From steve at pearwood.info Fri Mar 1 00:49:56 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Mar 2019 16:49:56 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: <20190301054954.GG4465@ando.pearwood.info> On Thu, Feb 28, 2019 at 08:59:30PM -0800, Hasan Diwan wrote: > Do we really need a "+" and a "-" operation on dictionaries? > [dictinstance.update({k:v}) for k,v in dictinstance.items()] does handle > merges already. I don;t think that does what you intended. That merges dictinstance with itself (a no-op!), but one item at a time, so in the slowest, most inefficient way possible. Writing a comprehension for its side-effects is an anti-pattern that should be avoided. You are creating a (potentially large) list of Nones which has to be created, then garbage collected. > And I'm assuming that "-" should return the difference -- > set(d1.keys()) - set(d2.keys()), right? No. That throws away the values associated with the keys. P.S. As per Guido's ~~command~~ request *wink* I'm writing a PEP for this. I should have a draft ready later this evening. -- Steven From fhsxfhsx at 126.com Fri Mar 1 00:36:45 2019 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Fri, 1 Mar 2019 13:36:45 +0800 (CST) Subject: [Python-ideas] Dict joining using + and += Message-ID: <723351c8.4df2.16937c13f9b.Coremail.fhsxfhsx@126.com> Considering potential ambiguity, I suggest `d1.append(d2)` so we can have an additional argument saying `d1.append(d2, mode="some mode that tells how this function behaviours")`. If we are really to have the new syntax `d1 + d2`, I suggest leaving it for `d1.append(d2, mode="strict")` which raises an error when there're duplicate keys. The semantics is nature and clear when two dicts have no overlapping keys. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Mar 1 01:29:06 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 1 Mar 2019 08:29:06 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <5C78506E.6040600@canterbury.ac.nz> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: 28.02.19 23:19, Greg Ewing ????: > Serhiy Storchaka wrote: >> I do not understand why we discuss a new syntax for dict merging if we >> already have a syntax for dict merging: {**d1, **d2} (which works with >> *all* mappings). > > But that always returns a dict. A '+' operator could be implemented > by other mapping types to return a mapping of the same type. And this opens a non-easy problem: how to create a mapping of the same type? Not all mappings, and even not all dict subclasses have a copying constructor. From ricocotam at gmail.com Fri Mar 1 01:44:57 2019 From: ricocotam at gmail.com (Adrien Ricocotam) Date: Fri, 1 Mar 2019 07:44:57 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <723351c8.4df2.16937c13f9b.Coremail.fhsxfhsx@126.com> References: <723351c8.4df2.16937c13f9b.Coremail.fhsxfhsx@126.com> Message-ID: I really like this idea. It?s not obvious how to deal with key conflicts and I don?t think replacing by the keys of the second dict is that obviously a good behaviour. With the actual merging ({**d1, **d2}) it works the same as when you build a custom dict so it?s usually known by people. If we add a new syntax/function, we might think of better behaviors. IMO, and I might be wrong, merging two mapping having common keys is an error. Thus we would need a clean way to combine two dicts. A simple way could be adding a key function that takes the values of each merged dict and returns the new value : d1 = ... d2 = ... d1.merge(d2, key=lambda values: values[0]) That?s an example, I don?t like the syntax. On Fri 1 Mar 2019 at 07:09, fhsxfhsx wrote: > Considering potential ambiguity, I suggest `d1.append(d2)` so we can have > an additional argument saying `d1.append(d2, mode="some mode that tells how > this function behaviours")`. > If we are really to have the new syntax `d1 + d2`, I suggest leaving it > for `d1.append(d2, mode="strict")` which raises an error when there're > duplicate keys. The semantics is nature and clear when two dicts have no > overlapping keys. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Mar 1 01:47:36 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 1 Mar 2019 08:47:36 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: 01.03.19 06:21, Guido van Rossum ????: > On Wed, Feb 27, 2019 at 11:18 PM Serhiy Storchaka > > wrote: > Counter uses + for a *different* behavior! > > ?>>> Counter(a=2) + Counter(a=3) > Counter({'a': 5}) > > > Well, you can see this as a special case. The proposed?+ operator on > Mappings returns a new Mapping whose keys are the union of the keys of > the two arguments; the value is the single value for a key that occurs > in only one of the arguments, and *somehow* combined for a key that's in > both. The way of combining keys is up to the type of Mapping. For dict, > the second value wins (not so different as {'a': 1, 'a': 2}, which > becomes {'a': 2}). But for other Mappings, the combination can be done > differently -- and Counter chooses to add the two values. Currently Counter += dict works and Counter + dict is an error. With this change Counter + dict will return a value, but it will be different from the result of the += operator. Also, if the custom dict subclass implemented the plus operator with different semantic which supports the addition with a dict, this change will break it, because dict + CustomDict will call dict.__add__ instead of CustomDict.__radd__. Adding support of new operators to builting types is dangerous. > I do not understand why we discuss a new syntax for dict merging if we > already have a syntax for dict merging: {**d1, **d2} (which works with > *all* mappings). Is not this contradicts the Zen? > > > But (as someone else pointed out) {**d1, **d2} always returns a dict, > not the type of d1 and d2. And this saves us from the hard problem of creating a mapping of the same type. Note that reference implementations discussed above make d1 + d2 always returning a dict. dict.copy() returns a dict. > Also, I'm sorry for PEP 448, but even if you know about **d in simpler > contexts, if you were to ask a typical Python user how to combine two > dicts into a new one, I doubt many people would think of {**d1, **d2}. I > know I myself had forgotten about it when this thread started! If you > were to ask a newbie who has learned a few things (e.g. sequence > concatenation) they would much more likely guess d1+d2. Perhaps the better solution is to update the documentation. From angala.agl at gmail.com Fri Mar 1 03:30:26 2019 From: angala.agl at gmail.com (=?UTF-8?Q?Antonio_Gal=C3=A1n?=) Date: Fri, 1 Mar 2019 09:30:26 +0100 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: Hi, datetime.date.today() (or other day) has attributes .year and .month wich return the year and the month of that date, also it has a function weekday() wich return the number of the day in the week. I think it is a good idea add a function or attribute "week" wich return the number of the week on the year. It is useful to execute scripts once a week for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ricocotam at gmail.com Fri Mar 1 03:33:46 2019 From: ricocotam at gmail.com (Adrien Ricocotam) Date: Fri, 1 Mar 2019 09:33:46 +0100 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: I like the idea. But how to distinguish it from the number of week past since the beginning of the month ? But that?s great. On Fri 1 Mar 2019 at 09:31, Antonio Gal?n wrote: > Hi, datetime.date.today() (or other day) has attributes .year and .month > wich return the year and the month of that date, also it has a function > weekday() wich return the number of the day in the week. > > I think it is a good idea add a function or attribute "week" wich return > the number of the week on the year. It is useful to execute scripts once a > week for example. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Mar 1 04:07:04 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 1 Mar 2019 09:07:04 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: On Thu, 28 Feb 2019 at 07:18, Serhiy Storchaka wrote: > [...] > > I do not understand why we discuss a new syntax for dict merging if we > already have a syntax for dict merging: {**d1, **d2} (which works with > *all* mappings). Is not this contradicts the Zen? > FWIW there are already three ways for lists/sequences: [*x, *y] x + y x.extend(y) # in-place version We already have first and third for dicts/mappings, I don't see a big problem in adding a + for dicts, also this is not really a new syntax, just implementing couple dunders for a builtin class. So I actually like this idea. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From angala.agl at gmail.com Fri Mar 1 05:40:53 2019 From: angala.agl at gmail.com (=?UTF-8?Q?Antonio_Gal=C3=A1n?=) Date: Fri, 1 Mar 2019 11:40:53 +0100 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: The week number is usually refered to the week of the year, but the week of the month is also interesting, for example for some holiday which depend on the week number of the month, so in analogy with "weekday" we can use "yearweek" and "monthweek" El vie., 1 de marzo de 2019 9:33, Adrien Ricocotam escribi?: > I like the idea. But how to distinguish it from the number of week past > since the beginning of the month ? > > But that?s great. > > On Fri 1 Mar 2019 at 09:31, Antonio Gal?n wrote: > >> Hi, datetime.date.today() (or other day) has attributes .year and .month >> wich return the year and the month of that date, also it has a function >> weekday() wich return the number of the day in the week. >> >> I think it is a good idea add a function or attribute "week" wich return >> the number of the week on the year. It is useful to execute scripts once a >> week for example. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 1 05:44:27 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Mar 2019 21:44:27 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: <20190301104427.GH4465@ando.pearwood.info> On Fri, Mar 01, 2019 at 08:47:36AM +0200, Serhiy Storchaka wrote: > Currently Counter += dict works and Counter + dict is an error. With > this change Counter + dict will return a value, but it will be different > from the result of the += operator. That's how list.__iadd__ works too: ListSubclass + list will return a value, but it might not be the same as += since that operates in place and uses a different dunder method. Why is it a problem for dicts but not a problem for lists? > Also, if the custom dict subclass implemented the plus operator with > different semantic which supports the addition with a dict, this change > will break it, because dict + CustomDict will call dict.__add__ instead > of CustomDict.__radd__. That's not how operators work in Python or at least that's not how they worked the last time I looked: if the behaviour has changed without discussion, that's a breaking change that should be reverted. Obviously I can't show this with dicts, but here it is with lists: py> class MyList(list): ... def __radd__(self, other): ... print("called subclass first") ... return "Something" ... py> [1, 2, 3] + MyList() called subclass first 'Something' This is normal, standard behaviour for Python operators: if the right operand is a subclass of the left operand, the reflected method __r*__ is called first. > Adding support of new operators to builting > types is dangerous. Explain what makes new operators more dangerous than old operators please. > > I do not understand why we discuss a new syntax for dict merging if we > > already have a syntax for dict merging: {**d1, **d2} (which works with > > *all* mappings). Is not this contradicts the Zen? > > > > > >But (as someone else pointed out) {**d1, **d2} always returns a dict, > >not the type of d1 and d2. > > And this saves us from the hard problem of creating a mapping of the > same type. What's wrong with doing this? new = type(self)() Or the equivalent from C code. If that doesn't work, surely that's the fault of the subclass, the subclass is broken, and it will raise an exception. I don't think it is our responsibility to do anything more than call the subclass constructor. If that's broken, then so be it. Possibly relevant: I've always been frustrated and annoyed at classes that hardcode their own type into methods. E.g. something like: class X: def spam(self, arg): return X(eggs) # Wrong! Bad! Please use type(self) instead. That means that each subclass has to override every method: class MySubclass(X): def spam(self, arg): # Do nothing except change the type returned. return type(self)( super().spam(arg) ) This gets really annoying really quickly. Try subclassing int, for example, where you have to override something like 30+ methods and do nothing but wrap calls to super. -- Steven From robertve92 at gmail.com Fri Mar 1 05:47:30 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Fri, 1 Mar 2019 11:47:30 +0100 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: Currently one can do week = d.isocalendar()[1] The iso definition of a week number has some nice properties. robertvandeneynde.be On Fri, 1 Mar 2019, 11:44 Antonio Gal?n, wrote: > The week number is usually refered to the week of the year, but the week > of the month is also interesting, for example for some holiday which depend > on the week number of the month, so in analogy with "weekday" we can use > "yearweek" and "monthweek" > El vie., 1 de marzo de 2019 9:33, Adrien Ricocotam > escribi?: > >> I like the idea. But how to distinguish it from the number of week past >> since the beginning of the month ? >> >> But that?s great. >> >> On Fri 1 Mar 2019 at 09:31, Antonio Gal?n wrote: >> >>> Hi, datetime.date.today() (or other day) has attributes .year and >>> .month wich return the year and the month of that date, also it has a >>> function weekday() wich return the number of the day in the week. >>> >>> I think it is a good idea add a function or attribute "week" wich return >>> the number of the week on the year. It is useful to execute scripts once a >>> week for example. >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 1 06:11:54 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Mar 2019 22:11:54 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <9B095C8A-A040-4DA8-A41B-30C250E01524@gmail.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <9B095C8A-A040-4DA8-A41B-30C250E01524@gmail.com> Message-ID: <20190301111154.GJ4465@ando.pearwood.info> On Thu, Feb 28, 2019 at 07:40:25AM -0500, James Lu wrote: > I agree with Storchaka here. The advantage of existing dict merge > syntax is that it will cause an error if the object is not a dict or > dict-like object, thus preventing people from doing bad things. What sort of "bad things" are you afraid of? -- Steven From songofacandy at gmail.com Fri Mar 1 06:59:45 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Mar 2019 20:59:45 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> Message-ID: I dislike adding more operator overload to builtin types. str is not commutative, but it satisfies a in (a+b), and b in (a+b). There are no loss. In case of dict + dict, it not only sum. There may be loss value. {"a":1} + {"a":2} = ? In case of a.update(b), it's clear that b wins. In case of a + b, "which wins" or "exception raised on duplicated key?" is unclear to me. Regards, On Thu, Feb 28, 2019 at 1:28 AM Jo?o Matos wrote: > Hello, > > I would like to propose that instead of using this (applies to Py3.5 and > upwards) > dict_a = {**dict_a, **dict_b} > > we could use > dict_a = dict_a + dict_b > > or even better > dict_a += dict_b > > > Best regards, > > Jo?o Matos > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Mar 1 07:47:07 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 1 Mar 2019 23:47:07 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> Message-ID: On Fri, Mar 1, 2019 at 11:00 PM INADA Naoki wrote: > > I dislike adding more operator overload to builtin types. > > str is not commutative, but it satisfies a in (a+b), and b in (a+b). > There are no loss. > > In case of dict + dict, it not only sum. There may be loss value. > > {"a":1} + {"a":2} = ? > > In case of a.update(b), it's clear that b wins. > In case of a + b, "which wins" or "exception raised on duplicated key?" is unclear to me. Picking semantics can be done as part of the PEP discussion, and needn't be a reason for rejecting the proposal before it's even made. We have at least one other precedent to consider: >>> {1} | {1.0} {1} >>> {1.0} | {1} {1.0} I have absolutely no doubt that these kinds of questions will be thoroughly hashed out (multiple times, even) before the PEP gets to pronouncement. ChrisA From songofacandy at gmail.com Fri Mar 1 07:58:08 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Mar 2019 21:58:08 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> Message-ID: On Fri, Mar 1, 2019 at 9:47 PM Chris Angelico wrote: > > On Fri, Mar 1, 2019 at 11:00 PM INADA Naoki wrote: > > > > I dislike adding more operator overload to builtin types. > > > > str is not commutative, but it satisfies a in (a+b), and b in (a+b). > > There are no loss. > > > > In case of dict + dict, it not only sum. There may be loss value. > > > > {"a":1} + {"a":2} = ? > > > > In case of a.update(b), it's clear that b wins. > > In case of a + b, "which wins" or "exception raised on duplicated key?" is unclear to me. > > Picking semantics can be done as part of the PEP discussion, and > needn't be a reason for rejecting the proposal before it's even made. Yes. I say just no semantics seems clear to me. I don't discuss which one is best. And I say only I dislike it. It must be free to express like or dislike, no? > We have at least one other precedent to consider: > > >>> {1} | {1.0} > {1} > >>> {1.0} | {1} > {1.0} It is just because of behavior of int and float. It is not caused by set behavior. Set keeps "no loss" semantics when view of equality. >>> {1} <= ({1} | {1.0}) True >>> {1.0} <= ({1} | {1.0}) True So dict + dict is totally different than set | set. dict + dict has los at equality level. -- INADA Naoki From steve at pearwood.info Fri Mar 1 08:09:50 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 2 Mar 2019 00:09:50 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> Message-ID: <20190301130947.GK4465@ando.pearwood.info> On Fri, Mar 01, 2019 at 08:59:45PM +0900, INADA Naoki wrote: > I dislike adding more operator overload to builtin types. > > str is not commutative, but it satisfies a in (a+b), and b in (a+b). > There are no loss. Is this an invariant you expect to apply for other classes that support the addition operator? 5 in (5 + 6) [1, 2, 3] in ([1, 2, 3] + [4, 5, 6]) Since it doesn't apply for int, float, complex, list or tuple, why do you think it must apply to dicts? > In case of dict + dict, it not only sum. There may be loss value. Yes? Why is that a problem? > {"a":1} + {"a":2} = ? Would you like to argue that Counter.__add__ is a mistake for the same reason? Counter(('a', 1)) + Counter(('a', 2)) = ? For the record, what I expected the above to do turned out to be *completely wrong* when I tried it. I expected Counter({'a': 3}) but the actual results are Counter({'a': 2, 1: 1, 2: 1}). Every operation is going to be mysterious if you have never learned what it does: from array import array a = array('i', [1, 2, 3]) b = array('i', [10, 20, 30]) a + b = ? Without trying it or reading the docs, should that be an error, or concatenation, or element-wise addition? > In case of a.update(b), it's clear that b wins. It wasn't clear to me when I was a beginner and first came across dict.update. I had to learn what it did by experimenting with manual loops until it made sense to me. > In case of a + b, "which wins" or "exception raised on duplicated key?" is > unclear to me. Many things are unclear to me too. That doesn't make them any less useful. -- Steven From steve at pearwood.info Fri Mar 1 08:18:09 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 2 Mar 2019 00:18:09 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> Message-ID: <20190301131809.GL4465@ando.pearwood.info> On Fri, Mar 01, 2019 at 09:58:08PM +0900, INADA Naoki wrote: > >>> {1} <= ({1} | {1.0}) > True > >>> {1.0} <= ({1} | {1.0}) > True > > So dict + dict is totally different than set | set. > dict + dict has los at equality level. Is that an invariant you expect to apply to other uses of the + operator? py> x = -1 py> x <= (x + x) False py> [999] <= ([1, 2, 3] + [999]) False -- Steven From songofacandy at gmail.com Fri Mar 1 08:19:34 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Mar 2019 22:19:34 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190301130947.GK4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301130947.GK4465@ando.pearwood.info> Message-ID: On Fri, Mar 1, 2019 at 10:10 PM Steven D'Aprano wrote: > > On Fri, Mar 01, 2019 at 08:59:45PM +0900, INADA Naoki wrote: > > I dislike adding more operator overload to builtin types. > > > > str is not commutative, but it satisfies a in (a+b), and b in (a+b). > > There are no loss. > > Is this an invariant you expect to apply for other classes that support > the addition operator? > > 5 in (5 + 6) I meant more high level semantics: "no loss". Not only "in". So my example about set used "<=" operator. 5 + 6 is sum of 5 and 6. > > [1, 2, 3] in ([1, 2, 3] + [4, 5, 6]) > Both of [1,2,3] and [4,5,6] are not lost in result. > > Since it doesn't apply for int, float, complex, list or tuple, why do > you think it must apply to dicts? > You misunderstood my "no loss" expectation. > > > In case of dict + dict, it not only sum. There may be loss value. > > Yes? Why is that a problem? > It's enough reason to I dislike. > > > {"a":1} + {"a":2} = ? > > Would you like to argue that Counter.__add__ is a mistake for the same > reason? > In Counter's case, it's clear. In case of dict, it's unclear. > Counter(('a', 1)) + Counter(('a', 2)) = ? > > > For the record, what I expected the above to do turned out to be > *completely wrong* when I tried it. I expected Counter({'a': 3}) but the > actual results are Counter({'a': 2, 1: 1, 2: 1}). It just because you misunderstood Counter's initializer argument. It's not relating to how overload + or | operator. > > Every operation is going to be mysterious if you have never > learned what it does: > > from array import array > a = array('i', [1, 2, 3]) > b = array('i', [10, 20, 30]) > a + b = ? > > Without trying it or reading the docs, should that be an > error, or concatenation, or element-wise addition? > I never say every operator must be expected by everyone. Don't straw man. -- INADA Naoki From songofacandy at gmail.com Fri Mar 1 08:47:23 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Mar 2019 22:47:23 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190301131809.GL4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> Message-ID: > > > Is that an invariant you expect to apply to other uses of the + > operator? > > py> x = -1 > py> x <= (x + x) > False > > py> [999] <= ([1, 2, 3] + [999]) > False > Please calm down. I meant each type implements "sum" in semantics of the type, in lossless way. What "lossless" means is changed by the semantics of the type. -1 + -1 = -2 is sum in numerical semantics. There are no loss. [1, 2, 3] + [999] = [1, 2, 3, 999] is (lossless) sum in sequence semantics. So what about {"a": 1} + {"a": 2}. Is there (lossless) sum in dict semantics? * {"a": 1} -- It seems {"a": 2} is lost in dict semantics. Should it really called "sum" ? * {"a": 2} -- It seems {"a": 1} is lost in dict semantics. Should it really called "sum" ? * {"a": 3} -- It seems bit curious compared with + of sequence, because [2]+[3] is not [5]. It looks like more Counter than container. * ValueError -- Hmm, it looks ugly to me. So I don't think "sum" is not fit to dict semantics. Regards, -- INADA Naoki From remi.lapeyre at henki.fr Fri Mar 1 09:06:58 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Fri, 1 Mar 2019 06:06:58 -0800 Subject: [Python-ideas] Dict joining using + and += Message-ID: I?m having issues to understand the semantics of d1 + d2. I think mappings are more complicated than sequences it some things seems not obvious to me. What would be OrderedDict1 + OrderedDict2, in which positions would be the resulting keys, which value would be used if the same key is present in both? What would be defaultdict1 + defaultdict2? It seems to me that subclasses of dict are complex mappings for which ??merging?? may be less obvious than for sequences. From levkivskyi at gmail.com Fri Mar 1 09:19:08 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 1 Mar 2019 14:19:08 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> Message-ID: On Fri, 1 Mar 2019 at 13:48, INADA Naoki wrote: > > > > > > Is that an invariant you expect to apply to other uses of the + > > operator? > > > > py> x = -1 > > py> x <= (x + x) > > False > > > > py> [999] <= ([1, 2, 3] + [999]) > > False > > > > Please calm down. I meant each type implements "sum" > in semantics of the type, in lossless way. > What "lossless" means is changed by the semantics of the type. > > -1 + -1 = -2 is sum in numerical semantics. There are no loss. > TBH I don't understand what is lossless about numeric addition. What is the definition of lossless? Clearly some information is lost, since you can't uniquely restore two numbers you add from the result. Unless you define what lossless means, there will be just more misunderstandings. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Fri Mar 1 09:31:11 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 1 Mar 2019 14:31:11 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: Message-ID: <5c186ac7-6c35-6ad5-f972-94f65fff8f52@kynesim.co.uk> On 01/03/2019 14:06, R?mi Lapeyre wrote: > I?m having issues to understand the semantics of d1 + d2. That's understandable, clouds of confusion have been raised. As far as I can tell it's pretty straightforward: d = d1 + d2 is equivalent to: >>> d = d1.copy() >>> d.update(d2) All of your subsequent questions then become "What does DictSubclassInQuestion.update() do?" which should be well defined. -- Rhodri James *-* Kynesim Ltd From mistersheik at gmail.com Fri Mar 1 09:38:04 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 1 Mar 2019 06:38:04 -0800 (PST) Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190301104427.GH4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <20190301104427.GH4465@ando.pearwood.info> Message-ID: <90a25967-76ce-43de-89cf-6949090c79f4@googlegroups.com> On Friday, March 1, 2019 at 5:47:06 AM UTC-5, Steven D'Aprano wrote: > > On Fri, Mar 01, 2019 at 08:47:36AM +0200, Serhiy Storchaka wrote: > > > Currently Counter += dict works and Counter + dict is an error. With > > this change Counter + dict will return a value, but it will be different > > from the result of the += operator. > > That's how list.__iadd__ works too: ListSubclass + list will return a > value, but it might not be the same as += since that operates in place > and uses a different dunder method. > > Why is it a problem for dicts but not a problem for lists? > > > > Also, if the custom dict subclass implemented the plus operator with > > different semantic which supports the addition with a dict, this change > > will break it, because dict + CustomDict will call dict.__add__ instead > > of CustomDict.__radd__. > > That's not how operators work in Python or at least that's not how they > worked the last time I looked: if the behaviour has changed without > discussion, that's a breaking change that should be reverted. > > Obviously I can't show this with dicts, but here it is with lists: > > py> class MyList(list): > ... def __radd__(self, other): > ... print("called subclass first") > ... return "Something" > ... > py> [1, 2, 3] + MyList() > called subclass first > 'Something' > > > This is normal, standard behaviour for Python operators: if the right > operand is a subclass of the left operand, the reflected method __r*__ > is called first. > > > > Adding support of new operators to builting > > types is dangerous. > > Explain what makes new operators more dangerous than old operators > please. > > > > > I do not understand why we discuss a new syntax for dict merging if > we > > > already have a syntax for dict merging: {**d1, **d2} (which works > with > > > *all* mappings). Is not this contradicts the Zen? > > > > > > > > >But (as someone else pointed out) {**d1, **d2} always returns a dict, > > >not the type of d1 and d2. > > > > And this saves us from the hard problem of creating a mapping of the > > same type. > > What's wrong with doing this? > > new = type(self)() > > Or the equivalent from C code. If that doesn't work, surely that's the > fault of the subclass, the subclass is broken, and it will raise an > exception. > > I don't think it is our responsibility to do anything more than call > the subclass constructor. If that's broken, then so be it. > > > Possibly relevant: I've always been frustrated and annoyed at classes > that hardcode their own type into methods. E.g. something like: > > class X: > def spam(self, arg): > return X(eggs) > # Wrong! Bad! Please use type(self) instead. > > That means that each subclass has to override every method: > > class MySubclass(X): > def spam(self, arg): > # Do nothing except change the type returned. > return type(self)( super().spam(arg) ) > > > This gets really annoying really quickly. Try subclassing int, for > example, where you have to override something like 30+ methods and do > nothing but wrap calls to super. > I agree with you here. You might want to start a different thread with this idea and possibly come up with a PEP. There might be some pushback for efficiency's sake, so you might have to reel in your proposal to collections.abc mixin methods and UserDict methods. Regarding the proposal, I agree with the reasoning put forward by Guido and I like it. I think there should be: * d1 + d2 * d1 += d2 * d1 - d2 * d1 -= d2 which are roughly (ignoring steve's point about types) * {**d1, **d2} * d1.update(d2) * {k: v for k, v in d1.items() if k not in d2} * for k in list(d1): if k not in d2: del d1[k] Seeing this like this, there should be no confusion about what the operators do. I understand the points people made about the Zen of Python. However, I think that just like with lists, we tend to use l1+l2 when combining lists and [*l1, x, *l2, y] when combining lists and elements. Similarly, I think {**d1, **d2} should only be written when there are also key value pairs, like {**d1, k: v, **d2, k2: v2}. Best, Neil > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Fri Mar 1 09:38:29 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Mar 2019 23:38:29 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> Message-ID: Sorry, I'm not good at English enough to explain my mental model. I meant no skip, no ignorance, no throw away. In case of 1+2=3, both of 1 and 2 are not skipped, ignored or thrown away. On the other hand, in case of {a:1, b:2}+{a:2}={a:2, b:2}, I feel {a:1} is skipped, ignored, or thrown away. I used "lost" to explain it. And I used "lossless" for "there is no lost". Not for reversible. If it isn't understandable to you, please ignore me. I think R?mi?s comment is very similar to my thought. Merging mapping is more complex than concatenate sequence and it seems hard to call it "sum". Regards, 2019?3?1?(?) 23:19 Ivan Levkivskyi : > On Fri, 1 Mar 2019 at 13:48, INADA Naoki wrote: > >> > >> > >> > Is that an invariant you expect to apply to other uses of the + >> > operator? >> > >> > py> x = -1 >> > py> x <= (x + x) >> > False >> > >> > py> [999] <= ([1, 2, 3] + [999]) >> > False >> > >> >> Please calm down. I meant each type implements "sum" >> in semantics of the type, in lossless way. >> What "lossless" means is changed by the semantics of the type. >> >> -1 + -1 = -2 is sum in numerical semantics. There are no loss. >> > > TBH I don't understand what is lossless about numeric addition. What is > the definition of lossless? > Clearly some information is lost, since you can't uniquely restore two > numbers you add from the result. > > Unless you define what lossless means, there will be just more > misunderstandings. > > -- > Ivan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Mar 1 09:40:25 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 1 Mar 2019 15:40:25 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: Message-ID: R?mi Lapeyre schrieb am 01.03.19 um 15:06: > I?m having issues to understand the semantics of d1 + d2. > > I think mappings are more complicated than sequences it some things > seems not obvious to me. > > What would be OrderedDict1 + OrderedDict2, in which positions would be > the resulting keys, which value would be used if the same key is > present in both? The only reasonable answer I can come up with is: 1) unique keys from OrderedDict1 are in the same order as before 2) duplicate keys and new keys from OrderedDict2 come after the keys from d1, in their original order in d2 since they replace keys in d1. Basically, the expression says: "take a copy of d1 and add the items from d2 to it". That's exactly what you should get, whether the mappings are ordered or not (and dict are ordered by insertion in Py3.6+). > What would be defaultdict1 + defaultdict2? No surprises here, the result is a copy of defaultdict1 (using the same missing-key function) with all items from defaultdict2 added. Remember that the order of the two operands matters. The first always defines the type of the result, the second is only added to it. > It seems to me that subclasses of dict are complex mappings for which > ??merging?? may be less obvious than for sequences. It's the same for subclasses of sequences. Stefan From 2QdxY4RzWzUUiLuE at potatochowder.com Fri Mar 1 09:41:13 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Fri, 1 Mar 2019 08:41:13 -0600 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> Message-ID: <04b1a765-f97d-0935-8990-d0db8b7bce5d@potatochowder.com> On 3/1/19 8:19 AM, Ivan Levkivskyi wrote: > On Fri, 1 Mar 2019 at 13:48, INADA Naoki wrote: > >> > >> > >> > Is that an invariant you expect to apply to other uses of the + >> > operator? >> > >> > py> x = -1 >> > py> x <= (x + x) >> > False >> > >> > py> [999] <= ([1, 2, 3] + [999]) >> > False >> > >> >> Please calm down. I meant each type implements "sum" >> in semantics of the type, in lossless way. >> What "lossless" means is changed by the semantics of the type. >> >> -1 + -1 = -2 is sum in numerical semantics. There are no loss. >> > > TBH I don't understand what is lossless about numeric addition. What is the > definition of lossless? > Clearly some information is lost, since you can't uniquely restore two > numbers you add from the result. > > Unless you define what lossless means, there will be just more > misunderstandings. I don't mean to put words into anyone's mouth, but I think I see what IDANA Naoki means: in other cases of summation, the result somehow includes or contains both operands. In the case of summing dicts, though, some of the operands are "lost" in the process. I'm sure that I'm nowhere near as prolific as many of the members of this list, but I don't remember ever merging dicts (and a quick grep of my Python source tree confirms same), so I won't comment further on the actual issue at hand. From eric at trueblade.com Fri Mar 1 09:49:02 2019 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 1 Mar 2019 09:49:02 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> Message-ID: <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> On 3/1/2019 9:38 AM, INADA Naoki wrote: > Sorry, I'm not good at English enough to explain my mental model. > > I meant no skip, no ignorance, no throw away. > > In case of 1+2=3, both of 1 and 2 are not skipped, ignored or thrown away. > > On the other hand, in case of {a:1, b:2}+{a:2}={a:2, b:2}, I feel {a:1} > is skipped, ignored, or thrown away.? I used "lost" to explain it. > > And I used "lossless" for "there is no lost".? Not for reversible. > > If it isn't understandable to you, please ignore me. > > I think R?mi?s comment is very similar to my thought.? Merging mapping > is more complex than concatenate sequence and it seems hard to call it > "sum". I understand Inada to be saying that each value on the LHS (as shown above) affects the result on the RHS. That's the case with addition of ints and other types, but not so with the proposed dict addition. As he says, the {a:1} doesn't affect the result. The result would be the same if this key wasn't present in the first dict, or if the key had a different value. This doesn't bother me, personally. I'm just trying to clarify. Eric > > Regards, > > > 2019?3?1?(?) 23:19 Ivan Levkivskyi >: > > On Fri, 1 Mar 2019 at 13:48, INADA Naoki > wrote: > > > > > > > Is that an invariant you expect to apply to other uses of the + > > operator? > > > > py> x = -1 > > py> x <= (x + x) > > False > > > > py> [999] <= ([1, 2, 3] + [999]) > > False > > > > Please calm down.? I meant each type implements "sum" > in semantics of the type, in lossless way. > What "lossless" means is changed by the semantics of the type. > > -1 + -1 = -2 is sum in numerical semantics.? There are no loss. > > > TBH I don't understand what is lossless about numeric addition. What > is the definition of lossless? > Clearly some information is lost, since you can't uniquely restore > two numbers you add from the result. > > Unless you define what lossless means, there will be just more > misunderstandings. > > -- > Ivan > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From stefan_ml at behnel.de Fri Mar 1 10:03:13 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 1 Mar 2019 16:03:13 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: Message-ID: R?mi Lapeyre schrieb am 01.03.19 um 15:50: > Le 1 mars 2019 ? 15:41:52, Stefan Behnel a ?crit: > >> R?mi Lapeyre schrieb am 01.03.19 um 15:06: >>> I?m having issues to understand the semantics of d1 + d2. >>> >>> I think mappings are more complicated than sequences it some things >>> seems not obvious to me. >>> >>> What would be OrderedDict1 + OrderedDict2, in which positions would be >>> the resulting keys, which value would be used if the same key is >>> present in both? >> >> The only reasonable answer I can come up with is: >> >> 1) unique keys from OrderedDict1 are in the same order as before >> 2) duplicate keys and new keys from OrderedDict2 come after the keys from >> d1, in their original order in d2 since they replace keys in d1. >> >> Basically, the expression says: "take a copy of d1 and add the items from >> d2 to it". That's exactly what you should get, whether the mappings are >> ordered or not (and dict are ordered by insertion in Py3.6+). > > Thanks Stefan for your feedback, unless I?m mistaken this does not work like > Rhodri suggested, he said: > > I can tell it's pretty straightforward: > > d = d1 + d2 is equivalent to: > > >>> d = d1.copy() > >>> d.update(d2) > > But doing this: > > >>> d1 = OrderedDict({"a": 1, "b": 2, "c": 3}) > >>> d2 = OrderedDict({"d": 4, "b": 5}) > >>> d = d1.copy() > >>> d.update(d2) > >>> d > OrderedDict([('a', 1), ('b', 5), ('c', 3), ('d', 4)]) > > It looks like that the semantics are either not straightforward or what you > proposed is not the only reasonable answer. Am I missing something? No, I was, apparently. In Py3.7: >>> d1 = {"a": 1, "b": 2, "c": 3} >>> d1 {'a': 1, 'b': 2, 'c': 3} >>> d2 = {"d": 4, "b": 5} >>> d = d1.copy() >>> d.update(d2) >>> d {'a': 1, 'b': 5, 'c': 3, 'd': 4} I think the behaviour makes sense when you know how it's implemented (keys are stored separately from values). I would have been less surprised if the keys had also been reordered, but well, this is how it is now in Py3.6+, so this is how it's going to work also for the operator. No *additional* surprises here. ;) Stefan From levkivskyi at gmail.com Fri Mar 1 10:32:03 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 1 Mar 2019 15:32:03 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> Message-ID: On Fri, 1 Mar 2019 at 14:50, Eric V. Smith wrote: > On 3/1/2019 9:38 AM, INADA Naoki wrote: > > Sorry, I'm not good at English enough to explain my mental model. > > > > I meant no skip, no ignorance, no throw away. > > > > In case of 1+2=3, both of 1 and 2 are not skipped, ignored or thrown > away. > > > > On the other hand, in case of {a:1, b:2}+{a:2}={a:2, b:2}, I feel {a:1} > > is skipped, ignored, or thrown away. I used "lost" to explain it. > > > > And I used "lossless" for "there is no lost". Not for reversible. > > > > If it isn't understandable to you, please ignore me. > > > > I think R?mi?s comment is very similar to my thought. Merging mapping > > is more complex than concatenate sequence and it seems hard to call it > > "sum". > > I understand Inada to be saying that each value on the LHS (as shown > above) affects the result on the RHS. That's the case with addition of > ints and other types, but not so with the proposed dict addition. As he > says, the {a:1} doesn't affect the result. The result would be the same > if this key wasn't present in the first dict, or if the key had a > different value. > > This doesn't bother me, personally. I'm just trying to clarify. > OK, thanks for explaining! So more formally speaking, you want to say that for other examples of '+' in Python x1 + y == x2 + y if and only if x1 == x2, while for the proposed '+' for dicts there may be many different x_i such that x_i + y gives the same result. This doesn't bother me either, since this is not a critical requirement for addition. I would say this is rather a coincidence than a conscious decision. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Mar 1 10:34:31 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 1 Mar 2019 16:34:31 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> Message-ID: Eric V. Smith schrieb am 01.03.19 um 15:49: > I understand Inada to be saying that each value on the LHS (as shown above) > affects the result on the RHS. That's the case with addition of ints and > other types, but not so with the proposed dict addition. As he says, the > {a:1} doesn't affect the result. The result would be the same if this key > wasn't present in the first dict, or if the key had a different value. > > This doesn't bother me, personally. +1 Stefan From songofacandy at gmail.com Fri Mar 1 10:56:41 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 2 Mar 2019 00:56:41 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <20190301131809.GL4465@ando.pearwood.info> <484229fd-5034-aa2d-533e-02f4a388a920@trueblade.com> Message-ID: > > OK, thanks for explaining! So more formally speaking, you want to say that for other examples of '+' in Python > x1 + y == x2 + y if and only if x1 == x2, while for the proposed '+' for dicts there may be many different x_i such that > x_i + y gives the same result. > It's bit different thank my mind. I'm OK to violate " x1 + y == x2 + y if and only if x1 == x2", if it's not important for semantics of type of x1, x2, and y. Mapping is defined by key: value pairs. It's core part. I don't want to call operator losts key: value pair as "sum". That's why I thought this proposal is more serious abuse of + operator. By the way, in case of sequence, `len(a) + len(b) == len(a + b)`. In case of set, `len(a) + len(b) >= len(a | b)`. Proposed operation looks similar to `set | set` than `seq + seq` in this point of view. I don't propose | than +. I just mean difference between dict.update() and seq+seq is not smaller than difference between dict.update() and set|set. If | seems not fit to this operation, + seems not fit to this operation too. -- INADA Naoki From stefan_ml at behnel.de Fri Mar 1 10:57:32 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 1 Mar 2019 16:57:32 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: Message-ID: R?mi Lapeyre schrieb am 01.03.19 um 16:44: > Le 1 mars 2019 ? 16:04:47, Stefan Behnel a ?crit: >> I think the behaviour makes sense when you know how it's implemented (keys >> are stored separately from values). > > Is a Python user expected to know the implementation details of all mappings > thought? No, it just helps _me_ in explaining the behaviour to myself. Feel free to look it up in the documentation if you prefer. >> I would have been less surprised if the >> keys had also been reordered, but well, this is how it is now in Py3.6+, so >> this is how it's going to work also for the operator. >> >> No *additional* surprises here. ;) > > There is never any surprises left once all details have been carefully worked > out but having `+` for mappings make it looks like an easy operation whose > meaning is non ambiguous and obvious. > > I?m still not convinced that it the meaning is obvious, and gave an example > in my other message where I think it could be ambiguous. What I meant was that it's obvious in the sense that it is no new behaviour at all. It just provides an operator for behaviour that is already there. We are not discussing the current behaviour here. That ship has long sailed with the release of Python 3.6 beta 1 back in September 2016. The proposal that is being discussed here is the new operator. Stefan From steve at pearwood.info Fri Mar 1 11:26:45 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 2 Mar 2019 03:26:45 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction Message-ID: <20190301162645.GM4465@ando.pearwood.info> Attached is a draft PEP on adding + and - operators to dict for discussion. This should probably go here: https://github.com/python/peps but due to technical difficulties at my end, I'm very limited in what I can do on Github (at least for now). If there's anyone who would like to co-author and/or help with the process, that will be appreciated. -- Steven -------------- next part -------------- ====================================== PEP-xxxx Dict addition and subtraction ====================================== **DRAFT** -- This is a draft document for discussion. Abstract -------- This PEP suggests adding merge ``+`` and difference ``-`` operators to the built-in ``dict`` class. The merge operator will have the same relationship to the ``dict.update`` method as the list concatenation operator has to ``list.extend``, with dict difference being defined analogously. Examples -------- Dict addition will return a new dict containing the left operand merged with the right operand. >>> d = {'spam': 1, 'eggs': 2, 'cheese': 3} >>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'} >>> d + e {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} >>> e + d {'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2} The augmented assignment version operates in-place. >>> d += e >>> print(d) {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} Analogously with list addition, the operator version is more restrictive, and requires that both arguments are dicts, while the augmented assignment version allows anything the ``update`` method allows, such as iterables of key/value pairs. >>> d + [('spam', 999)] Traceback (most recent call last): ... TypeError: can only merge dict (not "list") to dict >>> d += [('spam', 999)] >>> print(d) {'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} Dict difference ``-`` will return a new dict containing the items from the left operand which are not in the right operand. >>> d = {'spam': 1, 'eggs': 2, 'cheese': 3} >>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'} >>> d - e {'spam': 1, 'eggs': 2} >>> e - d {'aardvark': 'Ethel'} Augmented assignment will operate in place. >>> d -= e >>> print(d) {'spam': 1, 'eggs': 2} Like the merge operator and list concatenation, the difference operator requires both operands to be dicts, while the augmented version allows any iterable of keys. >>> d - {'spam', 'parrot'} Traceback (most recent call last): ... TypeError: cannot take the difference of dict and set >>> d -= {'spam', 'parrot'} >>> print(d) {'eggs': 2, 'cheese': 'cheddar'} >>> d -= [('spam', 999)] >>> print(d) {'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} Semantics --------- For the merge operator, if a key appears in both operands, the last-seen value (i.e. that from the right-hand operand) wins. This shows that dict addition is not commutative, in general ``d + e`` will not equal ``e + d``. This joins a number of other non-commutative addition operators among the builtins, including lists, tuples, strings and bytes. Having the last-seen value wins makes the merge operator match the semantics of the ``update`` method, so that ``d + e`` is an operator version of ``d.update(e)``. The error messages shown above are not part of the API, and may change at any time. Rejected semantics ~~~~~~~~~~~~~~~~~~ Rejected alternatives semantics for ``d + e`` include: - Add only new keys from ``e``, without overwriting existing keys in ``d``. This may be done by reversing the operands ``e + d``, or using dict difference first, ``d + (e - d)``. The later is especially useful for the in-place version ``d += (e - d)``. - Raise an exception if there are duplicate keys. This seems unnecessarily restrictive and is not likely to be useful in practice. For example, updating default configuration values with user-supplied values would most often fail under the requirement that keys are unique:: prefs = site_defaults + user_defaults + document_prefs - Add the values of d2 to the corresponding values of d1. This is the behaviour implemented by ``collections.Counter``. Syntax ------ An alternative to the ``+`` operator is the pipe ``|`` operator, which is used for set union. This suggestion did not receive much support on Python-Ideas. The ``+`` operator was strongly preferred on Python-Ideas.[1] It is more familiar than the pipe operator, matches nicely with ``-`` as a pair, and the Counter subclass already uses ``+`` for merging. Current Alternatives -------------------- To create a new dict containing the merged items of two (or more) dicts, one can currently write:: {**d1, **d2} but this is neither obvious nor easily discoverable. It is only guaranteed to work if the keys are all strings. If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython[2]. It is also limited to returning a built-in dict, not a subclass, unless re-written as ``MyDict(**d1, **d2)``, in which case non-string keys will raise TypeError. There is currently no way to perform dict subtraction except through a manual loop. Implementation -------------- The implementation will be in C. (The author of this PEP would like to make it known that he is not able to write the implemention.) An approximate pure-Python implementation of the merge operator will be:: def __add__(self, other): if isinstance(other, dict): new = type(self)() # May be a subclass of dict. new.update(self) new.update(other) return new return NotImplemented def __radd__(self, other): if isinstance(other, dict): new = type(other)() new.update(other) new.update(self) return new return NotImplemented Note that the result type will be the type of the left operand; in the event of matching keys, the winner is the right operand. Augmented assignment will just call the ``update`` method. This is analogous to the way ``list +=`` calls the ``extend`` method, which accepts any iterable, not just lists. def __iadd__(self, other): self.update(other) An approximate pure-Python implementation of the difference operator will be:: def __sub__(self, other): if isinstance(other, dict): new = type(self)() for k in self: if k not in other: new[k] = self[k] return new return NotImplemented def __rsub__(self, other): if isinstance(other, dict): new = type(other)() for k in other: if k not in self: new[k] = other[k] return new return NotImplemented Augmented assignment will operate on equivalent terms to ``update``. If the operand has a key method, it will be used, otherwise the operand will be iterated over:: def __isub__(self, other): if hasattr(other, 'keys'): for k in other.keys(): if k in self: del self[k] else: for k in other: if k in self: del self[k] These semantics are intended to match those of ``update`` as closely as possible. For the dict built-in itself, calling ``keys`` is redundant as iteration over a dict iterates over its keys; but for subclasses or other mappings, ``update`` prefers to use the keys method. .. attention:: The above paragraph may be inaccurate. Although the dict docstring states that ``keys`` will be called if it exists, this does not seem to be the case for dict subclasses. Bug or feature? Contra-indications ------------------ (Or when to avoid using these new operators.) For merging multiple dicts, the ``d1 + d2 + d3 + d4 + ...`` idiom will suffer from the same unfortunate O(N\*\*2) Big Oh performance as does list and tuple addition, and for similar reasons. If one expects to be merging a large number of dicts where performance is an issue, it may be better to use an explicit loop and in-place merging:: new = {} for d in many_dicts: new += d This is unlikely to be a problem in practice as most uses of the merge operator are expected to only involve a small number of dicts. Similarly, most uses of list and tuple concatenation only use a few objects. Using the dict augmented assignment operators on a dict inside a tuple (or other immutable data structure) will lead to the same problem that occurs with list concatenation[3], namely the in-place addition will succeed, but the operation will raise an exception. >>> a_tuple = ({'spam': 1, 'eggs': 2}, None) >>> a_tuple[0] += {'spam': 999} Traceback (most recent call last): ... TypeError: 'tuple' object does not support item assignment >>> a_tuple[0] {'spam': 999, 'eggs': 2} Similar remarks apply to the ``-`` operator. Other discussions ----------------- `Latest discussion which motivated this PEP `_ `Ticket on the bug tracker `_ `A previous discussion `_ and `commentary on it `_. Note that the author of this PEP was skeptical of this proposal at the time. `How to merge dictionaries `_ in idiomatic Python. Open questions -------------- Should these operators be part of the ABC ``Mapping`` API? References ---------- [1] Guido's declaration that plus wins over pipe: https://mail.python.org/pipermail/python-ideas/2019-February/055519.html [2] Non-string keys: https://bugs.python.org/issue35105 and https://mail.python.org/pipermail/python-dev/2018-October/155435.html [3] Behaviour in tuples: https://docs.python.org/3/faq/programming.html#why-does-a-tuple-i-item-raise-an-exception-when-the-addition-works Copyright --------- This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From songofacandy at gmail.com Fri Mar 1 11:47:37 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 2 Mar 2019 01:47:37 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: > If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython[2]. I don't think so. https://bugs.python.org/issue35105 and https://mail.python.org/pipermail/python-dev/2018-October/155435.html are about kwargs. I think non string keys are allowed for {**d1, **d2} by language. -- INADA Naoki From brandtbucher at gmail.com Fri Mar 1 11:48:56 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Fri, 1 Mar 2019 08:48:56 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> I?ve never been part of this process before, but I?m interested in learning and helping any way I can. My addition implementation is attached to the bpo, and I?m working today on bringing it in line with the PEP in its current form (specifically, subtraction operations). https://github.com/python/cpython/pull/12088 Brandt > On Mar 1, 2019, at 08:26, Steven D'Aprano wrote: > > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > -- > Steven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri Mar 1 11:52:10 2019 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 1 Mar 2019 11:52:10 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <750892ed-4a36-c8d2-57db-a0f6a6cd3d9c@trueblade.com> Hi, Steven. I can help you with it. I added it as PEP 584. I had to add the PEP headers, but didn't do any other editing. I'm going to be out of town for the next 2 weeks, so I might be slow in responding. Eric On 3/1/2019 11:26 AM, Steven D'Aprano wrote: > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From mistersheik at gmail.com Fri Mar 1 12:05:54 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 1 Mar 2019 09:05:54 -0800 (PST) Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <2e10ef29-7d39-4c97-8a7a-8e85f24745fa@googlegroups.com> Looks like a good start. I think you should replace all of the lines: if isinstance(other, dict): with if isinstance(self, type(other)): Since if other is an instance of a dict subclass, he should be the one to process the addition. On the other hand, if self is an instance of the derived type, then we are free to do the combination. I think you should also change this wording: "the result type will be the type of the left operand" since the result type will be negotiated between the operands (even in your implemenation). __sub__ can be implemented more simply as a dict comprehension. Don't forget to return self in __isub__ and __iadd__ or they won't work. I think __isub__ would be simpler like this: def __isub__(self, it): if it is self: self.clear() else: for value in it: del self[value] return self I don't see why you would bother looking for keys (iter will do that anyway). On Friday, March 1, 2019 at 11:27:54 AM UTC-5, Steven D'Aprano wrote: > > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > -- > Steven > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 1 14:08:19 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Mar 2019 11:08:19 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <750892ed-4a36-c8d2-57db-a0f6a6cd3d9c@trueblade.com> References: <20190301162645.GM4465@ando.pearwood.info> <750892ed-4a36-c8d2-57db-a0f6a6cd3d9c@trueblade.com> Message-ID: Thanks -- FYI I renamed the file to .rst (per convention for PEPs in ReST format) and folded long text lines. On Fri, Mar 1, 2019 at 8:53 AM Eric V. Smith wrote: > Hi, Steven. > > I can help you with it. I added it as PEP 584. I had to add the PEP > headers, but didn't do any other editing. > > I'm going to be out of town for the next 2 weeks, so I might be slow in > responding. > > Eric > > On 3/1/2019 11:26 AM, Steven D'Aprano wrote: > > Attached is a draft PEP on adding + and - operators to dict for > > discussion. > > > > This should probably go here: > > > > https://github.com/python/peps > > > > but due to technical difficulties at my end, I'm very limited in what I > > can do on Github (at least for now). If there's anyone who would like to > > co-author and/or help with the process, that will be appreciated. > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 1 14:31:30 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Mar 2019 11:31:30 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: On Thu, Feb 28, 2019 at 10:30 PM Serhiy Storchaka wrote: > 28.02.19 23:19, Greg Ewing ????: > > Serhiy Storchaka wrote: > >> I do not understand why we discuss a new syntax for dict merging if we > >> already have a syntax for dict merging: {**d1, **d2} (which works with > >> *all* mappings). > > > > But that always returns a dict. A '+' operator could be implemented > > by other mapping types to return a mapping of the same type. > > And this opens a non-easy problem: how to create a mapping of the same > type? Not all mappings, and even not all dict subclasses have a copying > constructor. > There's a compromise solution for this possible. We already do this for Sequence and MutableSequence: Sequence does *not* define __add__, but MutableSequence *does* define __iadd__, and the default implementation just calls self.update(other). I propose the same for Mapping (do nothing) and MutableMapping: make the default __iadd__ implementation call self.update(other). Looking at the code for Counter, its __iadd__ and __add__ behave subtly different than Counter.update(): __iadd__ and __add__ (and __radd__) drop values that are <= 0, while update() does not. That's all fine -- Counter is not bound by the exact same semantics as dict (starting with its update() method, which adds values rather than overwriting). Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the latter is highly non-obvious except if you've already encountered that pattern before, while d1+d2 is what anybody familiar with other Python collection types would guess or propose. And the default semantics for subclasses of dict that don't override these are settled with the "d = d1.copy(); d.update(d2)" equivalence. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Mar 1 14:41:41 2019 From: brett at python.org (Brett Cannon) Date: Fri, 1 Mar 2019 11:41:41 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> Message-ID: On Fri, Mar 1, 2019 at 8:50 AM Brandt Bucher wrote: > I?ve never been part of this process before, but I?m interested in > learning and helping any way I can. > Thanks! > > My addition implementation is attached to the bpo, and I?m working today > on bringing it in line with the PEP in its current form (specifically, > subtraction operations). > > https://github.com/python/cpython/pull/12088 > When your proposed patch is complete, Brandt, just ask Steven to update the PEP to mention that there's a proposed implementation attached to the issue tracking the idea. -Brett > > > Brandt > > On Mar 1, 2019, at 08:26, Steven D'Aprano wrote: > > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > -- > Steven > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Mar 1 17:33:31 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 02 Mar 2019 11:33:31 +1300 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: <5C79B33B.8020003@canterbury.ac.nz> Serhiy Storchaka wrote: > And this opens a non-easy problem: how to create a mapping of the same > type? That's the responsibility of the class implementing the + operator. There doesn't have to be any guarantee that a subclass of it will automatically return an instance of the subclass (many existing types provide no such guarantee, e.g. + on strings), so whatever strategy it uses doesn't have to be part of its public API. -- Greg From brandtbucher at gmail.com Fri Mar 1 19:10:44 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Fri, 1 Mar 2019 16:10:44 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> Message-ID: While working through my implementation, I've come across a couple of inconsistencies with the current proposal: > The merge operator will have the same relationship to the dict.update method as the list concatenation operator has to list.extend, with dict difference being defined analogously. I like this premise. += for lists *behaves* like extend, and += for dicts *behaves* like update. However, later in the PEP it says: > Augmented assignment will just call the update method. This is analogous to the way list += calls the extend method, which accepts any iterable, not just lists. In your Python implementation samples from the PEP, dict subclasses will behave differently from how list subclasses do. List subclasses, without overrides, return *list* objects for bare "+" operations (and "+=" won't call an overridden "extend" method). So a more analogous pseudo-implementation (if that's what we seek) would look like: def __add__(self, other): if isinstance(other, dict): new = dict.copy(self) dict.update(new, other) return new return NotImplemented def __radd__(self, other): if isinstance(other, dict): new = dict.copy(other) dict.update(other, self) return new return NotImplemented def __iadd__(self, other): if isinstance(other, dict): dict.update(self, other) return self return NotImplemented This is what my C looks like right now. We can choose to update these semantics to be "nicer" to subclasses, but I don't see any precedent for it (lists, sets, strings, etc.). Brandt On Fri, Mar 1, 2019 at 11:41 AM Brett Cannon wrote: > > > On Fri, Mar 1, 2019 at 8:50 AM Brandt Bucher > wrote: > >> I?ve never been part of this process before, but I?m interested in >> learning and helping any way I can. >> > > Thanks! > > >> >> My addition implementation is attached to the bpo, and I?m working today >> on bringing it in line with the PEP in its current form (specifically, >> subtraction operations). >> >> https://github.com/python/cpython/pull/12088 >> > > When your proposed patch is complete, Brandt, just ask Steven to update > the PEP to mention that there's a proposed implementation attached to the > issue tracking the idea. > > -Brett > > >> >> >> Brandt >> >> On Mar 1, 2019, at 08:26, Steven D'Aprano wrote: >> >> Attached is a draft PEP on adding + and - operators to dict for >> discussion. >> >> This should probably go here: >> >> https://github.com/python/peps >> >> but due to technical difficulties at my end, I'm very limited in what I >> can do on Github (at least for now). If there's anyone who would like to >> co-author and/or help with the process, that will be appreciated. >> >> >> -- >> Steven >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Fri Mar 1 20:59:25 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 1 Mar 2019 20:59:25 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> Message-ID: I think that sequence should be fixed. On Fri., Mar. 1, 2019, 7:12 p.m. Brandt Bucher, wrote: > While working through my implementation, I've come across a couple of > inconsistencies with the current proposal: > > > The merge operator will have the same relationship to the dict.update > method as the list concatenation operator has to list.extend, with dict > difference being defined analogously. > > I like this premise. += for lists *behaves* like extend, and += for dicts > *behaves* like update. > > However, later in the PEP it says: > > > Augmented assignment will just call the update method. This is > analogous to the way list += calls the extend method, which accepts any > iterable, not just lists. > > In your Python implementation samples from the PEP, dict subclasses will > behave differently from how list subclasses do. List subclasses, without > overrides, return *list* objects for bare "+" operations (and "+=" won't > call an overridden "extend" method). So a more analogous > pseudo-implementation (if that's what we seek) would look like: > > def __add__(self, other): > if isinstance(other, dict): > new = dict.copy(self) > dict.update(new, other) > return new > return NotImplemented > > def __radd__(self, other): > if isinstance(other, dict): > new = dict.copy(other) > dict.update(other, self) > return new > return NotImplemented > > def __iadd__(self, other): > if isinstance(other, dict): > dict.update(self, other) > return self > return NotImplemented > > This is what my C looks like right now. We can choose to update these semantics to be "nicer" to subclasses, but I don't see any precedent for it (lists, sets, strings, etc.). > > Brandt > > > On Fri, Mar 1, 2019 at 11:41 AM Brett Cannon wrote: > >> >> >> On Fri, Mar 1, 2019 at 8:50 AM Brandt Bucher >> wrote: >> >>> I?ve never been part of this process before, but I?m interested in >>> learning and helping any way I can. >>> >> >> Thanks! >> >> >>> >>> My addition implementation is attached to the bpo, and I?m working today >>> on bringing it in line with the PEP in its current form (specifically, >>> subtraction operations). >>> >>> https://github.com/python/cpython/pull/12088 >>> >> >> When your proposed patch is complete, Brandt, just ask Steven to update >> the PEP to mention that there's a proposed implementation attached to the >> issue tracking the idea. >> >> -Brett >> >> >>> >>> >>> Brandt >>> >>> On Mar 1, 2019, at 08:26, Steven D'Aprano wrote: >>> >>> Attached is a draft PEP on adding + and - operators to dict for >>> discussion. >>> >>> This should probably go here: >>> >>> https://github.com/python/peps >>> >>> but due to technical difficulties at my end, I'm very limited in what I >>> can do on Github (at least for now). If there's anyone who would like to >>> co-author and/or help with the process, that will be appreciated. >>> >>> >>> -- >>> Steven >>> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/jq5QVTt3CAI/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/jq5QVTt3CAI/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 1 22:52:25 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 2 Mar 2019 14:52:25 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> Message-ID: <20190302035224.GO4465@ando.pearwood.info> Executive summary: - I'm going to argue for subclass-preserving behaviour; - I'm not wedded to the idea that dict += should actually call the update method, so long as it has the same behaviour; - __iadd__ has no need to return NotImplemented or type-check its argument. Details below. On Fri, Mar 01, 2019 at 04:10:44PM -0800, Brandt Bucher wrote: [...] > In your Python implementation samples from the PEP, dict subclasses will > behave differently from how list subclasses do. List subclasses, without > overrides, return *list* objects for bare "+" operations Right -- and I think they are wrong to do so, for reasons I explained here: https://mail.python.org/pipermail/python-ideas/2019-March/055547.html I think the standard handling of subclasses in Python builtins is wrong, and I don't wish to emulate that wrong behaviour without a really good reason. Or at least a better reason than "other methods break subclassing unless explicitly overloaded, so this should do so too". Or at least not without a fight :-) > (and "+=" won't call an overridden "extend" method). I'm slightly less opinionated about that. Looking more closely into the docs, I see that they don't actually say that += calls list.extend: s.extend(t) extends s with the contents of t (for or s += t the most part the same as s[len(s):len(s)] = t) https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types only that they have the same effect. So the wording re lists calling extend certainly needs to be changed. But that doesn't mean that we must change the implementation. We have a choice: - regardless of what lists do, we define += for dicts as literally calling dict.update; the more I think about it, the less I like this. - Or we say that += behaves similarly to update, without actually calling the method. I think I prefer this. (The second implies either that += either contains a duplicate of the update logic, or that += and update both delegate to a private, C-level function that does most of the work.) I think that the second approach (define += as having the equivalent semantics of update but without actually calling the update method) is probably better. That decouples the two methods, allows subclasses to change one without necessarily changing the other. > So a more analogous > pseudo-implementation (if that's what we seek) would look like: > > def __add__(self, other): > if isinstance(other, dict): > new = dict.copy(self) > dict.update(new, other) > return new > return NotImplemented We should not require the copy method. The PEP should be more explicit that the approximate implementation does not imply the copy() and update() methods are actually called. > def __iadd__(self, other): > if isinstance(other, dict): > dict.update(self, other) > return self > return NotImplemented I don't agree with that implementation. According to PEP 203, which introduced augmented assignment, the sequence of calls in ``d += e`` is: 1. Try to call ``d.__iadd__(e)``. 2. If __iadd__ is not present, try ``d.__add__(e)``. 3. If __add__ is missing too, try ``e.__radd__(d)``. but my tests suggest this is inaccurate. I think the correct behaviour is this: 1. Try to call ``d.__iadd__(e)``. 2. If __iadd__ is not present, or if it returns NotImplemented, try ``d.__add__(e)``. 3. If __add__ is missing too, or if it returns NotImplemented, fail with TypeError. In other words, e.__radd__ is not used. We don't want dict.__iadd__ to try calling __add__, since the later is more restrictive and less efficient than the in-place merge. So there is no need for __iadd__ to return NotImplemented. It should either succeed on its own, or fail hard: def __iadd__(self, other): self.update(other) return self Except that the actual C implementation won't call the update method itself, but will follow the same semantics. See the docstring for dict.update for details of what is accepted by update. -- Steven From raymond.hettinger at gmail.com Sat Mar 2 14:14:18 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 2 Mar 2019 11:14:18 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: > On Mar 1, 2019, at 11:31 AM, Guido van Rossum wrote: > > There's a compromise solution for this possible. We already do this for Sequence and MutableSequence: Sequence does *not* define __add__, but MutableSequence *does* define __iadd__, and the default implementation just calls self.update(other). I propose the same for Mapping (do nothing) and MutableMapping: make the default __iadd__ implementation call self.update(other). Usually, it's easy to add methods to classes without creating disruption, but ABCs are more problematic. If MutableMapping grows an __iadd__() method, what would that mean for existing classes that register as MutableMapping but don't already implement __iadd__? When "isinstance(m, MutableMapping)" returns True, is it a promise that the API is fully implemented? Is this something that mypy could would or should complain about? > Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the latter is highly non-obvious except if you've already encountered that pattern before I concur. The latter is also an eyesore and almost certain to be a stumbling block when reading code. That said, I'm not sure we actually need a short-cut for "d=e.copy(); d.update(f)". Code like this comes-up for me perhaps once a year. Having a plus operator on dicts would likely save me five seconds per year. If the existing code were in the form of "d=e.copy(); d.update(f); d.update(g); d.update(h)", converting it to "d = e + f + g + h" would be a tempting but algorithmically poor thing to do (because the behavior is quadratic). Most likely, the right thing to do would be "d = ChainMap(e, f, g, h)" for a zero-copy solution or "d = dict(ChainMap(e, f, g, h))" to flatten the result without incurring quadratic costs. Both of those are short and clear. Lastly, I'm still bugged by use of the + operator for replace-logic instead of additive-logic. With numbers and lists and Counters, the plus operator creates a new object where all the contents of each operand contribute to the result. With dicts, some of the contents for the left operand get thrown-away. This doesn't seem like addition to me (IIRC that is also why sets have "|" instead of "+"). Raymond From francismb at email.de Sat Mar 2 15:25:05 2019 From: francismb at email.de (francismb) Date: Sat, 2 Mar 2019 21:25:05 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: <05b38d6c-2da7-47c9-b008-71033f34b9b9@email.de> On 3/2/19 8:14 PM, Raymond Hettinger wrote: > Lastly, I'm still bugged by use of the + operator for replace-logic instead of additive-logic. With numbers and lists and Counters, the plus operator creates a new object where all the contents of each operand contribute to the result. With dicts, some of the contents for the left operand get thrown-away. This doesn't seem like addition to me (IIRC that is also why sets have "|" instead of "+"). +1, it's a good point. IMHO the proposed (meaning) overloading for + and += is too much/unclear. If the idea is to 'join' dicts why not to use "d.join(...here the other dicts ...)" Regards, --francis From francismb at email.de Sat Mar 2 17:02:27 2019 From: francismb at email.de (francismb) Date: Sat, 2 Mar 2019 23:02:27 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> Message-ID: <590f3ecc-6106-8f73-4b22-5174a131d3ae@email.de> On 2/27/19 7:14 PM, MRAB wrote: > Are there any advantages of using '+' over '|'? or for e.g. '<=' (d1 <= d2) over '+' (d1 + d2) From python at mrabarnett.plus.com Sat Mar 2 17:11:09 2019 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 2 Mar 2019 22:11:09 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <590f3ecc-6106-8f73-4b22-5174a131d3ae@email.de> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> <590f3ecc-6106-8f73-4b22-5174a131d3ae@email.de> Message-ID: On 2019-03-02 22:02, francismb wrote: > > On 2/27/19 7:14 PM, MRAB wrote: >> Are there any advantages of using '+' over '|'? > or for e.g. '<=' (d1 <= d2) over '+' (d1 + d2) > '<=' is for comparison, less-than-or-equal (in the case of sets, subset, which is sort of the same kind of thing). Using it for anything else in Python would be too confusing. From brandtbucher at gmail.com Sat Mar 2 18:38:17 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Sat, 2 Mar 2019 15:38:17 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190302035224.GO4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> Message-ID: Hi Steven. Thanks for the clarifications. I've pushed a complete working patch (with tests) to GitHub. It's linked to the bpo issue. Branch: https://github.com/brandtbucher/cpython/tree/addiction PR: https://github.com/python/cpython/pull/12088 Right now, it's pretty much a straight reimplementation of your Python examples. I plan to update it periodically to keep it in sync with any changes, and to make a few optimizations (for example, when operands are identical or empty). Let me know if you have any questions/suggestions. Stoked to learn and help out with this process! :) Brandt On Fri, Mar 1, 2019 at 7:57 PM Steven D'Aprano wrote: > Executive summary: > > - I'm going to argue for subclass-preserving behaviour; > > - I'm not wedded to the idea that dict += should actually call the > update method, so long as it has the same behaviour; > > - __iadd__ has no need to return NotImplemented or type-check its > argument. > > Details below. > > > On Fri, Mar 01, 2019 at 04:10:44PM -0800, Brandt Bucher wrote: > > [...] > > In your Python implementation samples from the PEP, dict subclasses will > > behave differently from how list subclasses do. List subclasses, without > > overrides, return *list* objects for bare "+" operations > > Right -- and I think they are wrong to do so, for reasons I explained > here: > > https://mail.python.org/pipermail/python-ideas/2019-March/055547.html > > I think the standard handling of subclasses in Python builtins is wrong, > and I don't wish to emulate that wrong behaviour without a really good > reason. Or at least a better reason than "other methods break > subclassing unless explicitly overloaded, so this should do so too". > > Or at least not without a fight :-) > > > > > (and "+=" won't call an overridden "extend" method). > > I'm slightly less opinionated about that. Looking more closely into the > docs, I see that they don't actually say that += calls list.extend: > > s.extend(t) extends s with the contents of t (for > or s += t the most part the same as s[len(s):len(s)] = t) > > https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types > > only that they have the same effect. So the wording re lists calling > extend certainly needs to be changed. But that doesn't mean that we must > change the implementation. We have a choice: > > - regardless of what lists do, we define += for dicts as literally > calling dict.update; the more I think about it, the less I like this. > > - Or we say that += behaves similarly to update, without actually > calling the method. I think I prefer this. > > (The second implies either that += either contains a duplicate of the > update logic, or that += and update both delegate to a private, C-level > function that does most of the work.) > > I think that the second approach (define += as having the equivalent > semantics of update but without actually calling the update method) is > probably better. That decouples the two methods, allows subclasses to > change one without necessarily changing the other. > > > > So a more analogous > > pseudo-implementation (if that's what we seek) would look like: > > > > def __add__(self, other): > > if isinstance(other, dict): > > new = dict.copy(self) > > dict.update(new, other) > > return new > > return NotImplemented > > We should not require the copy method. > > The PEP should be more explicit that the approximate implementation does > not imply the copy() and update() methods are actually called. > > > > def __iadd__(self, other): > > if isinstance(other, dict): > > dict.update(self, other) > > return self > > return NotImplemented > > I don't agree with that implementation. > > According to PEP 203, which introduced augmented assignment, the > sequence of calls in ``d += e`` is: > > 1. Try to call ``d.__iadd__(e)``. > > 2. If __iadd__ is not present, try ``d.__add__(e)``. > > 3. If __add__ is missing too, try ``e.__radd__(d)``. > > but my tests suggest this is inaccurate. I think the correct behaviour > is this: > > 1. Try to call ``d.__iadd__(e)``. > > 2. If __iadd__ is not present, or if it returns NotImplemented, > try ``d.__add__(e)``. > > 3. If __add__ is missing too, or if it returns NotImplemented, > fail with TypeError. > > In other words, e.__radd__ is not used. > > We don't want dict.__iadd__ to try calling __add__, since the later is > more restrictive and less efficient than the in-place merge. So there is > no need for __iadd__ to return NotImplemented. It should either succeed > on its own, or fail hard: > > def __iadd__(self, other): > self.update(other) > return self > > Except that the actual C implementation won't call the update method > itself, but will follow the same semantics. > > See the docstring for dict.update for details of what is accepted by > update. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Sun Mar 3 08:27:36 2019 From: francismb at email.de (francismb) Date: Sun, 3 Mar 2019 14:27:36 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> <590f3ecc-6106-8f73-4b22-5174a131d3ae@email.de> Message-ID: <4a26631c-43c6-ff59-7765-dbf9a8d0b26b@email.de> On 3/2/19 11:11 PM, MRAB wrote: > '<=' is for comparison, less-than-or-equal (in the case of sets, subset, > which is sort of the same kind of thing). Using it for anything else in > Python would be too confusing. Understandable, so the the proposed (meaning) overloading for <= is also too much/unclear. From francismb at email.de Sun Mar 3 08:36:37 2019 From: francismb at email.de (francismb) Date: Sun, 3 Mar 2019 14:36:37 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <8899b5c1-0b56-e04b-6437-89fa71e6fa25@mrabarnett.plus.com> Message-ID: <766ac408-4821-872c-2cbf-c7f5d20d247f@email.de> On 2/27/19 7:14 PM, MRAB wrote: > Are there any advantages of using '+' over '|'? or '<-' (d1 <- d2) meaning merge priority (overriding policy for equal keys) on the right dict, and may be '->' (d1 -> d2) merge priority on the left dict over '+' (d1 + d2) ? E.g.: >>> d1 = {'a':1, 'b':1 } >>> d2 = {'a':2 } >>> d3 = d1 -> d2 >>> d3 {'a':1, 'b':1 } >>> d1 = {'a':1, 'b':1 } >>> d2 = {'a':2 } >>> d3 = d1 <- d2 >>> d3 {'a':2, 'b':1 } Regards, --francis From francismb at email.de Sun Mar 3 09:46:24 2019 From: francismb at email.de (francismb) Date: Sun, 3 Mar 2019 15:46:24 +0100 Subject: [Python-ideas] Left arrow and right arrow operators Message-ID: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Hi, the idea here is just to add the __larrow__ and __rarrow__ operators for <- and ->. E.g. of use on dicts : >>> d1 = {'a':1, 'b':1 } >>> d2 = {'a':2 } >>> d3 = d1 -> d2 >>> d3 {'a':1, 'b':1 } >>> d1 = {'a':1, 'b':1 } >>> d2 = {'a':2 } >>> d3 = d1 <- d2 >>> d3 {'a':2, 'b':1 } Or on bools as Modus Ponens [1] Or your idea/imagination here :-) Regards, --francis [1] https://en.wikipedia.org/wiki/Modus_ponens From phd at phdru.name Sun Mar 3 10:06:38 2019 From: phd at phdru.name (Oleg Broytman) Date: Sun, 3 Mar 2019 16:06:38 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: <20190303150638.zv24yqah625uiypj@phdru.name> On Sun, Mar 03, 2019 at 03:46:24PM +0100, francismb wrote: > Hi, > the idea here is just to add the __larrow__ and __rarrow__ operators for > <- and ->. You cannot create operator ``<-`` because it's currently valid syntax: 3 <- 2 is equivalent to 3 < -2 > Regards, > --francis Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From jamtlu at gmail.com Sun Mar 3 21:28:30 2019 From: jamtlu at gmail.com (James Lu) Date: Sun, 3 Mar 2019 21:28:30 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190302035224.GO4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> Message-ID: <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown. This way, d1 + d2 isn?t just another obvious way to do {**d1, **d2}. The second syntax makes it clear that a new dictionary is being constructed and that d2 overrides keys from d1. One can reasonably expect or imagine a situation where a section of code that expects to merge two dictionaries with non-conflicting keys commits a semantic error if it merges two dictionaries with conflicting keys. To better explain, imagine a program where options is a global variable storing parsed values from the command line. def verbose_options(): if options.quiet return {'verbose': True} def quiet_options(): if options.quiet: return {'verbose': False} If we were to define an options() function, return {**quiet_options(), **verbose_options()} implies that verbose overrules quiet; whereas return quiet_options() + verbose_options() implies that verbose and quiet cannot be used simultaneously. I am not aware of another easy way in Python to merge dictionaries while checking for non-conflicting keys. Compare: def settings(): return {**quiet_options(), **verbose_options()} def settings(): try: return quiet_options() + verbose_options() except KeyError: print('conflicting options used', sys.stderr') sys.exit(1) *** This is a simple scenario, but you can imagine more complex ones as well. Does ?quiet-stage-1 loosen ?verbose? Does ?quiet-stage-1 conflict with ?verbose-stage-1?Does ?verbosity=5 override ?verbosity=4 or cause an error? Having {**, **} and + do different things provides a convenient and Pythonic way to model such relationships in code. Indeed, you can even combine the two syntaxes in the same expression to show a mix of overriding and exclusionary behavior. Anyways, I think it?s a good idea to have this semantic difference in behavior so Python developers have a good way to communicate what is expected of the two dictionaries being merged inside the language. This is like an assertion without Again, I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown, because such ?non-conflicting merge? behavior would be useful in Python. It gives clarifying power to the + sign. The + and the {**, **} should serve different roles. In other words, explicit + is better than implicit {**, **#, unless explicitly suppressed. Here + is explicit whereas {**, **} is implicitly allowing inclusive keys, and the KeyError is expressed suppressed by virtue of not using the {**, **} syntax. People expect the + operator to be commutative, while the {**, **} syntax prompts further examination by virtue of its ?weird? syntax. From songofacandy at gmail.com Sun Mar 3 21:53:27 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 4 Mar 2019 11:53:27 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: I think "Current Alternatives" section must refer to long existing idiom, in addition to {**d1, **d2}: d3 = d1.copy() d3.update(d2) It is obvious nor easily discoverable, while it takes two lines. "There are no obvious way" and "there is at least one obvious way" is very different. On Sat, Mar 2, 2019 at 1:27 AM Steven D'Aprano wrote: > > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- INADA Naoki From fhsxfhsx at 126.com Sun Mar 3 23:56:34 2019 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Mon, 4 Mar 2019 12:56:34 +0800 (CST) Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: <7b267f9b.48c7.169470f8936.Coremail.fhsxfhsx@126.com> I wonder if it is necessary to add two new operators, and for me, "arrow operator" is not clearer than `+`. Could you explain why do you prefer this operator than `+`? Also -> is a symbol of propositional logic, like ? and ? , do we also need these operators as well? At 2019-03-03 22:46:24, "francismb" wrote: >Hi, >the idea here is just to add the __larrow__ and __rarrow__ operators for ><- and ->. > > >E.g. of use on dicts : >>>> d1 = {'a':1, 'b':1 } >>>> d2 = {'a':2 } >>>> d3 = d1 -> d2 >>>> d3 >{'a':1, 'b':1 } > >>>> d1 = {'a':1, 'b':1 } >>>> d2 = {'a':2 } >>>> d3 = d1 <- d2 >>>> d3 >{'a':2, 'b':1 } > >Or on bools as Modus Ponens [1] > >Or your idea/imagination here :-) > > > >Regards, >--francis > >[1] https://en.wikipedia.org/wiki/Modus_ponens > > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Mar 4 03:41:21 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 4 Mar 2019 09:41:21 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> Message-ID: James Lu schrieb am 04.03.19 um 03:28: > I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown. Please, no. That would be really annoying. If you need that feature, it can become a new method on dicts. Stefan From ijkl at netc.fr Mon Mar 4 04:12:09 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Mon, 4 Mar 2019 10:12:09 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190302035224.GO4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> Message-ID: <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Hi, I'm not old on this list but every time there is a proposal, the answer is "what are you trying to solve ?". Since |z ={**x,**y} and z.update(y) Exists, I can"t find the answer. | | | Le 02/03/2019 ? 04:52, Steven D'Aprano a ?crit?: > Executive summary: > > - I'm going to argue for subclass-preserving behaviour; > > - I'm not wedded to the idea that dict += should actually call the > update method, so long as it has the same behaviour; > > - __iadd__ has no need to return NotImplemented or type-check its > argument. > > Details below. > > > On Fri, Mar 01, 2019 at 04:10:44PM -0800, Brandt Bucher wrote: > > [...] >> In your Python implementation samples from the PEP, dict subclasses will >> behave differently from how list subclasses do. List subclasses, without >> overrides, return *list* objects for bare "+" operations > Right -- and I think they are wrong to do so, for reasons I explained > here: > > https://mail.python.org/pipermail/python-ideas/2019-March/055547.html > > I think the standard handling of subclasses in Python builtins is wrong, > and I don't wish to emulate that wrong behaviour without a really good > reason. Or at least a better reason than "other methods break > subclassing unless explicitly overloaded, so this should do so too". > > Or at least not without a fight :-) > > > >> (and "+=" won't call an overridden "extend" method). > I'm slightly less opinionated about that. Looking more closely into the > docs, I see that they don't actually say that += calls list.extend: > > s.extend(t) extends s with the contents of t (for > or s += t the most part the same as s[len(s):len(s)] = t) > > https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types > > only that they have the same effect. So the wording re lists calling > extend certainly needs to be changed. But that doesn't mean that we must > change the implementation. We have a choice: > > - regardless of what lists do, we define += for dicts as literally > calling dict.update; the more I think about it, the less I like this. > > - Or we say that += behaves similarly to update, without actually > calling the method. I think I prefer this. > > (The second implies either that += either contains a duplicate of the > update logic, or that += and update both delegate to a private, C-level > function that does most of the work.) > > I think that the second approach (define += as having the equivalent > semantics of update but without actually calling the update method) is > probably better. That decouples the two methods, allows subclasses to > change one without necessarily changing the other. > > >> So a more analogous >> pseudo-implementation (if that's what we seek) would look like: >> >> def __add__(self, other): >> if isinstance(other, dict): >> new = dict.copy(self) >> dict.update(new, other) >> return new >> return NotImplemented > We should not require the copy method. > > The PEP should be more explicit that the approximate implementation does > not imply the copy() and update() methods are actually called. > > >> def __iadd__(self, other): >> if isinstance(other, dict): >> dict.update(self, other) >> return self >> return NotImplemented > I don't agree with that implementation. > > According to PEP 203, which introduced augmented assignment, the > sequence of calls in ``d += e`` is: > > 1. Try to call ``d.__iadd__(e)``. > > 2. If __iadd__ is not present, try ``d.__add__(e)``. > > 3. If __add__ is missing too, try ``e.__radd__(d)``. > > but my tests suggest this is inaccurate. I think the correct behaviour > is this: > > 1. Try to call ``d.__iadd__(e)``. > > 2. If __iadd__ is not present, or if it returns NotImplemented, > try ``d.__add__(e)``. > > 3. If __add__ is missing too, or if it returns NotImplemented, > fail with TypeError. > > In other words, e.__radd__ is not used. > > We don't want dict.__iadd__ to try calling __add__, since the later is > more restrictive and less efficient than the in-place merge. So there is > no need for __iadd__ to return NotImplemented. It should either succeed > on its own, or fail hard: > > def __iadd__(self, other): > self.update(other) > return self > > Except that the actual C implementation won't call the update method > itself, but will follow the same semantics. > > See the docstring for dict.update for details of what is accepted by > update. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Mar 4 04:51:58 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 4 Mar 2019 10:51:58 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: Jimmy Girardet schrieb am 04.03.19 um 10:12: > I'm not old on this list but every time there is a proposal, the answer > is "what are you trying to solve ?". > > Since > > |z ={**x,**y} and z.update(y) Exists, I can"t find the answer. I think the main intentions is to close a gap in the language. [1,2,3] + [4,5,6] works for lists and tuples, {1,2,3} | {4,5,6} works for sets, but joining two dicts isn't simply {1:2, 3:4} + {5:6} but requires either some obscure syntax or a statement instead of a simple expression. The proposal is to enable the obvious syntax for something that should be obvious. Stefan From songofacandy at gmail.com Mon Mar 4 05:15:14 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 4 Mar 2019 19:15:14 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: On Mon, Mar 4, 2019 at 6:52 PM Stefan Behnel wrote: > > I think the main intentions is to close a gap in the language. > > [1,2,3] + [4,5,6] > > works for lists and tuples, > > {1,2,3} | {4,5,6} > > works for sets, but joining two dicts isn't simply > > {1:2, 3:4} + {5:6} > Operators are syntax borrowed from math. * Operators are used for concatenate and repeat (Kleene star) in regular language. https://en.wikipedia.org/wiki/Regular_language seq + seq and seq * N are very similar to it, although Python used + instead of middle dot (not in ASCII) for concatenate. * set is directly relating to set in math. | is well known operator for union. * In case of merging dict, I don't know obvious background in math or computer science. So I feel it's very natural that dict don't have operator for merging. Isn't "for consistency with other types" a wrong consistency? > but requires either some obscure syntax or a statement instead of a simple > expression. > > The proposal is to enable the obvious syntax for something that should be > obvious. dict.update is obvious already. Why statement is not enough? Regards, From ijkl at netc.fr Mon Mar 4 05:27:12 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Mon, 4 Mar 2019 11:27:12 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: <0fdd2db2-52e3-9e2d-f65c-1dd8cbb6da9f@netc.fr> > but requires either some obscure syntax or a statement instead of a simple > expression. > > The proposal is to enable the obvious syntax for something that should be > obvious. > > Stefan The discussions on this list show that the behavior of `+` operator with dict will never be obvious (first wins or second wins or add results or raise Exception). So the user will always have to look at the doc or test it to know the intended behavior. That said, [1,2] + [3] equals? [1,2,3]? but not[1,2, [3]] and that was not obvious to me, and I survived. From levkivskyi at gmail.com Mon Mar 4 05:41:19 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 4 Mar 2019 10:41:19 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: On Sat, 2 Mar 2019 at 19:15, Raymond Hettinger wrote: > > > On Mar 1, 2019, at 11:31 AM, Guido van Rossum wrote: > > > > There's a compromise solution for this possible. We already do this for > Sequence and MutableSequence: Sequence does *not* define __add__, but > MutableSequence *does* define __iadd__, and the default implementation just > calls self.update(other). I propose the same for Mapping (do nothing) and > MutableMapping: make the default __iadd__ implementation call > self.update(other). > > Usually, it's easy to add methods to classes without creating disruption, > but ABCs are more problematic. If MutableMapping grows an __iadd__() > method, what would that mean for existing classes that register as > MutableMapping but don't already implement __iadd__? When "isinstance(m, > MutableMapping)" returns True, is it a promise that the API is fully > implemented? Is this something that mypy could would or should complain > about? > Just to clarify the situation, currently Mapping and MutableMapping are not protocols from both runtime and mypy points of view. I.e. they don't have the structural __subclasshook__() (as e.g. Iterable), and are not declared as Protocol in typeshed. So to implement these (and be considered a subtype by mypy) one needs to explicitly subclass them (register() isn't supported by mypy). This means that adding a new method will not cause any problems here, since the new method will be non-abstract with a default implementation that calls update() (the same way as for MutableSequence). The only potential for confusion I see is if there is a class that de-facto implements current MutableMapping API and made a subclass (at runtime) of MutableMapping using register(). Then after we add __iadd__, users of that class might expect that __iadd__ is implemented, while it might be not. This is however OK I think, since register() is already non type safe. Also there is a simple way to find if there are any subclassses of MutableMapping in typeshed that don't have __iadd__: one can *try* declaring MutableMapping.__iadd__ as abstract, and mypy will error on all such subclasses. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From cspealma at redhat.com Mon Mar 4 08:09:21 2019 From: cspealma at redhat.com (Calvin Spealman) Date: Mon, 4 Mar 2019 08:09:21 -0500 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: I don't like the idea of arrows in both directions when you can just swap the operands instead On Sun, Mar 3, 2019 at 9:52 AM francismb wrote: > Hi, > the idea here is just to add the __larrow__ and __rarrow__ operators for > <- and ->. > > > E.g. of use on dicts : > >>> d1 = {'a':1, 'b':1 } > >>> d2 = {'a':2 } > >>> d3 = d1 -> d2 > >>> d3 > {'a':1, 'b':1 } > > >>> d1 = {'a':1, 'b':1 } > >>> d2 = {'a':2 } > >>> d3 = d1 <- d2 > >>> d3 > {'a':2, 'b':1 } > > Or on bools as Modus Ponens [1] > > Or your idea/imagination here :-) > > > > Regards, > --francis > > [1] https://en.wikipedia.org/wiki/Modus_ponens > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- CALVIN SPEALMAN SENIOR QUALITY ENGINEER cspealma at redhat.com M: +1.336.210.5107 TRIED. TESTED. TRUSTED. -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Mon Mar 4 08:18:07 2019 From: toddrjen at gmail.com (Todd) Date: Mon, 4 Mar 2019 08:18:07 -0500 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: What is the operator supposed to do? On Sun, Mar 3, 2019, 09:52 francismb wrote: > Hi, > the idea here is just to add the __larrow__ and __rarrow__ operators for > <- and ->. > > > E.g. of use on dicts : > >>> d1 = {'a':1, 'b':1 } > >>> d2 = {'a':2 } > >>> d3 = d1 -> d2 > >>> d3 > {'a':1, 'b':1 } > > >>> d1 = {'a':1, 'b':1 } > >>> d2 = {'a':2 } > >>> d3 = d1 <- d2 > >>> d3 > {'a':2, 'b':1 } > > Or on bools as Modus Ponens [1] > > Or your idea/imagination here :-) > > > > Regards, > --francis > > [1] https://en.wikipedia.org/wiki/Modus_ponens > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon Mar 4 08:29:06 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 4 Mar 2019 15:29:06 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: 01.03.19 21:31, Guido van Rossum ????: > On Thu, Feb 28, 2019 at 10:30 PM Serhiy Storchaka > > wrote: > And this opens a non-easy problem: how to create a mapping of the same > type? Not all mappings, and even not all dict subclasses have a copying > constructor. > > > There's a compromise solution for this possible. We already do this for > Sequence and MutableSequence: Sequence does *not* define __add__, but > MutableSequence *does* define __iadd__, and the default implementation > just calls self.update(other). I propose the same for Mapping (do > nothing) and MutableMapping: make the default __iadd__ implementation > call self.update(other). This LGTM for mappings. But the problem with dict subclasses still exists. If use the copy() method for creating a copy, d1 + d2 will always return a dict (unless the plus operator or copy() are redefined in a subclass). If use the constructor of the left argument type, there will be problems with subclasses with non-compatible constructors (e.g. defaultdict). > Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the > latter is highly non-obvious except if you've already encountered that > pattern before, while d1+d2 is what anybody familiar with other Python > collection types would guess or propose. And the default semantics for > subclasses of dict that don't override these are settled with the "d = > d1.copy(); d.update(d2)" equivalence. Dicts are not like lists or deques, or even sets. Iterating dicts produces keys, but not values. The "in" operator tests a key, but not a value. It is not that I like to add an operator for dict merging, but dicts are more like sets than sequences: they can not contain duplicated keys and the size of the result of merging two dicts can be less than the sum of their sizes. Using "|" looks more natural to me than using "+". We should look at discussions for using the "|" operator for sets, if the alternative of using "+" was considered, I think the same arguments for preferring "|" for sets are applicable now for dicts. But is merging two dicts a common enough problem that needs introducing an operator to solve it? I need to merge dicts maybe not more than one or two times by year, and I am fine with using the update() method. Perhaps {**d1, **d2} can be more appropriate in some cases, but I did not encounter such cases yet. From songofacandy at gmail.com Mon Mar 4 08:41:38 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 4 Mar 2019 22:41:38 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: On Mon, Mar 4, 2019 at 10:29 PM Serhiy Storchaka wrote: > > It is not that I like to add an operator for dict merging, but dicts are > more like sets than sequences: they can not contain duplicated keys and > the size of the result of merging two dicts can be less than the sum of > their sizes. Using "|" looks more natural to me than using "+". We > should look at discussions for using the "|" operator for sets, if the > alternative of using "+" was considered, I think the same arguments for > preferring "|" for sets are applicable now for dicts. > I concur with Serhiy. While I don't like adding operator to dict, proposed +/- looks similar to set |/- than seq +/-. If we're going to add such set-like operations, operators can be: * dict & dict_or_set * dict - dict_or_set * dict | dict Especially, dict - set can be more useful than proposed dict - dict. > But is merging two dicts a common enough problem that needs introducing > an operator to solve it? I need to merge dicts maybe not more than one > or two times by year, and I am fine with using the update() method. +1. Adding new method to builtin should have a high bar. Adding new operator to builtin should have a higher bar. Adding new syntax should have a highest bar. -- INADA Naoki From storchaka at gmail.com Mon Mar 4 08:43:48 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 4 Mar 2019 15:43:48 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190301104427.GH4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <20190301104427.GH4465@ando.pearwood.info> Message-ID: 01.03.19 12:44, Steven D'Aprano ????: > On Fri, Mar 01, 2019 at 08:47:36AM +0200, Serhiy Storchaka wrote: > >> Currently Counter += dict works and Counter + dict is an error. With >> this change Counter + dict will return a value, but it will be different >> from the result of the += operator. > > That's how list.__iadd__ works too: ListSubclass + list will return a > value, but it might not be the same as += since that operates in place > and uses a different dunder method. > > Why is it a problem for dicts but not a problem for lists? Because the plus operator for lists predated any list subclasses. >> Also, if the custom dict subclass implemented the plus operator with >> different semantic which supports the addition with a dict, this change >> will break it, because dict + CustomDict will call dict.__add__ instead >> of CustomDict.__radd__. > > That's not how operators work in Python or at least that's not how they > worked the last time I looked: if the behaviour has changed without > discussion, that's a breaking change that should be reverted. You are right. > What's wrong with doing this? > > new = type(self)() > > Or the equivalent from C code. If that doesn't work, surely that's the > fault of the subclass, the subclass is broken, and it will raise an > exception. Try to do this with defaultdict. Note that none of builtin sequences or sets do this. For good reasons they always return an instance of the base type. From jfine2358 at gmail.com Mon Mar 4 09:03:38 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 4 Mar 2019 14:03:38 +0000 Subject: [Python-ideas] Current use of addition in Python Message-ID: Summary: This thread is for recording current use of addition in Python. This post covers the built-in types. I'll do Counter and numpy.array in another post. Please use another thread to discuss possible possible future use of addition. BACKGROUND At present constructions such as {'a': 1} + {'b': 2} produce TypeError: unsupported operand type(s) for +: 'dict' and 'dict' Elsewhere on this list, we're discussing whether to extend dict so that it supports such constructions, and if so what semantics to give to addition of dictionaries. In this thread I intend to record the CURRENT use of addition in Python, in a neutral manner. (However, I will focus on what might be called the mathematical properties of addition in Python.) BUILT-IN TYPES In this post I talk about the exiting built-in types: see https://docs.python.org/3/library/stdtypes.html The NUMERIC types support addition, which has the following properties Commutative: a + b == b + a Associative: (a + b) + c == a + (b + c) Left-cancellation: if (a + b) == (a + c): assert b == c Right-cancellation: if (a + b) == (c + b): assert a == c Existence of zero (depending on type): zero + a == a + zero == a Multiplication by non-negative integer: a + a + .... + a == a * n == n * a Note: For floats, the equalities are ideal. For example, sometimes huge + tiny == huge + zero and so we don't exactly have cancellation. The SEQUENCE types (except range) have the following properties Associative Left-cancellation Right-cancellation Existence of zero Multiplication by non-negative integer. Note: Even though range is a sequence type, range(3) + range(3) produces TypeError: unsupported operand type(s) for +: 'range' and 'range' By the way, although documented, I find this a bit surprising: >>> (0, 1, 2) * (-1) == () True I'd have expected a ValueError. As things stand seq * (-1) * (-1) is not associative. And (-1) * seq == -seq is not true, although the left hand side is defined. CONCLUSION I've recorded the behaviour of addition for the built-in types. I'll do Counter and numpy.array in another post, later today. Please use another thread to discuss possible future use of addition. My aim in this thread is to establish and record the present use, to better support discussion of future use. -- Jonathan From mertz at gnosis.cx Mon Mar 4 09:42:53 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 4 Mar 2019 09:42:53 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: On Mon, Mar 4, 2019, 8:30 AM Serhiy Storchaka wrote: > But is merging two dicts a common enough problem that needs introducing > an operator to solve it? I need to merge dicts maybe not more than one > or two times by year, and I am fine with using the update() method. > Perhaps {**d1, **d2} can be more appropriate in some cases, but I did not > encounter such cases yet. > Like other folks in the thread, I also want to merge dicts three times per year. And every one of those times, itertools.ChainMap is the right way to do that non-destructively, and without copying. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Mon Mar 4 10:01:23 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 10:01:23 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> Message-ID: <3200242B-8002-47FF-9FAD-52CA3B6410E6@gmail.com> > On Mar 4, 2019, at 3:41 AM, Stefan Behnel wrote: > > James Lu schrieb am 04.03.19 um 03:28: >> I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown. > > Please, no. That would be really annoying. > > If you need that feature, it can become a new method on dicts. > > Stefan If you want to merge it without a KeyError, learn and use the more explicit {**d1, **d2} syntax. From stefan_ml at behnel.de Mon Mar 4 10:02:06 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 4 Mar 2019 16:02:06 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: INADA Naoki schrieb am 04.03.19 um 11:15: > Why statement is not enough? I'm not sure I understand why you're asking this, but a statement is "not enough" because it's a statement and not an expression. It does not replace the convenience of an expression. Stefan From jamtlu at gmail.com Mon Mar 4 10:09:32 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 10:09:32 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> >> On Mar 4, 2019, at 4:51 AM, Stefan Behnel wrote: > > Jimmy Girardet schrieb am 04.03.19 um 10:12: >> I'm not old on this list but every time there is a proposal, the answer >> is "what are you trying to solve ?". >> >> Since >> >> |z ={**x,**y} and z.update(y) Exists, I can"t find the answer. > > I think the main intentions is to close a gap in the language. > > [1,2,3] + [4,5,6] > > works for lists and tuples, > > {1,2,3} | {4,5,6} > > works for sets, but joining two dicts isn't simply > > {1:2, 3:4} + {5:6} > > but requires either some obscure syntax or a statement instead of a simple > expression. > > The proposal is to enable the obvious syntax for something that should be > obvious. Rebutting my ?throw KeyError on conflicting keys for +? proposal: Indeed but + is never destructive in those contexts: duplicate list items are okay because they?re ordered, duplicated set items are okay because they mean the same thing (when two sets contain the same item and you merge the two the ?containing? means the same thing), but duplicate dict keys mean different things. How many situations would you need to make a copy of a dictionary and then update that copy and override old keys from a new dictionary? It?s better to have two different syntaxes for different situations. The KeyError of my proposal is a feature, a sign that something is wrong, a sign an invariant is being violated. Yes, {**, **} syntax looks abnormal and ugly. That?s part of the point? how many times have you needed to create a copy of a dictionary and update that dictionary with overriding keys from a new dictionary? It?s much more common to have non-conflicting keys. The ugliness of the syntax makes one pause and think and ask: ?Why is it important that the keys from this dictionary override the ones from another dictionary?? PROPOSAL EDIT: I think KeyError should only be thrown if the same keys from two dictionaries have values that are not __eq__. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Mon Mar 4 10:12:05 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 10:12:05 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: > On Mar 4, 2019, at 10:02 AM, Stefan Behnel wrote: > > INADA Naoki schrieb am 04.03.19 um 11:15: >> Why statement is not enough? > > I'm not sure I understand why you're asking this, but a statement is "not > enough" because it's a statement and not an expression. It does not replace > the convenience of an expression. > > Stefan There is already an expression for key-overriding merge. Why do we need a new one? From steve at pearwood.info Mon Mar 4 10:25:57 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 02:25:57 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <3200242B-8002-47FF-9FAD-52CA3B6410E6@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <3200242B-8002-47FF-9FAD-52CA3B6410E6@gmail.com> Message-ID: <20190304152557.GT4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 10:01:23AM -0500, James Lu wrote: > If you want to merge it without a KeyError, learn and use the more explicit {**d1, **d2} syntax. In your previous email, you said the {**d ...} syntax was implicit: In other words, explicit + is better than implicit {**, **#, unless explicitly suppressed. Here + is explicit whereas {**, **} is implicitly allowing inclusive keys, and the KeyError is expressed suppressed by virtue of not using the {**, **} syntax. It is difficult to take your "explicit/implicit" argument seriously when you cannot even decided which is which. -- Steven From rhodri at kynesim.co.uk Mon Mar 4 09:18:12 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 4 Mar 2019 14:18:12 +0000 Subject: [Python-ideas] Current use of addition in Python In-Reply-To: References: Message-ID: <5d9b4c2e-5a3b-fe20-3d74-9f52a82064f8@kynesim.co.uk> On 04/03/2019 14:03, Jonathan Fine wrote: > Summary: This thread is for recording current use of addition in > Python. TL;DR. Why is this is Python Ideas? -- Rhodri James *-* Kynesim Ltd From steve at pearwood.info Mon Mar 4 11:25:43 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 03:25:43 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> Message-ID: <20190304162543.GU4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 10:09:32AM -0500, James Lu wrote: > How many situations would you need to make a copy of a dictionary and > then update that copy and override old keys from a new dictionary? Very frequently. That's why we have a dict.update method, which if I remember correctly, was introduced in Python 1.5 because people were frequently re-inventing the same wheel: def update(d1, d2): for key in d2.keys(): d1[key] in d2[key] You should have a look at how many times it is used in the standard library: [steve at ando cpython]$ cd Lib/ [steve at ando Lib]$ grep -U "\.update[(]" *.py */*.py | wc -l 373 Now some of those are false positives (docstrings, comments, non-dicts, etc) but that still leaves a lot of examples of wanting to override old keys. This is a very common need. Wanting an exception if the key already exists is, as far as I can tell, very rare. It is true that many of the examples in the std lib involve updating an existing dict, not creating a new one. But that's only to be expected: since Python didn't provide an obvious functional version of update, only an in-place version, naturally people get used to writing in-place code. (Think about how long we made do without sorted(). I don't know about other people, but I now find sorted indispensible, and probably use it ten or twenty times more often than the in-place version.) [...] > The KeyError of my proposal is a feature, a sign that something is > wrong, a sign an invariant is being violated. Why is "keys are unique" an invariant? The PEP gives a good example of when this "invariant" would be unnecessarily restrictive: For example, updating default configuration values with user-supplied values would most often fail under the requirement that keys are unique:: prefs = site_defaults + user_defaults + document_prefs Another example would be when reading command line options, where the most common convention is for "last option seen" to win: [steve at ando Lib]$ grep --color=always --color=never "zero" f*.py fileinput.py: numbers are zero; nextfile() has no effect. fractions.py: # the same way for any finite a, so treat a as zero. functools.py: # prevent their ref counts from going to zero during and the output is printed without colour. (I've slightly edited the above output so it will fit in the email without wrapping.) The very name "update" should tell us that the most useful behaviour is the one the devs decided on back in 1.5: have the last seen value win. How can you update values if the operation raises an error if the key already exists? If this behaviour is ever useful, I would expect that it will be very rare. An update or merge is effectively just running through a loop setting the value of a key. See the pre-Python 1.5 function above. Having update raise an exception if the key already exists would be about as useful as having ``d[key] = value`` raise an exception if the key already exists. Unless someone can demonstrate that the design of dict.update() was a mistake, and the "require unique keys" behaviour is more common, then I maintain that for the very rare cases you want an exception, you can subclass dict and overload the __add__ method: # Intentionally simplified version. def __add__(self, other): if self.keys() & other.keys(): raise KeyError return super().__add__(self, other) > The ugliness of the syntax makes one pause > and think and ask: ?Why is it important that the keys from this > dictionary override the ones from another dictionary?? Because that is the most common and useful behaviour. That's what it means to *update* a dict or database, and this proposal is for an update operator. The ugliness of the existing syntax is not a feature, it is a barrier. -- Steven From steve at pearwood.info Mon Mar 4 11:44:09 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 03:44:09 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: <20190304164408.GV4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 09:42:53AM -0500, David Mertz wrote: > On Mon, Mar 4, 2019, 8:30 AM Serhiy Storchaka wrote: > > > But is merging two dicts a common enough problem that needs introducing > > an operator to solve it? I need to merge dicts maybe not more than one > > or two times by year, and I am fine with using the update() method. > > Perhaps {**d1, **d2} can be more appropriate in some cases, but I did not > > encounter such cases yet. > > > > Like other folks in the thread, I also want to merge dicts three times per > year. I'm impressed that you have counted it with that level of accuracy. Is it on the same three days each year, or do they move about? *wink* > And every one of those times, itertools.ChainMap is the right way to > do that non-destructively, and without copying. Can you elaborate on why ChainMap is the right way to merge multiple dicts into a single, new dict? ChainMap also seems to implement the opposite behaviour to that usually desired: first value seen wins, instead of last: py> from collections import ChainMap py> cm = ChainMap({'a': 1}, {'b': 2}, {'a': 999}) py> cm ChainMap({'a': 1}, {'b': 2}, {'a': 999}) py> dict(cm) {'a': 1, 'b': 2} If you know ahead of time which order you want, you can simply reverse it: # prefs = site_defaults + user_defaults + document_prefs prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults)) but that seems a little awkward to me, and reads backwards. I'm used to thinking reading left-to-right, not right-to-left. ChainMap seems, to me, to be ideal for implementing "first wins" mappings, such as emulating nested scopes, but not so ideal for update/merge operations. -- Steven From rhodri at kynesim.co.uk Mon Mar 4 11:44:52 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 4 Mar 2019 16:44:52 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> On 04/03/2019 15:12, James Lu wrote: > >> On Mar 4, 2019, at 10:02 AM, Stefan Behnel wrote: >> >> INADA Naoki schrieb am 04.03.19 um 11:15: >>> Why statement is not enough? >> >> I'm not sure I understand why you're asking this, but a statement is "not >> enough" because it's a statement and not an expression. It does not replace >> the convenience of an expression. >> >> Stefan > There is already an expression for key-overriding merge. Why do we need a new one? Because the existing one is inobvious, hard to discover and ugly. -- Rhodri James *-* Kynesim Ltd From steve at pearwood.info Mon Mar 4 11:50:12 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 03:50:12 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <20190301104427.GH4465@ando.pearwood.info> Message-ID: <20190304165011.GW4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 03:43:48PM +0200, Serhiy Storchaka wrote: > 01.03.19 12:44, Steven D'Aprano ????: > >On Fri, Mar 01, 2019 at 08:47:36AM +0200, Serhiy Storchaka wrote: > > > >>Currently Counter += dict works and Counter + dict is an error. With > >>this change Counter + dict will return a value, but it will be different > >>from the result of the += operator. > > > >That's how list.__iadd__ works too: ListSubclass + list will return a > >value, but it might not be the same as += since that operates in place > >and uses a different dunder method. > > > >Why is it a problem for dicts but not a problem for lists? > > Because the plus operator for lists predated any list subclasses. That doesn't answer my question. Just because it is older is no explaination for why this behaviour is not a problem for lists, or a problem for dicts. [...] > >What's wrong with doing this? > > > > new = type(self)() > > > >Or the equivalent from C code. If that doesn't work, surely that's the > >fault of the subclass, the subclass is broken, and it will raise an > >exception. > > Try to do this with defaultdict. I did. It seems to work fine with my testing: py> defaultdict() defaultdict(None, {}) is precisely the behaviour I would expect. If it isn't the right thing to do, then defaultdict can override __add__ and __radd__. > Note that none of builtin sequences or sets do this. For good reasons > they always return an instance of the base type. What are those good reasons? -- Steven From eric at trueblade.com Mon Mar 4 12:17:01 2019 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 4 Mar 2019 17:17:01 +0000 Subject: [Python-ideas] Current use of addition in Python In-Reply-To: <5d9b4c2e-5a3b-fe20-3d74-9f52a82064f8@kynesim.co.uk> References: <5d9b4c2e-5a3b-fe20-3d74-9f52a82064f8@kynesim.co.uk> Message-ID: <7979F375-1D96-495B-A48B-E5981CBD5C84@trueblade.com> > On Mar 4, 2019, at 2:18 PM, Rhodri James wrote: > >> On 04/03/2019 14:03, Jonathan Fine wrote: >> Summary: This thread is for recording current use of addition in >> Python. > > TL;DR. Why is this is Python Ideas? Because of the current discussion of dict + dict. I think this is helping answer the question: is there anything currently in python that?s a similar usage of ?+?? Personally, I don?t think it matters much, but it?s interesting. If the requirement were for new usages of ?+? to be like old ones, would we have added str + str? Eric From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Mar 4 12:56:54 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Mon, 4 Mar 2019 11:56:54 -0600 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190304164408.GV4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On 3/4/19 10:44 AM, Steven D'Aprano wrote: > If you know ahead of time which order you want, you can simply reverse > it: > > # prefs = site_defaults + user_defaults + document_prefs > prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults)) > > but that seems a little awkward to me, and reads backwards. I'm used to > thinking reading left-to-right, not right-to-left. I read that as use document preferences first, then user defaults, then site defautls, exactly as I'd explain the functionality to someone else. So maybe we're agreeing: if you think in terms of updating a dictionary of preferences, then maybe it reads backwards, but if you think of implementing features, then adding dictionaries of preferences reads backwards. From mertz at gnosis.cx Mon Mar 4 13:43:12 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 4 Mar 2019 13:43:12 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190304164408.GV4465@ando.pearwood.info> References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019, 11:45 AM Steven D'Aprano wrote: > > Like other folks in the thread, I also want to merge dicts three times > per > > year. > > I'm impressed that you have counted it with that level of accuracy. Is it > on the same three days each year, or do they move about? *wink* > To be respectful, I always merge dicts on Eid al-Fitr, Diwali, and Lent. I was speaking approximate since those do not appears line up with the same Gregorian year. > And every one of those times, itertools.ChainMap is the right way to do > that non-destructively, and without copying. > > Can you elaborate on why ChainMap is the right way to merge multiple dicts > into a single, new dict? > Zero-copy. > ChainMap also seems to implement the opposite behaviour to that usually > desired: first value seen wins, instead of last: > True, the semantics are different, but equivalent, to the proposed dict addition. I put the key I want to "win" first rather than last. If you know ahead of time which order you want, you can simply reverse it: > This seems nonsensical. If I write, at some future time, 'dict1+dict2+dict3' I need exactly as much to know "ahead of time" which keys I intend to win. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Mar 4 13:41:38 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 4 Mar 2019 18:41:38 +0000 Subject: [Python-ideas] Current use of addition in Python In-Reply-To: <7979F375-1D96-495B-A48B-E5981CBD5C84@trueblade.com> References: <5d9b4c2e-5a3b-fe20-3d74-9f52a82064f8@kynesim.co.uk> <7979F375-1D96-495B-A48B-E5981CBD5C84@trueblade.com> Message-ID: First, I thank Rhodri for his question, and Eric for his reply (see earlier messages in this thread). SUMMARY I record the addition properties of collections.Counter and numpy.array. Finally, some comments about len(), and a promise of more tomorrow. COUNTER Now for collections.Counter -- this is "provided to support convenient and rapid tallies." https://docs.python.org/3/library/collections.html#collections.Counter In my previous post, I noted that the built-in numeric types have the properties: Commutative Associative Left and right cancellation Existence of zero Multiplication by a non-negative integer Instances of Counter 'usually' have all of these properties, except for multiplication by a non-negative integer. Here's an example where cancellation fails >>> Counter(a=-1) + Counter() Counter() Here's two more examples: >>> Counter(a=+1, b=-1) + Counter(a=-1, b=+1) Counter() >>> Counter(a=+1, b=-2) + Counter(a=-2, b=+1) Counter() In the first example, it seems that the counters cancel. But the second example shows that something else is going on. Here's an example of associativity failing: >>> (Counter(a=+1) + Counter(a=-2)) + Counter(a=2) Counter({'a': 2}) >>> Counter(a=+1) + (Counter(a=-2) + Counter(a=2)) Counter({'a': 1}) The Python docs (URL above) notes that the Counter "methods are designed only for use cases with positive values." NUMPY.ARRAY These arrays have all the properties listed above (commutative, associative, left and right cancellation, multiplication by non-negative integer), provided all the arrays have the same shape. (The shape of an array is a tuple of non-negative integers.) And for numpy.array, the zero must also have the same shape. Briefly, a numpy array acts like a multi-dimensional vector. Here's an example: >>> array(range(3)), array(range(3, 6)) (array([0, 1, 2]), array([3, 4, 5])) >>> array(range(3)) + array(range(3, 6)) array([3, 5, 7]) Here's another example: >>> array(range(3, 6)) * 4 array([12, 16, 20]) >>> 4 * array(range(3, 6)) array([12, 16, 20]) LENGTH -- len() The numeric types don't have a length: >>> len(0) TypeError: object of type 'int' has no len() The sequence and mapping types (such as list, tuple, str, bytes, set, dict) are iterable, and have a length. Also, numpy.array and collections.Counter have a length. More on length tomorrow. -- Jonathan From guido at python.org Mon Mar 4 14:24:39 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Mar 2019 11:24:39 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: * Dicts are not like sets because the ordering operators (<, <=, >, >=) are not defined on dicts, but they implement subset comparisons for sets. I think this is another argument pleading against | as the operator to combine two dicts. * Regarding how to construct the new set in __add__, I now think this should be done like this: class dict: def __add__(self, other): new = self.copy() # A subclass may or may not choose to override new.update(other) return new AFAICT this will give the expected result for defaultdict -- it keeps the default factory from the left operand (i.e., self). * Regarding how often this is needed, we know that this is proposed and discussed at length every few years, so I think this will fill a real need. * Regarding possible anti-patterns that this might encourage, I'm not aware of problems around list + list, so this seems an unwarranted worry to me. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Mon Mar 4 15:07:28 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 4 Mar 2019 15:07:28 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 2:26 PM Guido van Rossum wrote: > > * Dicts are not like sets because the ordering operators (<, <=, >, >=) are not defined on dicts, but they implement subset comparisons for sets. I think this is another argument pleading against | as the operator to combine two dicts. > I feel like dict should be treated like sets with the |, &, and - operators since in mathematics a mapping is sometimes represented as a set of pairs with unique first elements. Therefore, I think the set metaphor is stronger. > * Regarding how to construct the new set in __add__, I now think this should be done like this: > > class dict: > > def __add__(self, other): > > new = self.copy() # A subclass may or may not choose to override > new.update(other) > return new I like that, but it would be inefficient to do that for __sub__ since it would create elements that it might later delete. def __sub__(self, other): new = self.copy() for k in other: del new[k] return new is less efficient than def __sub__(self, other): return type(self)({k: v for k, v in self.items() if k not in other}) when copying v is expensive. Also, users would probably not expect values that don't end up being returned to be copied. > > AFAICT this will give the expected result for defaultdict -- it keeps the default factory from the left operand (i.e., self). > > * Regarding how often this is needed, we know that this is proposed and discussed at length every few years, so I think this will fill a real need. > > * Regarding possible anti-patterns that this might encourage, I'm not aware of problems around list + list, so this seems an unwarranted worry to me. > I agree with these points. Best, Neil > -- > --Guido van Rossum (python.org/~guido) > > -- > > --- > You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. > To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/zfHYRHMIAdM/unsubscribe. > To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. > To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/zfHYRHMIAdM/unsubscribe. > To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. From guido at python.org Mon Mar 4 15:22:22 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Mar 2019 12:22:22 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 12:12 PM Neil Girdhar wrote: > On Mon, Mar 4, 2019 at 2:26 PM Guido van Rossum wrote: > > > > * Dicts are not like sets because the ordering operators (<, <=, >, >=) > are not defined on dicts, but they implement subset comparisons for sets. I > think this is another argument pleading against | as the operator to > combine two dicts. > > > > I feel like dict should be treated like sets with the |, &, and - > operators since in mathematics a mapping is sometimes represented as a > set of pairs with unique first elements. Therefore, I think the set > metaphor is stronger. > That ship has long sailed. > > * Regarding how to construct the new set in __add__, I now think this > should be done like this: > > > > class dict: > > > > def __add__(self, other): > > > > new = self.copy() # A subclass may or may not choose to override > > new.update(other) > > return new > > I like that, but it would be inefficient to do that for __sub__ since > it would create elements that it might later delete. > > def __sub__(self, other): > new = self.copy() > for k in other: > del new[k] > return new > > is less efficient than > > def __sub__(self, other): > return type(self)({k: v for k, v in self.items() if k not in other}) > > when copying v is expensive. Also, users would probably not expect > values that don't end up being returned to be copied. > No, the values won't be copied -- it is a shallow copy that only increfs the keys and values. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Mon Mar 4 15:33:36 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 4 Mar 2019 15:33:36 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 3:22 PM Guido van Rossum wrote: > > On Mon, Mar 4, 2019 at 12:12 PM Neil Girdhar wrote: >> >> On Mon, Mar 4, 2019 at 2:26 PM Guido van Rossum wrote: >> > >> > * Dicts are not like sets because the ordering operators (<, <=, >, >=) are not defined on dicts, but they implement subset comparisons for sets. I think this is another argument pleading against | as the operator to combine two dicts. >> > >> >> I feel like dict should be treated like sets with the |, &, and - >> operators since in mathematics a mapping is sometimes represented as a >> set of pairs with unique first elements. Therefore, I think the set >> metaphor is stronger. > > > That ship has long sailed. Maybe, but reading through the various replies, it seems that if you are adding "-" to be analogous to set difference, then the combination operator should be analogous to set union "|". And it also opens an opportunity to add set intersection "&". After all, how do you filter a dictionary to a set of keys? >> d = {'some': 5, 'extra': 10, 'things': 55} >> d &= {'some', 'allowed', 'options'} >> d {'some': 5} >> >> > * Regarding how to construct the new set in __add__, I now think this should be done like this: >> > >> > class dict: >> > >> > def __add__(self, other): >> > >> > new = self.copy() # A subclass may or may not choose to override >> > new.update(other) >> > return new >> >> I like that, but it would be inefficient to do that for __sub__ since >> it would create elements that it might later delete. >> >> def __sub__(self, other): >> new = self.copy() >> for k in other: >> del new[k] >> return new >> >> is less efficient than >> >> def __sub__(self, other): >> return type(self)({k: v for k, v in self.items() if k not in other}) >> >> when copying v is expensive. Also, users would probably not expect >> values that don't end up being returned to be copied. > > > No, the values won't be copied -- it is a shallow copy that only increfs the keys and values. Oh right, good point. Then your way is better since it would preserve any other data stored by the dict subclass. > > -- > --Guido van Rossum (python.org/~guido) From guido at python.org Mon Mar 4 15:41:02 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Mar 2019 12:41:02 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: Honestly I would rather withdraw the subtraction operators than reopen the discussion about making dict more like set. On Mon, Mar 4, 2019 at 12:33 PM Neil Girdhar wrote: > On Mon, Mar 4, 2019 at 3:22 PM Guido van Rossum wrote: > > > > On Mon, Mar 4, 2019 at 12:12 PM Neil Girdhar > wrote: > >> > >> On Mon, Mar 4, 2019 at 2:26 PM Guido van Rossum > wrote: > >> > > >> > * Dicts are not like sets because the ordering operators (<, <=, >, > >=) are not defined on dicts, but they implement subset comparisons for > sets. I think this is another argument pleading against | as the operator > to combine two dicts. > >> > > >> > >> I feel like dict should be treated like sets with the |, &, and - > >> operators since in mathematics a mapping is sometimes represented as a > >> set of pairs with unique first elements. Therefore, I think the set > >> metaphor is stronger. > > > > > > That ship has long sailed. > > Maybe, but reading through the various replies, it seems that if you > are adding "-" to be analogous to set difference, then the combination > operator should be analogous to set union "|". And it also opens an > opportunity to add set intersection "&". After all, how do you filter > a dictionary to a set of keys? > > >> d = {'some': 5, 'extra': 10, 'things': 55} > >> d &= {'some', 'allowed', 'options'} > >> d > {'some': 5} > > >> > >> > * Regarding how to construct the new set in __add__, I now think this > should be done like this: > >> > > >> > class dict: > >> > > >> > def __add__(self, other): > >> > > >> > new = self.copy() # A subclass may or may not choose to > override > >> > new.update(other) > >> > return new > >> > >> I like that, but it would be inefficient to do that for __sub__ since > >> it would create elements that it might later delete. > >> > >> def __sub__(self, other): > >> new = self.copy() > >> for k in other: > >> del new[k] > >> return new > >> > >> is less efficient than > >> > >> def __sub__(self, other): > >> return type(self)({k: v for k, v in self.items() if k not in other}) > >> > >> when copying v is expensive. Also, users would probably not expect > >> values that don't end up being returned to be copied. > > > > > > No, the values won't be copied -- it is a shallow copy that only increfs > the keys and values. > > Oh right, good point. Then your way is better since it would preserve > any other data stored by the dict subclass. > > > > -- > > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Mon Mar 4 15:58:24 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Mon, 4 Mar 2019 12:58:24 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 12:41 PM Guido van Rossum wrote: > Honestly I would rather withdraw the subtraction operators than reopen the > discussion about making dict more like set. > +1 I think the "dicts are like more-featured" sets is a math-geek perspective, and unlikely to make things more clear for the bulk of users. And may make it less clear. We need to be careful -- there are a lot more math geeks on this list than in the general Python coding population. Simply adding "+" is a non-critical nice to have, but it seems unlikely to really confuse anyone. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Mon Mar 4 16:09:17 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Mon, 4 Mar 2019 13:09:17 -0800 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: There are all sorts of "Calendar" operations one might want -- I think those belong in a separate library, rather than a few tacked on to datetime. -CHB On Fri, Mar 1, 2019 at 2:48 AM Robert Vanden Eynde wrote: > Currently one can do week = d.isocalendar()[1] > > The iso definition of a week number has some nice properties. > > robertvandeneynde.be > > On Fri, 1 Mar 2019, 11:44 Antonio Gal?n, wrote: > >> The week number is usually refered to the week of the year, but the week >> of the month is also interesting, for example for some holiday which depend >> on the week number of the month, so in analogy with "weekday" we can use >> "yearweek" and "monthweek" >> El vie., 1 de marzo de 2019 9:33, Adrien Ricocotam >> escribi?: >> >>> I like the idea. But how to distinguish it from the number of week past >>> since the beginning of the month ? >>> >>> But that?s great. >>> >>> On Fri 1 Mar 2019 at 09:31, Antonio Gal?n wrote: >>> >>>> Hi, datetime.date.today() (or other day) has attributes .year and >>>> .month wich return the year and the month of that date, also it has a >>>> function weekday() wich return the number of the day in the week. >>>> >>>> I think it is a good idea add a function or attribute "week" wich >>>> return the number of the week on the year. It is useful to execute scripts >>>> once a week for example. >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Mon Mar 4 16:28:24 2019 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 4 Mar 2019 16:28:24 -0500 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 3:58 PM Christopher Barker wrote: > > > > On Mon, Mar 4, 2019 at 12:41 PM Guido van Rossum wrote: >> >> Honestly I would rather withdraw the subtraction operators than reopen the discussion about making dict more like set. I think that's unfortunate. > > > +1 > > I think the "dicts are like more-featured" sets is a math-geek perspective, and unlikely to make things more clear for the bulk of users. And may make it less clear. I'd say reddit has some pretty "common users", and they're having a discussion of this right now (https://www.reddit.com/r/Python/comments/ax4zzb/pep_584_add_and_operators_to_the_builtin_dict/). The most popular comment is how it should be |. Anyway, I think that following the mathematical metaphors tends to make things more intuitive in the long run. Python is an adventure. You learn it for years and then it all makes sense. If dict uses +, yes, new users might find that sooner than |. However, when they learn set union, I think they will wonder why it's not consistent with dict union. The PEP's main justification for + is that it matches Counter, but counter is adding the values whereas | doesn't touch the values. I think it would be good to at least make a list of pros and cons of each proposed syntax. > We need to be careful -- there are a lot more math geeks on this list than in the general Python coding population. > > Simply adding "+" is a non-critical nice to have, but it seems unlikely to really confuse anyone. > > -CHB > > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython From p.f.moore at gmail.com Mon Mar 4 16:34:34 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 4 Mar 2019 21:34:34 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, 4 Mar 2019 at 20:42, Guido van Rossum wrote: > > Honestly I would rather withdraw the subtraction operators than reopen the discussion about making dict more like set. I'm neutral on dict addition, but dict subtraction seemed an odd extension to the proposal. Using b in a - b solely for its keys, and ignoring its values, seems weird to me. Even if dict1 - dict2 were added to the language, I think I'd steer clear of it as being too obscure. I'm not going to get sucked into this debate, but I'd be happy to see the subtraction operator part of the proposal withdrawn. Paul From steve at pearwood.info Mon Mar 4 17:56:32 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 09:56:32 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: <20190304225631.GX4465@ando.pearwood.info> On Sat, Mar 02, 2019 at 11:14:18AM -0800, Raymond Hettinger wrote: > If the existing code were in the form of "d=e.copy(); d.update(f); > d.update(g); d.update(h)", converting it to "d = e + f + g + h" would > be a tempting but algorithmically poor thing to do (because the > behavior is quadratic). I mention this in the PEP. Unlike strings, but like lists and tuples, I don't expect that this will be a problem in practice: - it's easy to put repeated string concatenation in a tight loop; it is harder to think of circumstances where one needs to concatenate lists or tuples, or merge dicts, in a tight loop; - it's easy to have situations where one is concatenating thousands of strings; its harder to imagine circumstances where one would be merging more than three or four dicts; - concatentation s1 + s2 + ... for strings, lists or tuples results in a new object of length equal to the sum of the lengths of each of the inputs, so the output is constantly growing; but merging dicts d1 + d2 + ... typically results in a smaller object of length equal to the number of unique keys. > Most likely, the right thing to do would be > "d = ChainMap(e, f, g, h)" for a zero-copy solution or "d = > dict(ChainMap(e, f, g, h))" to flatten the result without incurring > quadratic costs. Both of those are short and clear. And both result in the opposite behaviour of what you probably intended if you were trying to match e + f + g + h. Dict merging/updating operates on "last seen wins", but ChainMap is "first seen wins". To get the same behaviour, we have to write the dicts in opposite order compared to update, from most to least specific: # least specific to most specific prefs = site_defaults + user_defaults + document_prefs # most specific to least prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults)) To me, the later feels backwards: I'm applying document prefs first, and then trusting that the ChainMap doesn't overwrite them with the defaults. I know that's guaranteed behaviour, but every time I read it I'll feel the need to check :-) > Lastly, I'm still bugged by use of the + operator for replace-logic > instead of additive-logic. With numbers and lists and Counters, the > plus operator creates a new object where all the contents of each > operand contribute to the result. With dicts, some of the contents > for the left operand get thrown-away. This doesn't seem like addition > to me (IIRC that is also why sets have "|" instead of "+"). I'm on the fence here. Addition seems to be the most popular operator (it often gets requested) but you might be right that this is more like a union operation than concatenation or addition operation. MRAB also suggested this earlier. One point in its favour is that + goes nicely with - but on the other hand, sets have | and - with no + and that isn't a problem. -- Steven From steve at pearwood.info Mon Mar 4 18:11:02 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 10:11:02 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: <20190304231102.GY4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 11:56:54AM -0600, Dan Sommers wrote: > On 3/4/19 10:44 AM, Steven D'Aprano wrote: > > > If you know ahead of time which order you want, you can simply reverse > > it: > > > > # prefs = site_defaults + user_defaults + document_prefs > > prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults)) > > > > but that seems a little awkward to me, and reads backwards. I'm used to > > thinking reading left-to-right, not right-to-left. > > I read that as use document preferences first, then user > defaults, then site defautls, exactly as I'd explain the > functionality to someone else. If you explained it to me like that, with the term "use", I'd think that the same feature would be done three times: once with document prefs, then with user defaults, then site defaults. Clearly that's not what you mean, so I'd then have to guess what you meant by "use", since you don't actually mean use. That would leave me trying to guess whether you meant that *site defaults* overrode document prefs or the other way. I don't like guessing, so I'd probably explicitly ask: "Wait, I'm confused, which wins? It sounds like site defaults wins, surely that's not what you meant." > So maybe we're agreeing: if you think in terms of updating > a dictionary of preferences, then maybe it reads backwards, > but if you think of implementing features, then adding > dictionaries of preferences reads backwards. Do you think "last seen wins" is backwards for dict.update() or for command line options? -- Steven From delgan.py at gmail.com Mon Mar 4 18:30:06 2019 From: delgan.py at gmail.com (Del Gan) Date: Tue, 5 Mar 2019 00:30:06 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> Message-ID: Hi. > the augmented assignment version allows anything the ``update`` method allows, such as iterables of key/value pairs I am a little surprised by this choice. First, this means that "a += b" would not be equivalent to "a = a + b". Is there other built-in types which act differently if called with the operator or augmented assignment version? Secondly, that would imply I would no longer be able to infer the type of "a" while reading "a += [('foo', 'bar')]". Is it a list? A dict? Those two points make me uncomfortable with "+=" strictly behaving like ".update()". 2019-03-04 17:44 UTC+01:00, Rhodri James : > On 04/03/2019 15:12, James Lu wrote: >> >>> On Mar 4, 2019, at 10:02 AM, Stefan Behnel wrote: >>> >>> INADA Naoki schrieb am 04.03.19 um 11:15: >>>> Why statement is not enough? >>> >>> I'm not sure I understand why you're asking this, but a statement is >>> "not >>> enough" because it's a statement and not an expression. It does not >>> replace >>> the convenience of an expression. >>> >>> Stefan >> There is already an expression for key-overriding merge. Why do we need a >> new one? > > Because the existing one is inobvious, hard to discover and ugly. > > -- > Rhodri James *-* Kynesim Ltd > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Mar 4 18:49:41 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Mon, 4 Mar 2019 17:49:41 -0600 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190304231102.GY4465@ando.pearwood.info> References: <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> <20190304231102.GY4465@ando.pearwood.info> Message-ID: <4e3141e2-f82f-731b-6909-c38bfa9b8240@potatochowder.com> On 3/4/19 5:11 PM, Steven D'Aprano wrote: > On Mon, Mar 04, 2019 at 11:56:54AM -0600, Dan Sommers wrote: >> On 3/4/19 10:44 AM, Steven D'Aprano wrote: >> >> > If you know ahead of time which order you want, you can simply reverse >> > it: >> > >> > # prefs = site_defaults + user_defaults + document_prefs >> > prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults)) >> > >> > but that seems a little awkward to me, and reads backwards. I'm used to >> > thinking reading left-to-right, not right-to-left. >> >> I read that as use document preferences first, then user >> defaults, then site defautls, exactly as I'd explain the >> functionality to someone else. > > If you explained it to me like that, with the term "use", I'd think that > the same feature would be done three times: once with document prefs, > then with user defaults, then site defaults. > > Clearly that's not what you mean, so I'd then have to guess what you > meant by "use", since you don't actually mean use. That would leave me > trying to guess whether you meant that *site defaults* overrode document > prefs or the other way. > > I don't like guessing, so I'd probably explicitly ask: "Wait, I'm > confused, which wins? It sounds like site defaults wins, surely that's > not what you meant." You're right: "use" is the wrong word. Perhaps "prefer" is more appropriate. To answer the question of which wins: the first one in the list [document, user, site] that contains a given preference in question. Users don't see dictionary updates; they see collections of preferences in order of priority. Documentation is hard. :-) Sorry. >> So maybe we're agreeing: if you think in terms of updating >> a dictionary of preferences, then maybe it reads backwards, >> but if you think of implementing features, then adding >> dictionaries of preferences reads backwards. > > Do you think "last seen wins" is backwards for dict.update() or for > command line options? As a user, "last seen wins" is clearly superior for command line options. As a programmer, because object methods operate on their underlying object, it's pretty obvious that d1.update(d2) starts with d1 and applies the changes expressed in d2, which is effectively "last seen wins." If I resist the temptation to guess in the face of ambiguity, though, I don't think that d1 + d2 is any less ambiguous than a hypothetical dict_update(d1, d2) function. When I see a + operator, I certainly don't think of one operand or the other winning. From guido at python.org Mon Mar 4 18:57:38 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Mar 2019 15:57:38 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> Message-ID: On Mon, Mar 4, 2019 at 3:31 PM Del Gan wrote: > > the augmented assignment version allows anything the ``update`` method > allows, such as iterables of key/value pairs > > I am a little surprised by this choice. > > First, this means that "a += b" would not be equivalent to "a = a + > b". Is there other built-in types which act differently if called with > the operator or augmented assignment version? > Yes. The same happens for lists. [1] + 'a' is a TypeError, but a += 'a' works: >>> a = [1] >>> a + 'a' Traceback (most recent call last): File "", line 1, in TypeError: can only concatenate list (not "str") to list >>> a += 'a' >>> a [1, 'a'] >>> > Secondly, that would imply I would no longer be able to infer the type > of "a" while reading "a += [('foo', 'bar')]". Is it a list? A dict? > Real code more likely looks like "a += b" and there you already don't have much of a clue -- the author of the code should probably communicate this using naming conventions or type annotations. > Those two points make me uncomfortable with "+=" strictly behaving > like ".update()". > And yet that's how it works for lists. (Note that dict.update() still has capabilities beyond +=, since you can also invoke it with keyword args.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Mon Mar 4 15:26:20 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 15:26:20 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190304152557.GT4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <3200242B-8002-47FF-9FAD-52CA3B6410E6@gmail.com> <20190304152557.GT4465@ando.pearwood.info> Message-ID: <2EEBB26D-834A-437D-BE03-B1A6B7A085A6@gmail.com> > On Mon, Mar 04, 2019 at 10:01:23AM -0500, James Lu wrote: > > If you want to merge it without a KeyError, learn and use the more explicit {**d1, **d2} syntax. On Mar 4, 2019, at 10:25 AM, Steven D'Aprano wrote: > In your previous email, you said the {**d ...} syntax was implicit: > > In other words, explicit + is better than implicit {**, **#, unless > explicitly suppressed. Here + is explicit whereas {**, **} is > implicitly allowing inclusive keys, and the KeyError is expressed > suppressed by virtue of not using the {**, **} syntax. > > It is difficult to take your "explicit/implicit" argument seriously when > you cannot even decided which is which. I misspoke. > In your previous email, you said the {**d ...} syntax was implicit: > > In other words, explicit + is better than implicit {**, **#, unless > explicitly suppressed. Here + is explicit whereas {**, **} is > implicitly allowing inclusive keys, and the KeyError is expressed > suppressed by virtue of not using the {**, **} syntax. > > It is difficult to take your "explicit/implicit" argument seriously when > you cannot even decided which is which. Yes, + is explicit. {**, **} is implicit. My argument: We should set the standard that + is for non-conflicting merge and {**, **} is for overriding merge. That standard should be so that + explicitly asserts that the keys will not conflict whereas {**d1, **d2} is ambiguous on why d2 is overriding d1.^ ^Presumably you?re making a copy of d1 so why should d3 have d2 take priority? The syntax deserves a comment, perhaps explaining that items from d2 are newer in time or that the items in d1 are always nonces. The + acts as an implicit assertion and an opportunity to catch an invariant violation or data input error. Give me an example of a situation where you need a third dictionary from two existing dictionaries and having conflict where a key has a different value in both is desirable behavior. The situation where non-conflicting merge is what?s desired is more common and in that case throwing an exception in the case of a conflicting value is a good thing, a way to catch code smell. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Mon Mar 4 19:53:17 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 19:53:17 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190304162543.GU4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> <20190304162543.GU4465@ando.pearwood.info> Message-ID: On Mar 4, 2019, at 11:25 AM, Steven D'Aprano wrote: >> How many situations would you need to make a copy of a dictionary and >> then update that copy and override old keys from a new dictionary? >> > > Very frequently. > > That's why we have a dict.update method, which if I remember correctly, > was introduced in Python 1.5 because people were frequently re-inventing > the same wheel: > > def update(d1, d2): > for key in d2.keys(): > d1[key] in d2[key] > > > You should have a look at how many times it is used in the standard > library: > > [steve at ando cpython]$ cd Lib/ > [steve at ando Lib]$ grep -U "\.update[(]" *.py */*.py | wc -l > 373 > > Now some of those are false positives (docstrings, comments, non-dicts, > etc) but that still leaves a lot of examples of wanting to override old > keys. This is a very common need. Wanting an exception if the key > already exists is, as far as I can tell, very rare. It is very rare when you want to modify an existing dictionary. It?s not rare at all when you?re creating a new one. > > It is true that many of the examples in the std lib involve updating an > existing dict, not creating a new one. But that's only to be expected: > since Python didn't provide an obvious functional version of update, > only an in-place version, naturally people get used to writing > in-place code. My question was ?How many situations would you need to make a copy of a dictionary and then update that copy and override old keys from a new dictionary?? Try to really think about my question, instead of giving answering with half of it to dismiss my point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamtlu at gmail.com Mon Mar 4 20:01:38 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 20:01:38 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190304162543.GU4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> <20190304162543.GU4465@ando.pearwood.info> Message-ID: > On Mar 4, 2019, at 11:25 AM, Steven D'Aprano wrote: > > The PEP gives a good example of when this "invariant" would be > unnecessarily restrictive: > > For example, updating default configuration values with > user-supplied values would most often fail under the > requirement that keys are unique:: > > prefs = site_defaults + user_defaults + document_prefs > > > Another example would be when reading command line options, where the > most common convention is for "last option seen" to win: > > [steve at ando Lib]$ grep --color=always --color=never "zero" f*.py > fileinput.py: numbers are zero; nextfile() has no effect. > fractions.py: # the same way for any finite a, so treat a as zero. > functools.py: # prevent their ref counts from going to zero during > Indeed, in this case you would want to use {**, **} syntax. > and the output is printed without colour. > > (I've slightly edited the above output so it will fit in the email > without wrapping.) > > The very name "update" should tell us that the most useful behaviour is > the one the devs decided on back in 1.5: have the last seen value win. > How can you update values if the operation raises an error if the key > already exists? If this behaviour is ever useful, I would expect that it > will be very rare. > An update or merge is effectively just running through a loop setting > the value of a key. See the pre-Python 1.5 function above. Having update > raise an exception if the key already exists would be about as useful as > having ``d[key] = value`` raise an exception if the key already exists. > > Unless someone can demonstrate that the design of dict.update() was a > mistake You?re making a logical mistake here. + isn?t supposed to have .update?s behavior and it never was supposed to. > , and the "require unique keys" behaviour is more common, I just have. 99% of the time you want to have keys from one dict override another, you?d be better off doing it in-place and so would be using .update() anyways. > then > I maintain that for the very rare cases you want an exception, you can > subclass dict and overload the __add__ method: Well, yes, the whole point is to define the best default behavior. From jamtlu at gmail.com Mon Mar 4 20:03:18 2019 From: jamtlu at gmail.com (James Lu) Date: Mon, 4 Mar 2019 20:03:18 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> Message-ID: <8B8B929B-94F5-45D5-9837-F5C26D1DF7EE@gmail.com> By the way, my ?no same keys with different values? proposal would not apply to +=. From brett at python.org Mon Mar 4 20:54:25 2019 From: brett at python.org (Brett Cannon) Date: Mon, 4 Mar 2019 17:54:25 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: On Mon, Mar 4, 2019 at 1:29 PM Neil Girdhar wrote: > On Mon, Mar 4, 2019 at 3:58 PM Christopher Barker > wrote: > > > > > > > > On Mon, Mar 4, 2019 at 12:41 PM Guido van Rossum > wrote: > >> > >> Honestly I would rather withdraw the subtraction operators than reopen > the discussion about making dict more like set. > > I think that's unfortunate. > > > > > > +1 > > > > I think the "dicts are like more-featured" sets is a math-geek > perspective, and unlikely to make things more clear for the bulk of users. > And may make it less clear. > > I'd say reddit has some pretty "common users", and they're having a > discussion of this right now > ( > https://www.reddit.com/r/Python/comments/ax4zzb/pep_584_add_and_operators_to_the_builtin_dict/ > ). > The most popular comment is how it should be |. > > Anyway, I think that following the mathematical metaphors tends to > make things more intuitive in the long run. Only if you know the mathematical metaphors. ;) > Python is an adventure. > You learn it for years and then it all makes sense. If dict uses +, > yes, new users might find that sooner than |. However, when they > learn set union, I think they will wonder why it's not consistent with > dict union. > Not to me. I barely remember that | is supported for sets, but I sure know about + and lists (and strings, etc.) and I'm willing to bet the vast majority of folks are the some; addition is much more widely known than set theory. > > The PEP's main justification for + is that it matches Counter, but > counter is adding the values whereas | doesn't touch the values. I > think it would be good to at least make a list of pros and cons of > each proposed syntax. > I suspect Steven will add more details to a Rejected Ideas section. > > > We need to be careful -- there are a lot more math geeks on this list > than in the general Python coding population. > > > > Simply adding "+" is a non-critical nice to have, but it seems unlikely > to really confuse anyone. > I agree with Chris. -Brett > > > > -CHB > > > > > > -- > > Christopher Barker, PhD > > > > Python Language Consulting > > - Teaching > > - Scientific Software Development > > - Desktop GUI and Web Development > > - wxPython, numpy, scipy, Cython > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidfstr at gmail.com Tue Mar 5 00:37:54 2019 From: davidfstr at gmail.com (David Foster) Date: Mon, 4 Mar 2019 21:37:54 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <8f412d1d-4327-3ef3-bffe-cee8e66f5d68@gmail.com> I have seen a ton of discussion about what dict addition should do, but have seen almost no mention of dict difference. This lack of discussion interest combined with me not recalling having needed the proposed subtraction semantics personally makes me wonder if we should hold off on locking in subtraction semantics just yet. Perhaps we could just scope the proposal to dictionary addition only for now? If I *were* to define dict difference, my intuition suggests supporting a second operand that is any iterable of keys and not just dicts. (Augmented dict subtraction is already proposed to accept such a broader second argument.) David Foster | Seattle, WA, USA On 3/1/19 8:26 AM, Steven D'Aprano wrote: > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From raymond.hettinger at gmail.com Tue Mar 5 00:52:04 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 4 Mar 2019 21:52:04 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: > On Mar 4, 2019, at 11:24 AM, Guido van Rossum wrote: > > * Regarding how often this is needed, we know that this is proposed and discussed at length every few years, so I think this will fill a real need. I'm not sure that conclusion follows from the premise :-) Some ideas get proposed routinely because they are obvious things to propose, not because people actually need them. One hint is that the proposals always have generic variable names, "d = d1 + d2", and another is that they are almost never accompanied by actual use cases or real code that would be made better. I haven't seen anyone in this thread say they would use this more than once a year or that their existing code was unclear or inefficient in any way. The lack of dict addition support in other languages (like Java example) is another indicator that there isn't a real need -- afaict there is nothing about Python that would cause us to have a unique requirement that other languages don't have. FWIW, there are some downsides to the proposal -- it diminishes some of the unifying ideas about Python that I typically present on the first day of class: * One notion is that the APIs nudge users toward good code. The "copy.copy()" function has to be imported -- that minor nuisance is a subtle hint that copying isn't good for you. Likewise for dicts, writing "e=d.copy(); e.update(f)" is a minor nuisance that either serves to dissuade people from unnecessary copying or at least will make very clear what is happening. The original motivating use case for ChainMap() was to make a copy free replacement for excessively slow dict additions in ConfigParser. Giving a plus-operator to mappings is an invitation to writing code that doesn't scale well. * Another unifying notion is that the star-operator represents repeat addition across multiple data types. It is a nice demo to show that "a * 5 == a + a + a + a + a" where "a" is an int, float, complex, str, bytes, tuple, or list. Giving __add__() to dicts breaks this pattern. * When teaching dunder methods, the usual advice regarding operators is to use them only when their meaning is unequivocal; otherwise, have a preference for named methods where the method name clarifies what is being done -- don't use train+car to mean train.shunt_to_middle(car). For dicts that would mean not having the plus-operator implement something that isn't inherently additive (it applies replace/overwrite logic instead), that isn't commutative, and that isn't linear when applied in succession (d1+d2+d3). * In the advanced class where C extensions are covered, the organization of the slots is shown as a guide to which methods make sense together: tp_as_number, tp_as_sequence, and tp_as_mapping. For dicts to gain the requisite methods, they will have to become numbers (in the sense of filling out the tp_as_number slots). That will slow down the abstract methods that search the slot groups, skipping over groups marked as NULL. It also exposes method groups that don't typically appear together, blurring their distinction. * Lastly, there is a vague piece of zen-style advice, "if many things in the language have to change to implement idea X, it stops being worth it". In this case, it means that every dict-like API and the related abstract methods and typing equivalents would need to grow support for addition in mappings (would it even make sense to add to shelve objects or os.environ objects together?) That's my two cents worth. I'm ducking out now (nothing more to offer on the subject). Guido's participation in the thread has given it an air of inevitability so this post will likely not make a difference. Raymond From gadgetsteve at live.co.uk Tue Mar 5 01:00:21 2019 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Tue, 5 Mar 2019 06:00:21 +0000 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: If anybody is looking for such components then wx.DateTime (https://wxpython.org/Phoenix/docs/html/datetime_overview.html) it is derived from wxDateTime (https://docs.wxwidgets.org/3.1/classwx_date_time.html) and should support all of its methods including things like DST changes, etc., supported dates from about 4714 B.C. to some 480 million years in the future. From: Python-ideas On Behalf Of Christopher Barker Sent: 04 March 2019 21:09 To: python-ideas Subject: Re: [Python-ideas] Add a "week" function or attribute to datetime.date There are all sorts of "Calendar" operations one might want -- I think those belong in a separate library, rather than a few tacked on to datetime. -CHB On Fri, Mar 1, 2019 at 2:48 AM Robert Vanden Eynde > wrote: Currently one can do week = d.isocalendar()[1] The iso definition of a week number has some nice properties. robertvandeneynde.be On Fri, 1 Mar 2019, 11:44 Antonio Gal?n, > wrote: The week number is usually refered to the week of the year, but the week of the month is also interesting, for example for some holiday which depend on the week number of the month, so in analogy with "weekday" we can use "yearweek" and "monthweek" El vie., 1 de marzo de 2019 9:33, Adrien Ricocotam > escribi?: I like the idea. But how to distinguish it from the number of week past since the beginning of the month ? But that?s great. On Fri 1 Mar 2019 at 09:31, Antonio Gal?n > wrote: Hi, datetime.date.today() (or other day) has attributes .year and .month wich return the year and the month of that date, also it has a function weekday() wich return the number of the day in the week. I think it is a good idea add a function or attribute "week" wich return the number of the week on the year. It is useful to execute scripts once a week for example. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From amber.yust at gmail.com Tue Mar 5 01:18:13 2019 From: amber.yust at gmail.com (Amber Yust) Date: Mon, 4 Mar 2019 22:18:13 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: Adding the + operator for dictionaries feels like it would be a mistake in that it offers at most sugar-y benefits, but introduces the significant drawback of making it easier to introduced unintended errors. This would be the first instance of "addition" where the result can potentially lose/overwrite data (lists and strings both preserve the full extent of each operand; Counters include the full value from each operand, etc). Combining dictionaries is fundamentally an operation that requires more than one piece of information, because there's no single well-defined way to combine a pair of them. Off the top of my head, I can think of at least 2 different common options (replacement aka .update(), combination of values a la Counter). Neither of these is really a more valid "addition" of dictionaries. For specific dict-like subclasses, addition may make sense - Counter is a great example of this, because the additional context adds definition to the most logical method via which two instances would be combined. If anything, this seems like an argument to avoid implementing __ladd__ on dict itself, to leave the possibility space open for interpretation by more specific classes. *From: *Raymond Hettinger *Date: *Mon, Mar 4, 2019 at 9:53 PM *To: *Guido van Rossum *Cc: *python-ideas > > > On Mar 4, 2019, at 11:24 AM, Guido van Rossum wrote: > > > > * Regarding how often this is needed, we know that this is proposed and > discussed at length every few years, so I think this will fill a real need. > > I'm not sure that conclusion follows from the premise :-) Some ideas get > proposed routinely because they are obvious things to propose, not because > people actually need them. One hint is that the proposals always have > generic variable names, "d = d1 + d2", and another is that they are almost > never accompanied by actual use cases or real code that would be made > better. I haven't seen anyone in this thread say they would use this more > than once a year or that their existing code was unclear or inefficient in > any way. The lack of dict addition support in other languages (like Java > example) is another indicator that there isn't a real need -- afaict there > is nothing about Python that would cause us to have a unique requirement > that other languages don't have. > > FWIW, there are some downsides to the proposal -- it diminishes some of > the unifying ideas about Python that I typically present on the first day > of class: > > * One notion is that the APIs nudge users toward good code. The > "copy.copy()" function has to be imported -- that minor nuisance is a > subtle hint that copying isn't good for you. Likewise for dicts, writing > "e=d.copy(); e.update(f)" is a minor nuisance that either serves to > dissuade people from unnecessary copying or at least will make very clear > what is happening. The original motivating use case for ChainMap() was to > make a copy free replacement for excessively slow dict additions in > ConfigParser. Giving a plus-operator to mappings is an invitation to > writing code that doesn't scale well. > > * Another unifying notion is that the star-operator represents repeat > addition across multiple data types. It is a nice demo to show that "a * 5 > == a + a + a + a + a" where "a" is an int, float, complex, str, bytes, > tuple, or list. Giving __add__() to dicts breaks this pattern. > > * When teaching dunder methods, the usual advice regarding operators is to > use them only when their meaning is unequivocal; otherwise, have a > preference for named methods where the method name clarifies what is being > done -- don't use train+car to mean train.shunt_to_middle(car). For dicts > that would mean not having the plus-operator implement something that isn't > inherently additive (it applies replace/overwrite logic instead), that > isn't commutative, and that isn't linear when applied in succession > (d1+d2+d3). > > * In the advanced class where C extensions are covered, the organization > of the slots is shown as a guide to which methods make sense together: > tp_as_number, tp_as_sequence, and tp_as_mapping. For dicts to gain the > requisite methods, they will have to become numbers (in the sense of > filling out the tp_as_number slots). That will slow down the abstract > methods that search the slot groups, skipping over groups marked as NULL. > It also exposes method groups that don't typically appear together, > blurring their distinction. > > * Lastly, there is a vague piece of zen-style advice, "if many things in > the language have to change to implement idea X, it stops being worth it". > In this case, it means that every dict-like API and the related abstract > methods and typing equivalents would need to grow support for addition in > mappings (would it even make sense to add to shelve objects or os.environ > objects together?) > > That's my two cents worth. I'm ducking out now (nothing more to offer on > the subject). Guido's participation in the thread has given it an air of > inevitability so this post will likely not make a difference. > > > Raymond > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brandtbucher at gmail.com Tue Mar 5 01:48:07 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Mon, 4 Mar 2019 22:48:07 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <8f412d1d-4327-3ef3-bffe-cee8e66f5d68@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <8f412d1d-4327-3ef3-bffe-cee8e66f5d68@gmail.com> Message-ID: I agree with David here. Subtraction wasn?t even part of the original discussion ? it seems that it was only added as an afterthought because Guido felt they were natural to propose together and formed a nice symmetry. It?s odd that RHS values are not used at all, period. Further, there?s no precedent for bulk sequence/mapping removals like this... except for sets, for which it is certainly justified. I?ve had the opportunity to play around with my reference implementation over the last few days, and despite my initial doubts, I have *absolutely* fallen in love with dictionary addition ? I even accidentally tried to += two dictionaries at work on Friday (a good, but frustrating, sign). For context, I was updating a module-level mapping with an imported one, a use case I hadn?t even previously considered. I have tried to fall in love with dict subtraction the same way, but every code sketch/test I come up with feels contrived and hack-y. I?m indifferent towards it, at best. TL;DR: I?ve lived with both for a week. Addition is now habit, subtraction is still weird. > Nice branch name! :) I couldn?t help myself. Brandt From songofacandy at gmail.com Tue Mar 5 02:03:11 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Mar 2019 16:03:11 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: On Tue, Mar 5, 2019 at 12:02 AM Stefan Behnel wrote: > > INADA Naoki schrieb am 04.03.19 um 11:15: > > Why statement is not enough? > > I'm not sure I understand why you're asking this, but a statement is "not > enough" because it's a statement and not an expression. It does not replace > the convenience of an expression. > > Stefan > It seems tautology and say nothing. What is "convenience of an expression"? Is it needed to make Python more readable language? Anyway, If "there is expression" is the main reason for this proposal, symbolic operator is not necessary. `new = d1.updated(d2)` or `new = dict.merge(d1, d2)` are enough. Python preferred name over symbol in general. Symbols are readable and understandable only when it has good math metaphor. Sets has symbol operator because it is well known in set in math, not because set is frequently used. In case of dict, there is no simple metaphor in math. It just cryptic and hard to Google. -- INADA Naoki From boxed at killingar.net Tue Mar 5 02:05:25 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Tue, 5 Mar 2019 08:05:25 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: <99CADE31-AEF8-4B49-AD4E-B613A15B730F@killingar.net> > Adding the + operator for dictionaries feels like it would be a mistake in that it offers at most sugar-y benefits, but introduces the significant drawback of making it easier to introduced unintended errors. I disagree. This argument only really applies to the case "a = a + b", not "a = b + c". Making it easier and more natural to produce code that doesn't mutate in place is something that should reduce errors, not make them more common. The big mistake here was * for strings which is unusual, would be just as well served by a method, and will ensure that type errors blow up much later than it could have been. This type of mistake for dicts when you expected numbers is a much stronger argument against this proposal in my opinion. Let's not create another pitfall! The current syntax is a bit unwieldy but is really fine. / Anders From storchaka at gmail.com Tue Mar 5 02:23:20 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 5 Mar 2019 09:23:20 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: 04.03.19 21:24, Guido van Rossum ????: > * Dicts are not like sets because the ordering operators (<, <=, >, >=) > are not defined on dicts, but they implement subset comparisons for > sets. I think this is another argument pleading against | as the > operator to combine two dicts. Well, I suppose that the next proposition will be to implement the ordering operators for dicts. Because why not? Lists and numbers support them. /sarcasm/ Jokes aside, dicts have more common with sets than with sequences. Both can not contain duplicated keys/elements. Both have the constant computational complexity of the containment test. For both the size of the merging/unioning can be less than the sum of sizes of original containers. Both have the same restrictions for keys/elements (hashability). > * Regarding how to construct the new set in __add__, I now think this > should be done like this: > > class dict: > ??? > ??? def __add__(self, other): > ??????? > ??????? new = self.copy()? # A subclass may or may not choose to override > ??????? new.update(other) > ??????? return new > > AFAICT this will give the expected result for defaultdict -- it keeps > the default factory from the left operand (i.e., self). No one builtin type that implements __add__ uses the copy() method. Dict would be the only exception from the general rule. And it would be much less efficient than {**d1, **d2}. > * Regarding how often this is needed, we know that this is proposed and > discussed at length every few years, so I think this will fill a real need. And every time this proposition was rejected. What has been changed since it was rejected the last time? We now have the expression form of dict merging ({**d1, **d2}), this should be decrease the need of the plus operator for dicts. From songofacandy at gmail.com Tue Mar 5 02:39:40 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Mar 2019 16:39:40 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) Message-ID: I think some people in favor of PEP 584 just want single expression for merging dicts without in-place update. But I feel it's abuse of operator overload. I think functions and methods are better than operator unless the operator has good math metaphor, or very frequently used as concatenate strings. This is why function and methods are better: * Easy to search. * Name can describe it's behavior better than abused operator. * Simpler lookup behavior. (e.g. subclass and __iadd__) Then, I propose `dict.merge` method. It is outer-place version of `dict.update`, but accepts multiple dicts. (dict.update() can be updated to accept multiple dicts, but it's not out of scope). * d = d1.merge(d2) # d = d1.copy(); d.update(d2) * d = d1.merge(d2, d3) # d = d1.copy(); d.update(d2); d2.update(d3) * d = d1.merge(iter_of_pairs) * d = d1.merge(key=value) ## Merits of dict.merge() over operator + * Easy to Google (e.g. "python dict merge"). * Easy to help(dict.merge). (or dict.merge? in IPython) * No inefficiency of d1+d2+d3+...+dN, or sum(list_of_many_dicts) * Type of returned value is always same to d1.copy(). No issubclass, no __iadd__. ## Why not dict.updated()? sorted() is a function so it looks different from L.sort() But d.updated() is very similar to d.update() for human eyes. ## How about d1 - d2? If it is really useful, it can be implemented as method too. dict.discard(sequence_of_keys) Regards, -- INADA Naoki From rosuav at gmail.com Tue Mar 5 03:22:47 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 5 Mar 2019 19:22:47 +1100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: On Tue, Mar 5, 2019 at 6:40 PM INADA Naoki wrote: > This is why function and methods are better: > > * Easy to search. > > ## Merits of dict.merge() over operator + > > * Easy to Google (e.g. "python dict merge"). This keeps getting thrown around. It's simply not true. https://www.google.com/search?q=%7B**d1%2C+**d2%7D First hit when I do that search is Stack Overflow: https://stackoverflow.com/questions/2255878/what-does-mean-in-the-expression-dictd1-d2 which, while it's not specifically about that exact syntax, does mention it in the comments on the question. Symbols ARE searchable. In fact, adding the word "python" to the beginning of that search produces a number of very useful hits, including a Reddit thread on combining dictionaries, and PEP 584 itself. Please can people actually test these lines of argument before reiterating them? ChrisA From songofacandy at gmail.com Tue Mar 5 03:31:16 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Mar 2019 17:31:16 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: On Tue, Mar 5, 2019 at 5:23 PM Chris Angelico wrote: > > On Tue, Mar 5, 2019 at 6:40 PM INADA Naoki wrote: > > This is why function and methods are better: > > > > * Easy to search. > > > > ## Merits of dict.merge() over operator + > > > > * Easy to Google (e.g. "python dict merge"). > > This keeps getting thrown around. It's simply not true. > > https://www.google.com/search?q=%7B**d1%2C+**d2%7D > > First hit when I do that search is Stack Overflow: > > https://stackoverflow.com/questions/2255878/what-does-mean-in-the-expression-dictd1-d2 > > which, while it's not specifically about that exact syntax, does > mention it in the comments on the question. Symbols ARE searchable. In > fact, adding the word "python" to the beginning of that search > produces a number of very useful hits, including a Reddit thread on > combining dictionaries, and PEP 584 itself. > > Please can people actually test these lines of argument before reiterating them? > > ChrisA I'm surprised {**d1, **d2} is searchable. But in my proposal, I compared with one character operator `+`. I switched my browser as English and Googled "python str +" https://www.google.com/search?q=python+str+%2B&oq=python+str+%2B As far as I can see, top result is https://docs.python.org/2/library/string.html When I search "+" in the page, it's difficult to find concat string. I tried Google "python set union" and "python set |" too. "union" is much easier to reach the answer. So I don't think "name is easier to Google than symbol" is a fake or FUD. Regards, -- INADA Naoki From njs at pobox.com Tue Mar 5 03:49:48 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Mar 2019 00:49:48 -0800 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: On Mon, Mar 4, 2019 at 11:41 PM INADA Naoki wrote: > Then, I propose `dict.merge` method. It is outer-place version > of `dict.update`, but accepts multiple dicts. (dict.update() > can be updated to accept multiple dicts, but it's not out of scope). > > * d = d1.merge(d2) # d = d1.copy(); d.update(d2) > * d = d1.merge(d2, d3) # d = d1.copy(); d.update(d2); d2.update(d3) > * d = d1.merge(iter_of_pairs) > * d = d1.merge(key=value) Another similar option would be to extend the dict constructor to allow: d = dict(d1, d2, d3, ...) -n -- Nathaniel J. Smith -- https://vorpus.org From jfine2358 at gmail.com Tue Mar 5 03:48:44 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Tue, 5 Mar 2019 08:48:44 +0000 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ Message-ID: SUMMARY Instead of using dict + dict, perhaps use dict.flow_update. Here, flow_update is just like update, except that it returns self. BACKGROUND There's a difference between a sorted copy of a list, and sorting the list in place. >>> items = [2, 0, 1, 9] >>> sorted(items), items ([0, 1, 2, 9], [2, 0, 1, 9]) >>> items.sort(), items (None, [0, 1, 2, 9]) In Python, mutating methods generally return None. Here, this prevents beginners thinking their code has produced a sorted copy of a list, when in fact it has done an in-place sort on the list. If they write >>> aaa = my_list.sort() they'll get a None error when they use aaa. The same goes for dict.update. This is a useful feature, particularly for beginners. It helps them think clearly, and express themselves clearly. THE PROBLEM This returning None can be a nuisance, sometimes. Suppose we have a dictionary of default values, and a dictionary of use supplied options. We wish to combine the two dictionaries, say into a new combined dictionary. One way to do this is: combined = defaults.copy() combined.update(options) But this is awkward when you're in the middle of calling a function: call_big_method( # lots of arguments, one to a line, with comments arg = combined, # Look up to see what combined is. # more arguments ) USING + There's a suggestion, that instead one extends Python so that this works: arg = defaults + options # What does '+' mean here? USING flow_update Here's another suggestion. Instead write: dict_arg = defaults.copy().flow_update(options) # Is this clearer? IMPLEMENTATION Here's an implementation, as a subclass of dict. class mydict(dict): def flow_update(self, *argv, **kwargs): self.update(*argv, **kwargs) return self def copy(self): return self.__class__(self) A DIRTY HACK Not tested, using an assignment expression. dict_arg = (tmp := defaults.copy(), tmp.update(options))[0] Not recommend. -- Jonathan From songofacandy at gmail.com Tue Mar 5 04:04:40 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Mar 2019 18:04:40 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: On Tue, Mar 5, 2019 at 5:50 PM Nathaniel Smith wrote: > > On Mon, Mar 4, 2019 at 11:41 PM INADA Naoki wrote: > > Then, I propose `dict.merge` method. It is outer-place version > > of `dict.update`, but accepts multiple dicts. (dict.update() > > can be updated to accept multiple dicts, but it's not out of scope). > > > > * d = d1.merge(d2) # d = d1.copy(); d.update(d2) > > * d = d1.merge(d2, d3) # d = d1.copy(); d.update(d2); d2.update(d3) > > * d = d1.merge(iter_of_pairs) > > * d = d1.merge(key=value) > > Another similar option would be to extend the dict constructor to > allow: d = dict(d1, d2, d3, ...) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org Yes, it's an option too. One obvious merit of d.merge(...) is it returns same type of d. `type(d1)(d1, d2)` looks ugly. But people just want dict instead of some subtype of dict. This merit is not so important. I'm bit nervous about adding much overload to constructor. That's main reason why I proposed method instead of constructor. Regards, -- INADA Naoki From songofacandy at gmail.com Tue Mar 5 04:12:32 2019 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Mar 2019 18:12:32 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: > * Type of returned value is always same to d1.copy(). No issubclass, > no __iadd__. I'm sorry, I meant __radd__, not __iadd__. From steve at pearwood.info Tue Mar 5 04:28:33 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 20:28:33 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> Message-ID: <20190305092833.GZ4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 09:34:34PM +0000, Paul Moore wrote: > On Mon, 4 Mar 2019 at 20:42, Guido van Rossum wrote: > > > > Honestly I would rather withdraw the subtraction operators than > > reopen the discussion about making dict more like set. As some people have repeatedly pointed out, we already have four ways to spell dict merging: - in-place dict.update; - copy, followed by update; - use a ChainMap; - the obscure new (**d1, ...} syntax. But if there's a good way to get dict difference apart from a manual loop or comprehension, I don't know it. So from my perspective, even though most of the attention has been on the merge operator, I'd rather keep the difference operator. As far as making dicts "more like set", I'm certainly not proposing that. The furthest I'd go is bow to the consensus if it happened to decide that | is a better choice than + (but that seems unlikely). > I'm neutral on dict addition, but dict subtraction seemed an odd > extension to the proposal. Using b in a - b solely for its keys, and > ignoring its values, seems weird to me. The PEP current says that dict subtraction requires the right-hand operand to be a dict. That's the conservative choice that follows the example of list addition (it requires a list, not just any iterable) and avoids breaking changes to code that uses operator-overloading: mydict - some_object works if some_object overloads __rsub__. If dict.__sub__ was greedy in what it accepted, it could break such code. Better (in my opinion) to be less greedy by only allowing dicts. dict -= on the other hand can take any iterable of keys, as the right-hand operand isn't called. Oh, another thing the PEP should gain... a use-case for dict subtraction. Here's a few: (1) You have a pair of dicts D and E, and you want to update D with only the new keys from E: D.update(E - D) which I think is nicer than writing a manual loop: D.update({k:E[k] for k in (E.keys() - D.keys())}) # or D.update({k:v for k,v in E.items() if k not in D}) (This is a form of update with "first seen wins" instead of the usual "last seen wins".) (2) You have a dict D, and you want to unconditionally remove keys from a blacklist, e.g.: all_users = {'username': user, ...} allowed_users = all_users - banned_users (3) You have a dict, and you want to ensure there's no keys that you didn't expect: if (d := actual-expected): print('unexpected key:value pairs', d) > Even if dict1 - dict2 were > added to the language, I think I'd steer clear of it as being too > obscure. Everything is obscure until people learn it and get used to it. -- Steven From ijkl at netc.fr Tue Mar 5 04:42:42 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Tue, 5 Mar 2019 10:42:42 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <20190305092833.GZ4465@ando.pearwood.info> References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> Message-ID: Indeed the "obscure" argument should be thrown away. The `|` operator in sets seems to be evident for every one on this list but I would be curious to know how many people first got a TypeError doing set1 + set2 and then found set1 | set2 in the doc. Except for math geek the `|` is always something obscure. >> Even if dict1 - dict2 were >> added to the language, I think I'd steer clear of it as being too >> obscure. > Everything is obscure until people learn it and get used to it. > > From jfine2358 at gmail.com Tue Mar 5 04:43:56 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Tue, 5 Mar 2019 09:43:56 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: This is mainly for Steve, as the author of PEP 584. I'm grateful to Steve for preparing the current draft. Thank you. It's strong on implementation, but I find it weak on motivation. I hope that when time is available you (and the other contributors) could transfer some motivating material into the PEP, from python-ideas. According to PEP 001, the PEP "should clearly explain why the existing language specification is inadequate to address the problem that the PEP solves". So it is important. -- Jonathan From songofacandy at gmail.com Tue Mar 5 04:56:49 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 5 Mar 2019 18:56:49 +0900 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 6:42 PM Jimmy Girardet wrote: > > Indeed the "obscure" argument should be thrown away. > > The `|` operator in sets seems to be evident for every one on this list > but I would be curious to know how many people first got a TypeError > doing set1 + set2 and then found set1 | set2 in the doc. > > Except for math geek the `|` is always something obscure. > Interesting point. In Japan, we learn set in high school, not in university. And I think it's good idea that people using `set` type learn about `set` in math. So I don't think "union" is not only for math geeks. But we use "A ? B" in math. `|` is borrowed from "bitwise OR" in C. And "bitwise" operators are for "geeks". Although I'm not in favor of adding `+` to set, it will be worth enough to add `+` to set too if it is added to dict for consistency. FWIW, Scala uses `++` for join all containers. Kotlin uses `+` for join all containers. (ref https://discuss.python.org/t/pep-584-survey-of-other-languages-operator-overload/977) Regards, -- Inada Naoki From steve at pearwood.info Tue Mar 5 04:59:08 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 20:59:08 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> Message-ID: <20190305095906.GA4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 03:33:36PM -0500, Neil Girdhar wrote: > Maybe, but reading through the various replies, it seems that if you > are adding "-" to be analogous to set difference, then the combination > operator should be analogous to set union "|". That's the purpose of this discussion, to decide whether dict merging is more like addition/concatenation or union :-) > And it also opens an > opportunity to add set intersection "&". What should intersection do in the case of matching keys? I see the merge + operator as a kind of update, whether it makes a copy or does it in place, so to me it is obvious that "last seen wins" should apply just as it does for the update method. But dict *intersection* is a more abstract operation than merge/update. And that leads to the problem, what do you do with the values? {key: "spam"} & {key: "eggs"} # could result in any of: {key: "spam"} {key: "eggs"} {key: ("spam", "eggs")} {key: "spameggs"} an exception something else? Unlike "update", I don't have any good use-cases to prefer any one of those over the others. > After all, how do you filter a dictionary to a set of keys? > > >> d = {'some': 5, 'extra': 10, 'things': 55} > >> d &= {'some', 'allowed', 'options'} > >> d > {'some': 5} new = d - (d - allowed) {k:v for (k,v) in d if k in allowed} > >> > * Regarding how to construct the new set in __add__, I now think this should be done like this: > >> > > >> > class dict: > >> > > >> > def __add__(self, other): > >> > > >> > new = self.copy() # A subclass may or may not choose to override > >> > new.update(other) > >> > return new > >> > >> I like that, but it would be inefficient to do that for __sub__ since > >> it would create elements that it might later delete. > >> > >> def __sub__(self, other): > >> new = self.copy() > >> for k in other: > >> del new[k] > >> return new > >> > >> is less efficient than > >> > >> def __sub__(self, other): > >> return type(self)({k: v for k, v in self.items() if k not in other}) I don't think you should be claiming what is more or less efficient unless you've actually profiled them for speed and memory use. Often, but not always, the two are in opposition: we make things faster by using more memory, and save memory at the cost of speed. Your version of __sub__ creates a temporary dict, which then has to be copied in order to preserve the type. Its not obvious to me that that's faster or more memory efficient than building a dict then deleting keys. (Remember that dicts aren't lists, and deleting keys is an O(1) operation.) -- Steven From steve at pearwood.info Tue Mar 5 05:21:35 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 21:21:35 +1100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <5C78506E.6040600@canterbury.ac.nz> <20190304164408.GV4465@ando.pearwood.info> Message-ID: <20190305102135.GB4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 10:18:13PM -0800, Amber Yust wrote: > Adding the + operator for dictionaries feels like it would be a mistake in > that it offers at most sugar-y benefits, but introduces the significant > drawback of making it easier to introduced unintended errors. What sort of errors? I know that some (mis-)features are "bug magnets" that encourage people to write buggy code, but I don't see how this proposal is worse than dict.update(). In one way it is better, since D + E returns a new dict, instead of over-writing the data in D. Ask any functional programmer, and they'll tell you that we should avoid side-effects. > This would be > the first instance of "addition" where the result can potentially > lose/overwrite data (lists and strings both preserve the full extent of > each operand; Counters include the full value from each operand, etc). I don't see why this is relevant to addition. It doesn't even apply to numeric addition! If I give you the result of an addition: 101 say, you can't tell what the operands were. And that's not even getting into the intricicies of floating point addition, which can violate associativity ``(a + b) + c`` is not necessarily equal to ``a + (b + c)`` and distributivity: ``x*(a + b)`` is not necessarily equal to ``x*a + x*b`` even for well-behaved, numeric floats (not NANs or INFs). > Combining dictionaries is fundamentally an operation that requires more > than one piece of information, because there's no single well-defined way > to combine a pair of them. Indeed, But some ways are more useful than others. > Off the top of my head, I can think of at least > 2 different common options (replacement aka .update(), combination of > values a la Counter). Neither of these is really a more valid "addition" of > dictionaries. That's why we have subclasses and operator overloading :-) By far the most commonly requested behaviour for this is copy-and- update (or merge, if you prefer). But subclasses are free to define it as they will, including: - add values, as Counter already does; - raise an exception if there is a duplicate key; - "first seen wins" or anything else. -- Steven From fhsxfhsx at 126.com Tue Mar 5 04:53:09 2019 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Tue, 5 Mar 2019 17:53:09 +0800 (CST) Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: <29395fe2.8d3a.1694d456e43.Coremail.fhsxfhsx@126.com> I agree so much on your opinion that I was just to create a topic about this if you didn't. I also propose here a small modification to make it more general which adds an argument `how` (name to be discussed), telling how to merge the dicts, as many have pointed out that there could be different ways to merge dicts. So things would be like def addition_merge(key, values, exists): """ :param key: the key to merge :param values: values of dicts to merge indexed at `key` :param exists: whether each dict contains `key` """ if any(exists): return True, sum([value for exist, value in zip(exists, values) if exist]) else: return False d1.merge(d2, d3, ..., how=addition_merge) We could even have def discard(key, values, exists): return not any(exists[1:]), values[0] d1.merge(d2, how=discard) which does the same thing as proposed `d1-d2`. This would make things like d = d1.merge(iter_of_pairs) d = d1.merge(key=value) not working, but people could easily wrap a `dict()` over the iterator or key-value stuff and attach no complication. At 2019-03-05 15:39:40, "INADA Naoki" wrote: >I think some people in favor of PEP 584 just want >single expression for merging dicts without in-place update. > >But I feel it's abuse of operator overload. I think functions >and methods are better than operator unless the operator >has good math metaphor, or very frequently used as concatenate >strings. > >This is why function and methods are better: > >* Easy to search. >* Name can describe it's behavior better than abused operator. >* Simpler lookup behavior. (e.g. subclass and __iadd__) > >Then, I propose `dict.merge` method. It is outer-place version >of `dict.update`, but accepts multiple dicts. (dict.update() >can be updated to accept multiple dicts, but it's not out of scope). > >* d = d1.merge(d2) # d = d1.copy(); d.update(d2) >* d = d1.merge(d2, d3) # d = d1.copy(); d.update(d2); d2.update(d3) >* d = d1.merge(iter_of_pairs) >* d = d1.merge(key=value) > > >## Merits of dict.merge() over operator + > >* Easy to Google (e.g. "python dict merge"). >* Easy to help(dict.merge). (or dict.merge? in IPython) >* No inefficiency of d1+d2+d3+...+dN, or sum(list_of_many_dicts) >* Type of returned value is always same to d1.copy(). No issubclass, >no __iadd__. > >## Why not dict.updated()? > >sorted() is a function so it looks different from L.sort() >But d.updated() is very similar to d.update() for human eyes. > >## How about d1 - d2? > >If it is really useful, it can be implemented as method too. > >dict.discard(sequence_of_keys) > >Regards, >-- >INADA Naoki >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Mar 5 05:26:33 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 21:26:33 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <20190305102632.GC4465@ando.pearwood.info> On Sat, Mar 02, 2019 at 01:47:37AM +0900, INADA Naoki wrote: > > If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython[2]. > > I don't think so. https://bugs.python.org/issue35105 and > https://mail.python.org/pipermail/python-dev/2018-October/155435.html > are about kwargs. I think non string keys are allowed for {**d1, > **d2} by language. Is this documented somewhere? Or is there a pronouncement somewhere that it is definitely expected to work in any language calling itself Python? Thanks, -- Steven From songofacandy at gmail.com Tue Mar 5 05:36:31 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 5 Mar 2019 19:36:31 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190305102632.GC4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 7:26 PM Steven D'Aprano wrote: > > On Sat, Mar 02, 2019 at 01:47:37AM +0900, INADA Naoki wrote: > > > If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython[2]. > > > > I don't think so. https://bugs.python.org/issue35105 and > > https://mail.python.org/pipermail/python-dev/2018-October/155435.html > > are about kwargs. I think non string keys are allowed for {**d1, > > **d2} by language. > > Is this documented somewhere? > It is not explicitly documented. But unlike keyword argument, dict display supported non-string keys from very old. I believe {3: 4} is supported by Python language, not CPython implementation behavior. https://docs.python.org/3/reference/expressions.html#grammar-token-dict-display > Or is there a pronouncement somewhere that it is definitely expected to > work in any language calling itself Python? > > > Thanks, > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Inada Naoki From andrew.svetlov at gmail.com Tue Mar 5 05:38:11 2019 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 5 Mar 2019 12:38:11 +0200 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <29395fe2.8d3a.1694d456e43.Coremail.fhsxfhsx@126.com> References: <29395fe2.8d3a.1694d456e43.Coremail.fhsxfhsx@126.com> Message-ID: Python C API has PyDict_Merge (https://docs.python.org/3/c-api/dict.html#c.PyDict_Merge) function which has different behavior than the proposed Python level method (doesn't copy but merge in-place). This is a red flag for me. On Tue, Mar 5, 2019 at 12:24 PM fhsxfhsx wrote: > > I agree so much on your opinion that I was just to create a topic about this if you didn't. > I also propose here a small modification to make it more general which adds an argument `how` (name to be discussed), telling how to merge the dicts, as many have pointed out that there could be different ways to merge dicts. > So things would be like > > def addition_merge(key, values, exists): > """ > :param key: the key to merge > :param values: values of dicts to merge indexed at `key` > :param exists: whether each dict contains `key` > """ > if any(exists): > return True, sum([value for exist, value in zip(exists, values) if exist]) > else: > return False > d1.merge(d2, d3, ..., how=addition_merge) > > We could even have > > def discard(key, values, exists): > return not any(exists[1:]), values[0] > d1.merge(d2, how=discard) > > which does the same thing as proposed `d1-d2`. > > This would make things like > d = d1.merge(iter_of_pairs) > d = d1.merge(key=value) > not working, but people could easily wrap a `dict()` over the iterator or key-value stuff and attach no complication. > > > At 2019-03-05 15:39:40, "INADA Naoki" wrote: > >I think some people in favor of PEP 584 just want > >single expression for merging dicts without in-place update. > > > >But I feel it's abuse of operator overload. I think functions > >and methods are better than operator unless the operator > >has good math metaphor, or very frequently used as concatenate > >strings. > > > >This is why function and methods are better: > > > >* Easy to search. > >* Name can describe it's behavior better than abused operator. > >* Simpler lookup behavior. (e.g. subclass and __iadd__) > > > >Then, I propose `dict.merge` method. It is outer-place version > >of `dict.update`, but accepts multiple dicts. (dict.update() > >can be updated to accept multiple dicts, but it's not out of scope). > > > >* d = d1.merge(d2) # d = d1.copy(); d.update(d2) > >* d = d1.merge(d2, d3) # d = d1.copy(); d.update(d2); d2.update(d3) > >* d = d1.merge(iter_of_pairs) > >* d = d1.merge(key=value) > > > > > >## Merits of dict.merge() over operator + > > > >* Easy to Google (e.g. "python dict merge"). > >* Easy to help(dict.merge). (or dict.merge? in IPython) > >* No inefficiency of d1+d2+d3+...+dN, or sum(list_of_many_dicts) > >* Type of returned value is always same to d1.copy(). No issubclass, > >no __iadd__. > > > >## Why not dict.updated()? > > > >sorted() is a function so it looks different from L.sort() > >But d.updated() is very similar to d.update() for human eyes. > > > >## How about d1 - d2? > > > >If it is really useful, it can be implemented as method too. > > > >dict.discard(sequence_of_keys) > > > >Regards, > >-- > >INADA Naoki > >_______________________________________________ > >Python-ideas mailing list > >Python-ideas at python.org > >https://mail.python.org/mailman/listinfo/python-ideas > >Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Thanks, Andrew Svetlov From storchaka at gmail.com Tue Mar 5 05:46:46 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 5 Mar 2019 12:46:46 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: 04.03.19 15:29, Serhiy Storchaka ????: > Using "|" looks more natural to me than using "+". We > should look at discussions for using the "|" operator for sets, if the > alternative of using "+" was considered, I think the same arguments for > preferring "|" for sets are applicable now for dicts. See the Python-Dev thread with the subject "Re: Re: PEP 218 (sets); moving set.py to Lib" starting from https://mail.python.org/pipermail/python-dev/2002-August/028104.html From steve at pearwood.info Tue Mar 5 05:59:09 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Mar 2019 21:59:09 +1100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: Message-ID: <20190305105909.GD4465@ando.pearwood.info> On Tue, Mar 05, 2019 at 06:04:40PM +0900, INADA Naoki wrote: [...] > One obvious merit of d.merge(...) is it returns same type of d. > `type(d1)(d1, d2)` looks ugly. > > But people just want dict instead of some subtype of dict. > This merit is not so important. Not to me! It *is* important to me. I want builtins to honour their subclasses. It is probably too late to change existing behaviour, but my proposal specifies that subclasses are honoured. -- Steven From songofacandy at gmail.com Tue Mar 5 06:02:17 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 5 Mar 2019 20:02:17 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <20190305105909.GD4465@ando.pearwood.info> References: <20190305105909.GD4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 7:59 PM Steven D'Aprano wrote: > > On Tue, Mar 05, 2019 at 06:04:40PM +0900, INADA Naoki wrote: > [...] > > One obvious merit of d.merge(...) is it returns same type of d. > > `type(d1)(d1, d2)` looks ugly. > > > > But people just want dict instead of some subtype of dict. > > This merit is not so important. > > Not to me! It *is* important to me. I'm sorry, I missed "most". > > I want builtins to honour their subclasses. It is probably too late to > change existing behaviour, but my proposal specifies that subclasses are > honoured. > Then my proposal `d1.merge(d2)` is much better than alternative dict(d1, d2) for you. -- Inada Naoki From paal.drange at gmail.com Tue Mar 5 06:10:57 2019 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Tue, 5 Mar 2019 12:10:57 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <5C78506E.6040600@canterbury.ac.nz> Message-ID: I just wanted to mention this since it hasn't been brought up, but neither of these work a.keys() + b.keys() a.values() + b.values() a.items() + b.items() However, the following do work: a.keys() | b.keys() a.items() | b.items() Perhaps they work by coincidence (being set types), but I think it's worth bringing up, since a naive/natural Python implementation of dict addition/union would possibly involve the |-operator. P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Tue Mar 5 06:49:17 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 5 Mar 2019 11:49:17 +0000 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> Message-ID: On 05/03/2019 09:42, Jimmy Girardet wrote: > Indeed the "obscure" argument should be thrown away. > > The `|` operator in sets seems to be evident for every one on this list > but I would be curious to know how many people first got a TypeError > doing set1 + set2 and then found set1 | set2 in the doc. Every. Single. Time. I don't use sets a lot (purely by happenstance rather than choice), and every time I do I have to go and look in the documentation because I expect the union operator to be '+'. > Except for math geek the `|` is always something obscure. Two thirds of my degree is in maths, and '|' is still something I don't associate with sets. It would be unreasonable to expect '?' and '?' as the operators, but reasoning from '-' for set difference I always expect '+' and '*' as the union and intersection operators. Alas my hopes are always cruelly crushed :-) -- Rhodri James *-* Kynesim Ltd From daveshawley at gmail.com Tue Mar 5 08:11:29 2019 From: daveshawley at gmail.com (David Shawley) Date: Tue, 5 Mar 2019 08:11:29 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: On Mar 4, 2019, at 4:51 AM, Stefan Behnel wrote: > > I think the main intentions is to close a gap in the language. > > [1,2,3] + [4,5,6] > > works for lists and tuples, > > {1,2,3} | {4,5,6} > > works for sets, but joining two dicts isn't simply > > {1:2, 3:4} + {5:6} > > but requires either some obscure syntax or a statement instead of a simple > expression. > > The proposal is to enable the obvious syntax for something that should be > obvious. I would challenge that this dictionary merging is something that is obvious. The existing sequences are simple collections of values where a dictionary is a mapping of values. The difference between the two is akin to the difference between a mathematical array or set and a unary mapping function. There is a clear and obvious way to combine arrays and sets -- concatenation for arrays and union for sets. Combining mapping functions is less than obvious. "Putting Metaclasses to Work" (ISBN-13 978-0201433050) presents a more mathematical view of programming language types that includes two distinct operations for combining dictionaries -- merge and recursive merge. For two input dictionaries D1 & D2 and the output dictionary O D1 merge D2 O is D1 with the of those keys of D2 that do not have keys in D1 D1 recursive-merge D2 For all keys k, O[k] = D1[k] recursive merge D2[k] if both D1[k] and D2[k] are dictionaries, otherwise O[k] = (D1 merge D2)[k]. Note that neither of the cases is the same as: >>> O = D1.copy() >>> O.update(D2) So that gives us three different ways to combine dictionaries that are each sensible. The following example uses dictionaries from "Putting Metaclasses to Work": >>> d1 = { ... 'title': 'Structured Programming', ... 'authors': 'Dahl, Dijkstra, and Hoare', ... 'locations': { ... 'Dahl': 'University of Oslo', ... 'Dijkstra': 'University of Texas', ... 'Hoare': 'Oxford University', ... }, ... } >>> >>> d2 = { ... 'publisher': 'Academic Press', ... 'locations': { ... 'North America': 'New York', ... 'Europe': 'London', ... }, ... } >>> >>> o = d1.copy() >>> o.update(d2) >>> o {'publisher': 'Academic Press', 'title': 'Structured Programming', 'locations': {'North America': 'New York', 'Europe': 'London'}, 'authors': 'Dahl, Dijkstra, and Hoare'} >>> >>> merge(d1, d2) {'publisher': 'Academic Press', 'title': 'Structured Programming', 'locations': {'Dijkstra': 'University of Texas', 'Hoare': 'Oxford University', 'Dahl': 'University of Oslo'}, 'authors': 'Dahl, Dijkstra, and Hoare'} >>> >>> recursive_merge(d1, d2) {'publisher': 'Academic Press', 'title': 'Structured Programming', 'locations': {'North America': 'New York', 'Europe': 'London', 'Dijkstra': 'University of Texas', 'Hoare': 'Oxford University', 'Dahl': 'University of Oslo'}, 'authors': 'Dahl, Dijkstra, and Hoare'} >>> https://repl.it/@dave_shawley/PuttingMetaclassesToWork IMO, having more than one obvious outcome means that we should refuse the temptation to guess. If we do, then the result is only obvious to a subset of users and will be a surprise to the others. It's also useful to note that I am having trouble coming up with another programming language that supports a "+" operator for map types. Does anyone have an example of another programming language that allows for addition of dictionaries/mappings? If so, what is the behavior there? - dave -- Any linter or project that treats PEP 8 as mandatory has *already* failed, as PEP 8 itself states that the rules can be broken as needed. - Paul Moore. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ijkl at netc.fr Tue Mar 5 09:20:35 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Tue, 5 Mar 2019 15:20:35 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: > Does anyone have an example of another programming language that > allows for addition of dictionaries/mappings? > kotlin does that (`to` means `:`) ? : fun main() { ??? var a = mutableMapOf("a" to 1, "b" to 2) ??? var b = mutableMapOf("c" to 1, "b" to 3) ??? println(a) ??? println(b) ??? println(a + b) ??? println(b + a) } ??? {a=1, b=2} {c=1, b=3} {a=1, b=3, c=1} {c=1, b=2, a=1} -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 5 11:02:03 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 5 Mar 2019 08:02:03 -0800 Subject: [Python-ideas] Add a "week" function or attribute to datetime.date In-Reply-To: References: Message-ID: On Mon, Mar 4, 2019 at 10:00 PM Steve Barnes wrote: > If anybody is looking for such components then wx.DateTime > There has got to be a stand alone python library for that! Anyone know the status of the venerable mxDateTime? -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 5 11:07:45 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Mar 2019 08:07:45 -0800 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: If you have to tell such a long and convoluted story to explain a name that you've picked out of the blue and that has no equivalent in other Python data types, it's probably a bad idea. If you're proposing that other mutating methods also gain a flow_XXX variant, please, no! That's like the theory of supersymmetry (SUSY) in particle physics, where ever known particle from the Standard Model would have to have a much heavier "superpartner" just to make some esoteric idea work. On Tue, Mar 5, 2019 at 12:54 AM Jonathan Fine wrote: > SUMMARY > Instead of using dict + dict, perhaps use dict.flow_update. Here, > flow_update is just like update, except that it returns self. > > BACKGROUND > There's a difference between a sorted copy of a list, and sorting the > list in place. > > >>> items = [2, 0, 1, 9] > >>> sorted(items), items > ([0, 1, 2, 9], [2, 0, 1, 9]) > >>> items.sort(), items > (None, [0, 1, 2, 9]) > > In Python, mutating methods generally return None. Here, this prevents > beginners thinking their code has produced a sorted copy of a list, > when in fact it has done an in-place sort on the list. If they write > >>> aaa = my_list.sort() > they'll get a None error when they use aaa. > > The same goes for dict.update. This is a useful feature, particularly > for beginners. It helps them think clearly, and express themselves > clearly. > > THE PROBLEM > This returning None can be a nuisance, sometimes. Suppose we have a > dictionary of default values, and a dictionary of use supplied > options. We wish to combine the two dictionaries, say into a new > combined dictionary. > > One way to do this is: > > combined = defaults.copy() > combined.update(options) > > But this is awkward when you're in the middle of calling a function: > > call_big_method( > # lots of arguments, one to a line, with comments > arg = combined, # Look up to see what combined is. > # more arguments > ) > > USING + > There's a suggestion, that instead one extends Python so that this works: > arg = defaults + options # What does '+' mean here? > > USING flow_update > Here's another suggestion. Instead write: > dict_arg = defaults.copy().flow_update(options) # Is this clearer? > > IMPLEMENTATION > Here's an implementation, as a subclass of dict. > > class mydict(dict): > > def flow_update(self, *argv, **kwargs): > self.update(*argv, **kwargs) > return self > > def copy(self): > return self.__class__(self) > > A DIRTY HACK > Not tested, using an assignment expression. > dict_arg = (tmp := defaults.copy(), tmp.update(options))[0] > Not recommend. > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Mar 5 11:11:01 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Mar 2019 03:11:01 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: <20190305161100.GE4465@ando.pearwood.info> On Tue, Mar 05, 2019 at 08:11:29AM -0500, David Shawley wrote: > "Putting Metaclasses to Work" (ISBN-13 978-0201433050) presents a more > mathematical view of programming language types that includes two > distinct operations for combining dictionaries -- merge and recursive > merge. > > For two input dictionaries D1 & D2 and the output dictionary O > > D1 merge D2 > O is D1 with the of those keys of D2 that do not have keys in D1 > > D1 recursive-merge D2 > For all keys k, O[k] = D1[k] recursive merge D2[k] if both D1[k] > and D2[k] are dictionaries, otherwise O[k] = (D1 merge D2)[k]. I'm afraid I cannot understand either of those algorithms as written. I suspect that you've left at least one word out of the first. Fortunately your example below is extremely clear, thank you. [...] > The following example uses dictionaries from "Putting > Metaclasses to Work": > > >>> d1 = { > ... 'title': 'Structured Programming', > ... 'authors': 'Dahl, Dijkstra, and Hoare', > ... 'locations': { > ... 'Dahl': 'University of Oslo', > ... 'Dijkstra': 'University of Texas', > ... 'Hoare': 'Oxford University', > ... }, > ... } > >>> > >>> d2 = { > ... 'publisher': 'Academic Press', > ... 'locations': { > ... 'North America': 'New York', > ... 'Europe': 'London', > ... }, > ... } > >>> > >>> o = d1.copy() > >>> o.update(d2) > >>> o > {'publisher': 'Academic Press', > 'title': 'Structured Programming', > 'locations': {'North America': 'New York', 'Europe': 'London'}, > 'authors': 'Dahl, Dijkstra, and Hoare'} Yes, that's the classic "update with last seen wins". That's what the PEP proposes as that seems to be the most frequently requested behaviour. It is also the only behaviour which has been deemed useful enough in nearly 30 years of Python's history to be added to dict as a method. > >>> merge(d1, d2) > {'publisher': 'Academic Press', > 'title': 'Structured Programming', > 'locations': {'Dijkstra': 'University of Texas', > 'Hoare': 'Oxford University', > 'Dahl': 'University of Oslo'}, > 'authors': 'Dahl, Dijkstra, and Hoare'} That seems to be "update with first seen wins", which is easily done using ChainMap or the proposed dict difference operator: dict( ChainMap(d1, d2) ) # or d1 + (d2 - d1) or simply by swapping the order of the operands: d2 + d1 (These are not *identical* in effect, there are small differences with respect to key:value identity, and order of keys. But they ought to give *equal* results.) Personally, I don't think that behaviour is as useful as the first, but it is certainly a legitimate kind of merge. As far as I know, this has never been requested before. Perhaps it is too niche? > >>> recursive_merge(d1, d2) > {'publisher': 'Academic Press', > 'title': 'Structured Programming', > 'locations': {'North America': 'New York', > 'Europe': 'London', > 'Dijkstra': 'University of Texas', > 'Hoare': 'Oxford University', > 'Dahl': 'University of Oslo'}, > 'authors': 'Dahl, Dijkstra, and Hoare'} That's an interesting one. I'd write it something like this: def merge(a, b): new = a.copy() for key, value in b: if key not in a: # Add new keys. new[key] = value else: v = new[key] if isinstance(value, dict) and isinstance(v, dict): # If both values are dicts, merge them. new[key] = merge(v, value) else: # What to do if only one is a dict? # Or if neither is a dict? return new I've seen variants of this where duplicate keys are handled by building a list of the values: def merge(a, b): new = a.copy() for key, value in b: if key in a: v = new[key] if isinstance(v, list): v.append(value) else: new[key] = [v, value] ... or by concatenating values, or adding them (as Counter does), etc. We have subclasses and operator overloading, so you can implement whatever behaviour you like. The question is, is this behaviour useful enough and common enough to be built into dict itself? > IMO, having more than one obvious outcome means that we should refuse > the temptation to guess. We're not *guessing*. We're *chosing* which behaviour we want. Nobody says: When I print some strings, I can seperate them with spaces, or dots, or newlines, and print a newline at the end, or suppress the newline. Since all of these behaviours might be useful for somebody, we should not "guess" what the user wants. Therefore we should not have a print() function at all. The behaviour of print() is not a guess as to what the user wants. We offer a specific behaviour, and if the user is happy with that, then they can use print(), and if not, they can write their own. The same applies here: we're offering one specific behaviour that we think is the most important, and anyone who wants another can write their own. If people don't like my choice of what I think is the most important (copy-and-update, with last seen wins), they can argue for whichever alternative they like. If they make a convincing enough case, the PEP can change :-) James Lu has already tried to argue that the "raise on non-unique keys" is the best behaviour. I have disagreed with that, but if James makes a strong enough case for his idea, and it gains sufficient support, I could be persuaded to change my position. Or he can write a competing PEP and the Steering Council can decide between the two ideas. > If we do, then the result is only obvious > to a subset of users and will be a surprise to the others. Its only a surprise to those users who don't read the docs and make assumptions about behaviour based on their own wild guesses. We should get away from the idea that the only behaviours we can provide are those which are "obvious" (intuitive?) to people who guess what it means without reading the docs. It's great when a function's meaning can be guessed or inferred from a basic understanding of English: len(string) # assuming len is an abbreviation for length but that sets the bar impossibly high. We can't guess what these do, not with any precision: print(spam, eggs) # prints spaces between arguments or not? spam is eggs # that's another way of spelling == right? zip(spam, eggs) # what does it do if args aren't the same length? and who can guess what these do without reading the docs? property, classmethod, slice, enumerate, iter I don't think that Python is a worse language for having specified a meaning for these rather than leaving them out. The Zen's prohibition against guessing in the face of ambiguity does not mean that we must not add a feature to the language that requires the user to learn what it does first. > It's also useful to note that I am having trouble coming up with > another programming language that supports a "+" operator for map types. > > Does anyone have an example of another programming language that > allows for addition of dictionaries/mappings? > > If so, what is the behavior there? An excellent example, but my browser just crashed and it's after 3am here so I'm going to take this opportunity to go to bed :-) -- Steven From pythonchb at gmail.com Tue Mar 5 11:11:06 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 5 Mar 2019 08:11:06 -0800 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: On Tue, Mar 5, 2019 at 12:53 AM Jonathan Fine wrote: > SUMMARY > Instead of using dict + dict, perhaps use dict.flow_update. Here, > flow_update is just like update, except that it returns self. That violates an important convention in Python: mutating methods do not return self. We really want to preserve that convention. On the other hand, as seen in other recent threads, there is a desire for chaining operations of many sorts, so a .flow_update() that returned a new dict would provide that feature. Though I would only recommend that if it was decided that we wanted to generally support that approach for all mutable containers ? which would mean adding quite a few methods. And we could then use the same naming convention for them all. I?m not sure I like ?flow_? though, it?s not very commonly known jargon. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 5 11:45:31 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Mar 2019 08:45:31 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 2:38 AM Inada Naoki wrote: > On Tue, Mar 5, 2019 at 7:26 PM Steven D'Aprano > wrote: > > > > On Sat, Mar 02, 2019 at 01:47:37AM +0900, INADA Naoki wrote: > > > > If the keys are not strings, it currently works in CPython, but it > may not work with other implementations, or future versions of CPython[2]. > > > > > > I don't think so. https://bugs.python.org/issue35105 and > > > https://mail.python.org/pipermail/python-dev/2018-October/155435.html > > > are about kwargs. I think non string keys are allowed for {**d1, > > > **d2} by language. > > > > Is this documented somewhere? > > It is not explicitly documented. But unlike keyword argument, > dict display supported non-string keys from very old. > > I believe {3: 4} is supported by Python language, not CPython > implementation behavior. > > > https://docs.python.org/3/reference/expressions.html#grammar-token-dict-display > I'd like to remove all doubt: {**d1} needs to work regardless of the key type, as long as it's hashable (d1 could be some mapping implemented without hashing, e.g. using a balanced tree, so that it could support unhashable keys). If there's doubt about this anywhere, we could add an example to the docs and to the PEP. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Tue Mar 5 12:02:32 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Tue, 5 Mar 2019 18:02:32 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: > I'd like to remove all doubt: {**d1} needs to work regardless of the key type, as long as it's hashable (d1 could be some mapping implemented without hashing, e.g. using a balanced tree, so that it could support unhashable keys). > > If there's doubt about this anywhere, we could add an example to the docs and to the PEP. On a related note: **kwargs, should they support arbitrary strings as keys? I depend on this behavior in production code and all python implementations handle it. / Anders From jfine2358 at gmail.com Tue Mar 5 12:09:19 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Tue, 5 Mar 2019 17:09:19 +0000 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: I thank Guido and Christopher for their thoughtful comments. You certainly found some weak points. I chose the name 'flow' to match: https://en.wikipedia.org/wiki/Fluent_interface#Python Instead of my previous arg = defaults.copy().flow_update(options) one could instead from somewhere import flow, and then write arg = flow(defaults.copy()).update(options) This avoids a profusion of flow_ methods, and also the need to subclass dict. Once could of course use a different name. Perhaps 'follow' would be better. And it would work 'out the box' in other situations. Christopher might prefer the flow(obj).update approach, as it respects the convention "mutating methods do not return self." (Thank you for your clear statement, Christopher.) (Aside: For the non-experts, a word if I may about the implementation. The key point is that in Python the programmer 'owns the dot' and so the desired semantics can be implemented. We use a custom __getattribute__ .) Finally, please forgive my fictional use case. I think studying real-world use cases for dict + dict would be very helpful to the discussion. I don't recall seeing any, but I haven't looked hard. Instructive use cases should, of course, be placed in the PEP. -- Jonathan From guido at python.org Tue Mar 5 12:13:22 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Mar 2019 09:13:22 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 9:02 AM Anders Hovm?ller wrote: > On a related note: **kwargs, should they support arbitrary strings as > keys? I depend on this behavior in production code and all python > implementations handle it. > The ice is much thinner there, but my position is that as long as they are *strings* such keys should be allowed. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Mar 5 12:20:45 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 6 Mar 2019 02:20:45 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: There was a thread for the topic. https://mail.python.org/pipermail/python-dev/2018-October/155435.html 2019?3?6?(?) 2:02 Anders Hovm?ller : > > On a related note: **kwargs, should they support arbitrary strings as > keys? I depend on this behavior in production code and all python > implementations handle it. > > / Anders -------------- next part -------------- An HTML attachment was scrubbed... URL: From delgan.py at gmail.com Tue Mar 5 13:30:46 2019 From: delgan.py at gmail.com (Del Gan) Date: Tue, 5 Mar 2019 19:30:46 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190305102632.GC4465@ando.pearwood.info> Message-ID: 2019-03-05 0:34 UTC+01:00, Brandt Bucher : >> Is there other built-in types which act differently if called with >> the operator or augmented assignment version? > > list.__iadd__ and list.extend 2019-03-05 0:57 UTC+01:00, Guido van Rossum : > Yes. The same happens for lists. [1] + 'a' is a TypeError, but a += 'a' > works: Oh, I can't believe I'm learning that just today while I'm using Python since years. Thanks for the clarification. This makes perfect sense for += to behaves like .update() then. From greg.ewing at canterbury.ac.nz Tue Mar 5 17:13:55 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Mar 2019 11:13:55 +1300 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> Message-ID: <5C7EF4A3.4060407@canterbury.ac.nz> Rhodri James wrote: > I have to go and look in the documentation because I > expect the union operator to be '+'. Anyone raised on Pascal is likely to find + and * more natural. Pascal doesn't have bitwise operators, so it re-uses + and * for set operations. I like the economy of this arrangement -- it's not as if there's any other obvious meaning that + and * could have for sets. -- Greg From greg.ewing at canterbury.ac.nz Tue Mar 5 17:40:38 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Mar 2019 11:40:38 +1300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190305161100.GE4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> Message-ID: <5C7EFAE6.8000105@canterbury.ac.nz> Steven D'Aprano wrote: > The question is, is [recursive merge] behaviour useful enough and > common enough to be built into dict itself? I think not. It seems like just one possible way of merging values out of many. I think it would be better to provide a merge function or method that lets you specify a function for merging values. -- Greg From brandtbucher at gmail.com Tue Mar 5 17:47:13 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Tue, 5 Mar 2019 14:47:13 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: > > These semantics are intended to match those of update as closely as > possible. For the dict built-in itself, calling keys is redundant as > iteration over a dict iterates over its keys; but for subclasses or other > mappings, update prefers to use the keys method. > > The above paragraph may be inaccurate. Although the dict docstring states > that keys will be called if it exists, this does not seem to be the case > for dict subclasses. Bug or feature? > >>> print(dict.update.__doc__) D.update([E, ]**F) -> None. Update D from dict/iterable E and F. If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k] It's actually pretty interesting... and misleading/wrongish. It never says that keys is *called*... in reality, it just checks for the "keys" method before deciding whether to proceed with PyDict_Merge or PyDict_MergeFromSeq2. It should really read more like: D.update([E, ]**F) -> None. Update D from dict/iterable E and F. If E is present, has a .keys() method, and is a subclass of dict, then does: for k in E: D[k] = E[k] If E is present, has a .keys() method, and is not a subclass of dict, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k] Should our __sub__ behavior be the same (i.e., iterate for dict subclasses and objects without "keys()", otherwise call "keys()" and iterate over that)? __iadd__ calls into this logic already. It seems to be the most "natural" solution here, if we desire behavior analogous to "update". Brandt On Fri, Mar 1, 2019 at 8:26 AM Steven D'Aprano wrote: > Attached is a draft PEP on adding + and - operators to dict for > discussion. > > This should probably go here: > > https://github.com/python/peps > > but due to technical difficulties at my end, I'm very limited in what I > can do on Github (at least for now). If there's anyone who would like to > co-author and/or help with the process, that will be appreciated. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brandtbucher at gmail.com Tue Mar 5 17:54:22 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Tue, 5 Mar 2019 14:54:22 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: > > Should our __sub__ behavior be the same... Sorry, our "__isub__" behavior. Long day... On Tue, Mar 5, 2019 at 2:47 PM Brandt Bucher wrote: > These semantics are intended to match those of update as closely as >> possible. For the dict built-in itself, calling keys is redundant as >> iteration over a dict iterates over its keys; but for subclasses or other >> mappings, update prefers to use the keys method. >> >> The above paragraph may be inaccurate. Although the dict docstring states >> that keys will be called if it exists, this does not seem to be the case >> for dict subclasses. Bug or feature? >> > > >>> print(dict.update.__doc__) > D.update([E, ]**F) -> None. Update D from dict/iterable E and F. > If E is present and has a .keys() method, then does: for k in E: D[k] = > E[k] > If E is present and lacks a .keys() method, then does: for k, v in E: > D[k] = v > In either case, this is followed by: for k in F: D[k] = F[k] > > It's actually pretty interesting... and misleading/wrongish. It never says > that keys is *called*... in reality, it just checks for the "keys" method > before deciding whether to proceed with PyDict_Merge or PyDict > _MergeFromSeq2. It should really read more like: > > D.update([E, ]**F) -> None. Update D from dict/iterable E and F. > If E is present, has a .keys() method, and is a subclass of dict, then > does: for k in E: D[k] = E[k] > If E is present, has a .keys() method, and is not a subclass of dict, then > does: for k in E.keys(): D[k] = E[k] > If E is present and lacks a .keys() method, then does: for k, v in E: > D[k] = v > In either case, this is followed by: for k in F: D[k] = F[k] > > Should our __sub__ behavior be the same (i.e., iterate for dict subclasses > and objects without "keys()", otherwise call "keys()" and iterate over > that)? __iadd__ calls into this logic already. It seems to be the most > "natural" solution here, if we desire behavior analogous to "update". > > Brandt > > On Fri, Mar 1, 2019 at 8:26 AM Steven D'Aprano > wrote: > >> Attached is a draft PEP on adding + and - operators to dict for >> discussion. >> >> This should probably go here: >> >> https://github.com/python/peps >> >> but due to technical difficulties at my end, I'm very limited in what I >> can do on Github (at least for now). If there's anyone who would like to >> co-author and/or help with the process, that will be appreciated. >> >> >> -- >> Steven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Mar 5 17:56:33 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Mar 2019 11:56:33 +1300 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: <5C7EFEA1.7010207@canterbury.ac.nz> Christopher Barker wrote: > That violates an important convention in Python: mutating methods do not > return self. We really want to preserve that convention. Smalltalk has an abbreviated way of writing a series of method calls to the same object: x doThis; doThatWith: y; doTheOther. is equivalent to x doThis. x doThatWith: y. x doTheOther. Something like this could no doubt be added to Python, but I'm not sure it would be worth the bother. Giving a short name to the recipient and then writing the calls out explicitly isn't much harder and is clearer to read, IMO. -- Greg From brandtbucher at gmail.com Tue Mar 5 18:14:38 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Tue, 5 Mar 2019 15:14:38 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: Actually, this was made even more condition-y in 3.8. Now we check __iter__ too: D.update([E, ]**F) -> None. Update D from dict/iterable E and F. If E is present, has a .keys() method, is a subclass of dict, and hasn't overridden __iter__, then does: for k in E: D[k] = E[k] If E is present, has a .keys() method, and is not a subclass of dict or has overridden __iter__, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k] Bleh. On Tue, Mar 5, 2019 at 2:54 PM Brandt Bucher wrote: > Should our __sub__ behavior be the same... > > > Sorry, our "__isub__" behavior. Long day... > > On Tue, Mar 5, 2019 at 2:47 PM Brandt Bucher > wrote: > >> These semantics are intended to match those of update as closely as >>> possible. For the dict built-in itself, calling keys is redundant as >>> iteration over a dict iterates over its keys; but for subclasses or other >>> mappings, update prefers to use the keys method. >>> >>> The above paragraph may be inaccurate. Although the dict docstring >>> states that keys will be called if it exists, this does not seem to be >>> the case for dict subclasses. Bug or feature? >>> >> >> >>> print(dict.update.__doc__) >> D.update([E, ]**F) -> None. Update D from dict/iterable E and F. >> If E is present and has a .keys() method, then does: for k in E: D[k] = >> E[k] >> If E is present and lacks a .keys() method, then does: for k, v in E: >> D[k] = v >> In either case, this is followed by: for k in F: D[k] = F[k] >> >> It's actually pretty interesting... and misleading/wrongish. It never >> says that keys is *called*... in reality, it just checks for the "keys" >> method before deciding whether to proceed with PyDict_Merge or PyDict >> _MergeFromSeq2. It should really read more like: >> >> D.update([E, ]**F) -> None. Update D from dict/iterable E and F. >> If E is present, has a .keys() method, and is a subclass of dict, then >> does: for k in E: D[k] = E[k] >> If E is present, has a .keys() method, and is not a subclass of dict, >> then does: for k in E.keys(): D[k] = E[k] >> If E is present and lacks a .keys() method, then does: for k, v in E: >> D[k] = v >> In either case, this is followed by: for k in F: D[k] = F[k] >> >> Should our __sub__ behavior be the same (i.e., iterate for dict >> subclasses and objects without "keys()", otherwise call "keys()" and >> iterate over that)? __iadd__ calls into this logic already. It seems to be >> the most "natural" solution here, if we desire behavior analogous to >> "update". >> >> Brandt >> >> On Fri, Mar 1, 2019 at 8:26 AM Steven D'Aprano >> wrote: >> >>> Attached is a draft PEP on adding + and - operators to dict for >>> discussion. >>> >>> This should probably go here: >>> >>> https://github.com/python/peps >>> >>> but due to technical difficulties at my end, I'm very limited in what I >>> can do on Github (at least for now). If there's anyone who would like to >>> co-author and/or help with the process, that will be appreciated. >>> >>> >>> -- >>> Steven >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Mar 5 18:14:53 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Mar 2019 10:14:53 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> Message-ID: <20190305231453.GF4465@ando.pearwood.info> On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote: > I propose that the + sign merge two python dictionaries such that if > there are conflicting keys, a KeyError is thrown. This proposal is for a simple, operator-based equivalent to dict.update() which returns a new dict. dict.update has existed since Python 1.5 (something like a quarter of a century!) and never grown a "unique keys" version. I don't recall even seeing a request for such a feature. If such a unique keys version is useful, I don't expect it will be useful often. > This way, d1 + d2 isn?t just another obvious way to do {**d1, **d2}. One of the reasons for preferring + is that it is an obvious way to do something very common, while {**d1, **d2} is as far from obvious as you can get without becoming APL or Perl :-) If I needed such a unique key version of update, I'd use a subclass: class StrictDict(dict): def __add__(self, other): if isinstance(other, dict) and (self.keys() & other.keys()): raise KeyError('non-unique keys') return super().__add__(self, other) # and similar for __radd__. rather than burden the entire language, and every user of it, with having to learn the subtle difference between the obvious + operator and the error-prone and unobvious trick of {*d1, *d2}. ( Did you see what I did there? *wink* ) > The second syntax makes it clear that a new dictionary is being > constructed and that d2 overrides keys from d1. Only because you have learned the rule that {**d, **e) means to construct a new dict by merging, with the rule that in the event of duplicate keys, the last key seen wins. If you hadn't learned that rule, there is nothing in the syntax which would tell you the behaviour. We could have chosen any rule we liked: - raise an exception, like you get a TypeError if you pass the same keyword argument to a function twice: spam(foo=1, foo=2); - first value seen wins; - last value seen wins; - random value wins; - anything else we liked! There is nothing "clear" about the syntax which makes it obvious which behaviour is implemented. We have to learn it. > One can reasonably expect or imagine a situation where a section of > code that expects to merge two dictionaries with non-conflicting keys > commits a semantic error if it merges two dictionaries with > conflicting keys. I can imagine it, but I don't think I've ever needed it, and I can't imagine wanting it often enough to wish it was not just a built-in function or method, but actual syntax. Do you have some real examples of wanting an error when trying to update a dict if keys match? > To better explain, imagine a program where options is a global > variable storing parsed values from the command line. > > def verbose_options(): > if options.quiet > return {'verbose': True} > > def quiet_options(): > if options.quiet: > return {'verbose': False} That seems very artifical to me. Why not use a single function: def verbose_options(): # There's more than one? return {'verbose': not options.quiet} The way you have written those functions seems weird to me. You already have a nice options object, with named fields like "options.quiet", why are you turning it into not one but *two* different dicts, both reporting the same field? And its buggy: if options.quiet is True, then the key 'quiet' should be True, not the 'verbose' key. Do you have *two* functions for every preference setting that takes a true/false flag? What do you do for preference settings that take multiple values? Create a vast number of specialised functions, one for each possible value? def A4_page_options(): if options.page_size == 'A4': return {'page_size': 'A4'} def US_Letter_page_options(): if options.page_size == 'US Letter': return {'page_size': 'US Letter'} page_size = ( A4_page_options() + A3_page_options() + A5_page_options() + Foolscape_page_options + Tabloid_page_options() + US_Letter_page_options() + US_Legal_page_options() # and about a dozen more... ) The point is, although I might be wrong, I don't think that this example is a practical, realistic use-case for a unique keys version of update. To me, your approach seems so complicated and artificial that it seems like it was invented specifically to justify this "unique key" operator, not something that we would want to write in real life. But even if it real code, the question is not whether it is EVER useful for a dict update to raise an exception on matching keys. The question is whether this is so often useful that this is the behaviour we want to make the default for dicts. [...] > Again, I propose that the + sign merge two python dictionaries such > that if there are conflicting keys, a KeyError is thrown, because such > ?non-conflicting merge? behavior would be useful in Python. I don't think it would be, at least not often. If it were common enough to justify a built-in operator to do this, we would have had many requests for a dict.unique_update or similar by now, and I don't think we have. > It gives > clarifying power to the + sign. The + and the {**, **} should serve > different roles. > > In other words, explicit + is better than implicit {**, **#, unless > explicitly suppressed. Here + is explicit whereas {**, **} is > implicitly allowing inclusive keys, If I had a cent for every time people misused "explicit" to mean "the proposal that I like", I'd be rich. In what way is the "+" operator *explicit* about raising an exception on duplicate keys? These are both explicit: merge_but_raise_exception_if_any_duplicates(d1, d2) merge(d1, d2, raise_if_duplicates=True) and these are both equally implicit: d1 + d2 {**d1, **d2} since the behaviour on duplicates is not explicitly stated in clear and obvious language, but implied by the rules of the language. [...] > People expect the + operator to be commutative THey are wrong to expect that, because the + operator is already not commutative for: str bytes bytearray list tuple array.array collections.deque collections.Counter and possibly others. -- Steven From steve at pearwood.info Tue Mar 5 18:36:04 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Mar 2019 10:36:04 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> <20190304162543.GU4465@ando.pearwood.info> Message-ID: <20190305233604.GG4465@ando.pearwood.info> On Mon, Mar 04, 2019 at 08:01:38PM -0500, James Lu wrote: > > > On Mar 4, 2019, at 11:25 AM, Steven D'Aprano wrote: > > Another example would be when reading command line options, where the > > most common convention is for "last option seen" to win: > > > > [steve at ando Lib]$ grep --color=always --color=never "zero" f*.py [...] > Indeed, in this case you would want to use {**, **} syntax. No I would NOT want to use the {**, **} syntax, because it is ugly. That's why people ask for + instead. (Or perhaps I should say "as well as" since the double-star syntax is not going away.) [...] > > Unless someone can demonstrate that the design of dict.update() was a > > mistake > > You?re making a logical mistake here. + isn?t supposed to have > .update?s behavior and it never was supposed to. James, I'm the author of the PEP, and for the purposes of the proposal, the + operator is supposed to do what I say it is supposed to do. You might be able to persuade me to change the PEP, if you have a sufficiently good argument, or you can write your own counter PEP making a different choice, but please don't tell me what I intended. I know what I intended, and it is for + to have the same last-key-wins behaviour as update. That's the behaviour which is most commonly requested in the various times this comes up. > > , and the "require unique keys" behaviour is more common, > > I just have. No you haven't -- you have simply *declared* that it is more common, without giving any evidence for it. > 99% of the time you want to have keys from one dict override another, > you?d be better off doing it in-place and so would be using .update() > anyways. I don't know if it is "99% of the time" or 50% of the time or 5%, but this PEP is for the remaining times where we don't want in-place updates but we want a new dict. I use list.append or list.extend more often than list concatenation, but when I want a new list, list concatenation is very useful. This proposal is about those cases where we want a new dict. -- Steven From shadowranger+pythonideas at gmail.com Tue Mar 5 18:48:41 2019 From: shadowranger+pythonideas at gmail.com (Josh Rosenberg) Date: Tue, 5 Mar 2019 23:48:41 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190305231453.GF4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano wrote: > On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote: > > > I propose that the + sign merge two python dictionaries such that if > > there are conflicting keys, a KeyError is thrown. > > This proposal is for a simple, operator-based equivalent to > dict.update() which returns a new dict. dict.update has existed since > Python 1.5 (something like a quarter of a century!) and never grown a > "unique keys" version. > > I don't recall even seeing a request for such a feature. If such a > unique keys version is useful, I don't expect it will be useful often. > > I have one argument in favor of such a feature: It preserves concatenation semantics. + means one of two things in all code I've ever seen (Python or otherwise): 1. Numeric addition (including element-wise numeric addition as in Counter and numpy arrays) 2. Concatenation (where the result preserves all elements, in order, including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 + seq2)) dict addition that didn't reject non-unique keys wouldn't fit *either* pattern; the main proposal (making it equivalent to left.copy(), followed by .update(right)) would have the left hand side would win on ordering, the right hand side on values, and wouldn't preserve the length invariant of concatenation. At least when repeated keys are rejected, most concatenation invariants are preserved; order is all of the left elements followed by all of the right, and no elements are lost. > > > This way, d1 + d2 isn?t just another obvious way to do {**d1, **d2}. > > One of the reasons for preferring + is that it is an obvious way to do > something very common, while {**d1, **d2} is as far from obvious as you > can get without becoming APL or Perl :-) > > >From the moment PEP 448 published, I've been using unpacking as a more composable/efficient form of concatenation, merging, etc. I'm sorry you don't find it obvious, but a couple e-mails back you said: "The Zen's prohibition against guessing in the face of ambiguity does not mean that we must not add a feature to the language that requires the user to learn what it does first." Learning to use the unpacking syntax in the case of function calls is necessary for tons of stuff (writing general function decorators, handling initialization in class hierarchies, etc.), and as PEP 448 is titled, this is just a generalization combining the features of unpacking arguments with collection literals. > The second syntax makes it clear that a new dictionary is being > > constructed and that d2 overrides keys from d1. > > Only because you have learned the rule that {**d, **e) means to > construct a new dict by merging, with the rule that in the event of > duplicate keys, the last key seen wins. If you hadn't learned that rule, > there is nothing in the syntax which would tell you the behaviour. We > could have chosen any rule we liked: > > No, because we learned the general rule for dict literals that {'a': 1, 'a': 2} produces {'a': 2}; the unpacking generalizations were very good about adhering to the existing rules, so it was basically zero learning curve if you already knew dict literal rules and less general unpacking rules. The only part to "learn" is that when there is a conflict between dict literal rules and function call rules, dict literal rules win. To be clear: I'm not supporting + as raising error on non-unique keys. Even if it makes dict + dict adhere to the rules of concatenation, I don't think it's a common or useful functionality. My order of preferences is roughly: 1. Do nothing (even if you don't like {**d1, **d2}, .copy() followed by .update() is obvious, and we don't need more than one way to do it) 2. Add a new method to dict, e.g. dict.merge (whether it's a class method or an instance method is irrelevant to me) 3. Use | (because dicts are *far* more like sets than they are like sequences, and the semi-lossy rules of unioning make more sense there); it would also make - make sense, since + is only matched by - in numeric contexts; on collections, | and - are paired. And I consider the - functionality the most useful part of this whole proposal (because I *have* wanted to drop a collection of known blacklisted keys from a dict and while it's obvious you can do it by looping, I always wanted to be able to do something like d1.keys() -= badkeys, and remain disappointed nothing like it is available) -Josh Rosenberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 5 19:07:58 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Mar 2019 16:07:58 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> Message-ID: On Tue, Mar 5, 2019 at 3:50 PM Josh Rosenberg < shadowranger+pythonideas at gmail.com> wrote: > > On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano > wrote: > >> On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote: >> >> > I propose that the + sign merge two python dictionaries such that if >> > there are conflicting keys, a KeyError is thrown. >> >> This proposal is for a simple, operator-based equivalent to >> dict.update() which returns a new dict. dict.update has existed since >> Python 1.5 (something like a quarter of a century!) and never grown a >> "unique keys" version. >> >> I don't recall even seeing a request for such a feature. If such a >> unique keys version is useful, I don't expect it will be useful often. >> > > I have one argument in favor of such a feature: It preserves concatenation > semantics. + means one of two things in all code I've ever seen (Python or > otherwise): > > 1. Numeric addition (including element-wise numeric addition as in Counter > and numpy arrays) > 2. Concatenation (where the result preserves all elements, in order, > including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 + > seq2)) > > dict addition that didn't reject non-unique keys wouldn't fit *either* > pattern; the main proposal (making it equivalent to left.copy(), followed > by .update(right)) would have the left hand side would win on ordering, the > right hand side on values, and wouldn't preserve the length invariant of > concatenation. At least when repeated keys are rejected, most concatenation > invariants are preserved; order is all of the left elements followed by all > of the right, and no elements are lost. > I must by now have seen dozens of post complaining about this aspect of the proposal. I think this is just making up rules (e.g. "+ never loses information") to deal with an aspect of the design where a *choice* must be made. This may reflect the Zen of Python's "In the face of ambiguity, refuse the temptation to guess." But really, that's a pretty silly rule (truly, they aren't all winners). Good interface design constantly makes choices in ambiguous situations, because the alternative is constantly asking, and that's just annoying. We have a plethora of examples (in fact, almost all alternatives considered) of situations related to dict merging where a choice is made between conflicting values for a key, and it's always the value further to the right that wins: from d[k] = v (which overrides the value when k is already in the dict) to d1.update(d2) (which lets the values in d2 win), including the much lauded {**d1, **d2} and even plain {'a': 1, 'a': 2} has a well-defined meaning where the latter value wins. As to why raising is worse: First, none of the other situations I listed above raises for conflicts. Second, there's the experience of str+unicode in Python 2, which raises if the str argument contains any non-ASCII bytes. In fact, we disliked it so much that we changed the language incompatibly to deal with it. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From shadowranger+pythonideas at gmail.com Tue Mar 5 19:46:57 2019 From: shadowranger+pythonideas at gmail.com (Josh Rosenberg) Date: Wed, 6 Mar 2019 00:46:57 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> Message-ID: On Wed, Mar 6, 2019 at 12:08 AM Guido van Rossum wrote: > On Tue, Mar 5, 2019 at 3:50 PM Josh Rosenberg < > shadowranger+pythonideas at gmail.com> wrote: > >> >> On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano >> wrote: >> >>> On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote: >>> >>> > I propose that the + sign merge two python dictionaries such that if >>> > there are conflicting keys, a KeyError is thrown. >>> >>> This proposal is for a simple, operator-based equivalent to >>> dict.update() which returns a new dict. dict.update has existed since >>> Python 1.5 (something like a quarter of a century!) and never grown a >>> "unique keys" version. >>> >>> I don't recall even seeing a request for such a feature. If such a >>> unique keys version is useful, I don't expect it will be useful often. >>> >> >> I have one argument in favor of such a feature: It preserves >> concatenation semantics. + means one of two things in all code I've ever >> seen (Python or otherwise): >> >> 1. Numeric addition (including element-wise numeric addition as in >> Counter and numpy arrays) >> 2. Concatenation (where the result preserves all elements, in order, >> including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 + >> seq2)) >> >> dict addition that didn't reject non-unique keys wouldn't fit *either* >> pattern; the main proposal (making it equivalent to left.copy(), followed >> by .update(right)) would have the left hand side would win on ordering, the >> right hand side on values, and wouldn't preserve the length invariant of >> concatenation. At least when repeated keys are rejected, most concatenation >> invariants are preserved; order is all of the left elements followed by all >> of the right, and no elements are lost. >> > > I must by now have seen dozens of post complaining about this aspect of > the proposal. I think this is just making up rules (e.g. "+ never loses > information") to deal with an aspect of the design where a *choice* must be > made. This may reflect the Zen of Python's "In the face of ambiguity, > refuse the temptation to guess." But really, that's a pretty silly rule > (truly, they aren't all winners). Good interface design constantly makes > choices in ambiguous situations, because the alternative is constantly > asking, and that's just annoying. > > We have a plethora of examples (in fact, almost all alternatives > considered) of situations related to dict merging where a choice is made > between conflicting values for a key, and it's always the value further to > the right that wins: from d[k] = v (which overrides the value when k is > already in the dict) to d1.update(d2) (which lets the values in d2 win), > including the much lauded {**d1, **d2} and even plain {'a': 1, 'a': 2} has > a well-defined meaning where the latter value wins. > > Yeah. And I'm fine with the behavior for update because the name itself is descriptive; we're spelling out, in English, that we're update-ing the thing it's called on, so it makes sense to have the thing we're sourcing for updates take precedence. Similarly, for dict literals (and by extension, unpacking), it's following an existing Python convention which doesn't contradict anything else. Overloading + lacks the clear descriptive aspect of update that describes the goal of the operation, and contradicts conventions (in Python and elsewhere) about how + works (addition or concatenation, and a lot of people don't even like it doing the latter, though I'm not that pedantic). A couple "rules" from C++ on overloading are "*Whenever the meaning of an operator is not obviously clear and undisputed, it should not be overloaded.* *Instead, provide a function with a well-chosen name.*" and "*Always stick to the operator?s well-known semantics".* (Source: https://stackoverflow.com/a/4421708/364696 , though the principle is restated in many other places). Obviously the C++ community isn't perfect on this (see iostream and <> operators), but they're otherwise pretty consistent. + means addition, and in many languages including C++ strings, concatenation, but I don't know of any languages outside the "esoteric" category that use it for things that are neither addition nor concatenation. You've said you don't want the whole plethora of set-like behaviors on dicts, but dicts are syntactically and semantically much more like sets than sequences, and if you add + (with semantics differing from both sets and sequences), the language becomes less consistent. I'm not against making it easier to merge dictionaries. But people seem to be arguing that {**d1, **d2} is bad because of magic punctuation that obscures meaning, when IMO: d3 = d1 + d2 is obscuring meaning by adding yet a third rule for what + means, inconsistent with both existing rules (from both Python and the majority of languages I've had cause to use). A named method (class or instance) or top-level function (a la sorted) is more explicit, easier to look up (after all, the major complaint about ** syntax is the difficulty of finding the documentation on it). It's also easier to make it do the right thing; d1 + d2 + d3 + ... dN is inefficient (makes many unnecessary temporaries), {**d1, **d2, **d3, ..., **dN} is efficient but obscure (and not subclass friendly), but a varargs method like dict.combine(d1, d2, d3, ..., dN) (or merge, or whatever; I'm not trying to bikeshed) is correct, efficient, and most importantly, easy to look up documentation for. I occasionally find it frustrating that concatenation exists given the wealth of Schlemiel the Painter's algorithms it encourages, and the "correct" solution for combining sequences (itertools.chain for general cases, str.join/bytes.join for special cases) being less obvious means my students invariably use the "wrong" tool out of convenience (and it's not really wrong in 90% of code where the lengths are always short, but then they use it where lengths are often huge and suffer for it). If we're going to make dict merging more convenient, I'd prefer we make the obvious, convenient solution also the one that doesn't encourage non-scalable anti-patterns. As to why raising is worse: First, none of the other situations I listed > above raises for conflicts. Second, there's the experience of str+unicode > in Python 2, which raises if the str argument contains any non-ASCII bytes. > In fact, we disliked it so much that we changed the language incompatibly > to deal with it. > Agreed, I don't like raising. It's consistent with + (the only argument in favor of it really), but it's a bad idea, for all the reasons you mention. - Josh Rosenberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Tue Mar 5 21:05:52 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 5 Mar 2019 18:05:52 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: <5C7EF4A3.4060407@canterbury.ac.nz> References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> <5C7EF4A3.4060407@canterbury.ac.nz> Message-ID: > On Mar 5, 2019, at 2:13 PM, Greg Ewing wrote: > > Rhodri James wrote: >> I have to go and look in the documentation because I expect the union operator to be '+'. > > Anyone raised on Pascal is likely to find + and * more > natural. Pascal doesn't have bitwise operators, so it > re-uses + and * for set operations. I like the economy > of this arrangement -- it's not as if there's any > other obvious meaning that + and * could have for sets. The language SETL (the language of sets) also uses + and * for set operations.? For us though, the decision to use | and & are set in stone. The time for debating the decision was 19 years ago.? Raymond ? https://www.linuxjournal.com/article/6805 ? https://www.python.org/dev/peps/pep-0218/ From guido at python.org Tue Mar 5 21:25:39 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Mar 2019 18:25:39 -0800 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <20190304164408.GV4465@ando.pearwood.info> <20190305092833.GZ4465@ando.pearwood.info> <5C7EF4A3.4060407@canterbury.ac.nz> Message-ID: On Tue, Mar 5, 2019 at 6:07 PM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Mar 5, 2019, at 2:13 PM, Greg Ewing > wrote: > > > > Rhodri James wrote: > >> I have to go and look in the documentation because I expect the union > operator to be '+'. > > > > Anyone raised on Pascal is likely to find + and * more > > natural. Pascal doesn't have bitwise operators, so it > > re-uses + and * for set operations. I like the economy > > of this arrangement -- it's not as if there's any > > other obvious meaning that + and * could have for sets. > > The language SETL (the language of sets) also uses + and * for set > operations.? > So the secret is out: Python inherits a lot from SETL, through ABC -- ABC was heavily influenced by SETL. > ? https://www.linuxjournal.com/article/6805 > ? https://www.python.org/dev/peps/pep-0218/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorropo.pgm at gmail.com Tue Mar 5 23:01:25 2019 From: jorropo.pgm at gmail.com (Jorropo .) Date: Wed, 6 Mar 2019 05:01:25 +0100 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) Message-ID: I was doing some async networking and I wondered, why I have to use 2 different api for making the same things in async or sync regime. Even if we make 2 perfectly identical api (except function will be sync and async), it will still 2 different code). So I first thinked to allow await in syncronous function but that create some problems (ex: an async function calling async.create_task) so if we allow that we have to asume to allways be in async regime (like js). Or we can differentiate async function wich can be awaited in syncronous regime, maybe with a new keyword (here I will use polymorph due to a lack of imagination but I find that one too long) ? So a polymorph function can be awaited in a syncronous function, and a polymorph function can only await polymorph functions. Polymorph function work exacly like async function BUT they assure of the ability to execute syncronously. And in a syncronous regime if an await need to wait (like async.sleep or network operation), just wait (like the equivalent of this function in syncronous way). So why made that ? To provide the same api for async and sync regime when its possible, example http api. This allow to code less librairy. Syncronous users can just use the librairy like any other sync lib (with the keyword await for executing but, personally, I think that is worth). And asyncronous users can run multiples tasks using the same lib. Moving from a regime to an other is simpler, source code size is reduced (doesn't need to create 2 api for the same lib), gain of time for the same reason. Also why it need to await in syncronous function, why not just execute polymorph function like any sync function while called in a sync function ? Because we need to create runnable objects for async.run, ... So I would have your though about that, what can be improved, a better name for polymorph ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Mar 6 00:41:40 2019 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Mar 2019 00:41:40 -0500 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: dicttoolz has functions for working with these objects; including dicttoolz.merge (which returns a reference to the merged dicts but does not mutate the arguments passed). https://toolz.readthedocs.io/en/latest/api.html#dicttoolz https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge pyrsistent has a PRecord class with invariants and type checking that precedes dataclasses. pyrsistent also has 'freeze' and 'thaw' functions for immutability. PRecord extends PMap, which implements __add__ as self.update(arg) (which does not mutate self) https://github.com/tobgu/pyrsistent/blob/master/README.rst#precord https://github.com/tobgu/pyrsistent/blob/master/pyrsistent/_pmap.py On Tuesday, March 5, 2019, Guido van Rossum wrote: > If you have to tell such a long and convoluted story to explain a name > that you've picked out of the blue and that has no equivalent in other > Python data types, it's probably a bad idea. If you're proposing that other > mutating methods also gain a flow_XXX variant, please, no! That's like the > theory of supersymmetry (SUSY) in particle physics, where ever known > particle from the Standard Model would have to have a much heavier > "superpartner" just to make some esoteric idea work. > > On Tue, Mar 5, 2019 at 12:54 AM Jonathan Fine wrote: > >> SUMMARY >> Instead of using dict + dict, perhaps use dict.flow_update. Here, >> flow_update is just like update, except that it returns self. >> >> BACKGROUND >> There's a difference between a sorted copy of a list, and sorting the >> list in place. >> >> >>> items = [2, 0, 1, 9] >> >>> sorted(items), items >> ([0, 1, 2, 9], [2, 0, 1, 9]) >> >>> items.sort(), items >> (None, [0, 1, 2, 9]) >> >> In Python, mutating methods generally return None. Here, this prevents >> beginners thinking their code has produced a sorted copy of a list, >> when in fact it has done an in-place sort on the list. If they write >> >>> aaa = my_list.sort() >> they'll get a None error when they use aaa. >> >> The same goes for dict.update. This is a useful feature, particularly >> for beginners. It helps them think clearly, and express themselves >> clearly. >> >> THE PROBLEM >> This returning None can be a nuisance, sometimes. Suppose we have a >> dictionary of default values, and a dictionary of use supplied >> options. We wish to combine the two dictionaries, say into a new >> combined dictionary. >> >> One way to do this is: >> >> combined = defaults.copy() >> combined.update(options) >> >> But this is awkward when you're in the middle of calling a function: >> >> call_big_method( >> # lots of arguments, one to a line, with comments >> arg = combined, # Look up to see what combined is. >> # more arguments >> ) >> >> USING + >> There's a suggestion, that instead one extends Python so that this works: >> arg = defaults + options # What does '+' mean here? >> >> USING flow_update >> Here's another suggestion. Instead write: >> dict_arg = defaults.copy().flow_update(options) # Is this >> clearer? >> >> IMPLEMENTATION >> Here's an implementation, as a subclass of dict. >> >> class mydict(dict): >> >> def flow_update(self, *argv, **kwargs): >> self.update(*argv, **kwargs) >> return self >> >> def copy(self): >> return self.__class__(self) >> >> A DIRTY HACK >> Not tested, using an assignment expression. >> dict_arg = (tmp := defaults.copy(), tmp.update(options))[0] >> Not recommend. >> >> -- >> Jonathan >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 6 00:53:29 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Mar 2019 21:53:29 -0800 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) In-Reply-To: References: Message-ID: Defining a single polymorphic function is easy at the library level. For example, with asyncio: ---- def maybe_async(fn): @functools.wraps(fn) def wrapper(*args, **kwargs): coro = fn(*args, **kwargs) if asyncio.get_running_loop() is not None: return coro else: return await coro @maybe_async async def my_func(...): ... use asyncio freely in here ... ---- You can't do it at the language level though (e.g. with your proposed 'polymorph' keyword), because the language doesn't know whether an event loop is running or not. Extending this from a single function to a whole library API is substantially more complex, because you have to wrap every function and method, deal with __iter__ versus __aiter__, etc. -n On Tue, Mar 5, 2019 at 8:02 PM Jorropo . wrote: > > I was doing some async networking and I wondered, why I have to use 2 different api for making the same things in async or sync regime. > Even if we make 2 perfectly identical api (except function will be sync and async), it will still 2 different code). > > So I first thinked to allow await in syncronous function but that create some problems (ex: an async function calling async.create_task) so if we allow that we have to asume to allways be in async regime (like js). > > Or we can differentiate async function wich can be awaited in syncronous regime, maybe with a new keyword (here I will use polymorph due to a lack of imagination but I find that one too long) ? > > So a polymorph function can be awaited in a syncronous function, and a polymorph function can only await polymorph functions. > > Polymorph function work exacly like async function BUT they assure of the ability to execute syncronously. > And in a syncronous regime if an await need to wait (like async.sleep or network operation), just wait (like the equivalent of this function in syncronous way). > > So why made that ? > To provide the same api for async and sync regime when its possible, example http api. > This allow to code less librairy. > Syncronous users can just use the librairy like any other sync lib (with the keyword await for executing but, personally, I think that is worth). > And asyncronous users can run multiples tasks using the same lib. > Moving from a regime to an other is simpler, source code size is reduced (doesn't need to create 2 api for the same lib), gain of time for the same reason. > > Also why it need to await in syncronous function, why not just execute polymorph function like any sync function while called in a sync function ? > Because we need to create runnable objects for async.run, ... > > So I would have your though about that, what can be improved, a better name for polymorph ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Nathaniel J. Smith -- https://vorpus.org From wes.turner at gmail.com Wed Mar 6 01:07:04 2019 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Mar 2019 01:07:04 -0500 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) In-Reply-To: References: Message-ID: Is this what syncer does with a sync() function? Why isn't this a built-in? https://github.com/miyakogi/syncer On Wednesday, March 6, 2019, Nathaniel Smith wrote: > Defining a single polymorphic function is easy at the library level. > For example, with asyncio: > > ---- > > def maybe_async(fn): > @functools.wraps(fn) > def wrapper(*args, **kwargs): > coro = fn(*args, **kwargs) > if asyncio.get_running_loop() is not None: > return coro > else: > return await coro > > @maybe_async > async def my_func(...): > ... use asyncio freely in here ... > > ---- > > You can't do it at the language level though (e.g. with your proposed > 'polymorph' keyword), because the language doesn't know whether an > event loop is running or not. > > Extending this from a single function to a whole library API is > substantially more complex, because you have to wrap every function > and method, deal with __iter__ versus __aiter__, etc. > > -n > > On Tue, Mar 5, 2019 at 8:02 PM Jorropo . wrote: > > > > I was doing some async networking and I wondered, why I have to use 2 > different api for making the same things in async or sync regime. > > Even if we make 2 perfectly identical api (except function will be sync > and async), it will still 2 different code). > > > > So I first thinked to allow await in syncronous function but that create > some problems (ex: an async function calling async.create_task) so if we > allow that we have to asume to allways be in async regime (like js). > > > > Or we can differentiate async function wich can be awaited in syncronous > regime, maybe with a new keyword (here I will use polymorph due to a lack > of imagination but I find that one too long) ? > > > > So a polymorph function can be awaited in a syncronous function, and a > polymorph function can only await polymorph functions. > > > > Polymorph function work exacly like async function BUT they assure of > the ability to execute syncronously. > > And in a syncronous regime if an await need to wait (like async.sleep or > network operation), just wait (like the equivalent of this function in > syncronous way). > > > > So why made that ? > > To provide the same api for async and sync regime when its possible, > example http api. > > This allow to code less librairy. > > Syncronous users can just use the librairy like any other sync lib (with > the keyword await for executing but, personally, I think that is worth). > > And asyncronous users can run multiples tasks using the same lib. > > Moving from a regime to an other is simpler, source code size is reduced > (doesn't need to create 2 api for the same lib), gain of time for the same > reason. > > > > Also why it need to await in syncronous function, why not just execute > polymorph function like any sync function while called in a sync function ? > > Because we need to create runnable objects for async.run, ... > > > > So I would have your though about that, what can be improved, a better > name for polymorph ? > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Wed Mar 6 03:33:54 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 6 Mar 2019 09:33:54 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: INADA Naoki schrieb am 05.03.19 um 08:03:> On Tue, Mar 5, 2019 at 12:02 AM Stefan Behnel wrote: >> INADA Naoki schrieb am 04.03.19 um 11:15: >>> Why statement is not enough? >> >> I'm not sure I understand why you're asking this, but a statement is >> "not enough" because it's a statement and not an expression. It does >> not replace the convenience of an expression. > > It seems tautology and say nothing. That's close to what I thought when I read your question. :) > What is "convenience of an expression"? It's the convenience of being able to write an expression that generates the thing you need, rather than having to split code into statements that create it step by step before you can use it. Think of comprehensions versus for-loops. Comprehensions are expressions that don't add anything to the language that a for-loop cannot achieve. Still, everyone uses them because they are extremely convenient. > Is it needed to make Python more readable language? No, just like comprehensions, it's not "needed". It's just convenient. > Anyway, If "there is expression" is the main reason for this proposal, > symbolic operator is not necessary. As said, "needed" is not the right word. Being able to use a decorator closes a gap in the language. Just like list comprehensions fit generator expressions and vice versa. There is no "need" for being able to write [x**2 for x in seq] {x**2 for x in seq} when you can equally well write list(x**2 for x in seq) set(x**2 for x in seq) But I certainly wouldn't complain about that redundancy in the language. > `new = d1.updated(d2)` or `new = dict.merge(d1, d2)` are enough. Python > preferred name over symbol in general. Symbols are readable and > understandable only when it has good math metaphor. > > Sets has symbol operator because it is well known in set in math, not > because set is frequently used. > > In case of dict, there is no simple metaphor in math. So then, if "list+list" and "tuple+tuple" wasn't available through an operator, would you also reject the idea of adding it, argueing that we could use this: L = L1.extended(L2) I honestly do not see the math relation in concatenation via "+". But, given that "+" and "|" already have the meaning of "merging two containers into one" in Python, I think it makes sense to allow that also for dicts. > It just cryptic and hard to Google. I honestly doubt that it's something people would have to search for any more than they have to search for the list "+" operation. My guess is that it's pretty much what most people would try first when they have the need to merge two dicts, and only failing that, they would start a web search. In comparison, very few users would be able to come up with "{**d1, **d2}" on their own, or even "d1.updated(d2)". My point is, given the current language, "dict+dict" is a gap that is worth closing. Stefan From contact at brice.xyz Wed Mar 6 04:03:31 2019 From: contact at brice.xyz (Brice Parent) Date: Wed, 6 Mar 2019 10:03:31 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <5C7EFAE6.8000105@canterbury.ac.nz> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> Message-ID: <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Le 05/03/2019 ? 23:40, Greg Ewing a ?crit?: > Steven D'Aprano wrote: >> The question is, is [recursive merge] behaviour useful enough and > > common enough to be built into dict itself? > > I think not. It seems like just one possible way of merging > values out of many. I think it would be better to provide > a merge function or method that lets you specify a function > for merging values. > That's what this conversation led me to. I'm not against the addition for the most general usage (and current PEP's describes the behaviour I would expect before reading the doc), but for all other more specific usages, where we intend any special or not-so-common behaviour, I'd go with modifying Dict.update like this: foo.update(bar, on_collision=updator)? # Although I'm not a fan of the keyword I used `updator` being a simple function like this one: def updator(updated, updator, key) -> Any: ??? if key == "related": ??????? return updated[key].update(updator[key]) ??? if key == "tags": ??????? return updated[key] + updator[key] ??? if key in ["a", "b", "c"]:? # Those ??????? return updated[key] ??? return updator[key] There's nothing here that couldn't be made today by using a custom update function, but leaving the burden of checking for values that are in both and actually inserting the new values to Python's language, and keeping on our side only the parts that are specific to our use case, makes in my opinion the code more readable, with fewer possible bugs and possibly better optimization. From songofacandy at gmail.com Wed Mar 6 04:26:06 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 6 Mar 2019 18:26:06 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: On Wed, Mar 6, 2019 at 5:34 PM Stefan Behnel wrote: > > INADA Naoki schrieb am 05.03.19 um 08:03:> On Tue, Mar 5, 2019 at 12:02 AM > Stefan Behnel wrote: > >> INADA Naoki schrieb am 04.03.19 um 11:15: > >>> Why statement is not enough? > >> > >> I'm not sure I understand why you're asking this, but a statement is > >> "not enough" because it's a statement and not an expression. It does > >> not replace the convenience of an expression. > > > > It seems tautology and say nothing. > > That's close to what I thought when I read your question. :) > > > > What is "convenience of an expression"? > > It's the convenience of being able to write an expression that generates > the thing you need, rather than having to split code into statements that > create it step by step before you can use it. > I don't think it's reasonable rationale for adding operator. First, Python sometimes force people to use statement intentionally. Strictly speaking, dict.update() is an expression. But it not return `self` so you must split statements. It's design decision. So "add operator because I want expression" is bad reasoning to me. If it is valid reasoning, every mutating method should have operator. It's crazy idea. Second, operator is not required for expression. And adding operator must have high bar than adding method because it introduces more complexity and it could seen cryptic especially when the operator doesn't have good math metaphor. So I proposed adding dict.merge() instead of adding dict + as a counter proposal. If "I want expression" is really main motivation, it must be enough. > Think of comprehensions versus for-loops. Comprehensions are expressions > that don't add anything to the language that a for-loop cannot achieve. > Still, everyone uses them because they are extremely convenient. > I agree that comprehension is extremely convenient. But I think the main reason is it is compact and readable. If comprehension is not compact and readable as for-loop, it's not extremely convenient. > > > Is it needed to make Python more readable language? > > No, just like comprehensions, it's not "needed". It's just convenient. > I think comprehension is needed to make Python more readable language, not just for convenient. > > > Anyway, If "there is expression" is the main reason for this proposal, > > symbolic operator is not necessary. > > As said, "needed" is not the right word. Maybe, I misunderstood nuance of the word "needed". English and Japanese are very different language. sorry. > Being able to use a decorator > closes a gap in the language. Just like list comprehensions fit generator > expressions and vice versa. There is no "need" for being able to write > > [x**2 for x in seq] > {x**2 for x in seq} > > when you can equally well write > > list(x**2 for x in seq) > set(x**2 for x in seq) > > But I certainly wouldn't complain about that redundancy in the language. > OK, I must agree this point. [] and {} has good metaphor in math. We use [1, 2, 3,... ] for series, and {1, 2, 3, ...} for sets. > > > `new = d1.updated(d2)` or `new = dict.merge(d1, d2)` are enough. Python > > preferred name over symbol in general. Symbols are readable and > > understandable only when it has good math metaphor. > > > > Sets has symbol operator because it is well known in set in math, not > > because set is frequently used. > > > > In case of dict, there is no simple metaphor in math. > > So then, if "list+list" and "tuple+tuple" wasn't available through an > operator, would you also reject the idea of adding it, argueing that we > could use this: > > L = L1.extended(L2) > > I honestly do not see the math relation in concatenation via "+". > First of all, concatenating sequence (especially str) is extremely frequent than merging dict. My point is dict + dict is major abuse of + than seq + seq and it's usage is smaller than seq + seq. Let's describe why I think dict+dict is "major" abuse. As I said before, it's common to assign operator for concatenation in regular language, while middle-dot is used common. When the commonly-used operator is not in ASCII, other symbol can be used as alternative. We used | instead of ?. In case of dict, it's not common to assign operator for merging in math, as far as I know. (Maybe, "direct sum" ? is similar to it. But it doesn't allow intersection. So ValueError must be raised for duplicated key if we use "direct sum" for metaphor. But direct sum is higher-level math than "union" of set. I don't think it's good idea to use it as metaphor.) That's one of reasons I think seq + seq is "little" abuse and dict + dict is "major" abuse. Another reason is "throw some values away" doesn't fit mental model of "sum", as I said already in earlier mail. > But, given that "+" and "|" already have the meaning of "merging two > containers into one" in Python, I think it makes sense to allow that also > for dicts. > + is used for concatenate, it is more strict than just merge. If + is allowed for dict, set should support it too for consistency. Then, meaning of "+ for container" become "sum up two containers in some way, defined by the container type." It's consistent. Kotlin uses + for this meaning. Scala uses ++ for this meaning. But this is a large design change of the language. Is this really required? I feel adding a method is enough. -- Inada Naoki From remi.lapeyre at henki.fr Wed Mar 6 04:50:38 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Wed, 6 Mar 2019 01:50:38 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: Le 6 mars 2019 ? 10:26:15, Brice Parent (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: > > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : > > Steven D'Aprano wrote: > >> The question is, is [recursive merge] behaviour useful enough and > > > common enough to be built into dict itself? > > > > I think not. It seems like just one possible way of merging > > values out of many. I think it would be better to provide > > a merge function or method that lets you specify a function > > for merging values. > > > That's what this conversation led me to. I'm not against the addition > for the most general usage (and current PEP's describes the behaviour I > would expect before reading the doc), but for all other more specific > usages, where we intend any special or not-so-common behaviour, I'd go > with modifying Dict.update like this: > > foo.update(bar, on_collision=updator) # Although I'm not a fan of the > keyword I used Le 6 mars 2019 ? 10:26:15, Brice Parent (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: > > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : > > Steven D'Aprano wrote: > >> The question is, is [recursive merge] behaviour useful enough and > > > common enough to be built into dict itself? > > > > I think not. It seems like just one possible way of merging > > values out of many. I think it would be better to provide > > a merge function or method that lets you specify a function > > for merging values. > > > That's what this conversation led me to. I'm not against the addition > for the most general usage (and current PEP's describes the behaviour I > would expect before reading the doc), but for all other more specific > usages, where we intend any special or not-so-common behaviour, I'd go > with modifying Dict.update like this: > > foo.update(bar, on_collision=updator) # Although I'm not a fan of the > keyword I used This won?t be possible update() already takes keyword arguments: >>> foo = {} >>> bar = {'a': 1} >>> foo.update(bar, on_collision=lambda e: e) >>> foo {'a': 1, 'on_collision': at 0x10b8df598>} > `updator` being a simple function like this one: > > def updator(updated, updator, key) -> Any: > if key == "related": > return updated[key].update(updator[key]) > > if key == "tags": > return updated[key] + updator[key] > > if key in ["a", "b", "c"]: # Those > return updated[key] > > return updator[key] > > There's nothing here that couldn't be made today by using a custom > update function, but leaving the burden of checking for values that are > in both and actually inserting the new values to Python's language, and > keeping on our side only the parts that are specific to our use case, > makes in my opinion the code more readable, with fewer possible bugs and > possibly better optimization. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From zestyping at gmail.com Wed Mar 6 05:29:27 2019 From: zestyping at gmail.com (Ka-Ping Yee) Date: Wed, 6 Mar 2019 02:29:27 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + operator is nonsense. len(dict1 + dict2) cannot even be computed by any expression involving +. Using len() to test the semantics of the operation is not arbitrary; the fact that the sizes do not add is a defining quality of a merge. This is a merge, not an addition. The proper analogy is to sets, not lists. The operators should be |, &, and -, exactly as for sets, and the behaviour defined with just three rules: 1. The keys of dict1 [op] dict2 are the elements of dict1.keys() [op] dict2.keys(). 2. The values of dict2 take priority over the values of dict1. 3. When either operand is a set, it is treated as a dict whose values are None. This yields many useful operations and, most importantly, is simple to explain. "sets and dicts can |, &, -" takes up less space in your brain than "sets can |, &, - but dicts can only + and -, where dict + is like set |". merge and update some items: {'a': 1, 'b': 2} | {'b': 3, 'c': 4} => {'a': 1, 'b': 3, 'c': 4} pick some items: {'a': 1, 'b': 2} & {'b': 3, 'c': 4} => {'b': 3} remove some items: {'a': 1, 'b': 2} - {'b': 3, 'c': 4} => {'a': 1} reset values of some keys: {'a': 1, 'b': 2} | {'b', 'c'} => {'a': 1, 'b': None, 'c': None} ensure certain keys are present: {'b', 'c'} | {'a': 1, 'b': 2} => {'a': 1, 'b': 2, 'c': None} pick some items: {'b', 'c'} | {'a': 1, 'b': 2} => {'b': 2} remove some items: {'a': 1, 'b': 2} - {'b', 'c'} => {'a': 1} On Wed, Mar 6, 2019 at 1:51 AM R?mi Lapeyre wrote: > Le 6 mars 2019 ? 10:26:15, Brice Parent > (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: > > > > > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : > > > Steven D'Aprano wrote: > > >> The question is, is [recursive merge] behaviour useful enough and > > > > common enough to be built into dict itself? > > > > > > I think not. It seems like just one possible way of merging > > > values out of many. I think it would be better to provide > > > a merge function or method that lets you specify a function > > > for merging values. > > > > > That's what this conversation led me to. I'm not against the addition > > for the most general usage (and current PEP's describes the behaviour I > > would expect before reading the doc), but for all other more specific > > usages, where we intend any special or not-so-common behaviour, I'd go > > with modifying Dict.update like this: > > > > foo.update(bar, on_collision=updator) # Although I'm not a fan of the > > keyword I used > > Le 6 mars 2019 ? 10:26:15, Brice Parent > (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: > > > > > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : > > > Steven D'Aprano wrote: > > >> The question is, is [recursive merge] behaviour useful enough and > > > > common enough to be built into dict itself? > > > > > > I think not. It seems like just one possible way of merging > > > values out of many. I think it would be better to provide > > > a merge function or method that lets you specify a function > > > for merging values. > > > > > That's what this conversation led me to. I'm not against the addition > > for the most general usage (and current PEP's describes the behaviour I > > would expect before reading the doc), but for all other more specific > > usages, where we intend any special or not-so-common behaviour, I'd go > > with modifying Dict.update like this: > > > > foo.update(bar, on_collision=updator) # Although I'm not a fan of the > > keyword I used > > This won?t be possible update() already takes keyword arguments: > > >>> foo = {} > >>> bar = {'a': 1} > >>> foo.update(bar, on_collision=lambda e: e) > >>> foo > {'a': 1, 'on_collision': at 0x10b8df598>} > > > `updator` being a simple function like this one: > > > > def updator(updated, updator, key) -> Any: > > if key == "related": > > return updated[key].update(updator[key]) > > > > if key == "tags": > > return updated[key] + updator[key] > > > > if key in ["a", "b", "c"]: # Those > > return updated[key] > > > > return updator[key] > > > > There's nothing here that couldn't be made today by using a custom > > update function, but leaving the burden of checking for values that are > > in both and actually inserting the new values to Python's language, and > > keeping on our side only the parts that are specific to our use case, > > makes in my opinion the code more readable, with fewer possible bugs and > > possibly better optimization. > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Wed Mar 6 06:52:26 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 6 Mar 2019 11:52:26 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: <10b1cb61-3198-5234-284a-ddb4e2978525@kynesim.co.uk> On 06/03/2019 10:29, Ka-Ping Yee wrote: > len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + > operator is nonsense. I'm sorry, but you're going to have to justify why this identity is important. Making assumptions about length where any dictionary manipulations are concerned seems unwise to me, which makes a nonsense of your claim that this is nonsense :-) -- Rhodri James *-* Kynesim Ltd From contact at brice.xyz Wed Mar 6 06:54:11 2019 From: contact at brice.xyz (Brice Parent) Date: Wed, 6 Mar 2019 12:54:11 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: <773bc98b-c9fb-02f4-422d-33b65f639c29@brice.xyz> Le 06/03/2019 ? 10:50, R?mi Lapeyre a ?crit?: >> Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : >>> Steven D'Aprano wrote: >>>> The question is, is [recursive merge] behaviour useful enough and >>>> common enough to be built into dict itself? >>> I think not. It seems like just one possible way of merging >>> values out of many. I think it would be better to provide >>> a merge function or method that lets you specify a function >>> for merging values. >>> >> That's what this conversation led me to. I'm not against the addition >> for the most general usage (and current PEP's describes the behaviour I >> would expect before reading the doc), but for all other more specific >> usages, where we intend any special or not-so-common behaviour, I'd go >> with modifying Dict.update like this: >> >> foo.update(bar, on_collision=updator) # Although I'm not a fan of the >> keyword I used > This won?t be possible update() already takes keyword arguments: > >>>> foo = {} >>>> bar = {'a': 1} >>>> foo.update(bar, on_collision=lambda e: e) >>>> foo > {'a': 1, 'on_collision': at 0x10b8df598>} I don't see that as a problem at all. Having a function's signature containing a **kwargs doesn't disable to have explicit keyword arguments at the same time: `def foo(bar="baz", **kwargs):` is perfectly valid, as well as `def spam(ham: Dict, eggs="blah", **kwargs):`, so `update(other, on_collision=None, **added) is too, no? The major implication to such a modification of the Dict.update method, is that when you're using it with keyword arguments (by opposition to passing another dict/iterable as positional), you're making a small non-backward compatible change in that if in some code, someone was already using the keyword that would be chosing (here "on_collision"), their code would be broken by the new feature. I had never tried to pass a dict and kw arguments together, as it seemed to me that it wasn't supported (I would even have expected an exception to be raised), but it's probably my level of English that isn't high enough to get it right, or this part of the doc that doesn't describe well the full possible usage of the method (see here: https://docs.python.org/3/library/stdtypes.html#dict.update). Anyway, if the keyword is slected wisely, the collision case will almost never happen, and be quite easy to correct if it ever happened. From jfine2358 at gmail.com Wed Mar 6 07:17:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 12:17:40 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: Ka-Ping Yee wrote: > > len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + operator is nonsense. > > len(dict1 + dict2) cannot even be computed by any expression involving +. Using len() to test the semantics of the operation is not arbitrary; the fact that the sizes do not add is a defining quality of a merge. This is a merge, not an addition. The proper analogy is to sets, not lists. For me, this comment is excellent. It neatly expresses the central concern about this proposal. I think most us will agree that the proposal is to use '+' to express a merge operation, namely update. (There are other merge operations, when there are two values to combine, such as taking the min or max of the two values.) Certainly, many of the posts quite naturally use the word merge. Indeed PEP 584 writes "This PEP suggests adding merge '+' and difference '-' operators to the built-in dict class." We would all agree that it would be obviously wrong to suggest adding merge '-' and difference '+' operators. (Note: I've swapped '+' and '-'.) And why? Because it is obviously wrong to use '-' to denote merge, etc. Some of us are also upset by the use of '+' to denote merge. By the way, there is already a widespread symbol for merge. It appears on many road signs. It looks like an upside down 'Y'. It even has merge left and merge right versions. Python already has operator symbols '+', '-', '*', '/' and so on. See https://docs.python.org/3/reference/lexical_analysis.html#operators Perhaps we should add a merge or update symbol to this list, so that we don't overload to breaking point the humble '+' operator. Although that would make Python a bit more like APL. By the way, Pandas already has a merge operation, called merge, that takes many parameters. I've only glanced at it. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html -- Jonathan From shadowranger+pythonideas at gmail.com Wed Mar 6 07:20:50 2019 From: shadowranger+pythonideas at gmail.com (Josh Rosenberg) Date: Wed, 6 Mar 2019 12:20:50 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <10b1cb61-3198-5234-284a-ddb4e2978525@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <10b1cb61-3198-5234-284a-ddb4e2978525@kynesim.co.uk> Message-ID: On Wed, Mar 6, 2019 at 11:52 AM Rhodri James wrote: > On 06/03/2019 10:29, Ka-Ping Yee wrote: > > len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + > > operator is nonsense. > > I'm sorry, but you're going to have to justify why this identity is > important. Making assumptions about length where any dictionary > manipulations are concerned seems unwise to me, which makes a nonsense > of your claim that this is nonsense :-) > It's not "nonsense" per se. If we were inventing programming languages in a vacuum, you could say + can mean "arbitrary combination operator" and it would be fine. But we're not in a vacuum; every major language that uses + with general purpose containers uses it to mean element-wise addition or concatenation, not just "merge". Concatenation is what imposes that identity (and all the others people are defending, like no loss of input values); you're taking a sequence of things, and shoving another sequence of things on the end of it, preserving order and all values. The argument here isn't that you *can't* make + do arbitrary merges that don't adhere to these semantics. It's that adding yet a third meaning to + (and it is a third meaning; it has no precedent in any existing type in Python, nor in any other major language; even in the minor languages that allow it, they use + for sets as well, so Python using + is making Python itself internally inconsistent with the operators used for set), for limited benefit. - Josh Rosenberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Wed Mar 6 07:31:13 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 12:31:13 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <10b1cb61-3198-5234-284a-ddb4e2978525@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <10b1cb61-3198-5234-284a-ddb4e2978525@kynesim.co.uk> Message-ID: Rhodri James wrote: > Making assumptions about length where any dictionary > manipulations are concerned seems unwise to me I think you're a bit hasty here. Some assumptions are sensible. Suppose a = len(d1) b = len(d2) c = len(d1 + d2) # Using the suggested syntax. Then we know max(a, b) <= c <= a + b And this is, in broad terms, characteristic of merge operations. -- Jonathan From mertz at gnosis.cx Wed Mar 6 07:52:17 2019 From: mertz at gnosis.cx (David Mertz) Date: Wed, 6 Mar 2019 07:52:17 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: I strongly agree with Ka-Ping. '+' is intuitively concatenation not merging. The behavior is overwhelmingly more similar to the '|' operator in sets (whether or not a user happens to know the historical implementation overlap). I think growing the full collection of set operations world be a pleasant addition to dicts. I think shoe-horning in plus would always be jarring to me. On Wed, Mar 6, 2019, 5:30 AM Ka-Ping Yee wrote: > len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + > operator is nonsense. > > len(dict1 + dict2) cannot even be computed by any expression involving +. > Using len() to test the semantics of the operation is not arbitrary; the > fact that the sizes do not add is a defining quality of a merge. This is a > merge, not an addition. The proper analogy is to sets, not lists. > > The operators should be |, &, and -, exactly as for sets, and the > behaviour defined with just three rules: > > 1. The keys of dict1 [op] dict2 are the elements of dict1.keys() [op] > dict2.keys(). > > 2. The values of dict2 take priority over the values of dict1. > > 3. When either operand is a set, it is treated as a dict whose values are > None. > > This yields many useful operations and, most importantly, is simple to > explain. "sets and dicts can |, &, -" takes up less space in your brain > than "sets can |, &, - but dicts can only + and -, where dict + is like set > |". > > merge and update some items: > > {'a': 1, 'b': 2} | {'b': 3, 'c': 4} => {'a': 1, 'b': 3, 'c': 4} > > pick some items: > > {'a': 1, 'b': 2} & {'b': 3, 'c': 4} => {'b': 3} > > remove some items: > > {'a': 1, 'b': 2} - {'b': 3, 'c': 4} => {'a': 1} > > reset values of some keys: > > {'a': 1, 'b': 2} | {'b', 'c'} => {'a': 1, 'b': None, 'c': None} > > ensure certain keys are present: > > {'b', 'c'} | {'a': 1, 'b': 2} => {'a': 1, 'b': 2, 'c': None} > > pick some items: > > {'b', 'c'} | {'a': 1, 'b': 2} => {'b': 2} > > remove some items: > > {'a': 1, 'b': 2} - {'b', 'c'} => {'a': 1} > > On Wed, Mar 6, 2019 at 1:51 AM R?mi Lapeyre wrote: > >> Le 6 mars 2019 ? 10:26:15, Brice Parent >> (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: >> >> > >> > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : >> > > Steven D'Aprano wrote: >> > >> The question is, is [recursive merge] behaviour useful enough and >> > > > common enough to be built into dict itself? >> > > >> > > I think not. It seems like just one possible way of merging >> > > values out of many. I think it would be better to provide >> > > a merge function or method that lets you specify a function >> > > for merging values. >> > > >> > That's what this conversation led me to. I'm not against the addition >> > for the most general usage (and current PEP's describes the behaviour I >> > would expect before reading the doc), but for all other more specific >> > usages, where we intend any special or not-so-common behaviour, I'd go >> > with modifying Dict.update like this: >> > >> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the >> > keyword I used >> >> Le 6 mars 2019 ? 10:26:15, Brice Parent >> (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: >> >> > >> > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : >> > > Steven D'Aprano wrote: >> > >> The question is, is [recursive merge] behaviour useful enough and >> > > > common enough to be built into dict itself? >> > > >> > > I think not. It seems like just one possible way of merging >> > > values out of many. I think it would be better to provide >> > > a merge function or method that lets you specify a function >> > > for merging values. >> > > >> > That's what this conversation led me to. I'm not against the addition >> > for the most general usage (and current PEP's describes the behaviour I >> > would expect before reading the doc), but for all other more specific >> > usages, where we intend any special or not-so-common behaviour, I'd go >> > with modifying Dict.update like this: >> > >> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the >> > keyword I used >> >> This won?t be possible update() already takes keyword arguments: >> >> >>> foo = {} >> >>> bar = {'a': 1} >> >>> foo.update(bar, on_collision=lambda e: e) >> >>> foo >> {'a': 1, 'on_collision': at 0x10b8df598>} >> >> > `updator` being a simple function like this one: >> > >> > def updator(updated, updator, key) -> Any: >> > if key == "related": >> > return updated[key].update(updator[key]) >> > >> > if key == "tags": >> > return updated[key] + updator[key] >> > >> > if key in ["a", "b", "c"]: # Those >> > return updated[key] >> > >> > return updator[key] >> > >> > There's nothing here that couldn't be made today by using a custom >> > update function, but leaving the burden of checking for values that are >> > in both and actually inserting the new values to Python's language, and >> > keeping on our side only the parts that are specific to our use case, >> > makes in my opinion the code more readable, with fewer possible bugs and >> > possibly better optimization. >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Mar 6 07:53:47 2019 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Mar 2019 23:53:47 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <773bc98b-c9fb-02f4-422d-33b65f639c29@brice.xyz> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <773bc98b-c9fb-02f4-422d-33b65f639c29@brice.xyz> Message-ID: On Wed, Mar 6, 2019 at 11:18 PM Brice Parent wrote: > The major implication to such a > modification of the Dict.update method, is that when you're using it > with keyword arguments (by opposition to passing another dict/iterable > as positional), you're making a small non-backward compatible change in > that if in some code, someone was already using the keyword that would > be chosing (here "on_collision"), their code would be broken by the new > feature. > Anyway, if > the keyword is slected wisely, the collision case will almost never > happen, and be quite easy to correct if it ever happened. You can make it unlikely, yes, but I'd dispute "easy to correct". Let's suppose that someone had indeed used the chosen keyword (and remember, the more descriptive the argument name, the more likely that it'll be useful elsewhere and therefore have a collision). How would they discover this? If they're really lucky, there MIGHT be an exception (if on_collision accepts only a handful of keywords, and the collision isn't one of them), but if your new feature is sufficiently flexible, that might not happen. There'll just be incorrect behaviour. As APIs go, using specific keyword args at the same time as **kw is a bit odd. Consider: button_options.update(button_info, on_click=frobnicate, style="KDE", on_collision="replace") It's definitely not obvious which of those will end up in the dictionary and which won't. Big -1 from me on that change. ChrisA From contact at brice.xyz Wed Mar 6 08:40:00 2019 From: contact at brice.xyz (Brice Parent) Date: Wed, 6 Mar 2019 14:40:00 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <773bc98b-c9fb-02f4-422d-33b65f639c29@brice.xyz> Message-ID: Le 06/03/2019 ? 13:53, Chris Angelico a ?crit?: > On Wed, Mar 6, 2019 at 11:18 PM Brice Parent wrote: >> The major implication to such a >> modification of the Dict.update method, is that when you're using it >> with keyword arguments (by opposition to passing another dict/iterable >> as positional), you're making a small non-backward compatible change in >> that if in some code, someone was already using the keyword that would >> be chosing (here "on_collision"), their code would be broken by the new >> feature. >> Anyway, if >> the keyword is slected wisely, the collision case will almost never >> happen, and be quite easy to correct if it ever happened. > You can make it unlikely, yes, but I'd dispute "easy to correct". > Let's suppose that someone had indeed used the chosen keyword (and > remember, the more descriptive the argument name, the more likely that > it'll be useful elsewhere and therefore have a collision). How would > they discover this? If they're really lucky, there MIGHT be an > exception (if on_collision accepts only a handful of keywords, and the > collision isn't one of them), but if your new feature is sufficiently > flexible, that might not happen. There'll just be incorrect behaviour. > > As APIs go, using specific keyword args at the same time as **kw is a > bit odd. Consider: > > button_options.update(button_info, on_click=frobnicate, style="KDE", > on_collision="replace") > > It's definitely not obvious which of those will end up in the > dictionary and which won't. Big -1 from me on that change. That's indeed a good point. Even if the correction is quite easy to make in most cases. With keyword only changes: button_options.update(dict(on_click=frobnicate, style="KDE", on_collision="replace")) # or button_options.update(dict(on_collision="replace"), on_click=frobnicate, style="KDE") In the exact case you proposed, it could become a 2-liners: button_options.update(button_info) button_options.update(dict(on_click=frobnicate, style="KDE", on_collision="replace")) In my code, I would probably make it into 2 lines, to make clear that we have 2 levels of data merging, one that is general (the first), and one that is specific to this use-case (as it's hard written in the code), but not everyone doesn't care about the number of lines. But for the other part of your message, I 100% agree with you. The main problem with such a change is not (to me) that it can break some edge cases, but that it would potentially break them silently. And that, I agree, is worth a big -1 I guess. From michael.lee.0x2a at gmail.com Wed Mar 6 08:58:46 2019 From: michael.lee.0x2a at gmail.com (Michael Lee) Date: Wed, 6 Mar 2019 05:58:46 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: > > I strongly agree with Ka-Ping. '+' is intuitively concatenation not > merging. The behavior is overwhelmingly more similar to the '|' operator in > sets (whether or not a user happens to know the historical implementation > overlap). I think the behavior proposed in the PEP makes sense whether you think of "+" as meaning "concatenation" or "merging". If your instinct is to assume "+" means "concatenation", then it would be natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the key-value pairs into a new dict. But of course, you can't have duplicate keys in Python. So, you would either recall or look up how duplicate keys are handled when constructing a dict and learn that the rule is that the right-most key wins. So the natural conclusion is that "+" would follow this existing rule -- and you end up with exactly the behavior described in the PEP. This also makes explaining the behavior of "d1 + d2" slightly easier than explaining "d1 | d2". For the former, you can just say "d1 + d2 means we concat the two dicts together" and stop there. You almost don't need to explain the merging/right-most key wins behavior at all, since that behavior is the only one consistent with the existing language rules. In contrast, you *would* need to explain this with "d1 | d2": I would mentally translate this expression to mean "take the union of these two dicts" and there's no real way to deduce which key-value pair ends up in the final dict given that framing. Why is it that key-value pairs in d2 win over pairs in d1 here? That choice seems pretty arbitrary when you think of this operation in terms of unions, rather than either concat or merge. Using "|" would also violate an important existing property of unions: the invariant "d1 | d2 == d2 | d1" is no longer true. As far as I'm aware, the union operation is always taken to be commutative in math, and so I think it's important that we preserve that property in Python. At the very least, I think it's far more important to preserve commutativity of unions then it is to preserve some of the invariants I've seen proposed above, like "len(d1 + d2) == len(d1) + len(d2)". Personally, I don't really have a strong opinion on this PEP, or the other one I've seen proposed where we add a "d1.merge(d2, d3, ...)". But I do know that I'm a strong -1 on adding set operations to dicts: it's not possible to preserve the existing semantics of union (and intersection) with dict and think expressions like "d1 | d2" and "d1 & d2" would just be confusing and misleading to encounter in the wild. -- Michael On Wed, Mar 6, 2019 at 4:53 AM David Mertz wrote: > I strongly agree with Ka-Ping. '+' is intuitively concatenation not > merging. The behavior is overwhelmingly more similar to the '|' operator in > sets (whether or not a user happens to know the historical implementation > overlap). > > I think growing the full collection of set operations world be a pleasant > addition to dicts. I think shoe-horning in plus would always be jarring to > me. > > On Wed, Mar 6, 2019, 5:30 AM Ka-Ping Yee wrote: > >> len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the + >> operator is nonsense. >> >> len(dict1 + dict2) cannot even be computed by any expression >> involving +. Using len() to test the semantics of the operation is not >> arbitrary; the fact that the sizes do not add is a defining quality of a >> merge. This is a merge, not an addition. The proper analogy is to sets, >> not lists. >> >> The operators should be |, &, and -, exactly as for sets, and the >> behaviour defined with just three rules: >> >> 1. The keys of dict1 [op] dict2 are the elements of dict1.keys() [op] >> dict2.keys(). >> >> 2. The values of dict2 take priority over the values of dict1. >> >> 3. When either operand is a set, it is treated as a dict whose values are >> None. >> >> This yields many useful operations and, most importantly, is simple to >> explain. "sets and dicts can |, &, -" takes up less space in your brain >> than "sets can |, &, - but dicts can only + and -, where dict + is like set >> |". >> >> merge and update some items: >> >> {'a': 1, 'b': 2} | {'b': 3, 'c': 4} => {'a': 1, 'b': 3, 'c': 4} >> >> pick some items: >> >> {'a': 1, 'b': 2} & {'b': 3, 'c': 4} => {'b': 3} >> >> remove some items: >> >> {'a': 1, 'b': 2} - {'b': 3, 'c': 4} => {'a': 1} >> >> reset values of some keys: >> >> {'a': 1, 'b': 2} | {'b', 'c'} => {'a': 1, 'b': None, 'c': None} >> >> ensure certain keys are present: >> >> {'b', 'c'} | {'a': 1, 'b': 2} => {'a': 1, 'b': 2, 'c': None} >> >> pick some items: >> >> {'b', 'c'} | {'a': 1, 'b': 2} => {'b': 2} >> >> remove some items: >> >> {'a': 1, 'b': 2} - {'b', 'c'} => {'a': 1} >> >> On Wed, Mar 6, 2019 at 1:51 AM R?mi Lapeyre >> wrote: >> >>> Le 6 mars 2019 ? 10:26:15, Brice Parent >>> (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: >>> >>> > >>> > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : >>> > > Steven D'Aprano wrote: >>> > >> The question is, is [recursive merge] behaviour useful enough and >>> > > > common enough to be built into dict itself? >>> > > >>> > > I think not. It seems like just one possible way of merging >>> > > values out of many. I think it would be better to provide >>> > > a merge function or method that lets you specify a function >>> > > for merging values. >>> > > >>> > That's what this conversation led me to. I'm not against the addition >>> > for the most general usage (and current PEP's describes the behaviour I >>> > would expect before reading the doc), but for all other more specific >>> > usages, where we intend any special or not-so-common behaviour, I'd go >>> > with modifying Dict.update like this: >>> > >>> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the >>> > keyword I used >>> >>> Le 6 mars 2019 ? 10:26:15, Brice Parent >>> (contact at brice.xyz(mailto:contact at brice.xyz)) a ?crit: >>> >>> > >>> > Le 05/03/2019 ? 23:40, Greg Ewing a ?crit : >>> > > Steven D'Aprano wrote: >>> > >> The question is, is [recursive merge] behaviour useful enough and >>> > > > common enough to be built into dict itself? >>> > > >>> > > I think not. It seems like just one possible way of merging >>> > > values out of many. I think it would be better to provide >>> > > a merge function or method that lets you specify a function >>> > > for merging values. >>> > > >>> > That's what this conversation led me to. I'm not against the addition >>> > for the most general usage (and current PEP's describes the behaviour I >>> > would expect before reading the doc), but for all other more specific >>> > usages, where we intend any special or not-so-common behaviour, I'd go >>> > with modifying Dict.update like this: >>> > >>> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the >>> > keyword I used >>> >>> This won?t be possible update() already takes keyword arguments: >>> >>> >>> foo = {} >>> >>> bar = {'a': 1} >>> >>> foo.update(bar, on_collision=lambda e: e) >>> >>> foo >>> {'a': 1, 'on_collision': at 0x10b8df598>} >>> >>> > `updator` being a simple function like this one: >>> > >>> > def updator(updated, updator, key) -> Any: >>> > if key == "related": >>> > return updated[key].update(updator[key]) >>> > >>> > if key == "tags": >>> > return updated[key] + updator[key] >>> > >>> > if key in ["a", "b", "c"]: # Those >>> > return updated[key] >>> > >>> > return updator[key] >>> > >>> > There's nothing here that couldn't be made today by using a custom >>> > update function, but leaving the burden of checking for values that are >>> > in both and actually inserting the new values to Python's language, and >>> > keeping on our side only the parts that are specific to our use case, >>> > makes in my opinion the code more readable, with fewer possible bugs >>> and >>> > possibly better optimization. >>> > >>> > >>> > _______________________________________________ >>> > Python-ideas mailing list >>> > Python-ideas at python.org >>> > https://mail.python.org/mailman/listinfo/python-ideas >>> > Code of Conduct: http://python.org/psf/codeofconduct/ >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Mar 6 09:05:05 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Mar 2019 01:05:05 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: On Thu, Mar 7, 2019 at 12:59 AM Michael Lee wrote: > If your instinct is to assume "+" means "concatenation", then it would be natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the key-value pairs into a new dict. > > But of course, you can't have duplicate keys in Python. So, you would either recall or look up how duplicate keys are handled when constructing a dict and learn that the rule is that the right-most key wins. So the natural conclusion is that "+" would follow this existing rule -- and you end up with exactly the behavior described in the PEP. > Which, by the way, is also consistent with assignment: d = {}; d["a"] = 1; d["b"] = 2; d["c"] = 3; d["b"] = 4 Rightmost one wins. It's the most logical behaviour. ChrisA From jfine2358 at gmail.com Wed Mar 6 09:49:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 14:49:40 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: Michael Lee wrote: > If your instinct is to assume "+" means "concatenation", then it would be natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the key-value pairs into a new dict. > But of course, you can't have duplicate keys in Python. So, you would either recall or look up how duplicate keys are handled when constructing a dict and learn that the rule is that the right-most key wins. So the natural conclusion is that "+" would follow this existing rule -- and you end up with exactly the behavior described in the PEP. This is a nice argument. And well presented. And it gave me surprise, that taught me something. Here goes: >>> {'a': 0} {'a': 0} >>> {'a': 0, 'a': 0} {'a': 0} >>> {'a': 0, 'a': 1} {'a': 1} >>> {'a': 1, 'a': 0} {'a': 0} This surprised me quite a bit. I was expecting to get an exception. However >>> dict(a=0) {'a': 0} >>> dict(a=0, a=0) SyntaxError: keyword argument repeated does give an exception. I wonder, is this behaviour of {'a': 0, 'a': 1} documented (or tested) anywhere? I didn't find it in these URLs: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict https://docs.python.org/3/tutorial/datastructures.html#dictionaries I think this behaviour might give rise to gotchas. For example, if we define inverse_f by >>> inverse_f = { f(a): a, f(b): b } then is the next statement always true (assuming a <> b)? >>> inverse_f[ f(a) ] == a Well, it's not true with these values >>> a, b = 1, 2 >>> def f(n): pass # There's a bug here, f(n) should be a bijection. A quick check that len(inverse) == 2 would provide a sanity check. Or perhaps better, len(inverse_f) == len(set(a, b)). (I don't have an example of this bug appearing 'in the wild'.) Once again, I thank Michael for his nice, instructive and well-presented example. -- Jonathan From contact at brice.xyz Wed Mar 6 11:16:41 2019 From: contact at brice.xyz (Brice Parent) Date: Wed, 6 Mar 2019 17:16:41 +0100 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: Why not simply propose an external lib with FluentDict and other Fluent[Anything] already packaged? I don't know if I'd use it personnally, but it definitely could have some users. Le 05/03/2019 ? 09:48, Jonathan Fine a ?crit?: > SUMMARY > Instead of using dict + dict, perhaps use dict.flow_update. Here, > flow_update is just like update, except that it returns self. > > BACKGROUND > There's a difference between a sorted copy of a list, and sorting the > list in place. > > >>> items = [2, 0, 1, 9] > >>> sorted(items), items > ([0, 1, 2, 9], [2, 0, 1, 9]) > >>> items.sort(), items > (None, [0, 1, 2, 9]) > > In Python, mutating methods generally return None. Here, this prevents > beginners thinking their code has produced a sorted copy of a list, > when in fact it has done an in-place sort on the list. If they write > >>> aaa = my_list.sort() > they'll get a None error when they use aaa. > > The same goes for dict.update. This is a useful feature, particularly > for beginners. It helps them think clearly, and express themselves > clearly. > > THE PROBLEM > This returning None can be a nuisance, sometimes. Suppose we have a > dictionary of default values, and a dictionary of use supplied > options. We wish to combine the two dictionaries, say into a new > combined dictionary. > > One way to do this is: > > combined = defaults.copy() > combined.update(options) > > But this is awkward when you're in the middle of calling a function: > > call_big_method( > # lots of arguments, one to a line, with comments > arg = combined, # Look up to see what combined is. > # more arguments > ) > > USING + > There's a suggestion, that instead one extends Python so that this works: > arg = defaults + options # What does '+' mean here? > > USING flow_update > Here's another suggestion. Instead write: > dict_arg = defaults.copy().flow_update(options) # Is this clearer? > > IMPLEMENTATION > Here's an implementation, as a subclass of dict. > > class mydict(dict): > > def flow_update(self, *argv, **kwargs): > self.update(*argv, **kwargs) > return self > > def copy(self): > return self.__class__(self) > > A DIRTY HACK > Not tested, using an assignment expression. > dict_arg = (tmp := defaults.copy(), tmp.update(options))[0] > Not recommend. > From songofacandy at gmail.com Wed Mar 6 11:28:44 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Thu, 7 Mar 2019 01:28:44 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: On Wed, Mar 6, 2019 at 10:59 PM Michael Lee wrote: > > I think the behavior proposed in the PEP makes sense whether you think of "+" as meaning "concatenation" or "merging". > > If your instinct is to assume "+" means "concatenation", then it would be natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the key-value pairs into a new dict. > Nice explanation. You reduced my opposite to `+` by "literally concat". Better example, {"a": 1, "b": 2} + {"c": 4, "b": 3} == {"a": 1, "b": 2, "c": 4, "b": 3} == {"a": 1, "b": 3, "c": 4} On the other hand, union of set is also "literally concat". If we use this "literally concat" metaphor, I still think set should have `+` as alias to `|` for consistency. > > Using "|" would also violate an important existing property of unions: the invariant "d1 | d2 == d2 | d1" is no longer true. As far as I'm aware, the union operation is always taken to be commutative in math, and so I think it's important that we preserve that property in Python. At the very least, I think it's far more important to preserve commutativity of unions then it is to preserve some of the invariants I've seen proposed above, like "len(d1 + d2) == len(d1) + len(d2)". > I think both rule are "rather a coincidence than a conscious decision". I think "|" keeps commutativity only because it's minor than `+`. Easy operator is abused easily more than minor operator. And I think every "coincidence" rules are important. They makes understanding Python easy. Every people "discover" rules and consistency while learning language. This is a matter of balance. There are no right answer. Someone *feel* rule A is important than B. Someone feel opposite. > But I do know that I'm a strong -1 on adding set operations to dicts: it's not possible to preserve the existing semantics of union (and intersection) with dict and think expressions like "d1 | d2" and "d1 & d2" would just be confusing and misleading to encounter in the wild. Hmm. The PEP proposed dict - dict, which is similar to set - set (difference). To me, {"a": 1, "b": 2} - {"b": 3} = {"a": 1} is confusing than {"a": 1, "b": 2} - {"b"} = {"a": 1}. So I think borrow some semantics from set is good idea. Both of `dict - set` and `dict & set` makes sense to me. * `dict - set` can be used to remove private keys by "blacklist". * `dict & set` can be used to choose public keys by "whiltelist". -- Inada Naoki From jfine2358 at gmail.com Wed Mar 6 11:58:30 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 16:58:30 +0000 Subject: [Python-ideas] dict literal allows duplicate keys In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: SUMMARY: The outcome of a search for: python dict literal duplicate keys. No conclusions (so far). BACKGROUND In the thread "PEP: Dict addition and subtraction" I wrote > >>> {'a': 0, 'a': 1} > {'a': 1} > I wonder, is this behaviour of {'a': 0, 'a': 1} documented (or tested) > anywhere? I didn't find it in these URLs: > https://docs.python.org/3/library/stdtypes.html#mapping-types-dict > https://docs.python.org/3/tutorial/datastructures.html#dictionaries LINKS I've since found some relevant URLs. [1] https://stackoverflow.com/questions/34539772/is-a-dict-literal-containing-repeated-keys-well-defined [2] https://help.semmle.com/wiki/display/PYTHON/Duplicate+key+in+dict+literal [3] https://bugs.python.org/issue26910 [4] https://bugs.python.org/issue16385 [5] https://realpython.com/python-dicts/ ANALYSIS [1] gives a reference to [6], which correctly states the behaviour of {'a':0, 'a':1}, although without giving an example. (Aside: Sometimes one example is worth 50 or more words.) [2] is from Semmle, who provide an automated code review tool, called LGTM. The page [2] appears to be part of the documentation for LGTM. This page provides a useful link to [7]. [3] is a re-opening of [4]. It was rapidly closed by David Murray, who recommended reopening the discussion on python-ideas. [4] was raised by Albert Ferras, based on his real-world experience. In particular, a configuration file that contains a long dict literal. This was closed by Benjamin Peterson, who said that raising an error was "out of the question for compatibility isssues". Given few use case and little support on python-ideas,Terry Ready supported the closure. Raymond Hettinger supported the closure. [5] is from RealPython, who provide online tutorials. This page contains the statement "a given key can appear in a dictionary only once. Duplicate keys are not allowed." Note that {'a': 0, 'a': 1} can reasonably be thought of as a dictionary with duplicate keys. NOTE As I recall SGML (this shows my age) allows multiple entity declarations, as in And as I recall, in SGML the first value "original" is the one that is in effect. This is what happens with the LaTeX command \providecommand. FURTHER LINKS [6] https://docs.python.org/3/reference/expressions.html#dictionary-displays [7] https://cwe.mitre.org/data/definitions/561.html # CWE-561: Dead Code -- Jonathan From pythonchb at gmail.com Wed Mar 6 12:01:39 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 6 Mar 2019 09:01:39 -0800 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: Do go read the recent thread about this - there is a lot there! Titled something like ?fluent programming? Sorry ? on a phone, kinda hard to check now. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Wed Mar 6 12:07:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 17:07:40 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: I wrote: > I wonder, is this behaviour of {'a': 0, 'a': 1} documented (or tested) > anywhere? I've answered my own question here: [Python-ideas] dict literal allows duplicate keys https://mail.python.org/pipermail/python-ideas/2019-March/055717.html Finally, Christopher Barker wrote: > Yes, and had already been brought up in this thread ( I think by Guido). (Maybe not well documented, but certainly well understood and deliberate) Thank you for this, Christopher. -- Jonathan From songofacandy at gmail.com Wed Mar 6 12:18:41 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Thu, 7 Mar 2019 02:18:41 +0900 Subject: [Python-ideas] dict literal allows duplicate keys In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: https://docs.python.org/3/reference/expressions.html#dictionary-displays > If a comma-separated sequence of key/datum pairs is given, they are evaluated from left to right to define the entries of the dictionary: each key object is used as a key into the dictionary to store the corresponding datum. This means that you can specify the same key multiple times in the key/datum list, and the final dictionary?s value for that key will be the last one given. On Thu, Mar 7, 2019 at 2:09 AM Jonathan Fine wrote: > > SUMMARY: The outcome of a search for: python dict literal duplicate > keys. No conclusions (so far). > > BACKGROUND > In the thread "PEP: Dict addition and subtraction" I wrote > > > >>> {'a': 0, 'a': 1} > > {'a': 1} > > > I wonder, is this behaviour of {'a': 0, 'a': 1} documented (or tested) > > anywhere? I didn't find it in these URLs: > > https://docs.python.org/3/library/stdtypes.html#mapping-types-dict > > https://docs.python.org/3/tutorial/datastructures.html#dictionaries > > LINKS > I've since found some relevant URLs. > > [1] https://stackoverflow.com/questions/34539772/is-a-dict-literal-containing-repeated-keys-well-defined > [2] https://help.semmle.com/wiki/display/PYTHON/Duplicate+key+in+dict+literal > [3] https://bugs.python.org/issue26910 > [4] https://bugs.python.org/issue16385 > [5] https://realpython.com/python-dicts/ > > ANALYSIS > [1] gives a reference to [6], which correctly states the behaviour of > {'a':0, 'a':1}, although without giving an example. (Aside: Sometimes > one example is worth 50 or more words.) > > [2] is from Semmle, who provide an automated code review tool, called > LGTM. The page [2] appears to be part of the documentation for LGTM. > This page provides a useful link to [7]. > > [3] is a re-opening of [4]. It was rapidly closed by David Murray, who > recommended reopening the discussion on python-ideas. > [4] was raised by Albert Ferras, based on his real-world experience. > In particular, a configuration file that contains a long dict literal. > This was closed by Benjamin Peterson, who said that raising an error > was "out of the question for compatibility isssues". Given few use > case and little support on python-ideas,Terry Ready supported the > closure. Raymond Hettinger supported the closure. > > [5] is from RealPython, who provide online tutorials. This page > contains the statement "a given key can appear in a dictionary only > once. Duplicate keys are not allowed." Note that > {'a': 0, 'a': 1} > can reasonably be thought of as a dictionary with duplicate keys. > > NOTE > As I recall SGML (this shows my age) allows multiple entity declarations, as in > > > > And as I recall, in SGML the first value "original" is the one that is > in effect. This is what happens with the LaTeX command > \providecommand. > > FURTHER LINKS > [6] https://docs.python.org/3/reference/expressions.html#dictionary-displays > [7] https://cwe.mitre.org/data/definitions/561.html # CWE-561: Dead Code > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Inada Naoki From jfine2358 at gmail.com Wed Mar 6 12:43:44 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 17:43:44 +0000 Subject: [Python-ideas] dict literal allows duplicate keys In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: SUMMARY: Off-thread-topic comment on examples and words in documentation. Inada Naoki quoted (from doc.python ref [6] in my original post): > > If a comma-separated sequence of key/datum pairs is given, they are evaluated from left to right to define the entries of the dictionary: each key object is used as a key into the dictionary to store the corresponding datum. This means that you can specify the same key multiple times in the key/datum list, and the final dictionary?s value for that key will be the last one given. Indeed. Although off-topic, I think >>> {'a': 0, 'a': 1} == {'a': 1} True is much better than "This means that you can specify the same key multiple times in the key/datum list, and the final dictionary?s value for that key will be the last one given." By the way, today I think we'd say key/value pairs. And I've read https://www.theguardian.com/guardian-observer-style-guide-d data takes a singular verb (like agenda), though strictly a plural; you come across datum, the singular of data, about as often as you hear about an agendum Oh, and "the final dictionary's value" should I think be "the dictionary's final value" or perhaps just "the dictionary's value" But now we're far from the thread topic. I'm happy to join in on a thread on improving documentation (by using simpler language and good examples). -- Jonathan From rhodri at kynesim.co.uk Wed Mar 6 13:12:15 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 6 Mar 2019 18:12:15 +0000 Subject: [Python-ideas] OT: In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: <7f66b841-b9ea-6188-1503-750a630cdad5@kynesim.co.uk> On 06/03/2019 17:43, Jonathan Fine wrote: > Indeed. Although off-topic, I think > >>>> {'a': 0, 'a': 1} == {'a': 1} > True > > is much better than "This means that you can specify the same key > multiple times in the key/datum list, and the final dictionary?s value > for that key will be the last one given." I disagree. An example is an excellent thing, but the words are definitive and must be there. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Wed Mar 6 13:13:23 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 6 Mar 2019 18:13:23 +0000 Subject: [Python-ideas] OT: Dictionary display documentation In-Reply-To: <7f66b841-b9ea-6188-1503-750a630cdad5@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <7f66b841-b9ea-6188-1503-750a630cdad5@kynesim.co.uk> Message-ID: <69b1b846-a47c-bd3a-8d2a-11200fca8f67@kynesim.co.uk> On 06/03/2019 18:12, Rhodri James wrote: > On 06/03/2019 17:43, Jonathan Fine wrote: >> Indeed. Although off-topic, I think >> >>>>> {'a': 0, 'a': 1} == {'a': 1} >> True >> >> is much better than "This means that you can specify the same key >> multiple times in the key/datum list, and the final dictionary?s value >> for that key will be the last one given." > > I disagree.? An example is an excellent thing, but the words are > definitive and must be there. Sigh. I hit SEND before I finished changing the title. Sorry, folks. -- Rhodri James *-* Kynesim Ltd From michael.lee.0x2a at gmail.com Wed Mar 6 13:22:54 2019 From: michael.lee.0x2a at gmail.com (Michael Lee) Date: Wed, 6 Mar 2019 10:22:54 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: > > If we use this "literally concat" metaphor, I still think set should have > `+` as alias to `|` for consistency. > I agree. I think "|" keeps commutativity only because it's minor than `+`. > I suppose that's true, fair point. I guess I would be ok with | no longer always implying commutativity if we were repurposing it for some radically different purpose. But dicts and sets are similar enough that I think having them both use similar but ultimately different definitions of "|" is going to have non-zero cost, especially when reading or modifying future code that makes heavy use of both data structures. Maybe that cost is worth it. I'm personally not convinced, but I do think it should be taken into account.. Hmm. The PEP proposed dict - dict, which is similar to set - set > (difference). > Now that you point it out, I think I also dislike `d1 - d2` for the same reasons I listed earlier: it's not consistent with set semantics. One other objection I overlooked is that the PEP currently requires both operands to be dicts when doing "d1 - d2" . So doing {"a": 1, "b": 2, "c": 3} - ["a", "b"] is currently disallowed (though doing d1 -= ["a", "b"] is apparently ok). I can sympathize: allowing "d1 - some_iter" feels a little too magical to me. But it's unfortunately restrictive -- I suspect removing keys stored within a list or something would be just as common of a use-case if not more so then removing keys stored in another dict. I propose that we instead add methods like "d1.without_keys(...)" and "d1.remove_keys(...)" that can accept any iterable of keys. These two methods would replace "d1.__sub__(...)" and "d1.__isub__(...)" respectively. The exact method names and semantics could probably do with a little more bikeshedding, but I think this idea would remove a false symmetry between "d1 + d2" and "d1 - d2" that doesn't actually really exist while being more broadly useful. Or I guess we could just remove that restriction: "it feels too magical" isn't a great objection on my part. Either way, that part of the PEP could use some more refinement, I think. -- Michael On Wed, Mar 6, 2019 at 8:29 AM Inada Naoki wrote: > On Wed, Mar 6, 2019 at 10:59 PM Michael Lee > wrote: > > > > > I think the behavior proposed in the PEP makes sense whether you think > of "+" as meaning "concatenation" or "merging". > > > > If your instinct is to assume "+" means "concatenation", then it would > be natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be > identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the > key-value pairs into a new dict. > > > > Nice explanation. You reduced my opposite to `+` by "literally concat". > Better example, {"a": 1, "b": 2} + {"c": 4, "b": 3} == {"a": 1, "b": > 2, "c": 4, "b": 3} == {"a": 1, "b": 3, "c": 4} > > On the other hand, union of set is also "literally concat". If we use > this "literally concat" metaphor, > I still think set should have `+` as alias to `|` for consistency. > > > > > Using "|" would also violate an important existing property of unions: > the invariant "d1 | d2 == d2 | d1" is no longer true. As far as I'm aware, > the union operation is always taken to be commutative in math, and so I > think it's important that we preserve that property in Python. At the very > least, I think it's far more important to preserve commutativity of unions > then it is to preserve some of the invariants I've seen proposed above, > like "len(d1 + d2) == len(d1) + len(d2)". > > > > I think both rule are "rather a coincidence than a conscious decision". > > I think "|" keeps commutativity only because it's minor than `+`. Easy > operator > is abused easily more than minor operator. > > And I think every "coincidence" rules are important. They makes > understanding Python easy. > Every people "discover" rules and consistency while learning language. > > This is a matter of balance. There are no right answer. Someone > *feel* rule A is important than B. > Someone feel opposite. > > > > But I do know that I'm a strong -1 on adding set operations to dicts: > it's not possible to preserve the existing semantics of union (and > intersection) with dict and think expressions like "d1 | d2" and "d1 & d2" > would just be confusing and misleading to encounter in the wild. > > Hmm. The PEP proposed dict - dict, which is similar to set - set > (difference). > To me, {"a": 1, "b": 2} - {"b": 3} = {"a": 1} is confusing than {"a": > 1, "b": 2} - {"b"} = {"a": 1}. > > So I think borrow some semantics from set is good idea. > Both of `dict - set` and `dict & set` makes sense to me. > > * `dict - set` can be used to remove private keys by "blacklist". > * `dict & set` can be used to choose public keys by "whiltelist". > > -- > Inada Naoki > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 6 13:54:04 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Mar 2019 10:54:04 -0800 Subject: [Python-ideas] OT: about hasty posts from phones In-Reply-To: References: Message-ID: On Wed, Mar 6, 2019 at 9:12 AM Christopher Barker wrote: > [...] > Sorry ? on a phone, kinda hard to check now. > A point of order: if you're away from a real keyboard/screen, maybe it's better to wait. The conversation isn't real-time, and you don't win points by answering first. We could all be reminded of the goal for StackOverflow: the intent there is to create a useful artifact. While it's not quite the same for python-ideas, in the end we're trying to get to insights and (tentative) decisions, which should be long-lasting. Fewer posts may be better. NOTE: I'm not picking on you specifically Chris! I see this a lot, and I do it myself too (and regularly regret it). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 6 13:58:48 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Mar 2019 10:58:48 -0800 Subject: [Python-ideas] dict literal allows duplicate keys In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: Would it shut down this particular subthread if (as the language's designer, if not its BDFL) I declared that this was an explicit design decision that I made nearly 30 years ago? I should perhaps blog about the background of this decision, but it was quite a conscious one. There really is no point in thinking that this is an accident of implementation or could be changed. On Wed, Mar 6, 2019 at 9:10 AM Jonathan Fine wrote: > SUMMARY: The outcome of a search for: python dict literal duplicate > keys. No conclusions (so far). > > BACKGROUND > In the thread "PEP: Dict addition and subtraction" I wrote > > > >>> {'a': 0, 'a': 1} > > {'a': 1} > > > I wonder, is this behaviour of {'a': 0, 'a': 1} documented (or tested) > > anywhere? I didn't find it in these URLs: > > https://docs.python.org/3/library/stdtypes.html#mapping-types-dict > > https://docs.python.org/3/tutorial/datastructures.html#dictionaries > > LINKS > I've since found some relevant URLs. > > [1] > https://stackoverflow.com/questions/34539772/is-a-dict-literal-containing-repeated-keys-well-defined > [2] > https://help.semmle.com/wiki/display/PYTHON/Duplicate+key+in+dict+literal > [3] https://bugs.python.org/issue26910 > [4] https://bugs.python.org/issue16385 > [5] https://realpython.com/python-dicts/ > > ANALYSIS > [1] gives a reference to [6], which correctly states the behaviour of > {'a':0, 'a':1}, although without giving an example. (Aside: Sometimes > one example is worth 50 or more words.) > > [2] is from Semmle, who provide an automated code review tool, called > LGTM. The page [2] appears to be part of the documentation for LGTM. > This page provides a useful link to [7]. > > [3] is a re-opening of [4]. It was rapidly closed by David Murray, who > recommended reopening the discussion on python-ideas. > [4] was raised by Albert Ferras, based on his real-world experience. > In particular, a configuration file that contains a long dict literal. > This was closed by Benjamin Peterson, who said that raising an error > was "out of the question for compatibility isssues". Given few use > case and little support on python-ideas,Terry Ready supported the > closure. Raymond Hettinger supported the closure. > > [5] is from RealPython, who provide online tutorials. This page > contains the statement "a given key can appear in a dictionary only > once. Duplicate keys are not allowed." Note that > {'a': 0, 'a': 1} > can reasonably be thought of as a dictionary with duplicate keys. > > NOTE > As I recall SGML (this shows my age) allows multiple entity declarations, > as in > > > > And as I recall, in SGML the first value "original" is the one that is > in effect. This is what happens with the LaTeX command > \providecommand. > > FURTHER LINKS > [6] > https://docs.python.org/3/reference/expressions.html#dictionary-displays > [7] https://cwe.mitre.org/data/definitions/561.html # CWE-561: Dead Code > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Mar 6 14:11:44 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 6 Mar 2019 23:11:44 +0400 Subject: [Python-ideas] OT: about hasty posts from phones In-Reply-To: References: Message-ID: Sorry about adding a few words here, i know you are all more 'advanced' programmers than me. I just wanted to ask the list to keep threads informative. Just today i decided to take the bulls by the horn and read the add dictionaries by using the + operator. Midway, i asked myself if i was getting value from reading each mail one by one. I don't think i can use the lists as a hands-on reference (was not meant to but could've been), it fits more as an nlp data sci project with data cleaning and all. I don't want an exact SO line though in the sense of absolutely no crap for beginners would be scared away. The talking of it's members sometimes do keep things lively. But for technical posts i think we can keep it to technical points and avoid the like of >> if we do this some trajic things might happen > what do you mean by "trajic"? and so on. Those create diversions in the road to understanding a topic. For discussions posts like CoC update or whatever things off code, i think there can be a little leniency. I have only neen using python since 3.4, i have a lot to catch up and learn from you all. yours, Abdur-Rahmaan Janhangeer http://www.pythonmembers.club | https://github.com/Abdur-rahmaanJ Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcepl at cepl.eu Wed Mar 6 15:12:52 2019 From: mcepl at cepl.eu (=?UTF-8?Q?Mat=C4=9Bj?= Cepl) Date: Wed, 06 Mar 2019 21:12:52 +0100 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite Message-ID: Hi, I am a lead maintainer of Python packages in OpenSUSE and I can see the pattern of many packagers adding blindly python setup.py test to %check section of our SPEC file. The problem is that if the package doesn't use unittest (it actually uses nose, pytest or something), it could lead to zero found tests, which pass and Python returns exit code 0 (success) even though nothing has been tested. It seems from the outside that everything is all right, package is being tested on every build, but actually it is lie. Would it be possible to change unittest runner, so that when 0 tests pass, whole test suite would end up failing? Thank you for considering this, Mat?j -- https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Never ascribe to malice that which is adequately explained by stupidity. -- Napoleon Bonaparte (or many other people to whom this quote is ascribed) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From python at mrabarnett.plus.com Wed Mar 6 15:39:21 2019 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 6 Mar 2019 20:39:21 +0000 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: <06ac2362-c034-4d31-4f79-5d259d45e7c7@mrabarnett.plus.com> On 2019-03-06 20:12, Mat?j Cepl wrote: > Hi, > > I am a lead maintainer of Python packages in OpenSUSE and I can > see the pattern of many packagers adding blindly > > python setup.py test > > to %check section of our SPEC file. The problem is that if the > package doesn't use unittest (it actually uses nose, pytest or > something), it could lead to zero found tests, which pass and > Python returns exit code 0 (success) even though nothing has been > tested. It seems from the outside that everything is all right, > package is being tested on every build, but actually it is lie. > > Would it be possible to change unittest runner, so that when 0 > tests pass, whole test suite would end up failing? > > Thank you for considering this, > Strictly speaking, it's not a lie because none of the tests have failed. From jfine2358 at gmail.com Wed Mar 6 15:48:14 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Wed, 6 Mar 2019 20:48:14 +0000 Subject: [Python-ideas] dict literal allows duplicate keys In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: Hi Guido You wrote: > Would it shut down this particular subthread if (as the language's designer, if not its BDFL) I declared that this was an explicit design decision that I made nearly 30 years ago? I should perhaps blog about the background of this decision, but it was quite a conscious one. There really is no point in thinking that this is an accident of implementation or could be changed. Thank you for sharing this with us. I'd be fascinated to hear about the background to this conscious decision, and I think it would help me and others understand better what makes Python what it is. And it might help persuade me that my surprise at {'a': 0, 'a': 1} is misplaced, or at least exaggerated and one-sided. Do you want menial help writing the blog? Perhaps if you share your recollections, others will find the traces in the source code. For example, I've found the first dictobject.c, dating back to 1994. https://github.com/python/cpython/blob/956640880da20c20d5320477a0dcaf2026bd9426/Objects/dictobject.c I'm a great fan of your Python conversation (with Biancuzzi and Warden) in http://shop.oreilly.com/product/9780596515171.do # Masterminds of Programming I've read this article several times, and have wished that it was more widely available. My personal view is that putting a copy of this article in docs.python.org would provide more benefit to the community than you blogging on why dict literals allow duplicate keys. However, it need not be either/or. Perhaps someone could ask the PSF to talk with O'Reilly about getting copyright clearance to do this. Finally, some personal remarks. I've got a long training as a pure mathematician. For me consistency and application of simple basic principles is important to me. And also the discovery of basic principles. In your interview, you say (paraphrased and without context) that most Python code is written simply to get a job done. And that pragmatism, rather then being hung up about theoretical concept, is the fundamental quality in being proficient in developing with Python. Thank you for inventing Python, and designing the language. It's a language popular both with pure mathematicians, and also pragmatic people who want to get things done. That's quite an achievement, which has drawn people like me into your community. with best regards Jonathan From guido at python.org Wed Mar 6 15:59:17 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Mar 2019 12:59:17 -0800 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: I would just file a bug and add a PR. On Wed, Mar 6, 2019 at 12:14 PM Mat?j Cepl wrote: > Hi, > > I am a lead maintainer of Python packages in OpenSUSE and I can > see the pattern of many packagers adding blindly > > python setup.py test > > to %check section of our SPEC file. The problem is that if the > package doesn't use unittest (it actually uses nose, pytest or > something), it could lead to zero found tests, which pass and > Python returns exit code 0 (success) even though nothing has been > tested. It seems from the outside that everything is all right, > package is being tested on every build, but actually it is lie. > > Would it be possible to change unittest runner, so that when 0 > tests pass, whole test suite would end up failing? > > Thank you for considering this, > > Mat?j > > -- > https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz > GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 > > Never ascribe to malice that which is adequately explained by > stupidity. > -- Napoleon Bonaparte (or many other people to whom this > quote is ascribed) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Mar 6 17:30:32 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Mar 2019 11:30:32 +1300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> Message-ID: <5C804A08.9060902@canterbury.ac.nz> Ka-Ping Yee wrote: > len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the > + operator is nonsense. You might as well say that using the + operator on vectors is nonsense, because len(v1 + v2) is not in general equal to len(v1) + len(v2). Yet mathematicians are quite happy to talk about "addition" of vectors. -- Greg From shadowranger+pythonideas at gmail.com Wed Mar 6 18:51:13 2019 From: shadowranger+pythonideas at gmail.com (Josh Rosenberg) Date: Wed, 6 Mar 2019 23:51:13 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <5C804A08.9060902@canterbury.ac.nz> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: On Wed, Mar 6, 2019 at 10:31 PM Greg Ewing wrote: > > You might as well say that using the + operator on vectors is > nonsense, because len(v1 + v2) is not in general equal to > len(v1) + len(v2). > > Yet mathematicians are quite happy to talk about "addition" > of vectors. > > Vectors addition is *actual* addition, not concatenation. You're so busy loosening the definition of + as relates to , to make it make sense for dicts that you've forgotten that + is, first and foremost, about addition in the mathematical sense, where vector addition is just one type of addition. Concatenation is already a minor abuse of +, but one commonly accepted by programmers, thanks to it having some similarities to addition and a single, unambiguous set of semantics to avoid confusion. You're defending + on dicts because vector addition isn't concatenation already, which only shows how muddled things get when you try to use + to mean multiple concepts that are at best loosely related. The closest I can come to a thorough definition of what + does in Python (and most languages) right now is that: 1. Returns a new thing of the same type (or a shared coerced type for number weirdness) 2. That combines the information of the input operands 3. Is associative ((a + b) + c produces the same thing as a + (b + c)) (modulo floating point weirdness) 4. Is "reversible": Knowing the end result and *one* of the inputs is sufficient to determine the value of the other input; that is, for c = a + b, knowing any two of a, b and c allows you to determine a single unambiguous value for the remaining value (numeric coercion and floating point weirdness make this not 100%, but you can at least know a value equal to other value; e.g. for c = a + b, knowing c is 5.0 and a is 1.0 is sufficient to say that b is equal to 4, even if it's not necessarily an int or float). For numbers, reversal is done with -; for sequences, it's done by slicing c using the length of a or b to "subtract" the elements that came from a/b. 5. (Actual addition only) Is commutative (modulo floating point weirdness); a + b == b + a 6. (Concatenation only) Is order preserving (really a natural consequence of #4, but a property that people expect) Note that these rules are consistent across most major languages that allow + to mean combine collections (the few that disagree, like Pascal, don't support | as a union operator). Concatenation is missing element #5, but otherwise aligns with actual addition. dict merges (and set unions for that matter) violate #4 and #6; for c = a + b, knowing c and either a or b still leaves a literally infinite set of possible inputs for the other input (it's not infinite for sets, where the options would be a subset of the result, but for dicts, there would be no such limitation; keys from b could exist with any possible value in a). dicts order preserving aspect *almost* satisfies #6, but not quite (if 'x' comes after 'y' in b, there is no guarantee that it will do so in c, because a gets first say on ordering, and b gets the final word on value). Allowing dicts to get involved in + means: 1. Fewer consistent rules apply to +; 2. The particular idiosyncrasies of Python dict ordering and "which value wins" rules are now tied to +. for concatenation, there is only one set of possible rules AFAICT so every language naturally agrees on behavior, but dict merging obviously has many possible rules that would be unlikely to match the exact rules of any other language except by coincidence). a winning on order and b winning on value is a historical artifact of how Python's dict developed; I doubt any other language would intentionally choose to split responsibility like that if they weren't handcuffed by history. Again, there's nothing wrong with making dict merges easier. But it shouldn't be done by (further) abusing +. -Josh Rosenberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Mar 6 18:58:02 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Mar 2019 10:58:02 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: On Thu, Mar 7, 2019 at 10:52 AM Josh Rosenberg wrote: > The closest I can come to a thorough definition of what + does in Python (and most languages) right now is that: > > 1. Returns a new thing of the same type (or a shared coerced type for number weirdness) > 2. That combines the information of the input operands > 3. Is associative ((a + b) + c produces the same thing as a + (b + c)) (modulo floating point weirdness) > 4. Is "reversible": Knowing the end result and *one* of the inputs is sufficient to determine the value of the other input; that is, for c = a + b, knowing any two of a, b and c allows you to determine a single unambiguous value for the remaining value (numeric coercion and floating point weirdness make this not 100%, but you can at least know a value equal to other value; e.g. for c = a + b, knowing c is 5.0 and a is 1.0 is sufficient to say that b is equal to 4, even if it's not necessarily an int or float). For numbers, reversal is done with -; for sequences, it's done by slicing c using the length of a or b to "subtract" the elements that came from a/b. > 5. (Actual addition only) Is commutative (modulo floating point weirdness); a + b == b + a > 6. (Concatenation only) Is order preserving (really a natural consequence of #4, but a property that people expect) > > Allowing dicts to get involved in + means: > > 1. Fewer consistent rules apply to +; > 2. The particular idiosyncrasies of Python dict ordering and "which value wins" rules are now tied to +. for concatenation, there is only one set of possible rules AFAICT so every language naturally agrees on behavior, but dict merging obviously has many possible rules that would be unlikely to match the exact rules of any other language except by coincidence). a winning on order and b winning on value is a historical artifact of how Python's dict developed; I doubt any other language would intentionally choose to split responsibility like that if they weren't handcuffed by history. > > Again, there's nothing wrong with making dict merges easier. But it shouldn't be done by (further) abusing +. Lots of words that basically say: Stuff wouldn't be perfectly pure. But adding dictionaries is fundamentally *useful*. It is expressive. It will, in pretty much all situations, do exactly what someone would expect, based on knowledge of how Python works in other areas. The semantics for edge cases have to be clearly defined, but they'll only come into play on rare occasions; most of the time, for instance, we don't have to worry about identity vs equality in dictionary keys. If you tell people "adding two dictionaries combines them, with the right operand winning collisions", it won't matter that this isn't how lists or floats work; it'll be incredibly useful as it is. Practicality. Let's have some. ChrisA From pylang3 at gmail.com Wed Mar 6 19:37:29 2019 From: pylang3 at gmail.com (pylang) Date: Wed, 6 Mar 2019 19:37:29 -0500 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) In-Reply-To: References: Message-ID: Jorropo states: Polymorph function work exacly like async function BUT they assure of the > ability to execute syncronously. - sic > Async functions can call sync functions, but not vice versa. Consider a third party solution - trio , that allows sync functions to call async functions. >From the docs : async def async_double(x): return 2 * x trio.run(async_double, 3) # returns 6 --- If you want a stdlib solution, let's revisit Nathaniel Smith's example: > def maybe_async(fn): > @functools.wraps(fn) > def wrapper(*args, **kwargs): > coro = fn(*args, **kwargs) > if asyncio.get_running_loop() is not None: > return coro > else: > return await coro > I was unable to run his example as-is (in Python 3.6 at least) since the `await` keyword is only permitted inside an `async def` function. However, the idea is intriguing and can be adapted. See the example below. Code: def maybe_async(fn): async def _process(fn, *args, **kwargs): coro_fn = fn(*args, **kwargs) if asyncio.iscoroutinefunction(fn): return await coro_fn else: return coro_fn @functools.wraps(fn) def wrapper(*args, **kwarg): loop = asyncio.get_event_loop() res = loop.run_until_complete(_process(fn, *args, **kwarg)) return res return wrapper Demo: @maybe_async async def agreet(delay): print("hello") await asyncio.sleep(delay) print("world") @maybe_async def greet(delay): print("hello") time.sleep(delay) print("world") agreet(2) # prints hello world after 2 seconds greet(1) # print hello world after 1 second Now you can call either sync or async functions like regular functions. Hope this helps. --- On Wed, Mar 6, 2019 at 12:54 AM Nathaniel Smith wrote: > Defining a single polymorphic function is easy at the library level. > For example, with asyncio: > > ---- > > def maybe_async(fn): > @functools.wraps(fn) > def wrapper(*args, **kwargs): > coro = fn(*args, **kwargs) > if asyncio.get_running_loop() is not None: > return coro > else: > return await coro > > @maybe_async > async def my_func(...): > ... use asyncio freely in here ... > > ---- > > You can't do it at the language level though (e.g. with your proposed > 'polymorph' keyword), because the language doesn't know whether an > event loop is running or not. > > Extending this from a single function to a whole library API is > substantially more complex, because you have to wrap every function > and method, deal with __iter__ versus __aiter__, etc. > > -n > > On Tue, Mar 5, 2019 at 8:02 PM Jorropo . wrote: > > > > I was doing some async networking and I wondered, why I have to use 2 > different api for making the same things in async or sync regime. > > Even if we make 2 perfectly identical api (except function will be sync > and async), it will still 2 different code). > > > > So I first thinked to allow await in syncronous function but that create > some problems (ex: an async function calling async.create_task) so if we > allow that we have to asume to allways be in async regime (like js). > > > > Or we can differentiate async function wich can be awaited in syncronous > regime, maybe with a new keyword (here I will use polymorph due to a lack > of imagination but I find that one too long) ? > > > > So a polymorph function can be awaited in a syncronous function, and a > polymorph function can only await polymorph functions. > > > > Polymorph function work exacly like async function BUT they assure of > the ability to execute syncronously. > > And in a syncronous regime if an await need to wait (like async.sleep or > network operation), just wait (like the equivalent of this function in > syncronous way). > > > > So why made that ? > > To provide the same api for async and sync regime when its possible, > example http api. > > This allow to code less librairy. > > Syncronous users can just use the librairy like any other sync lib (with > the keyword await for executing but, personally, I think that is worth). > > And asyncronous users can run multiples tasks using the same lib. > > Moving from a regime to an other is simpler, source code size is reduced > (doesn't need to create 2 api for the same lib), gain of time for the same > reason. > > > > Also why it need to await in syncronous function, why not just execute > polymorph function like any sync function while called in a sync function ? > > Because we need to create runnable objects for async.run, ... > > > > So I would have your though about that, what can be improved, a better > name for polymorph ? > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 6 21:17:43 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Mar 2019 18:17:43 -0800 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) In-Reply-To: References: Message-ID: On Wed, Mar 6, 2019 at 4:37 PM pylang wrote: >> def maybe_async(fn): >> @functools.wraps(fn) >> def wrapper(*args, **kwargs): >> coro = fn(*args, **kwargs) >> if asyncio.get_running_loop() is not None: >> return coro >> else: >> return await coro > > I was unable to run his example as-is (in Python 3.6 at least) since the `await` keyword is only permitted inside an `async def` function. Oh yeah, that was a brain fart. I meant to write: def maybe_async(fn): @functools.wraps(fn) def wrapper(*args, **kwargs): coro = fn(*args, **kwargs) if asyncio.get_running_loop() is not None: return coro else: return asyncio.run(coro) -n -- Nathaniel J. Smith -- https://vorpus.org From tjreedy at udel.edu Wed Mar 6 21:55:21 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 6 Mar 2019 21:55:21 -0500 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: On 3/6/2019 3:12 PM, Mat?j Cepl wrote: > Hi, > > I am a lead maintainer of Python packages in OpenSUSE and I can > see the pattern of many packagers adding blindly > > python setup.py test > > to %check section of our SPEC file. I am not familiar with setup.py, so I don't know how this affects the presence and contents of any particular files. > The problem is that if the > package doesn't use unittest (it actually uses nose, pytest or > something), it could lead to zero found tests, Hence I don't know how unittest might be invoked in the situation you describe nor what output you see and whether you mean 0 test file(s) found or 0 test methods found or 0 lines of test code executed. > which pass and > Python returns exit code 0 (success) even though nothing has been > tested. 0 test methods does not mean 0 code executed in the tested module. Here is a possible minimal test file test_mod that is better than nothing. import mod import unittest class MinTest(unittest.TestCase): def setUp(self): self.instance = mod.MainClass() > It seems from the outside that everything is all right, > package is being tested on every build, but actually it is lie. Unless a test covers 100% of both lines *and* logic, 'success' never means 'everything is all right'. > Would it be possible to change unittest runner, so that when 0 > tests pass, whole test suite would end up failing? Yes, but unless a change were very narrow, and only affected the particular situation presented, it would be a bad idea. The unittest system is premised on 'success' rather than 'failure' being the default. 1. A test file may do better-that-nothing testing without running a test method. See above. Calling a mininal pass a 'fail' would be wrong. 2. A test file should skip everything when running on a system that cannot runs the tests. Several stdlib modules are OS-specific; their test modules skip all tests on some OS. There is no OS that can run every file in the Python test suite. Skipped test modules must not fail the test suite. IDLE and tkinter require graphics hardware and are then optional. IDLE depends on idlelib and tkinter. Tkinter depends on _tkinter and tcl/tk. Tk depends on having a graphic system, which servers and, in particular, *nix buildbots, generally lack. Again, skipped IDLE and tkinter test.test_x files must not fail a test suite. I agree that labeling the result of running a single test file can be problematical. The following could be either a 'SUCCESS' or 'FAIL', depending on what one wanted and expected. So one should read the detail and judge for oneself. 0:00:00 [1/1] test_idle test_idle skipped -- No module named 'idlelib' # or tkinter or ... test_idle skipped == Tests result: SUCCESS == 1 test skipped: test_idle Total duration: 109 ms Tests result: SUCCESS Unittest effectively assumes the context 'test file in test suite'. -- Terry Jan Reedy From wes.turner at gmail.com Wed Mar 6 22:15:09 2019 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Mar 2019 22:15:09 -0500 Subject: [Python-ideas] Allow creation of polymorph function (async function executable syncronously) In-Reply-To: References: Message-ID: Here's syncer/syncer.py: https://github.com/miyakogi/syncer/blob/master/syncer.py I think the singledispatch is pretty cool. ```python #!/usr/bin/env python3 # -*- coding: utf-8 -*- import sys from functools import singledispatch, wraps import asyncio import inspect import types from typing import Any, Callable, Generator PY35 = sys.version_info >= (3, 5) def _is_awaitable(co: Generator[Any, None, Any]) -> bool: if PY35: return inspect.isawaitable(co) else: return (isinstance(co, types.GeneratorType) or isinstance(co, asyncio.Future)) @singledispatch def sync(co: Any): raise TypeError('Called with unsupported argument: {}'.format(co)) @sync.register(asyncio.Future) @sync.register(types.GeneratorType) def sync_co(co: Generator[Any, None, Any]) -> Any: if not _is_awaitable(co): raise TypeError('Called with unsupported argument: {}'.format(co)) return asyncio.get_event_loop().run_until_complete(co) @sync.register(types.FunctionType) @sync.register(types.MethodType) def sync_fu(f: Callable[..., Any]) -> Callable[..., Any]: if not asyncio.iscoroutinefunction(f): raise TypeError('Called with unsupported argument: {}'.format(f)) @wraps(f) def run(*args, **kwargs): return asyncio.get_event_loop().run_until_complete(f(*args, **kwargs)) return run if PY35: sync.register(types.CoroutineType)(sync_co) ``` On Wed, Mar 6, 2019 at 9:20 PM Nathaniel Smith wrote: > On Wed, Mar 6, 2019 at 4:37 PM pylang wrote: > >> def maybe_async(fn): > >> @functools.wraps(fn) > >> def wrapper(*args, **kwargs): > >> coro = fn(*args, **kwargs) > >> if asyncio.get_running_loop() is not None: > >> return coro > >> else: > >> return await coro > > > > I was unable to run his example as-is (in Python 3.6 at least) since the > `await` keyword is only permitted inside an `async def` function. > > Oh yeah, that was a brain fart. I meant to write: > > def maybe_async(fn): > @functools.wraps(fn) > def wrapper(*args, **kwargs): > coro = fn(*args, **kwargs) > if asyncio.get_running_loop() is not None: > return coro > else: > return asyncio.run(coro) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 6 22:21:10 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Mar 2019 19:21:10 -0800 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: On Wed, Mar 6, 2019 at 12:13 PM Mat?j Cepl wrote: > > Hi, > > I am a lead maintainer of Python packages in OpenSUSE and I can > see the pattern of many packagers adding blindly > > python setup.py test > > to %check section of our SPEC file. The problem is that if the > package doesn't use unittest (it actually uses nose, pytest or > something), it could lead to zero found tests, which pass and > Python returns exit code 0 (success) even though nothing has been > tested. It seems from the outside that everything is all right, > package is being tested on every build, but actually it is lie. > > Would it be possible to change unittest runner, so that when 0 > tests pass, whole test suite would end up failing? You probably want to file a bug on the setuptools tracker: https://github.com/pypa/setuptools It's maintained by different people than Python itself, and is responsible for defining 'setup.py test'. -n -- Nathaniel J. Smith -- https://vorpus.org From storchaka at gmail.com Thu Mar 7 00:14:33 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 7 Mar 2019 07:14:33 +0200 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: 06.03.19 22:12, Mat?j Cepl ????: > I am a lead maintainer of Python packages in OpenSUSE and I can > see the pattern of many packagers adding blindly > > python setup.py test > > to %check section of our SPEC file. The problem is that if the > package doesn't use unittest (it actually uses nose, pytest or > something), it could lead to zero found tests, which pass and > Python returns exit code 0 (success) even though nothing has been > tested. It seems from the outside that everything is all right, > package is being tested on every build, but actually it is lie. > > Would it be possible to change unittest runner, so that when 0 > tests pass, whole test suite would end up failing? There was a related issue: https://bugs.python.org/issue34279. It may be worth to make that warning more visible. Or make just setup.py more pedantic. From mcepl at cepl.eu Thu Mar 7 01:51:35 2019 From: mcepl at cepl.eu (=?UTF-8?Q?Mat=C4=9Bj?= Cepl) Date: Thu, 07 Mar 2019 07:51:35 +0100 Subject: [Python-ideas] unittest: 0 tests pass means failure of the testsuite In-Reply-To: References: Message-ID: Nathaniel Smith p??e v St 06. 03. 2019 v 19:21 -0800: > You probably want to file a bug on the setuptools tracker: > https://github.com/pypa/setuptools > > It's maintained by different people than Python itself, and is > responsible for defining 'setup.py test'. I think we have to bugs (or deficiencies) here: 1. setup.py tries too hard and it pretends to collect zero tests suite even when it failed 2. unittest claims everything is OK, when zero tests passed (and zero were skipped, that's a good point by Terry Reedy, completely skipped test suite is legitimate in some situations). And no https://bugs.python.org/issue34279 is not exactly it, it is about regrtest, not unittest, but it could be probably adapted. Mat?j -- https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Give your heartache to him. (1Pt 5,7; Mt 11:28-30) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jamtlu at gmail.com Thu Mar 7 08:10:20 2019 From: jamtlu at gmail.com (James Lu) Date: Thu, 7 Mar 2019 08:10:20 -0500 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to__Python_3_online_docs?= Message-ID: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> Rationale: When I use a search engine to google a Python question, I frequently get a link to a page of the Python 2.7 documentation that shows before the Python 3 documentation link. This is annoying and slows down my programming. I propose: That we add a setting to Python?s online documentation that will optionally given that certain conditions are met, we redirect the user to the corresponding Python 3 documentation entry. The conditions: - The Referer header is set to a URL of a major search engine (the Referer is the URL of the last page that whose link was clicked on to reach the documentation) - The user has opted-in to this behavior. (Conceptually this should be user script, but for the collective conscience of all python developers, a doc option would be better. ) I understand that some core devs might just have documentation downloaded and just use that, but a large portion of Python users primarily use online documentation James Lu From andre.roberge at gmail.com Thu Mar 7 08:36:00 2019 From: andre.roberge at gmail.com (Andre Roberge) Date: Thu, 7 Mar 2019 09:36:00 -0400 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to_Python_3_online_docs?= In-Reply-To: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> Message-ID: On Thu, Mar 7, 2019 at 9:10 AM James Lu wrote: > Rationale: When I use a search engine to google a Python question, I > frequently get a link to a page of the Python 2.7 documentation that shows > before the Python 3 documentation link. > There exists browser extensions that do this: https://addons.mozilla.org/en-US/firefox/addon/py3direct/ https://chrome.google.com/webstore/detail/py3redirect/codfjigcljdnlklcaopdciclmmdandig?hl=en Andr? Roberge > > This is annoying and slows down my programming. > > I propose: That we add a setting to Python?s online documentation that > will optionally given that certain conditions are met, we redirect the user > to the corresponding Python 3 documentation entry. The conditions: > - The Referer header is set to a URL of a major search engine (the Referer > is the URL of the last page that whose link was clicked on to reach the > documentation) > - The user has opted-in to this behavior. > > (Conceptually this should be user script, but for the collective > conscience of all python developers, a doc option would be better. ) > > I understand that some core devs might just have documentation downloaded > and just use that, but a large portion of Python users primarily use > online documentation > > James Lu > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Mar 7 08:56:06 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Mar 2019 00:56:06 +1100 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to__Python_3_online_docs?= In-Reply-To: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> Message-ID: <20190307135606.GM4465@ando.pearwood.info> On Thu, Mar 07, 2019 at 08:10:20AM -0500, James Lu wrote: > Rationale: When I use a search engine to google a Python question, I > frequently get a link to a page of the Python 2.7 documentation that > shows before the Python 3 documentation link. > > This is annoying and slows down my programming. Please see https://bugs.python.org/issue35435 and related links from that issue. I've found that the search engines are getting better at linking to the more recent docs. For example, all of these: https://duckduckgo.com/?q=python+docs+random https://search.yahoo.com/yhs/search?p=python+docs+itertools https://www.bing.com/search?q=python+docs+netrc https://www.startpage.com/do/search?q=python+docs+array https://www.dogpile.com/serp?q=python+docs+shutil give me Python 3 first and Python 2 second. Even the comparatively obscure "sndhdr" module gets Python 3 first: https://www.google.com/search?q=python+docs+sndhdr However these gives Python 2 first: https://www.startpage.com/do/search?q=python+docs+netrc https://www.dogpile.com/serp?q=python+docs+fileinput But note that the docs do include a drop down menu to select the version, so it shouldn't be that difficult to swap from old versions to the most recent. (Unless you're looking at *really* old versions like 1.5.) -- Steven From zestyping at gmail.com Thu Mar 7 12:36:46 2019 From: zestyping at gmail.com (Ka-Ping Yee) Date: Thu, 7 Mar 2019 09:36:46 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: On Wed, Mar 6, 2019 at 4:01 PM Chris Angelico wrote: > On Thu, Mar 7, 2019 at 10:52 AM Josh Rosenberg > wrote: > > > > Allowing dicts to get involved in + means: > > Lots of words that basically say: Stuff wouldn't be perfectly pure. > > But adding dictionaries is fundamentally *useful*. It is expressive. > It is useful. It's just that + is the wrong name. Filtering and subtracting from dictionaries are also useful! Those are operations we do all the time. It would be useful if & and - did these things too?and if we have & and -, it's going to be even more obvious that the merge operator should have been |. Josh Rosenberg wrote: > If we were inventing programming languages in a vacuum, you could say + > can mean "arbitrary combination operator" and it would be fine. But we're > not in a vacuum; every major language that uses + with general purpose > containers uses it to mean element-wise addition or concatenation, not just > "merge". If we were inventing Python from scratch, we could have decided that we always use "+" to combine collections. Sets would combine with + and then it would make sense that dictionaries also combine with + . But that is not Python. Lists combine with + and sets combine with |. Why? Because lists add (put both collections together and keep everything), but sets merge (put both collections together and keep some). So, Python already has a merge operator. The merge operator is "|". For lists, += is shorthand for list.extend(). For sets, |= is shorthand for set.update(). Is dictionary merge more like extend() or more like update()? Python already took a position on that when it was decided to name the dictionary method update(). That ship sailed a long time ago. ?Ping -------------- next part -------------- An HTML attachment was scrubbed... URL: From ricocotam at gmail.com Thu Mar 7 15:26:12 2019 From: ricocotam at gmail.com (Adrien Ricocotam) Date: Thu, 7 Mar 2019 21:26:12 +0100 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to__Python_3_online_docs?= In-Reply-To: <20190307135606.GM4465@ando.pearwood.info> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> <20190307135606.GM4465@ando.pearwood.info> Message-ID: <27C25822-9609-4633-8CCC-402BE5A42A4F@gmail.com> The way search engines works is ?the more it?s clicked, the higher it is? In order to have python3 on top of the results, just hit the Python3 result :) From tjreedy at udel.edu Thu Mar 7 17:53:58 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Mar 2019 17:53:58 -0500 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to_Python_3_online_docs?= In-Reply-To: <20190307135606.GM4465@ando.pearwood.info> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> <20190307135606.GM4465@ando.pearwood.info> Message-ID: On 3/7/2019 8:56 AM, Steven D'Aprano wrote: > On Thu, Mar 07, 2019 at 08:10:20AM -0500, James Lu wrote: > >> Rationale: When I use a search engine to google a Python question, I >> frequently get a link to a page of the Python 2.7 documentation that >> shows before the Python 3 documentation link. >> >> This is annoying and slows down my programming. > > Please see > > https://bugs.python.org/issue35435 > > and related links from that issue. > > I've found that the search engines are getting better at linking to the > more recent docs. For example, all of these: > give me Python 3 first and Python 2 second. > https://duckduckgo.com/?q=python+docs+random > > https://search.yahoo.com/yhs/search?p=python+docs+itertools > > https://www.bing.com/search?q=python+docs+netrc > > https://www.startpage.com/do/search?q=python+docs+array Ditto for me: /3/ before /2/. > https://www.dogpile.com/serp?q=python+docs+shutil I get /2/ before /3/ Even the comparatively > obscure "sndhdr" module gets Python 3 first: > https://www.google.com/search?q=python+docs+sndhdr Ditto, but no /2/ on first page. > However these gives Python 2 first: > > https://www.startpage.com/do/search?q=python+docs+netrc /2/ followed by /3.1.5/. No /3/ on first page, so no option to influence better placement of /3/. > https://www.dogpile.com/serp?q=python+docs+fileinput I get /3/ before /2/. Does order depend on country? (AU versus US) > But note that the docs do include a drop down menu to select the > version, so it shouldn't be that difficult to swap from old versions to > the most recent. > > (Unless you're looking at *really* old versions like 1.5.) > (Or 3.1.5 ;-) -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Thu Mar 7 18:46:31 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Mar 2019 12:46:31 +1300 Subject: [Python-ideas] =?windows-1252?q?Make_Python_2=2E7=92s_online_doc?= =?windows-1252?q?s_optionally_redirect_to__Python_3_online_docs?= In-Reply-To: <20190307135606.GM4465@ando.pearwood.info> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> <20190307135606.GM4465@ando.pearwood.info> Message-ID: <5C81AD57.4090600@canterbury.ac.nz> Steven D'Aprano wrote: > I've found that the search engines are getting better at linking to the > more recent docs. Likely this is simply due to the fact that Python 3 is being used more than it was, so more of its doc pages are getting linked to. If that's true, then thing should continue to improve over time. -- Greg From steve at pearwood.info Thu Mar 7 18:47:18 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Mar 2019 10:47:18 +1100 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to_Python_3_online_docs?= In-Reply-To: References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> <20190307135606.GM4465@ando.pearwood.info> Message-ID: <20190307234718.GN4465@ando.pearwood.info> On Thu, Mar 07, 2019 at 05:53:58PM -0500, Terry Reedy wrote: > On 3/7/2019 8:56 AM, Steven D'Aprano wrote: [...] > >I've found that the search engines are getting better at linking to the > >more recent docs. For example, all of these: > >give me Python 3 first and Python 2 second. [...] > I get /2/ before /3/ Sorry, I forgot to say "Your mileage may vary." Google is well-known for tracking users (even if they aren't logged into a google account at the time) and filtering their search results. As far as I know, only DuckDuckGo promises that all users will see unfiltered results, with everyone seeing the same results from identical searches. So it is quite likely that any other search engine may give different results for identical search terms, according to who you are, whether you are signed into a google account, the country you or your ISP is based in, and the kinds of links you have followed in the past. Not just clicked search links -- Google in particular has an extensive web of tracking bugs throughout the WWW, so they can track you even when you aren't logged in. (Again, YMMV -- those taking active countermeasures may avoid some tracking, and I understand that in the EU Google has legal restrictions on what they collect and what they do with it.) [...] > /2/ followed by /3.1.5/. No /3/ on first page, so no option to > influence better placement of /3/. You could click through to the second page of search results :-) -- Steven From rosuav at gmail.com Thu Mar 7 18:51:00 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Mar 2019 10:51:00 +1100 Subject: [Python-ideas] =?utf-8?q?Make_Python_2=2E7=E2=80=99s_online_docs?= =?utf-8?q?_optionally_redirect_to_Python_3_online_docs?= In-Reply-To: <20190307234718.GN4465@ando.pearwood.info> References: <7DC0A967-5F8D-4756-8D35-E86253DE6D41@gmail.com> <20190307135606.GM4465@ando.pearwood.info> <20190307234718.GN4465@ando.pearwood.info> Message-ID: On Fri, Mar 8, 2019 at 10:48 AM Steven D'Aprano wrote: > > /2/ followed by /3.1.5/. No /3/ on first page, so no option to > > influence better placement of /3/. > > You could click through to the second page of search results :-) Obligatory XKCD: https://xkcd.com/1334/ It's unclear whether clicking a link on the second page actually trains the search engine, though. Clicks from the first page are (a) easier to track, and (b) more likely to be useful signals from a user, than clicks from subsequent pages are. But then, we have no real information about what DOES train the search engine, so take it all with a grain of salt. ChrisA From jamtlu at gmail.com Thu Mar 7 18:29:26 2019 From: jamtlu at gmail.com (James Lu) Date: Thu, 7 Mar 2019 18:29:26 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: Now, this belongs as a separate PEP, and I probably will write one, but I propose: d1 << d2 makes a copy of d1 and merges d2 into it, and when the keys conflict, d2 takes priority. (Works like copy/update.) d1 + d2 makes a new dictionary, taking keys from d1 and d2. If d1 and d2 have a different value for same key, a KeyError is thrown. From turnbull.stephen.fw at u.tsukuba.ac.jp Fri Mar 8 00:01:20 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 8 Mar 2019 14:01:20 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: <23681.63264.157314.461467@turnbull.sk.tsukuba.ac.jp> Ka-Ping Yee writes: > On Wed, Mar 6, 2019 at 4:01 PM Chris Angelico wrote: > > But adding dictionaries is fundamentally *useful*. It is expressive. > > It is useful. It's just that + is the wrong name. First, let me say that I prefer ?!'s position here, so my bias is made apparent. I'm also aware that I have biases so I'm sympathetic to those who take a different position. Rather than say it's "wrong", let me instead point out that I think it's pragmatically troublesome to use "+". I can think of at least four interpretations of "d1 + d2" 1. update 2. multiset (~= Collections.Counter addition) 3. addition of functions into the same vector space (actually, a semigroup will do ;-), and this is the implementation of Collections.Counter 4. "fiberwise" addition (ie, assembling functions into relations) and I'm very jet-lagged so I may be missing some. Since "|" (especially "|=") *is* suitable for "update", I think we should reserve "+" for some alternative future commutative extension, of which there are several possible (all of 2, 3, 4 are commutative). Again in the spirit of full disclosure, of those above, 2 is already implemented and widely used, so we don't need to use "+" for that. I've never seen 4 except in the mathematical literature (union of relations is not the same thing). 3, however, is very common both for mappings with small domain and sparse representation of mappings with a default value (possibly computed then cached), and "|" is not suitable for expressing that sort of addition (I'm willing to say it's "wrong" :-). There's also the fact that the operations denoted by "|" and "||" are often implemented as "short-circuiting", and therefore not commutative, while "+" usually is (and that's reinforced for mathematicians who are trained to think of "+" as the operator for Abelian groups, while "*" is a (possibly) non-commutative operator. I know commutativity of "+" has been mentioned before, but the non-commutativity of "|" -- and so unsuitability for many kinds of dict combination -- hasn't been emphasized before IIRC. Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Fri Mar 8 00:11:13 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 8 Mar 2019 14:11:13 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> Ka-Ping Yee writes: > On Wed, Mar 6, 2019 at 4:01 PM Chris Angelico wrote: > > But adding dictionaries is fundamentally *useful*. It is expressive. > > It is useful. It's just that + is the wrong name. First, let me say that I prefer ?!'s position here, so my bias is made apparent. I'm also aware that I have biases so I'm sympathetic to those who take a different position. Rather than say it's "wrong", let me instead point out that I think it's pragmatically troublesome to use "+". I can think of at least four interpretations of "d1 + d2" 1. update 2. multiset (~= Collections.Counter addition) 3. addition of functions into the same vector space (actually, a semigroup will do ;-), and this is the implementation of Collections.Counter 4. "fiberwise" set addition (ie, of functions into relations) and I'm very jet-lagged so I may be missing some. There's also the fact that the operations denoted by "|" and "||" are often implemented as "short-circuiting", and therefore not commutative, while "+" usually is (and that's reinforced for mathematicians who are trained to think of "+" as the operator for Abelian groups, while "*" is a (possibly) non-commutative operator. I know commutativity of "+" has been mentioned before, but the non-commutativity of "|" -- and so unsuitability for many kinds of dict combination -- hasn't been emphasized before IIRC. Since "|" (especially "|=") *is* suitable for "update", I think we should reserve "+" for some future commutative extension. In the spirit of full disclosure: Of these, 2 is already implemented and widely used, so we don't need to use dict.__add__ for that. I've never seen 4 in the mathematical literature (union of relations is not the same thing). 3, however, is very common both for mappings with small domain and sparse representation of mappings with a default value (possibly computed then cached), and "|" is not suitable for expressing that sort of addition (I'm willing to say it's "wrong" :-). Steve From jfine2358 at gmail.com Fri Mar 8 04:48:58 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 8 Mar 2019 09:48:58 +0000 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: I've just learnt something new. Look at >>> from operator import iadd >>> lst = [1, 2, 3] >>> iadd(lst, 'hi') [1, 2, 3, 'h', 'i'] >>> lst [1, 2, 3, 'h', 'i'] This shows that the proposals dict.flow_update and dict.__iadd__ are basically the same. (I think this is quite important for understanding the attraction of fluent programming. We ALREADY like and use it, in the form of augmented assignment of mutables.) This also shows that combined = defaults.copy() combined.update(options) could, if the proposal is accepted, be written as defaults.copy().__iadd__(options) I got the idea from the withdrawn PEP (thank you, Nick Coghlan, for writing it): PEP 577 -- Augmented Assignment Expressions https://www.python.org/dev/peps/pep-0577/ -- Jonathan From sorcio at gmail.com Fri Mar 8 08:06:37 2019 From: sorcio at gmail.com (Davide Rizzo) Date: Fri, 8 Mar 2019 14:06:37 +0100 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: > Counter also uses +/__add__ for a similar behavior. > > >>> c = Counter(a=3, b=1) > >>> d = Counter(a=1, b=2) > >>> c + d # add two counters together: c[x] + d[x] > Counter({'a': 4, 'b': 3}) > > At first I worried that changing base dict would cause confusion for the subclass, but Counter seems to share the idea that update and + are synonyms. Counter is a moot analogy. Counter's + and - operators follow the rules of numbers addition and subtraction: >>> c = Counter({"a": 1}) >>> c + Counter({"a": 5}) Counter({'a': 6}) >>> c + Counter({"a": 5}) - Counter({"a": 4}) Counter({'a': 2}) Which also means that in most cases (c1 + c2) - c2 == c1 which is not something you would expect with the suggested "dictionary addition" operation. As a side note, this is not true in general for Counters because of how subtraction handles 0. E.g. >>> c0 = Counter({"a": 0}) >>> c1 = Counter({"a": 1}) >>> (c0 + c1) - c1 Counter() >>> (c0 + c1) - c1 == c0 False --- The current intuition of how + and - work don't apply literally to this suggestion: 1) numeric types are their own story 2) most built-in sequences imply concatenation for + and have no subtraction 3) numpy-like arrays behave closer to numbers 4) Counters mimic numbers in some ways and while addition reminds of concatenation (but order is not relevant) they also have subtraction 5) sets have difference which is probably the closest you expect from dict subtraction, but no + operator --- I understand the arguments against a | operator for dicts but I don't entirely agree with them. dict is obviously a different type of object than all the others I've mentioned, even mathematically, and there is no clear precedent. If sets happened to maintain insertion order, like dicts after 3.6/3.7, I would expect the union operator to also preserve the order. Before 3.6 we probably would have seen dicts as closer to sets from that point of view, and this suggested addition as closer to set union. The question of symmetry ({"a": 1} + {"a": 2}) is an important one and I would consider not enforcing one resolution in PEP 584, and instead leave this undefined (i.e. in the resulting dict, the value could be either 1 or 2, or just completely undefined to also be compatible with Counter-like semantics in the same PEP). This is something to consider carefully if the plan is to make the new operators part of Mapping. It's not obvious that all mappings should implement this the same way, and a survey of what is being done by other implementation of Mappings would be useful. On the other hand leaving it undefined might make it harder to standardize it later, once other implementations have defined their own behavior. This question is probably on its own a valid argument against the proposal. When it comes to dicts (and not Mappings in general) {**d1, **d2} or d.update() already have clearly-defined semantics. The new proposal for a merge() operation might be more useful. The added value would be the ability to add two mappings regardless of concrete type. But it's with Mappings in general that this proposal is the most problematic. On the other hand the subtraction operator is probably less controversial and immediately useful (the idiom to remove keys from a dictionary is not obvious). From jcrmatos at gmail.com Fri Mar 8 11:24:19 2019 From: jcrmatos at gmail.com (=?UTF-8?Q?Jo=c3=a3o_Matos?=) Date: Fri, 8 Mar 2019 16:24:19 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190305233604.GG4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> <20190304162543.GU4465@ando.pearwood.info> <20190305233604.GG4465@ando.pearwood.info> Message-ID: <620bcb7b-cf8b-4992-402a-d3918aab20f8@gmail.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3988 bytes Desc: S/MIME Cryptographic Signature URL: From guido at python.org Fri Mar 8 11:55:43 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Mar 2019 08:55:43 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> Message-ID: On Thu, Mar 7, 2019 at 9:12 PM Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Ka-Ping Yee writes: > > On Wed, Mar 6, 2019 at 4:01 PM Chris Angelico wrote: > > > > But adding dictionaries is fundamentally *useful*. It is expressive. > > > > It is useful. It's just that + is the wrong name. > > First, let me say that I prefer ?!'s position here, so my bias is made > apparent. I'm also aware that I have biases so I'm sympathetic to > those who take a different position. > TBH, I am warming up to "|" as well. > Rather than say it's "wrong", let me instead point out that I think > it's pragmatically troublesome to use "+". I can think of at least > four interpretations of "d1 + d2" > > 1. update > 2. multiset (~= Collections.Counter addition) > I guess this explains the behavior of removing results <= 0; it makes sense as multiset subtraction, since in a multiset a negative count makes little sense. (Though the name Counter certainly doesn't seem to imply multiset.) > 3. addition of functions into the same vector space (actually, a > semigroup will do ;-), and this is the implementation of > Collections.Counter > 4. "fiberwise" set addition (ie, of functions into relations) > > and I'm very jet-lagged so I may be missing some. > > There's also the fact that the operations denoted by "|" and "||" are > often implemented as "short-circuiting", and therefore not > commutative, while "+" usually is (and that's reinforced for > mathematicians who are trained to think of "+" as the operator for > Abelian groups, while "*" is a (possibly) non-commutative operator. I > know commutativity of "+" has been mentioned before, but the > non-commutativity of "|" -- and so unsuitability for many kinds of > dict combination -- hasn't been emphasized before IIRC. > I've never heard of single "|" being short-circuiting. ("||" of course is infamous for being that in C and most languages derived from it.) And "+" is of course used for many non-commutative operations in Python (e.g. adding two lists/strings/tuples together). It is only *associative*, a weaker requirement that just says (A + B) + C == A + (B + C). (This is why we write A + B + C, since the grouping doesn't matter for the result.) Anyway, while we're discussing mathematical properties, and since SETL was briefly mentioned, I found an interesting thing in math. For sets, union and intersection are distributive over each other. I can't type the operators we learned in high school, so I'll use Python's set operations. We find that A | (B & C) == (A | B) & (A | C). We also find that A & (B | C) == (A & B) | (A & C). Note that this is *not* the case for + and * when used with (mathematical) numbers: * distributes over +: a * (b + c) == (a * b) + (a * c), but + does not distribute over *: a + (b * c) != (a + b) * (a + c). So in a sense, SETL (which uses + and * for union and intersection) got the operators wrong. Note that in Python, + and * for sequences are not distributive this way, since (A + B) * n is not the same as (A * n) + (B * n). OTOH A * (n + m) == A * n + A * m. (Assuming A and B are sequences of the same type, and n and m are positive integers.) If we were to use "|" and "&" for dict "union" and "intersection", the mutual distributive properties will hold. > Since "|" (especially "|=") *is* suitable for "update", I think we > should reserve "+" for some future commutative extension. > One argument is that sets have an update() method aliased to "|=", so this makes it more reasonable to do the same for dicts, which also have a. update() method, with similar behavior (not surprising, since sets were modeled after dicts). > In the spirit of full disclosure: > Of these, 2 is already implemented and widely used, so we don't need > to use dict.__add__ for that. I've never seen 4 in the mathematical > literature (union of relations is not the same thing). 3, however, is > very common both for mappings with small domain and sparse > representation of mappings with a default value (possibly computed > then cached), and "|" is not suitable for expressing that sort of > addition (I'm willing to say it's "wrong" :-). > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel.wgx at gmail.com Fri Mar 8 11:55:58 2019 From: samuel.wgx at gmail.com (Samuel Li) Date: Fri, 8 Mar 2019 11:55:58 -0500 Subject: [Python-ideas] Attribute-Getter Syntax Proposal Message-ID: Don't know if this has been suggested before. Instead of writing something like >>> map(lambda x: x.upper(), ['a', 'b', 'c']) I suggest this syntax: >>> map(.upper(), ['a', 'b', 'c']) This would also work for attributes: >>> map(.real, [1j, 2, 3+4j]) Internally, this would require translating .attribute -> lambda x: x.attribute and .method(*args, **kwargs) -> lambda x: x.method(*args, **kwargs) This translation should only take place where a "normal" attribute lookup makes no sense (throws a SyntaxError); i.e. foo.bar works as before, foo(.bar) would previously throw a SyntaxError, so the new syntax applies and the .bar is interpreted as an attrgetter. This is of course only a cosmetic improvement over operator.attrgetter and operator.methodcaller, but I think it's nice enough to warrant consideration. If you like this idea or think it's utter garbage, feel free to discuss. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Fri Mar 8 13:07:21 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 8 Mar 2019 18:07:21 +0000 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: Hi Samuel Interesting idea, and certainly addresses a real problem, if you find yourself creating lots of lambda expressions. But in my first opinion, not so useful that it merits adding to the syntax of Python. (Even if I never use it, it puts an extra burden on me when scanning Python code. Something that used to look like a syntax error is now valid. That's more work for me.) However, you can already achieve something similar, and perhaps more expressive. It is possible to define an object 'magic' such that fn = magic.upper fn = lambda x: x.upper() are effectively equivalent. And this can be done now. No need for a PEP and a new version of Python. And available for those who have to use some fixed already existing Python versions. I hope you'd be interesting in coding this up yourself. I'd have a limited amount of time to help you, but it would put you on a good learning curve, for fundamentals of the Python object model. -- Jonathan From brett at python.org Fri Mar 8 13:37:48 2019 From: brett at python.org (Brett Cannon) Date: Fri, 8 Mar 2019 10:37:48 -0800 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: On Fri, Mar 8, 2019 at 8:57 AM Samuel Li wrote: > Don't know if this has been suggested before. Instead of writing something > like > > >>> map(lambda x: x.upper(), ['a', 'b', 'c']) > > I suggest this syntax: > >>> map(.upper(), ['a', 'b', 'c']) > Do note you get the same results with `map(str.upper, ['a', 'b', 'c'])`. > > This would also work for attributes: > >>> map(.real, [1j, 2, 3+4j]) > > Internally, this would require translating > > .attribute -> lambda x: x.attribute > > and > > .method(*args, **kwargs) -> lambda x: x.method(*args, **kwargs) > > This translation should only take place where a "normal" attribute lookup > makes no sense (throws a SyntaxError); i.e. foo.bar works as before, > foo(.bar) would previously throw a SyntaxError, so the new syntax applies > and the .bar is interpreted as an attrgetter. > > This is of course only a cosmetic improvement over operator.attrgetter and > operator.methodcaller, but I think it's nice enough to warrant > consideration. > > If you like this idea or think it's utter garbage, feel free to discuss. > Sorry, I'm personally not a fan as it looks like you have a typo in your code, e.g. you left of 'x' or something before the dot. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Mar 8 14:19:17 2019 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 8 Mar 2019 19:19:17 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> Message-ID: <59dbf1c3-a574-2e9a-5255-9012547ab79d@mrabarnett.plus.com> On 2019-03-08 16:55, Guido van Rossum wrote: [snip] > If we were to use "|" and "&" for dict "union" and "intersection", the > mutual distributive properties will hold. > > Since "|" (especially "|=") *is* suitable for "update", I think we > should reserve "+" for some future commutative extension. > > > One argument is that sets have an update() method aliased to "|=", so > this makes it more reasonable to do the same for dicts, which also have > a. update() method, with similar behavior (not surprising, since sets > were modeled after dicts). > [snip] One way to think of it is that a dict is like a set, except that each of its members has an additional associated value. From francismb at email.de Fri Mar 8 14:48:18 2019 From: francismb at email.de (francismb) Date: Fri, 8 Mar 2019 20:48:18 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: Hi Todd, On 3/4/19 2:18 PM, Todd wrote: > What is the operator supposed to do? this should depend on what you want to do, the type, the context. How to you would want to use it ? do you see a context where the symbols make meaning to you? Thanks in advance! --francis From francismb at email.de Fri Mar 8 14:51:48 2019 From: francismb at email.de (francismb) Date: Fri, 8 Mar 2019 20:51:48 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> Message-ID: <8bb1e7a5-2154-2d5d-b068-3a02050b9636@email.de> Hi Calvin, On 3/4/19 2:09 PM, Calvin Spealman wrote: > I don't like the idea of arrows in both directions when you can just swap > the operands instead Well you saw just to examples of contexts (dict and bool). Could you imagine a context where swapping cannot be done and thus there is a need for left- and right arrow? Thanks in advance! --francis From francismb at email.de Fri Mar 8 14:56:07 2019 From: francismb at email.de (francismb) Date: Fri, 8 Mar 2019 20:56:07 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <7b267f9b.48c7.169470f8936.Coremail.fhsxfhsx@126.com> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <7b267f9b.48c7.169470f8936.Coremail.fhsxfhsx@126.com> Message-ID: Hi fhsxfhsx, On 3/4/19 5:56 AM, fhsxfhsx wrote: > Could you explain why do you prefer this operator than `+`? Well yes, because of the asymmetric operation done underneath (merging dicts is not symmetric). The asymmetry is explicit in the symbol. Not implicit from the documentation you need to know/read for + (in the case proposed for dictionaries). Regards, --francis From francismb at email.de Fri Mar 8 15:02:58 2019 From: francismb at email.de (francismb) Date: Fri, 8 Mar 2019 21:02:58 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <20190303150638.zv24yqah625uiypj@phdru.name> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> Message-ID: Hi Oleg, On 3/3/19 4:06 PM, Oleg Broytman wrote: > You cannot create operator ``<-`` because it's currently valid > syntax: > > 3 <- 2 > > is equivalent to > > 3 < -2 Yes, its a good point, but for me it's not the same '<-' and '< -' due (n)blanks in between. It is may be how now it is, but means that it needs to be always like this? Isn't Python not already blank(s)/indentation aware? or it's just a grammar NO GO? Thanks in advance! --francis From rosuav at gmail.com Fri Mar 8 15:07:45 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Mar 2019 07:07:45 +1100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> Message-ID: On Sat, Mar 9, 2019 at 7:05 AM francismb wrote: > > Hi Oleg, > > On 3/3/19 4:06 PM, Oleg Broytman wrote: > > You cannot create operator ``<-`` because it's currently valid > > syntax: > > > > 3 <- 2 > > > > is equivalent to > > > > 3 < -2 > > Yes, its a good point, but for me it's not the same '<-' and '< -' due > (n)blanks in between. It is may be how now it is, but means that it > needs to be always like this? Isn't Python not already > blank(s)/indentation aware? or it's just a grammar NO GO? > Python permits "3<-2", so this is indeed a no-go. You can easily test this at the interactive interpreter. ChrisA From mrbm74 at gmail.com Fri Mar 8 16:16:02 2019 From: mrbm74 at gmail.com (Martin Bammer) Date: Fri, 8 Mar 2019 22:16:02 +0100 Subject: [Python-ideas] Preallocated tuples and dicts for function calls Message-ID: Hi, what about the idea that the interpreter preallocates and preinitializes the tuples and dicts for function calls where possible when loading a module? Before calling a function then the interpreter would just need to update the items which are dynamic and then call the function. Some examples: msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False, raw=False) The above function call needs a tuple with 1 entry and a dict with 2 entries. All entries are constant. So in this case the interpreter can immediately execute the function call. Without the optimization the interpreter would need to: - create new tuple (allocate memory) - write constant into first tuple index. - create dict (allocate memory) - add key+value - add key+value - call function Another example: foo(bar, 3, 5, arg1=bar1, arg2=True) The above needs a tuple with 3 entries. 2 of them are constant. And a dict with 2 entries. 1 of them is constant. With the optimization: - write bar into first tuple index. - replace first key+value pair in the dict. - call function Without the optimization: - create new tuple (allocate memory) - write bar into first tuple index. - write constant into second tuple index. - write constant into third tuple index. - create dict (allocate memory) - add key+value - add key+value - call function If this idea is possible to implement I assume the function calls would receive a great speed improvment. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From 1benediktwerner at gmail.com Fri Mar 8 17:08:52 2019 From: 1benediktwerner at gmail.com (Benedikt Werner) Date: Fri, 8 Mar 2019 23:08:52 +0100 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> This was actually quite interesting to code, thanks for the idea Jonathan! You can even support "magic.upper()" and "magic.real" at the same time as well as "magic[0]": class MagicClass: ??? NO_ARG = object() ??? @staticmethod ??? def __getattribute__(attr): ??????? def method(x=MagicClass.NO_ARG): ??????????? if x is MagicClass.NO_ARG: ??????????????? return lambda x: getattr(x, attr)() ??????????? return getattr(x, attr) ??????? return method ??? @staticmethod ??? def __getitem__(attr): ??????? return lambda x: x[attr] magic = MagicClass() print(list(map(magic.upper(), ["abc", "def"])))? # ['ABC', 'DEF'] print(list(map(magic.real, [1j, 2, 3+4j])))????? # [0.0, 2, 3.0] print(list(map(magic[0], ["abc", "def"])))?????? # ['a', 'd'] You could also use None instead of that NO_ARG thingy, because you most likely won't want to get any attributes of None objects, but that wouldn't produce proper errors incase you do anyways. With metaclasses you propably could also make it work directly on the class without the need of a magic instance. Benedikt Am 08.03.2019 um 19:07 schrieb Jonathan Fine: > Hi Samuel > > Interesting idea, and certainly addresses a real problem, if you find > yourself creating lots of lambda expressions. But in my first opinion, > not so useful that it merits adding to the syntax of Python. > > (Even if I never use it, it puts an extra burden on me when scanning > Python code. Something that used to look like a syntax error is now > valid. That's more work for me.) > > However, you can already achieve something similar, and perhaps more > expressive. It is possible to define an object 'magic' such that > fn = magic.upper > fn = lambda x: x.upper() > are effectively equivalent. > > And this can be done now. No need for a PEP and a new version of > Python. And available for those who have to use some fixed already > existing Python versions. > > I hope you'd be interesting in coding this up yourself. I'd have a > limited amount of time to help you, but it would put you on a good > learning curve, for fundamentals of the Python object model. > From rosuav at gmail.com Fri Mar 8 17:12:41 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Mar 2019 09:12:41 +1100 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> Message-ID: On Sat, Mar 9, 2019 at 9:09 AM Benedikt Werner <1benediktwerner at gmail.com> wrote: > > This was actually quite interesting to code, thanks for the idea Jonathan! > > You can even support "magic.upper()" and "magic.real" at the same time > as well as "magic[0]": > > class MagicClass: > NO_ARG = object() > > @staticmethod > def __getattribute__(attr): > def method(x=MagicClass.NO_ARG): > if x is MagicClass.NO_ARG: > return lambda x: getattr(x, attr)() > return getattr(x, attr) > return method > > @staticmethod > def __getitem__(attr): > return lambda x: x[attr] > > magic = MagicClass() > > print(list(map(magic.upper(), ["abc", "def"]))) # ['ABC', 'DEF'] > print(list(map(magic.real, [1j, 2, 3+4j]))) # [0.0, 2, 3.0] > print(list(map(magic[0], ["abc", "def"]))) # ['a', 'd'] > > You could also use None instead of that NO_ARG thingy, because you most > likely won't want to get any attributes of None objects, but that > wouldn't produce proper errors incase you do anyways. > > With metaclasses you propably could also make it work directly on the > class without the need of a magic instance. Rather than using map in this way, I would recommend a list comprehension: print([x.upper() for x in ["abc", "def"]]) print([x.real for x in [1j, 2, 3+4j]]) print([x[0] for x in ["abc", "def"]]) No magic needed. ChrisA From 1benediktwerner at gmail.com Fri Mar 8 17:21:56 2019 From: 1benediktwerner at gmail.com (Benedikt Werner) Date: Fri, 8 Mar 2019 23:21:56 +0100 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> Message-ID: <7871ae21-f359-461f-bc67-0f1f3453a9e6@gmail.com> I just realized it doesn't work properly if the method takes some arguments, so you would actually have to use two different magic objects or something like that, but I guess the point is clear. Am 08.03.2019 um 23:08 schrieb Benedikt Werner: > This was actually quite interesting to code, thanks for the idea > Jonathan! > > You can even support "magic.upper()" and "magic.real" at the same time > as well as "magic[0]": > > class MagicClass: > ??? NO_ARG = object() > > ??? @staticmethod > ??? def __getattribute__(attr): > ??????? def method(x=MagicClass.NO_ARG): > ??????????? if x is MagicClass.NO_ARG: > ??????????????? return lambda x: getattr(x, attr)() > ??????????? return getattr(x, attr) > ??????? return method > > ??? @staticmethod > ??? def __getitem__(attr): > ??????? return lambda x: x[attr] > > magic = MagicClass() > > print(list(map(magic.upper(), ["abc", "def"])))? # ['ABC', 'DEF'] > print(list(map(magic.real, [1j, 2, 3+4j])))????? # [0.0, 2, 3.0] > print(list(map(magic[0], ["abc", "def"])))?????? # ['a', 'd'] > > You could also use None instead of that NO_ARG thingy, because you > most likely won't want to get any attributes of None objects, but that > wouldn't produce proper errors incase you do anyways. > > With metaclasses you propably could also make it work directly on the > class without the need of a magic instance. > > Benedikt > > Am 08.03.2019 um 19:07 schrieb Jonathan Fine: >> Hi Samuel >> >> Interesting idea, and certainly addresses a real problem, if you find >> yourself creating lots of lambda expressions. But in my first opinion, >> not so useful that it merits adding to the syntax of Python. >> >> (Even if I never use it, it puts an extra burden on me when scanning >> Python code. Something that used to look like a syntax error is now >> valid. That's more work for me.) >> >> However, you can already achieve something similar, and perhaps more >> expressive. It is possible to define an object 'magic' such that >> ??? fn = magic.upper >> ??? fn = lambda x: x.upper() >> are effectively equivalent. >> >> And this can be done now. No need for a PEP and a new version of >> Python. And available for those who have to use some fixed already >> existing Python versions. >> >> I hope you'd be interesting in coding this up yourself. I'd have a >> limited amount of time to help you, but it would put you on a good >> learning curve, for fundamentals of the Python object model. >> From greg.ewing at canterbury.ac.nz Fri Mar 8 18:32:15 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 12:32:15 +1300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> Message-ID: <5C82FB7F.4000800@canterbury.ac.nz> Guido van Rossum wrote: > I guess this explains the behavior of removing results <= 0; it makes > sense as multiset subtraction, since in a multiset a negative count > makes little sense. (Though the name Counter certainly doesn't seem to > imply multiset.) It doesn't even behave consistently as a multiset, since c[k] -= n is happy to let the value go negative. > For sets, > union and intersection are distributive over each other. > Note that this is *not* the case for + and * when used with > (mathematical) numbers... So in a sense, SETL (which uses + and * > for union and intersection got the operators wrong. But in another sense, it didn't. In Boolean algebra, "and" and "or" (which also distribute over each other) are often written using the same notations as multiplication and addition. There's no rule in mathematics saying that these notations must be distributive in one direction but not the other. -- Greg From chris.barker at noaa.gov Fri Mar 8 18:43:10 2019 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 8 Mar 2019 15:43:10 -0800 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> Message-ID: > > Rather than using map in this way, I would recommend a list comprehension: Exactly! I really don?t get why folks want to use map() so much when the comprehension syntax is often cleaner and easier. It was added for a reason :-) -CHB From greg.ewing at canterbury.ac.nz Fri Mar 8 18:53:06 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 12:53:06 +1300 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: <5C830062.4030609@canterbury.ac.nz> Samuel Li wrote: > .attribute -> lambda x: x.attribute > > .method(*args, **kwargs) -> lambda x: x.method(*args, **kwargs) Leading dots can be hard to spot when reading code. Also, I'm not convinced that use cases for this are frequent enough to warrant new syntax. Something akin to this can already be done in simple cases: map(string.upper, some_list) Anything more complicated, such as passing arguments, is probably better expressed with a comprehension. -- Greg From guido at python.org Fri Mar 8 19:07:45 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Mar 2019 16:07:45 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <5C82FB7F.4000800@canterbury.ac.nz> References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> <5C82FB7F.4000800@canterbury.ac.nz> Message-ID: On Fri, Mar 8, 2019 at 3:33 PM Greg Ewing wrote: > Guido van Rossum wrote: > > I guess this explains the behavior of removing results <= 0; it makes > > sense as multiset subtraction, since in a multiset a negative count > > makes little sense. (Though the name Counter certainly doesn't seem to > > imply multiset.) > > It doesn't even behave consistently as a multiset, since c[k] -= n > is happy to let the value go negative. > > > For sets, > > union and intersection are distributive over each other. > > > Note that this is *not* the case for + and * when used with > > (mathematical) numbers... So in a sense, SETL (which uses + and * > > for union and intersection got the operators wrong. > > But in another sense, it didn't. In Boolean algebra, "and" and "or" > (which also distribute over each other) are often written using the > same notations as multiplication and addition. There's no rule in > mathematics saying that these notations must be distributive in one > direction but not the other. > I guess everybody's high school math(s) class was different. I don't ever recall seeing + and * for boolean OR/AND; we used ? and ?. I learned | and & for set operations only after I learned programming; I think it was in PL/1. But of course it stuck because of C bitwise operators (which are also boolean OR/AND and set operations). This table suggests there's a lot of variety in how these operators are spelled: https://en.wikipedia.org/wiki/List_of_logic_symbols -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Mar 8 19:42:16 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 13:42:16 +1300 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> Message-ID: <5C830BE8.1030606@canterbury.ac.nz> francismb wrote: > It is may be how now it is, but means that it > needs to be always like this? Yes, as long as you care about not breaking existing code. While you may be in the habit of always leaving a space between '<' and '-', others may have different styles. Do you really want to tell them that all their code is now wrong? -- Greg From mertz at gnosis.cx Fri Mar 8 19:51:45 2019 From: mertz at gnosis.cx (David Mertz) Date: Fri, 8 Mar 2019 19:51:45 -0500 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: You could use the time machine: https://docs.python.org/3/library/operator.html On Fri, Mar 8, 2019, 11:57 AM Samuel Li wrote: > Don't know if this has been suggested before. Instead of writing something > like > > >>> map(lambda x: x.upper(), ['a', 'b', 'c']) > > I suggest this syntax: > >>> map(.upper(), ['a', 'b', 'c']) > > This would also work for attributes: > >>> map(.real, [1j, 2, 3+4j]) > > Internally, this would require translating > > .attribute -> lambda x: x.attribute > > and > > .method(*args, **kwargs) -> lambda x: x.method(*args, **kwargs) > > This translation should only take place where a "normal" attribute lookup > makes no sense (throws a SyntaxError); i.e. foo.bar works as before, > foo(.bar) would previously throw a SyntaxError, so the new syntax applies > and the .bar is interpreted as an attrgetter. > > This is of course only a cosmetic improvement over operator.attrgetter and > operator.methodcaller, but I think it's nice enough to warrant > consideration. > > If you like this idea or think it's utter garbage, feel free to discuss. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Mar 8 19:53:55 2019 From: mertz at gnosis.cx (David Mertz) Date: Fri, 8 Mar 2019 19:53:55 -0500 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: I'm really old ... I remember thinking how clever attrgetter() was when it was after to Python 2.4. On Fri, Mar 8, 2019, 7:51 PM David Mertz wrote: > You could use the time machine: > https://docs.python.org/3/library/operator.html > > On Fri, Mar 8, 2019, 11:57 AM Samuel Li wrote: > >> Don't know if this has been suggested before. Instead of writing >> something like >> >> >>> map(lambda x: x.upper(), ['a', 'b', 'c']) >> >> I suggest this syntax: >> >>> map(.upper(), ['a', 'b', 'c']) >> >> This would also work for attributes: >> >>> map(.real, [1j, 2, 3+4j]) >> >> Internally, this would require translating >> >> .attribute -> lambda x: x.attribute >> >> and >> >> .method(*args, **kwargs) -> lambda x: x.method(*args, **kwargs) >> >> This translation should only take place where a "normal" attribute lookup >> makes no sense (throws a SyntaxError); i.e. foo.bar works as before, >> foo(.bar) would previously throw a SyntaxError, so the new syntax applies >> and the .bar is interpreted as an attrgetter. >> >> This is of course only a cosmetic improvement over operator.attrgetter >> and operator.methodcaller, but I think it's nice enough to warrant >> consideration. >> >> If you like this idea or think it's utter garbage, feel free to discuss. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Mar 8 20:02:53 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 14:02:53 +1300 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: References: Message-ID: <5C8310BD.7040300@canterbury.ac.nz> Martin Bammer wrote: > what about the idea that the interpreter preallocates and preinitializes > the tuples and dicts for function calls where possible when loading a module? This would not be thread-safe. Locking would be needed around uses of the preallocated objects, and that might take away some or all of the gain. It would also introduce unexpected interactions between threads. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 8 20:22:26 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 14:22:26 +1300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <23681.63857.133445.657998@turnbull.sk.tsukuba.ac.jp> <5C82FB7F.4000800@canterbury.ac.nz> Message-ID: <5C831552.4020604@canterbury.ac.nz> Guido van Rossum wrote: > I guess everybody's high school math(s) class was different. I don't > ever recall seeing + and * for boolean OR/AND; we used ? and ?. Boolean algebra was only touched on briefly in my high school years. I can't remember exactly what notation was used, but it definitely wasn't ? and ? -- I didn't encounter those until much later. However, I've definitely seen texts on boolean alegbra in relation to logic circuits that write 'A and B' as 'AB', and 'A or B' as 'A + B'. (And also use an overbar for negation instead of the mathematical ?). Maybe it depends on whether you're a mathematician or an engineer? The multiplication-addition notation seems a lot more readable when you have a complicated boolean expression, so I can imagine it being favoured by pragmatic engineering type people. -- Greg From steve at pearwood.info Fri Mar 8 21:21:31 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Mar 2019 13:21:31 +1100 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: References: Message-ID: <20190309022131.GF12502@ando.pearwood.info> On Fri, Mar 08, 2019 at 10:16:02PM +0100, Martin Bammer wrote: > Hi, > > what about the idea that the interpreter preallocates and > preinitializes the tuples and dicts for function calls where possible > when loading a module? That's an implementation detail. CPython may or may not use tuples and dicts to call functions, but I don't think that's specified by the language. So we're talking about a potential optimization of one interpreter, not a language change. If the idea survives cursory discussion here, the Python-Dev mailing list is probably a better place to discuss it further. > Before calling a function then the interpreter would just need to update > the items which are dynamic and then call the function. As Greg points out, that would be unsafe when using threads. Let's say you have two threads, A and B, and both call function spam(). A wants to call spam(1, 2) and B wants to call spam(3, 4). Because of the unpredictable order that threaded code runs, we might have: A sets the argument tuple to (1, 2) B sets the argument tuple to (2, 3) B calls spam() A calls spam() # Oops! and mysterious, difficult to reproduce errors occur. It may be possible to solve this with locks, but that would probably slow code down horribly. [...] > Without the optimization the interpreter would need to: > > - create new tuple (allocate memory) > - write constant into first tuple index. > - create dict (allocate memory) > - add key+value > - add key+value > - call function Sure, and that happens at runtime, just before the function is called. But the same series of allocations would have to occur under your idea too, it would just happen when the module loads. And then the pre- allocated tuples and dicts would hang around forever, wasting memory. Even if it turns out that the function never actually gets called: for x in sequence: if condition(x): # always returns False! function(...) the compiler will have pre-allocated the memory to call it. So I suspect this is going to be very memory hungry. Trading off memory for speed might be worthwhile, but it is a trade-off that will make certain things worse rather than better. > If this idea is possible to implement I assume the function calls would > receive a great speed improvment. Well, it might decrease the overhead of calling a function, but that's usually only a small proportion of the total time to make function calls. So it might not help as much as you expect, except in the case where you have lots and lots of function calls each of which do only a tiny amount of work. But that has to be balanced against the slowdown that occurs when the module loads, when the same memory allocations (but not deallocations) would occur. Starting up Python is already pretty slow compared to other languages, this would probably make it worse. Even if it became a nett win for some applications, for others it would likely be a nett loss. My guess is that it would probably hurt the cases which are already uncomfortably slow, while benefitting the cases that don't need much optimization. But that's just a guess, and not an especially educated guess at that. -- Steven From steve at pearwood.info Fri Mar 8 21:29:51 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Mar 2019 13:29:51 +1100 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> Message-ID: <20190309022950.GG12502@ando.pearwood.info> On Fri, Mar 08, 2019 at 03:43:10PM -0800, Chris Barker - NOAA Federal via Python-ideas wrote: > > > > Rather than using map in this way, I would recommend a list comprehension: > > Exactly! I really don?t get why folks want to use map() so much when > the comprehension syntax is often cleaner and easier. It was added for > a reason :-) Comprehensions are great for avoiding the need to write verbose lambdas before calling map: map(lambda x: x + 1, numbers) (x + 1 for x in numbers) but you typically only save a few characters, and you don't even save that when the function already exists: map(str.upper, strings) (s.upper() for s in strings) So horses for courses. In my opinion, map() looks nicer when you are calling a pre-existing named function, and comprehensions look nicer when you have an expression involving operators which would otherwise require a lambda. -- Steven From greg.ewing at canterbury.ac.nz Fri Mar 8 23:19:50 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Mar 2019 17:19:50 +1300 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: <20190309022950.GG12502@ando.pearwood.info> References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> <20190309022950.GG12502@ando.pearwood.info> Message-ID: <5C833EE6.20901@canterbury.ac.nz> If we were going to add a syntax for abbreviating lambdas, I would rather see something more generally useful, e.g. x -> x.method() as an abbrevation for lambda x: x.method() -- Greg From steve at pearwood.info Fri Mar 8 23:48:36 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Mar 2019 15:48:36 +1100 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: <5C833EE6.20901@canterbury.ac.nz> References: <12598c24-d507-7f10-28db-5798834ab890@gmail.com> <20190309022950.GG12502@ando.pearwood.info> <5C833EE6.20901@canterbury.ac.nz> Message-ID: <20190309044833.GI12502@ando.pearwood.info> On Sat, Mar 09, 2019 at 05:19:50PM +1300, Greg Ewing wrote: > If we were going to add a syntax for abbreviating lambdas, I would > rather see something more generally useful, e.g. > > x -> x.method() > > as an abbrevation for > > lambda x: x.method() Cocnut does this! https://coconut.readthedocs.io/en/latest/DOCS.html#lambdas It allows any arbitrary expression and parameter list, rather than being limited to a single special case: lambda x, y=None, *args, **kw: spam(x)+y.eggs()-len(args)+kw['foo'] (x, y=None, *args, **kw) -> spam(x)+y.eggs()-len(args)+kw['foo'] # Saves an entire three columns! *wink* (I believe this is similar to Haskell's syntax.) Given that we have lambda though, and many people don't like anonymous functions, I doubt there's enough advantage to justify adding new syntax. I prefer the look of -> to lambda, but given that we have lambda already I wouldn't want to waste that nice looking arrow operator on something we already can do. I'd rather see it saved for something more interesting, like a hypothetical cascading (fluent) method call syntax, or pattern matching. -- Steven From jheiv at jheiv.com Fri Mar 8 23:49:28 2019 From: jheiv at jheiv.com (James Edwards) Date: Fri, 8 Mar 2019 23:49:28 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <620bcb7b-cf8b-4992-402a-d3918aab20f8@gmail.com> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <9C42F773-A486-479E-BD9F-14EB588E5093@gmail.com> <20190304162543.GU4465@ando.pearwood.info> <20190305233604.GG4465@ando.pearwood.info> <620bcb7b-cf8b-4992-402a-d3918aab20f8@gmail.com> Message-ID: On Fri, Mar 8, 2019 at 11:25 AM Jo?o Matos wrote: > I've just read your PEP 585 draft and have some questions. > When you say > " > > Like the merge operator and list concatenation, the difference operator > requires both operands to be dicts, while the augmented version allows any > iterable of keys. > > >>> d - {'spam', 'parrot'} > Traceback (most recent call last): > ... > TypeError: cannot take the difference of dict and set > > >>> d -= {'spam', 'parrot'} > >>> print(d) > {'eggs': 2, 'cheese': 'cheddar'} > > >>> d -= [('spam', 999)] > >>> print(d) > {'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} > > > " > > The option d -= {'spam', 'parrot'} where parrot does not exist in the d > dict, will raise an exception (eg. KeyNotFound) or be silent? > > The option d -= [('spam', 999)] should remove the pair from the dict, > correct? But the print that follows still shows it there. It's a mistake or > am I missing something? > My understanding is that: - (Q1) Attempting to discard a key not in the target of the augmented assignment would *not *raise a KeyError (or any Exception for that matter). This is analogous to how the - operator works on sets and is consistent with the pure python implementation towards the bottom of the PEP. - (Q2) This one got me as well while implementing the proposal in cpython, but there is a difference in what "part" of the RHS the operators "care about" if the RHS isn't a dict. The += operator expects 2-tuples and will treat them as (key, value) pairs. The -= operator doesn't attempt to unpack the RHS's elements as += does and expects keys. So d -= [('spam', 999)] treated the tuple as a *key *and attempted to discard it. IOW, d = { 'spam': 999, ('spam', 999): True } d -= [('spam', 999)] Would discard the *key* ('spam', 999) and corresponding value True. Which highlights a possibly surprising incongruence between the operators: d = {} update = [(1,1), (2,2), (3,3)] d += update d -= update assert d == {} # will raise, as d still has 3 items Similarly, d = {} update = {1:1, 2:2, 3:3} d += update.items() d -= update.items() assert d == {} # will raise, for the same reason d -= update.keys() assert d == {} # would pass without issue That being said I (personally) wouldn't consider it a deal-breaker and still would very much appreciate of the added functionality (regardless of the choice of operator). - Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrbm74 at gmail.com Sat Mar 9 03:21:33 2019 From: mrbm74 at gmail.com (Martin Bammer) Date: Sat, 9 Mar 2019 09:21:33 +0100 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: <5C8310BD.7040300@canterbury.ac.nz> References: <5C8310BD.7040300@canterbury.ac.nz> Message-ID: Thread safety is not a problem here because of the GIL. Am Sa., 9. M?rz 2019 um 02:03 Uhr schrieb Greg Ewing < greg.ewing at canterbury.ac.nz>: > Martin Bammer wrote: > > > what about the idea that the interpreter preallocates and preinitializes > > the tuples and dicts for function calls where possible when loading a > module? > > This would not be thread-safe. Locking would be needed around uses > of the preallocated objects, and that might take away some or all > of the gain. It would also introduce unexpected interactions > between threads. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrbm74 at gmail.com Sat Mar 9 03:29:26 2019 From: mrbm74 at gmail.com (Martin Bammer) Date: Sat, 9 Mar 2019 09:29:26 +0100 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: <20190309022131.GF12502@ando.pearwood.info> References: <20190309022131.GF12502@ando.pearwood.info> Message-ID: Ok right. There are some details which need to modify the idea: - Thread safety: Instead of locking the thread id could be saved in the object and then checked when the object is used. If the thread id is wrong then a new object must be created. I think there is no additional locking necessary because of the GIL. - Startup time and memory waste: This idea can be improved if lazy object initalization is used. So that the tuples and dicts are only created when the function is called the first time and then these objects are kept in memory as long as the module is not unloaded. This would not hurt the startup time and save memory. One more detail which needs to be handled is recursive calling of the function. This can be easily handled by the reference counter. To keep the implementation simple and to not get too memory hungry this optimization should support just the first call level and not iterative calls. Regards, Martin Am Sa., 9. M?rz 2019 um 03:23 Uhr schrieb Steven D'Aprano < steve at pearwood.info>: > On Fri, Mar 08, 2019 at 10:16:02PM +0100, Martin Bammer wrote: > > Hi, > > > > what about the idea that the interpreter preallocates and > > preinitializes the tuples and dicts for function calls where possible > > when loading a module? > > That's an implementation detail. CPython may or may not use tuples and > dicts to call functions, but I don't think that's specified by the > language. So we're talking about a potential optimization of one > interpreter, not a language change. > > If the idea survives cursory discussion here, the Python-Dev mailing > list is probably a better place to discuss it further. > > > > Before calling a function then the interpreter would just need to update > > the items which are dynamic and then call the function. > > As Greg points out, that would be unsafe when using threads. Let's say > you have two threads, A and B, and both call function spam(). A wants to > call spam(1, 2) and B wants to call spam(3, 4). Because of the > unpredictable order that threaded code runs, we might have: > > A sets the argument tuple to (1, 2) > B sets the argument tuple to (2, 3) > B calls spam() > A calls spam() # Oops! > > and mysterious, difficult to reproduce errors occur. > > It may be possible to solve this with locks, but that would probably > slow code down horribly. > > [...] > > Without the optimization the interpreter would need to: > > > > - create new tuple (allocate memory) > > - write constant into first tuple index. > > - create dict (allocate memory) > > - add key+value > > - add key+value > > - call function > > Sure, and that happens at runtime, just before the function is called. > But the same series of allocations would have to occur under your idea > too, it would just happen when the module loads. And then the pre- > allocated tuples and dicts would hang around forever, wasting memory. > Even if it turns out that the function never actually gets called: > > for x in sequence: > if condition(x): # always returns False! > function(...) > > the compiler will have pre-allocated the memory to call it. > > So I suspect this is going to be very memory hungry. Trading off memory > for speed might be worthwhile, but it is a trade-off that will make > certain things worse rather than better. > > > > If this idea is possible to implement I assume the function calls would > > receive a great speed improvment. > > Well, it might decrease the overhead of calling a function, but that's > usually only a small proportion of the total time to make function > calls. So it might not help as much as you expect, except in the case > where you have lots and lots of function calls each of which do only a > tiny amount of work. > > But that has to be balanced against the slowdown that occurs when the > module loads, when the same memory allocations (but not deallocations) > would occur. Starting up Python is already pretty slow compared to other > languages, this would probably make it worse. > > Even if it became a nett win for some applications, for others it would > likely be a nett loss. My guess is that it would probably hurt the cases > which are already uncomfortably slow, while benefitting the cases that > don't need much optimization. > > But that's just a guess, and not an especially educated guess at that. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Mar 9 04:18:49 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Mar 2019 11:18:49 +0200 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: <5C8310BD.7040300@canterbury.ac.nz> References: <5C8310BD.7040300@canterbury.ac.nz> Message-ID: 09.03.19 03:02, Greg Ewing ????: > Martin Bammer wrote: > >> what about the idea that the interpreter preallocates and >> preinitializes the tuples and dicts for function calls where possible >> when loading a module? > > This would not be thread-safe. Locking would be needed around uses > of the preallocated objects, and that might take away some or all > of the gain. It would also introduce unexpected interactions > between threads. Thread safety is not a problem (because of GIL). The problems are the reentrancy, functions which save references to args and kwargs, and the garbage collector. From storchaka at gmail.com Sat Mar 9 04:19:14 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Mar 2019 11:19:14 +0200 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: References: Message-ID: 08.03.19 23:16, Martin Bammer ????: > what about the idea that the interpreter preallocates and preinitializes the > tuples and dicts for function calls where possible when loading a module? > > Before calling a function then the interpreter would just need to update the > items which are dynamic and then call the function. A kind of this already has been implemented. Tuples and dicts use free lists, so when you allocate a small tuple or an empty dict, you usually do not use expensive memory allocating functions, but take a preallocated object from a free list. Although, this still has some overhead. This is why the property object had an attached preallocated tuple for passing the self argument to the getter function. But this was a complex and errorprone code. There were at least three attempts to fix it, and new flaws were found month later after every attempt. Finally this microoptimizations has been removed, and named tuples will use a special type for getters of their attribute in 3.8. Internally, CPython uses the private "fast" calling convention for many builtin functions and methods. It allows to avoid creating an intermediate tuple and dict at all. In future this convention will be exposed publically. Cython already uses it, so third-party extensions written in Cython have an advantage of using it. From jfine2358 at gmail.com Sat Mar 9 10:33:25 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 15:33:25 +0000 Subject: [Python-ideas] The @update operator for dictionaries Message-ID: I've been thinking that it might be easier, in the long term, to make a big step and allow >>> a @update= b as valid Python. What do you think? (I hope it will look nicer once syntax highlighted.) For clarity, this would proceed via a.__iat_update__(b), and (a @update b) would be similarly defined. As major disadvantage, of course, will be that >>> guido at python.org would no longer be valid Python! And also we might have fewer animated discussions. -- Jonathan From boxed at killingar.net Sat Mar 9 10:57:50 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 9 Mar 2019 16:57:50 +0100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: Message-ID: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> I don't understand what you mean. Can you provide examples that show the state of the dicts before and after and what the syntax would be the equivalent of in current python? > On 9 Mar 2019, at 16:33, Jonathan Fine wrote: > > I've been thinking that it might be easier, in the long term, to make > a big step and allow >>>> a @update= b > as valid Python. What do you think? (I hope it will look nicer once > syntax highlighted.) > > For clarity, this would proceed via a.__iat_update__(b), and (a > @update b) would be similarly defined. > > As major disadvantage, of course, will be that >>>> guido at python.org > would no longer be valid Python! > > And also we might have fewer animated discussions. > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Sat Mar 9 11:07:01 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Mar 2019 03:07:01 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: Message-ID: <20190309160700.GJ12502@ando.pearwood.info> On Sat, Mar 09, 2019 at 03:33:25PM +0000, Jonathan Fine wrote: > I've been thinking that it might be easier, in the long term, to make > a big step and allow > >>> a @update= b > as valid Python. What do you think? (I hope it will look nicer once > syntax highlighted.) Are we supposed to know what that does? > For clarity, this would proceed via a.__iat_update__(b), and (a > @update b) would be similarly defined. For an explanation to be clear, you actually have to give an explanation. You can start with explaining the difference between the two different examples you give: a at update=b a at update b and what the default __iat_update__ does. Is this supposed to be unique to update, or will there be an infinite number of special dunder methods for arbitrary method names? a at spam=b # calls __iat_spam__ Does this apply to only dicts, or is it applicable to every object? > As major disadvantage, of course, will be that > >>> guido at python.org > would no longer be valid Python! New syntax which breaks existing code is not likely to be accepted without a *really* good reason. -- Steven From jfine2358 at gmail.com Sat Mar 9 11:12:26 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 16:12:26 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: Anders Hovm?ller wrote: > I don't understand what you mean. Can you provide examples that show the state of the dicts before and after and what the syntax would be the equivalent of in current python? If a.__radd__ exists, then a += b is equivalent to a = a.__radd__(b) Similarly, if a.__iat_update__ exists then a @update= b would be equivalent to a = a.__iat_update__(b) Here's an implementation def __iat_update__(self, other): self.update(other) return self Thus, 'b' would be unchanged, and 'a' would be the same dictionary as before, but updated with 'b'. I hope this helps. -- Jonathan From rosuav at gmail.com Sat Mar 9 11:33:47 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Mar 2019 03:33:47 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: On Sun, Mar 10, 2019 at 3:16 AM Jonathan Fine wrote: > > Anders Hovm?ller wrote: > > > I don't understand what you mean. Can you provide examples that show the state of the dicts before and after and what the syntax would be the equivalent of in current python? > > If a.__radd__ exists, then > a += b > is equivalent to > a = a.__radd__(b) > > Similarly, if a.__iat_update__ exists then > a @update= b > would be equivalent to > a = a.__iat_update__(b) > > Here's an implementation > def __iat_update__(self, other): > self.update(other) > return self > > Thus, 'b' would be unchanged, and 'a' would be the same dictionary as > before, but updated with 'b'. With something this long, how is it better from just writing: a = a.update_with(b) ? What's the point of an operator, especially if - by your own statement - it will backward-incompatibly change the language grammar (in ways that I've yet to understand, since you haven't really been clear on that)? ChrisA From jfine2358 at gmail.com Sat Mar 9 11:34:01 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 16:34:01 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: <20190309160700.GJ12502@ando.pearwood.info> References: <20190309160700.GJ12502@ando.pearwood.info> Message-ID: Steven D'Aprano asked me to explain the difference between > a at update=b > a at update b It is the same as the difference between a += b a + b The one uses a._iadd__(b), a.__add__(b) and so on. For the other, replace 'add' by 'at_update', and '+' by '@update'. By the way, the expressions a at update b a @update b are to be equivalent, but a @ update b is something else (and not valid Python). > Is this supposed to be unique to update, or will there be an infinite > number of special dunder methods for arbitrary method names? There are (many) numbers between 1 and infinity. If a programmer defines __at_python__ on type(guido) then guido at python will have semantics. Steve wrote: > New syntax which breaks existing code is not likely to be > accepted without a *really* good reason. I'd like to see some real-world examples of code that would be broken. As I recall, most or all of the code examples in the python-ideas thread on the '@' operator actually write ' @ '. So they would be good. https://mail.python.org/pipermail/python-ideas/2014-March/027053.html And if otherwise a good idea, we can use the from __future__ trick to maintain compatibility. -- Jonathan From rosuav at gmail.com Sat Mar 9 11:43:53 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Mar 2019 03:43:53 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> Message-ID: On Sun, Mar 10, 2019 at 3:37 AM Jonathan Fine wrote: > I'd like to see some real-world examples of code that would be broken. > As I recall, most or all of the code examples in the python-ideas > thread on the '@' operator actually write ' @ '. So they would be > good. > https://mail.python.org/pipermail/python-ideas/2014-March/027053.html > Can you start by actually defining the change to the grammar? You've casually thrown out the comment that there'll be breakage, without saying exactly what you're proposing to change. Currently, "x @ y" is defined as an operator, with the same precedence as other multiplication/division operators: https://github.com/python/cpython/blob/master/Grammar/Grammar#L106 If the actual Grammar file is too hard to work with, define in high level terms what you're trying to change, perhaps by referencing this table: https://docs.python.org/3/reference/expressions.html#operator-precedence ChrisA From jfine2358 at gmail.com Sat Mar 9 11:43:18 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 16:43:18 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: Chris Angelico suggested that a = a.update_with(b) would be better than a @update= b One of the key points of += is that parent.child['toy'].wheel[3].speed += 1 increases the speed that that wheel by 1, without having to write parent.child['toy'].wheel[3].speed = parent.child['toy'].wheel[3].speed + 1 To answer Chris's other points. It not me, but Chris and Steve who want to bind dict.update to an operator, namely '+'. I'm suggested that if you do that, why not call the operator 'update'. Finally, we don't yet have any real idea how much difficulty the grammar change would cause. -- Jonathan From steve at pearwood.info Sat Mar 9 11:49:14 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Mar 2019 03:49:14 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> Message-ID: <20190309164913.GK12502@ando.pearwood.info> On Sat, Mar 09, 2019 at 04:34:01PM +0000, Jonathan Fine wrote: > There are (many) numbers between 1 and infinity. If a programmer > defines __at_python__ on type(guido) then guido at python will have > semantics. It already has meaning: it calls the @ operator with operands "guido" and "python". > Steve wrote: > > New syntax which breaks existing code is not likely to be > > accepted without a *really* good reason. > > I'd like to see some real-world examples of code that would be broken. > As I recall, most or all of the code examples in the python-ideas > thread on the '@' operator actually write ' @ '. So they would be > good. The interpreter doesn't distinguish between "a @ b" and "a at b". Spaces around operators are always optional. The last thing we're going to do is repeat Ruby's design mistake of making code dependent on spaces around operators. Define a function in Ruby with a default value: def a(x=4) x+2 end and then evaluate the expressions: a + 1 a+ 1 a+1 a +1 The results you get will be 7, 7, 7 and 3. -- Steven From boxed at killingar.net Sat Mar 9 11:50:24 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 9 Mar 2019 17:50:24 +0100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: > Thus, 'b' would be unchanged, and 'a' would be the same dictionary as > before, but updated with 'b'. > > I hope this helps. It didn't. I feel just as confused as before. What does "iat" mean? Is "update" an arbitrary symbol (if so then you could have used "foo" instead of "update" to make it a lot clearer). And you didn't show before, after and equivalent functionality in current python. / Anders From rosuav at gmail.com Sat Mar 9 11:51:49 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Mar 2019 03:51:49 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: On Sun, Mar 10, 2019 at 3:46 AM Jonathan Fine wrote: > > Chris Angelico suggested that > a = a.update_with(b) > would be better than > a @update= b > > One of the key points of += is that > parent.child['toy'].wheel[3].speed += 1 > increases the speed that that wheel by 1, without having to write > parent.child['toy'].wheel[3].speed = parent.child['toy'].wheel[3].speed + 1 > > To answer Chris's other points. It not me, but Chris and Steve who > want to bind dict.update to an operator, namely '+'. I'm suggested > that if you do that, why not call the operator 'update'. > > Finally, we don't yet have any real idea how much difficulty the > grammar change would cause. No, we don't, because you have yet to say what the grammar change would BE. Changing language grammar is a big deal. You don't just say "oh, we should do this, and hey, it's gonna break code". Steven's proposal (not mine, btw, unless you meant some other Chris?) involves giving meaning to "x + y" for different values of x and y, but doesn't change the grammar at all. If you're proposing a completely new meaning for completely new syntax, *be clear* about what you are proposing. ChrisA From steve at pearwood.info Sat Mar 9 11:55:14 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Mar 2019 03:55:14 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <20190309165514.GA29550@ando.pearwood.info> Thanks to everyone who has contributed to the discussion, I have been reading all the comments even if I haven't responded. I'm currently working on an update to the PEP which will, I hope, improve some of the failings of the current draft. -- Steven From jfine2358 at gmail.com Sat Mar 9 12:15:23 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 17:15:23 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: <20190309164913.GK12502@ando.pearwood.info> References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: Steven D'Aprano wrote > The last thing we're going to do is repeat Ruby's design mistake of > making code dependent on spaces around operators. I'd say that Ruby's mistake was encouraging programmers to write code that was hard for human beings to read accurately and quickly. As we say in Python, readability counts. And that's part of the PEP process. Let me clarify the additions to the grammar. 'foo' is a valid Python variable name '@foo=' is to be a valid incremental assignment operator '@foo' is to be a valid Python binary operator For clarity, we keep '@' and '@=' as binary and incremental assignment operators. The worst possible example of ambiguity and incompatibility is perhaps a at b+1 which is valid Python both before and after, but with different syntax a @ (b + 1) # Before a @b (+1) # After To return to Steve's point. A natural example of code using the extended '@' syntax, which is hard to read accurately and quickly, would probably be fatal to this suggestion. Finally, in the discussion of the '@' operator (which Steve was part of), the point was made many times that using the '@' operator made the matrix code easier to read and understand. This was a major force in the proposal. A major reason for this was its alignment with standard mathematical notation. https://mail.python.org/pipermail/python-ideas/2014-March/027053.html I'm suggesting that the grammar allow us, if we wish, to write c = a @cup b for the union of two sets. And len(A @cup B) == len(A) + len(B) - len(A @cap B) is the very useful https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Here, 'cup' and 'cap' are the visual names for the union and intersection operators for sets. -- Jonathan From rosuav at gmail.com Sat Mar 9 12:53:14 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Mar 2019 04:53:14 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: On Sun, Mar 10, 2019 at 4:22 AM Jonathan Fine wrote: > > Steven D'Aprano wrote > > > The last thing we're going to do is repeat Ruby's design mistake of > > making code dependent on spaces around operators. > > I'd say that Ruby's mistake was encouraging programmers to write code > that was hard for human beings to read accurately and quickly. As we > say in Python, readability counts. And that's part of the PEP process. > > Let me clarify the additions to the grammar. > > 'foo' is a valid Python variable name > '@foo=' is to be a valid incremental assignment operator > '@foo' is to be a valid Python binary operator > > For clarity, we keep '@' and '@=' as binary and incremental assignment > operators. > > The worst possible example of ambiguity and incompatibility is perhaps > a at b+1 > which is valid Python both before and after, but with different syntax > a @ (b + 1) # Before > a @b (+1) # After Actually, due to operator precedence, the current interpretation is: (a @ b) + 1 That's a massive compatibility break. You're making it so the presence of whitespace around an operator not just changes its precedence, but actually changes "b" from a value to a token. There's a huge difference between: x.upper and x+upper One of them looks up the name "upper" as an attribute of whatever object 'x' is, and the other evaluates "upper" in the current context (looking for a local or global variable, or a built-in). Atoms and values are fundamentally different; you can replace a simple name with an expression (since they're both values), but you can't do that with an atom: x+(dispatch["upper"]) # can do exactly the same thing as x+upper x.(dispatch["upper"]) # SyntaxError You're proposing to change the @ symbol from being like the first example to being like the second... but ONLY if there's the right pattern of whitespace. I hope that this has 0% chance of happening. You'll do better to pick some other symbol, such that you're giving meaning to something that is currently an error. At least that way, there won't be code that behaves drastically differently on 3.8 and 3.9. ChrisA From paal.drange at gmail.com Sat Mar 9 13:19:44 2019 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Sat, 9 Mar 2019 19:19:44 +0100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: On Sat, 9 Mar 2019 at 18:22, Jonathan Fine wrote: > > I'm suggesting that the grammar allow us, if we wish, to write > c = a @cup b > for the union of two sets. And > len(A @cup B) == len(A) + len(B) - len(A @cap B) You can use the infix module for that, which allows you to write - a @cup@ b - a &cup& b - a <> b - a ^cup^ b - a **cup** b - a /cup/ b - a %cup% b - a |cup| b Use that for a while, and if you like it, you can help the authors spread the word. - P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From m at meitham.com Sat Mar 9 13:44:23 2019 From: m at meitham.com (Meitham Jamaa) Date: Sat, 9 Mar 2019 18:44:23 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: <20190309184423.w3u6qrc5nehmibrl@florence.meitham.com> It might also be worth considering YAML's own dict merge operator, the "<<" operator, as in https://yaml.org/type/merge.html as this is the existing Python's shift operator added to dict and will require no change to the synatx:: a = a << b Meitham On 03/10, Chris Angelico wrote: > On Sun, Mar 10, 2019 at 3:16 AM Jonathan Fine wrote: > > > > Anders Hovm?ller wrote: > > > > > I don't understand what you mean. Can you provide examples that show the state of the dicts before and after and what the syntax would be the equivalent of in current python? > > > > If a.__radd__ exists, then > > a += b > > is equivalent to > > a = a.__radd__(b) > > > > Similarly, if a.__iat_update__ exists then > > a @update= b > > would be equivalent to > > a = a.__iat_update__(b) > > > > Here's an implementation > > def __iat_update__(self, other): > > self.update(other) > > return self > > > > Thus, 'b' would be unchanged, and 'a' would be the same dictionary as > > before, but updated with 'b'. > > With something this long, how is it better from just writing: > > a = a.update_with(b) > > ? What's the point of an operator, especially if - by your own > statement - it will backward-incompatibly change the language grammar > (in ways that I've yet to understand, since you haven't really been > clear on that)? > > ChrisA > -- Meitham Jamaa http://meitham.com GPG Fingerprint: 3934D0B2 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From jfine2358 at gmail.com Sat Mar 9 14:03:56 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 19:03:56 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: SUMMARY Acknowledgement of an error, and clarification of behaviour of '@b' vs '@ b'. I claim that once '@b' is accepted as an operator, the behaviour is perfectly natural and obvious. AN ERROR In a previous post, I mispoke. I should have written a at b+1 is valid Python before and after, but with different syntax. (a @ b) + 1 # Before - operator is '@'. a @b (+1) # After - operator is '@b'. Chris Angelico kindly pointed out that my Before value was wrong. Thank you, Chris. WHITE SPACE AND OPERATORS Chris also correctly points that '@ b' is parses as the '@' operator followed by the identifier 'b' '@b' parses as above (BEFORE) '@b' parses as the '@b' operator (AFTER) He then correctly says that in my proposal the lack of whitespace after an operator can cause the operator to absorb a following identifier. However, something similar ALREADY happens in Python. >>> a = nota = True >>> not a False >>> nota True Today, whenever a Python operator ends in a letter, and is followed by an identifier, white space is or some other delimiter is required between the two. Python, rightly, refuse to guess that 'notary' might be 'not ary'. Here is another example >>> e = note = None >>> e is not e False >>> e is note True This is not quite what's happening with '@b'. With 'is not e' the following identifier 'e' absorbs the 'not' from the operator to create 'note'. And finally >>> False is not None True >>> False is (not None) False The 'natural language' operators appear in https://docs.python.org/3/reference/expressions.html#operator-precedence In my suggestion, '@' consumes for as long it can, first a letter, and then name characters. This is exactly the same as with 'a' or 'b'. I think this is a problem, but not nearly so bad as Chris suggests. Some people have argued that the proposed semantics for dict + dict are natural and obvious, once the behaviour of Python elsewhere is understood. I claim the same for '@b' and '@ b', once we allow '@b' as an operator (which was the whole purpose of the proposal. By the way, it's likely that most users won't know that '@' by itself is an operator, until they come to use matrices. -- Jonathan From shoyer at gmail.com Sat Mar 9 14:39:39 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 9 Mar 2019 11:39:39 -0800 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190309165514.GA29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190309165514.GA29550@ando.pearwood.info> Message-ID: Would __iadd__ and __isub__ be added to collections.abc.MutableMapping? This would be consistent with other infix operations on mutable ABCs, but could potentially break backwards compatibility for anyone who has defined a MutableMapping subclass that implements __add__ but not __iadd__. On Sat, Mar 9, 2019 at 8:55 AM Steven D'Aprano wrote: > Thanks to everyone who has contributed to the discussion, I have been > reading all the comments even if I haven't responded. > > I'm currently working on an update to the PEP which will, I hope, > improve some of the failings of the current draft. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Sat Mar 9 15:13:37 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 9 Mar 2019 21:13:37 +0100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: <7ECF5CF6-13B8-4D9B-AE7D-46D5C9597ABD@killingar.net> Why are you continuing to insist on using the symbol "@" after all these problems have been pointed out I wonder? Just suggest $ or something else that is a syntax error today and we can move on with discussing the merits of the underlying idea itself? / Anders > On 9 Mar 2019, at 20:03, Jonathan Fine wrote: > > SUMMARY > Acknowledgement of an error, and clarification of behaviour of '@b' vs > '@ b'. I claim that once '@b' is accepted as an operator, the > behaviour is perfectly natural and obvious. > > AN ERROR > In a previous post, I mispoke. I should have written > a at b+1 > is valid Python before and after, but with different syntax. > (a @ b) + 1 # Before - operator is '@'. > a @b (+1) # After - operator is '@b'. > > Chris Angelico kindly pointed out that my Before value was wrong. > Thank you, Chris. > > WHITE SPACE AND OPERATORS > Chris also correctly points that > '@ b' is parses as the '@' operator followed by the identifier 'b' > '@b' parses as above (BEFORE) > '@b' parses as the '@b' operator (AFTER) > > He then correctly says that in my proposal the lack of whitespace > after an operator can cause the operator to absorb a following > identifier. > > However, something similar ALREADY happens in Python. >>>> a = nota = True >>>> not a > False >>>> nota > True > > Today, whenever a Python operator ends in a letter, and is followed by > an identifier, white space is or some other delimiter is required > between the two. Python, rightly, refuse to guess that 'notary' might > be 'not ary'. > > Here is another example >>>> e = note = None >>>> e is not e > False >>>> e is note > True > > This is not quite what's happening with '@b'. With 'is not e' the > following identifier 'e' absorbs the 'not' from the operator to create > 'note'. > > And finally >>>> False is not None > True >>>> False is (not None) > False > > The 'natural language' operators appear in > https://docs.python.org/3/reference/expressions.html#operator-precedence > > In my suggestion, '@' consumes for as long it can, first a letter, and > then name characters. This is exactly the same as with 'a' or 'b'. > > I think this is a problem, but not nearly so bad as Chris suggests. > Some people have argued that the proposed semantics for dict + dict > are natural and obvious, once the behaviour of Python elsewhere is > understood. I claim the same for '@b' and '@ b', once we allow '@b' as > an operator (which was the whole purpose of the proposal. > > By the way, it's likely that most users won't know that '@' by itself > is an operator, until they come to use matrices. > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Sat Mar 9 15:41:42 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Mar 2019 07:41:42 +1100 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: On Sun, Mar 10, 2019 at 6:07 AM Jonathan Fine wrote: > He then correctly says that in my proposal the lack of whitespace > after an operator can cause the operator to absorb a following > identifier. > > However, something similar ALREADY happens in Python. > >>> a = nota = True > >>> not a > False > >>> nota > True > > Today, whenever a Python operator ends in a letter, and is followed by > an identifier, white space is or some other delimiter is required > between the two. Python, rightly, refuse to guess that 'notary' might > be 'not ary'. > > Here is another example > >>> e = note = None > >>> e is not e > False > >>> e is note > True > > This is not quite what's happening with '@b'. With 'is not e' the > following identifier 'e' absorbs the 'not' from the operator to create > 'note'. Python's grammar is defined in terms of tokens. There is a specific token 'not' which can be used in three ways: either 'not' followed by a valid expression, or as part of the operators 'not in' and 'is not', both of which are then followed (and preceded) by expressions. If the parser sees 'note', it doesn't see the token 'not'; it sees a NAME token. That's not the case with '@'. There is no way that '@foo' could be a single NAME token, because an at sign cannot be part of a NAME. There is a VAST difference, both to humans and to the parser, between "note" and "@e", and the fact that "not e" is different from "note" does not mean that "@e" can be different from "@ e". As Anders says, pick something that's not currently valid syntax and then you won't be up against this problem. ChrisA From robertve92 at gmail.com Sat Mar 9 15:49:10 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Sat, 9 Mar 2019 23:49:10 +0300 Subject: [Python-ideas] Attribute-Getter Syntax Proposal In-Reply-To: References: Message-ID: You can do : I suggest this syntax: > >>> map(.upper(), ['a', 'b', 'c']) > map(dot('upper'), 'a b c'.split()) map(dot('replace', 'x', 'y'), 'xo do ox'.split()) def dot(name, *args, **kwargs): return lambda self: getattr(self, name)(*args, **kwargs) > This would also work for attributes: > >>> map(.real, [1j, 2, 3+4j]) > from operator import itemgetter map(itergetter('real'), [...]) from operator import itemgetter as gatt map(itergetter('real'), [...]) Also, check out my package funcoperators on pip for neat functional programming syntaxes https://pypi.org/project/funcoperators/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Sat Mar 9 16:13:25 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 21:13:25 +0000 Subject: [Python-ideas] The $update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: I'm adopting an idea suggested by Anders and Chris. To allow us better to focus on the main idea and purpose, I've replaced '@' by '$' in the initial suggestion. And if the main idea is accepted, we can if needed have an secondary discussion regarding the details of the syntax. Here's the restatement. I've also changed the subject line. I've been thinking that it might be easier, in the long term, to make a big step and allow >>> a $update= b as valid Python. What do you think? (I hope it will look nicer once syntax highlighted.) For clarity, this would proceed via something like a.__idl_update__(b), and (a $update b) would be similarly defined. (Here, 'idl' stand for 'incremented dollar'.) -- Jonathan From jfine2358 at gmail.com Sat Mar 9 16:47:13 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sat, 9 Mar 2019 21:47:13 +0000 Subject: [Python-ideas] The $update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: A good starting point for discussing the main idea is: PEP 465 -- A dedicated infix operator for matrix multiplication https://www.python.org/dev/peps/pep-0465 Matrix multiplication is one of many special binary mathematical operators. PEP 465 successfully argues the merits of introducing a special operator for matrix multiplication. This thread starts from a discussion of the merits of binding dict.update to an operator. (For update, '+', '|' and '<<' the leading candidate symbols.) Matrices and linear algebra are not the only part of mathematics that is usually expressed with infix operators. Thus, I suggest that the main questions are: 1. In practice, how important are additional infix operators to the Python community? 2. Can we harmoniously extend Python to accommodate these new operators? Here, from PEP 465, are some highlights of the benefits. Infix @ dramatically improves matrix code usability at all stages of programmer interaction. A large proportion of scientific code is written by people who are experts in their domain, but are not experts in programming. For these kinds of users, whose programming knowledge is fragile, the existence of a transparent mapping between formulas and code often means the difference between succeeding and failing to write that code at all. Most mathematical and scientific formulas can be written in LaTeX notation, which gives standard names for the infix operators mathematicians use. There is no transparent and obvious mapping from the present operators to those used in mathematics. https://docs.python.org/3/reference/lexical_analysis.html?#operators Using Unicode symbols for the math operators is probably unwise. Better, I suggest, is to use the LaTeX names. There is some evidence (the wish to bind dict.update to an infix operator) that outside of mathematics there is a demand for custom infix operators. -- Jonathan From steve at pearwood.info Sat Mar 9 18:24:27 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Mar 2019 10:24:27 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190309165514.GA29550@ando.pearwood.info> Message-ID: <20190309232426.GL12502@ando.pearwood.info> On Sat, Mar 09, 2019 at 11:39:39AM -0800, Stephan Hoyer wrote: > Would __iadd__ and __isub__ be added to collections.abc.MutableMapping? No, that will not be part of the PEP. The proposal is only to change dict itself. If people want to add this to MutableMapping, that could be considered seperately. -- Steven From greg.ewing at canterbury.ac.nz Sat Mar 9 19:13:24 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 10 Mar 2019 13:13:24 +1300 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> Message-ID: <5C8456A4.8030808@canterbury.ac.nz> Jonathan Fine wrote: > It not me, but Chris and Steve who > want to bind dict.update to an operator, namely '+'. I'm suggested > that if you do that, why not call the operator 'update'. One reason would be that '+' is short, whereas 'update' is long. A large part of the reason that common operations are written using infix operators is that the operator symbols used are very compact. That benefit disappears if your operator is an entire word. -- Greg From ian at feete.org Sat Mar 9 19:39:33 2019 From: ian at feete.org (Ian Foote) Date: Sun, 10 Mar 2019 00:39:33 +0000 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: <20190309184423.w3u6qrc5nehmibrl@florence.meitham.com> References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> <20190309184423.w3u6qrc5nehmibrl@florence.meitham.com> Message-ID: > It might also be worth considering YAML's own dict merge operator, the > "<<" operator, as in https://yaml.org/type/merge.html as this is the > existing Python's shift operator added to dict and will require no > change to the synatx:: > > a = a << b I really like this suggestion. It captures the asymmetry, since we could have a = a >> b to merge with the other dictionary's keys taking precedence. My instinct is that a = a << b would take b's values when keys collide and a = a >> b would take a's values when keys collide. I'd be very interested to know if this matches most peoples' intuitions. On Sat, 9 Mar 2019 at 18:44, Meitham Jamaa wrote: > It might also be worth considering YAML's own dict merge operator, the > "<<" operator, as in https://yaml.org/type/merge.html as this is the > existing Python's shift operator added to dict and will require no > change to the synatx:: > > a = a << b > > Meitham > > > On 03/10, Chris Angelico wrote: > > On Sun, Mar 10, 2019 at 3:16 AM Jonathan Fine > wrote: > > > > > > Anders Hovm?ller wrote: > > > > > > > I don't understand what you mean. Can you provide examples that show > the state of the dicts before and after and what the syntax would be the > equivalent of in current python? > > > > > > If a.__radd__ exists, then > > > a += b > > > is equivalent to > > > a = a.__radd__(b) > > > > > > Similarly, if a.__iat_update__ exists then > > > a @update= b > > > would be equivalent to > > > a = a.__iat_update__(b) > > > > > > Here's an implementation > > > def __iat_update__(self, other): > > > self.update(other) > > > return self > > > > > > Thus, 'b' would be unchanged, and 'a' would be the same dictionary as > > > before, but updated with 'b'. > > > > With something this long, how is it better from just writing: > > > > a = a.update_with(b) > > > > ? What's the point of an operator, especially if - by your own > > statement - it will backward-incompatibly change the language grammar > > (in ways that I've yet to understand, since you haven't really been > > clear on that)? > > > > ChrisA > > > > -- > Meitham Jamaa > > http://meitham.com > GPG Fingerprint: 3934D0B2 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benrudiak at gmail.com Sat Mar 9 19:42:40 2019 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Sat, 9 Mar 2019 16:42:40 -0800 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: <5C8456A4.8030808@canterbury.ac.nz> References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> <5C8456A4.8030808@canterbury.ac.nz> Message-ID: On Sat, Mar 9, 2019 at 4:14 PM Greg Ewing wrote: > > A large part of the reason that common operations are written > using infix operators is that the operator symbols used are very > compact. That benefit disappears if your operator is an entire > word. I suppose people bring up Haskell too much, but it does work in Haskell. People write things like (item `notElem` list) all the time and it's readable enough. In Haskell, though, it's sugar for (notElem item list), or notElem(item, list) in Pythonish syntax. In Python, it'd in most cases be sugar for a method call, in which the method name already appears in infix position, so the benefit is less clear. Given that Python's so-called augmented assignments are really mutating operations in disguise anyway (x op= y is not equivalent to x = x op y when x is mutable), I don't see any advantage of a new assignment syntax over the existing mutating methods. I.e., instead of x @update= y, you can just write x.update(y). From mertz at gnosis.cx Sat Mar 9 21:06:03 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 9 Mar 2019 21:06:03 -0500 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> <20190309184423.w3u6qrc5nehmibrl@florence.meitham.com> Message-ID: Maybe it's just the C++ IO piping that makes me like it, but these actually seem intuitive to me, whereas `+` or even `|` leaves me queasy. On Sat, Mar 9, 2019 at 7:40 PM Ian Foote wrote: > > It might also be worth considering YAML's own dict merge operator, the > > "<<" operator, as in https://yaml.org/type/merge.html as this is the > > existing Python's shift operator added to dict and will require no > > change to the synatx:: > > > > a = a << b > > > I really like this suggestion. It captures the asymmetry, since we could > have a = a >> b to merge with the other dictionary's keys taking precedence. > > My instinct is that a = a << b would take b's values when keys collide and > a = a >> b would take a's values when keys collide. I'd be very interested > to know if this matches most peoples' intuitions. > > On Sat, 9 Mar 2019 at 18:44, Meitham Jamaa wrote: > >> It might also be worth considering YAML's own dict merge operator, the >> "<<" operator, as in https://yaml.org/type/merge.html as this is the >> existing Python's shift operator added to dict and will require no >> change to the synatx:: >> >> a = a << b >> >> Meitham >> >> >> On 03/10, Chris Angelico wrote: >> > On Sun, Mar 10, 2019 at 3:16 AM Jonathan Fine >> wrote: >> > > >> > > Anders Hovm?ller wrote: >> > > >> > > > I don't understand what you mean. Can you provide examples that >> show the state of the dicts before and after and what the syntax would be >> the equivalent of in current python? >> > > >> > > If a.__radd__ exists, then >> > > a += b >> > > is equivalent to >> > > a = a.__radd__(b) >> > > >> > > Similarly, if a.__iat_update__ exists then >> > > a @update= b >> > > would be equivalent to >> > > a = a.__iat_update__(b) >> > > >> > > Here's an implementation >> > > def __iat_update__(self, other): >> > > self.update(other) >> > > return self >> > > >> > > Thus, 'b' would be unchanged, and 'a' would be the same dictionary as >> > > before, but updated with 'b'. >> > >> > With something this long, how is it better from just writing: >> > >> > a = a.update_with(b) >> > >> > ? What's the point of an operator, especially if - by your own >> > statement - it will backward-incompatibly change the language grammar >> > (in ways that I've yet to understand, since you haven't really been >> > clear on that)? >> > >> > ChrisA >> > >> >> -- >> Meitham Jamaa >> >> http://meitham.com >> GPG Fingerprint: 3934D0B2 >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From benrudiak at gmail.com Sun Mar 10 15:08:36 2019 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Sun, 10 Mar 2019 12:08:36 -0700 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: <20190309022131.GF12502@ando.pearwood.info> References: <20190309022131.GF12502@ando.pearwood.info> Message-ID: On Fri, Mar 8, 2019 at 6:23 PM Steven D'Aprano wrote: > A sets the argument tuple to (1, 2) > B sets the argument tuple to (2, 3) > B calls spam() > A calls spam() # Oops! I'm pretty sure the idea was to have constant tuples (1, 2) and (3, 4) in the module instead of LOAD_CONST 1/2/3/4 instructions in the bytecode. There's no setting of a hidden global variable involved. The kwargs dicts are a harder problem. I suppose they would have to be copy-on-write which would add too much complexity, or the language would have to be changed to allow/require kwargs to be a frozendict. > And then the pre- > allocated tuples and dicts would hang around forever, wasting memory. > Even if it turns out that the function never actually gets called: > > for x in sequence: > if condition(x): # always returns False! > function(...) > > the compiler will have pre-allocated the memory to call it. The bytecode for "function(...)" already hangs around forever even if it's never run. There is no need for tuple constants because you can generate LOAD_CONST a; LOAD_CONST b; ...; BUILD_TUPLE n at each usage point instead, but CPython has tuple constants, so they must have some space and/or speed benefit that was considered significant enough to be worth implementing them. It seems like args constants for function calls can be justified on similar grounds. From pythonchb at gmail.com Sun Mar 10 16:18:11 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 10 Mar 2019 13:18:11 -0700 Subject: [Python-ideas] The @update operator for dictionaries In-Reply-To: References: <4F971693-C568-4589-9FBE-EEA23FBD5C70@killingar.net> <20190309184423.w3u6qrc5nehmibrl@florence.meitham.com> Message-ID: Are you all REALY=LU proposing more operators? Adding @ made sense because there was an important use case for which there was no existing operator to use. But in this case, we have + and | available, both of which are pretty good options. Finally, which dicts are a very important ue ase, do we want to add an operator just for that? What would it mean for sets, for instance? I have to say, the whole discussion seems to me to be a massive bike-shedding exercise -- the original proposal was to simply define + for dicts. It's totally reasonable to not like that idea at all, or to propose that | is a better option, but this has really gone off the rails! I guess I say that because this wasn't started with a critical use-case that really needed a solution, but rather: "the + operator isn't being used for dicts, why not make a semi-common operation easily available" So my opinion, which I'm re-stating: using + to merge dicts is simple, non-disruptive, and unlikely to really confuse anyone - so why not? ( | would be OK, too, though I think a tad less accessible to newbies) But I don't think that having an operator to merge dicts is a critical use-case worth of adding a new operator or new syntax to the to the language. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.bordeyne at gmail.com Sun Mar 10 18:01:46 2019 From: simon.bordeyne at gmail.com (Simon) Date: Sun, 10 Mar 2019 23:01:46 +0100 Subject: [Python-ideas] New use for the 'in' keyword. Message-ID: Python's 'in' keyword has already several use cases, whether it's for testing inclusion in a set, or to iterate over that set, nevertheless, I think we could add one more function to that keyword. It's not uncommon to see star imports in some sources. The reason that people use star imports are mainly the following : - ease of use of the module's function - overwriting parts of a module with another module's similarly named functions. Obviously, there are plenty of cases where you don't want your module's functions to be overwritten, but sometimes, you do. I'm suggesting that the in keyword would be used to import a module in another module's namespace, with the following syntax : import numpy as np import math in np This has two advantages over star imports : 1/ You keep a namespace. For maintainability reasons, it's best to use namespaces 2/ You can overwrite parts of a module with another one 3/ IDEs would know which modules have been imported and which functions are defined there, which they can't have with star imports. Additionnally, The following syntax would also be valid : import math import numpy in math -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Mar 10 18:09:22 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Mar 2019 09:09:22 +1100 Subject: [Python-ideas] New use for the 'in' keyword. In-Reply-To: References: Message-ID: On Mon, Mar 11, 2019 at 9:02 AM Simon wrote: > > Python's 'in' keyword has already several use cases, whether it's for testing inclusion in a set, or to iterate over that set, nevertheless, I think we could add one more function to that keyword. > > It's not uncommon to see star imports in some sources. The reason that people use star imports are mainly the following : > > - ease of use of the module's function > - overwriting parts of a module with another module's similarly named functions. > > Obviously, there are plenty of cases where you don't want your module's functions to be overwritten, but sometimes, you do. I'm suggesting that the in keyword would be used to import a module in another module's namespace, with the following syntax : > > import numpy as np > import math in np > > This has two advantages over star imports : > > 1/ You keep a namespace. For maintainability reasons, it's best to use namespaces > 2/ You can overwrite parts of a module with another one > 3/ IDEs would know which modules have been imported and which functions are defined there, which they can't have with star imports. I'm not entirely sure what the effect of the second statement would be. Is it like doing "np.sin = math.sin; np.sqrt = math.sqrt" for every name in the math module? That seems like a massive amount of monkeypatching. Or is it simply "np.math = math", so you could then write "np.math.sqrt"? Less problematic, but also not all that useful. Or is it something else? Clarify please? ChrisA From jamtlu at gmail.com Sun Mar 10 22:36:24 2019 From: jamtlu at gmail.com (James Lu) Date: Sun, 10 Mar 2019 22:36:24 -0400 Subject: [Python-ideas] The $update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: This is a horrible idea. I proposed to Mr. Fine earlier that we adopt a << operator. d1 << d2 merges d2 into a copy of d1 and returns it, with keys from d2 overriding keys from d2. On Sat, Mar 9, 2019 at 4:50 PM Jonathan Fine wrote: > A good starting point for discussing the main idea is: > PEP 465 -- A dedicated infix operator for matrix multiplication > https://www.python.org/dev/peps/pep-0465 > > Matrix multiplication is one of many special binary mathematical > operators. PEP 465 successfully argues the merits of introducing a > special operator for matrix multiplication. This thread starts from a > discussion of the merits of binding dict.update to an operator. (For > update, '+', '|' and '<<' the leading candidate symbols.) > > Matrices and linear algebra are not the only part of mathematics that > is usually expressed with infix operators. Thus, I suggest that the > main questions are: > > 1. In practice, how important are additional infix operators to the > Python community? > 2. Can we harmoniously extend Python to accommodate these new operators? > > Here, from PEP 465, are some highlights of the benefits. > > > Infix @ dramatically improves matrix code usability at all stages of > programmer interaction. > > A large proportion of scientific code is written by people who are > experts in their domain, but are not experts in programming. > > For these kinds of users, whose programming knowledge is fragile, the > existence of a transparent mapping between formulas and code often > means the difference between succeeding and failing to write that code > at all. > > > Most mathematical and scientific formulas can be written in LaTeX > notation, which gives standard names for the infix operators > mathematicians use. > > There is no transparent and obvious mapping from the present operators > to those used in mathematics. > https://docs.python.org/3/reference/lexical_analysis.html?#operators > > Using Unicode symbols for the math operators is probably unwise. > Better, I suggest, is to use the LaTeX names. > > There is some evidence (the wish to bind dict.update to an infix > operator) that outside of mathematics there is a demand for custom > infix operators. > > -- > Jonathan > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Mar 11 06:27:29 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 11 Mar 2019 10:27:29 +0000 Subject: [Python-ideas] New use for the 'in' keyword. In-Reply-To: References: Message-ID: Hi Simon You've suggested allowing import numpy as np import math in np as an improvement on from somewhere import * But you can get a similar result already, by writing in your package import .aaa as np and within mypackage/aaa.py writing from numpy import * from math import * When you write import numpy as np the value of the identifier becomes the numpy module. Your proposal aims to add to that module additional key-value pairs. (A module is a bit like a dict.) And perhaps clobber some existing resources. It's generally best to treat a module as a read-only object. That way, you don't get side-effects. This is why Chris was talking about monkey-patching. -- Jonathan From jamtlu at gmail.com Sun Mar 10 23:48:59 2019 From: jamtlu at gmail.com (James Lu) Date: Sun, 10 Mar 2019 23:48:59 -0400 Subject: [Python-ideas] from __future__ import runtime_default_kwargs Message-ID: When from __future__ import runtime_default_kwargs Is run, def a(b=1, c=b+2, d=[]): pass behaves as (if the peephole optimizer didn?t exist) def a(b=None, c=None): if b is None: b = 1 if c is None: c = b + 2 if d is None: d = [] i.e. the keyword expression is evaluated at runtime. Perhaps a restriction on ?literals only? can be made so people don?t abuse this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Mar 11 07:33:26 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Mar 2019 22:33:26 +1100 Subject: [Python-ideas] from __future__ import runtime_default_kwargs In-Reply-To: References: Message-ID: On Mon, Mar 11, 2019 at 10:28 PM James Lu wrote: > > When > > from __future__ import runtime_default_kwargs > > > > Is run, > > def a(b=1, c=b+2, d=[]): > pass > > behaves as (if the peephole optimizer didn?t exist) > > def a(b=None, c=None): > if b is None: > b = 1 > if c is None: > c = b + 2 > if d is None: > d = [] > > i.e. the keyword expression is evaluated at runtime. > > Perhaps a restriction on ?literals only? can be made so people don?t abuse this. A future directive is appropriate only if the language is planned to be changed. Is that what you're actually asking for? If so, ask for that (and then the future directive becomes a transition tool). As an alternative, consider tagging the specific function, not the whole module (future directives always apply to the entire module). The best way to tag a function would probably be a decorator. ChrisA From jfine2358 at gmail.com Mon Mar 11 07:34:20 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 11 Mar 2019 11:34:20 +0000 Subject: [Python-ideas] from __future__ import runtime_default_kwargs In-Reply-To: References: Message-ID: Thank you, James, for your idea. For the benefit of those who may not know, please explain the problem you wish to solve. That way we could suggest, discuss and compare other solutions. -- Jonathan From rhodri at kynesim.co.uk Mon Mar 11 07:41:00 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 11 Mar 2019 11:41:00 +0000 Subject: [Python-ideas] The $update operator for dictionaries In-Reply-To: References: <20190309160700.GJ12502@ando.pearwood.info> <20190309164913.GK12502@ando.pearwood.info> Message-ID: <6a7d6d6b-aa23-6e3d-cadf-5c3bc7ac1916@kynesim.co.uk> On 09/03/2019 21:13, Jonathan Fine wrote: > I'm adopting an idea suggested by Anders and Chris. > > To allow us better to focus on the main idea and purpose, I've > replaced '@' by '$' in the initial suggestion. And if the main idea is > accepted, we can if needed have an secondary discussion regarding the > details of the syntax. > > Here's the restatement. I've also changed the subject line. > > I've been thinking that it might be easier, in the long term, to make > a big step and allow > >>> a $update= b > as valid Python. What do you think? (I hope it will look nicer once > syntax highlighted.) I think it's horribly Perl-like, but I'll reserve judgement until you tell us exactly what you expect this to achieve. So far you appear to be trying to create random operators to no great purpose. What is the problem you are seeking to solve here, and why is it worse than the problems you are creating? -- Rhodri James *-* Kynesim Ltd From steve at pearwood.info Mon Mar 11 07:58:17 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Mar 2019 22:58:17 +1100 Subject: [Python-ideas] from __future__ import runtime_default_kwargs In-Reply-To: References: Message-ID: <20190311115817.GN12502@ando.pearwood.info> Hi James, Some weeks ago, you started a discussion here about "Clearer Communication". Here's another suggestion to help: don't expect your readers to either guess, or infer from the code, what your proposal means. As the Zen of Python says: Explicit is better than implicit. Looking at your sample code, I'm guessing that you want support for late binding of function parameter defaults. Python uses early binding, see the FAQs: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects Am I correct? If not, can you please explain what it is that you are actually suggesting. Note that early binding is not a bug to be fixed, it is a design choice which is sometimes useful and sometimes not useful. Mutable defaults being shared is sometimes a good feature to have, and for immutable defaults, early binding is more efficient. I don't think that changing the behaviour will be acceptible, I know I would argue strongly in favour of keeping early binding. (If the language defaults to early binding, it is easy to implement late binding in the body of the function; but if the language uses late binding, it is difficult and annoying to get early binding when you want it. We should stick to the status quo.) But having a nice, clean way to get late binding semantics as an alternative might be acceptible, if we can agree on acceptible syntax and a mechanism. -- Steven From jfine2358 at gmail.com Mon Mar 11 08:17:25 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 11 Mar 2019 12:17:25 +0000 Subject: [Python-ideas] from __future__ import runtime_default_kwargs In-Reply-To: <20190311115817.GN12502@ando.pearwood.info> References: <20190311115817.GN12502@ando.pearwood.info> Message-ID: Steven D'Aprano wrote: > Some weeks ago, you started a discussion here about "Clearer > Communication". Here's another suggestion to help: don't expect your > readers to either guess, or infer from the code, what your proposal > means. As the Zen of Python says: > > Explicit is better than implicit. For me, the canonical guidelines for the use of this list are [1] http://python.org/psf/codeofconduct/ Summary: Open, Considerate, Respectful [2] https://mail.python.org/mailman/listinfo/python-ideas This list is to contain discussion of speculative language ideas for Python for possible inclusion into the language. If an idea gains traction it can then be discussed and honed to the point of becoming a solid proposal to put to python-dev as appropriate. [3] https://devguide.python.org/coredev/#responsibilities As a core developer, there are certain things that are expected of you. First and foremost, be a good person. This might sound melodramatic, but you are now a member of the Python project and thus represent the project and your fellow core developers whenever you discuss Python with anyone. We have a reputation for being a very nice group of people and we would like to keep it that way. Core developers responsibilities include following the PSF Code of Conduct. ASIDE The system puts the first two URLs at the foot of every email it sends out. It might help if it also added https://devguide.python.org/ I'll suggest that to the forum moderators. -- Jonathan From rhodri at kynesim.co.uk Mon Mar 11 08:28:06 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 11 Mar 2019 12:28:06 +0000 Subject: [Python-ideas] OT: Respectful behaviour In-Reply-To: References: <20190311115817.GN12502@ando.pearwood.info> Message-ID: On 11/03/2019 12:17, Jonathan Fine wrote: > Steven D'Aprano wrote: > >> Some weeks ago, you started a discussion here about "Clearer >> Communication". Here's another suggestion to help: don't expect your >> readers to either guess, or infer from the code, what your proposal >> means. As the Zen of Python says: >> >> Explicit is better than implicit. > > For me, the canonical guidelines for the use of this list are > > [1] http://python.org/psf/codeofconduct/ > Summary: Open, Considerate, Respectful May I suggest that hijacking a thread to discuss something unrelated at length is neither considerate nor respectful? > ASIDE > The system puts the first two URLs at the foot of every email it sends > out. It might help if it also added > https://devguide.python.org/ > > I'll suggest that to the forum moderators. That makes it sound like we're all core developers, which would be a considerable discouragement to discussion. I would expect that the core devs actually know this already, and reminding them at every turn is rather insulting. -- Rhodri James *-* Kynesim Ltd From jfine2358 at gmail.com Mon Mar 11 11:29:29 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 11 Mar 2019 15:29:29 +0000 Subject: [Python-ideas] OT: Respectful behaviour In-Reply-To: References: <20190311115817.GN12502@ando.pearwood.info> Message-ID: Someone made a proposal whose purpose was not clear. A second person criticised the first person for this. A third person (me) referred to the public guidelines for the use of this list. A fourth person, in a new thread, accused the third person of hijacking the thread. The third person (me) responded, with the previous remarks. Silence, sometimes, is golden. Or at least better than the alternative. -- Jonatha From rosuav at gmail.com Mon Mar 11 11:37:46 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Mar 2019 02:37:46 +1100 Subject: [Python-ideas] OT: Respectful behaviour In-Reply-To: References: <20190311115817.GN12502@ando.pearwood.info> Message-ID: On Tue, Mar 12, 2019 at 2:30 AM Jonathan Fine wrote: > > Someone made a proposal whose purpose was not clear. A second person > criticised the first person for this. A third person (me) referred to > the public guidelines for the use of this list. A fourth person, in a > new thread, accused the third person of hijacking the thread. The > third person (me) responded, with the previous remarks. TBH your link to the public guidelines was not quite a response to the second post, as the second post was talking about *content* and you were talking about *behaviour*. It's perfectly possible to remain entirely within the Code of Conduct, but still not provide enough context for the post; it's also entirely possible to make a post that has all sorts of useful information, but is caustic, rude, racist, sexist, or in any other way violates the CoC. The former is perfectly legitimate content, but will result in an on-topic response asking for more details; the latter might get you banned from the list. That said, though - I don't think Rhodri's response was really necessary here. Calling your post "unrelated" is stretching it a bit, and it wasn't an inconsiderate post, just not quite a direct response. *shrug* ChrisA From rhodri at kynesim.co.uk Mon Mar 11 11:45:38 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 11 Mar 2019 15:45:38 +0000 Subject: [Python-ideas] OT: Respectful behaviour In-Reply-To: References: <20190311115817.GN12502@ando.pearwood.info> Message-ID: On 11/03/2019 15:37, Chris Angelico wrote: > On Tue, Mar 12, 2019 at 2:30 AM Jonathan Fine wrote: >> >> Someone made a proposal whose purpose was not clear. A second person >> criticised the first person for this. A third person (me) referred to >> the public guidelines for the use of this list. A fourth person, in a >> new thread, accused the third person of hijacking the thread. The >> third person (me) responded, with the previous remarks. > > TBH your link to the public guidelines was not quite a response to the > second post, as the second post was talking about *content* and you > were talking about *behaviour*. It's perfectly possible to remain > entirely within the Code of Conduct, but still not provide enough > context for the post; it's also entirely possible to make a post that > has all sorts of useful information, but is caustic, rude, racist, > sexist, or in any other way violates the CoC. The former is perfectly > legitimate content, but will result in an on-topic response asking for > more details; the latter might get you banned from the list. > > That said, though - I don't think Rhodri's response was really > necessary here. Calling your post "unrelated" is stretching it a bit, > and it wasn't an inconsiderate post, just not quite a direct response. > *shrug* Had Jonathan retitled his post to be clear he was not addressing the original issue (the first unclear post), I wouldn't have had an issue with it. However he didn't, and his post wasn't on the subject (or attempting to clarify the subject) of the the first post. In a post talking about respect and good conduct, that's quite a failing. In my opinion, obviously. -- Rhodri James *-* Kynesim Ltd From francismb at email.de Mon Mar 11 15:58:03 2019 From: francismb at email.de (francismb) Date: Mon, 11 Mar 2019 20:58:03 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <5C830BE8.1030606@canterbury.ac.nz> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> Message-ID: <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Hi Greg, On 3/9/19 1:42 AM, Greg Ewing wrote: > Do you really want > to tell them that all their code is now wrong? Of course not, at least not so promptly. But, would it be still a problem if the update to a new version (let say from 3.X to next(3.X)) is done through some kind of updater/re-writer/evolver. In that case the evolver could just add the blanks. What do you think ? Could it work? Thanks in advance! --francis From francismb at email.de Mon Mar 11 16:38:21 2019 From: francismb at email.de (francismb) Date: Mon, 11 Mar 2019 21:38:21 +0100 Subject: [Python-ideas] Code version evolver Message-ID: Hi, I would like to discuss on the idea of a code (minor) version evolver/re-writer (or at least a change indicator). Let's see one wants to add a feature on the next version and some small grammar change is needed, then the script upgrades/evolves first the current code and then the new version can be installed/used. Something like: >> new-code = python-next('3.7', current-code) >> is-python-code('3.8', new-code) >> True How hard is that? or what is possible and what not? where should it happen? on the installer? Thanks in advance! --francis From p.f.moore at gmail.com Mon Mar 11 19:21:52 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 11 Mar 2019 23:21:52 +0000 Subject: [Python-ideas] Code version evolver In-Reply-To: References: Message-ID: On Mon, 11 Mar 2019 at 20:39, francismb wrote: > > Hi, > I would like to discuss on the idea of a code (minor) version > evolver/re-writer (or at least a change indicator). Let's see one wants > to add a feature on the next version and some small grammar change is > needed, then the script upgrades/evolves first the current code and then > the new version can be installed/used. That sounds very similar to 2to3, which seemed like a good approach to the Python 2 to Python 3 transition, but fell into disuse because people who have to support multiple versions of Python in their code found it *far* easier to do so with a single codebase that worked with both versions, rather than needing to use a translator. Paul From steve at pearwood.info Mon Mar 11 19:25:03 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Mar 2019 10:25:03 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: Message-ID: <20190311232501.GP12502@ando.pearwood.info> On Mon, Mar 11, 2019 at 09:38:21PM +0100, francismb wrote: > Hi, > I would like to discuss on the idea of a code (minor) version > evolver/re-writer (or at least a change indicator). Let's see one wants > to add a feature on the next version and some small grammar change is > needed, then the script upgrades/evolves first the current code and then > the new version can be installed/used. What you want this "version evolver" to do might be clear to you, but it certainly isn't clear to me. I don't know what you mean by evolving the current code, but I'm guessing you don't mean genetic programming. https://en.wikipedia.org/wiki/Genetic_programming I don't know who you expect is using this: the Python core developers responsible for adding new language features and changing the grammar, or Python programmers. I don't know what part of the current code (current code of *what*?) is supposed to be upgraded or evolved, or what you mean by that. Do you mean using this to add new grammatical features to the interpreter? Do you mean something like 2to3? Something which transforms source code written in Python? https://docs.python.org/2/library/2to3.html > Something like: > > >> new-code = python-next('3.7', current-code) > >> is-python-code('3.8', new-code) > >> True > > > How hard is that? How hard it is to get the behaviour shown? Easy! def python_next(version, ignoreme): x = float(version) return "%.1f" % (x + 0.1) def is_python_code(target, version): return target == version How hard is it to get the behaviour not shown? I have no idea, since I can't guess what these functions do that you don't show. If you want these functions to do more than the behaviour you show, don't expect us to guess what they do. > or what is possible and what not? > where should it happen? on the installer? Absolutely no idea. -- Steven From sylvain.marie at se.com Tue Mar 12 05:36:41 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Tue, 12 Mar 2019 09:36:41 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: <5BD23DE1.1000909@canterbury.ac.nz> References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> Message-ID: Dear python enthusiasts, Writing python decorators is indeed quite a tideous process, in particular when you wish to add arguments, and in particular in two cases : all optional arguments, and one mandatory argument. Indeed in these two cases there is a need to disambiguate between no-parenthesis and with-parenthesis usage. After having struggled with this pattern for two years in various open source and industrial projects, I ended up writing a library to hopefully solve this once and for all: https://smarie.github.io/python-decopatch/ . It is extensively tested (203 tests) against many combinations of signature/calls. I would gladly appreciate any feedback ! Please note that there is a "PEP proposal draft" in the project page because I belive that the best a library can do will always be a poor workaround, where the interpreter or stdlib could really fix it properly. Sorry for not providing too much implementation details in that page, my knowledge of the python interpreter is unfortunately quite limited. -- Finally there is an additional topic around decorators : people tend to believe that decorators and function wrappers are the same, which is absolutely not the case. I used the famous `decorator` lib in many projects but I was not satisfied because it was solving both issues at the same time, maintaining the confusion. I therefore proposed https://smarie.github.io/python-makefun/ . In particular it provides an equivalent of `@functools.wraps` that is truly signature-preserving (based on the same recipe than `decorator`). Once again, any feedback would be gladly appreciated ! Kind regards Sylvain -----Message d'origine----- De?: Python-ideas De la part de Greg Ewing Envoy??: vendredi 26 octobre 2018 00:04 ??: python-ideas Objet?: Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ Jonathan Fine wrote: > I also find writing decorators a bit > hard. It seems to be something I have to learn anew each time I do it. > Particularly for the pattern > > @deco(arg1, arg2) def fn(arg3, arg4): > # function body > > Perhaps doing something with partial might help here. Anyone here > interested in exploring this? > I can't think of a way that partial would help. But would you find it easier if you could do something like this? class deco(Decorator): def __init__(self, arg1, arg2): self.arg1 = arg1 self.arg2 = arg2 def invoke(self, func, arg3, arg4): # function body Implementation: class Decorator: def __call__(self, func): self._wrapped_function = func return self._wrapper def _wrapper(self, *args, **kwds): return self.invoke(self._wrapped_function, *args, **kwds) -- Greg _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C295cecd58fc3461f42c108d63ac5e03e%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636761018888605889&sdata=Z8OET1CZZWnmN5czi0rZ1X57%2FDd4a9IDbbSujsDNWzk%3D&reserved=0 Code of Conduct: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C295cecd58fc3461f42c108d63ac5e03e%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636761018888615894&sdata=u0hvM5cR%2BR1Vni%2BV48WoNF%2FpriCaOG5%2BFAXayaTGsYY%3D&reserved=0 ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From mertz at gnosis.cx Tue Mar 12 07:19:47 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 12 Mar 2019 07:19:47 -0400 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> Message-ID: What advantage do you perceive decopatch to have over wrapt? ( https://github.com/GrahamDumpleton/wrapt) On Tue, Mar 12, 2019, 5:37 AM Sylvain MARIE via Python-ideas < python-ideas at python.org> wrote: > Dear python enthusiasts, > > Writing python decorators is indeed quite a tideous process, in particular > when you wish to add arguments, and in particular in two cases : all > optional arguments, and one mandatory argument. Indeed in these two cases > there is a need to disambiguate between no-parenthesis and with-parenthesis > usage. > > After having struggled with this pattern for two years in various open > source and industrial projects, I ended up writing a library to hopefully > solve this once and for all: https://smarie.github.io/python-decopatch/ . > It is extensively tested (203 tests) against many combinations of > signature/calls. > I would gladly appreciate any feedback ! > > Please note that there is a "PEP proposal draft" in the project page > because I belive that the best a library can do will always be a poor > workaround, where the interpreter or stdlib could really fix it properly. > Sorry for not providing too much implementation details in that page, my > knowledge of the python interpreter is unfortunately quite limited. > > -- > > Finally there is an additional topic around decorators : people tend to > believe that decorators and function wrappers are the same, which is > absolutely not the case. I used the famous `decorator` lib in many projects > but I was not satisfied because it was solving both issues at the same > time, maintaining the confusion. I therefore proposed > https://smarie.github.io/python-makefun/ . In particular it provides an > equivalent of `@functools.wraps` that is truly signature-preserving (based > on the same recipe than `decorator`). > Once again, any feedback would be gladly appreciated ! > > Kind regards > > Sylvain > > -----Message d'origine----- > De : Python-ideas > De la part de Greg Ewing > Envoy? : vendredi 26 octobre 2018 00:04 > ? : python-ideas > Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators > > [External email: Use caution with links and attachments] > > ________________________________ > > > > Jonathan Fine wrote: > > I also find writing decorators a bit > > hard. It seems to be something I have to learn anew each time I do it. > > Particularly for the pattern > > > > @deco(arg1, arg2) def fn(arg3, arg4): > > # function body > > > > Perhaps doing something with partial might help here. Anyone here > > interested in exploring this? > > > > I can't think of a way that partial would help. But would you find it > easier if you could do something like this? > > class deco(Decorator): > > def __init__(self, arg1, arg2): > self.arg1 = arg1 > self.arg2 = arg2 > > def invoke(self, func, arg3, arg4): > # function body > > Implementation: > > class Decorator: > > def __call__(self, func): > self._wrapped_function = func > return self._wrapper > > def _wrapper(self, *args, **kwds): > return self.invoke(self._wrapped_function, *args, **kwds) > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C295cecd58fc3461f42c108d63ac5e03e%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636761018888605889&sdata=Z8OET1CZZWnmN5czi0rZ1X57%2FDd4a9IDbbSujsDNWzk%3D&reserved=0 > Code of Conduct: > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C295cecd58fc3461f42c108d63ac5e03e%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636761018888615894&sdata=u0hvM5cR%2BR1Vni%2BV48WoNF%2FpriCaOG5%2BFAXayaTGsYY%3D&reserved=0 > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Mar 12 07:30:10 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Mar 2019 22:30:10 +1100 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> Message-ID: <20190312113010.GQ12502@ando.pearwood.info> On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas wrote: > I therefore proposed > https://smarie.github.io/python-makefun/ . In particular it provides > an equivalent of `@functools.wraps` that is truly signature-preserving Tell us more about that please. I'm very interested in getting decorators preserve the original signature. -- Steven From sylvain.marie at se.com Tue Mar 12 09:52:41 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Tue, 12 Mar 2019 13:52:41 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: <20190312113010.GQ12502@ando.pearwood.info> References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: David, Steven, Thanks for your interest ! As you probably know, decorators and function wrappers are *completely different concepts*. A decorator can directly return the decorated function (or class), it does not have to return a wrapper. Even more, it can entirely replace the decorated item with something else (not even a function or class!). Try it: it is possible to write a decorator to replace a function with an integer, even though it is probably not quite useful :) `decopatch` helps you write decorators, whatever they are. It "just" solves the annoying issue of having to handle the no-parenthesis and with-parenthesis calls. In addition as a 'goodie', it proposes two development styles: *nested* (you have to return a function) and *flat* (you directly write what will happen when the decorator is applied to something). -- Now about creating signature-preserving function wrappers (in a decorator, or outside a decorator - again, that's not related). That use case is supposed to be covered by functools.wrapt. Unfortunately as explained here https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 this is not the case because with functools.wrapt: - the wrapper code will execute even when the provided arguments are invalid. - the wrapper code cannot easily access an argument using its name, from the received *args, **kwargs. Indeed one would have to handle all cases (positional, keyword, default) and therefore to use something like Signature.bind(). For this reason I proposed a replacement in `makefun`: https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers -- Now bridging the gap. Of course a very interesting use cases for decorators is to create decorators that create a signature-preserving wrapper. It is possible to combine decopatch and makefun for this: https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . Decopatch even proposes a "double-flat" development style where you directly write the wrapper body, as explained in the doc. Did I answer your questions ? Thanks again for the quick feedback ! Best, Sylvain -----Message d'origine----- De?: Python-ideas De la part de Steven D'Aprano Envoy??: mardi 12 mars 2019 12:30 ??: python-ideas at python.org Objet?: Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas wrote: > I therefore proposed > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > provides an equivalent of `@functools.wraps` that is truly > signature-preserving Tell us more about that please. I'm very interested in getting decorators preserve the original signature. -- Steven _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 Code of Conduct: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From sylvain.marie at se.com Tue Mar 12 09:54:07 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Tue, 12 Mar 2019 13:54:07 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: David I realize that you were pointing at the 'wrapt' library not the functools.wrapt library. >From what I have tried with that library, it does not solve the issue solved with decopatch. Sylvain -----Message d'origine----- De?: Sylvain MARIE Envoy??: mardi 12 mars 2019 14:53 ??: Steven D'Aprano ; python-ideas at python.org; David Mertz Objet?: RE: [Python-ideas] Problems (and solutions?) in writing decorators David, Steven, Thanks for your interest ! As you probably know, decorators and function wrappers are *completely different concepts*. A decorator can directly return the decorated function (or class), it does not have to return a wrapper. Even more, it can entirely replace the decorated item with something else (not even a function or class!). Try it: it is possible to write a decorator to replace a function with an integer, even though it is probably not quite useful :) `decopatch` helps you write decorators, whatever they are. It "just" solves the annoying issue of having to handle the no-parenthesis and with-parenthesis calls. In addition as a 'goodie', it proposes two development styles: *nested* (you have to return a function) and *flat* (you directly write what will happen when the decorator is applied to something). -- Now about creating signature-preserving function wrappers (in a decorator, or outside a decorator - again, that's not related). That use case is supposed to be covered by functools.wrapt. Unfortunately as explained here https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 this is not the case because with functools.wrapt: - the wrapper code will execute even when the provided arguments are invalid. - the wrapper code cannot easily access an argument using its name, from the received *args, **kwargs. Indeed one would have to handle all cases (positional, keyword, default) and therefore to use something like Signature.bind(). For this reason I proposed a replacement in `makefun`: https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers -- Now bridging the gap. Of course a very interesting use cases for decorators is to create decorators that create a signature-preserving wrapper. It is possible to combine decopatch and makefun for this: https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . Decopatch even proposes a "double-flat" development style where you directly write the wrapper body, as explained in the doc. Did I answer your questions ? Thanks again for the quick feedback ! Best, Sylvain -----Message d'origine----- De?: Python-ideas De la part de Steven D'Aprano Envoy??: mardi 12 mars 2019 12:30 ??: python-ideas at python.org Objet?: Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas wrote: > I therefore proposed > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > provides an equivalent of `@functools.wraps` that is truly > signature-preserving Tell us more about that please. I'm very interested in getting decorators preserve the original signature. -- Steven _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 Code of Conduct: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From remi.lapeyre at henki.fr Tue Mar 12 10:16:18 2019 From: remi.lapeyre at henki.fr (=?UTF-8?q?R=C3=A9mi=20Lapeyre?=) Date: Tue, 12 Mar 2019 15:16:18 +0100 Subject: [Python-ideas] Add method unittest.TestProgram to add custom arguments Message-ID: <20190312141617.38160-1-remi.lapeyre@henki.fr> Hi everybody, I would like to add a new method to unittest.TestProgram that would let the user add its own arguments to TestProgram so its behavior can be changed from the command line, and make them accessible from a TestCase. This would allow for more customization when running the tests. Currently there is no way to do this, so users either have to rely on environment variables or parse sys.argv themselves like what is proposed in https://stackoverflow.com/a/46061210. This approach have several disadvantages: - it's more work to check and parse the argument - it's not discoverable from the help - everyone have its own way of doing this Here's two cases where this would be useful for me: - in one project, I have an option to run tests in "record mode". In this mode, all tests are executed but the results are saved to files instead of checking their validity. In subsequent runs, results are compared to those files to make sure they are correct: def _test_payload(obj, filename): import json, os filename = f"test/results/{filename}" if "RECORD_PAYLOADS" in os.environ: with open(filename, "w") as f: json.dump(obj, f, indent=2) else: with open(filename) as f: assert obj == json.load(f) This let us updating tests very quickly and just checking they are correct by looking at the diff. This is very useful for testing an API with many endpoints. As you can see, we use an environment variable to change TestCase behavior but it is awkward to run tests as `RECORD_PAYLOADS=1 python test_api.py -v` instead of a more straightforward `python test_api.py -v --record-payloads`. - in https://bugs.python.org/issue18765 (unittest needs a way to launch pdb.post_mortem or other debug hooks) Gregory P. Smith and Michael Foord propose to make TestCase call pdb.post_mortem() on failure. This also requires a way to change the behavior as this would not be wanted in CI. Just subclassing TestProgram would not be sufficient as the result of the parsing would not be accessible from the TestCase. One argument against making such change is that it would require adding a new attribute to TestCase which would not be backward compatible and which could break existing code but I think this can be manageable if we choose an appropriate name for it. I attached an implementation of this feature (also available at https://github.com/remilapeyre/cpython/tree/unittest-custom-arguments) and look forward for your comments. What do you think about this? --- Lib/unittest/case.py | 7 +- Lib/unittest/loader.py | 73 +++++---- Lib/unittest/main.py | 49 +++++- Lib/unittest/test/test_discovery.py | 225 ++++++++++++++++++++++++---- Lib/unittest/test/test_program.py | 34 ++++- 5 files changed, 331 insertions(+), 57 deletions(-) diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py index a157ae8a14..5698e3a640 100644 --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -1,6 +1,7 @@ """Test case implementation""" import sys +import argparse import functools import difflib import logging @@ -416,7 +417,7 @@ class TestCase(object): _class_cleanups = [] - def __init__(self, methodName='runTest'): + def __init__(self, methodName='runTest', command_line_arguments=None): """Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name. @@ -436,6 +437,10 @@ class TestCase(object): self._testMethodDoc = testMethod.__doc__ self._cleanups = [] self._subtest = None + if command_line_arguments is None: + self.command_line_arguments = argparse.Namespace() + else: + self.command_line_arguments = command_line_arguments # Map types to custom assertEqual functions that will compare # instances of said type in more detail to generate a more useful diff --git a/Lib/unittest/loader.py b/Lib/unittest/loader.py index ba7105e1ad..d5f0a97213 100644 --- a/Lib/unittest/loader.py +++ b/Lib/unittest/loader.py @@ -5,6 +5,7 @@ import re import sys import traceback import types +import itertools import functools import warnings @@ -81,7 +82,7 @@ class TestLoader(object): # avoid infinite re-entrancy. self._loading_packages = set() - def loadTestsFromTestCase(self, testCaseClass): + def loadTestsFromTestCase(self, testCaseClass, command_line_arguments=None): """Return a suite of all test cases contained in testCaseClass""" if issubclass(testCaseClass, suite.TestSuite): raise TypeError("Test cases should not be derived from " @@ -90,12 +91,21 @@ class TestLoader(object): testCaseNames = self.getTestCaseNames(testCaseClass) if not testCaseNames and hasattr(testCaseClass, 'runTest'): testCaseNames = ['runTest'] - loaded_suite = self.suiteClass(map(testCaseClass, testCaseNames)) + + # keep backward compatibility for subclasses that override __init__ + def instanciate_testcase(testCaseClass, testCaseName): + try: + return testCaseClass(testCaseName, command_line_arguments) + except TypeError: + return testCaseClass(testCaseName) + loaded_suite = self.suiteClass( + map(instanciate_testcase, itertools.repeat(testCaseClass), testCaseNames) + ) return loaded_suite # XXX After Python 3.5, remove backward compatibility hacks for # use_load_tests deprecation via *args and **kws. See issue 16662. - def loadTestsFromModule(self, module, *args, pattern=None, **kws): + def loadTestsFromModule(self, module, *args, pattern=None, command_line_arguments=None, **kws): """Return a suite of all test cases contained in the given module""" # This method used to take an undocumented and unofficial # use_load_tests argument. For backward compatibility, we still @@ -121,7 +131,7 @@ class TestLoader(object): for name in dir(module): obj = getattr(module, name) if isinstance(obj, type) and issubclass(obj, case.TestCase): - tests.append(self.loadTestsFromTestCase(obj)) + tests.append(self.loadTestsFromTestCase(obj, command_line_arguments)) load_tests = getattr(module, 'load_tests', None) tests = self.suiteClass(tests) @@ -135,7 +145,7 @@ class TestLoader(object): return error_case return tests - def loadTestsFromName(self, name, module=None): + def loadTestsFromName(self, name, module=None, command_line_arguments=None): """Return a suite of all test cases given a string specifier. The name may resolve either to a module, a test case class, a @@ -188,9 +198,9 @@ class TestLoader(object): return error_case if isinstance(obj, types.ModuleType): - return self.loadTestsFromModule(obj) + return self.loadTestsFromModule(obj, command_line_arguments) elif isinstance(obj, type) and issubclass(obj, case.TestCase): - return self.loadTestsFromTestCase(obj) + return self.loadTestsFromTestCase(obj, command_line_arguments) elif (isinstance(obj, types.FunctionType) and isinstance(parent, type) and issubclass(parent, case.TestCase)): @@ -213,11 +223,11 @@ class TestLoader(object): else: raise TypeError("don't know how to make test from: %s" % obj) - def loadTestsFromNames(self, names, module=None): + def loadTestsFromNames(self, names, module=None, command_line_arguments=None): """Return a suite of all test cases found using the given sequence of string specifiers. See 'loadTestsFromName()'. """ - suites = [self.loadTestsFromName(name, module) for name in names] + suites = [self.loadTestsFromName(name, module, command_line_arguments) for name in names] return self.suiteClass(suites) def getTestCaseNames(self, testCaseClass): @@ -239,7 +249,7 @@ class TestLoader(object): testFnNames.sort(key=functools.cmp_to_key(self.sortTestMethodsUsing)) return testFnNames - def discover(self, start_dir, pattern='test*.py', top_level_dir=None): + def discover(self, start_dir, pattern='test*.py', top_level_dir=None, command_line_arguments=None): """Find and return all test modules from the specified start directory, recursing into subdirectories to find them and return all tests found within them. Only test files that match the pattern will @@ -322,9 +332,12 @@ class TestLoader(object): self._top_level_dir = \ (path.split(the_module.__name__ .replace(".", os.path.sep))[0]) - tests.extend(self._find_tests(path, - pattern, - namespace=True)) + tests.extend(self._find_tests( + path, + pattern, + namespace=True, + command_line_arguments=command_line_arguments + )) elif the_module.__name__ in sys.builtin_module_names: # builtin module raise TypeError('Can not use builtin modules ' @@ -346,7 +359,7 @@ class TestLoader(object): raise ImportError('Start directory is not importable: %r' % start_dir) if not is_namespace: - tests = list(self._find_tests(start_dir, pattern)) + tests = list(self._find_tests(start_dir, pattern, command_line_arguments=command_line_arguments)) return self.suiteClass(tests) def _get_directory_containing_module(self, module_name): @@ -381,7 +394,7 @@ class TestLoader(object): # override this method to use alternative matching strategy return fnmatch(path, pattern) - def _find_tests(self, start_dir, pattern, namespace=False): + def _find_tests(self, start_dir, pattern, namespace=False, command_line_arguments=None): """Used by discovery. Yields test suites it loads.""" # Handle the __init__ in this package name = self._get_name_from_path(start_dir) @@ -391,7 +404,7 @@ class TestLoader(object): # name is in self._loading_packages while we have called into # loadTestsFromModule with name. tests, should_recurse = self._find_test_path( - start_dir, pattern, namespace) + start_dir, pattern, namespace, command_line_arguments) if tests is not None: yield tests if not should_recurse: @@ -403,7 +416,7 @@ class TestLoader(object): for path in paths: full_path = os.path.join(start_dir, path) tests, should_recurse = self._find_test_path( - full_path, pattern, namespace) + full_path, pattern, namespace, command_line_arguments) if tests is not None: yield tests if should_recurse: @@ -411,11 +424,16 @@ class TestLoader(object): name = self._get_name_from_path(full_path) self._loading_packages.add(name) try: - yield from self._find_tests(full_path, pattern, namespace) + yield from self._find_tests( + full_path, + pattern, + namespace, + command_line_arguments=command_line_arguments + ) finally: self._loading_packages.discard(name) - def _find_test_path(self, full_path, pattern, namespace=False): + def _find_test_path(self, full_path, pattern, namespace=False, command_line_arguments=None): """Used by discovery. Loads tests from a single file, or a directories' __init__.py when @@ -457,7 +475,8 @@ class TestLoader(object): "%r. Is this module globally installed?") raise ImportError( msg % (mod_name, module_dir, expected_dir)) - return self.loadTestsFromModule(module, pattern=pattern), False + return self.loadTestsFromModule(module, pattern=pattern, + command_line_arguments=command_line_arguments), False elif os.path.isdir(full_path): if (not namespace and not os.path.isfile(os.path.join(full_path, '__init__.py'))): @@ -480,7 +499,11 @@ class TestLoader(object): # Mark this package as being in load_tests (possibly ;)) self._loading_packages.add(name) try: - tests = self.loadTestsFromModule(package, pattern=pattern) + tests = self.loadTestsFromModule( + package, + pattern=pattern, + command_line_arguments=command_line_arguments + ) if load_tests is not None: # loadTestsFromModule(package) has loaded tests for us. return tests, False @@ -507,11 +530,11 @@ def getTestCaseNames(testCaseClass, prefix, sortUsing=util.three_way_cmp, testNa return _makeLoader(prefix, sortUsing, testNamePatterns=testNamePatterns).getTestCaseNames(testCaseClass) def makeSuite(testCaseClass, prefix='test', sortUsing=util.three_way_cmp, - suiteClass=suite.TestSuite): + suiteClass=suite.TestSuite, command_line_arguments=None): return _makeLoader(prefix, sortUsing, suiteClass).loadTestsFromTestCase( - testCaseClass) + testCaseClass, command_line_arguments) def findTestCases(module, prefix='test', sortUsing=util.three_way_cmp, - suiteClass=suite.TestSuite): + suiteClass=suite.TestSuite, command_line_arguments=None): return _makeLoader(prefix, sortUsing, suiteClass).loadTestsFromModule(\ - module) + module, command_line_arguments=command_line_arguments) diff --git a/Lib/unittest/main.py b/Lib/unittest/main.py index e62469aa2a..7adc558e5f 100644 --- a/Lib/unittest/main.py +++ b/Lib/unittest/main.py @@ -51,6 +51,8 @@ def _convert_select_pattern(pattern): pattern = '*%s*' % pattern return pattern +_options = ('verbosity', 'tb_locals', 'failfast', 'catchbreak', 'buffer', 'tests', + 'testNamePatterns', 'tests', 'start', 'pattern', 'top', 'exit') class TestProgram(object): """A command-line program that runs a set of tests; this is primarily @@ -100,6 +102,37 @@ class TestProgram(object): self.parseArgs(argv) self.runTests() + def __setattr__(self, name, value): + if name in _options: + setattr(self.command_line_arguments, name, value) + else: + super().__setattr__(name, value) + + def __getattribute__(self, name): + if name in _options: + try: + return getattr(self.command_line_arguments, name) + except AttributeError: + pass + + try: + return super().__getattribute__(name) + except AttributeError: + if name == 'command_line_arguments': + namespace = argparse.Namespace() + # preload command_line_arguments with class arguments + # this is useful for subclasses of TestProgram that override __init__ + for name in _options: + try: + value = super().__getattribute__(name) + setattr(namespace, name, value) + except AttributeError: + pass + self.command_line_arguments = namespace + return namespace + else: + raise + def usageExit(self, msg=None): if msg: print(msg) @@ -123,14 +156,14 @@ class TestProgram(object): if len(argv) > 1 and argv[1].lower() == 'discover': self._do_discovery(argv[2:]) return - self._main_parser.parse_args(argv[1:], self) + self._main_parser.parse_args(argv[1:], self.command_line_arguments) if not self.tests: # this allows "python -m unittest -v" to still work for # test discovery. self._do_discovery([]) return else: - self._main_parser.parse_args(argv[1:], self) + self._main_parser.parse_args(argv[1:], self.command_line_arguments) if self.tests: self.testNames = _convert_names(self.tests) @@ -151,7 +184,12 @@ class TestProgram(object): self.testLoader.testNamePatterns = self.testNamePatterns if from_discovery: loader = self.testLoader if Loader is None else Loader() - self.test = loader.discover(self.start, self.pattern, self.top) + self.test = loader.discover( + self.start, + self.pattern, + self.top, + self.command_line_arguments + ) elif self.testNames is None: self.test = self.testLoader.loadTestsFromModule(self.module) else: @@ -196,8 +234,13 @@ class TestProgram(object): help='Only run tests which match the given substring') self.testNamePatterns = [] + self.addCustomArguments(parser) + return parser + def addCustomArguments(self, parser): + pass + def _getMainArgParser(self, parent): parser = argparse.ArgumentParser(parents=[parent]) parser.prog = self.progName diff --git a/Lib/unittest/test/test_discovery.py b/Lib/unittest/test/test_discovery.py index 204043b493..3a2112eb7e 100644 --- a/Lib/unittest/test/test_discovery.py +++ b/Lib/unittest/test/test_discovery.py @@ -6,6 +6,7 @@ import types import pickle from test import support import test.test_importlib.util +from argparse import Namespace import unittest import unittest.mock @@ -19,7 +20,6 @@ class TestableTestProgram(unittest.TestProgram): verbosity = 1 progName = '' testRunner = testLoader = None - def __init__(self): pass @@ -72,9 +72,13 @@ class TestDiscovery(unittest.TestCase): loader._get_module_from_name = lambda path: path + ' module' orig_load_tests = loader.loadTestsFromModule - def loadTestsFromModule(module, pattern=None): + def loadTestsFromModule(module, pattern=None, command_line_arguments=None): # This is where load_tests is called. - base = orig_load_tests(module, pattern=pattern) + base = orig_load_tests( + module, + pattern=pattern, + command_line_arguments=command_line_arguments + ) return base + [module + ' tests'] loader.loadTestsFromModule = loadTestsFromModule loader.suiteClass = lambda thing: thing @@ -118,9 +122,13 @@ class TestDiscovery(unittest.TestCase): loader._get_module_from_name = lambda path: path + ' module' orig_load_tests = loader.loadTestsFromModule - def loadTestsFromModule(module, pattern=None): + def loadTestsFromModule(module, pattern=None, command_line_arguments=None): # This is where load_tests is called. - base = orig_load_tests(module, pattern=pattern) + base = orig_load_tests( + module, + pattern=pattern, + command_line_arguments=command_line_arguments + ) return base + [module + ' tests'] loader.loadTestsFromModule = loadTestsFromModule loader.suiteClass = lambda thing: thing @@ -173,9 +181,13 @@ class TestDiscovery(unittest.TestCase): loader._get_module_from_name = lambda name: Module(name) orig_load_tests = loader.loadTestsFromModule - def loadTestsFromModule(module, pattern=None): + def loadTestsFromModule(module, pattern=None, command_line_arguments=None): # This is where load_tests is called. - base = orig_load_tests(module, pattern=pattern) + base = orig_load_tests( + module, + pattern=pattern, + command_line_arguments=command_line_arguments + ) return base + [module.path + ' module tests'] loader.loadTestsFromModule = loadTestsFromModule loader.suiteClass = lambda thing: thing @@ -247,9 +259,13 @@ class TestDiscovery(unittest.TestCase): loader._get_module_from_name = lambda name: Module(name) orig_load_tests = loader.loadTestsFromModule - def loadTestsFromModule(module, pattern=None): + def loadTestsFromModule(module, pattern=None, command_line_arguments=None): # This is where load_tests is called. - base = orig_load_tests(module, pattern=pattern) + base = orig_load_tests( + module, + pattern=pattern, + command_line_arguments=command_line_arguments + ) return base + [module.path + ' module tests'] loader.loadTestsFromModule = loadTestsFromModule loader.suiteClass = lambda thing: thing @@ -395,7 +411,7 @@ class TestDiscovery(unittest.TestCase): self.addCleanup(restore_isdir) _find_tests_args = [] - def _find_tests(start_dir, pattern, namespace=None): + def _find_tests(start_dir, pattern, namespace=None, command_line_arguments=None): _find_tests_args.append((start_dir, pattern)) return ['tests'] loader._find_tests = _find_tests @@ -633,75 +649,214 @@ class TestDiscovery(unittest.TestCase): class Loader(object): args = [] - def discover(self, start_dir, pattern, top_level_dir): - self.args.append((start_dir, pattern, top_level_dir)) + def discover(self, start_dir, pattern, top_level_dir, command_line_arguments): + self.args.append((start_dir, pattern, top_level_dir, command_line_arguments)) return 'tests' program.testLoader = Loader() program._do_discovery(['-v']) - self.assertEqual(Loader.args, [('.', 'test*.py', None)]) + self.assertEqual(Loader.args, [( + '.', + 'test*.py', + None, + Namespace( + buffer=False, + catchbreak=False, + failfast=False, + pattern='test*.py', + start='.', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=2 + ) + )]) def test_command_line_handling_do_discovery_calls_loader(self): program = TestableTestProgram() class Loader(object): args = [] - def discover(self, start_dir, pattern, top_level_dir): - self.args.append((start_dir, pattern, top_level_dir)) + def discover(self, start_dir, pattern, top_level_dir, command_line_arguments): + self.args.append((start_dir, pattern, top_level_dir, command_line_arguments)) return 'tests' program._do_discovery(['-v'], Loader=Loader) self.assertEqual(program.verbosity, 2) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('.', 'test*.py', None)]) + self.assertEqual(Loader.args, [('.', 'test*.py', None, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='.', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=2 + ))]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['--verbose'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('.', 'test*.py', None)]) + self.assertEqual(Loader.args, [('.', 'test*.py', None, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='.', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=2 + ))]) Loader.args = [] program = TestableTestProgram() program._do_discovery([], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('.', 'test*.py', None)]) + self.assertEqual(Loader.args, [('.', 'test*.py', None, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='.', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=1 + ))]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['fish'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('fish', 'test*.py', None)]) + self.assertEqual(Loader.args, [('fish', 'test*.py', None, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='fish', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=1 + ))]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['fish', 'eggs'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('fish', 'eggs', None)]) + self.assertEqual(Loader.args, [( + 'fish', + 'eggs', + None, + Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='eggs', + start='fish', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=1 + ) + )]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['fish', 'eggs', 'ham'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('fish', 'eggs', 'ham')]) + self.assertEqual(Loader.args, [( + 'fish', + 'eggs', + 'ham', + Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='eggs', + start='fish', + tb_locals=False, + testNamePatterns=[], + top='ham', + verbosity=1 + ) + )]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['-s', 'fish'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('fish', 'test*.py', None)]) + self.assertEqual(Loader.args, [( + 'fish', + 'test*.py', + None, + Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='fish', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=1 + ) + )]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['-t', 'fish'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('.', 'test*.py', 'fish')]) + self.assertEqual(Loader.args, [( + '.', + 'test*.py', + 'fish', + Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='test*.py', + start='.', + tb_locals=False, + testNamePatterns=[], + top='fish', + verbosity=1 + ) + )]) Loader.args = [] program = TestableTestProgram() program._do_discovery(['-p', 'fish'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('.', 'fish', None)]) + self.assertEqual(Loader.args, [( + '.', + 'fish', + None, + Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + pattern='fish', + start='.', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=1 + ) + )]) self.assertFalse(program.failfast) self.assertFalse(program.catchbreak) @@ -710,7 +865,23 @@ class TestDiscovery(unittest.TestCase): program._do_discovery(['-p', 'eggs', '-s', 'fish', '-v', '-f', '-c'], Loader=Loader) self.assertEqual(program.test, 'tests') - self.assertEqual(Loader.args, [('fish', 'eggs', None)]) + self.assertEqual(Loader.args, [( + 'fish', + 'eggs', + None, + Namespace( + buffer=False, + catchbreak=True, + exit=True, + failfast=True, + pattern='eggs', + start='fish', + tb_locals=False, + testNamePatterns=[], + top=None, + verbosity=2 + ) + )]) self.assertEqual(program.verbosity, 2) self.assertTrue(program.failfast) self.assertTrue(program.catchbreak) @@ -785,7 +956,7 @@ class TestDiscovery(unittest.TestCase): expectedPath = os.path.abspath(os.path.dirname(unittest.test.__file__)) self.wasRun = False - def _find_tests(start_dir, pattern, namespace=None): + def _find_tests(start_dir, pattern, namespace=None, command_line_arguments=None): self.wasRun = True self.assertEqual(start_dir, expectedPath) return tests @@ -833,7 +1004,7 @@ class TestDiscovery(unittest.TestCase): return package _find_tests_args = [] - def _find_tests(start_dir, pattern, namespace=None): + def _find_tests(start_dir, pattern, namespace=None, command_line_arguments=None): _find_tests_args.append((start_dir, pattern)) return ['%s/tests' % start_dir] diff --git a/Lib/unittest/test/test_program.py b/Lib/unittest/test/test_program.py index 4a62ae1b11..a6873089f2 100644 --- a/Lib/unittest/test/test_program.py +++ b/Lib/unittest/test/test_program.py @@ -6,6 +6,7 @@ import subprocess from test import support import unittest import unittest.test +from argparse import Namespace class Test_TestProgram(unittest.TestCase): @@ -17,7 +18,7 @@ class Test_TestProgram(unittest.TestCase): expectedPath = os.path.abspath(os.path.dirname(unittest.test.__file__)) self.wasRun = False - def _find_tests(start_dir, pattern): + def _find_tests(start_dir, pattern, command_line_arguments=None): self.wasRun = True self.assertEqual(start_dir, expectedPath) return tests @@ -437,6 +438,37 @@ class TestCommandLineArgs(unittest.TestCase): self.assertIn('Ran 7 tests', run_unittest(['-k', '*test_warnings.*Warning*', t])) self.assertIn('Ran 1 test', run_unittest(['-k', '*test_warnings.*warning*', t])) + def testCustomCommandLineArguments(self): + class FakeTP(unittest.TestProgram): + def addCustomArguments(self, parser): + parser.add_argument('--foo', action="store_true", help='foo help') + def runTests(self, *args, **kw): pass + + fp = FakeTP(argv=["testprogram"]) + self.assertEqual(fp.command_line_arguments, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + foo=False, + tb_locals=False, + testNamePatterns=[], + tests=[], + verbosity=1 + )) + fp = FakeTP(argv=["testprogram", "--foo"]) + self.assertEqual(fp.command_line_arguments, Namespace( + buffer=False, + catchbreak=False, + exit=True, + failfast=False, + foo=True, + tb_locals=False, + testNamePatterns=[], + tests=[], + verbosity=1 + )) + if __name__ == '__main__': unittest.main() -- 2.20.1 From mertz at gnosis.cx Tue Mar 12 10:18:33 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 12 Mar 2019 10:18:33 -0400 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: The wrapt module I linked to (not funtools.wraps) provides all the capabilities you mention since 2013. It allows mixed use of decorators as decorator factories. It has a flat style. There are some minor API difference between your libraries and wrapt, but the concept is very similar. Since yours is something new, I imagine you perceive some win over what wrapt does. On Tue, Mar 12, 2019, 9:52 AM Sylvain MARIE wrote: > David, Steven, > > Thanks for your interest ! > > As you probably know, decorators and function wrappers are *completely > different concepts*. A decorator can directly return the decorated function > (or class), it does not have to return a wrapper. Even more, it can > entirely replace the decorated item with something else (not even a > function or class!). Try it: it is possible to write a decorator to replace > a function with an integer, even though it is probably not quite useful :) > > `decopatch` helps you write decorators, whatever they are. It "just" > solves the annoying issue of having to handle the no-parenthesis and > with-parenthesis calls. In addition as a 'goodie', it proposes two > development styles: *nested* (you have to return a function) and *flat* > (you directly write what will happen when the decorator is applied to > something). > -- > Now about creating signature-preserving function wrappers (in a decorator, > or outside a decorator - again, that's not related). That use case is > supposed to be covered by functools.wrapt. Unfortunately as explained here > https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 > this is not the case because with functools.wrapt: > - the wrapper code will execute even when the provided arguments are > invalid. > - the wrapper code cannot easily access an argument using its name, from > the received *args, **kwargs. Indeed one would have to handle all cases > (positional, keyword, default) and therefore to use something like > Signature.bind(). > > For this reason I proposed a replacement in `makefun`: > https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers > -- > Now bridging the gap. Of course a very interesting use cases for > decorators is to create decorators that create a signature-preserving > wrapper. It is possible to combine decopatch and makefun for this: > https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . > Decopatch even proposes a "double-flat" development style where you > directly write the wrapper body, as explained in the doc. > > Did I answer your questions ? > Thanks again for the quick feedback ! > Best, > > Sylvain > > -----Message d'origine----- > De : Python-ideas > De la part de Steven D'Aprano > Envoy? : mardi 12 mars 2019 12:30 > ? : python-ideas at python.org > Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators > > [External email: Use caution with links and attachments] > > ________________________________ > > > > On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas > wrote: > > > I therefore proposed > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > > provides an equivalent of `@functools.wraps` that is truly > > signature-preserving > > Tell us more about that please. I'm very interested in getting decorators > preserve the original signature. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 > Code of Conduct: > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Mar 12 10:29:57 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 12 Mar 2019 10:29:57 -0400 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: The documentation for wrapt mentions: Decorators With Optional Arguments Although opinion can be mixed about whether the pattern is a good one, if the decorator arguments all have default values, it is also possible to implement decorators which have optional arguments. As Graham hints in his docs, I think repurposing decorator factories as decorators is an antipattern. Explicit is better than implicit. While I *do* understands that what decotools and makefun do are technically independent, I'm not sure I ever want them independently in practice. I did write the book _Functional Programming in Python_, so I'm not entirely unfamiliar with function wrappers. On Tue, Mar 12, 2019, 10:18 AM David Mertz wrote: > The wrapt module I linked to (not funtools.wraps) provides all the > capabilities you mention since 2013. It allows mixed use of decorators as > decorator factories. It has a flat style. > > There are some minor API difference between your libraries and wrapt, but > the concept is very similar. Since yours is something new, I imagine you > perceive some win over what wrapt does. > > On Tue, Mar 12, 2019, 9:52 AM Sylvain MARIE wrote: > >> David, Steven, >> >> Thanks for your interest ! >> >> As you probably know, decorators and function wrappers are *completely >> different concepts*. A decorator can directly return the decorated function >> (or class), it does not have to return a wrapper. Even more, it can >> entirely replace the decorated item with something else (not even a >> function or class!). Try it: it is possible to write a decorator to replace >> a function with an integer, even though it is probably not quite useful :) >> >> `decopatch` helps you write decorators, whatever they are. It "just" >> solves the annoying issue of having to handle the no-parenthesis and >> with-parenthesis calls. In addition as a 'goodie', it proposes two >> development styles: *nested* (you have to return a function) and *flat* >> (you directly write what will happen when the decorator is applied to >> something). >> -- >> Now about creating signature-preserving function wrappers (in a >> decorator, or outside a decorator - again, that's not related). That use >> case is supposed to be covered by functools.wrapt. Unfortunately as >> explained here >> https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 >> this is not the case because with functools.wrapt: >> - the wrapper code will execute even when the provided arguments are >> invalid. >> - the wrapper code cannot easily access an argument using its name, from >> the received *args, **kwargs. Indeed one would have to handle all cases >> (positional, keyword, default) and therefore to use something like >> Signature.bind(). >> >> For this reason I proposed a replacement in `makefun`: >> https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers >> -- >> Now bridging the gap. Of course a very interesting use cases for >> decorators is to create decorators that create a signature-preserving >> wrapper. It is possible to combine decopatch and makefun for this: >> https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . >> Decopatch even proposes a "double-flat" development style where you >> directly write the wrapper body, as explained in the doc. >> >> Did I answer your questions ? >> Thanks again for the quick feedback ! >> Best, >> >> Sylvain >> >> -----Message d'origine----- >> De : Python-ideas >> De la part de Steven D'Aprano >> Envoy? : mardi 12 mars 2019 12:30 >> ? : python-ideas at python.org >> Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators >> >> [External email: Use caution with links and attachments] >> >> ________________________________ >> >> >> >> On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas >> wrote: >> >> > I therefore proposed >> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma >> > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s >> > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae >> > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 >> > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it >> > provides an equivalent of `@functools.wraps` that is truly >> > signature-preserving >> >> Tell us more about that please. I'm very interested in getting decorators >> preserve the original signature. >> >> >> -- >> Steven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> >> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 >> Code of Conduct: >> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 >> >> ______________________________________________________________________ >> This email has been scanned by the Symantec Email Security.cloud service. >> ______________________________________________________________________ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From prometheus235 at gmail.com Tue Mar 12 10:57:58 2019 From: prometheus235 at gmail.com (Nick Timkovich) Date: Tue, 12 Mar 2019 09:57:58 -0500 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Message-ID: In general, there is lots of code out in the wild that can't be updated for whatever reason, e.g. the person that knows Python left and it needs to continue to work. Weak argument, but cost-benefit I think it comes out ahead. In your example there isn't a reason I can tell why swapping the operands isn't what should be done as Calvin mentioned. The onus is on you to positively demonstrate you require both directions, not him to negatively demonstrate it's never required. I suggest you confine your proposal to `->` only, as it's currently illegal syntax. You would also want the reflected `__r*__` equivalent of `__arrow__` or `__rarrow__` (`__rrarrow__` if you also need the left-arrow...) Perhaps broadening the use of it, functions may be able to use it as a pipe operator, e.g. Elixir: https://elixir-lang.org/getting-started/enumerables-and-streams.html#the-pipe-operator On Mon, Mar 11, 2019 at 2:58 PM francismb wrote: > Hi Greg, > > On 3/9/19 1:42 AM, Greg Ewing wrote: > > Do you really want > > to tell them that all their code is now wrong? > Of course not, at least not so promptly. But, would it be still a > problem if the update to a new version (let say from 3.X to next(3.X)) > is done through some kind of updater/re-writer/evolver. In that case the > evolver could just add the blanks. What do you think ? Could it work? > > Thanks in advance! > --francis > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Tue Mar 12 11:20:34 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 12 Mar 2019 16:20:34 +0100 Subject: [Python-ideas] Preallocated tuples and dicts for function calls In-Reply-To: References: Message-ID: <5C87CE42.2090507@UGent.be> On 2019-03-08 22:16, Martin Bammer wrote: > Hi, > > what about the idea that the interpreter preallocates and preinitializes > the > > tuples and dicts for function calls where possible when loading a module? The basic premise here is wrong: function calls using the METH_FASTCALL convention don't need to allocate any temporary tuple/dict for function calls. Of course, not all function calls use METH_FASTCALL, but most of them where performance matters do. From pythonchb at gmail.com Tue Mar 12 11:34:18 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 12 Mar 2019 08:34:18 -0700 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> Message-ID: > This question is probably on its own a valid argument against the > proposal. When it comes to dicts (and not Mappings in general) {**d1, > **d2} or d.update() already have clearly-defined semantics. Actually, in my mind, this is an argument for an operator (or method) ? besides being obtuse, the {**d1,**d2} syntax only creates actual dicts. If we had an operator defined for mappings in general, it would be easier to duck type dicts. I think this is pretty compelling, actually. And also an argument for aging the operation return the type it was invoked on, rather than always a dict. I can?t find the latest draft of the PEP, so I?m not sure if this is discussed there. But it should be. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 12 11:59:44 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 12 Mar 2019 08:59:44 -0700 Subject: [Python-ideas] Suggestions: dict.flow_update and dict.__add__ In-Reply-To: References: Message-ID: On Fri, Mar 8, 2019 at 1:52 AM Jonathan Fine wrote: > I've just learnt something new. Look at > > >>> from operator import iadd > >>> lst = [1, 2, 3] > >>> iadd(lst, 'hi') > [1, 2, 3, 'h', 'i'] > >>> lst > [1, 2, 3, 'h', 'i'] > > This shows that the proposals dict.flow_update and dict.__iadd__ are > basically the same. (I think this is quite important for understanding > the attraction of fluent programming. We ALREADY like and use it, in > the form of augmented assignment of mutables.) > well, no -- the fact that __iadd__ acts like it does is essentially an accident of implementation, and calling __iadd__ directly is frowned upon (and I'm not totally sure if it is guaranteed to keep working that way by the language spec). And, in fact, it DOESN'T act like flow_merge method -- as it both mutates the original object, and returns itself -- which I think is a no-no in fluent programming, yes? (certainly in functional programming) In [10]: list1 = [1,2,3] In [11]: list2 = [4,5,6] In [12]: list3 = list1.__iadd__(list2) In [13]: list3 Out[13]: [1, 2, 3, 4, 5, 6] In [14]: list1 Out[14]: [1, 2, 3, 4, 5, 6] In [15]: list1 is list3 Out[15]: True This also shows that > combined = defaults.copy() > combined.update(options) > could, if the proposal is accepted, be written as > defaults.copy().__iadd__(options) > did you mean: combined = defaults.copy().__iadd__(options) because the way you wrote it, you are making a copy, mutating it, and then throwing it away... in which case, yes it could, but it would not be recommended, and I can't see the advantage of it over: combined = defaults + options or even, if you REALLY want to use __ methods: combined = defaults.__add__(options) In [17]: list3 = list1.__add__(list2) In [18]: list1 Out[18]: [1, 2, 3] In [19]: list3 Out[19]: [1, 2, 3, 4, 5, 6] > I got the idea from the withdrawn PEP (thank you, Nick Coghlan, for > writing it): > PEP 577 -- Augmented Assignment Expressions > https://www.python.org/dev/peps/pep-0577/ > Interestingly (to me) it was withdrawn for different reasons than what I would think -- mutating and assigning at once is dangerous. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at se.com Tue Mar 12 12:39:10 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Tue, 12 Mar 2019 16:39:10 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: Thanks David, > I did write the book _Functional Programming in Python_, so I'm not entirely unfamiliar with function wrappers. Nice ! I did not realize ; good job here, congrats! ;) -- I carefully read the documentation you pointed at, https://wrapt.readthedocs.io/en/latest/decorators.html#decorators-with-optional-arguments This is the example shown by the author: def with_optional_arguments(wrapped=None, myarg1=1, myarg2=2): if wrapped is None: return functools.partial(with_optional_arguments, myarg1=myarg1, myarg2=myarg2) @wrapt.decorator def wrapper(wrapped, instance, args, kwargs): return wrapped(*args, **kwargs) return wrapper(wrapped) As you can see: * the developer has to explicitly handle the no-parenthesis case (the first two lines of code). * And in the next lines of the doc you see his recommendations ?For this to be used in this way, it is a requirement that the decorator arguments be supplied as keyword arguments. If using Python 3, the requirement to use keyword only arguments can again be enforced using the keyword only argument syntax.? * Finally, but this is just a comment: this is not ?flat? mode but nested mode (the code returns a decorator that returns a function wrapper) So if I?m not misleading, the problem is not really solved. Or at least, not the way I would like the problem to be solved : it is solved here (a) only if the developer takes extra care and (b) reduces the way the decorator can be used (no positional args). This is precisely because I was frustrated by all these limitations that depend on the desired signature that I wrote decopatch. As a developer I do not want to care about which trick to use in which situation (mandatory args, optional args, var-positional args..). My decorators may change signature during the development cycle, and if I frequently had to change trick during development as I changed the signature - that is a bit tiring. -- Concerning creation of signature-preserving wrappers: @wrapt.decorator is not signature preserving, I just checked it. You can check it with the following experiment: def dummy(wrapped): @wrapt.decorator def wrapper(wrapped, instance, args, kwargs): print("wrapper called") return wrapped(*args, **kwargs) return wrapper(wrapped) @dummy def function(a, b): pass If you call function(1) you will see that ?wrapper called? is displayed before the TypeError is raised? The signature-preserving equivalent of @wrapt.decorator, @decorator.decorator, is the source of inspiration for makefun. You can see `makefun` as a generalization of the core of `decorator`. -- > I'm not sure I ever want them (decopatch and makefun) independently in practice I totally understand. But some projects actually need makefun and not decopatch because their need is different: they just want to create a function dynamically. This is low-level tooling, really. So at least now there is a clear separation of concerns (and dedicated issues management/roadmap, which is also quite convenient. Not to mention readability !). To cover your concern: decopatch depends on makefun, so both come at the same time when you install decopatch, and decopatch by default relies on makefun when you use it in ?double-flat? mode to create wrappers as explained here https://smarie.github.io/python-decopatch/#even-simpler -- Thanks again for this discussion! It is challenging but it is necessary, to make sure I did not answer a non-existent need ;) Kind regards -- Sylvain De : David Mertz Envoy? : mardi 12 mars 2019 15:30 ? : Sylvain MARIE Cc : Steven D'Aprano ; python-ideas Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ The documentation for wrapt mentions: Decorators With Optional Arguments Although opinion can be mixed about whether the pattern is a good one, if the decorator arguments all have default values, it is also possible to implement decorators which have optional arguments. As Graham hints in his docs, I think repurposing decorator factories as decorators is an antipattern. Explicit is better than implicit. While I *do* understands that what decotools and makefun do are technically independent, I'm not sure I ever want them independently in practice. I did write the book _Functional Programming in Python_, so I'm not entirely unfamiliar with function wrappers. On Tue, Mar 12, 2019, 10:18 AM David Mertz > wrote: The wrapt module I linked to (not funtools.wraps) provides all the capabilities you mention since 2013. It allows mixed use of decorators as decorator factories. It has a flat style. There are some minor API difference between your libraries and wrapt, but the concept is very similar. Since yours is something new, I imagine you perceive some win over what wrapt does. On Tue, Mar 12, 2019, 9:52 AM Sylvain MARIE > wrote: David, Steven, Thanks for your interest ! As you probably know, decorators and function wrappers are *completely different concepts*. A decorator can directly return the decorated function (or class), it does not have to return a wrapper. Even more, it can entirely replace the decorated item with something else (not even a function or class!). Try it: it is possible to write a decorator to replace a function with an integer, even though it is probably not quite useful :) `decopatch` helps you write decorators, whatever they are. It "just" solves the annoying issue of having to handle the no-parenthesis and with-parenthesis calls. In addition as a 'goodie', it proposes two development styles: *nested* (you have to return a function) and *flat* (you directly write what will happen when the decorator is applied to something). -- Now about creating signature-preserving function wrappers (in a decorator, or outside a decorator - again, that's not related). That use case is supposed to be covered by functools.wrapt. Unfortunately as explained here https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 this is not the case because with functools.wrapt: - the wrapper code will execute even when the provided arguments are invalid. - the wrapper code cannot easily access an argument using its name, from the received *args, **kwargs. Indeed one would have to handle all cases (positional, keyword, default) and therefore to use something like Signature.bind(). For this reason I proposed a replacement in `makefun`: https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers -- Now bridging the gap. Of course a very interesting use cases for decorators is to create decorators that create a signature-preserving wrapper. It is possible to combine decopatch and makefun for this: https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . Decopatch even proposes a "double-flat" development style where you directly write the wrapper body, as explained in the doc. Did I answer your questions ? Thanks again for the quick feedback ! Best, Sylvain -----Message d'origine----- De : Python-ideas > De la part de Steven D'Aprano Envoy? : mardi 12 mars 2019 12:30 ? : python-ideas at python.org Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas wrote: > I therefore proposed > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > provides an equivalent of `@functools.wraps` that is truly > signature-preserving Tell us more about that please. I'm very interested in getting decorators preserve the original signature. -- Steven _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 Code of Conduct: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Mar 12 14:15:04 2019 From: mertz at gnosis.cx (David Mertz) Date: Tue, 12 Mar 2019 14:15:04 -0400 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: One of the nice things in wrapt is that Dumpleton lets you use the same decorator for functions, regular methods, static methods, and class methods. Does yours handle that sort of "polymorphism"? FWIW, thanks for the cool work with your libraries! I don't think I will want the specific with-or-without parens feature, since it feels too implicit. Typing `@deco_factory()` really isn't too much work for me to use the two characters extra. But given that I feel the option is an antipattern, I don't want to add core language features to make the pattern easier. Both you and Graham Dumpleton have found workarounds to get that behavior when it is wanted, but I don't want it to be "too easy." FWIW... I think I'd be tempted to use a metaclass approach so that both the class and instance are callable. The class would be called with a single function argument (i.e. a decorator), but if called with any other signature it would manufacture a callable instance that was parameterized by the initialization arguments (i.e. a decorator factory). Actually, I haven't looked at your actual code, maybe that's what you do. Best, David... On Tue, Mar 12, 2019 at 12:44 PM Sylvain MARIE wrote: > Thanks David, > > > > > I did write the book _Functional Programming in Python_, so I'm not > entirely unfamiliar with function wrappers. > > > > Nice ! I did not realize ; good job here, congrats! ;) > > > > -- > > I carefully read the documentation you pointed at, > https://wrapt.readthedocs.io/en/latest/decorators.html#decorators-with-optional-arguments > > This is the example shown by the author: > > > > def with_optional_arguments(wrapped=None, myarg1=1, myarg2=2): > > if wrapped is None: > > return functools.partial(with_optional_arguments, > > myarg1=myarg1, myarg2=myarg2) > > > > @wrapt.decorator > > def wrapper(wrapped, instance, args, kwargs): > > return wrapped(*args, **kwargs) > > > > return wrapper(wrapped) > > > > As you can see: > > - the developer has to explicitly handle the no-parenthesis case (the > first two lines of code). > - And in the next lines of the doc you see his recommendations ?For > this to be used in this way, it is a requirement that the decorator > arguments be supplied as keyword arguments. If using Python 3, the > requirement to use keyword only arguments can again be enforced using the > keyword only argument syntax.? > - Finally, but this is just a comment: this is not ?flat? mode but > nested mode (the code returns a decorator that returns a function wrapper) > > > > So if I?m not misleading, the problem is not really solved. Or at least, > not the way I would like the problem to be solved : it is solved here (a) > only if the developer takes extra care and (b) reduces the way the > decorator can be used (no positional args). This is precisely because I was > frustrated by all these limitations that depend on the desired signature > that I wrote decopatch. As a developer I do not want to care about which > trick to use in which situation (mandatory args, optional args, > var-positional args..). My decorators may change signature during the > development cycle, and if I frequently had to change trick during > development as I changed the signature - that is a bit tiring. > > > > -- > > Concerning creation of signature-preserving wrappers: @wrapt.decorator is > not signature preserving, I just checked it. You can check it with the > following experiment: > > > > *def *dummy(wrapped): > @wrapt.decorator > *def *wrapper(wrapped, instance, args, kwargs): > print(*"wrapper called"*) > *return *wrapped(*args, **kwargs) > *return *wrapper(wrapped) > > > > @dummy > *def *function(a, b): > *pass* > > > > If you call > > > > function(1) > > > > you will see that ?wrapper called? is displayed before the TypeError is > raised? > > > > The signature-preserving equivalent of @wrapt.decorator, > @decorator.decorator, is the source of inspiration for makefun. You can > see `makefun` as a generalization of the core of `decorator`. > > > > -- > > > I'm not sure I ever want them (decopatch and makefun) independently in > practice > > > > I totally understand. > > But some projects actually need makefun and not decopatch because their > need is different: they just want to create a function dynamically. This is > low-level tooling, really. > > So at least now there is a clear separation of concerns (and dedicated > issues management/roadmap, which is also quite convenient. Not to mention > readability !). > > To cover your concern: decopatch depends on makefun, so both come at the > same time when you install decopatch, and decopatch by default relies on > makefun when you use it in ?double-flat? mode to create wrappers as > explained here https://smarie.github.io/python-decopatch/#even-simpler > > -- > > > > Thanks again for this discussion! It is challenging but it is necessary, > to make sure I did not answer a non-existent need ;) > > Kind regards > > > > -- > > Sylvain > > > > *De :* David Mertz > *Envoy? :* mardi 12 mars 2019 15:30 > *? :* Sylvain MARIE > *Cc :* Steven D'Aprano ; python-ideas < > python-ideas at python.org> > *Objet :* Re: [Python-ideas] Problems (and solutions?) in writing > decorators > > > > [External email: Use caution with links and attachments] > ------------------------------ > > > > The documentation for wrapt mentions: > > > Decorators With Optional Arguments > > Although opinion can be mixed about whether the pattern is a good one, if > the decorator arguments all have default values, it is also possible to > implement decorators which have optional arguments. > > As Graham hints in his docs, I think repurposing decorator factories as > decorators is an antipattern. Explicit is better than implicit. > > > > While I *do* understands that what decotools and makefun do are > technically independent, I'm not sure I ever want them independently in > practice. I did write the book _Functional Programming in Python_, so I'm > not entirely unfamiliar with function wrappers. > > On Tue, Mar 12, 2019, 10:18 AM David Mertz wrote: > > The wrapt module I linked to (not funtools.wraps) provides all the > capabilities you mention since 2013. It allows mixed use of decorators as > decorator factories. It has a flat style. > > > > There are some minor API difference between your libraries and wrapt, but > the concept is very similar. Since yours is something new, I imagine you > perceive some win over what wrapt does. > > On Tue, Mar 12, 2019, 9:52 AM Sylvain MARIE wrote: > > David, Steven, > > Thanks for your interest ! > > As you probably know, decorators and function wrappers are *completely > different concepts*. A decorator can directly return the decorated function > (or class), it does not have to return a wrapper. Even more, it can > entirely replace the decorated item with something else (not even a > function or class!). Try it: it is possible to write a decorator to replace > a function with an integer, even though it is probably not quite useful :) > > `decopatch` helps you write decorators, whatever they are. It "just" > solves the annoying issue of having to handle the no-parenthesis and > with-parenthesis calls. In addition as a 'goodie', it proposes two > development styles: *nested* (you have to return a function) and *flat* > (you directly write what will happen when the decorator is applied to > something). > -- > Now about creating signature-preserving function wrappers (in a decorator, > or outside a decorator - again, that's not related). That use case is > supposed to be covered by functools.wrapt. Unfortunately as explained here > https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 > > this is not the case because with functools.wrapt: > - the wrapper code will execute even when the provided arguments are > invalid. > - the wrapper code cannot easily access an argument using its name, from > the received *args, **kwargs. Indeed one would have to handle all cases > (positional, keyword, default) and therefore to use something like > Signature.bind(). > > For this reason I proposed a replacement in `makefun`: > https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers > > -- > Now bridging the gap. Of course a very interesting use cases for > decorators is to create decorators that create a signature-preserving > wrapper. It is possible to combine decopatch and makefun for this: > https://smarie.github.io/python-decopatch/#3-creating-function-wrappers > > . > Decopatch even proposes a "double-flat" development style where you > directly write the wrapper body, as explained in the doc. > > Did I answer your questions ? > Thanks again for the quick feedback ! > Best, > > Sylvain > > -----Message d'origine----- > De : Python-ideas > De la part de Steven D'Aprano > Envoy? : mardi 12 mars 2019 12:30 > ? : python-ideas at python.org > Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators > > [External email: Use caution with links and attachments] > > ________________________________ > > > > On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas > wrote: > > > I therefore proposed > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > > rie.github.io > > %2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > > e.com > > %7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > > provides an equivalent of `@functools.wraps` that is truly > > signature-preserving > > Tell us more about that please. I'm very interested in getting decorators > preserve the original signature. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 > > Code of Conduct: > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > ______________________________________________________________________ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Mar 12 19:40:20 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 12 Mar 2019 16:40:20 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190309232426.GL12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190309165514.GA29550@ando.pearwood.info> <20190309232426.GL12502@ando.pearwood.info> Message-ID: Just in case I'm not the only one that had a hard time finding the latest version of this PEP, here it is in the PEPS Repo: https://github.com/python/peps/blob/master/pep-0584.rst -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkteresi at gmail.com Wed Mar 13 14:44:51 2019 From: dkteresi at gmail.com (David Teresi) Date: Wed, 13 Mar 2019 14:44:51 -0400 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Message-ID: `->` would not be ambiguous in the proposed cases, but it does already mean something elsewhere in the language as of 3.5: def concat(a: str, b: str) -> str: return a + b This could potentially cause confusion (as with the % operator being used for modulo as well as string formatting). On Tue, Mar 12, 2019 at 10:58 AM Nick Timkovich wrote: > > In general, there is lots of code out in the wild that can't be updated for whatever reason, e.g. the person that knows Python left and it needs to continue to work. Weak argument, but cost-benefit I think it comes out ahead. In your example there isn't a reason I can tell why swapping the operands isn't what should be done as Calvin mentioned. The onus is on you to positively demonstrate you require both directions, not him to negatively demonstrate it's never required. > > I suggest you confine your proposal to `->` only, as it's currently illegal syntax. You would also want the reflected `__r*__` equivalent of `__arrow__` or `__rarrow__` (`__rrarrow__` if you also need the left-arrow...) > > Perhaps broadening the use of it, functions may be able to use it as a pipe operator, e.g. Elixir: https://elixir-lang.org/getting-started/enumerables-and-streams.html#the-pipe-operator > > On Mon, Mar 11, 2019 at 2:58 PM francismb wrote: >> >> Hi Greg, >> >> On 3/9/19 1:42 AM, Greg Ewing wrote: >> > Do you really want >> > to tell them that all their code is now wrong? >> Of course not, at least not so promptly. But, would it be still a >> problem if the update to a new version (let say from 3.X to next(3.X)) >> is done through some kind of updater/re-writer/evolver. In that case the >> evolver could just add the blanks. What do you think ? Could it work? >> >> Thanks in advance! >> --francis >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From 2QdxY4RzWzUUiLuE at potatochowder.com Wed Mar 13 14:54:17 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Wed, 13 Mar 2019 13:54:17 -0500 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Message-ID: <5e24ce0d-2f35-8f2c-5240-7dc9fd14d7e2@potatochowder.com> On 3/13/19 1:44 PM, David Teresi wrote: > `->` would not be ambiguous in the proposed cases, but it does already > mean something elsewhere in the language as of 3.5: > > def concat(a: str, b: str) -> str: > return a + b > > This could potentially cause confusion (as with the % operator being > used for modulo as well as string formatting). But by that logic, the colon is also ambiguous: the colon is used to indicate a dictionary entry, as in {a : str}. Given the radically difference contexts in which the tokens in question occur, I don't think that this is an issue (then again, I've never designed a language as widely consumed as Python). From sylvain.marie at se.com Thu Mar 14 13:56:19 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Thu, 14 Mar 2019 17:56:19 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: Thanks David Sorry for not getting back to you earlier, I made a bunch of releases of makefun in the meantime. In particular I fixed a few bugs and added an equivalent of `functools.partial` . > One of the nice things in wrapt is that Dumpleton lets you use the same decorator for functions, regular methods, static methods, and class methods. Does yours handle that sort of "polymorphism"? It should, but I will check it thoroughly ? I?ll let you know > I don't think I will want the specific with-or-without parens feature, since it feels too implicit. Typing `@deco_factory()` really isn't too much work for me to use the two characters extra. I totally understand your point of view. However on the other hand, many very popular open source projects out there have the opposite point of view and provide decorators that can seamlessly be used with and without arguments (pytest, attrs, click, etc.). So after a while users get used to this behavior and expect it from all libraries. Making it easy to implement is therefore something quite important for developers not to spend time on this useless ?feature?. Kind regards Sylvain De : David Mertz Envoy? : mardi 12 mars 2019 19:15 ? : Sylvain MARIE Cc : Steven D'Aprano ; python-ideas Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ One of the nice things in wrapt is that Dumpleton lets you use the same decorator for functions, regular methods, static methods, and class methods. Does yours handle that sort of "polymorphism"? FWIW, thanks for the cool work with your libraries! I don't think I will want the specific with-or-without parens feature, since it feels too implicit. Typing `@deco_factory()` really isn't too much work for me to use the two characters extra. But given that I feel the option is an antipattern, I don't want to add core language features to make the pattern easier. Both you and Graham Dumpleton have found workarounds to get that behavior when it is wanted, but I don't want it to be "too easy." FWIW... I think I'd be tempted to use a metaclass approach so that both the class and instance are callable. The class would be called with a single function argument (i.e. a decorator), but if called with any other signature it would manufacture a callable instance that was parameterized by the initialization arguments (i.e. a decorator factory). Actually, I haven't looked at your actual code, maybe that's what you do. Best, David... On Tue, Mar 12, 2019 at 12:44 PM Sylvain MARIE > wrote: Thanks David, > I did write the book _Functional Programming in Python_, so I'm not entirely unfamiliar with function wrappers. Nice ! I did not realize ; good job here, congrats! ;) -- I carefully read the documentation you pointed at, https://wrapt.readthedocs.io/en/latest/decorators.html#decorators-with-optional-arguments This is the example shown by the author: def with_optional_arguments(wrapped=None, myarg1=1, myarg2=2): if wrapped is None: return functools.partial(with_optional_arguments, myarg1=myarg1, myarg2=myarg2) @wrapt.decorator def wrapper(wrapped, instance, args, kwargs): return wrapped(*args, **kwargs) return wrapper(wrapped) As you can see: * the developer has to explicitly handle the no-parenthesis case (the first two lines of code). * And in the next lines of the doc you see his recommendations ?For this to be used in this way, it is a requirement that the decorator arguments be supplied as keyword arguments. If using Python 3, the requirement to use keyword only arguments can again be enforced using the keyword only argument syntax.? * Finally, but this is just a comment: this is not ?flat? mode but nested mode (the code returns a decorator that returns a function wrapper) So if I?m not misleading, the problem is not really solved. Or at least, not the way I would like the problem to be solved : it is solved here (a) only if the developer takes extra care and (b) reduces the way the decorator can be used (no positional args). This is precisely because I was frustrated by all these limitations that depend on the desired signature that I wrote decopatch. As a developer I do not want to care about which trick to use in which situation (mandatory args, optional args, var-positional args..). My decorators may change signature during the development cycle, and if I frequently had to change trick during development as I changed the signature - that is a bit tiring. -- Concerning creation of signature-preserving wrappers: @wrapt.decorator is not signature preserving, I just checked it. You can check it with the following experiment: def dummy(wrapped): @wrapt.decorator def wrapper(wrapped, instance, args, kwargs): print("wrapper called") return wrapped(*args, **kwargs) return wrapper(wrapped) @dummy def function(a, b): pass If you call function(1) you will see that ?wrapper called? is displayed before the TypeError is raised? The signature-preserving equivalent of @wrapt.decorator, @decorator.decorator, is the source of inspiration for makefun. You can see `makefun` as a generalization of the core of `decorator`. -- > I'm not sure I ever want them (decopatch and makefun) independently in practice I totally understand. But some projects actually need makefun and not decopatch because their need is different: they just want to create a function dynamically. This is low-level tooling, really. So at least now there is a clear separation of concerns (and dedicated issues management/roadmap, which is also quite convenient. Not to mention readability !). To cover your concern: decopatch depends on makefun, so both come at the same time when you install decopatch, and decopatch by default relies on makefun when you use it in ?double-flat? mode to create wrappers as explained here https://smarie.github.io/python-decopatch/#even-simpler -- Thanks again for this discussion! It is challenging but it is necessary, to make sure I did not answer a non-existent need ;) Kind regards -- Sylvain De : David Mertz > Envoy? : mardi 12 mars 2019 15:30 ? : Sylvain MARIE > Cc : Steven D'Aprano >; python-ideas > Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ The documentation for wrapt mentions: Decorators With Optional Arguments Although opinion can be mixed about whether the pattern is a good one, if the decorator arguments all have default values, it is also possible to implement decorators which have optional arguments. As Graham hints in his docs, I think repurposing decorator factories as decorators is an antipattern. Explicit is better than implicit. While I *do* understands that what decotools and makefun do are technically independent, I'm not sure I ever want them independently in practice. I did write the book _Functional Programming in Python_, so I'm not entirely unfamiliar with function wrappers. On Tue, Mar 12, 2019, 10:18 AM David Mertz > wrote: The wrapt module I linked to (not funtools.wraps) provides all the capabilities you mention since 2013. It allows mixed use of decorators as decorator factories. It has a flat style. There are some minor API difference between your libraries and wrapt, but the concept is very similar. Since yours is something new, I imagine you perceive some win over what wrapt does. On Tue, Mar 12, 2019, 9:52 AM Sylvain MARIE > wrote: David, Steven, Thanks for your interest ! As you probably know, decorators and function wrappers are *completely different concepts*. A decorator can directly return the decorated function (or class), it does not have to return a wrapper. Even more, it can entirely replace the decorated item with something else (not even a function or class!). Try it: it is possible to write a decorator to replace a function with an integer, even though it is probably not quite useful :) `decopatch` helps you write decorators, whatever they are. It "just" solves the annoying issue of having to handle the no-parenthesis and with-parenthesis calls. In addition as a 'goodie', it proposes two development styles: *nested* (you have to return a function) and *flat* (you directly write what will happen when the decorator is applied to something). -- Now about creating signature-preserving function wrappers (in a decorator, or outside a decorator - again, that's not related). That use case is supposed to be covered by functools.wrapt. Unfortunately as explained here https://stackoverflow.com/questions/308999/what-does-functools-wraps-do/55102697#55102697 this is not the case because with functools.wrapt: - the wrapper code will execute even when the provided arguments are invalid. - the wrapper code cannot easily access an argument using its name, from the received *args, **kwargs. Indeed one would have to handle all cases (positional, keyword, default) and therefore to use something like Signature.bind(). For this reason I proposed a replacement in `makefun`: https://smarie.github.io/python-makefun/#signature-preserving-function-wrappers -- Now bridging the gap. Of course a very interesting use cases for decorators is to create decorators that create a signature-preserving wrapper. It is possible to combine decopatch and makefun for this: https://smarie.github.io/python-decopatch/#3-creating-function-wrappers . Decopatch even proposes a "double-flat" development style where you directly write the wrapper body, as explained in the doc. Did I answer your questions ? Thanks again for the quick feedback ! Best, Sylvain -----Message d'origine----- De : Python-ideas > De la part de Steven D'Aprano Envoy? : mardi 12 mars 2019 12:30 ? : python-ideas at python.org Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ On Tue, Mar 12, 2019 at 09:36:41AM +0000, Sylvain MARIE via Python-ideas wrote: > I therefore proposed > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsma > rie.github.io%2Fpython-makefun%2F&data=02%7C01%7Csylvain.marie%40s > e.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae > 68fef%7C0%7C0%7C636879872385158085&sdata=nB9p9V%2BJ7gk%2Fsc%2BA5%2 > Fekk35bnYGvmEFJyCXaLDyLm9I%3D&reserved=0 . In particular it > provides an equivalent of `@functools.wraps` that is truly > signature-preserving Tell us more about that please. I'm very interested in getting decorators preserve the original signature. -- Steven _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=XcYfEginmDF7kIpGGA0XxDZKpUn9e4p2zPFk7UAruYg%3D&reserved=0 Code of Conduct: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&data=02%7C01%7Csylvain.marie%40se.com%7C579232e7e10e475314c708d6a6de9d23%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636879872385158085&sdata=20ZrtVQZbpQ54c96veSXIOfEK7rKy0ggj0omTZg3ri8%3D&reserved=0 ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Thu Mar 14 16:33:21 2019 From: francismb at email.de (francismb) Date: Thu, 14 Mar 2019 21:33:21 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: <20190311232501.GP12502@ando.pearwood.info> References: <20190311232501.GP12502@ando.pearwood.info> Message-ID: <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> Hi Steven, On 3/12/19 12:25 AM, Steven D'Aprano wrote: > I don't know who you expect is using this: the Python core developers > responsible for adding new language features and changing the grammar, > or Python programmers. Python core devs should write the 'python_next' and 'is_python_code' parts that moves source code from the current version to the next if a backwards incompatible grammar change is needed. Python programmers may use the helpers to upgrade to the next version. > I don't know what part of the current code (current code of *what*?) is > supposed to be upgraded or evolved, or what you mean by that. Do you > mean using this to add new grammatical features to the interpreter? > > Do you mean something like 2to3? Something which transforms source code > written in Python? > Yes a source transformer, but to be applied to some 3.x version to move it to the next 3.x+1, and so on ... (instead of '2to3' a kind of 'nowtonext', aka 'python_next') Couldn't that relax the tension on doing 'backward compatibility changes' a bit ? Thanks, --francis From rosuav at gmail.com Thu Mar 14 16:47:51 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 15 Mar 2019 07:47:51 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> Message-ID: On Fri, Mar 15, 2019 at 7:39 AM francismb wrote: > > Hi Steven, > > On 3/12/19 12:25 AM, Steven D'Aprano wrote: > > I don't know who you expect is using this: the Python core developers > > responsible for adding new language features and changing the grammar, > > or Python programmers. > Python core devs should write the 'python_next' and 'is_python_code' > parts that moves source code from the current version to the next if a > backwards incompatible grammar change is needed. > > Python programmers may use the helpers to upgrade to the next version. > > > > I don't know what part of the current code (current code of *what*?) is > > supposed to be upgraded or evolved, or what you mean by that. Do you > > mean using this to add new grammatical features to the interpreter? > > > > Do you mean something like 2to3? Something which transforms source code > > written in Python? > > > Yes a source transformer, but to be applied to some 3.x version to move > it to the next 3.x+1, and so on ... (instead of '2to3' a kind of > 'nowtonext', aka 'python_next') > > Couldn't that relax the tension on doing 'backward compatibility > changes' a bit ? What happens when someone wants to support multiple Python versions? "Requires Python 3.5 or newer" is easy. Forcing people to install the correct one for each version isn't. ChrisA From francismb at email.de Thu Mar 14 16:58:46 2019 From: francismb at email.de (francismb) Date: Thu, 14 Mar 2019 21:58:46 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: Message-ID: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> Hi Paul, On 3/12/19 12:21 AM, Paul Moore wrote: > That sounds very similar to 2to3, which seemed like a good approach to > the Python 2 to Python 3 transition, but fell into disuse because > people who have to support multiple versions of Python in their code > found it *far* easier to do so with a single codebase that worked with > both versions, rather than needing to use a translator. Yes, the 2to3 idea was meant but for translations inside the 3 series (from 3.x to 3.x+1). Trying to keep a single code base for 2/3 seems like a good idea (may be the developer just cannot change to 3 fast due how big the step was) but that also have the limitation on how far you can go using new features. Once you're just on the 3 series couldn't such 2to3 concept also help to speed up ? (due the 'backwards-compatibility issue') Thanks, --francis From tudorache.vlad at gmail.com Thu Mar 14 17:00:03 2019 From: tudorache.vlad at gmail.com (Vlad Tudorache) Date: Thu, 14 Mar 2019 22:00:03 +0100 Subject: [Python-ideas] HTML Wrapper Message-ID: Hello, I'd like to know if there is a basic HTML wrapper for Python, like TextWrapper but allowing the generation of HTML from strings or iterables of strings. Like: make_select = HTMLWrapper(tag='select class="eggs"', indent=' ') make_option = HTMLWrapper(tag='option') Applying this like: s = make_select([make_option('Option %d' % (i + 1), \ escape=False, strip=False) for i in range(3)]) should return s like (when printed): Vlad Tudorache -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Mar 14 18:37:26 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Mar 2019 09:37:26 +1100 Subject: [Python-ideas] HTML Wrapper In-Reply-To: References: Message-ID: <20190314223725.GB12502@ando.pearwood.info> Hi Vlad, and welcome! On Thu, Mar 14, 2019 at 10:00:03PM +0100, Vlad Tudorache wrote: > Hello, > > I'd like to know if there is a basic HTML wrapper for Python, like > TextWrapper but allowing the generation of HTML from strings or iterables > of strings. This list is for proposing and discussing ideas for new syntax or functionality for the Python language, not for asking basic support questions. Are you are proposing that Python gets a HTML wrapper? If so, it is up to you to do your research first, so that you know the answer to your question before you propose the idea. You should be able to tell us what options are available as language features or third-party libraries. If you don't know the answer, there are many places you can ask, starting with Google and other search engines: https://duckduckgo.com/?q=python+html+generator and others such as Reddit's /r/learnpython subreddit, Stackoverflow, the Python-List mailing list, the Python IRC channel, and more. https://mail.python.org/mailman/listinfo/python-list news:comp.lang.python https://www.reddit.com/r/learnpython/ https://www.python.org/community/irc/ If you still have a proposal after doing your research, we're happy to hear it. Regards, Steven From steve at pearwood.info Thu Mar 14 19:02:40 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Mar 2019 10:02:40 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> Message-ID: <20190314230239.GD12502@ando.pearwood.info> On Thu, Mar 14, 2019 at 09:33:21PM +0100, francismb wrote: [...] > > Do you mean something like 2to3? Something which transforms source code > > written in Python? > > > Yes a source transformer, but to be applied to some 3.x version to move > it to the next 3.x+1, and so on ... (instead of '2to3' a kind of > 'nowtonext', aka 'python_next') > > Couldn't that relax the tension on doing 'backward compatibility > changes' a bit ? Perhaps, but probably not. The core-developers are already overworked, and don't have time to add all the features we want. Making them responsible for writing this source code transformer for every backwards incompatible change will increase the amount of work they do, not decrease it, and probably make backwards-incompatible changes even less popular. For example: version 3.8 will include a backwards incompatible change made to the statistics.mode function. Currently, mode() raises an exception if the data contains more than one "most frequent" value. Starting from 3.8, it will return the first such value found. If we had to write some sort of source code translator to deal with this change, I doubt that we could automate this. And if we could, writing that translator would probably be *much* more work than making the change itself. Besides, I think it was Paul who pointed this out, in practice we found that 2to3 wasn't as useful as people expected. It turns out that for most people, writing version-independent code that supports the older and newer versions of Python is usually simpler than keeping two versions and using a translator to move from one to the other. But if you feel that this feature may be useful, I encourage you to experiment with writing your own version and putting it on PyPI for others to use. If it is successful, then we could some day bring it into the standard library. -- Steven From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 14 23:54:56 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 15 Mar 2019 12:54:56 +0900 Subject: [Python-ideas] Code version evolver In-Reply-To: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> Message-ID: <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> francismb writes: > Trying to keep a single code base for 2/3 seems like a good idea > (may be the developer just cannot change to 3 fast due how big the > step was) but that also have the limitation on how far you can go > using new features. This doesn't work very well: you can't use 3 at all until you can use 3 everywhere. So the evolutionary path is 0. Python 2-only code base 1. Part Python 2-only, part Python 2/3 code base (no new features anywhere, since everything has to run in Python 2 -- may require new test halters for component testing under Python 3, and system test has to wait for step 2) 2. Complete Python 2/3 code base 3a. Users use their preferred Python and developers have 2/3 4ever! (All limitations apply in this case. :-( ) 3b. Project moves to Python 3-only. So what most applications did is branch 2 vs. 3, do almost all new development on 3 (bugfixing on both 2 and 3 of course, and maybe occasionally backporting a few new features to 2), and eventually (often as soon as there's a complete implementation for Python 3!) stop supporting Python 2. Only when there was strong demand for Step 3a (typically for popular libraries) did it make sense to spend effort satisfying the constraints of a 2/3 code base. > Once you're just on the 3 series couldn't such 2to3 concept also help to > speed up ? (due the 'backwards-compatibility issue') Not really. For example, addition of syntax like "async" and "yield" fundamentally changes the meaning of "def", in ways that *could not* be fully emulated in earlier Pythons. The semantics simply were impossible to produce -- that's why syntax extensions were necessary. What 2to3 does is to handle a lot of automatic conversions, such as flipping the identifiers from str to bytes and unicode to str. It was necessary to have some such tool because of the very large amount of such menial work needed to change a 2 code base to a 3 code base. But even so, there were things that 2to3 couldn't do, and it often exposed bugs or very poor practice (decode applied to unicode objects, encode applied to bytes) that had to be reworked by the developer anyway. The thing about "within 3" upgrades is that that kind of project-wide annoyance is going to be minimal, because the language is mostly growing in power, not changing the semantics of existing syntax. Such changes are very rare, and considered extremely carefully for implications for existing code. In a very few cases it's possible to warn about dangerous use of obsolete syntax whose meaning has changed, but that's very rare too. In some cases *pure additions* to the core will be available via "from __future__ import A", which covers many of the cases of "I wish I could use feature A in version X.Y". But this kind of thing is constrained by core developer time, and developing a 3.x to 3.y utility is (IMO, somebody else is welcome to prove me wrong! :-) way past the point of zero marginal returns to developer effort. It's an interesting idea, but I think practically it won't have the benefits you hope for, at least not enough to persuade core developers to work on it. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 14 23:55:49 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 15 Mar 2019 12:55:49 +0900 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> Message-ID: <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> Sylvain MARIE via Python-ideas writes: > I totally understand your point of view. However on the other hand, > many very popular open source projects out there have the opposite > point of view and provide decorators that can seamlessly be used > with and without arguments (pytest, attrs, click, etc.). So after a > while users get used to this behavior and expect it from all > libraries. Making it easy to implement is therefore something quite > important for developers not to spend time on this useless > ?feature?. That doesn't follow. You can also take it that "educating users to know the difference between a decorator and a decorator factory is therefore something quite important for developers not to spend time on this useless 'feature'." I'm not a fan of either position. I don't see why developers of libraries who want to provide this to their users shouldn't have "an easy way to do it", but I also don't see a good reason to encourage syntactic ambiguity by providing it in the standard library. I think this is a feature that belongs in the area of "you *could* do it, but *should* you?" If the answer is "maybe", IMO PyPI is the right solution for distribution. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From greg.ewing at canterbury.ac.nz Fri Mar 15 00:33:24 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Mar 2019 17:33:24 +1300 Subject: [Python-ideas] Code version evolver In-Reply-To: <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> Message-ID: <5C8B2B14.801@canterbury.ac.nz> francismb wrote: > Yes a source transformer, but to be applied to some 3.x version to move > it to the next 3.x+1, and so on ... (instead of '2to3' a kind of > 'nowtonext', aka 'python_next') > > Couldn't that relax the tension on doing 'backward compatibility > changes' a bit ? Not really. Having to translate all your source every time a minor version update occurs would be a huge hassle. -- Greg From pythonchb at gmail.com Fri Mar 15 01:20:22 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Thu, 14 Mar 2019 22:20:22 -0700 Subject: [Python-ideas] HTML Wrapper In-Reply-To: <20190314223725.GB12502@ando.pearwood.info> References: <20190314223725.GB12502@ando.pearwood.info> Message-ID: This is very much the kind of thing that would belong in a library. There's probably more than one out there right now. In fact, way back when I started learning Python (almost 20 yrs ago!)), there was such a lib -- I think it was called HTMLgen. However, since then, most people have decided that templating is the way to accomplish this -- write the html with bits of code in in, and have it generate the final html. There are many template engines for/with Python. Having said that, I actually use an OO html generator as an assignment in my training: https://uwpce-pythoncert.github.io/PythonCertDevel/exercises/html_renderer.html So you can write it yourself... Using the approach in that assignment, you would write your example as: selector = Selector(_class="eggs") for i in range(3): Selector.append(Option(f"Option {i}")) selector.render() -CHB On Thu, Mar 14, 2019 at 3:43 PM Steven D'Aprano wrote: > Hi Vlad, and welcome! > > On Thu, Mar 14, 2019 at 10:00:03PM +0100, Vlad Tudorache wrote: > > Hello, > > > > I'd like to know if there is a basic HTML wrapper for Python, like > > TextWrapper but allowing the generation of HTML from strings or iterables > > of strings. > > This list is for proposing and discussing ideas for new syntax or > functionality for the Python language, not for asking basic support > questions. > > Are you are proposing that Python gets a HTML wrapper? > > If so, it is up to you to do your research first, so that you know the > answer to your question before you propose the idea. You should be able > to tell us what options are available as language features or > third-party libraries. > > If you don't know the answer, there are many places you can ask, > starting with Google and other search engines: > > https://duckduckgo.com/?q=python+html+generator > > and others such as Reddit's /r/learnpython subreddit, Stackoverflow, the > Python-List mailing list, the Python IRC channel, and more. > > https://mail.python.org/mailman/listinfo/python-list > news:comp.lang.python > https://www.reddit.com/r/learnpython/ > https://www.python.org/community/irc/ > > If you still have a proposal after doing your research, we're happy to > hear it. > > > Regards, > > > > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From tudorache.vlad at gmail.com Fri Mar 15 02:59:05 2019 From: tudorache.vlad at gmail.com (Vlad Tudorache) Date: Fri, 15 Mar 2019 07:59:05 +0100 Subject: [Python-ideas] HTML Wrapper In-Reply-To: <20190314223725.GB12502@ando.pearwood.info> References: <20190314223725.GB12502@ando.pearwood.info> Message-ID: Hello, Steven, This wasn't a question asking for support. The answers I found when searching were different from what I needed, that's why I'm using my own. But I understand the point. Regards, Vlad Le jeu. 14 mars 2019 ? 23:43, Steven D'Aprano a ?crit : > Hi Vlad, and welcome! > > On Thu, Mar 14, 2019 at 10:00:03PM +0100, Vlad Tudorache wrote: > > Hello, > > > > I'd like to know if there is a basic HTML wrapper for Python, like > > TextWrapper but allowing the generation of HTML from strings or iterables > > of strings. > > This list is for proposing and discussing ideas for new syntax or > functionality for the Python language, not for asking basic support > questions. > > Are you are proposing that Python gets a HTML wrapper? > > If so, it is up to you to do your research first, so that you know the > answer to your question before you propose the idea. You should be able > to tell us what options are available as language features or > third-party libraries. > > If you don't know the answer, there are many places you can ask, > starting with Google and other search engines: > > https://duckduckgo.com/?q=python+html+generator > > and others such as Reddit's /r/learnpython subreddit, Stackoverflow, the > Python-List mailing list, the Python IRC channel, and more. > > https://mail.python.org/mailman/listinfo/python-list > news:comp.lang.python > https://www.reddit.com/r/learnpython/ > https://www.python.org/community/irc/ > > If you still have a proposal after doing your research, we're happy to > hear it. > > > Regards, > > > > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Mar 15 07:20:21 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Mar 2019 12:20:21 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> Message-ID: <20190315122021.16e8cca8@fsol> On Wed, 6 Mar 2019 00:46:57 +0000 Josh Rosenberg wrote: > > Overloading + lacks the clear descriptive aspect of update that describes > the goal of the operation, and contradicts conventions (in Python and > elsewhere) about how + works (addition or concatenation, and a lot of > people don't even like it doing the latter, though I'm not that pedantic). > > A couple "rules" from C++ on overloading are "*Whenever the meaning of an > operator is not obviously clear and undisputed, it should not be > overloaded.* *Instead, provide a function with a well-chosen name.*" > and "*Always > stick to the operator?s well-known semantics".* (Source: > https://stackoverflow.com/a/4421708/364696 , though the principle is > restated in many other places). Agreed with this. What is so useful exactly in this new dict operator that it hasn't been implemented, say, 20 years ago? I rarely find myself merging dicts and, when I do, calling dict.update() is entirely acceptable (I think the "{**d}" notation was already a mistake, making a perfectly readable operation more cryptic simply for the sake of saving a few keystrokes). Built-in operations should be added with regard to actual user needs (such as: a first-class notation for matrix multiplication, making formulas easier to read and understand), not a mere "hmm this might sometimes be useful". Besides, if I have two dicts with e.g. lists as values, I *really* dislike the fact that the + operator will clobber the values rather than concatenate them. It's a recipe for confusion. Regards Antoine. From solipsis at pitrou.net Fri Mar 15 07:21:23 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Mar 2019 12:21:23 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> Message-ID: <20190315122123.4684f0d2@fsol> On Mon, 4 Mar 2019 16:02:06 +0100 Stefan Behnel wrote: > INADA Naoki schrieb am 04.03.19 um 11:15: > > Why statement is not enough? > > I'm not sure I understand why you're asking this, but a statement is "not > enough" because it's a statement and not an expression. This is an argument for Perl 6, not for Python. Regards Antoine. From solipsis at pitrou.net Fri Mar 15 07:25:22 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Mar 2019 12:25:22 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> Message-ID: <20190315122522.2756381f@fsol> On Mon, 4 Mar 2019 15:57:38 -0800 Guido van Rossum wrote: > > > Those two points make me uncomfortable with "+=" strictly behaving > > like ".update()". > > And yet that's how it works for lists. (Note that dict.update() still has > capabilities beyond +=, since you can also invoke it with keyword args.) Yeah, well.... I do think "+=" for lists was a mistake. I *still* have trouble remembering the exact difference between "list +=" and "list.extend" (yes, there is one: one accepts more types than the other... which one it is, and why, I never remember; and, of course, there might be the obscure performance difference because of CPython's execution details). I should not have to remember whether I want to use "list +=" or "list.extend" every time I need to extend a list. There is a virtue to """There should be one-- and preferably only one --obvious way to do it""" and we shouldn't break it more than we already did. Regards Antoine. From solipsis at pitrou.net Fri Mar 15 07:34:45 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Mar 2019 12:34:45 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> Message-ID: <20190315123445.22d1284b@fsol> On Thu, 7 Mar 2019 10:58:02 +1100 Chris Angelico wrote: > > Lots of words that basically say: Stuff wouldn't be perfectly pure. Chris, please learn to think twice before contributing what is essentially a trivialization of someone else's arguments. You're not doing anything useful here, and are just sounding like an asshole who wants to shut people up. Regards Antoine. From steve at pearwood.info Fri Mar 15 10:41:59 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 01:41:59 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315122021.16e8cca8@fsol> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> Message-ID: <20190315144158.GF12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 12:20:21PM +0100, Antoine Pitrou wrote: > Agreed with this. What is so useful exactly in this new dict operator > that it hasn't been implemented, say, 20 years ago? One could say the same thing about every new feature. Since Python 1.5 was so perfect, why add Unicode, decorators, matrix multiplication, async, descriptors, Decimal, iterators, ... Matrix multiplication is a perfect example: adding the @ operator could have been done in Python 0.1 if anyone had thought of it, but it took 15 years of numerical folk "whinging" about the lack until it happened: https://mail.python.org/pipermail/python-ideas/2014-March/027053.html In some ways, it is often easier to get community buy-in for *big* changes, provided they are backwards compatible. With a big change, people often either want it, or don't care one way or another. (Sometimes because the big change is too big or complicated or difficult for them to understand -- I feel that way about async. Some day I'll deal with it, but right now it's so far down my list of priorities that I have no opinion on anything to do with async.) But *little* changes are easy enough for everyone to understand, and so they trigger the impulse to bike-shed. Everyone has an opinion on whether or not dicts should support an update operator, and whether to spell it + or | or <- or << or something else. Or the infamous := operator, which ultimately is a useful but minor syntactic and semantic change but generated a huge amount of debate, argument and negativity. A far smaller change to the language than adding type hinting, but it generated far more argument. I still remember being told in no uncertain terms by the core devs that adding a clear() method to lists was a waste of time because there was already a perfectly good way to spell it with slicing. And then ABCs came along and now lists have a clear method. So opinions change too. Things happen when they happen, because if they had happened earlier we wouldn't still be arguing about them. > I rarely find > myself merging dicts and, when I do, calling dict.update() is entirely > acceptable The code we write is shaped by the operators and methods that exist. You use dict.update() because *it exists* so when you want a new dict merged with another, you write the code that is possible today: new = spam.copy() new.update(eggs) process(new) and you are content because you "rarely find myself merging dicts". But perhaps those who *frequently* merge dicts have a different option, and would prefer to write one line rather than three and avoid naming something that doesn't need a name: process(spam + eggs) # or spam|eggs if you prefer > (I think the "{**d}" notation was already a mistake, making > a perfectly readable operation more cryptic simply for the sake of > saving a few keystrokes). I don't know if it was a mistake, but disatisfaction with its lack of readability and discoverability is one of the motivations of this PEP. [...] > Besides, if I have two dicts with e.g. lists as values, I *really* > dislike the fact that the + operator will clobber the values rather than > concatenate them. It's a recipe for confusion. Are you confused that the update method clobbers list values rather than concatenate them? I doubt that you are. So why would it be confusing to say that + does a copy-and-update? (In any case, popular opinion may be shifting towards preferring the | operator over + so perhaps confusion over concatenation may not be an issue in the future.) -- Steven From andre.roberge at gmail.com Fri Mar 15 10:54:51 2019 From: andre.roberge at gmail.com (Andre Roberge) Date: Fri, 15 Mar 2019 11:54:51 -0300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315144158.GF12502@ando.pearwood.info> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> Message-ID: On Fri, Mar 15, 2019 at 11:42 AM Steven D'Aprano wrote: > [snip] > > I still remember being told in no uncertain terms by the core devs that > adding a clear() method to lists was a waste of time because there was > already a perfectly good way to spell it with slicing. And then ABCs > came along and now lists have a clear method. So opinions change too. > > I agree with the opinions expressed in the (partially) quoted message but I don't think that this is how this particular change happened. https://mail.python.org/pipermail/python-ideas/2009-April/003897.html ;-) ;-) Andr? Roberge -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 15 10:59:07 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 01:59:07 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315122522.2756381f@fsol> References: <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> <20190315122522.2756381f@fsol> Message-ID: <20190315145907.GG12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 12:25:22PM +0100, Antoine Pitrou wrote: > Yeah, well.... I do think "+=" for lists was a mistake. I *still* have > trouble remembering the exact difference between "list +=" and > "list.extend" (yes, there is one: one accepts more types than the > other... which one it is, and why, I never remember; Both accept arbitrary iterables, and the documentation suggests that they are the same: https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types Perhaps you are thinking of the difference between list + list versus list += iterable? [...] > There is a virtue to > > """There should be one-- and preferably only one --obvious way to do > it""" "It" here refers to two different things: "I want to update a dict in place": The Obvious Way is to use the update method; the fact that += works as well is just a side-effect of the way augmented assignments are defined. "I want a new dict that merges two existing dicts": The Obvious Way is to use the merge operator (possibly spelled + but that's not written in stone yet). -- Steven From steve at pearwood.info Fri Mar 15 11:21:48 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 02:21:48 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> Message-ID: <20190315152147.GH12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 11:54:51AM -0300, Andre Roberge wrote: > On Fri, Mar 15, 2019 at 11:42 AM Steven D'Aprano > wrote: > > > [snip] > > > > I still remember being told in no uncertain terms by the core devs that > > adding a clear() method to lists was a waste of time because there was > > already a perfectly good way to spell it with slicing. And then ABCs > > came along and now lists have a clear method. So opinions change too. > > > > I agree with the opinions expressed in the (partially) quoted message > but I don't think that this is how this particular change happened. > > https://mail.python.org/pipermail/python-ideas/2009-April/003897.html You proposed that in April 2009, but there was nothing added to the bug tracker for 18 months until it was finally added by Terry Reedy in November 2010, based on discussion in a completely different thread (one about sets!): https://mail.python.org/pipermail/python-ideas/2010-November/008722.html Contrary-wise, a few years earlier the same request had been roundly dismissed by core-devs and Python luminaries as "inane", "redundant" and "trivial". https://mail.python.org/pipermail/python-list/2006-April/356236.html People can change their mind -- something that is dismissed one year may be accepted some years later on. -- Steven From steve at pearwood.info Fri Mar 15 12:44:02 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 03:44:02 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315123445.22d1284b@fsol> References: <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> Message-ID: <20190315164401.GB29550@ando.pearwood.info> On Fri, Mar 15, 2019 at 12:34:45PM +0100, Antoine Pitrou wrote: > On Thu, 7 Mar 2019 10:58:02 +1100 > Chris Angelico wrote: > > > > Lots of words that basically say: Stuff wouldn't be perfectly pure. > > Chris, please learn to think twice before contributing what is > essentially a trivialization of someone else's arguments. You're not > doing anything useful here, and are just sounding like an asshole who > wants to shut people up. I don't think you are being fair here, and I'd rather avoid getting into unhelpful arguments about tone and whether Chris is "trivializing" (a perjorative term) or "simplifying" (a more neutral term) Josh's position. But if you feel that Chris (and I) have missed parts of Josh's argument, then by all means point out what we missed. Josh, the same applies to you: I do want to give your objections a fair hearing in the updated PEP, so if you think I've missed something, please point it out. In context, I think Chris' response was valid: he was responding to a post by Josh whose entire argument was that using + for dict merging is an abuse of the + symbol because it isn't like numeric addition. If there is more to Josh's argument, can you point out to me what I have missed please? That's a genuine request, not a rhetorical question. Here's Josh's argument: https://mail.python.org/pipermail/python-ideas/2019-March/055733.html and for context, here is Chris' dismissal of Josh's argument: https://mail.python.org/pipermail/python-ideas/2019-March/055734.html and his explanation of why he is dismissing it. Chris is well within his right to dismiss an argument that doesn't impress him, which he did by summarizing it as "Stuff wouldn't be perfectly pure". (Pure in the sense of being like numeric addition.) I think that's pretty much an accurate summary: Josh apparently doesn't like using + for anything that isn't purely like + for real numbers. He calls using + for concatentation a "minor abuse" of the operator and argues that it would be bad for dict meging to use + because merging has different properties to numeric addition. (He has also criticised the use of + for concatenation in at least one other post.) He even gives qualified support for a dict merge operator: "there's nothing wrong with making dict merges easier" but just doesn't like the choice of + as the operator. He's entitled to his opinion, and Chris is entitled to dismiss it. (Aside: your email appears to have broken threading. I'm not sure why, your other emails seem to be threaded okay.) -- Steven From tjreedy at udel.edu Fri Mar 15 13:34:05 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 15 Mar 2019 13:34:05 -0400 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315152147.GH12502@ando.pearwood.info> References: <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190315152147.GH12502@ando.pearwood.info> Message-ID: On 3/15/2019 11:21 AM, Steven D'Aprano wrote: > On Fri, Mar 15, 2019 at 11:54:51AM -0300, Andre Roberge wrote: >> On Fri, Mar 15, 2019 at 11:42 AM Steven D'Aprano >> wrote: >> >>> [snip] >>> >>> I still remember being told in no uncertain terms by the core devs that >>> adding a clear() method to lists was a waste of time because there was >>> already a perfectly good way to spell it with slicing. And then ABCs >>> came along and now lists have a clear method. So opinions change too. >>> >>> I agree with the opinions expressed in the (partially) quoted message >> but I don't think that this is how this particular change happened. >> >> https://mail.python.org/pipermail/python-ideas/2009-April/003897.html > > You proposed that in April 2009, but there was nothing added to the bug > tracker for 18 months until it was finally added by Terry Reedy in Actually, I opened the tracker issue with a succinct message, after the discussion and Guido's approval changed my mind. https://bugs.python.org/issue10516 However, Eli Bendersky wrote the patch with help from others and then merged it. > November 2010, based on discussion in a completely different thread (one > about sets!): > > https://mail.python.org/pipermail/python-ideas/2010-November/008722.html -- Terry Jan Reedy From brett at python.org Fri Mar 15 13:34:45 2019 From: brett at python.org (Brett Cannon) Date: Fri, 15 Mar 2019 10:34:45 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315123445.22d1284b@fsol> References: <20190301162645.GM4465@ando.pearwood.info> <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> Message-ID: On Fri, Mar 15, 2019 at 4:36 AM Antoine Pitrou wrote: > On Thu, 7 Mar 2019 10:58:02 +1100 > Chris Angelico wrote: > > > > Lots of words that basically say: Stuff wouldn't be perfectly pure. > > Chris, please learn to think twice before contributing what is > essentially a trivialization of someone else's arguments. You're not > doing anything useful here, and are just sounding like an asshole who > wants to shut people up. > Watch the tone please. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 15 13:51:11 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Mar 2019 10:51:11 -0700 Subject: [Python-ideas] Why operators are useful Message-ID: There's been a lot of discussion about an operator to merge two dicts. I participated in the beginning but quickly felt overwhelmed by the endless repetition, so I muted most of the threads. But I have been thinking about the reason (some) people like operators, and a discussion I had with my mentor Lambert Meertens over 30 years ago came to mind. For mathematicians, operators are essential to how they think. Take a simple operation like adding two numbers, and try exploring some of its behavior. add(x, y) == add(y, x) (1) Equation (1) expresses the law that addition is commutative. It's usually written using an operator, which makes it more concise: x + y == y + x (1a) That feels like a minor gain. Now consider the associative law: add(x, add(y, z)) == add(add(x, y), z) (2) Equation (2) can be rewritten using operators: x + (y + z) == (x + y) + z (2a) This is much less confusing than (2), and leads to the observation that the parentheses are redundant, so now we can write x + y + z (3) without ambiguity (it doesn't matter whether the + operator binds tighter to the left or to the right). Many other laws are also written more easily using operators. Here's one more example, about the identity element of addition: add(x, 0) == add(0, x) == x (4) compare to x + 0 == 0 + x == x (4a) The general idea here is that once you've learned this simple notation, equations written using them are easier to *manipulate* than equations written using functional notation -- it is as if our brains grasp the operators using different brain machinery, and this is more efficient. I think that the fact that formulas written using operators are more easily processed *visually* has something to do with it: they engage the brain's visual processing machinery, which operates largely subconsciously, and tells the conscious part what it sees (e.g. "chair" rather than "pieces of wood joined together"). The functional notation must take a different path through our brain, which is less subconscious (it's related to reading and understanding what you read, which is learned/trained at a much later age than visual processing). The power of visual processing really becomes apparent when you combine multiple operators. For example, consider the distributive law: mul(n, add(x, y)) == add(mul(n, x), mul(n, y)) (5) That was painful to write, and I believe that at first you won't see the pattern (or at least you wouldn't have immediately seen it if I hadn't mentioned this was the distributive law). Compare to: n * (x + y) == n * x + n * y (5a) Notice how this also uses relative operator priorities. Often mathematicians write this even more compact: n(x+y) == nx + ny (5b) but alas, that currently goes beyond the capacities of Python's parser. Another very powerful aspect of operator notation is that it is convenient to apply them to objects of different types. For example, laws (1) through (5) also work when n, x, y and z are same-size vectors (substituting a vector of zeros for the literal "0"), and also if x, y and z are matrices (note that n has to be a scalar). And you can do this with objects in many different domains. For example, the above laws (1) through (5) apply to functions too (n being a scalar again). By choosing the operators wisely, mathematicians can employ their visual brain to help them do math better: they'll discover new interesting laws sooner because sometimes the symbols on the blackboard just jump at you and suggest a path to an elusive proof. Now, programming isn't exactly the same activity as math, but we all know that Readability Counts, and this is where operator overloading in Python comes in. Once you've internalized the simple properties which operators tend to have, using + for string or list concatenation becomes more readable than a pure OO notation, and (2) and (3) above explain (in part) why that is. Of course, it's definitely possible to overdo this -- then you get Perl. But I think that the folks who point out "there is already a way to do this" are missing the point that it really is easier to grasp the meaning of this: d = d1 + d2 compared to this: d = d1.copy() d = d1.update(d2) and it is not just a matter of fewer lines of code: the first form allows us to use our visual processing to help us see the meaning quicker -- and without distracting other parts of our brain (which might already be occupied by keeping track of the meaning of d1 and d2, for example). Of course, everything comes at a price. You have to learn the operators, and you have to learn their properties when applied to different object types. (This is true in math too -- for numbers, x*y == y*x, but this property does not apply to functions or matrices; OTOH x+y == y+x applies to all, as does the associative law.) "But what about performance?" I hear you ask. Good question. IMO, readability comes first, performance second. And in the basic example (d = d1 + d2) there is no performance loss compared to the two-line version using update, and a clear win in readability. I can think of many situations where performance difference is irrelevant but readability is of utmost importance, and for me this is the default assumption (even at Dropbox -- our most performance critical code has already been rewritten in ugly Python or in Go). For the few cases where performance concerns are paramount, it's easy to transform the operator version to something else -- *once you've confirmed it's needed* (probably by profiling). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 15 14:14:28 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 05:14:28 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> Message-ID: <20190315181428.GI12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 10:34:45AM -0700, Brett Cannon wrote: > Watch the tone please. Brett, you might have missed my comment about wanting to avoid unhelpful arguments about tone, but if you are going to complain about people's tone, the considerate thing to do is to say what it is that you're objecting to. Otherwise we're left guessing as to what it is and whether or not you are making an implied threat to apply the CoC. I responded to Antoine's post earlier, but thought that it was a respectful disagreement. Do you think that's not the case? -- Steven From brett at python.org Fri Mar 15 14:31:14 2019 From: brett at python.org (Brett Cannon) Date: Fri, 15 Mar 2019 11:31:14 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315181428.GI12502@ando.pearwood.info> References: <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> <20190315181428.GI12502@ando.pearwood.info> Message-ID: On Fri, Mar 15, 2019 at 11:15 AM Steven D'Aprano wrote: > On Fri, Mar 15, 2019 at 10:34:45AM -0700, Brett Cannon wrote: > > > Watch the tone please. > > Brett, you might have missed my comment about wanting to avoid unhelpful > arguments about tone, but if you are going to complain about people's > tone, the considerate thing to do is to say what it is that you're > objecting to. > The phrasing of "just sounding like an asshole who wants to shut people up" is unnecessary. > > Otherwise we're left guessing as to what it is and whether or not you > are making an implied threat to apply the CoC. > No implied "threat". If it was an official warning then I would have said so. > > I responded to Antoine's post earlier, but thought that it was a > respectful disagreement. Do you think that's not the case? > I think it skirts the edge of being disrespectful, hence the request to please be aware of how one comes across. -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Fri Mar 15 14:42:55 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 19:42:55 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> Message-ID: <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> On 3/14/19 9:47 PM, Chris Angelico wrote: > What happens when someone wants to support multiple Python versions? > "Requires Python 3.5 or newer" is easy. Forcing people to install the > correct one for each version isn't. What are the reasons why people want to support multiple Python versions, on the 3 series? do they really want? or they need to (may be)? and for how many versions, "from 3.5 or newer" ... forever? will be reasonable possible? IMHO more versions to support, the harder to support. Regards, --francis From francismb at email.de Fri Mar 15 14:51:05 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 19:51:05 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: <5C8B2B14.801@canterbury.ac.nz> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <5C8B2B14.801@canterbury.ac.nz> Message-ID: <65e0e5fa-83fe-d040-8bae-6936d24ef569@email.de> Hi Greg, On 3/15/19 5:33 AM, Greg Ewing wrote: > Not really. Having to translate all your source every time a > minor version update occurs would be a huge hassle. PythonUAAS: upload the sources (zipped or packaged) and get them updated back ;-) Regards, --francis From raymond.hettinger at gmail.com Fri Mar 15 14:54:03 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 15 Mar 2019 11:54:03 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> > On Mar 15, 2019, at 10:51 AM, Guido van Rossum wrote: > > The general idea here is that once you've learned this simple notation, equations written using them are easier to *manipulate* than equations written using functional notation -- it is as if our brains grasp the operators using different brain machinery, and this is more efficient. There is no question that sometimes operators can be easier to manipulate and reason about than equivalent methods. The use of "+" and "*" are a major win for numeric and sequence types. There is also no question that sometimes method names are better than operators (otherwise, we wouldn't use method names at all). APL is an extreme example of a rich set of operators being both powerful and opaque. So, we have to ask whether we're stretching too far from "operators are good" to "we need this operator". Here are some considerations: Frequency of usage: Math provides ? and ? because they are common. It doesn't provide a special operator for sqrt(c**2 - b**2) because the latter is less fundamental and less common. To me, f=d.copy() followed by f.update(e) arises so rarely that an operator isn't warranted. The existing code is already concise, clear, and rare. Familiarity: We know about + because we use it a lot in addition and concatenation contexts. However, a symbol like ? is more opaque unless we're using it every day for a particular purpose. To me, the "+" operator implies "add/extend" semantics rather than "replace" semantics. Successive applications of "+" are never idempotent unless one operand is an identity element. So for me, "+" isn't familiar for dict merges. Loosely put, it isn't "plus-like". I think this is why so many other languages decided not use "+" for dict merges even when that would have been a trivially easy implementation choice. Obviousness: When working with "+" on numeric types, it is obvious it should be commutative. When using "+" when sequence types, it is obvious that concatenation is non-commutative. When using "+" for mapping types, it is not obvious that it isn't commutative. Likewise, it isn't obvious that "+" is a destructive operation for mappings (consider that adding to a log file never destroys existing log entries, while updating a dict will overwrite existing values). Harmony: The operators on dict views use "|" but regular dicts would use "+". That doesn't seem harmonious. Impact: When a class in the standard library adds a method or operator, the reverberations are felt only locally. In contrast, the dict API is fundamental. Changing it will reverberate for years. It will be felt in the ABCs, typeshed, and every mapping-like object. IMO such an impactful change should only be made if it adds significant new functionality rather than providing a slightly shorter spelling of something we already have. Raymond From remi.lapeyre at henki.fr Fri Mar 15 14:57:38 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Fri, 15 Mar 2019 14:57:38 -0400 Subject: [Python-ideas] Code version evolver In-Reply-To: <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: Le 15 mars 2019 ? 19:44:15, francismb (francismb at email.de(mailto:francismb at email.de)) a ?crit: > > > On 3/14/19 9:47 PM, Chris Angelico wrote: > > What happens when someone wants to support multiple Python versions? > > "Requires Python 3.5 or newer" is easy. Forcing people to install the > > correct one for each version isn't. > What are the reasons why people want to support multiple Python > versions, on the 3 series? do they really want? or they need to (may > be)? and for how many versions, "from 3.5 or newer" ... forever? will be > reasonable possible? IMHO more versions to support, the harder to support. I think it?s pretty much a requirement for any respectable library, when a library drop support for a Python version, its usefulness drop significantly. > Regards, > --francis > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From jfine2358 at gmail.com Fri Mar 15 15:05:29 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 15 Mar 2019 19:05:29 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: Guido wrote: > There's been a lot of discussion about an operator to merge two dicts. I participated in the beginning but quickly felt overwhelmed by the endless repetition, so I muted most of the threads. > But I have been thinking about the reason (some) people like operators, and a discussion I had with my mentor Lambert Meertens over 30 years ago came to mind. > For mathematicians, operators are essential to how they think. I agree about the endless repetition. I hope Steven D'A is making good progress with the revised PEP. I think that could help us focus discussion. A few days ago, I drafted but did not send a post on binary operators. Prompted by Guido's helpful post, I'm appending it below. My approach and opinions are not the same as Guido's, but have much in common. Perhaps later, I'll clarify where I agree with Guido, and where my opinions differ. Certainly, I think we have in common an emphasis on usability and in particular readability of code. ==================================================== SUBJECT: Naming things: would having more binary operators help? SUMMARY I'm refocusing our earlier discussion on binary operators. I suggest we discuss the question: Providing more binary operators. When would this make naming things this easier? And when harder? THE PROBLEM Naming things is hard. For example https://hilton.org.uk/blog/why-naming-things-is-hard "Naming is communication. Bad names prevent code from clearly communicating its intent, which is why code with obfuscated names is spectacularly hard to understand. The compiler might not care, but the humans benefit from naming that communicates effectively." AN EXAMPLE One person wrote: using + to merge dicts is simple, non-disruptive, and unlikely to really confuse anyone - so why not? Another person proposed: d1 << d2 merges d2 into a copy of d1 and returns it, with keys from d2 overriding keys from d2. A third person wrote: "|" (especially "|=") *is* suitable for "update" [So] reserve "+" for some alternative future commutative extension A fourth person provided a '+' operator on a subclass of dict, that merged items using a special addition on numeric values. A fifth person suggested adding arrow operators, with symbols '->' and '<-'. A six person said that '<-' would break code valid such as '-2<-1'. A seventh person noted that annotations already use '->' as a symbol. An eighth person said the infix module allows you to write a @cup@ b An nineth person (me) will soon suggest that we add dict.gapfill current.update(changes) # If conflict, prefer value in changes. options.gapfill(defaults) # If conflict, prefer value in options. (and so '+' or '|' for update not so clear). BENEFITS OF BINARY OPERATORS Binary operators, such as '+' allow us to write: c = a + b # Infix notation a += x # mutation or augmented assignment a[key] += x # as above, but 'in place' At school, when we learn arithmetic, we learn it using infix notation. Two plus two is four. Seven times eight is fifty-six. I think the above indicates the benefits of binary operators. Particular when binary operation does not mutate the operands. DIFFICULTIES Sometimes, having few names to choose makes naming things easier. That's obvious. Sometimes, having a wider choose makes naming things easier. Think Unicode's visually similar characters. At present, Python has 20 operators, the majority being binary evaluation operators. https://docs.python.org/3/reference/lexical_analysis.html#operators + - * ** / // % @ << >> & | ^ ~ < > <= >= == != The last row gives the (binary) comparison operators. The symbols '^' and '~' are unary operators. For clarity, I'm thinking of the binary evaluation operators, or in other words '+' through to '|'. Aside: '+' and '-' can be used as binary and unary operators. >>> 5 + -- +++ ---- + ------ - 4 1 ONE SUGGESTION The twelve binary evaluation operators sounds a lot, but perhaps some users will need more. Even it might be nice if the same symbol didn't have too many different meanings. Python2 used '/' for both float and integer division. To reduce cognitive overload, Python3 introduced '//' for integer division. >>> 4.3 // 2.1 2.0 For example https://oeis.org/wiki/List_of_LaTeX_mathematical_symbols#Arrows lists 10 types of horizontal arrow, and 6 types of vertical arrow. Providing more binary operators is the motivation for my proposal # https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle len( A @cup B) == len( A ) + len( B ) - len( A @cap B ) (By the way, we might prefer 'union' and 'intersection' to 'cup' and 'cap'. Also there are alternatives, such as using $ instead of @, or using Unicode Math Characters.) If there is a shared wish to have more binary operators, it might then be useful to discuss how. DISCUSSION QUESTION Please discuss: Providing more binary operators. When would this make naming things this easier? And when harder? By the way, naming things is mainly a human usability issue. The computer doesn't care about this. Even if we use only the current binary operators, this discussion should help us understand better naming and usability. -- Jonathan From francismb at email.de Fri Mar 15 15:10:58 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 20:10:58 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> Message-ID: <664d25df-c57f-dec3-f1d5-0fbdc4306807@email.de> On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > Not really. For example, addition of syntax like "async" and "yield" > fundamentally changes the meaning of "def", in ways that *could not* > be fully emulated in earlier Pythons. The semantics simply were > impossible to produce -- that's why syntax extensions were necessary. But here, the code for versions before that change (e.g. aync) also worked on the new versions? there was not need to translate anything to the new version as it was a backward compatible change. To use the new feature you have to explicitly use that feature. If that so far correct? Thanks, --francis From remi.lapeyre at henki.fr Fri Mar 15 15:20:35 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Fri, 15 Mar 2019 15:20:35 -0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: ?Le 15 mars 2019 ? 18:52:51, Guido van Rossum (guido at python.org(mailto:guido at python.org)) a ?crit: ? > The power of visual processing really becomes apparent when you combine multiple operators. For example, consider the distributive law: > > mul(n, add(x, y)) == add(mul(n, x), mul(n, y)) (5) > > That was painful to write, and I believe that at first you won't see the pattern (or at least you wouldn't have immediately seen it if I hadn't mentioned this was the distributive law). > > Compare to: > > n * (x + y) == n * x + n * y (5a) Thanks for the insight. I think this omit a very important property of mathematic equations thought, maths is a very strongly typed language which can be a significant improvement for readability. For example, a mathematician working within the space of linear maps over a vector space will easily recognize the meaning of every symbol in: f(a * x + y) = a * f(x) + f(y) And know that the + in the above expression is very different from the meaning of + in: x = a * y + z when he is working over the C complex field. For example, he instinctively knows what 1 / z means in the second case but that 1 / f in the first is completely bogus. In Python there is not that much contextual information, but we can use explicit names to overcome this, for example if I wrote: o = d1 + d2 + d3 you would have no idea what this is but: options = defaults + environment_variables + command_line_arguments is meaningful. ... > Of course, it's definitely possible to overdo this -- then you get Perl. But I think that the folks who point out "there is already a way to do this" are missing the point that it really is easier to grasp the meaning of this: > > d = d1 + d2 > > compared to this: > > d = d1.copy() > d = d1.update(d2) Of course. I may have missed something but I don?t understand why INADA Naoki proposal does not get more attention. It is not binary, and we could use names to convey the meaning of the operation: options = dict.merge(defaults, environment_variables, command_line_arguments) His alternative options = defaults.merge(environment_variables, command_line_arguments) could also be used if preferred. Is there really something wrong with this? It would do exactly what most proponent of + want but could be more readable. I agree that the argument of performance may not be very strong as most of the time, the updated dict might be smalls, but it would also solved this elegantly. I?m sorry if what I?m saying is not clear or I?m not able to convey my thoughts clearly as English is not my mother tongue, many others are better suited then me to discuss this proposal on this list but I don?t understand why this possibility is not more discussed. R?mi From francismb at email.de Fri Mar 15 15:22:25 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 20:22:25 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> Message-ID: <949f979b-f6ef-c9e8-a7de-b148d437dc12@email.de> On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > What 2to3 does is to handle a lot of automatic conversions, such as > flipping the identifiers from str to bytes and unicode to str. It was > necessary to have some such tool because of the very large amount of > such menial work needed to change a 2 code base to a 3 code base. But > even so, there were things that 2to3 couldn't do, and it often exposed > bugs or very poor practice (decode applied to unicode objects, encode > applied to bytes) that had to be reworked by the developer anyway. Very interesting from the 2/3 transition experience point of view. But that's not still the past, IMHO that will be after 2020,... around 2025 :-) Could one also say that under the line that it *improved* the code? (by exposing bugs, bad practices) could be a first step to just *flag* those behaviors/changes ? Regards, --francis From francismb at email.de Fri Mar 15 15:34:00 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 20:34:00 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> Message-ID: <1924bffb-6899-91c6-f601-025915adf68a@email.de> On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > The thing about "within 3" upgrades is that that kind of project-wide > annoyance is going to be minimal, because the language is mostly > growing in power, not changing the semantics of existing syntax. Such > changes are very rare, and considered extremely carefully for > implications for existing code. I understand that no one really wants to annoy the language users by breaking the code and that's why those changes are considered carefully. Is that may be because there is no easy way to write a translator? or there is no translator to help transition? > In a very few cases it's possible to > warn about dangerous use of obsolete syntax whose meaning has changed, > but that's very rare too. Ok, it's a starting point. --francis From 2QdxY4RzWzUUiLuE at potatochowder.com Fri Mar 15 15:44:21 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Fri, 15 Mar 2019 14:44:21 -0500 Subject: [Python-ideas] Code version evolver In-Reply-To: <1924bffb-6899-91c6-f601-025915adf68a@email.de> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> <1924bffb-6899-91c6-f601-025915adf68a@email.de> Message-ID: <1963363c-88c8-4371-a433-d7465f19729f@potatochowder.com> On 3/15/19 2:34 PM, francismb wrote: > I understand that no one really wants to annoy the language users by > breaking the code and that's why those changes are considered carefully. > > Is that may be because there is no easy way to write a translator? or > there is no translator to help transition? Translating existing code is a small problem when the language changes backwards-incompatibly. The larger problem is unlearning what I used to know and learning a new language. If that happens enough, I'll stop using a language. One of Python's strengths is that it evolves very slowly, and knowledge I work hard to accumulate now remains useful for a long time. From jfine2358 at gmail.com Fri Mar 15 15:49:29 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 15 Mar 2019 19:49:29 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> Message-ID: Raymond Hettinger wrote: > Frequency of usage: Math provides ? and ? because they are common. It doesn't provide a special operator for sqrt(c**2 - b**2) because the latter is less fundamental and less common. Here's some more information. Below is an example of an area, where sqrt(c**2 - b**2) is both fundamental and common. And that it might be helpful for Python to provide a (named) function for this operation. Whether or not, or how, a symbolic expression should be provided is another question. This one example by itself does not refute Raymond's argument. I certainly think caution is required, in promoting the needs of one group of users at the expense of another. Best avoided, if possible. GORY DETAILS Don Knuth, in METAFONT, implemented special '++' and '+-+' operators, that he called Pythagorean addition and subtraction. The latter is precisely Raymond's sqrt(c**2 - b**2), but calculated more efficiently and accurately. This is described on page 66 of Don Knuth's METAFONT Book. https://ctan.org/tex-archive/systems/knuth/dist/mf/mfbook.tex The `^|++|' operation is called {\sl^{Pythagorean addition}\/}; $a\pyth+b$ is the same thing as $\sqrt{\stt a^2+b^2}$. Most of the ^{square root} operations in computer programs could probably be avoided if $++$ were more widely available, because people seem to want square roots primarily when they are computing distances. Notice that $a\pyth+b\pyth+c= \sqrt{\stt a^2+b^2+c^2}$; we have the identity $(a\pyth+b)\pyth+c=a\pyth+( b\pyth+c)$ as well as $a\pyth+b=b\pyth+a$. It is better to use Pythagorean addition than to calculate $\sqrt{\stt a^2+b^2}$, because the computation of $a^2$ and $b^2$ might produce numbers that are too large even when $a\pyth+b$ is rather small. There's also an inverse operation, ^{Pythagorean subtraction}, which is denoted by `^|+-+|'; the quantity $a\mathbin{+{-}+}b$ is equal to $\sqrt{\stt a^2-b^2}$. ASIDE - wikipedia In https://en.wikipedia.org/wiki/Pythagorean_addition, wikipedia using the symbol \oplus for Pythagorean addition, and does not mention Pythagorean subtraction. ASIDE- \pyth and Python Don Knuth uses \pyth as a macro (shorthand) for Pythagorean. It's got nothing to do with Python. The METAFONT book goes back to 1986, which predates Pyth-on by about 5 years. That said, Pythagoras was the founder of a new way of life, and Python is a new way of programming. -- Jonathan From rosuav at gmail.com Fri Mar 15 15:56:22 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Mar 2019 06:56:22 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: On Sat, Mar 16, 2019 at 5:43 AM francismb wrote: > > On 3/14/19 9:47 PM, Chris Angelico wrote: > > What happens when someone wants to support multiple Python versions? > > "Requires Python 3.5 or newer" is easy. Forcing people to install the > > correct one for each version isn't. > What are the reasons why people want to support multiple Python > versions, on the 3 series? do they really want? or they need to (may > be)? and for how many versions, "from 3.5 or newer" ... forever? will be > reasonable possible? IMHO more versions to support, the harder to support. > People who care about backward compatibility will usually have some definition of what they support, such as "this app will run on any Python version shipped by a currently-supported Debian release" (which at the moment means supporting Python 3.4, shipped by Debian Jessie), or "we support back as far as isn't too much of a pain" (which usually means committing to support everything starting from the version that introduced some crucial feature). Either way, there's not usually a "forever", but potentially quite a few versions' worth of support. The same is true of books that discuss the language, blog posts giving tips and tricks, Stack Overflow answers, and everything else that incorporates code that people might want to copy and paste. What version of Python do you need? What's the oldest that it still works on, and what's the newest before something breaks it? Backward-incompatible changes make that EXTREMELY hard. Backward-compatible changes make it only a little bit harder, as they set a minimum but not a maximum. You want to see how bad it can be? Go try to find out how to do something slightly unusual with React.js. Stack Overflow answers sometimes have three, four, or even more different code blocks, saying "this if you're on this version, that for some other version". ChrisA From francismb at email.de Fri Mar 15 16:02:45 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 21:02:45 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Message-ID: <375e1160-d475-7324-8d41-88df11e5a3fd@email.de> On 3/13/19 7:44 PM, David Teresi wrote: > `->` would not be ambiguous in the proposed cases, but it does already > mean something elsewhere in the language as of 3.5: > > def concat(a: str, b: str) -> str: > return a + b > > This could potentially cause confusion (as with the % operator being > used for modulo as well as string formatting). IMHO in that context the asymmetry is still there: (a: str, b: str) -> str And the operator is the function. Regards, --francis From rhodri at kynesim.co.uk Fri Mar 15 15:28:16 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 15 Mar 2019 19:28:16 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> Message-ID: On 15/03/2019 18:54, Raymond Hettinger wrote: > So, we have to ask whether we're stretching too far from "operators are good" to "we need this operator". Here are some considerations: > > Frequency of usage: Math provides ? and ? because they are common. It doesn't provide a special operator for sqrt(c**2 - b**2) because the latter is less fundamental and less common. To me, f=d.copy() followed by f.update(e) arises so rarely that an operator isn't warranted. The existing code is already concise, clear, and rare. I think the "less fundamental" in your argument is more relevant than the "less common". Mathematicians will cheerfully invent operators for whatever is fundamental to their field and then use them twice in a paper, but still write out fully common combinations. I would suggest that merging is merging is a fairly fundamental operation for dictionaries, so is a good candidate for an operator. The combination "f=d.copy(); f.update(e)" is rare in my code. I suspect that's partly because it doesn't occur to me that I can do it. Guido's argument about recognisability is strong here. I know that dict.update() exists, and that I can destructively merge dictionaries. The extra step of doing the copy first for a non-destructive merge makes for a much less memorable pattern, to the point where I just don't think of it unless it would be more than usually useful. "f = d | e" (however it gets spelled) is much easier to remember the existence of. > Familiarity: We know about + because we use it a lot in addition and concatenation contexts. However, a symbol like ? is more opaque unless we're using it every day for a particular purpose. To me, the "+" operator implies "add/extend" semantics rather than "replace" semantics. Successive applications of "+" are never idempotent unless one operand is an identity element. So for me, "+" isn't familiar for dict merges. Loosely put, it isn't "plus-like". I think this is why so many other languages decided not use "+" for dict merges even when that would have been a trivially easy implementation choice. I'm beginning to be swayed by the arguments that merging is more "or-like" and the right analogy is with set union. Personally I don't find "|" for set union at all obvious, but that argument was lost long ago, and like I said it's just personal. I don't have the same problem you have with the semantics of "+", but when I was a maths student I was used to using "+" as an entirely generic operator not necessarily meaning addition, so it's probably just me. > Obviousness: When working with "+" on numeric types, it is obvious it should be commutative. When using "+" when sequence types, it is obvious that concatenation is non-commutative. When using "+" for mapping types, it is not obvious that it isn't commutative. Likewise, it isn't obvious that "+" is a destructive operation for mappings (consider that adding to a log file never destroys existing log entries, while updating a dict will overwrite existing values). I suspect this is a bit personal; I had sufficiently evil lecturers in my university Algebra course that I still don't automatically take the commutativity of "+" over a particular group as a given :-) Nothing is obvious unless you already know it. (There is a probably apocryphal tale of a lecturer in full flow saying "It is obvious that..." and pausing. He then turned to the blackboard and scribbled furiously in one corner for five minutes. "I was right," he said triumphantly, "it is obvious!") > Harmony: The operators on dict views use "|" but regular dicts would use "+". That doesn't seem harmonious. Yes, that's probably the killer argument against "+", damn it. > Impact: When a class in the standard library adds a method or operator, the reverberations are felt only locally. In contrast, the dict API is fundamental. Changing it will reverberate for years. It will be felt in the ABCs, typeshed, and every mapping-like object. IMO such an impactful change should only be made if it adds significant new functionality rather than providing a slightly shorter spelling of something we already have. I am inclined that adding significant new utility (which this does) is also a good enough reason to make such an impactful change. -- Rhodri James *-* Kynesim Ltd From francismb at email.de Fri Mar 15 16:34:58 2019 From: francismb at email.de (francismb) Date: Fri, 15 Mar 2019 21:34:58 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: Thanks! On 3/15/19 8:56 PM, Chris Angelico wrote: > The same is true of books that discuss the language, blog posts giving > tips and tricks, Stack Overflow answers, and everything else that > incorporates code that people might want to copy and paste. What > version of Python do you need? What's the oldest that it still works > on, and what's the newest before something breaks it? > Backward-incompatible changes make that EXTREMELY hard. > Backward-compatible changes make it only a little bit harder, as they > set a minimum but not a maximum. >... that seems to be a use case for a function like, e.g. "is-python-code(version-to-ask-for, code-snipped)" ;-) (Wanna search the min. and max. python working points/ranges ? loop over e.g. 3.0 .. 3.X) Regards, --francis From rosuav at gmail.com Fri Mar 15 18:09:41 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Mar 2019 09:09:41 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: On Sat, Mar 16, 2019 at 7:35 AM francismb wrote: > > Thanks! > On 3/15/19 8:56 PM, Chris Angelico wrote: > > The same is true of books that discuss the language, blog posts giving > > tips and tricks, Stack Overflow answers, and everything else that > > incorporates code that people might want to copy and paste. What > > version of Python do you need? What's the oldest that it still works > > on, and what's the newest before something breaks it? > > Backward-incompatible changes make that EXTREMELY hard. > > Backward-compatible changes make it only a little bit harder, as they > > set a minimum but not a maximum. > >... that seems to be a use case for a function like, e.g. > "is-python-code(version-to-ask-for, code-snipped)" ;-) (Wanna search the > min. and max. python working points/ranges ? loop over e.g. 3.0 .. 3.X) > Python 3.5 introduced the modulo operator for bytes objects. How are you going to write a function that determines whether or not a piece of code depends on this? And, are you going to run this function on every single code snippet before you try it? I don't think this is possible, AND it's most definitely not a substitute for backward compatibility. ChrisA From python at mrabarnett.plus.com Fri Mar 15 18:53:31 2019 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 15 Mar 2019 22:53:31 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: On 2019-03-15 19:05, Jonathan Fine wrote:> Guido wrote: > >> There's been a lot of discussion about an operator to merge two dicts. I participated in the beginning but quickly felt overwhelmed by the endless repetition, so I muted most of the threads. > >> But I have been thinking about the reason (some) people like operators, and a discussion I had with my mentor Lambert Meertens over 30 years ago came to mind. > >> For mathematicians, operators are essential to how they think. > > I agree about the endless repetition. I hope Steven D'A is making good > progress with the revised PEP. I think that could help us focus > discussion. > > A few days ago, I drafted but did not send a post on binary operators. > Prompted by Guido's helpful post, I'm appending it below. My approach > and opinions are not the same as Guido's, but have much in common. > Perhaps later, I'll clarify where I agree with Guido, and where my > opinions differ. > > Certainly, I think we have in common an emphasis on usability and in > particular readability of code. > > ==================================================== > SUBJECT: Naming things: would having more binary operators help? > > SUMMARY > I'm refocusing our earlier discussion on binary operators. I suggest > we discuss the question: > Providing more binary operators. When would this make naming things > this easier? And when harder? > > THE PROBLEM > Naming things is hard. > > For example https://hilton.org.uk/blog/why-naming-things-is-hard > "Naming is communication. Bad names prevent code from clearly > communicating its intent, which is why code with obfuscated names is > spectacularly hard to understand. The compiler might not care, but the > humans benefit from naming that communicates effectively." > > AN EXAMPLE > One person wrote: > using + to merge dicts is simple, non-disruptive, and unlikely to > really confuse anyone - so why not? > > Another person proposed: > d1 << d2 merges d2 into a copy of d1 and returns it, with keys from d2 > overriding keys from d2. > > A third person wrote: > "|" (especially "|=") *is* suitable for "update" > [So] reserve "+" for some alternative future commutative extension > [snip] There was also the suggestion of having both << and >>. Actually, now that dicts are ordered, that would provide a use-case, because you would then be able to choose which values were overwritten whilst maintaining the order of the dict on the LHS. From steve at pearwood.info Fri Mar 15 21:27:49 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 12:27:49 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <664d25df-c57f-dec3-f1d5-0fbdc4306807@email.de> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> <664d25df-c57f-dec3-f1d5-0fbdc4306807@email.de> Message-ID: <20190316012747.GJ12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 08:10:58PM +0100, francismb wrote: > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > > Not really. For example, addition of syntax like "async" and "yield" > > fundamentally changes the meaning of "def", in ways that *could not* > > be fully emulated in earlier Pythons. The semantics simply were > > impossible to produce -- that's why syntax extensions were necessary. > But here, the code for versions before that change (e.g. aync) also > worked on the new versions? there was not need to translate anything to > the new version as it was a backward compatible change. To use the new > feature you have to explicitly use that feature. If that so far correct? No, it is not a backwards compatible change. Any code using async as a name will fail. py> sys.version '3.8.0a2+ (heads/pr_12089:5fcd3b8, Mar 11 2019, 12:39:33) \n[GCC 4.1.2 20080704 (Red Hat 4.1.2-55)]' py> async = 1 File "", line 1 async = 1 ^ SyntaxError: invalid syntax -- Steven From rosuav at gmail.com Fri Mar 15 21:37:36 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Mar 2019 12:37:36 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <20190316012747.GJ12502@ando.pearwood.info> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> <664d25df-c57f-dec3-f1d5-0fbdc4306807@email.de> <20190316012747.GJ12502@ando.pearwood.info> Message-ID: On Sat, Mar 16, 2019 at 12:28 PM Steven D'Aprano wrote: > > On Fri, Mar 15, 2019 at 08:10:58PM +0100, francismb wrote: > > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > > > Not really. For example, addition of syntax like "async" and "yield" > > > fundamentally changes the meaning of "def", in ways that *could not* > > > be fully emulated in earlier Pythons. The semantics simply were > > > impossible to produce -- that's why syntax extensions were necessary. > > But here, the code for versions before that change (e.g. aync) also > > worked on the new versions? there was not need to translate anything to > > the new version as it was a backward compatible change. To use the new > > feature you have to explicitly use that feature. If that so far correct? > > No, it is not a backwards compatible change. Any code using async as a > name will fail. > > py> sys.version > '3.8.0a2+ (heads/pr_12089:5fcd3b8, Mar 11 2019, 12:39:33) \n[GCC 4.1.2 20080704 (Red Hat 4.1.2-55)]' > py> async = 1 > File "", line 1 > async = 1 > ^ > SyntaxError: invalid syntax Though that particular case is a little complicated. Python 3.4.4 (default, Apr 17 2016, 16:02:33) >>> async def foo(): File "", line 1 async def foo(): ^ SyntaxError: invalid syntax Python 3.5.3 (default, Sep 27 2018, 17:25:39) and Python 3.6.5 (default, Apr 1 2018, 05:46:30) >>> async def foo(): ... pass ... >>> async = 1 >>> Python 3.7.0a4+ (heads/master:95e4d58913, Jan 27 2018, 06:21:05) >>> async = 1 File "", line 1 async = 1 ^ SyntaxError: invalid syntax So at what point do you call it a backward-incompatible change? And if you have some sort of automated translation tool to "fix" this, when should it rename something that was called "async"? ChrisA From raymond.hettinger at gmail.com Fri Mar 15 21:39:33 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 15 Mar 2019 18:39:33 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> Message-ID: <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> > On Mar 15, 2019, at 12:28 PM, Rhodri James wrote: > > I suspect this is a bit personal; I had sufficiently evil lecturers in my university Algebra course that I still don't automatically take the commutativity of "+" over a particular group as a given :-) Nothing is obvious unless you already know it. We don't design Python for ourselves. We design it for everyday users. Telling them that they can assume nothing is an anti-pattern. People do rely quite a bit on their intuitions. They also rely on implicit patterns already present in the language (i.e. in no other place is + idempotent, in no other place is + a destructive rather than concatenative or accumulative operator). As for commutativity, + would be obviously commutative for numeric types and obviously noncommutative for sequence concatenation, but for dicts the non-commutativity isn't obvious at all. And since the "|" operator is already used for mapping views, the + operator for merging would be unexpected. What is missing from the discussion is that we flat out don't need an operator for this. Use of explicit method names, update() or merge(), is already clear and already brief. Also, if we're honest with ourselves, most of us would use this less than once a year. So why make a pervasive change for this? Today, at least one PEP was rejected that had a stronger case than this proposal. We should consider asking why other major languages haven't gone down this path. The most likely reasons are 1) insufficient need, 2) the "+" operator doesn't make sense, and 3) there are already clean ways to do it. Also, it seems like the efficiency concerns were dismissed with hand-waving. But usually, coping and updating aren't the desired behavior. When teaching Python, I like to talk about how the design of the language nudges you towards fast, clear, correct code. The principle is that things that are good for you are put within easy reach. Things that require more thought are placed a little further away. That is the usual justification for copy() and deepcopy() having to be imported rather than being builtins. Copying is an obvious thing to do; it is also not usually good for you; so, we have you do one extra step to get to it. Raymond From rosuav at gmail.com Fri Mar 15 21:49:17 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Mar 2019 12:49:17 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: On Sat, Mar 16, 2019 at 12:40 PM Raymond Hettinger wrote: > Also, it seems like the efficiency concerns were dismissed with hand-waving. But usually, coping and updating aren't the desired behavior. When teaching Python, I like to talk about how the design of the language nudges you towards fast, clear, correct code. The principle is that things that are good for you are put within easy reach. Things that require more thought are placed a little further away. That is the usual justification for copy() and deepcopy() having to be imported rather than being builtins. Copying is an obvious thing to do; it is also not usually good for you; so, we have you do one extra step to get to it. > I'm not sure I understand this argument. Are you saying that d1+d2 is bad code because it will copy the dictionary, and therefore it shouldn't be done? Because the exact same considerations apply to the addition of two lists, which already exists in the language. Is it bad to add lists together instead of using extend()? ChrisA From raymond.hettinger at gmail.com Fri Mar 15 22:27:34 2019 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 15 Mar 2019 19:27:34 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: > On Mar 15, 2019, at 6:49 PM, Chris Angelico wrote: > > On Sat, Mar 16, 2019 at 12:40 PM Raymond Hettinger > wrote: >> Also, it seems like the efficiency concerns were dismissed with hand-waving. But usually, coping and updating aren't the desired behavior. When teaching Python, I like to talk about how the design of the language nudges you towards fast, clear, correct code. The principle is that things that are good for you are put within easy reach. Things that require more thought are placed a little further away. That is the usual justification for copy() and deepcopy() having to be imported rather than being builtins. Copying is an obvious thing to do; it is also not usually good for you; so, we have you do one extra step to get to it. >> > > I'm not sure I understand this argument. Are you saying that d1+d2 is > bad code because it will copy the dictionary, and therefore it > shouldn't be done? Because the exact same considerations apply to the > addition of two lists, which already exists in the language. Is it bad > to add lists together instead of using extend()? Yes, that exactly. Consider a table in a database. Usually what people want/need/ought-to-do is an SQL UPDATE rather than copy and update which would double the memory requirement and be potentially many times slower. The same applies to Python lists. Unless you actually have a requirement for three distinct lists (c = a + b), it is almost always better to extend in place. Adding lists rather than extending them is a recipe for poor performance (especially if it occurs in a loop): Raymond ---- Performant version ---- s = socket.socket() try: s.connect((host, port)) s.send(request) blocks = [] while True: block = s.recv(4096) if not block: break blocks += [block] # Normally done with append() page = b''.join(blocks) print(page.replace(b'\r\n', b'\n').decode()) finally: s.close() ---- Catastrophic version ---- s = socket.socket() try: s.connect((host, port)) s.send(request) blocks = [] while True: block = s.recv(4096) if not block: break blocks = blocks + [block] # Not good for you. page = b''.join(blocks) print(page.replace(b'\r\n', b'\n').decode()) finally: s.close() From rosuav at gmail.com Fri Mar 15 22:40:14 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Mar 2019 13:40:14 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: On Sat, Mar 16, 2019 at 1:27 PM Raymond Hettinger wrote: > > > On Mar 15, 2019, at 6:49 PM, Chris Angelico wrote: > > > > On Sat, Mar 16, 2019 at 12:40 PM Raymond Hettinger > > wrote: > >> Also, it seems like the efficiency concerns were dismissed with hand-waving. But usually, coping and updating aren't the desired behavior. When teaching Python, I like to talk about how the design of the language nudges you towards fast, clear, correct code. The principle is that things that are good for you are put within easy reach. Things that require more thought are placed a little further away. That is the usual justification for copy() and deepcopy() having to be imported rather than being builtins. Copying is an obvious thing to do; it is also not usually good for you; so, we have you do one extra step to get to it. > >> > > > > I'm not sure I understand this argument. Are you saying that d1+d2 is > > bad code because it will copy the dictionary, and therefore it > > shouldn't be done? Because the exact same considerations apply to the > > addition of two lists, which already exists in the language. Is it bad > > to add lists together instead of using extend()? > > Yes, that exactly. > Okay, fair. Though that doesn't necessarily push people towards operators. Your example from below: > blocks += [block] # Normally done with append() > blocks = blocks + [block] # Not good for you. contrasts two different ways of using operators, not operators vs methods (and as you say, the "good" example is more usually spelled with a method anyway). So I'm not sure what this means in terms of dictionary merging. I'm in favour of having both "merge to new" and "merge into this" operations (spelled as either + and +=, or | and |=, and I'm not fussed which of those is picked). As with everything else, "x += y" can be assumed to be the better option over "x = x + y", but the difference between copy/update and in-place update is the job of augmented assignment, not an operator/method distinction. > Consider a table in a database. Usually what people want/need/ought-to-do is an SQL UPDATE rather than copy and update which would double the memory requirement and be potentially many times slower. > (Heh. Funny you mention that example, because PostgreSQL actually implements updates by copying a row and then marking the old one as "will be deleted by transaction X". But that's unrelated to this, as it's a design decision for concurrency.) So in terms of "design pushing people to the performant option", the main takeaway is that, if dict addition is implemented, augmented addition should also be implemented. I don't think that's really under dispute. The question is, should addition (or bitwise-or, same diff) be implemented at all? Performance shouldn't kill that. ChrisA From arj.python at gmail.com Fri Mar 15 22:43:52 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Sat, 16 Mar 2019 06:43:52 +0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: Despite my poor python skills, i don't think i'd ever use this one. blocks = blocks + [block] # Not good for you. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sat Mar 16 00:18:49 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sat, 16 Mar 2019 13:18:49 +0900 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: On Sat, Mar 16, 2019 at 2:51 AM Guido van Rossum wrote: > > But I think that the folks who point out "there is already a way to do this" are missing the point that it really is easier to grasp the meaning of this: > > d = d1 + d2 > > compared to this: > > d = d1.copy() > d = d1.update(d2) > > and it is not just a matter of fewer lines of code: the first form allows us to use our visual processing to help us see the meaning quicker -- and without distracting other parts of our brain (which might already be occupied by keeping track of the meaning of d1 and d2, for example). It seems this example is bit unfair. It is not just method vs operator, because dict doesn't provide outer place version of update() method. In case of set, `s = s1 | s2` can be compared to `s = s1.union(s2)`. So dict example doesn't explain "why add operator instead of method?" > Of course, everything comes at a price. You have to learn the operators, and you have to learn their properties when applied to different object types. (This is true in math too -- for numbers, x*y == y*x, but this property does not apply to functions or matrices; OTOH x+y == y+x applies to all, as does the associative law.) I think behavior is more important than properties. When we learn operator's behavior, its property is obvious. So main point of using operator or not is consistency. Same operator should be used for same thing as possible. I prefer | to + because the behavior of dict.update() looks similar set.union() rather than list.extend(). Another option I like is add + operator to not only dict, but also set. In this case, + is used to join containers by the way most natural to the container's type. That's what Kotlin and Scala does. (Although Scala used ++ instead of +). ref: https://discuss.python.org/t/pep-584-survey-of-other-languages-operator-overload/977 Regards, -- Inada Naoki From guido at python.org Sat Mar 16 01:29:01 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Mar 2019 22:29:01 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: On Fri, Mar 15, 2019 at 9:19 PM Inada Naoki wrote: > On Sat, Mar 16, 2019 at 2:51 AM Guido van Rossum wrote: > > > > But I think that the folks who point out "there is already a way to do > this" are missing the point that it really is easier to grasp the meaning > of this: > > > > d = d1 + d2 > > > > compared to this: > > > > d = d1.copy() > > d = d1.update(d2) > [Note that I made a typo in the last line. It should be `d.update(d2)`, no assignment.] > > and it is not just a matter of fewer lines of code: the first form > allows us to use our visual processing to help us see the meaning quicker > -- and without distracting other parts of our brain (which might already be > occupied by keeping track of the meaning of d1 and d2, for example). > > It seems this example is bit unfair. It is not just method vs operator, > because dict doesn't provide outer place version of update() method. > Actually most of my post was exactly about why operators can in some cases be better than functions (which includes methods). > In case of set, `s = s1 | s2` can be compared to `s = s1.union(s2)`. > > So dict example doesn't explain "why add operator instead of method?" > Correct, since most of the post was already explaining it. :-) > > Of course, everything comes at a price. You have to learn the operators, > and you have to learn their properties when applied to different object > types. (This is true in math too -- for numbers, x*y == y*x, but this > property does not apply to functions or matrices; OTOH x+y == y+x applies > to all, as does the associative law.) > > I think behavior is more important than properties. > When we learn operator's behavior, its property is obvious. > So main point of using operator or not is consistency. Same operator > should be used for same thing as possible. > > I prefer | to + because the behavior of dict.update() looks similar > set.union() > rather than list.extend(). > That's a separate topic and I did not mean to express an opinion on it in this post. I simply used + because it's the simplest of all operators, and it makes it easier for everyone to follow the visual argument. > Another option I like is add + operator to not only dict, but also set. > In this case, + is used to join containers by the way most natural to the > container's type. > > That's what Kotlin and Scala does. (Although Scala used ++ instead of +). > ref: > https://discuss.python.org/t/pep-584-survey-of-other-languages-operator-overload/977 This probably belongs in another thread (though IIRC it has been argued to death already). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Mar 16 04:40:44 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 16 Mar 2019 04:40:44 -0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: I agree with Guido's general comments on operators. Modern arithmetic and algebra really took off with the introduction of operators. On the other hand, I have seen condensed blocks of 'higher math', dense with operators, that I could hardly read, and that reminded me of API or Perl. On 3/15/2019 9:39 PM, Raymond Hettinger wrote: > We don't design Python for ourselves. We design it for everyday users. Telling them that they can assume nothing is an anti-pattern. People do rely quite a bit on their intuitions. They also rely on implicit patterns already present in the language (i.e. in no other place is + idempotent, in no other place is + a destructive rather than concatenative or accumulative operator). As for commutativity, + would be obviously commutative for numeric types and obviously noncommutative for sequence concatenation, but for dicts the non-commutativity isn't obvious at all. And since the "|" operator is already used for mapping views, the + operator for merging would be unexpected. I agree with this argument in favor of '|' over '+'. > What is missing from the discussion is that we flat out don't need an operator for this. I grepped idlelib's 60 modules for '.update('. Ignoring the tkinter .update() calls, there are 3 uses of copy-update, to create a namespace for eval or exec, that could use the new operator. There are 3 other used to update-mutate an existing dict, which would not. If someone took a similar look as stdlib modules, I missed it. So I looked at non-package top-level modules in /lib (no recursion). The following likely has a few mis-classification mistakes, but most were clear. 35 dict mutate updates 7 set updates 8 dict copy-updates that could use '|' (assuming not set updates) # I did not think of set possibility until I had seen move of these 4 copy, intervening try or if, update # these either could not use '|' or only with code contortion 5 tk widget updates 10 other update methods (a few 'dict updates might belong here) 10? 'update's in docstrings and comments -- 79 hits -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sat Mar 16 04:04:22 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Mar 2019 21:04:22 +1300 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190315164401.GB29550@ando.pearwood.info> References: <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> <20190315164401.GB29550@ando.pearwood.info> Message-ID: <5C8CAE06.4060304@canterbury.ac.nz> Another random thought about this: Mathematicians use addition as a metaphor for quite a range of different things, but they tend to only use the symbols ? and ? for actual sets, or things that are very set-like. So maybe that's an argument for using '+' rather than '|' for dict merging. -- Greg From greg.ewing at canterbury.ac.nz Sat Mar 16 04:39:05 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Mar 2019 21:39:05 +1300 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: <5C8CB629.9040303@canterbury.ac.nz> R?mi Lapeyre wrote: > I think this omit a very important property of > mathematic equations thought, maths is a very strongly typed language > which can be a significant improvement for readability. Python is very strongly typed too, so I don't really see how maths is different. > For example, a > mathematician working within the space of linear maps over a vector > space will easily recognize the meaning of every symbol in: > > f(a * x + y) = a * f(x) + f(y) Yes, but he has to remember what types are associated with the variables -- nothing at their point of use indicates that. Likewise, the reader of a Python program has to remember what type of object each name is expected to be bound to. If he can remember that, he will know what all the operators do. -- Greg From skrah at bytereef.org Sat Mar 16 05:38:59 2019 From: skrah at bytereef.org (Stefan Krah) Date: Sat, 16 Mar 2019 10:38:59 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <5C8CAE06.4060304@canterbury.ac.nz> References: <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> <20190315164401.GB29550@ando.pearwood.info> <5C8CAE06.4060304@canterbury.ac.nz> Message-ID: <20190316093859.GA3405@bytereef.org> On Sat, Mar 16, 2019 at 09:04:22PM +1300, Greg Ewing wrote: > Another random thought about this: Mathematicians use addition as a > metaphor for quite a range of different things, but they tend to only > use the symbols ? and ? for actual sets, or things that are very > set-like. So maybe that's an argument for using '+' rather than '|' > for dict merging. If one views an ordered dict as an assoc list, '+' would mean prepending the new values to the existing ones. If one views an unordered dict as a set of ordered pairs, '|' would make sense. Stefan Krah From storchaka at gmail.com Sat Mar 16 06:29:08 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 16 Mar 2019 12:29:08 +0200 Subject: [Python-ideas] Not all operators are useful (was Why operators are useful) In-Reply-To: References: Message-ID: 15.03.19 19:51, Guido van Rossum ????: > There's been a lot of discussion about an operator to merge two dicts. I > participated in the beginning but quickly felt overwhelmed by the > endless repetition, so I muted most of the threads. > > But I have been thinking about the reason (some) people like operators, > and a discussion I had with my mentor Lambert Meertens over 30 years ago > came to mind. Operators are useful because they are used for common operations. And the meaning is roughly the same in most programming languages and not only. It is very inconvenient to write any calculations using add(x, y) or x.add(y) (if you use big integers or decimals in Java you need to do this). Concatenating strings is common enough operation too. Although Python have now many other ways to perform it ('%s%s' % (x, y), f'{x}{y}', ''.join((x, y)), etc), so using the plus operator is not strongly necessary. But this is a history. Also, the "+" operator works well in pair with the "*" operator. But how much times you need to merge dicts not in-place? My main objection against adding an operator for merging dicts is that this is very uncommon operation. It adds complexity to the language, adds more special cases (because "+" for dicts do not satisfy some properties of "+" for numbers and sequences), adds potential conflicts (with Counter), but the usefulness of it is minor. Operators are useful, but not all operators are always useful in all cases. From steve at pearwood.info Sat Mar 16 06:33:31 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 21:33:31 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: <20190316103330.GL12502@ando.pearwood.info> On Fri, Mar 15, 2019 at 10:53:31PM +0000, MRAB wrote: > There was also the suggestion of having both << and >>. > > Actually, now that dicts are ordered, that would provide a use-case, > because you would then be able to choose which values were overwritten > whilst maintaining the order of the dict on the LHS. Is that common enough that it needs to be built-in to dict itself? If it is uncommon, then the conventional solution is to subclass dict, overriding the merge operator to use first-seen semantics. The question this PEP is trying to answer is not "can we support every use-case imaginable for a merge operator?" but "can we support the most typical use-case?", which I believe is a version of: new = a.copy() new.update(b) # do something with new -- Steven From solipsis at pitrou.net Sat Mar 16 06:39:22 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Mar 2019 11:39:22 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> Message-ID: <20190316113922.36c79378@fsol> On Sat, 16 Mar 2019 01:41:59 +1100 Steven D'Aprano wrote: > > Matrix multiplication is a perfect example: adding the @ operator could > have been done in Python 0.1 if anyone had thought of it, but it took 15 > years of numerical folk "whinging" about the lack until it happened: Not so perfect, as the growing use of Python for scientific computing has made it much more useful to promote a dedicated matrix multiplication operator than, say, 15 or 20 years ago. This is precisely why I worded my question this way: what has changed in the last 20 years that make a "+" dict operator more compelling today than it was? Do we merge dicts much more frequently than we did? I don't think so. > Or the infamous := operator, which ultimately is a useful but minor > syntactic and semantic change but generated a huge amount of debate, > argument and negativity. ... and is likely to be a mistake as well. Justifying future mistakes with past mistakes doesn't sound very reasonable ;-) > I still remember being told in no uncertain terms by the core devs that > adding a clear() method to lists was a waste of time because there was > already a perfectly good way to spell it with slicing. And then ABCs > came along and now lists have a clear method. So opinions change too. Not really the same problem. The "+" dict operator is not intuitively obvious in its meaning, while a "clear()" method on lists is. I wouldn't mind the new operator if its meaning was clear-cut. But here we have potential for confusion, both for writers and readers of code. > > Besides, if I have two dicts with e.g. lists as values, I *really* > > dislike the fact that the + operator will clobber the values rather than > > concatenate them. It's a recipe for confusion. > > Are you confused that the update method clobbers list values rather than > concatenate them? I doubt that you are. > > So why would it be confusing to say that + does a copy-and-update? Because it's named "+" precisely. You know, names are important. ;-) Regards Antoine. From solipsis at pitrou.net Sat Mar 16 06:40:51 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Mar 2019 11:40:51 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190302035224.GO4465@ando.pearwood.info> <8e57c9b2-7973-8317-e7ce-c33abcbe4425@netc.fr> <01ce6a83-534c-2e27-acc9-f3be4601b559@kynesim.co.uk> <20190315122522.2756381f@fsol> <20190315145907.GG12502@ando.pearwood.info> Message-ID: <20190316114051.6d760987@fsol> On Sat, 16 Mar 2019 01:59:07 +1100 Steven D'Aprano wrote: > On Fri, Mar 15, 2019 at 12:25:22PM +0100, Antoine Pitrou wrote: > > > Yeah, well.... I do think "+=" for lists was a mistake. I *still* have > > trouble remembering the exact difference between "list +=" and > > "list.extend" (yes, there is one: one accepts more types than the > > other... which one it is, and why, I never remember; > > Both accept arbitrary iterables, and the documentation suggests that > they are the same: > > https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types > > Perhaps you are thinking of the difference between list + list versus > list += iterable? Hmm, it looks like I misremembered indeed. Thanks for correcting this. Regards Antoine. From solipsis at pitrou.net Sat Mar 16 06:49:15 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Mar 2019 11:49:15 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190305161100.GE4465@ando.pearwood.info> <5C7EFAE6.8000105@canterbury.ac.nz> <8812767a-1010-0923-36d4-22af4d2665ec@brice.xyz> <5C804A08.9060902@canterbury.ac.nz> <20190315123445.22d1284b@fsol> <20190315164401.GB29550@ando.pearwood.info> Message-ID: <20190316114915.5759b352@fsol> On Sat, 16 Mar 2019 03:44:02 +1100 Steven D'Aprano wrote: > On Fri, Mar 15, 2019 at 12:34:45PM +0100, Antoine Pitrou wrote: > > On Thu, 7 Mar 2019 10:58:02 +1100 > > Chris Angelico wrote: > > > > > > Lots of words that basically say: Stuff wouldn't be perfectly pure. > > > > Chris, please learn to think twice before contributing what is > > essentially a trivialization of someone else's arguments. You're not > > doing anything useful here, and are just sounding like an asshole who > > wants to shut people up. > > I don't think you are being fair here, and I'd rather avoid getting into > unhelpful arguments about tone and whether Chris is "trivializing" (a > perjorative term) or "simplifying" (a more neutral term) Josh's > position. But if you feel that Chris (and I) have missed parts of Josh's > argument, then by all means point out what we missed. When someone posts an elaborate argument (regardless of whether they are right or not) and someone else responds a one-liner that reduces it to "lots of words" and claims to rephrase it as a short caricatural statement, then it seems fair to me to characterize it as "trivializing". But if you feel that I missed a subtlety in Chris' position, and if you feel he was more respectful of the OP than I felt he was, then by all means point out what I missed. Regards Antoine. From solipsis at pitrou.net Sat Mar 16 06:54:41 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Mar 2019 11:54:41 +0100 Subject: [Python-ideas] Why operators are useful References: Message-ID: <20190316115441.64117152@fsol> On Fri, 15 Mar 2019 10:51:11 -0700 Guido van Rossum wrote: > Of course, everything comes at a price. You have to learn the operators, > and you have to learn their properties when applied to different object > types. That's not the only price, though. If "+" is added to dicts, then we're overloading an already heavily used operator. It makes reading code more difficult. In mathematics, this is not a problem as the "types" of the "variables" are explicitly given. Not in (usual) Python. Regards Antoine. From steve at pearwood.info Sat Mar 16 07:00:59 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Mar 2019 22:00:59 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <723509C3-6FC3-440D-A67E-0E4B9A582DBA@gmail.com> <49BFF38E-7E96-4219-9EFB-C3FDD8A01F11@gmail.com> Message-ID: <20190316110054.GM12502@ando.pearwood.info> On Sat, Mar 16, 2019 at 06:43:52AM +0400, Abdur-Rahmaan Janhangeer wrote: > Despite my poor python skills, i don't think i'd ever use this one. > > blocks = blocks + [block] # Not good for you. Neither would I. But I would use: result = process(blocks + [block]) in preference to: temp = blocks[:] temp.append(block) result = process(temp) del temp # don't pollute the global namespace Can I make it clear that the dict addition PEP does not propose deprecating or removing the update method? If you need to update a dict in place, the update method remains the preferred One Obvious Way to do so, just as list.append remains the One Obvious Way to append to a list. -- Steven From Richard at Damon-Family.org Sat Mar 16 07:17:10 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Sat, 16 Mar 2019 07:17:10 -0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: <5C8CB629.9040303@canterbury.ac.nz> References: <5C8CB629.9040303@canterbury.ac.nz> Message-ID: <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> On 3/16/19 4:39 AM, Greg Ewing wrote: > R?mi Lapeyre wrote: >> I think this omit a very important property of >> mathematic equations thought, maths is a very strongly typed language >> which can be a significant improvement for readability. > > Python is very strongly typed too, so I don't really see how > maths is different. 'Strongly Typed Language' can have slightly different meaning to different people. In Python, an object have a very definite type which strongly defines what you can do with that object, while other languages are less definitive in that aspect. But in Python, names are NOT that strongly typed, as a name can be rebound to any sort of object with a wide variety of types, compared to other languages where before using (or at first use) a variable you need to declare the 'type' that will be stored in it, and that type is all that it can hold. R?mi, I believe, is assuming in their example that by defining the field of mathematics being used, there is at least an implicit definition (if not actually explicit as such a statement would typically be preceded by definitions) definition of the types of the variables. This is part of the rigors of the language of mathematics.? Python on the other hand, while it allows providing a 'Type Hint' for the type of a variable, doesn't demand such a thing, so when looking at a piece of code you don't necessarily know the types of the objects being used (which can also be a strength). -- Richard Damon From gjcarneiro at gmail.com Sat Mar 16 08:01:51 2019 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 16 Mar 2019 12:01:51 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190316103330.GL12502@ando.pearwood.info> References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: On Sat, 16 Mar 2019 at 10:33, Steven D'Aprano wrote: > On Fri, Mar 15, 2019 at 10:53:31PM +0000, MRAB wrote: > > > There was also the suggestion of having both << and >>. > > > > Actually, now that dicts are ordered, that would provide a use-case, > > because you would then be able to choose which values were overwritten > > whilst maintaining the order of the dict on the LHS. > > Is that common enough that it needs to be built-in to dict itself? > > If it is uncommon, then the conventional solution is to subclass dict, > overriding the merge operator to use first-seen semantics. > > The question this PEP is trying to answer is not "can we support every > use-case imaginable for a merge operator?" but "can we support the most > typical use-case?", which I believe is a version of: > > new = a.copy() > new.update(b) > # do something with new > Already been said, but might have been forgotten, but the new proposed syntax: new = a + b has to compete with the already existing syntax: new = {**a, **b} The existing syntax is not exactly an operator in the mathematical sense (or is it?...), but my intuition is that it already triggers the visual processing part of the brain, similarly to operators. The only argument for "a + b" in detriment of "{**a, **b}" is that "a + b" is more easy to discover, while not many programmers are familiar with "{**a, **b}". I wonder if this is only a matter of time, and over time programmers will become more accustomed to "{**a, **b}", thereby reducing the relative benefit of "a + b"? Especially as more and more developers migrate code bases from Python 2 to Python 3... -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Sat Mar 16 08:14:19 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Sat, 16 Mar 2019 07:14:19 -0500 Subject: [Python-ideas] Why operators are useful In-Reply-To: <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> Message-ID: <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> On 3/16/19 6:17 AM, Richard Damon wrote: > On 3/16/19 4:39 AM, Greg Ewing wrote: >> R?mi Lapeyre wrote: >>> I think this omit a very important property of >>> mathematic equations thought, maths is a very strongly typed language >>> which can be a significant improvement for readability. >> >> Python is very strongly typed too, so I don't really see how >> maths is different. > > 'Strongly Typed Language' can have slightly different meaning to > different people. In Python, an object have a very definite type which > strongly defines what you can do with that object, while other languages > are less definitive in that aspect. But in Python, names are NOT that > strongly typed, as a name can be rebound to any sort of object with a > wide variety of types, compared to other languages where before using > (or at first use) a variable you need to declare the 'type' that will be > stored in it, and that type is all that it can hold. That's not strong vs. weak typing, that's dynamic vs. static typing. That said, I agree that different people get this wrong. :-) From remi.lapeyre at henki.fr Sat Mar 16 08:41:45 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Sat, 16 Mar 2019 05:41:45 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> Message-ID: Le 16 mars 2019 ? 13:15:37, Dan Sommers (2qdxy4rzwzuuilue at potatochowder.com(mailto:2qdxy4rzwzuuilue at potatochowder.com)) a ?crit: > On 3/16/19 6:17 AM, Richard Damon wrote: > > On 3/16/19 4:39 AM, Greg Ewing wrote: > >> R?mi Lapeyre wrote: > >>> I think this omit a very important property of > >>> mathematic equations thought, maths is a very strongly typed language > >>> which can be a significant improvement for readability. > >> > >> Python is very strongly typed too, so I don't really see how > >> maths is different. > > > > 'Strongly Typed Language' can have slightly different meaning to > > different people. In Python, an object have a very definite type which > > strongly defines what you can do with that object, while other languages > > are less definitive in that aspect. But in Python, names are NOT that > > strongly typed, as a name can be rebound to any sort of object with a > > wide variety of types, compared to other languages where before using > > (or at first use) a variable you need to declare the 'type' that will be > > stored in it, and that type is all that it can hold. > > That's not strong vs. weak typing, that's dynamic vs. static typing. > > That said, I agree that different people get this wrong. :-) Yes, I?m dumb. I should have wrote ??maths is a static typed language??. This together with the fact that it is nearly purely functional means that the overhead to know what type a given symbol is is much smaller. If I say ? let f an automorphism over E??, I can write three pages of equations and f will still be the same automorphism and E its associated vector space. I don?t have to very carefully read each intermediate result to make sure I did not bind f to something else. In Python, if I write three pages of code f could be something else so to know its type, I must look at all intermediate lines, including the called functions to know what the operator will refer too. This means that the overhead to track what a given symbol is in Python and much larger than it is in math. It?s already the case in a given function, but it gets worse when some of the names come from arguments, then you have to look in the caller context which may have been written at completely another time, by another team, increasing again the overhead. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From remi.lapeyre at henki.fr Sat Mar 16 08:41:45 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Sat, 16 Mar 2019 05:41:45 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> Message-ID: Le 16 mars 2019 ? 13:15:37, Dan Sommers (2qdxy4rzwzuuilue at potatochowder.com(mailto:2qdxy4rzwzuuilue at potatochowder.com)) a ?crit: > On 3/16/19 6:17 AM, Richard Damon wrote: > > On 3/16/19 4:39 AM, Greg Ewing wrote: > >> R?mi Lapeyre wrote: > >>> I think this omit a very important property of > >>> mathematic equations thought, maths is a very strongly typed language > >>> which can be a significant improvement for readability. > >> > >> Python is very strongly typed too, so I don't really see how > >> maths is different. > > > > 'Strongly Typed Language' can have slightly different meaning to > > different people. In Python, an object have a very definite type which > > strongly defines what you can do with that object, while other languages > > are less definitive in that aspect. But in Python, names are NOT that > > strongly typed, as a name can be rebound to any sort of object with a > > wide variety of types, compared to other languages where before using > > (or at first use) a variable you need to declare the 'type' that will be > > stored in it, and that type is all that it can hold. > > That's not strong vs. weak typing, that's dynamic vs. static typing. > > That said, I agree that different people get this wrong. :-) Yes, I?m dumb. I should have wrote ??maths is a static typed language??. This together with the fact that it is nearly purely functional means that the overhead to know what type a given symbol is is much smaller. If I say ? let f an automorphism over E??, I can write three pages of equations and f will still be the same automorphism and E its associated vector space. I don?t have to very carefully read each intermediate result to make sure I did not bind f to something else. In Python, if I write three pages of code f could be something else so to know its type, I must look at all intermediate lines, including the called functions to know what the operator will refer too. This means that the overhead to track what a given symbol is in Python and much larger than it is in math. It?s already the case in a given function, but it gets worse when some of the names come from arguments, then you have to look in the caller context which may have been written at completely another time, by another team, increasing again the overhead. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From remi.lapeyre at henki.fr Sat Mar 16 08:57:09 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Sat, 16 Mar 2019 05:57:09 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: <5C8CB629.9040303@canterbury.ac.nz> References: <5C8CB629.9040303@canterbury.ac.nz> Message-ID: Le 16 mars 2019 ? 10:02:31, Greg Ewing (greg.ewing at canterbury.ac.nz(mailto:greg.ewing at canterbury.ac.nz)) a ?crit: > R?mi Lapeyre wrote: > > I think this omit a very important property of > > mathematic equations thought, maths is a very strongly typed language > > which can be a significant improvement for readability. > > Python is very strongly typed too, so I don't really see how > maths is different. Sorry, this should have read ??maths is a statically typed language??. For example, in Python I can write: def inverse(x): return x ** (-1) But this would never be accepted in maths, I should say one of ? ?R -> R f: x -> x ** (-1) ? ?R+* -> R f: x ? -> x ** (-1) ? ?[1; +oo[ -> R f: x ? ? ? ?-> x ** (-1) ? ?GLn(K) ->?GLn(K) f: x ? ? ?-> x ** (-1) And in all those examples, ** would have meant something very different and the resulting objects f are very different. For example, the third one is?Lipschitz continuous but not the first. On the other hand, I know nothing regarding the inverse Function in Python. Knowing nothing about `inverse` means that every time I use it i must determine what it means in the given context. > > For example, a > > mathematician working within the space of linear maps over a vector > > space will easily recognize the meaning of every symbol in: > > > > f(a * x + y) = a * f(x) + f(y) > > Yes, but he has to remember what types are associated with > the variables -- nothing at their point of use indicates that. > Likewise, the reader of a Python program has to remember what > type of object each name is expected to be bound to. If he > can remember that, he will know what all the operators do. The overhead to track the associated type for a given name in maths is far lower since it is a functional language. In maths, I can just make a mental note of it and be done with it; in Python, you can never be sure the type of the binded object did not change unexpectedly. > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From kirillbalunov at gmail.com Sat Mar 16 09:02:16 2019 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sat, 16 Mar 2019 16:02:16 +0300 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: Thank you for this very thoughtful message! It reminded me of my first experience with the old Fortran code. You probably know that earlier in Fortran there were no cryptic shortcuts for writing relational operators: instead of `A >= B`, you had to write `A .GE. B`, or as many often wrote this without spaces `A.GE.B`. Even after a decent time, I still mentally stop and linger on these places. It's amazing that, before your message, I never thought about the difference in perception between `>=` and `.GE.`. It seems to me that, from the perception point of view, the main difference is that `A .GE. B` consists of readable characters and therefore we try to read them, while `A >= B` is perceived as a single structure (picture) due to unreadable `>=`. And our brain is much more better at pattern matching than when reading. The same is true in Python in the difference between the operator and method forms: `a >= b` and `a.__ge__(b)`. If we draw an analogy for dictionaries between: a | b # (my preference) over `a + b` (1) and d = d1.copy() (2) d = d.update(d2) The (1) is perceived as a picture, while (2) is perceived as a short story. And you have to read it, and spend some extra time, and spend some extra energy. English is not my mother tongue, so I'm not sure that my words correctly convey the meaning of the analogy. Offtopic: To be honest, the idea of `+` operator overloading for something non numeric still does not fully fit in my numerically oriented mind. If I started from the beginning, I would introduce a special dunder for concatenation (__concat__) with the corresponding operator, something like seq1 .. seq2 or seq1 ~ seq2. But that ship has long sailed. With kind regards, -gdg ??, 15 ???. 2019 ?. ? 20:52, Guido van Rossum : > There's been a lot of discussion about an operator to merge two dicts. I > participated in the beginning but quickly felt overwhelmed by the endless > repetition, so I muted most of the threads. > > But I have been thinking about the reason (some) people like operators, > and a discussion I had with my mentor Lambert Meertens over 30 years ago > came to mind. > > For mathematicians, operators are essential to how they think. Take a > simple operation like adding two numbers, and try exploring some of its > behavior. > > add(x, y) == add(y, x) (1) > > Equation (1) expresses the law that addition is commutative. It's usually > written using an operator, which makes it more concise: > > x + y == y + x (1a) > > That feels like a minor gain. > > Now consider the associative law: > > add(x, add(y, z)) == add(add(x, y), z) (2) > > Equation (2) can be rewritten using operators: > > x + (y + z) == (x + y) + z (2a) > > This is much less confusing than (2), and leads to the observation that > the parentheses are redundant, so now we can write > > x + y + z (3) > > without ambiguity (it doesn't matter whether the + operator binds tighter > to the left or to the right). > > Many other laws are also written more easily using operators. Here's one > more example, about the identity element of addition: > > add(x, 0) == add(0, x) == x (4) > > compare to > > x + 0 == 0 + x == x (4a) > > The general idea here is that once you've learned this simple notation, > equations written using them are easier to *manipulate* than equations > written using functional notation -- it is as if our brains grasp the > operators using different brain machinery, and this is more efficient. > > I think that the fact that formulas written using operators are more > easily processed *visually* has something to do with it: they engage the > brain's visual processing machinery, which operates largely subconsciously, > and tells the conscious part what it sees (e.g. "chair" rather than "pieces > of wood joined together"). The functional notation must take a different > path through our brain, which is less subconscious (it's related to reading > and understanding what you read, which is learned/trained at a much later > age than visual processing). > > The power of visual processing really becomes apparent when you combine > multiple operators. For example, consider the distributive law: > > mul(n, add(x, y)) == add(mul(n, x), mul(n, y)) (5) > > That was painful to write, and I believe that at first you won't see the > pattern (or at least you wouldn't have immediately seen it if I hadn't > mentioned this was the distributive law). > > Compare to: > > n * (x + y) == n * x + n * y (5a) > > Notice how this also uses relative operator priorities. Often > mathematicians write this even more compact: > > n(x+y) == nx + ny (5b) > > but alas, that currently goes beyond the capacities of Python's parser. > > Another very powerful aspect of operator notation is that it is convenient > to apply them to objects of different types. For example, laws (1) through > (5) also work when n, x, y and z are same-size vectors (substituting a > vector of zeros for the literal "0"), and also if x, y and z are matrices > (note that n has to be a scalar). > > And you can do this with objects in many different domains. For example, > the above laws (1) through (5) apply to functions too (n being a scalar > again). > > By choosing the operators wisely, mathematicians can employ their visual > brain to help them do math better: they'll discover new interesting laws > sooner because sometimes the symbols on the blackboard just jump at you and > suggest a path to an elusive proof. > > Now, programming isn't exactly the same activity as math, but we all know > that Readability Counts, and this is where operator overloading in Python > comes in. Once you've internalized the simple properties which operators > tend to have, using + for string or list concatenation becomes more > readable than a pure OO notation, and (2) and (3) above explain (in part) > why that is. > > Of course, it's definitely possible to overdo this -- then you get Perl. > But I think that the folks who point out "there is already a way to do > this" are missing the point that it really is easier to grasp the meaning > of this: > > d = d1 + d2 > > compared to this: > > d = d1.copy() > d = d1.update(d2) > > and it is not just a matter of fewer lines of code: the first form allows > us to use our visual processing to help us see the meaning quicker -- and > without distracting other parts of our brain (which might already be > occupied by keeping track of the meaning of d1 and d2, for example). > > Of course, everything comes at a price. You have to learn the operators, > and you have to learn their properties when applied to different object > types. (This is true in math too -- for numbers, x*y == y*x, but this > property does not apply to functions or matrices; OTOH x+y == y+x applies > to all, as does the associative law.) > > "But what about performance?" I hear you ask. Good question. IMO, > readability comes first, performance second. And in the basic example (d = > d1 + d2) there is no performance loss compared to the two-line version > using update, and a clear win in readability. I can think of many > situations where performance difference is irrelevant but readability is of > utmost importance, and for me this is the default assumption (even at > Dropbox -- our most performance critical code has already been rewritten in > ugly Python or in Go). For the few cases where performance concerns are > paramount, it's easy to transform the operator version to something else -- > *once you've confirmed it's needed* (probably by profiling). > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirillbalunov at gmail.com Sat Mar 16 09:29:43 2019 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sat, 16 Mar 2019 16:29:43 +0300 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: Message-ID: ??, 16 ???. 2019 ?. ? 16:02, Kirill Balunov : > Thank you for this very thoughtful message! It reminded me of my first > experience with the old Fortran code. You probably know that earlier in > Fortran there were no cryptic shortcuts for writing relational operators: > instead of `A >= B`, you had to write `A .GE. B`, or as many often wrote > this without spaces `A.GE.B`. Even after a decent time, I still mentally > stop and linger on these places. It's amazing that, before your message, I > never thought about the difference in perception between `>=` and `.GE.`. > It seems to me that, from the perception point of view, the main difference > is that `A .GE. B` consists of readable characters and therefore we try > to read them, while `A >= B` is perceived as a single structure (picture) > due to unreadable `>=`. And our brain is much more better at pattern > matching than when reading. The same is true in Python in the difference > between the operator and method forms: `a >= b` and `a.__ge__(b)`. If we > draw an analogy for dictionaries between: > > a | b # (my preference) over `a + b` (1) > > and > > d = d1.copy() (2) > d = d.update(d2) > > of course d = d1.copy() (2) d.update(d2) just copy-pasted your example without any thought:) > The (1) is perceived as a picture, while (2) is perceived as a short > story. And you have to read it, and spend some extra time, and spend some > extra energy. English is not my mother tongue, so I'm not sure that my > words correctly convey the meaning of the analogy. > > Offtopic: To be honest, the idea of `+` operator overloading for > something non numeric still does not fully fit in my numerically oriented > mind. If I started from the beginning, I would introduce a special dunder > for concatenation (__concat__) with the corresponding operator, something > like seq1 .. seq2 or seq1 ~ seq2. But that ship has long sailed. > > With kind regards, > -gdg > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sat Mar 16 13:51:07 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Sat, 16 Mar 2019 13:51:07 -0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <8aefee76-79f4-7b70-6c67-d286730a0f8e@potatochowder.com> Message-ID: <2e231486-ce78-2a53-2e91-900115526f24@Damon-Family.org> On 3/16/19 8:14 AM, Dan Sommers wrote: > On 3/16/19 6:17 AM, Richard Damon wrote: >> On 3/16/19 4:39 AM, Greg Ewing wrote: >>> R?mi Lapeyre wrote: >>>> I think this omit a very important property of >>>> mathematic equations thought, maths is a very strongly typed language >>>> which can be a significant improvement for readability. >>> >>> Python is very strongly typed too, so I don't really see how >>> maths is different. >> >> 'Strongly Typed Language' can have slightly different meaning to >> different people. In Python, an object have a very definite type which >> strongly defines what you can do with that object, while other languages >> are less definitive in that aspect. But in Python, names are NOT that >> strongly typed, as a name can be rebound to any sort of object with a >> wide variety of types, compared to other languages where before using >> (or at first use) a variable you need to declare the 'type' that will be >> stored in it, and that type is all that it can hold. > > That's not strong vs. weak typing, that's dynamic vs. static typing. > > That said, I agree that different people get this wrong.? :-) As I said, different meaning to different people, Some consider that dynamic typing implies not a totally strong typing (since the name doesn't have a well know type). -- Richard Damon From greg at krypto.org Sat Mar 16 14:02:54 2019 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 16 Mar 2019 11:02:54 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: On Sat, Mar 16, 2019 at 5:02 AM Gustavo Carneiro wrote: > On Sat, 16 Mar 2019 at 10:33, Steven D'Aprano wrote: > >> On Fri, Mar 15, 2019 at 10:53:31PM +0000, MRAB wrote: >> >> > There was also the suggestion of having both << and >>. >> > >> > Actually, now that dicts are ordered, that would provide a use-case, >> > because you would then be able to choose which values were overwritten >> > whilst maintaining the order of the dict on the LHS. >> >> Is that common enough that it needs to be built-in to dict itself? >> >> If it is uncommon, then the conventional solution is to subclass dict, >> overriding the merge operator to use first-seen semantics. >> >> The question this PEP is trying to answer is not "can we support every >> use-case imaginable for a merge operator?" but "can we support the most >> typical use-case?", which I believe is a version of: >> >> new = a.copy() >> new.update(b) >> # do something with new >> > > Already been said, but might have been forgotten, but the new proposed > syntax: > > new = a + b > > has to compete with the already existing syntax: > > new = {**a, **b} > > The existing syntax is not exactly an operator in the mathematical sense > (or is it?...), but my intuition is that it already triggers the visual > processing part of the brain, similarly to operators. > > The only argument for "a + b" in detriment of "{**a, **b}" is that "a + b" > is more easy to discover, while not many programmers are familiar with > "{**a, **b}". > > I wonder if this is only a matter of time, and over time programmers will > become more accustomed to "{**a, **b}", thereby reducing the relative > benefit of "a + b"? Especially as more and more developers migrate code > bases from Python 2 to Python 3... > FWIW, even as a core developer I had forgotten that the {**a, **b} syntax existed, thanks for the reminder! :) But that's more likely because I rarely write code that needs to update and merge a dict or when i do it's still 2and3 compatible. Antoine said: > If "+" is added to dicts, then we're overloading an already heavily used operator. It makes reading code more difficult. This really resonated with me. Reading code gives you a feel for what possible types something could be. The set of possibilities for + is admittedly already quite large in Python. But making an existing core type start supporting + *reduces the information given to the reader* by that one line of code. They now have more possibilities to consider and must seek hints from more surrounding code. For type inferencers such as us humans or tools like pytype , it means we need to consider which version Python's dict the code may be running under in order to infer what it may mean from the code's context. For tooling, that's just a flag and a matter of conditionally changing code defining dict, but for humans they need to carry the possibility of that flag with them everywhere. We should just promote the use of {**d1, **d2} syntax for anyone who wants an inline updated copy. Why? (1) It already exists. (insert zen of python quote here) (2) Copying via the + operator encourages inefficient code (already true for bytes/str/list). A single + is fine. But the natural human extension to that when people want to merge a bunch of things is to use a string of multiple operators, because that is how we're taught math. No matter what type we're talking about, in Python this is an efficiency antipattern. z = a + b + c + d + e That's four __add__ calls. Each of which is a copy+update/extend/append operation for a dict/list/str respectively. We already tell people not to do that with bytes and lists, instead using b''.join(a,b,c,d) or z = []; z.extend(X)... calls or something from itertools. Given dict addition, it'd always be more efficient to join the "don't use tons of + operators" club (a good lint warning) and write that as z = {**a, **b, **c, **d, **e}. Unless the copy+update concept is an extremely common operation, having more than one way to do it feels like it'll cause more cognitive harm than good. Now (2) could *also* be used as an argument that Python should detect chains of operators and allow those to be optimized. That'd be a PEP of its own and is complicated to do; technically a semantic change given how dynamic Python is, as we do name lookups at time of use and each __add__ call could potentially have side effects changing the results of future name lookups (the a+b could change the meaning of c). Yes, that is horrible and people writing code that does that deserve very bad things, but those are the semantics we'd be breaking if we tried to detect and support a mythical new construct like __chained_add__ being invoked when the types of all elements being added sequentially are identical (how identical? do subtypes count? see, complicated). a + b + c: 0 LOAD_GLOBAL 0 (a) 2 LOAD_GLOBAL 1 (b) 4 BINARY_ADD 6 LOAD_GLOBAL 2 (c) 8 BINARY_ADD -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Mar 16 19:13:04 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 16 Mar 2019 19:13:04 -0400 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: On 3/16/2019 8:01 AM, Gustavo Carneiro wrote: > On Sat, 16 Mar 2019 at 10:33, Steven D'Aprano > > wrote: > The question this PEP is trying to answer is not "can we support every > use-case imaginable for a merge operator?" but "can we support the most > typical use-case?", which I believe is a version of: > > ? ? new = a.copy() > ? ? new.update(b) > ? ? # do something with new In my census of the stdlib, already posted and noted as subject to error, this was twice as common as all other non-update-in-place constructions (8 to 4) and about 1/4 as common as update in place (8 to 35). > Already been said, but might have been forgotten, but the new proposed > syntax: > ? ? new = a?+ b > has to compete with the already existing syntax: > ? ? new = {**a, **b} Thank you and whoever mentioned it first on this thread. I will look at using this in idlelib. There is one place where .update is called 3 times on the same initial dict in multiple lines. > I wonder if this is only a matter of time, and over time programmers > will become more accustomed to "{**a, **b}" I never paid this much attention as I did not know of any immediate use in my personal work (and there has not been yet) and I did not think about looking at idlelib. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sat Mar 16 19:45:19 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 17 Mar 2019 12:45:19 +1300 Subject: [Python-ideas] Why operators are useful In-Reply-To: <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> Message-ID: <5C8D8A8F.2040104@canterbury.ac.nz> Richard Damon wrote: > R?mi, I believe, is assuming in their example that by defining the field > of mathematics being used, there is at least an implicit definition (if > not actually explicit as such a statement would typically be preceded by > definitions) definition of the types of the variables. In Python, we have such implicit definitions in the form of comments, and inferences from the way things are used. > when looking at a piece of code you > don't necessarily know the types of the objects being used And if you look at an equation from a mathematics text without the context in which it appears, you won't always know what it means either. -- Greg From francismb at email.de Sun Mar 17 08:01:48 2019 From: francismb at email.de (francismb) Date: Sun, 17 Mar 2019 13:01:48 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> Message-ID: Hi Nick, On 3/12/19 3:57 PM, Nick Timkovich wrote: > The onus is on you > to positively demonstrate you require both directions, not him to > negatively demonstrate it's never required. >From Calvin I just wanted to have some examples where he sees a use for swapping operands (nothing to be demonstrated :-) ). But I really just wanted to talk some *visual asymmetric form* that could be used as operator for potentially asymmetric operations and thought that the arrow could be one of this. So you're correct one should discuss with *form* could potentially be wider accepted/work. The debate on this is going on the thread: "Why operators are useful". Regards, --francis From francismb at email.de Sun Mar 17 08:08:08 2019 From: francismb at email.de (francismb) Date: Sun, 17 Mar 2019 13:08:08 +0100 Subject: [Python-ideas] Left arrow and right arrow operators In-Reply-To: <375e1160-d475-7324-8d41-88df11e5a3fd@email.de> References: <4fd9518d-3707-7f9c-751a-b2eefa172634@email.de> <20190303150638.zv24yqah625uiypj@phdru.name> <5C830BE8.1030606@canterbury.ac.nz> <6d8e6422-9635-fd4a-439d-7529fbca7c4a@email.de> <375e1160-d475-7324-8d41-88df11e5a3fd@email.de> Message-ID: On 3/15/19 9:02 PM, francismb wrote: > And the operator is the function.exactly, function application/call From francismb at email.de Sun Mar 17 10:08:30 2019 From: francismb at email.de (francismb) Date: Sun, 17 Mar 2019 15:08:30 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: <80e850ba-7516-a6a8-4a56-34a4357b5959@email.de> On 3/15/19 11:09 PM, Chris Angelico wrote: > Python 3.5 introduced the modulo operator for bytes objects. How are > you going to write a function that determines whether or not a piece > of code depends on this? I'm not sure I understand the question. Isn't *a piece of code* that does a modulo operation on a bytes type object *at least* 3.5 python code ? or needs *at least* that version to run ? Thanks, --francis From rosuav at gmail.com Sun Mar 17 10:13:29 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Mar 2019 01:13:29 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <80e850ba-7516-a6a8-4a56-34a4357b5959@email.de> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> <80e850ba-7516-a6a8-4a56-34a4357b5959@email.de> Message-ID: On Mon, Mar 18, 2019 at 1:09 AM francismb wrote: > > On 3/15/19 11:09 PM, Chris Angelico wrote: > > Python 3.5 introduced the modulo operator for bytes objects. How are > > you going to write a function that determines whether or not a piece > > of code depends on this? > I'm not sure I understand the question. Isn't *a piece of code* that > does a modulo operation on a bytes type object *at least* 3.5 python > code ? or needs *at least* that version to run ? > Yes, it will. Can you determine whether some code does this? Can you recognize what kind of object is on the left of a percent sign? Remember, it quite possibly won't be a literal. ChrisA From francismb at email.de Sun Mar 17 10:21:14 2019 From: francismb at email.de (francismb) Date: Sun, 17 Mar 2019 15:21:14 +0100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> Message-ID: <4dcebd08-e471-425d-a056-6499f64fc243@email.de> On 3/15/19 11:09 PM, Chris Angelico wrote: > And, are you going to run this function on every single code snippet > before you try it? If just trying, may be not. But yes, if I care to know where the applicability limits are (interpreter versions) before integrating it. IMHO I don't think it's a good practice to integrate a snippet of code without knowing it, so I would use that function (or may be the service :-) ) if the possibility existed. Regards, --francis From stephie.maths at gmail.com Sun Mar 17 12:35:29 2019 From: stephie.maths at gmail.com (Savant Of Illusions) Date: Sun, 17 Mar 2019 12:35:29 -0400 Subject: [Python-ideas] New Data Structure - Non Well-Founded Dict Message-ID: I am in desperate need of a dict similar structure that allows sets and/or dicts as keys *and* values. My application is NLP conceptual plagiarism detection. Dealing with infinite grammars communicating illogical concepts. Would be even better if keys could nest the same data structure, e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as key(s). In order to detect conceptual plagiarism, I need to populate a data structure with if/then equivalents as a decision tree. But my equivalents have potentially infinite ways of arranging them syntactically* and* semantically. A dict having keys with identical set values treats each key as a distinct element. I am dealing with semantics or elemental equivalents and many different statements treated as equivalent statements involving if/then (key/value) or a implies b, where a and/or b can be an element or an if/then as an element. Modeling the syntactic equivalences of such claims is paramount, and in order to do that, I need the data structure. Hello, I am Stephanie. I have never contributed to any open source. I am about intermediate at python and I am a self-directed learner/hobbyist. I am trying to prove with my code that a particular very famous high profile pop debate intellectual is plagiarizing Anders Breivik. I can show it via observation, but his dishonesty is dispersed among many different talks/lectures. I am dealing with a large number of speaking hours as transcripts containing breadcrumbs that are very difficult for a human to piece together as having come from the manifesto which is 1515 pages and about half copied from other sources. The concepts stolen are rearrangements and reorganizations of the same identical claims and themes. He occasionally uses literal string plagiarism but not very much at once. He is very good at elaboration which makes it even more difficult. Thank you, for your time, Stephanie -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Sun Mar 17 13:30:29 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 18 Mar 2019 02:30:29 +0900 Subject: [Python-ideas] Code version evolver In-Reply-To: <949f979b-f6ef-c9e8-a7de-b148d437dc12@email.de> References: <23357464-8b15-ff45-824e-08ad0970ab03@email.de> <23691.8720.682485.692756@turnbull.sk.tsukuba.ac.jp> <949f979b-f6ef-c9e8-a7de-b148d437dc12@email.de> Message-ID: <23694.33845.412411.797340@turnbull.sk.tsukuba.ac.jp> francismb writes: > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > > What 2to3 does is to handle a lot of automatic conversions, such as > > flipping the identifiers from str to bytes and unicode to str. It was > > necessary to have some such tool because of the very large amount of > > such menial work needed to change a 2 code base to a 3 code base. But > > even so, there were things that 2to3 couldn't do, and it often exposed > > bugs or very poor practice (decode applied to unicode objects, encode > > applied to bytes) that had to be reworked by the developer anyway. > Very interesting from the 2/3 transition experience point of view. But > that's not still the past, IMHO that will be after 2020,... around 2025 :-) Yeah, I did a lightning talk on that about 3 years ago (whenever the 2020 Olympics was awarded to Tokyo, which was pretty much simultaneous with the start of the EOL-for-Python-2 clock -- the basic fantasy was that "Python 2 is the common official business-oriented language of the Tokyo Olympics and Paralympics", and the punch line was "no gold for programmers, just job security"). But so what? My point is that 2to3 development itself is the past. I don't think anybody's working on it at all now. The question you asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's why 2to3 was worth the effort, 3.X upgrades are quite different and it's not worth it". > Could one also say that under the line that it *improved* the code? > (by exposing bugs, bad practices) could be a first step to just > *flag* those behaviors/changes ? Probably not. So many lines of code need to be changed to go from 2 to 3 that most likely the first release after conversion is a pile of dungbeetles. Remember, some Python 2 code uses str as more or less opaque bytes, other code use it as "I don't need no stinkin' Unicode" text (works fine for monolingual environments with 8-bit encodings, after all). So it doesn't even do a great job for 'str' vs 'unicode' vs 'bytes'. No automatic conversion could do more than a 50% job for most medium-size projects, and every line of code changed has some probability of introducing a bug. If there were a lot of bugs to start with, that probability goes up -- and a lot of lines change, implying a lot of *new* bugs. It's hard for a syntax-based tool to find enough old bugs to keep up with the proliferation of new ones. You really should have given up on this by now. It's not that it's a bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its context. But the analogy for within-3 upgrades doesn't hold, and it's not hard to see why it doesn't once you have the basic facts (conservative policy toward backwards compatibility, even across major versions). I could be wrong, but I don't think there's much for you to learn by prolonging the thread. Unless you actually code an upgrade tool yourself -- then you'll learn a *ton*. That's not my idea of fun, though. :-) Steve From mertz at gnosis.cx Sun Mar 17 16:23:45 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 17 Mar 2019 16:23:45 -0400 Subject: [Python-ideas] New Data Structure - Non Well-Founded Dict In-Reply-To: References: Message-ID: This is an interesting challenge you have. However, this list is for proposing ideas for changes in the Python language itself, in particular the CPython reference implementation. Python-list or some discussion site dealing with machine learning or natural language processing would be appropriate for the task you are trying to figure out. I suspect that third party libraries contain the data structures you need, but I cannot recommend anything specific from my experience. On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions wrote: > I am in desperate need of a dict similar structure that allows sets and/or > dicts as keys *and* values. My application is NLP conceptual plagiarism > detection. Dealing with infinite grammars communicating illogical > concepts. Would be even better if keys could nest the same data structure, > e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as > key(s). > > In order to detect conceptual plagiarism, I need to populate a data > structure with if/then equivalents as a decision tree. But my equivalents > have potentially infinite ways of arranging them syntactically* and* > semantically. > > A dict having keys with identical set values treats each key as a distinct > element. I am dealing with semantics or elemental equivalents and many > different statements treated as equivalent statements involving if/then > (key/value) or a implies b, where a and/or b can be an element or an > if/then as an element. Modeling the syntactic equivalences of such claims > is paramount, and in order to do that, I need the data structure. > > Hello, I am Stephanie. I have never contributed to any open source. I am > about intermediate at python and I am a self-directed learner/hobbyist. I > am trying to prove with my code that a particular very famous high profile > pop debate intellectual is plagiarizing Anders Breivik. I can show it via > observation, but his dishonesty is dispersed among many different > talks/lectures. I am dealing with a large number of speaking hours as > transcripts containing breadcrumbs that are very difficult for a human to > piece together as having come from the manifesto which is 1515 pages and > about half copied from other sources. The concepts stolen are > rearrangements and reorganizations of the same identical claims and themes. > He occasionally uses literal string plagiarism but not very much at once. > He is very good at elaboration which makes it even more difficult. > > Thank you, for your time, > Stephanie > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Mar 17 18:34:01 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Mar 2019 09:34:01 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> <80e850ba-7516-a6a8-4a56-34a4357b5959@email.de> Message-ID: <20190317223401.GO12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: [...] > Yes, it will. Can you determine whether some code does this? Can you > recognize what kind of object is on the left of a percent sign? > Remember, it quite possibly won't be a literal. I don't understand whether your question is asking if Francis *personally* can do this, or if it is possible in principle. If the later, then inferring the type of expressions is precisely the sort of thing that mypy (and others) do. -- Steven From rosuav at gmail.com Sun Mar 17 19:59:25 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Mar 2019 10:59:25 +1100 Subject: [Python-ideas] Code version evolver In-Reply-To: <20190317223401.GO12502@ando.pearwood.info> References: <20190311232501.GP12502@ando.pearwood.info> <13143e69-2a4a-382b-ea59-cf26bdceb5d1@email.de> <1ece0df9-120a-21fd-e319-834505b7ba72@email.de> <80e850ba-7516-a6a8-4a56-34a4357b5959@email.de> <20190317223401.GO12502@ando.pearwood.info> Message-ID: On Mon, Mar 18, 2019 at 9:34 AM Steven D'Aprano wrote: > > On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: > [...] > > Yes, it will. Can you determine whether some code does this? Can you > > recognize what kind of object is on the left of a percent sign? > > Remember, it quite possibly won't be a literal. > > I don't understand whether your question is asking if Francis > *personally* can do this, or if it is possible in principle. > > If the later, then inferring the type of expressions is precisely the > sort of thing that mypy (and others) do. Kinda somewhere between. Francis keeps saying "oh, just make a source code rewriter", and I'm trying to point out that (1) that is NOT an easy thing to do - sure, there are easy cases, but there are also some extremely hard ones; and (2) even if it could magically be made to work, it would still have (and cause) problems. ChrisA From cs at cskk.id.au Sun Mar 17 22:16:51 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Mon, 18 Mar 2019 13:16:51 +1100 Subject: [Python-ideas] New Data Structure - Non Well-Founded Dict In-Reply-To: References: Message-ID: <20190318021651.GA39702@cskk.homeip.net> And in case it wasn't clear, "python-list" is here: python-list at python.org Please try posting the same question there instead. Cheers, Cameron Simpson On 17Mar2019 16:23, David Mertz wrote: >This is an interesting challenge you have. However, this list is for >proposing ideas for changes in the Python language itself, in particular >the CPython reference implementation. > >Python-list or some discussion site dealing with machine learning or >natural language processing would be appropriate for the task you are >trying to figure out. I suspect that third party libraries contain the data >structures you need, but I cannot recommend anything specific from my >experience. > >On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions >wrote: > >> I am in desperate need of a dict similar structure that allows sets and/or >> dicts as keys *and* values. My application is NLP conceptual plagiarism >> detection. Dealing with infinite grammars communicating illogical >> concepts. Would be even better if keys could nest the same data structure, >> e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as >> key(s). >> >> In order to detect conceptual plagiarism, I need to populate a data >> structure with if/then equivalents as a decision tree. But my equivalents >> have potentially infinite ways of arranging them syntactically* and* >> semantically. >> >> A dict having keys with identical set values treats each key as a distinct >> element. I am dealing with semantics or elemental equivalents and many >> different statements treated as equivalent statements involving if/then >> (key/value) or a implies b, where a and/or b can be an element or an >> if/then as an element. Modeling the syntactic equivalences of such claims >> is paramount, and in order to do that, I need the data structure. >> >> Hello, I am Stephanie. I have never contributed to any open source. I am >> about intermediate at python and I am a self-directed learner/hobbyist. I >> am trying to prove with my code that a particular very famous high profile >> pop debate intellectual is plagiarizing Anders Breivik. I can show it via >> observation, but his dishonesty is dispersed among many different >> talks/lectures. I am dealing with a large number of speaking hours as >> transcripts containing breadcrumbs that are very difficult for a human to >> piece together as having come from the manifesto which is 1515 pages and >> about half copied from other sources. The concepts stolen are >> rearrangements and reorganizations of the same identical claims and themes. >> He occasionally uses literal string plagiarism but not very much at once. >> He is very good at elaboration which makes it even more difficult. >> >> Thank you, for your time, >> Stephanie From ja.py at farowl.co.uk Mon Mar 18 03:14:01 2019 From: ja.py at farowl.co.uk (Jeff Allen) Date: Mon, 18 Mar 2019 07:14:01 +0000 Subject: [Python-ideas] New Data Structure - Non Well-Founded Dict In-Reply-To: References: Message-ID: <6a08638f-6534-30e3-5a28-96343be9b06c@farowl.co.uk> Stephanie: Welcome. The "Python idea" here is to allow a broader range of types as keys to a dictionary. The gap appears to be that certain types (like set) "don't work" as keys (or rather their identities not values work), but this is a misunderstanding. A set is mutable: it is as if, in an ordinary dictionary (lexically sorted), one were to allow changes to the spelling of a word while keping the definition. It's not unreasonable to do, but the entry is now potentially in the wrong place and ought to be re-inserted so someone can find it. Others have rightly suggested python-list as a place you could explore how to construct the data structure you need, using existing features of Python. However, I'll just mention that frozenset is worth a look. Jeff Allen On 17/03/2019 16:35, Savant Of Illusions wrote: > I am in desperate need of a dict similar structure that allows sets > and/or dicts as keys /and/ values. My application is NLP conceptual > plagiarism detection. Dealing with infinite grammars communicating > illogical concepts.?Would be even better if keys could nest the same > data structure, e.g. set(s) or dict(s) in set(s) or dict(s) of the > set(s) or dict(s) as key(s). > > In order to detect conceptual plagiarism, I need to populate a data > structure with if/then equivalents as a decision tree. But my > equivalents have potentially infinite ways of arranging them > syntactically/and/ semantically. > > A dict having keys with identical set values treats each key as a > distinct element. I am dealing with semantics or elemental equivalents > and many different statements treated as equivalent statements > involving if/then (key/value) or a implies b, where a and/or b can be > an element or an if/then as an element. Modeling the syntactic > equivalences of such claims is paramount, and in order to do that, I > need the data structure. > > Hello, I am Stephanie. I have never contributed to any open source. I > am about intermediate at python and I am a self-directed > learner/hobbyist. I am trying to prove with my code that a particular > very famous high profile pop debate intellectual is plagiarizing > Anders Breivik. I can show it via observation, but his dishonesty is > dispersed among many different talks/lectures. I am dealing with a > large number of speaking hours as transcripts containing breadcrumbs > that are very difficult for a human to piece together as having come > from the manifesto which is 1515 pages and about half copied from > other sources. The concepts stolen are rearrangements and > reorganizations of the same identical claims and themes. He > occasionally uses literal string plagiarism but not very much at once. > He is very good at elaboration which makes it even more difficult. > > Thank you, for your time, > Stephanie > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon Mar 18 07:13:31 2019 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 18 Mar 2019 07:13:31 -0400 Subject: [Python-ideas] True and False are singletons Message-ID: It came to my attention that: In the original PEP True and False are said to be singletons https://www.python.org/dev/peps/pep-0285/, but it's not in the Data Model https://docs.python.org/3/reference/datamodel.html This came to my attention by code wanting to own the valid values in a dict's key: if settings[MY_KEY] is True: ... If True and False are singletons in the spec (and not only in the CPython implementation), it should be prominent and well known. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From remi.lapeyre at henki.fr Mon Mar 18 07:21:43 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Mon, 18 Mar 2019 04:21:43 -0700 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: Le 18 mars 2019 ? 12:15:05, Juancarlo A?ez (apalala at gmail.com(mailto:apalala at gmail.com)) a ?crit: > It came to my attention that: > > > In the original PEP True and False are said to be singletons https://www.python.org/dev/peps/pep-0285/, but it's not in the Data Model https://docs.python.org/3/reference/datamodel.html > > This came to my attention by code wanting to own the valid values in a dict's key: > > > if settings[MY_KEY] is True: > > ... > > > > > If True and False are singletons in the spec (and not only in the CPython implementation), it should be prominent and well known. I think it?s what "The two objects representing the values False and True are the only Boolean objects." mean. R?mi > Cheers, > > -- > Juancarlo A?ez _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Mon Mar 18 07:32:38 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Mar 2019 22:32:38 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: On Mon, Mar 18, 2019 at 10:14 PM Juancarlo A?ez wrote: > > It came to my attention that: > > In the original PEP True and False are said to be singletons https://www.python.org/dev/peps/pep-0285/, but it's not in the Data Model https://docs.python.org/3/reference/datamodel.html > > > This came to my attention by code wanting to own the valid values in a dict's key: > > if settings[MY_KEY] is True: > ... > > > If True and False are singletons in the spec (and not only in the CPython implementation), it should be prominent and well known. > "Singleton" technically means that there is only one such object. 'None' is a singleton, by language specification; if type(x) is type(None), you can safely assume that x is None. Booleans are a bit more tricky; there will only ever be those two, but they're two. IMO the PEP is minorly inaccurate to use the word "singleton" there, but it's no big deal. As Remi says, the two built-in ones are the only two instances of that type. ChrisA From greg.ewing at canterbury.ac.nz Mon Mar 18 07:27:04 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Mar 2019 00:27:04 +1300 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: <5C8F8088.4050108@canterbury.ac.nz> Juancarlo A?ez wrote: > if settings[MY_KEY] is True: > ... If I saw code like this, it would take a really good argument to convince me that it shouldn't be just if settings[MY_KEY]: ... -- Greg From Richard at Damon-Family.org Mon Mar 18 08:17:35 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Mon, 18 Mar 2019 08:17:35 -0400 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: <690dd9f6-aaf8-98b4-8e57-9f7327940bb9@Damon-Family.org> On 3/18/19 7:32 AM, Chris Angelico wrote: > On Mon, Mar 18, 2019 at 10:14 PM Juancarlo A?ez wrote: >> It came to my attention that: >> >> In the original PEP True and False are said to be singletons https://www.python.org/dev/peps/pep-0285/, but it's not in the Data Model https://docs.python.org/3/reference/datamodel.html >> >> >> This came to my attention by code wanting to own the valid values in a dict's key: >> >> if settings[MY_KEY] is True: >> ... >> >> >> If True and False are singletons in the spec (and not only in the CPython implementation), it should be prominent and well known. >> > "Singleton" technically means that there is only one such object. > 'None' is a singleton, by language specification; if type(x) is > type(None), you can safely assume that x is None. Booleans are a bit > more tricky; there will only ever be those two, but they're two. IMO > the PEP is minorly inaccurate to use the word "singleton" there, but > it's no big deal. As Remi says, the two built-in ones are the only two > instances of that type. > > ChrisA Which says that the type of True or False isn't a singleton, but those particular values are. There may be other objects with values that are Truthy or Falsey, but if the value actually IS True or False, the object WILL be those particular objects. As a comparison, if the Tuple (1,2)? was described as a Singleton, then any computation that generates that value would all need to return the exact same object, but they don't (the language could make that a true statement). When converting a Truthy or Falsey value to True or False, Python will ALWAYS grab those particular objects, and not create a new object with the same value, so they are Singletons, even if the type itself isn't Yes, in many languages the term Singleton is used for Types and not values, and in part that difference is due to that fact that in Python ALL values are objects, and names are just bound to some object, so the idea of Singleton primitive values actually makes sense.? -- Richard Damon From Richard at Damon-Family.org Mon Mar 18 08:19:09 2019 From: Richard at Damon-Family.org (Richard Damon) Date: Mon, 18 Mar 2019 08:19:09 -0400 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C8F8088.4050108@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> Message-ID: On 3/18/19 7:27 AM, Greg Ewing wrote: > Juancarlo A?ez wrote: > >> ?? if settings[MY_KEY] is True: >> ?????? ... > > If I saw code like this, it would take a really good argument to > convince me that it shouldn't be just > > ??? if settings[MY_KEY]: > ??????? ... > That means something VERY different. The first asks if the item is specifically the True value, while the second just asks if the value is Truthy, it wold be satisfied also for values like 1. -- Richard Damon From phd at phdru.name Mon Mar 18 08:08:22 2019 From: phd at phdru.name (Oleg Broytman) Date: Mon, 18 Mar 2019 13:08:22 +0100 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C8F8088.4050108@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> Message-ID: <20190318120822.t36352yku5qzuu3d@phdru.name> On Tue, Mar 19, 2019 at 12:27:04AM +1300, Greg Ewing wrote: > Juancarlo A?ez wrote: > > > if settings[MY_KEY] is True: > > ... > > If I saw code like this, it would take a really good argument to > convince me that it shouldn't be just > > if settings[MY_KEY]: > ... Three-way (tri state) checkbox. You have to distinguish False and None if the possible valuse are None, False and True. > -- > Greg Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From remi.lapeyre at henki.fr Mon Mar 18 08:51:08 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Mon, 18 Mar 2019 05:51:08 -0700 Subject: [Python-ideas] Why operators are useful In-Reply-To: <5C8D8A8F.2040104@canterbury.ac.nz> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> Message-ID: Le 17 mars 2019 ? 02:01:51, Greg Ewing (greg.ewing at canterbury.ac.nz(mailto:greg.ewing at canterbury.ac.nz)) a ?crit: > Richard Damon wrote: > > R?mi, I believe, is assuming in their example that by defining the field > > of mathematics being used, there is at least an implicit definition (if > > not actually explicit as such a statement would typically be preceded by > > definitions) definition of the types of the variables. > > In Python, we have such implicit definitions in the form > of comments, and inferences from the way things are used. Yes, exactly. You can make "inferences from the way things are used". But the comparison with maths stops here, you don?t make such inferences because your object must be well defined before you start using it. You can track types with comments but you need to comment each line. There is also no definitions if no comment was written. In maths, an given object is not duck because it quacks and walks like a duck, it?s either part of the set of all ducks, or not. Python?s typing is implicit Maths? typing is explicit so you don?t need to spend brain cycles to determine them. Python is imperative Maths is functional So once you know the type or the value of an object in maths, you don?t have to check all the time to make sure they did not change and spend precious brain cycles tracking it. I would argue that those two differences are really important when using an operator, When doing maths, you are always acutely aware of the context. > > when looking at a piece of code you > > don't necessarily know the types of the objects being used > > And if you look at an equation from a mathematics text without > the context in which it appears, you won't always know what > it means either. But the equation is only meaningful in a given context. Asking whether f: x -> 1/x is?differentiable is only meaningful if we know whether x is in R, C, [1; +oo[... > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rhodri at kynesim.co.uk Mon Mar 18 10:02:52 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 18 Mar 2019 14:02:52 +0000 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: <5C8F8088.4050108@canterbury.ac.nz> Message-ID: <961c6268-4872-8b7c-b4b3-d9745954e303@kynesim.co.uk> On 18/03/2019 12:19, Richard Damon wrote: > On 3/18/19 7:27 AM, Greg Ewing wrote: >> Juancarlo A?ez wrote: >> >>> ?? if settings[MY_KEY] is True: >>> ?????? ... >> >> If I saw code like this, it would take a really good argument to >> convince me that it shouldn't be just >> >> ??? if settings[MY_KEY]: >> ??????? ... >> > That means something VERY different. The first asks if the item is > specifically the True value, while the second just asks if the value is > Truthy, it wold be satisfied also for values like 1. Yes. And the latter is what people almost always mean. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Mon Mar 18 10:06:53 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 18 Mar 2019 14:06:53 +0000 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: On 16/03/2019 12:01, Gustavo Carneiro wrote: > Already been said, but might have been forgotten, but the new proposed > syntax: > > new = a + b > > has to compete with the already existing syntax: > > new = {**a, **b} > That's easy. Whether it's spelt with "+" or "|" or pretty much anything else, the operator version is clearer and cleaner. "{**a, **b}" is a combination of operators and literal (display) syntax, and following Guido's reasoning that makes it inherently harder to interpret. It's also ugly IMHO, but that's me. -- Rhodri James *-* Kynesim Ltd From solipsis at pitrou.net Mon Mar 18 10:12:52 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Mar 2019 15:12:52 +0100 Subject: [Python-ideas] Why operators are useful References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: <20190318151252.6bf20587@fsol> On Mon, 18 Mar 2019 14:06:53 +0000 Rhodri James wrote: > On 16/03/2019 12:01, Gustavo Carneiro wrote: > > Already been said, but might have been forgotten, but the new proposed > > syntax: > > > > new = a + b > > > > has to compete with the already existing syntax: > > > > new = {**a, **b} > > > > That's easy. Whether it's spelt with "+" or "|" or pretty much anything > else, the operator version is clearer and cleaner. "{**a, **b}" is a > combination of operators and literal (display) syntax, and following > Guido's reasoning that makes it inherently harder to interpret. It's > also ugly IMHO, but that's me. The question is whether it's too hard or ugly for the use cases. In other words: where are the use cases where it's frequent enough to merge dicts that a nicer syntax is required? (also, don't forget you can still use the copy() + update() method) Regards Antoine. From ericfahlgren at gmail.com Mon Mar 18 11:10:03 2019 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Mon, 18 Mar 2019 08:10:03 -0700 Subject: [Python-ideas] True and False are singletons In-Reply-To: <961c6268-4872-8b7c-b4b3-d9745954e303@kynesim.co.uk> References: <5C8F8088.4050108@canterbury.ac.nz> <961c6268-4872-8b7c-b4b3-d9745954e303@kynesim.co.uk> Message-ID: On Mon, Mar 18, 2019 at 7:04 AM Rhodri James wrote: > On 18/03/2019 12:19, Richard Damon wrote: > > On 3/18/19 7:27 AM, Greg Ewing wrote: > >> Juancarlo A?ez wrote: > >> > >>> if settings[MY_KEY] is True: > >>> ... > >> > >> If I saw code like this, it would take a really good argument to > >> convince me that it shouldn't be just > >> > >> if settings[MY_KEY]: > >> ... > >> > > That means something VERY different. The first asks if the item is > > specifically the True value, while the second just asks if the value is > > Truthy, it wold be satisfied also for values like 1. > > Yes. And the latter is what people almost always mean. > No, it depends heavily on the context. In GUI code, Oleg's example (tri-state checkbox) is a pervasive idiom. There's lots of code that says "if x is True" or "if x is False" or "if x is None" and that's a very clear indicator that you are dealing with these "booleans that can also be 'unset'". -------------- next part -------------- An HTML attachment was scrubbed... URL: From ijkl at netc.fr Mon Mar 18 11:07:11 2019 From: ijkl at netc.fr (Jimmy Girardet) Date: Mon, 18 Mar 2019 16:07:11 +0100 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190318151252.6bf20587@fsol> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> Message-ID: <872f3573-90f1-6227-6f22-74d8b1608d0e@netc.fr> Hi, Please let me share my story of non experienced python programmer. Last year I wanted to merge three dicts? for config stuff. I found very quickly the answer : a = {**b, **c, **d} Sadly I was working on python 3.3 and that was nos possible to use this syntax. I don't remember what I did next : use chain,; ChainMap,? some comprehension or some a.update() but I was missing the "upacking syntax". The syntax {**b,**c} wasn't hard to remember. That wasn't something known by mathematician, experienced programmers or some artist at the first look maybe. But It's a clear syntax easy to remember. Easy because two arterisk `**` in python is a well known syntax due to `**kwargs` in many functions. And easy because at the end it's idiomatic. Many things are not straightforward in python depending where you come from : if __name__ == '__main__':? # Ugly len(collection) et not collection.len() # Ugly depending your programming background item? in collection instead of collection.contains(i) # same thing. list/dict comprehensions... At the end, only a few things are straightforward at the beginning, so d1+d2 fails isn't a big deal since you will easy remember after a quick initial search the idiom {**d1,***d2} Jimmy Le 18/03/2019 ? 15:12, Antoine Pitrou a ?crit?: > On Mon, 18 Mar 2019 14:06:53 +0000 > Rhodri James wrote: >> On 16/03/2019 12:01, Gustavo Carneiro wrote: >>> Already been said, but might have been forgotten, but the new proposed >>> syntax: >>> >>> new = a + b >>> >>> has to compete with the already existing syntax: >>> >>> new = {**a, **b} >>> >> That's easy. Whether it's spelt with "+" or "|" or pretty much anything >> else, the operator version is clearer and cleaner. "{**a, **b}" is a >> combination of operators and literal (display) syntax, and following >> Guido's reasoning that makes it inherently harder to interpret. It's >> also ugly IMHO, but that's me. > The question is whether it's too hard or ugly for the use cases. In > other words: where are the use cases where it's frequent enough to > merge dicts that a nicer syntax is required? > > (also, don't forget you can still use the copy() + update() method) > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rhodri at kynesim.co.uk Mon Mar 18 11:26:04 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 18 Mar 2019 15:26:04 +0000 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: <5C8F8088.4050108@canterbury.ac.nz> <961c6268-4872-8b7c-b4b3-d9745954e303@kynesim.co.uk> Message-ID: On 18/03/2019 15:10, Eric Fahlgren wrote: > On Mon, Mar 18, 2019 at 7:04 AM Rhodri James wrote: > >> On 18/03/2019 12:19, Richard Damon wrote: >>> On 3/18/19 7:27 AM, Greg Ewing wrote: >>>> Juancarlo A?ez wrote: >>>> >>>>> if settings[MY_KEY] is True: >>>>> ... >>>> >>>> If I saw code like this, it would take a really good argument to >>>> convince me that it shouldn't be just >>>> >>>> if settings[MY_KEY]: >>>> ... >>>> >>> That means something VERY different. The first asks if the item is >>> specifically the True value, while the second just asks if the value is >>> Truthy, it wold be satisfied also for values like 1. >> >> Yes. And the latter is what people almost always mean. >> > > No, it depends heavily on the context. In GUI code, Oleg's example > (tri-state checkbox) is a pervasive idiom. There's lots of code that says > "if x is True" or "if x is False" or "if x is None" and that's a very clear > indicator that you are dealing with these "booleans that can also be > 'unset'". I would still contend that even in that case, testing "x is True" is asking to be hit with subtle bugs. -- Rhodri James *-* Kynesim Ltd From g.rodola at gmail.com Mon Mar 18 11:41:34 2019 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 18 Mar 2019 16:41:34 +0100 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() Message-ID: Hello, I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP. The same approach is used by ProcessHacker and - I suppose - pssupend.exe, both from Sysinternals team. It must be noted that there are 3 different ways to do this on Windows (https://stackoverflow.com/a/11010508/376587) but NtSuspend/ResumeProcess appears to be the best choice. Possible use case: <> Thoughts? -- Giampaolo - http://grodola.blogspot.com From solipsis at pitrou.net Mon Mar 18 11:46:34 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Mar 2019 16:46:34 +0100 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() References: Message-ID: <20190318164634.6a01ca49@fsol> Seems reasonable to me. Regards Antoine. On Mon, 18 Mar 2019 16:41:34 +0100 "Giampaolo Rodola'" wrote: > Hello, > I've been having these 2 implemented in psutil for a long time. On > POSIX these are convenience functions using os.kill() + SIGSTOP / > SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented > NtSuspendProcess and NtResumeProcess Windows APIs available since XP. > The same approach is used by ProcessHacker and - I suppose - > pssupend.exe, both from Sysinternals team. It must be noted that there > are 3 different ways to do this on Windows > (https://stackoverflow.com/a/11010508/376587) but > NtSuspend/ResumeProcess appears to be the best choice. Possible use > case: > > < system, which is desirable in cases where a process is consuming a > resource (e.g. network, CPU or disk) that you want to allow different > processes to use. Rather than kill the process that's consuming the > resource, suspending permits you to let it continue operation at some > later point in time.>> > > Thoughts? > From wes.turner at gmail.com Mon Mar 18 16:52:53 2019 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 18 Mar 2019 16:52:53 -0400 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: 'True' is a keyword. (Which is now immutable in Python 3.X?) >>> True = 1 File "", line 1 SyntaxError: can't assign to keyword https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy https://docs.python.org/3/search.html?q=singleton - "Since None is a singleton, testing for object identity (using == in C) is sufficient. There is no PyNone_Check() function for the same reason." https://docs.python.org/3/c-api/none.html?highlight=singleton - "Using a trailing comma for a singleton tuple: a, or (a,)" (?) https://docs.python.org/3/library/stdtypes.html?highlight=singleton#tuple https://docs.python.org/3/library/stdtypes.html#boolean-values https://docs.python.org/3/library/stdtypes.html#truth-value-testing In Python 2: >>> True True >>> True is True True >>> True = 1 >>> True is 1 True >>> True is None False >>> True = None >>> True is None True On Mon, Mar 18, 2019 at 7:34 AM Chris Angelico wrote: > On Mon, Mar 18, 2019 at 10:14 PM Juancarlo A?ez wrote: > > > > It came to my attention that: > > > > In the original PEP True and False are said to be singletons > https://www.python.org/dev/peps/pep-0285/, but it's not in the Data Model > https://docs.python.org/3/reference/datamodel.html > > > > > > This came to my attention by code wanting to own the valid values in a > dict's key: > > > > if settings[MY_KEY] is True: > > ... > > > > > > If True and False are singletons in the spec (and not only in the > CPython implementation), it should be prominent and well known. > > > > "Singleton" technically means that there is only one such object. > 'None' is a singleton, by language specification; if type(x) is > type(None), you can safely assume that x is None. Booleans are a bit > more tricky; there will only ever be those two, but they're two. IMO > the PEP is minorly inaccurate to use the word "singleton" there, but > it's no big deal. As Remi says, the two built-in ones are the only two > instances of that type. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Mar 18 16:59:56 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Mar 2019 07:59:56 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: On Tue, Mar 19, 2019 at 7:53 AM Wes Turner wrote: > > 'True' is a keyword. (Which is now immutable in Python 3.X?) > > >>> True = 1 > File "", line 1 > SyntaxError: can't assign to keyword In Python 3, the source code token "True" is a keyword literal that always represents the bool value True. > In Python 2: > > >>> True > True > >>> True is True > True > >>> True = 1 > >>> True is 1 > True > >>> True is None > False > >>> True = None > >>> True is None > True In Python 2, the source code token "True" is simply a name, and there is a built-in of that name. Before it became a built-in, it was common for scripts to have their own definitions of True and False [1], so to avoid unnecessary breakage, they were made assignable in the normal way. Python 3 simplifies this by making them keywords. But either way, the *values* True and False are special, and are the only two instances of the bool type that will ever exist. ChrisA [1] Note that I learned about this in history class; it was before my time. From greg.ewing at canterbury.ac.nz Mon Mar 18 16:58:55 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Mar 2019 09:58:55 +1300 Subject: [Python-ideas] True and False are singletons In-Reply-To: <20190318120822.t36352yku5qzuu3d@phdru.name> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> Message-ID: <5C90068F.9040006@canterbury.ac.nz> Oleg Broytman wrote: > Three-way (tri state) checkbox. You have to distinguish False and > None if the possible valuse are None, False and True. In that case the conventional way to write it would be if settings[MY_KEY] == True: ... It's not a major issue, but I get nervous when I see code that assumes True and False are unique, because things weren't always that way. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 18 17:12:09 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Mar 2019 10:12:09 +1300 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: <5C8F8088.4050108@canterbury.ac.nz> Message-ID: <5C9009A9.8090106@canterbury.ac.nz> Richard Damon wrote: > On 3/18/19 7:27 AM, Greg Ewing wrote: > >> if settings[MY_KEY]: >> ... > > That means something VERY different. Yes, but there needs to be justification for why the difference matters and why this particular way is the best way to deal with it. Whenever you write 'x is True' or 'x == True', you are putting a burden on all code that assigns to x to ensure that the value is actually an instance of bool rather than just a truthy or falsy value. That's an unusual requiremebt that can lead to obscure bugs. In the tri-state example, the way I would do it is to guard uses of it with 'if x is not None' and then treat the other values as truthy or falsy. -- Greg From mertz at gnosis.cx Mon Mar 18 17:47:17 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 18 Mar 2019 17:47:17 -0400 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C90068F.9040006@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> Message-ID: It was a VERY long time ago when True and False were not singletons. I don't think we should still try to write code based on rules that stopped applying more than a decade ago. On Mon, Mar 18, 2019, 5:42 PM Greg Ewing wrote: > Oleg Broytman wrote: > > Three-way (tri state) checkbox. You have to distinguish False and > > None if the possible valuse are None, False and True. > > In that case the conventional way to write it would be > > if settings[MY_KEY] == True: > ... > > It's not a major issue, but I get nervous when I see code > that assumes True and False are unique, because things > weren't always that way. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Mar 18 17:52:45 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 18 Mar 2019 17:52:45 -0400 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C9009A9.8090106@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <5C9009A9.8090106@canterbury.ac.nz> Message-ID: There are few cases where I would approve of 'if x is True'. However, the names used in the example suggest it could be one of those rare cases. Settings of True/False/None (i.e. not set) seem like a reasonable pattern. In fact, in code like that, merely "truthy" values are probably a bug that should not pass silently. Obviously this depends on the surrounding code to decide. On Mon, Mar 18, 2019, 5:44 PM Greg Ewing wrote: > Richard Damon wrote: > > On 3/18/19 7:27 AM, Greg Ewing wrote: > > > >> if settings[MY_KEY]: > >> ... > > > > That means something VERY different. > > Yes, but there needs to be justification for why the difference > matters and why this particular way is the best way to deal > with it. > > Whenever you write 'x is True' or 'x == True', you are putting > a burden on all code that assigns to x to ensure that the > value is actually an instance of bool rather than just a > truthy or falsy value. That's an unusual requiremebt that > can lead to obscure bugs. > > In the tri-state example, the way I would do it is to guard > uses of it with 'if x is not None' and then treat the other > values as truthy or falsy. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Mon Mar 18 18:09:13 2019 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 19 Mar 2019 09:09:13 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C90068F.9040006@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> Message-ID: On Tue, 19 Mar 2019 at 08:42, Greg Ewing wrote: > Oleg Broytman wrote: > > Three-way (tri state) checkbox. You have to distinguish False and > > None if the possible valuse are None, False and True. > > In that case the conventional way to write it would be > > if settings[MY_KEY] == True: > ... > > It's not a major issue, but I get nervous when I see code > that assumes True and False are unique, because things > weren't always that way. I would argue the opposite - the use of "is" shows a clear knowledge that True and False are each a singleton and the author explicitly intended to use them that way. Use of == in the same context is more likely to indicate a programmer who is unfamiliar with Python's truth rules. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Mar 18 18:15:04 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 09:15:04 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C90068F.9040006@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> Message-ID: <20190318221503.GQ12502@ando.pearwood.info> On Tue, Mar 19, 2019 at 09:58:55AM +1300, Greg Ewing wrote: > Oleg Broytman wrote: > > Three-way (tri state) checkbox. You have to distinguish False and > >None if the possible valuse are None, False and True. > > In that case the conventional way to write it would be > > if settings[MY_KEY] == True: > ... For a tri-state setting, I would always check for None (or whatever third state was used) first: setting = settings[MY_KEY] if setting is None: # handle third state elif setting: # handle true state else: # handle false state If for some strange reason I required the flags to be precisely True or False rather than arbitrary truthy values, that's a *four* state flag where the fourth state is an error condition. setting = settings[MY_KEY] if setting is None: # handle third state if not isinstance(setting, bool): raise TypeError("not a bool! (but why do I care???)") if setting: # handle true state else: # handle false state > It's not a major issue, but I get nervous when I see code > that assumes True and False are unique, because things > weren't always that way. Do you also guard against True and False not being defined at all? As long as True and False have been builtins, it has been a language guarantee that they will be unique. -- Steven From greg.ewing at canterbury.ac.nz Mon Mar 18 17:49:41 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Mar 2019 10:49:41 +1300 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> Message-ID: <5C901275.8030101@canterbury.ac.nz> R?mi Lapeyre wrote: > You can make "inferences from the way things are used". But the > comparison with maths stops here, you don?t make such inferences because your > object must be well defined before you start using it. In maths texts it's very common to see things like 'Let y = f(x)...' where it's been made clear beforehand (either explicitly or implicitly) what type f returns. That's completely analogous to inferring the type bound to a Python name from an assignment statement. > You can track types with > comments but you need to comment each line. No, you don't, because most lines in most programs allow types to be inferred. The reason that things like MyPy are possible and useful is that Python programs in practice are usually well-typed. > In maths, an given object is not duck because it quacks and walks like > a duck, it?s > either part of the set of all ducks, or not. But there's no restriction on how you define the set of all ducks. It could be "the set of all things that quack". Duck typing is totally possible in mathematics, even common. For example, in linear algebra the ducks are "anything you can apply a linear operator to". That can cover a surprisingly large variety of things. > But the equation is only meaningful in a given context. Asking whether > f: x -> 1/x > is differentiable is only meaningful if we know whether x is in R, C, > [1; +oo[... That depends on what you want to say. You can say "let f be a differentiable function" and then go on to talk about things that depend only on the differentiability of f, without knowing exactly what types f operates on. Then later you can substitute any function you know to be differentiable, and all of those thing will be true for it. Mathematicians abstract things this way all the time. Groups, fields, vector spaces, etc. are all very abstract concepts, defined only as a set of objects having certain properties. If that's not duck typing, I don't know what is. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 18 18:32:56 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Mar 2019 11:32:56 +1300 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> Message-ID: <5C901C98.6090101@canterbury.ac.nz> Tim Delaney wrote: > I would argue the opposite - the use of "is" shows a clear knowledge > that True and False are each a singleton and the author explicitly > intended to use them that way. I don't think you can infer that. It could equally well be someone who's *not* familiar with Python truth rules and really just meant "if x". Or someone who's unfamiliar with booleans in general and thinks that every "if" statement has to have a comparison in it. -- Greg From steve at pearwood.info Mon Mar 18 18:48:42 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 09:48:42 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: <20190318224842.GR12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 10:32:38PM +1100, Chris Angelico wrote: > "Singleton" technically means that there is only one such object. True, but it is common to abuse the term to mean only a fixed, small number of such objects, since "duoton" (for two) and "tripleton" (for three) have never caught on. -- Steven From shoyer at gmail.com Mon Mar 18 18:52:33 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 18 Mar 2019 15:52:33 -0700 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C901C98.6090101@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> <5C901C98.6090101@canterbury.ac.nz> Message-ID: On Mon, Mar 18, 2019 at 3:42 PM Greg Ewing wrote: > Tim Delaney wrote: > > I would argue the opposite - the use of "is" shows a clear knowledge > > that True and False are each a singleton and the author explicitly > > intended to use them that way. > > I don't think you can infer that. It could equally well be someone who's > *not* familiar with Python truth rules and really just meant "if x". > Or someone who's unfamiliar with booleans in general and thinks that > every "if" statement has to have a comparison in it. > Regardless of whether it's idiomatic Python code or not, this pattern ("is True:") can be found all over Python code in the wild. If CPython ever broke this guarantee, quite a few popular libraries on pypi would be broken, including pandas, sqlalchemy, attrs and even Python's own standard library: https://github.com/python/cpython/blob/c183444f7e2640b054956474d71aae6e8d31a543/Lib/textwrap.py#L175 -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Mar 18 18:54:26 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 09:54:26 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C901C98.6090101@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> <5C901C98.6090101@canterbury.ac.nz> Message-ID: <20190318225425.GS12502@ando.pearwood.info> On Tue, Mar 19, 2019 at 11:32:56AM +1300, Greg Ewing wrote: > Tim Delaney wrote: > >I would argue the opposite - the use of "is" shows a clear knowledge > >that True and False are each a singleton and the author explicitly > >intended to use them that way. > > I don't think you can infer that. It could equally well be someone who's > *not* familiar with Python truth rules and really just meant "if x". > Or someone who's unfamiliar with booleans in general and thinks that > every "if" statement has to have a comparison in it. This! Writing "if some_bool = true" in static typed languages is pretty common. I used to see it a lot in my Pascal days. In fairness that was because I used to write some of it myself :-( For some reason it rarely seems to happen when the flag being tested is itself a boolean expression: if x > 0: # this if (x > 0) is True: # but never this which gives credence to your idea that people expect that people are (consciously or unconsciously) expecting every if to include a comparison. -- Steven From elazarg at gmail.com Mon Mar 18 18:56:56 2019 From: elazarg at gmail.com (Elazar) Date: Tue, 19 Mar 2019 00:56:56 +0200 Subject: [Python-ideas] Why operators are useful In-Reply-To: <5C901275.8030101@canterbury.ac.nz> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> <5C901275.8030101@canterbury.ac.nz> Message-ID: ?????? ??? ??, 19 ???? 2019, 0:41, ??? Greg Ewing ?< greg.ewing at canterbury.ac.nz>: > R?mi Lapeyre wrote: > > > You can make "inferences from the way things are used". But the > > comparison with maths stops here, you don?t make such inferences because > your > > object must be well defined before you start using it. > > In maths texts it's very common to see things like 'Let y = f(x)...' > where it's been made clear beforehand (either explicitly or implicitly) > what type f returns. > > That's completely analogous to inferring the type bound to a Python > name from an assignment statement. > > > You can track types with > > comments but you need to comment each line. > > No, you don't, because most lines in most programs allow types to > be inferred. The reason that things like MyPy are possible and > useful is that Python programs in practice are usually well-typed. > > > In maths, an given object is not duck because it quacks and walks like > > a duck, it?s > > either part of the set of all ducks, or not. > > But there's no restriction on how you define the set of all ducks. > It could be "the set of all things that quack". Duck typing is > totally possible in mathematics, even common. > > For example, in linear algebra the ducks are "anything you can > apply a linear operator to". That can cover a surprisingly large > variety of things. > > > But the equation is only meaningful in a given context. Asking whether > > f: x -> 1/x > > is differentiable is only meaningful if we know whether x is in R, C, > > [1; +oo[... > > That depends on what you want to say. You can say "let f be a > differentiable function" and then go on to talk about things that > depend only on the differentiability of f, without knowing > exactly what types f operates on. Then later you can substitute > any function you know to be differentiable, and all of those > thing will be true for it. > > Mathematicians abstract things this way all the time. Groups, > fields, vector spaces, etc. are all very abstract concepts, > defined only as a set of objects having certain properties. > If that's not duck typing, I don't know what is. > Technically, that's structural typing. Duck typing is only relevant when there is some kind of control flow, and things need not always have all the properties in question. But I don't think this difference is that important in the context. Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Mar 18 19:08:46 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 10:08:46 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190318151252.6bf20587@fsol> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> Message-ID: <20190318230846.GT12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 03:12:52PM +0100, Antoine Pitrou wrote: > (also, don't forget you can still use the copy() + update() method) If we had fluent method calls, we could write: process(mapping.copy().update(other)) but we don't, so we use a pointless temporary variable: temp = mapping.copy() temp.update(other) process(temp) del temp # don't pollute the global namespace turning what ought to be a simple expression into an extra two or three statements. -- Steven From steve at pearwood.info Mon Mar 18 19:16:14 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 10:16:14 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: <872f3573-90f1-6227-6f22-74d8b1608d0e@netc.fr> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> <872f3573-90f1-6227-6f22-74d8b1608d0e@netc.fr> Message-ID: <20190318231613.GU12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 04:07:11PM +0100, Jimmy Girardet wrote: > The syntax {**b,**c} wasn't hard to remember. [...] > And easy because at the end it's idiomatic. It is only idiomatic if moderately experienced Python programmers can automatically read it and write it without thinking about what it means. That's not yet the case. Since it is new, not many people know it at all, and those who do often can't use it because they have to use older versions of Python where the syntax is not allowed. I don't recall the discussion over whether to allow dict unpacking literals. Was there a PEP? Did it occur here on Python-Ideas? -- Steven From steve at pearwood.info Mon Mar 18 19:18:32 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 10:18:32 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <20190316103330.GL12502@ando.pearwood.info> Message-ID: <20190318231831.GV12502@ando.pearwood.info> On Sat, Mar 16, 2019 at 07:13:04PM -0400, Terry Reedy wrote: > > ? ? new = a.copy() > > ? ? new.update(b) > > ? ? # do something with new > > In my census of the stdlib, already posted and noted as subject to > error, this was twice as common as all other non-update-in-place > constructions (8 to 4) and about 1/4 as common as update in place (8 to > 35). Thank you Terry for doing a survey of the stdlib. -- Steven From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Mar 18 19:34:48 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Mon, 18 Mar 2019 18:34:48 -0500 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190318230846.GT12502@ando.pearwood.info> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> <20190318230846.GT12502@ando.pearwood.info> Message-ID: <8526dfc3-2e8b-611d-7b99-25ca80f9c761@potatochowder.com> On 3/18/19 6:08 PM, Steven D'Aprano wrote: > On Mon, Mar 18, 2019 at 03:12:52PM +0100, Antoine Pitrou wrote: > >> (also, don't forget you can still use the copy() + update() method) > > If we had fluent method calls, we could write: > > process(mapping.copy().update(other)) > > but we don't, so we use a pointless temporary variable: > > temp = mapping.copy() > temp.update(other) > process(temp) > del temp # don't pollute the global namespace > > > turning what ought to be a simple expression into an extra two or three > statements. So how many of you got tired of those three statements and added something like the following function to your private collection of useful functions: def merged_mappings(mapping, other): temp = mapping.copy() temp.update(other) return temp # no need to del temp here! turning two or three statements into a simple expression? I sure didn't. From steve at pearwood.info Mon Mar 18 19:58:03 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 10:58:03 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> Message-ID: <20190318235802.GW12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 05:51:08AM -0700, R?mi Lapeyre wrote: > Maths? typing is explicit so you don?t need to spend brain cycles to > determine them. Surely that depends on how formal you are being? Maths can vary hugely in formality, even at a professional level. It is a terrible overgeneralisation to state that maths is always explicitly typed, unless your definition of mathematics is defined so narrowly as to exclude the majority of maths done in the world. In my own personal experience, there is a lot of mathematics done using implicit typing. I've never seen anyone explicitly declare that the i, j or k loop variables in a sum or product is an element of ?, they just use them: ? ? expression i=0 Likewise it is very common to assume that n is an integer, x and y are Reals, and z is a Complex. Perhaps not in formal papers, but in less formal contexts, it is very common to assume specific convections used in the field rather than spell them out fully. For example: https://en.wikipedia.org/wiki/Volume_of_an_n-ball You might not give Wikipedia much credence, but I trust you won't object to John Baez and Terry Tao as examples of actual practicing mathematicians: https://johncarlosbaez.wordpress.com/2019/03/15/algebraic-geometry/ https://terrytao.wordpress.com/2019/02/19/on-the-universality-of-the-incompressible-euler-equation-on-compact-manifolds-ii-non-rigidity-of-euler-flows/ Similarly, I've never seen anyone explicit declare the type of a variable used for a change in variable. Even if we've explicitly stated that x is a Real, we might write something like: let u = x^2 + 3x in order to apply the chain rule, without explicitly stating that u is also a Real. Why would you need to? Its not like mathematics has a compiler which can flag type errors. We declare types only when needed. The rest of the time, we can use convention, domain-knowledge or inference to determine types. -- Steven From steve at pearwood.info Mon Mar 18 20:12:52 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Mar 2019 11:12:52 +1100 Subject: [Python-ideas] Why operators are useful In-Reply-To: <8526dfc3-2e8b-611d-7b99-25ca80f9c761@potatochowder.com> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> <20190318230846.GT12502@ando.pearwood.info> <8526dfc3-2e8b-611d-7b99-25ca80f9c761@potatochowder.com> Message-ID: <20190319001252.GX12502@ando.pearwood.info> On Mon, Mar 18, 2019 at 06:34:48PM -0500, Dan Sommers wrote: > So how many of you got tired of those three statements and > added something like the following function to your private > collection of useful functions: > > def merged_mappings(mapping, other): > temp = mapping.copy() > temp.update(other) > return temp # no need to del temp here! > > turning two or three statements into a simple expression? > > I sure didn't. I did, only I called it "updated()". As tends to happen, what started as a three line function quickly became more complex. E.g. docstrings, doctests, taking an arbitrary number of dicts to merge, keyword arguments. The latest version of my updated() function is 12 lines of code and a 13 line docstring, plus blank lines. And then I found I could never guarantee that my private toolbox was available, on account of it being, you know, *private*. So I found myself writing: try: from toolbox import updated except ImportError: # Fall back to a basic, no-frills version. def updated(d1, d2): ... which then means I can't use the extra frills in my private version. So why have the private version when I can't use it? Unless you are only writing for yourself, never to share your code with anyone else, "just put it in your private toolbox" can be impractical. -- Steven From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Mar 18 21:14:28 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Mon, 18 Mar 2019 20:14:28 -0500 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190319001252.GX12502@ando.pearwood.info> References: <20190316103330.GL12502@ando.pearwood.info> <20190318151252.6bf20587@fsol> <20190318230846.GT12502@ando.pearwood.info> <8526dfc3-2e8b-611d-7b99-25ca80f9c761@potatochowder.com> <20190319001252.GX12502@ando.pearwood.info> Message-ID: <5beb7a79-0b5e-1a33-77a2-3e3c1d86e68c@potatochowder.com> On 3/18/19 7:12 PM, Steven D'Aprano wrote: > On Mon, Mar 18, 2019 at 06:34:48PM -0500, Dan Sommers wrote: > >> So how many of you got tired of those three statements and >> added something like the following function to your private >> collection of useful functions: >> >> def merged_mappings(mapping, other): >> temp = mapping.copy() >> temp.update(other) >> return temp # no need to del temp here! >> >> turning two or three statements into a simple expression? >> >> I sure didn't. > > I did, only I called it "updated()". > > As tends to happen, what started as a three line function quickly > became more complex. E.g. docstrings, doctests, taking an arbitrary > number of dicts to merge, keyword arguments. The latest version of my > updated() function is 12 lines of code and a 13 line docstring, plus > blank lines. Yeah, that happens. :-) > And then I found I could never guarantee that my private toolbox was > available, on account of it being, you know, *private*. So I found > myself writing: > > try: > from toolbox import updated > except ImportError: > # Fall back to a basic, no-frills version. > def updated(d1, d2): > ... > > which then means I can't use the extra frills in my private > version. So why have the private version when I can't use it? The fact that you went to the trouble of writing (and testing and documenting and maintaining) that function, if nothing else, says that you perform this operation enough that repeating those three lines of code started to bother you. That's real evidence that merging mappings is *not* a once-a-year problem. To get back to Antoine's question, what are the use cases for your well-honed library function? Do you use it in every new project, or only projects of certain kinds (GUIs, daemons, etc.)? > Unless you are only writing for yourself, never to share your code > with anyone else, "just put it in your private toolbox" can be > impractical. Depending on the nature of the project, my private toolbox is open in both directions. Obviously, I'm not for stealing code from elsewhere and calling it my own, but I've certainly taken outside ideas and incorporated them into my private toolbox. I've also "contributed" bits and pieces from my private toolbox into other projects; in some ways, that's just passing folklore and tribal knowledge to the next generation of programmers. From storchaka at gmail.com Tue Mar 19 03:56:21 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 19 Mar 2019 09:56:21 +0200 Subject: [Python-ideas] True and False are singletons In-Reply-To: <5C90068F.9040006@canterbury.ac.nz> References: <5C8F8088.4050108@canterbury.ac.nz> <20190318120822.t36352yku5qzuu3d@phdru.name> <5C90068F.9040006@canterbury.ac.nz> Message-ID: 18.03.19 22:58, Greg Ewing ????: > Oleg Broytman wrote: >> ?? Three-way (tri state) checkbox. You have to distinguish False and >> None if the possible valuse are None, False and True. > > In that case the conventional way to write it would be > > ??? if settings[MY_KEY] == True: > ??????? ... > > It's not a major issue, but I get nervous when I see code > that assumes True and False are unique, because things > weren't always that way. "x == True" looks more dubious to me than "x is True". The latter can be intentional (see for example the JSON serializer), the former is likely was written by a newbie and contains a bug. For example, 1 and 1.0 will pass this test, but 2 and 1.2 will not. Python 3.8 will emit a syntax warning for "x is 1" but not for "x is True", because the latter is well defined and have valid use cases. From storchaka at gmail.com Tue Mar 19 04:01:57 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 19 Mar 2019 10:01:57 +0200 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: 18.03.19 22:52, Wes Turner ????: > >>> True = 1 > ? File "", line 1 > SyntaxError: can't assign to keyword The error message will be changed in 3.8. >>> True = 1 File "", line 1 SyntaxError: cannot assign to True From solipsis at pitrou.net Tue Mar 19 09:00:16 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Mar 2019 14:00:16 +0100 Subject: [Python-ideas] Why operators are useful References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> <5C901275.8030101@canterbury.ac.nz> Message-ID: <20190319140016.58437f52@fsol> On Tue, 19 Mar 2019 10:49:41 +1300 Greg Ewing wrote: > R?mi Lapeyre wrote: > > > You can make "inferences from the way things are used". But the > > comparison with maths stops here, you don?t make such inferences because your > > object must be well defined before you start using it. > > In maths texts it's very common to see things like 'Let y = f(x)...' > where it's been made clear beforehand (either explicitly or implicitly) > what type f returns. It's made clear because, when f was defined, it's explicitly spelled out what are the source and destination domains (not sure that's the right terminology). That's part of how you define a function in maths. There's no such thing in Python, unless you enforce typing hints and/or comprehensive docstrings. > No, you don't, because most lines in most programs allow types to > be inferred. The reason that things like MyPy are possible and > useful is that Python programs in practice are usually well-typed. You are being idealistic here. MyPy relies on typing hints being available, and sufficiently precise. Regards Antoine. From sylvain.marie at se.com Tue Mar 19 09:33:15 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Tue, 19 Mar 2019 13:33:15 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> Message-ID: Stephen > If the answer is "maybe", IMO PyPI is the right solution for distribution. Very wise words, I understand this point. However as of today it is *not* possible to write such a library in a complete way, without an additional tool from the language itself. Trust me, I tried very hard :). Indeed `my_decorator(foo)` when foo is a callable will always look like `@my_decorator` applied to function foo, because that's how the language is designed. So there is always one remaining ambiguous case to cover, and I have to rely on some ugly tricks such as asking users for a custom disambiguator. So if the decision is to let community-provided libraries like `decopatch` solve this problem, would you please consider providing us with the missing information? For example a new method in the `inspect` package could be `inspect.is_decorator_call(frame)`, that would return True if and only if the given frame is a decorator usage call as in `@my_decorator`. That piece would be enough - I would gladly take care of the rest in `decopatch`. Thanks for the feedback ! Sylvain -----Message d'origine----- De?: Stephen J. Turnbull Envoy??: vendredi 15 mars 2019 04:56 ??: Sylvain MARIE Cc?: David Mertz ; python-ideas Objet?: Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ Sylvain MARIE via Python-ideas writes: > I totally understand your point of view. However on the other hand, > many very popular open source projects out there have the opposite > point of view and provide decorators that can seamlessly be used > with and without arguments (pytest, attrs, click, etc.). So after a > while users get used to this behavior and expect it from all > libraries. Making it easy to implement is therefore something quite > important for developers not to spend time on this useless > ?feature?. That doesn't follow. You can also take it that "educating users to know the difference between a decorator and a decorator factory is therefore something quite important for developers not to spend time on this useless 'feature'." I'm not a fan of either position. I don't see why developers of libraries who want to provide this to their users shouldn't have "an easy way to do it", but I also don't see a good reason to encourage syntactic ambiguity by providing it in the standard library. I think this is a feature that belongs in the area of "you *could* do it, but *should* you?" If the answer is "maybe", IMO PyPI is the right solution for distribution. Steve -- Associate Professor Division of Policy and Planning Science https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fturnbull.sk.tsukuba.ac.jp%2F&data=02%7C01%7Csylvain.marie%40se.com%7C1c5b73f96ee240ef288608d6a8fa2030%7C6e51e1adc54b4b39b5980ffe9ae68fef%7C0%7C0%7C636882189569411152&sdata=EcCuG%2Bad0oJoT1Fupd7086wkxHUPapCEKN2x3zDvCw0%3D&reserved=0 Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ From rosuav at gmail.com Tue Mar 19 09:41:18 2019 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Mar 2019 00:41:18 +1100 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <20181024203512.GI3817@ando.pearwood.info> <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> Message-ID: On Wed, Mar 20, 2019 at 12:37 AM Sylvain MARIE via Python-ideas wrote: > > Stephen > > > If the answer is "maybe", IMO PyPI is the right solution for distribution. > > Very wise words, I understand this point. > However as of today it is *not* possible to write such a library in a complete way, without an additional tool from the language itself. Trust me, I tried very hard :). Indeed `my_decorator(foo)` when foo is a callable will always look like `@my_decorator` applied to function foo, because that's how the language is designed. So there is always one remaining ambiguous case to cover, and I have to rely on some ugly tricks such as asking users for a custom disambiguator. > Fair point. Though the custom disambiguator could be as simple as using a keyword argument - "my_decorator(x=foo)" is not going to look like "@my_decorator \n def foo". From tim.mitchell.chch at gmail.com Tue Mar 19 17:24:46 2019 From: tim.mitchell.chch at gmail.com (Tim Mitchell) Date: Wed, 20 Mar 2019 10:24:46 +1300 Subject: [Python-ideas] function annotation enhancement Message-ID: I would like to propose an enhancement to function annotations. Here is the motivating use case: When using callbacks I would like to declare the signature once as a type alias and use it to type hint both the function accepting the callback and the callbacks themselves. Currently I can declare the function signare CallbackType = Callable[[int, str], Any] and use it for the function/method accepting the callback def my_func(callabck: CallbackType): pass however it does not work for the callback itself, I have to repeat myself def my_callback(a: int, b: str) -> None: pass and if I change the signature in CallbackType the typechecker has to know that my_callback will be passed to my_func in order to detect the error. I propose to add a new syntax that declares the type of the function after the function name. def my_callback: CallbackType(a, b): pass any further parameter annotations would be disallowed: def my_callback: CallbackType(a: int, b: str): # Syntax error - duplicate annotations pass If the function parameters do not match the type signare, type hinters would flag this as a type mismatch. def my_callback: CallbackType(a): # type mismatch pass My original thought was that if CallbackType was not a Callable this would also be a type error. However I have since realized this could be used for declaring the return type of properties: For example class MyClass(object): @property def my_prop: int(self) return 10 c = MyClass() Then c.my_prop would be type hinted as an integer. What do people think? Cheers Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Tue Mar 19 17:32:43 2019 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Tue, 19 Mar 2019 14:32:43 -0700 Subject: [Python-ideas] function annotation enhancement In-Reply-To: References: Message-ID: Your proposed syntax is hard to implement, because it would require invasive syntax changes to the language itself. That's probably not worth it. However, there are other ways to achieve what you're looking for that don't require changing the language itself. This issue has some proposals: https://github.com/python/mypy/issues/1641. El mar., 19 mar. 2019 a las 14:25, Tim Mitchell (< tim.mitchell.chch at gmail.com>) escribi?: > I would like to propose an enhancement to function annotations. Here is > the motivating use case: > When using callbacks I would like to declare the signature once as a type > alias and use it to type hint both the function accepting the callback and > the callbacks themselves. > > Currently I can declare the function signare > > CallbackType = Callable[[int, str], Any] > > and use it for the function/method accepting the callback > > def my_func(callabck: CallbackType): > pass > > however it does not work for the callback itself, I have to repeat myself > > def my_callback(a: int, b: str) -> None: > pass > > and if I change the signature in CallbackType the typechecker has to know > that my_callback will be passed to my_func in order to detect the error. > > I propose to add a new syntax that declares the type of the function after > the function name. > > def my_callback: CallbackType(a, b): > pass > > any further parameter annotations would be disallowed: > > def my_callback: CallbackType(a: int, b: str): # Syntax error - > duplicate annotations > pass > > > If the function parameters do not match the type signare, type hinters > would flag this as a type mismatch. > > def my_callback: CallbackType(a): # type mismatch > pass > > > My original thought was that if CallbackType was not a Callable this would > also be a type error. > However I have since realized this could be used for declaring the return > type of properties: > For example > > class MyClass(object): > @property > def my_prop: int(self) > return 10 > > c = MyClass() > Then c.my_prop would be type hinted as an integer. > > What do people think? > Cheers > Tim > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Mar 19 18:05:41 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Mar 2019 11:05:41 +1300 Subject: [Python-ideas] Why operators are useful In-Reply-To: <20190319140016.58437f52@fsol> References: <5C8CB629.9040303@canterbury.ac.nz> <195c22b5-5a56-87a7-69e9-b376f51c5e14@Damon-Family.org> <5C8D8A8F.2040104@canterbury.ac.nz> <5C901275.8030101@canterbury.ac.nz> <20190319140016.58437f52@fsol> Message-ID: <5C9167B5.6090404@canterbury.ac.nz> Antoine Pitrou wrote: > You are being idealistic here. MyPy relies on typing hints being > available, and sufficiently precise. Yes, but it doesn't require type hints for *everything*. Given enough starting points, it can figure out the rest. Mathematicians rely heavily on their readers being able to do the same thing. -- Greg From greg.ewing at canterbury.ac.nz Tue Mar 19 18:18:21 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Mar 2019 11:18:21 +1300 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> Message-ID: <5C916AAD.6070509@canterbury.ac.nz> Sylvain MARIE via Python-ideas wrote: > `my_decorator(foo)` when foo is a callable will always look like > `@my_decorator` applied to function foo, because that's how the language is > designed. I don't think it's worth doing anything to change that. Everywhere else in the language, there's a very clear distinction between 'foo' and 'foo()', and you confuse them at your peril. I don't see why decorators should be any different. -- Greg From cs at cskk.id.au Tue Mar 19 21:02:39 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Wed, 20 Mar 2019 12:02:39 +1100 Subject: [Python-ideas] True and False are singletons In-Reply-To: References: Message-ID: <20190320010239.GA12727@cskk.homeip.net> On 18Mar2019 08:10, Eric Fahlgren wrote: >On Mon, Mar 18, 2019 at 7:04 AM Rhodri James wrote: >> On 18/03/2019 12:19, Richard Damon wrote: >> > On 3/18/19 7:27 AM, Greg Ewing wrote: >> >> Juancarlo A?ez wrote: >> >> >> >>> if settings[MY_KEY] is True: >> >>> ... >> >> >> >> If I saw code like this, it would take a really good argument to >> >> convince me that it shouldn't be just >> >> >> >> if settings[MY_KEY]: >> >> ... >> >> >> > That means something VERY different. The first asks if the item is >> > specifically the True value, while the second just asks if the value is >> > Truthy, it wold be satisfied also for values like 1. >> >> Yes. And the latter is what people almost always mean. > >No, it depends heavily on the context. In GUI code, Oleg's example >(tri-state checkbox) is a pervasive idiom. There's lots of code that says >"if x is True" or "if x is False" or "if x is None" and that's a very clear >indicator that you are dealing with these "booleans that can also be >'unset'". Yeah, but on a personal basis I would generally write such an idiom as "if x is None: ... elif x: truthy-action else: falsey-action" i.e. only rely on a singleton for the "not set" sentinel (usually None, occasionally something else). Cheers, Cameron Simpson From pythonchb at gmail.com Wed Mar 20 01:50:00 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 19 Mar 2019 19:50:00 -1000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: <5C916AAD.6070509@canterbury.ac.nz> References: <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> <5C916AAD.6070509@canterbury.ac.nz> Message-ID: Also: @something def fun(): ... Is exactly the same as: def fun() ... fun = something(fun) So you can?t make a distinction based whether a given usage is as a decoration. -CHB On Tue, Mar 19, 2019 at 12:26 PM Greg Ewing wrote: > Sylvain MARIE via Python-ideas wrote: > > `my_decorator(foo)` when foo is a callable will always look like > > `@my_decorator` applied to function foo, because that's how the language > is > > designed. > > I don't think it's worth doing anything to change that. Everywhere > else in the language, there's a very clear distinction between > 'foo' and 'foo()', and you confuse them at your peril. I don't see > why decorators should be any different. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvain.marie at se.com Wed Mar 20 12:30:39 2019 From: sylvain.marie at se.com (Sylvain MARIE) Date: Wed, 20 Mar 2019 16:30:39 +0000 Subject: [Python-ideas] Problems (and solutions?) in writing decorators In-Reply-To: References: <5BD23DE1.1000909@canterbury.ac.nz> <20190312113010.GQ12502@ando.pearwood.info> <23691.8773.311344.738472@turnbull.sk.tsukuba.ac.jp> <5C916AAD.6070509@canterbury.ac.nz> Message-ID: All of your answers are true, > (Chris) > "my_decorator(x=foo)" is not going to look like "@my_decorator \n def foo". That?s one of the many ways that `decopatch` uses to perform the disambiguation, indeed. But that?s already the user helping the lib, not the other way round > (Christopher) > @something applied on def fun is exactly the same as something(fun) True. However applying decorator manually is already for advanced users, so users of a decopatche-d decorator will not mind calling something()(fun). In fact it is consistent with when they use it with arguments : something(arg)(fun). Another criterion is : how easy would it be to implement an inspect.is_decorator_call(frame) method returning True if and only if frame is the @ statement ? If that?s fairly easy, well, I?m pretty sure that this is good stuff. From a na?ve user, not accustomed with the long history of this language, is very strange that an operator such as @ (the decorator one, not the other one) is completely not detectable by code, while there are so many hooks available in python for all other operators (+, -, etc.). Eventually that?s obviously your call, I?m just there to give feedback from what I see of the python libs development community. -- Sylvain De : Python-ideas De la part de Christopher Barker Envoy? : mercredi 20 mars 2019 06:50 ? : Greg Ewing Cc : python-ideas Objet : Re: [Python-ideas] Problems (and solutions?) in writing decorators [External email: Use caution with links and attachments] ________________________________ Also: @something def fun(): ... Is exactly the same as: def fun() ... fun = something(fun) So you can?t make a distinction based whether a given usage is as a decoration. -CHB On Tue, Mar 19, 2019 at 12:26 PM Greg Ewing > wrote: Sylvain MARIE via Python-ideas wrote: > `my_decorator(foo)` when foo is a callable will always look like > `@my_decorator` applied to function foo, because that's how the language is > designed. I don't think it's worth doing anything to change that. Everywhere else in the language, there's a very clear distinction between 'foo' and 'foo()', and you confuse them at your peril. I don't see why decorators should be any different. -- Greg _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. ______________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eryksun at gmail.com Wed Mar 20 18:19:02 2019 From: eryksun at gmail.com (eryk sun) Date: Wed, 20 Mar 2019 17:19:02 -0500 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() In-Reply-To: References: Message-ID: On 3/18/19, Giampaolo Rodola' wrote: > > I've been having these 2 implemented in psutil for a long time. On > POSIX these are convenience functions using os.kill() + SIGSTOP / > SIGCONT (the same as CTRL+Z / "fg"). On Windows they use > undocumented NtSuspendProcess and NtResumeProcess Windows > APIs available since XP. Currently, Windows Python only calls documented C runtime-library and Windows API functions. It doesn't directly call NT runtime-library and system functions. Maybe it could in the case of documented functions, but calling undocumented functions in the standard library should be avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, I don't see a way to reliably implement this feature for Windows. I'm CC'ing Steve Dower. He might say it's okay in this case, or know of another approach. DebugActiveProcess, the other simple approach mentioned in the linked SO answer [1], is unreliable and has the wrong semantics. A process only has a single debug port, so DebugActiveProcess will fail the PID as an invalid parameter if another debugger is already attached to the process. (The underlying NT call, DbgUiDebugActiveProcess, fails with STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect here, at least for Windows, is that each call to suspend() will require a corresponding call to resume(), since it's incrementing the suspend count on the threads; however, a debugger can't reattach to the same process. Also, if the Python process exits while it's attached as a debugger, the system will terminate the debugee as well, unless we call DebugSetProcessKillOnExit(0), but that interferes with the Python process acting as a debugger normally, as does this entire wonky idea. Also, the debugging system creates a thread in the debugee that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This thread is waiting, but it's not suspended, so the process will never actually appear as suspended in Task Manager or Process Explorer. That leaves enumerating threads in a snapshot and calling OpenThread and SuspendThread on each thread that's associated with the process. In comparison, let's take an abridged look at the guts of NtSuspendProcess. nt!NtSuspendProcess: ... mov r8,qword ptr [nt!PsProcessType] ... call nt!ObpReferenceObjectByHandleWithTag ... call nt!PsSuspendProcess ... mov ebx,eax call nt!ObfDereferenceObjectWithTag mov eax,ebx ... ret nt!PsSuspendProcess: ... call nt!ExAcquireRundownProtection cmp al,1 jne nt!PsSuspendProcess+0x74 ... call nt!PsGetNextProcessThread xor ebx,ebx jmp nt!PsSuspendProcess+0x62 nt!PsSuspendProcess+0x4d: ... call nt!PsSuspendThread ... call nt!PsGetNextProcessThread nt!PsSuspendProcess+0x62: ... test rax,rax jne nt!PsSuspendProcess+0x4d ... call nt!ExReleaseRundownProtection jmp nt!PsSuspendProcess+0x79 nt!PsSuspendProcess+0x74: mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING) nt!PsSuspendProcess+0x79: ... mov eax,ebx ... ret This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess. [1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsapi.h#L848 From pythonchb at gmail.com Wed Mar 20 21:46:24 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 20 Mar 2019 15:46:24 -1000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190316113922.36c79378@fsol> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> Message-ID: On Sat, Mar 16, 2019 at 12:39 AM Antoine Pitrou wrote: > On Sat, 16 Mar 2019 01:41:59 +1100 > Steven D'Aprano wrote: > > > Matrix multiplication is a perfect example: adding the @ operator could > > have been done in Python 0.1 if anyone had thought of it, but it took 15 > > years of numerical folk "whinging" about the lack until it happened: > > Not so perfect, as the growing use of Python for scientific computing > has made it much more useful to promote a dedicated matrix > multiplication operator than, say, 15 or 20 years ago. > Theres more to it than that, really, but not really relevant here... > This is precisely why I worded my question this way: what has changed > in the last 20 years that make a "+" dict operator more compelling > today than it was? Do we merge dicts much more frequently than we > did? The analogy doesn't hold because @ was a new operator -- a MUCH bigger change than dimply defining the use of + (or | ) for dicts. I wouldn't mind the new operator if its meaning was clear-cut. But > here we have potential for confusion, both for writers and readers of > code. > but it's NOT a new operator, it is making use of an existing one, and sure you could guess at a couple meanings, but the merge one is probably one of the most obvious to guess, and one quick test and you know -- I really can't see it being a ongoing source of confusion. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Mar 21 07:34:37 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 12:34:37 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> Message-ID: <20190321123437.2c66f3ab@fsol> On Wed, 20 Mar 2019 15:46:24 -1000 Christopher Barker wrote: > > > This is precisely why I worded my question this way: what has changed > > in the last 20 years that make a "+" dict operator more compelling > > today than it was? Do we merge dicts much more frequently than we > > did? > > The analogy doesn't hold because @ was a new operator -- a MUCH bigger > change than dimply defining the use of + (or | ) for dicts. But it's less disruptive when reading code, because "x @ y" is unambiguous: it's a matrix multiplication. "x + y" can be many different things, and now it can be one more thing. > I wouldn't mind the new operator if its meaning was clear-cut. But > > here we have potential for confusion, both for writers and readers of > > code. > > > > but it's NOT a new operator, it is making use of an existing one, and sure > you could guess at a couple meanings, but the merge one is probably one of > the most obvious to guess, and one quick test and you know -- I really > can't see it being a ongoing source of confusion. Did you actually read what I said? The problem is not to understand what dict.__add__ does. It's to understand what code using the + operator does, without knowing upfront whether the inputs are dicts. Regards Antoine. From rosuav at gmail.com Thu Mar 21 08:35:36 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Mar 2019 23:35:36 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321123437.2c66f3ab@fsol> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> Message-ID: On Thu, Mar 21, 2019 at 10:35 PM Antoine Pitrou wrote: > > but it's NOT a new operator, it is making use of an existing one, and sure > > you could guess at a couple meanings, but the merge one is probably one of > > the most obvious to guess, and one quick test and you know -- I really > > can't see it being a ongoing source of confusion. > > Did you actually read what I said? The problem is not to understand > what dict.__add__ does. It's to understand what code using the + > operator does, without knowing upfront whether the inputs are dicts. The + operator adds two things together. I don't understand the issue here. You can add integers: 1 + 2 == 3 You can add floats: 0.5 + 1.25 == 1.75 You can add lists: [1,2] + [3,4] == [1,2,3,4] You can add strings: "a" + "b" == "ab" And soon you'll be able to add dictionaries. The exact semantics need to be defined, but it's not fundamentally changing how you interpret the + operator. I don't understand the panic here - or rather, I don't understand why it's happening NOW, not back when lists got the ability to be added (if that wasn't in the very first release). Conversely, if it's the | operator, it's a matter of merging, and the same is true. You can merge integers, treating them as bit sets. You can merge sets. And now you'll be able to merge dictionaries. Same same. ChrisA From solipsis at pitrou.net Thu Mar 21 08:44:49 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 13:44:49 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> Message-ID: <20190321134449.35021172@fsol> On Thu, 21 Mar 2019 23:35:36 +1100 Chris Angelico wrote: > On Thu, Mar 21, 2019 at 10:35 PM Antoine Pitrou wrote: > > > but it's NOT a new operator, it is making use of an existing one, and sure > > > you could guess at a couple meanings, but the merge one is probably one of > > > the most obvious to guess, and one quick test and you know -- I really > > > can't see it being a ongoing source of confusion. > > > > Did you actually read what I said? The problem is not to understand > > what dict.__add__ does. It's to understand what code using the + > > operator does, without knowing upfront whether the inputs are dicts. > > The + operator adds two things together. I don't understand the issue here. I'm not expecting you to understand, either. Regards Antoine. From rosuav at gmail.com Thu Mar 21 08:51:12 2019 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Mar 2019 23:51:12 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321134449.35021172@fsol> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: On Thu, Mar 21, 2019 at 11:45 PM Antoine Pitrou wrote: > > On Thu, 21 Mar 2019 23:35:36 +1100 > Chris Angelico wrote: > > On Thu, Mar 21, 2019 at 10:35 PM Antoine Pitrou wrote: > > > > but it's NOT a new operator, it is making use of an existing one, and sure > > > > you could guess at a couple meanings, but the merge one is probably one of > > > > the most obvious to guess, and one quick test and you know -- I really > > > > can't see it being a ongoing source of confusion. > > > > > > Did you actually read what I said? The problem is not to understand > > > what dict.__add__ does. It's to understand what code using the + > > > operator does, without knowing upfront whether the inputs are dicts. > > > > The + operator adds two things together. I don't understand the issue here. > > I'm not expecting you to understand, either. > ... then, in the interests of productive discussion, could you please explain? What is it about dict addition that makes it harder to understand than other addition? ChrisA From solipsis at pitrou.net Thu Mar 21 08:56:44 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 13:56:44 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: <20190321135644.32ef4d8d@fsol> On Thu, 21 Mar 2019 23:51:12 +1100 Chris Angelico wrote: > On Thu, Mar 21, 2019 at 11:45 PM Antoine Pitrou wrote: > > > > On Thu, 21 Mar 2019 23:35:36 +1100 > > Chris Angelico wrote: > > > On Thu, Mar 21, 2019 at 10:35 PM Antoine Pitrou wrote: > > > > > but it's NOT a new operator, it is making use of an existing one, and sure > > > > > you could guess at a couple meanings, but the merge one is probably one of > > > > > the most obvious to guess, and one quick test and you know -- I really > > > > > can't see it being a ongoing source of confusion. > > > > > > > > Did you actually read what I said? The problem is not to understand > > > > what dict.__add__ does. It's to understand what code using the + > > > > operator does, without knowing upfront whether the inputs are dicts. > > > > > > The + operator adds two things together. I don't understand the issue here. > > > > I'm not expecting you to understand, either. > > > > ... then, in the interests of productive discussion, could you please > explain? What is it about dict addition that makes it harder to > understand than other addition? "Productive discussion" is something that requires mutual implication. Asking me to repeat exactly what I spelled out above (and that you even quoted) is not productive. Regards Antoine. From storchaka at gmail.com Thu Mar 21 09:16:44 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 21 Mar 2019 15:16:44 +0200 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: 21.03.19 14:51, Chris Angelico ????: > ... then, in the interests of productive discussion, could you please > explain? What is it about dict addition that makes it harder to > understand than other addition? Currently the + operator has 2 meanings for builtin types (both are widely used), after adding it for dicts it will have 3 meanings. 3 > 2, is not? From rosuav at gmail.com Thu Mar 21 09:24:58 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Mar 2019 00:24:58 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: On Fri, Mar 22, 2019 at 12:17 AM Serhiy Storchaka wrote: > > 21.03.19 14:51, Chris Angelico ????: > > ... then, in the interests of productive discussion, could you please > > explain? What is it about dict addition that makes it harder to > > understand than other addition? > > Currently the + operator has 2 meanings for builtin types (both are > widely used), after adding it for dicts it will have 3 meanings. > > 3 > 2, is not? I suppose you could call it two (numeric addition and sequence concatenation), but there are subtleties to the way that lists concatenate that don't apply to strings (esp since lists are mutable), so I'd call it at least three already. And what about non-builtin types? You can add two numpy arrays and it does pairwise addition, quite different from how lists add together. But in every case, the + operator means "add these things together". It will be the same with dicts: you add the dicts together. Antoine has stated that the problem is NOT understanding what dict.__add__ does, so I am at a loss as to what the problem IS. We *already* have many different definitions of "add", according to the data types involved. That is exactly what polymorphism is for. Why is it such a bad thing for a dict? Now, my own opinion is that | would be a better operator than +, but it's only a weak preference, and I'd be happy with either. Also, to my understanding, the concerns about "what does addition mean" apply identically to "what does Or mean", but as we've already seen, my understanding doesn't currently extend as far as comprehending this issue. Hence asking. ChrisA From rhodri at kynesim.co.uk Thu Mar 21 09:01:55 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 21 Mar 2019 13:01:55 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321123437.2c66f3ab@fsol> References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> Message-ID: <14b2e442-26ab-4f8d-4a5b-b967ae12d208@kynesim.co.uk> On 21/03/2019 11:34, Antoine Pitrou wrote: > On Wed, 20 Mar 2019 15:46:24 -1000 > Christopher Barker > wrote: >> >>> This is precisely why I worded my question this way: what has changed >>> in the last 20 years that make a "+" dict operator more compelling >>> today than it was? Do we merge dicts much more frequently than we >>> did? >> >> The analogy doesn't hold because @ was a new operator -- a MUCH bigger >> change than dimply defining the use of + (or | ) for dicts. > > But it's less disruptive when reading code, because "x @ y" is > unambiguous: it's a matrix multiplication. "x + y" can be many > different things, and now it can be one more thing. "x @ y" is unambiguous once you know what it means. Until then, it's just mysterious. >> I wouldn't mind the new operator if its meaning was clear-cut. But >>> here we have potential for confusion, both for writers and readers of >>> code. >>> >> >> but it's NOT a new operator, it is making use of an existing one, and sure >> you could guess at a couple meanings, but the merge one is probably one of >> the most obvious to guess, and one quick test and you know -- I really >> can't see it being a ongoing source of confusion. > > Did you actually read what I said? The problem is not to understand > what dict.__add__ does. It's to understand what code using the + > operator does, without knowing upfront whether the inputs are dicts. Welcome to polymorphism. -- Rhodri James *-* Kynesim Ltd From neatnate at gmail.com Thu Mar 21 09:32:21 2019 From: neatnate at gmail.com (Nathan Schneider) Date: Thu, 21 Mar 2019 09:32:21 -0400 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: On Thu, Mar 21, 2019 at 9:17 AM Serhiy Storchaka wrote: > 21.03.19 14:51, Chris Angelico ????: > > ... then, in the interests of productive discussion, could you please > > explain? What is it about dict addition that makes it harder to > > understand than other addition? > > Currently the + operator has 2 meanings for builtin types (both are > widely used), after adding it for dicts it will have 3 meanings. > > 3 > 2, is not? > It depends how abstractly you define the "meanings". If you define + as "arithmetic addition" and "sequence concatenation", then yes, there are 2. But novices have to learn that the same concatenation operator applies to strings as well as lists/tuples. And when reading x + y, it is probably relevant whether x and y are numbers, strings, or sequence containers like lists. The proposal would generalize "sequence concatenation" to something like "asymmetric sequence/collection combination". (Asymmetric because d1 + d2 may not equal d2 + d1.) It seems a natural extension to me, though the | alternative is also reasonable (interpreted as taking the OR of keys in the two dicts; but unlike unioning two sets, the dict-merge operator would be asymmetric). The third proposed alternative, <<, has no "baggage" from an existing use as a combination operator, but at the same time it is a more obscure choice. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Mar 21 10:43:34 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 15:43:34 +0100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) References: Message-ID: <20190321154334.70fc245a@fsol> On Tue, 5 Mar 2019 16:39:40 +0900 INADA Naoki wrote: > I think some people in favor of PEP 584 just want > single expression for merging dicts without in-place update. > > But I feel it's abuse of operator overload. I think functions > and methods are better than operator unless the operator > has good math metaphor, or very frequently used as concatenate > strings. > > This is why function and methods are better: > > * Easy to search. > * Name can describe it's behavior better than abused operator. > * Simpler lookup behavior. (e.g. subclass and __iadd__) > > Then, I propose `dict.merge` method. It is outer-place version > of `dict.update`, but accepts multiple dicts. (dict.update() > can be updated to accept multiple dicts, but it's not out of scope). > > * d = d1.merge(d2) # d = d1.copy(); d.update(d2) One should also be able to write `d = dict.merge(d1, d2, ...)` If dict merging is important enough to get a new spelling, then I think this proposal is the best: explicit, unambiguous, immediately understandable and easy to remember. Regards Antoine. From storchaka at gmail.com Thu Mar 21 10:44:01 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 21 Mar 2019 16:44:01 +0200 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: 21.03.19 15:24, Chris Angelico ????: > On Fri, Mar 22, 2019 at 12:17 AM Serhiy Storchaka wrote: >> >> 21.03.19 14:51, Chris Angelico ????: >>> ... then, in the interests of productive discussion, could you please >>> explain? What is it about dict addition that makes it harder to >>> understand than other addition? >> >> Currently the + operator has 2 meanings for builtin types (both are >> widely used), after adding it for dicts it will have 3 meanings. >> >> 3 > 2, is not? > > I suppose you could call it two (numeric addition and sequence > concatenation), but there are subtleties to the way that lists > concatenate that don't apply to strings (esp since lists are mutable), > so I'd call it at least three already. I do not understand what are these subtleties that you treat list concatenation different from string concatenation. Could you please explain? In any case, it does not matter how you count meanings, n + 1 > n. From guido at python.org Thu Mar 21 12:11:18 2019 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Mar 2019 09:11:18 -0700 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <20190321154334.70fc245a@fsol> References: <20190321154334.70fc245a@fsol> Message-ID: On Thu, Mar 21, 2019 at 7:45 AM Antoine Pitrou wrote: > On Tue, 5 Mar 2019 16:39:40 +0900 > INADA Naoki > wrote: > > I think some people in favor of PEP 584 just want > > single expression for merging dicts without in-place update. > > > > But I feel it's abuse of operator overload. I think functions > > and methods are better than operator unless the operator > > has good math metaphor, or very frequently used as concatenate > > strings. > > > > This is why function and methods are better: > > > > * Easy to search. > > * Name can describe it's behavior better than abused operator. > > * Simpler lookup behavior. (e.g. subclass and __iadd__) > > > > Then, I propose `dict.merge` method. It is outer-place version > > of `dict.update`, but accepts multiple dicts. (dict.update() > > can be updated to accept multiple dicts, but it's not out of scope). > > > > * d = d1.merge(d2) # d = d1.copy(); d.update(d2) > > One should also be able to write `d = dict.merge(d1, d2, ...)` > > If dict merging is important enough to get a new spelling, then I think > this proposal is the best: explicit, unambiguous, immediately > understandable and easy to remember. > I don't find it easy to understand or remember that d1.update(d2) modifies d1 in place, while d1.merge(d2) first copies d1. Maybe the name can indicate the copying stronger? Like we did with sorting: l.sort() sorts in-place, while sorted(l) returns a sorted copy. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Thu Mar 21 12:13:20 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 21 Mar 2019 17:13:20 +0100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> Message-ID: <5C93B820.8030905@UGent.be> On 2019-03-21 17:11, Guido van Rossum wrote: > I don't find it easy to understand or remember that d1.update(d2) > modifies d1 in place, while d1.merge(d2) first copies d1. That would be an advantage with + versus += (or | versus |=). From steve at pearwood.info Thu Mar 21 12:14:05 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 03:14:05 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: <20190321161405.GN12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 03:16:44PM +0200, Serhiy Storchaka wrote: > 21.03.19 14:51, Chris Angelico ????: > >... then, in the interests of productive discussion, could you please > >explain? What is it about dict addition that makes it harder to > >understand than other addition? > > Currently the + operator has 2 meanings for builtin types (both are > widely used), after adding it for dicts it will have 3 meanings. Just two meanings? I get at least eight among the builtins: - int addition; - float addition; - complex addition; - string concatenation; - list concatenation; - tuple concatenation; - bytes concatenation; - bytearray concatenation. I suppose if you cover one eye and focus on the "big picture", ignoring vital factors like "you can't add a list to a string" and "float addition and int addition aren't precisely the same", we might pretend that this is just two operations: - numeric addition; - sequence concatenation. But in practice, when reading code, it's usually not enough to know that some use of the + operator means "concatenation", you need to know *what* is being concatenated. There's no point trying to add a tuple if a bytearray is required. > 3 > 2, is not? Okay, but how does this make it harder to determine what a piece of code using + does? Antoine insists that *if we allow dict addition*, then we won't be able to tell what spam + eggs # for example does unless we know what spam and eggs are. This is very true. But it is *equally true today*, right now, and its been equally true going back to Python 1.0 or before. This proposed change doesn't add any uncertainty that doesn't already exist, nor will it make code that is clear today less clear tomorrow.^1 And don't forget that Python allows us to create non-builtin types that overload operators. If you don't know what spam and eggs are, you can't assume they are builtins. With operator overloading, any operator can mean literally anything at all. In practice though, this rarely becomes a serious problem. Is there a significant increase in difficulty between the current situation: # Is this addition or concatenation or something else? spam + eggs versus the proposed: # Is this addition or concatenation or merge or something else? spam + eggs Obviously there's *one more builtin* to consider, but I don't think that changes the process of understanding the meaning of the operation. I think that the problem you and Antoine fear ("dict.__add__ will make it harder to read code") requires a process that goes something like this: 1. Here's a mysterious "spam + eggs" operation we need to understand. 2. For each operation in ("numeric addition", "concatenation"): 3. assume + represents that operation; 4. if we understand the spam+eggs expression now, break If that's how we read code, then adding one more operation would make it harder to understand. We'd have to loop three times, not twice: 2. For each operation in ("numeric addition", "concatenation", "dict merging"): Three is greater than two, so we may have to do more work to understand the code. But I don't think that's how people actually read code. I think they do this: 1. Here's a mysterious "spam + eggs" operation we need to understand. 2. Read the code to find out what spam and eggs are. 3. Knowing what they are (tuples, lists, floats, etc) immediately tells you what the plus operator does; at worst, a programmer unfamiliar with the type may need to read the docs. Adding dict.__add__ doesn't make it any harder to work out what the operands spam and eggs are. The process we go through to determine what the operands are remains the same: - if one of operands is a literal, that gives you a strong hint that the other is the same type; - the names or context may make it clear ("header + text" probably isn't doing numeric addition); - read back through the code looking for where the variables are defined; etc. That last bit isn't always easy. People can write obfuscated, complex code using poor or misleading names. But allowing dict.__add__ doesn't make it more obfuscated or more complex. Usually the naming and context will make it clear. Most code is not terrible. At worst, there will be a transition period where people have a momentary surprise: "Wait, what, these are dicts??? How can you add dicts???" but then they will read the docs (or ask StackOverflow) and the second time they see it, it shouldn't be a surprise. ^1 That's my assertion, but if anyone has a concrete example of actual code which is self-evident today but will become ambiguous if this proposal goes ahead, please show me! -- Steven From phd at phdru.name Thu Mar 21 12:16:47 2019 From: phd at phdru.name (Oleg Broytman) Date: Thu, 21 Mar 2019 17:16:47 +0100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> Message-ID: <20190321161647.sahzctujdvryvlc7@phdru.name> On Thu, Mar 21, 2019 at 09:11:18AM -0700, Guido van Rossum wrote: > I don't find it easy to understand or remember that d1.update(d2) modifies > d1 in place, while d1.merge(d2) first copies d1. > > Maybe the name can indicate the copying stronger? Like we did with sorting: > l.sort() sorts in-place, while sorted(l) returns a sorted copy. Then shouldn't it be a function (not a method)? dictutils.merge()? > --Guido van Rossum (python.org/~guido) Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From steve at pearwood.info Thu Mar 21 12:21:23 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 03:21:23 +1100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> Message-ID: <20190321162123.GO12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 09:11:18AM -0700, Guido van Rossum wrote: > I don't find it easy to understand or remember that d1.update(d2) modifies > d1 in place, while d1.merge(d2) first copies d1. > > Maybe the name can indicate the copying stronger? Like we did with sorting: > l.sort() sorts in-place, while sorted(l) returns a sorted copy. How about dict.merged(*args, **kw)? Or dict.updated()? That would eliminate some of the difficulties with an operator, such as the difference between + which requires both operands to be a dict but += which can take any mapping or (key,value) iterable. -- Steven From steve at pearwood.info Thu Mar 21 12:42:00 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 03:42:00 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190301162645.GM4465@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> Message-ID: <20190321164159.GC29550@ando.pearwood.info> I'd like to make a plea to people: I get it, there is now significant opposition to using the + symbol for this proposed operator. At the time I wrote the first draft of the PEP, there was virtually no opposition to it, and the | operator had very little support. This has clearly changed. At this point I don't think it is productive to keep making subjective claims that + will be more confusing or surprising. You've made your point that you don't like it, and the next draft^1 of the PEP will make that clear. But if you have *concrete examples* of code that currently is easy to understand, but will be harder to understand if we add dict.__add__, then please do show me! For those who oppose the + operator, it will help me if you made it clear whether it is *just* the + symbol you dislike, and would accept the | operator instead, or whether you hate the whole operator concept regardless of how it is spelled. And to those who support this PEP, code examples where a dict merge operator will help are most welcome! ^1 Coming Real Soon Now?. -- Steven From solipsis at pitrou.net Thu Mar 21 13:06:36 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 18:06:36 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: <20190321180636.41498b1b@fsol> On Fri, 22 Mar 2019 03:42:00 +1100 Steven D'Aprano wrote: > > For those who oppose the + operator, it will help me if you made it > clear whether it is *just* the + symbol you dislike, and would accept > the | operator instead, or whether you hate the whole operator concept > regardless of how it is spelled. I'd rather see a method. Dict merging just doesn't occur often enough that an operator is desirable for it. > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! Yes, I still have no idea why this operator would supposedly be useful. How many dict merges do you write per month? Regards Antoine. From remi.lapeyre at henki.fr Thu Mar 21 13:26:04 2019 From: remi.lapeyre at henki.fr (=?UTF-8?Q?R=C3=A9mi_Lapeyre?=) Date: Thu, 21 Mar 2019 10:26:04 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321164159.GC29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: Le 21 mars 2019 ? 17:43:31, Steven D'Aprano (steve at pearwood.info(mailto:steve at pearwood.info)) a ?crit: > I'd like to make a plea to people: > > I get it, there is now significant opposition to using the + symbol for > this proposed operator. At the time I wrote the first draft of the PEP, > there was virtually no opposition to it, and the | operator had very > little support. This has clearly changed. > > At this point I don't think it is productive to keep making subjective > claims that + will be more confusing or surprising. You've made your > point that you don't like it, and the next draft^1 of the PEP will make > that clear. > > But if you have *concrete examples* of code that currently is easy to > understand, but will be harder to understand if we add dict.__add__, > then please do show me! > > For those who oppose the + operator, it will help me if you made it > clear whether it is *just* the + symbol you dislike, and would accept > the | operator instead, or whether you hate the whole operator concept > regardless of how it is spelled. Thanks for the work you are doing on this PEP and for debunking my misconceptions regarding types, I?m currently learning a lot about them. I don?t know if it matters but I?m in favor of the method > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! Not matter the notation you end up choosing, I think this code: https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81 which is part of a widely used library to validate JWTs would greatly benefit from a new merge to merge dicts. (This package is 78 on https://hugovk.github.io/top-pypi-packages/) R?mi > ^1 Coming Real Soon Now?. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From jfine2358 at gmail.com Thu Mar 21 13:42:15 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Thu, 21 Mar 2019 17:42:15 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321164159.GC29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: Steven D'Aprano wrote: > But if you have *concrete examples* of code that currently is easy to > understand, but will be harder to understand if we add dict.__add__, > then please do show me! # What does this do? >>> items. update(points) # And what does this do? >>> items += points What did you get? Here's one possible context. >>> Point = namedtuple('Point', ['x', 'y']) >>> p, q, r = Point(1,2), Point(3, 4), Point(5, 6) >>> points = set([p, q, r]) >>> points {Point(x=1, y=2), Point(x=5, y=6), Point(x=3, y=4)} >>> items = dict(a=4, b=8) -- Jonathan From p.f.moore at gmail.com Thu Mar 21 13:43:26 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Mar 2019 17:43:26 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: On Thu, 21 Mar 2019 at 17:27, R?mi Lapeyre wrote: > > Le 21 mars 2019 ? 17:43:31, Steven D'Aprano > (steve at pearwood.info(mailto:steve at pearwood.info)) a ?crit: > > > I'd like to make a plea to people: > > > > I get it, there is now significant opposition to using the + symbol for > > this proposed operator. At the time I wrote the first draft of the PEP, > > there was virtually no opposition to it, and the | operator had very > > little support. This has clearly changed. > > > > At this point I don't think it is productive to keep making subjective > > claims that + will be more confusing or surprising. You've made your > > point that you don't like it, and the next draft^1 of the PEP will make > > that clear. > > > > But if you have *concrete examples* of code that currently is easy to > > understand, but will be harder to understand if we add dict.__add__, > > then please do show me! > > > > For those who oppose the + operator, it will help me if you made it > > clear whether it is *just* the + symbol you dislike, and would accept > > the | operator instead, or whether you hate the whole operator concept > > regardless of how it is spelled. > > Thanks for the work you are doing on this PEP and for debunking my > misconceptions regarding types, I?m currently learning a lot about them. > > I don?t know if it matters but I?m in favor of the method > > > And to those who support this PEP, code examples where a dict merge > > operator will help are most welcome! > > Not matter the notation you end up choosing, I think this code: > https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81 > > which is part of a widely used library to validate JWTs would greatly > benefit from a new merge to merge dicts. (This package is 78 on > https://hugovk.github.io/top-pypi-packages/) It's already got a function that does the job. How much benefit is there *really* from being able to replace it with d1 + d2 once you drop support for Python < 3.8? But point taken that new code would have been able to avoid the function in the first place. ... or would it? def merge_dict(original, updates): if not updates: return original With the + operator. d1 + None will fail with an error. With your code, updates=None means "return the original unchanged". Does that matter with your current code? The point is that in many real world cases, you'd write a function *anyway*, to handle corner cases, and a new operator doesn't make much difference at that point. Having said all of that, I'm mostly indifferent to the idea of having a built in "dictionary merge" capability - I doubt I'd use it *much*, but if it were there I'm sure I'd find useful it on the odd occasion. I'm somewhat against an operator, I really don't see why this couldn't be a method (although the asymmetry in d1.merge(d2) makes me have a mild preference for a class method or standalone function). I can't form an opinion between + and |, I find | significantly uglier (I tend to avoid using it for sets, in favour of the union method) but I am mildly uncomfortable with more overloading of +. Serious suggestion - why not follow the lead of sets, and have *both* an operator and a method? And if you think that's a bad idea, it would be worth considering *why* it's a bad idea for dictionaries, when it's OK for sets (and "well, I didn't like it when sets did it" isn't sufficient ;-)) And having said that, I'll go back to lurking and not really caring one way or the other. Paul From brandtbucher at gmail.com Thu Mar 21 13:51:44 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Thu, 21 Mar 2019 10:51:44 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321164159.GC29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: > > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! I would definitely include the example you alluded to in the operators thread: Before: tmp = keep.copy() tmp.update(separate) result = function(param=tmp) del tmp After: result = f(param=keep+separate) Thanks for drafting the PEP for this. There seems to be a bit of an echo in these 5+ threads, and your commentary has definitely been more constructive/original than most. Looking forward to the next revision! Brandt On Thu, Mar 21, 2019 at 9:42 AM Steven D'Aprano wrote: > I'd like to make a plea to people: > > I get it, there is now significant opposition to using the + symbol for > this proposed operator. At the time I wrote the first draft of the PEP, > there was virtually no opposition to it, and the | operator had very > little support. This has clearly changed. > > At this point I don't think it is productive to keep making subjective > claims that + will be more confusing or surprising. You've made your > point that you don't like it, and the next draft^1 of the PEP will make > that clear. > > But if you have *concrete examples* of code that currently is easy to > understand, but will be harder to understand if we add dict.__add__, > then please do show me! > > For those who oppose the + operator, it will help me if you made it > clear whether it is *just* the + symbol you dislike, and would accept > the | operator instead, or whether you hate the whole operator concept > regardless of how it is spelled. > > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! > > > > > ^1 Coming Real Soon Now?. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Thu Mar 21 13:53:35 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 21 Mar 2019 18:53:35 +0100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <20190321162123.GO12502@ando.pearwood.info> References: <20190321154334.70fc245a@fsol> <20190321162123.GO12502@ando.pearwood.info> Message-ID: Steven D'Aprano schrieb am 21.03.19 um 17:21: > On Thu, Mar 21, 2019 at 09:11:18AM -0700, Guido van Rossum wrote: > >> I don't find it easy to understand or remember that d1.update(d2) modifies >> d1 in place, while d1.merge(d2) first copies d1. >> >> Maybe the name can indicate the copying stronger? Like we did with sorting: >> l.sort() sorts in-place, while sorted(l) returns a sorted copy. > > How about dict.merged(*args, **kw)? Or dict.updated()? And then users would accidentally type d.updated(items) and lack the tests to detect that this didn't do anything (except wasting some time and memory). Stefan From jfine2358 at gmail.com Thu Mar 21 14:01:14 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Thu, 21 Mar 2019 18:01:14 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: R?mi Lapeyre wrote: > Not matter the notation you end up choosing, I think this code: > https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81 > [...] would greatly benefit from a new merge to merge dicts. I've looked at the merge_dict defined in this code. It's similar to def gapfill(self, other): # See also: https://cobrapy.readthedocs.io/en/latest/gapfilling.html # Cobra's gapfill adds items to a model, to meet a requirement. for key in other.keys(): if key not in self: self[key] = other[key] (This is code I've written, that's not yet on PyPi.) The usage is different. Instead of writing one of aaa = merge_dict(aaa, bbb) ccc = merge_dict(aaa, bbb) you write one of gapfill(aaa, bbb) aaa.gapfill(bbb) # If gapfill added to dict methods. With merge_dict, you never really know if ccc is the same object as aaa, or a different one. Sometimes this is important. With gapfill, you get the same behaviour as the already familiar and loved dict.update. But of course with a different merge rule. -- Jonathan From jfine2358 at gmail.com Thu Mar 21 14:06:33 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Thu, 21 Mar 2019 18:06:33 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: Brandt Bucher wrote: > Before: > > tmp = keep.copy() > tmp.update(separate) > result = function(param=tmp) > del tmp > After: > > result = f(param=keep+separate) I'd rewrite this example as: Before: fn(param={**keep, **separate}) After: fn(param=keep + separate) -- Jonathan From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 21 14:10:52 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 22 Mar 2019 03:10:52 +0900 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <9DE357FD-8512-4B97-B141-8A3896813161@gmail.com> <20190302035224.GO4465@ando.pearwood.info> <98F8539E-D119-4299-B289-1EDA729FF10B@gmail.com> <20190305231453.GF4465@ando.pearwood.info> <20190315122021.16e8cca8@fsol> <20190315144158.GF12502@ando.pearwood.info> <20190316113922.36c79378@fsol> <20190321123437.2c66f3ab@fsol> <20190321134449.35021172@fsol> Message-ID: <23699.54188.855656.947611@turnbull.sk.tsukuba.ac.jp> Chris Angelico writes: > ... then, in the interests of productive discussion, could you please > explain? What is it about dict addition that makes it harder to > understand than other addition? Antoine didn't say what dict addition does is harder to understand than other addition. He said he wants to understand it without knowing what it does. I can't say for sure what he means precisely, but I take it that he wants dict "+" to obey certain regularities that other instances of "+" do, possibly including outside of Python. As you'll see I find it hard to make this precise, but it's a pretty strong feeling for me, as well. To me, those regularities include associativity (satisfied in Python except for floats) and commutativity where possible (integers and I believe floats do satisfy it, while strings cannot and other sequences in Python in general do not, although they very often do in mathematics). For mappings, the mathematical meaning of "+" is usually pointwise. This wouldn't make sense for strings (interpreted as mappings from a prefix of the natural numbers) at all except that by accident in Python s1[n] + s2[n] does make sense, but not pointwise (because the length of the result is 2, not 1, for each n). For sequences in general pointwise doesn't make sense (there's no restriction to homogeneous sequences, and if there were, like strings it's not clear that the elements would be summable in an appropriate sense). But concatenation always makes sense, especially by analogy to the somehow (IMO) canonical case of strings. For sets, the only plausible interpretation of "addition" is union, but in fact Python used .add asymmetrically as "add to", not "add together" (self is a set, argument is a generic object), and the union operator is "|", not "+". For dictionaries, neither pointwise addition nor concatenation makes sense in general, and update is "too asymmetric" for my taste, and has no analog in the usual algebras of mappings. In some sense string concatenation, though noncommutative, doesn't lose information, and it does obey a sort of antisymmetry in that a + b == reversed(reversed(b) + reversed(a)). Dictionary update does lose the original settings. If people really think it's so important to spell d = d0.copy() d.update(d1) as "d0 + d1" despite the noncommutativity (and the availability of "{**d0, **d1}" for "true" dicts), and by extension the redundant "d |= d1" for "d.update(d1)", I won't get terribly upset, but I will be sad because it offends my sense of "beautiful code" (including TOOWTDI, where "+" for dicts would violate both the "obvious" and the parenthetical "only one" conditions IMO). I would consider it a wart in the same way that many people consider str.join a wart, as it breaks even more of the regularities I associate with "+" than string concatenation does. Again, I don't know what Antoine meant, but I might say the same kind of thing in the same words, and the above is what I would mean. Steve From 2QdxY4RzWzUUiLuE at potatochowder.com Thu Mar 21 14:13:58 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Thu, 21 Mar 2019 13:13:58 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: <4bb9ac16-5aac-d70f-8be1-3fd573e7376e@potatochowder.com> On 3/21/19 1:01 PM, Jonathan Fine wrote: > R?mi Lapeyre wrote: > >> Not matter the notation you end up choosing, I think this code: >> https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81 >> [...] would greatly benefit from a new merge to merge dicts. > > I've looked at the merge_dict defined in this code. It's similar to > > def gapfill(self, other): > > # See also: https://cobrapy.readthedocs.io/en/latest/gapfilling.html > # Cobra's gapfill adds items to a model, to meet a requirement. > > for key in other.keys(): > if key not in self: > self[key] = other[key] > > (This is code I've written, that's not yet on PyPi.) The usage is > different. Instead of writing one of > aaa = merge_dict(aaa, bbb) > ccc = merge_dict(aaa, bbb) > you write one of > gapfill(aaa, bbb) > aaa.gapfill(bbb) # If gapfill added to dict methods. > > With merge_dict, you never really know if ccc is the same object as > aaa, or a different one. Sometimes this is important. > > With gapfill, you get the same behaviour as the already familiar and > loved dict.update. But of course with a different merge rule. With gapfill, I can never remeber whether it's gapfill(aaa, bbb) or gapfill(bbb, aaa). This is always important. :-) At least with aaa.gapfill(bbb), I have some sense of the "direction" of the asymmetry, or I would if I had some frame of reference into which to put the "gapfill" operation. (With the proposed + or | operator syntax, that gets lost.) From rhodri at kynesim.co.uk Thu Mar 21 13:59:41 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 21 Mar 2019 17:59:41 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321180636.41498b1b@fsol> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> Message-ID: <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> On 21/03/2019 17:06, Antoine Pitrou wrote: > On Fri, 22 Mar 2019 03:42:00 +1100 > Steven D'Aprano wrote: >> >> For those who oppose the + operator, it will help me if you made it >> clear whether it is *just* the + symbol you dislike, and would accept >> the | operator instead, or whether you hate the whole operator concept >> regardless of how it is spelled. > > I'd rather see a method. Dict merging just doesn't occur often enough > that an operator is desirable for it. Analogous to the relationship between list.sort() and sorted(), I can't help but think that a dict.merge() method would be a terrible idea. A merged() function is more defensible. >> And to those who support this PEP, code examples where a dict merge >> operator will help are most welcome! I don't use Python often enough to have much to offer, I'm afraid. The sort of occasion I would use dict merging is passing modified environments to subcommands. Something like: def process(): if time_to_do_thing1(): thing1(base_env + thing1_env_stuff + env_tweaks) if time_to_do_thing2(): thing2(base_env + thing2_env_stuff + env_tweaks) ...and so on. The current syntax for doing this is a tad verbose: def process(): if time_to_do_thing1(): env = base_env.copy() env.update(thing1_env_stuff) env.update(env_tweaks) thing1(env) del env if time_to_do_thing2(): env = base_env.copy() env.update(thing2_env_stuff) env.update(env_tweaks) thing2(env) del env -- Rhodri James *-* Kynesim Ltd From jcgoble3 at gmail.com Thu Mar 21 14:17:41 2019 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Thu, 21 Mar 2019 14:17:41 -0400 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> <20190321162123.GO12502@ando.pearwood.info> Message-ID: On Thu, Mar 21, 2019 at 1:54 PM Stefan Behnel wrote: > Steven D'Aprano schrieb am 21.03.19 um 17:21: > > On Thu, Mar 21, 2019 at 09:11:18AM -0700, Guido van Rossum wrote: > > > >> I don't find it easy to understand or remember that d1.update(d2) > modifies > >> d1 in place, while d1.merge(d2) first copies d1. > >> > >> Maybe the name can indicate the copying stronger? Like we did with > sorting: > >> l.sort() sorts in-place, while sorted(l) returns a sorted copy. > > > > How about dict.merged(*args, **kw)? Or dict.updated()? > > And then users would accidentally type > > d.updated(items) > > and lack the tests to detect that this didn't do anything (except wasting > some time and memory). > > Stefan > Generally when I call a method named with a verb on an instance of something mutable, I expect it to do something on that instance and return None. So merged() or updated() feels more like a built-in or a function to import from somewhere, akin to sorted(). Perhaps dict.union(d2) could be considered? Three points in favor: 1) Not a verb, therefore makes it clearer that it returns something new. 2) Not confusable with existing dict methods. 3) It matches the name and behavior of set.union (modulo value conflicts), so will be easier to grok. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brandtbucher at gmail.com Thu Mar 21 14:43:42 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Thu, 21 Mar 2019 11:43:42 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: I'd also like to add what I consider to be another point in favor of an operator: Throughout all of these related threads, I have seen many typos and misspellings of current dict merging idioms, from messing up the number of asterisks in "{**a, **b}", to even Guido(!) accidentally writing the common copy/update idiom as d = d1.copy() d = d1.update(d2) in a thoughtful email... and it was then copied-and-pasted (unquoted and verbatim) by others! I still have yet to see somebody (even those who claim to be confused by it) mess up the PEP's current definition of "+" or "+=" in this context. Brandt -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Thu Mar 21 14:26:49 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 21 Mar 2019 18:26:49 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> Message-ID: <31e63144-a4e1-6a2e-7d9f-8585978652a9@kynesim.co.uk> On 21/03/2019 17:59, Rhodri James wrote: > def process(): > ??? if time_to_do_thing1(): > ??????? thing1(base_env + thing1_env_stuff + env_tweaks) > ??? if time_to_do_thing2(): > ??????? thing2(base_env + thing2_env_stuff + env_tweaks) > > ...and so on.? The current syntax for doing this is a tad verbose: > > def process(): > ??? if time_to_do_thing1(): > ??????? env = base_env.copy() > ??????? env.update(thing1_env_stuff) > ??????? env.update(env_tweaks) > ??????? thing1(env) > ????del env > ??? if time_to_do_thing2(): > ??????? env = base_env.copy() > ??????? env.update(thing2_env_stuff) > ??????? env.update(env_tweaks) > ??????? thing2(env) > ??????? del env Of course I forgot: def process(): if time_to_do_thing1(): thing1({**base_env, **thing1_env_stuff, **env_tweaks}) if time_to_do_thing2(): thing2({**base_env, **thing2_env_stuff, **env_tweaks}) ...which says something about how memorable that syntax is. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Thu Mar 21 14:26:49 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 21 Mar 2019 18:26:49 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> Message-ID: <31e63144-a4e1-6a2e-7d9f-8585978652a9@kynesim.co.uk> On 21/03/2019 17:59, Rhodri James wrote: > def process(): > ??? if time_to_do_thing1(): > ??????? thing1(base_env + thing1_env_stuff + env_tweaks) > ??? if time_to_do_thing2(): > ??????? thing2(base_env + thing2_env_stuff + env_tweaks) > > ...and so on.? The current syntax for doing this is a tad verbose: > > def process(): > ??? if time_to_do_thing1(): > ??????? env = base_env.copy() > ??????? env.update(thing1_env_stuff) > ??????? env.update(env_tweaks) > ??????? thing1(env) > ????del env > ??? if time_to_do_thing2(): > ??????? env = base_env.copy() > ??????? env.update(thing2_env_stuff) > ??????? env.update(env_tweaks) > ??????? thing2(env) > ??????? del env Of course I forgot: def process(): if time_to_do_thing1(): thing1({**base_env, **thing1_env_stuff, **env_tweaks}) if time_to_do_thing2(): thing2({**base_env, **thing2_env_stuff, **env_tweaks}) ...which says something about how memorable that syntax is. -- Rhodri James *-* Kynesim Ltd From solipsis at pitrou.net Thu Mar 21 15:21:51 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 20:21:51 +0100 Subject: [Python-ideas] PEP: Dict addition and subtraction References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> Message-ID: <20190321202151.6ca7e96a@fsol> On Thu, 21 Mar 2019 17:59:41 +0000 Rhodri James wrote: > > >> And to those who support this PEP, code examples where a dict merge > >> operator will help are most welcome! > > I don't use Python often enough to have much to offer, I'm afraid. The > sort of occasion I would use dict merging is passing modified > environments to subcommands. Something like: > > def process(): > if time_to_do_thing1(): > thing1(base_env + thing1_env_stuff + env_tweaks) > if time_to_do_thing2(): > thing2(base_env + thing2_env_stuff + env_tweaks) > > ...and so on. The current syntax for doing this is a tad verbose: > > def process(): > if time_to_do_thing1(): > env = base_env.copy() > env.update(thing1_env_stuff) > env.update(env_tweaks) > thing1(env) > del env > if time_to_do_thing2(): > env = base_env.copy() > env.update(thing2_env_stuff) > env.update(env_tweaks) > thing2(env) > del env Ah, you convinced me there is a use case indeed (though `del env` isn't necessary above). I would still prefer something that's not an operator, but I agree there is potential to improve the current state of affairs. Note that, if you're able to live with a third-party dependency, the `toolz` package has what you need (and lots of other things too): https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge Regards Antoine. From hardik11989 at gmail.com Thu Mar 21 15:31:30 2019 From: hardik11989 at gmail.com (Hardik Patel) Date: Thu, 21 Mar 2019 15:31:30 -0400 Subject: [Python-ideas] Report an issue of Python Message-ID: Hello, Can you please help me to contact a core team if it is possible. I would like to report an issue. Thank you, Hardik Patel -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Mar 21 15:36:15 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Mar 2019 20:36:15 +0100 Subject: [Python-ideas] Report an issue of Python References: Message-ID: <20190321203615.706e1e2b@fsol> Hello, On Thu, 21 Mar 2019 15:31:30 -0400 Hardik Patel wrote: > Hello, > Can you please help me to contact a core team if it is possible. > I would like to report an issue. Issues can be reported at https://bugs.python.org/ Regards Antoine. From jfine2358 at gmail.com Thu Mar 21 17:44:01 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Thu, 21 Mar 2019 21:44:01 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321202151.6ca7e96a@fsol> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: Antoine Pitrou wrote: > Note that, if you're able to live with a third-party dependency, the > `toolz` package has what you need (and lots of other things too): > https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge I suggest that the supporters of dict + dict make (and put up on PyPi) a pure-Python subclass of dict that has the desired properties. This would 1. Clarify and document the syntax and semantics. 2. Help with exploration and testing. 3. Provide a 'back-port' mechanism to current Python. 4. Give the proposal the benefit of practical experience. I find this last very important, when we can do it. And we can, in this case. Language changes are 'cast in stone' and hard to reverse. And afterwards, on this list, we're sometime told that we've 'missed the boat' for a particular change. Let's take the benefit of a reference pure Python implementation, when we can. Steven D'A. Please would you include or respond to this suggestion, in the next revision of the PEP. -- Jonathan From rosuav at gmail.com Thu Mar 21 17:55:41 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Mar 2019 08:55:41 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: On Fri, Mar 22, 2019 at 8:44 AM Jonathan Fine wrote: > > Antoine Pitrou wrote: > > Note that, if you're able to live with a third-party dependency, the > > `toolz` package has what you need (and lots of other things too): > > https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge > > I suggest that the supporters of dict + dict make (and put up on PyPi) > a pure-Python subclass of dict that has the desired properties. This > would > > 1. Clarify and document the syntax and semantics. > 2. Help with exploration and testing. > 3. Provide a 'back-port' mechanism to current Python. > 4. Give the proposal the benefit of practical experience. > The trouble with that is that you can't always use a dict subclass (or a non-subclass MutableMapping implementation, etc, etc, etc). There are MANY situations in which Python will give you an actual real dict, and it defeats the purpose if you then have to construct an AddableDict out of it just so you can add something to it. Not every proposed change makes sense on PyPI, and it definitely won't get a fair representation in "practical experience". If someone's proposing adding a new module to the standard library, then by all means, propose PyPI. But changes to core types can't be imported from other modules. Python is not Ruby. ChrisA From mertz at gnosis.cx Thu Mar 21 18:02:05 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Mar 2019 18:02:05 -0400 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321164159.GC29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: I dislike the symbol '+' to mean "dictionary merging with value updates." I have no objection to, and mildly support, adding '|' with this meaning. It's not really possible to give "that one example" where + for meeting makes code less clear... In my eyes it would be EVERY such use. Every example presented in this thread or in the PEP feels wrong to me. I know about operator overloading and dunder methods and custom classes. My intuition about '+' from math, other programming languages, and Python, simply does not lead me to expect the proposed meaning. On Thu, Mar 21, 2019, 12:43 PM Steven D'Aprano wrote: > I'd like to make a plea to people: > > I get it, there is now significant opposition to using the + symbol for > this proposed operator. At the time I wrote the first draft of the PEP, > there was virtually no opposition to it, and the | operator had very > little support. This has clearly changed. > > At this point I don't think it is productive to keep making subjective > claims that + will be more confusing or surprising. You've made your > point that you don't like it, and the next draft^1 of the PEP will make > that clear. > > But if you have *concrete examples* of code that currently is easy to > understand, but will be harder to understand if we add dict.__add__, > then please do show me! > > For those who oppose the + operator, it will help me if you made it > clear whether it is *just* the + symbol you dislike, and would accept > the | operator instead, or whether you hate the whole operator concept > regardless of how it is spelled. > > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! > > > > > ^1 Coming Real Soon Now?. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brandtbucher at gmail.com Thu Mar 21 18:10:48 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Thu, 21 Mar 2019 15:10:48 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: For anyone interested in "trying it out": if you're not against cloning and compiling CPython yourself, here is a PEP 584 C implementation I have PR'd against master right now. I'm keeping it in sync with the draft PEP as it changes, so subtraction performance is not overly optimized yet, but it will show you the *exact* behavior outlined in the PEP on the dict builtin and its subclasses. The relevant branch is called "addiction". You can clone it from: https://github.com/brandtbucher/cpython.git :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Mar 21 18:54:58 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Mar 2019 09:54:58 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321164159.GC29550@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: On Fri, Mar 22, 2019 at 3:42 AM Steven D'Aprano wrote: > And to those who support this PEP, code examples where a dict merge > operator will help are most welcome! Since Python examples don't really exist yet, I'm reaching for another language that DOES have this feature. Pike's mappings (broadly equivalent to Python's dicts) can be added (actually, both + and | are supported), with semantics equivalent to PEP 584's. Translated into Python syntax, here's a section from the implementation of Process.run(): def run(cmd, modifiers={}): ... ... p = Process(cmd, modifiers + { "stdout": mystdout->pipe(), "stderr": mystderr->pipe(), "stdin": mystdin->pipe(), }) In Val.TimeTZ, a subclass that adds a timezone attribute overrides a mapping-returning method to incorporate the timezone in the result mapping. Again, translated into Python syntax: def tm(self): return super().tm() + {"timezone": self.timezone} To spawn a subprocess with a changed environment variable: //from the Process.create_process example Process.create_process(({ "/usr/bin/env" }), (["env" : getenv() + (["TERM":"vt100"]) ])); # equivalent Python code subprocess.Popen("/usr/bin/env", env=os.environ + {"TERM": "vt100"}) All of these examples could be done with the double-star syntax, as they all use simple literals. But addition looks a lot cleaner IMO, and even more so if you're combining multiple variables rather than using literals. ChrisA From steve at pearwood.info Thu Mar 21 19:46:59 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 10:46:59 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> Message-ID: <20190321234659.GQ12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 06:02:05PM -0400, David Mertz wrote: > I dislike the symbol '+' to mean "dictionary merging with value updates." I > have no objection to, and mildly support, adding '|' with this meaning. > > It's not really possible to give "that one example" where + for meeting > makes code less clear... In my eyes it would be EVERY such use. I suspect that I may not have explained myself properly. Sorry. Let me try to explain again. A number of people including Antoine and Serhiy seem to have taken the position that merely adding dict.__add__ will make existing code using + harder to understand, as you will need to consider not just numeric addition and concatenation, but also merging, when reading code. *If this were true* it would be an excellent argument against using + for dict merges. But is it true? Would you agree that this example of + is perfectly clear today? for digit in digits: num = num*10 + digit By both the naming (digit, num) and presence of multiplication by the literal 10, it should be pretty obvious that this is probably doing integer addition. (I suppose it is conceivable that this is doing sequence repetition and concatenation, but given the names that interpretation would be rather unexpected.) We shouldn't find it hard to understand that code, using nothing more than *local* context. There's no need to search the global context to find out what num and digits are. (Although in the specific example I copied that snippet from, that information is only two or three lines away. But in principle, we might have needed to search an arbitrarily large code base to determine what they were.) Adding dict.__add__ isn't going to make that example harder to understand. If it did, that would be a big blow to the + proposal. Antoine and Serhiy seem to worry that there are existing uses of + which are currently easy to understand but will become less so if dict.__add__ is added. I respect that worry, even if I doubt that they are correct. If someone can demonstrate that their fear is well-founded, that would be an excellent counter-argument to the PEP's proposal to use +. What *doesn't* count as a demonstration: 1. Toy examples using generic names don't count. With generic, meaningless names, they're not meaningful now and so adding dict.__add__ won't make them *less* meaningful: # is this concatenation or numeric addition? who can tell? for spam in spammy_macspamface: eggs += spam Regardless of whether dicts support + or not, we would still have to search the global context to work out what eggs and spam are. Adding dict.__add__ doesn't make this harder. 2. Purely opinion-based subjective statements, since they basically boil down to "I don't like the use of + for dict merging." That point has been made, no need to keep beating that drum. 3. Arguments based on unfamiliarity to the new operator: preferences += {'EDITOR': 'ed', 'PAGESIZE': 'A4'} might give you a bit of a double-take the first time you see it, but it surely won't still be surprising you in five years time. I realise that this is a high bar to reach, but if somebody does reach it, and demonstrates that Antoine and Serhiy's fears are well-founded, that would be a very effective and convincing argument. > Every > example presented in this thread or in the PEP feels wrong to me. I know > about operator overloading and dunder methods and custom classes. My > intuition about '+' from math, other programming languages, and Python, > simply does not lead me to expect the proposed meaning. And your subjective feeling is well-noted :-) -- Steven From steve at pearwood.info Thu Mar 21 19:55:08 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 10:55:08 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: <20190321235508.GR12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 03:10:48PM -0700, Brandt Bucher wrote: > For anyone interested in "trying it out": if you're not against cloning and > compiling CPython yourself, here is a PEP 584 C implementation I have PR'd > against master right now. I'm keeping it in sync with the draft PEP as it > changes, so subtraction performance is not overly optimized yet, but it > will show you the *exact* behavior outlined in the PEP on the dict builtin > and its subclasses. The relevant branch is called "addiction". You can > clone it from: That's great, thank you! For the sake of comparisons, could you support | as an alias? That will allow people to get a feel for whether a+b or a|b looks nicer. (For the record, the PEP isn't set in stone in regards to the choice of operator. > https://github.com/brandtbucher/cpython.git -- Steven From mertz at gnosis.cx Thu Mar 21 20:13:01 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Mar 2019 20:13:01 -0400 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321234659.GQ12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: On Thu, Mar 21, 2019, 7:48 PM Steven D'Aprano wrote: > A number of people including Antoine and Serhiy seem to have taken the > position that merely adding dict.__add__ will make existing code using + > harder to understand, as you will need to consider not just numeric > addition and concatenation, but also merging, when reading code. > > Would you agree that this example of + is perfectly clear today? > > for digit in digits: > num = num*10 + digit > > By both the naming (digit, num) and presence of multiplication by the > literal 10, it should be pretty obvious that this is probably doing > integer addition. > Yep. This is clear and will not become less clear if some more objects grow an .__add__() methods. Already, it is POSSIBLE that `num` and `digit` mean something other than numbers. Bad naming of variables if so, but not prohibited. For example, NumPy uses '+' and '*' for elementwise operations, often with broadcasting to different areas shapes. Maybe that's code dealing with vectorised arrays... But probably not. Holoviews users '+' and '*' to combine elements of graphs. E.g labelled = low_freq * high_freq * linpoints overlay + labelled + labelled.Sinusoid.Low_Frequency ggplot in R has similar behavior. Maybe your loop is composing a complex graph... But probably not. Nonetheless, if I see `dict1 + dict2` the meaning you intend in the PEP does not jump out as the obvious behavior. Nor even as the most useful behavior. Of course I could learn it and teach it, but it will always feel like a wart in the language. In contrast, once you tell me about the special object "vectorised arrays", `arr1 + arr2` does exactly what is expect in NumPy. And your subjective feeling is well-noted :-) > This is more than "merely subjective." I teach Python. I write books about Python. I've had tens of millions of readers of articles I've written about Python. I'm not the only person in this discussion with knowledge of learners and programmers and scientists... But the opinions I'm expressing ARE on their behalf too (as I perceive likely surprise and likely bugs). I like most of the design of Python. Almost all, even. But there are a few warts in it. This would be a wart. -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Thu Mar 21 20:21:50 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Thu, 21 Mar 2019 19:21:50 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321234659.GQ12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: On 3/21/19 6:46 PM, Steven D'Aprano wrote: > Antoine and Serhiy seem to worry that there are existing uses of + which > are currently easy to understand but will become less so if dict.__add__ > is added. I respect that worry, even if I doubt that they are correct. > > If someone can demonstrate that their fear is well-founded, that would > be an excellent counter-argument to the PEP's proposal to use +. https://docs.python.org/3.8/library/collections.html has some examples using collections.Counter, which is clearly described as being a subclass of dict. Amongst the examples: c + d # add two counters together: c[x] + d[x] That's the + operator operating on two dicts (don't make me quote the Liskov Substitution Principle), but doing something really different than the base operator. So if I know that c and d (or worse, that one of them) is a dict, then interpreting c + d becomes much more interesting, but arguably no worse than c.update(d). Yes, it's "just" polymorphism, but IMO it violates the Principle of Least Surprise. My apologies if this is covered elsewhere in this thread, or it doesn't meet the bar Steven set. From tjreedy at udel.edu Thu Mar 21 21:36:20 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Mar 2019 21:36:20 -0400 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> Message-ID: On 3/21/2019 12:11 PM, Guido van Rossum wrote: > On Thu, Mar 21, 2019 at 7:45 AM Antoine Pitrou >> One should also be able to write `d = dict.merge(d1, d2, ...)` > > If dict merging is important enough to get a new spelling, then I think > this proposal is the best: explicit, unambiguous, immediately > understandable and easy to remember. > > > I don't find it easy to understand or remember that d1.update(d2) > modifies d1 in place, while d1.merge(d2) first copies d1. > > Maybe the name can indicate the copying stronger? Like we did with > sorting: l.sort() sorts in-place, while sorted(l) returns a sorted copy. I counted what I believe to be 10 instances of copy-update in the top level of /lib. Do either of you consider this to be enough that any addition would be worthwhile. There are 3 in idlelib that I plan to replace with {**a, **b} and be done with the issue. I did not check any other packages. -- Terry Jan Reedy From songofacandy at gmail.com Thu Mar 21 22:06:08 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Fri, 22 Mar 2019 11:06:08 +0900 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <20190321162123.GO12502@ando.pearwood.info> References: <20190321154334.70fc245a@fsol> <20190321162123.GO12502@ando.pearwood.info> Message-ID: On Fri, Mar 22, 2019 at 1:21 AM Steven D'Aprano wrote: > > > How about dict.merged(*args, **kw)? Or dict.updated()? > +1 on "merged". I feel the word "update" indicating mutating, and it's difficult to distinguish between "update" and "updated". > That would eliminate some of the difficulties with an operator, such as > the difference between + which requires both operands to be a dict > but += which can take any mapping or (key,value) iterable. > > -- > Steven -- Inada Naoki From steve at pearwood.info Thu Mar 21 22:14:51 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 13:14:51 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: <20190322021451.GS12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 08:13:01PM -0400, David Mertz wrote: > On Thu, Mar 21, 2019, 7:48 PM Steven D'Aprano wrote: [...] > Nonetheless, if I see `dict1 + dict2` the meaning you intend in the PEP > does not jump out as the obvious behavior. Nor even as the most useful > behavior. What would be the most useful behaviour for dict "addition" in your opinion? > Of course I could learn it and teach it, but it will always feel > like a wart in the language. Would that wartness be lessoned if it were spelled | or << instead? > In contrast, once you tell me about the special object "vectorised arrays", > `arr1 + arr2` does exactly what is expect in NumPy. I don't know Numpy well enough to know whether that is elementwise addition or concatenation or something else, so that example doesn't resonate with me. I can't guess what you expect, and I have no confidence that my guess (matrix addition of equal-sized arrays, an exception if unequal) will be what Numpy does. > > And your subjective feeling is well-noted :-) > > This is more than "merely subjective." If it is more than subjective, then there must be an objective test that anyone, or a computer program, could do to tell whether or not the + operator on dicts will be ... um, what? A wart? Ugly? Both of those are subjective value judgements, so I'm not sure what objective claim you believe you are making which is "more than" subjective. The point is, I'm not *discounting* the subjective claims that + on dicts is ugly. I've acknowledged them, and the next draft of the PEP will do so too. But repetition doesn't make a subjective value judgement objective. It might boil down to a subjective preference for + over | or visa versa, or another operator, or no operator at all. That's fine: language design is partly subjective. But I'd like to see more comments based on objective reasons we can agree on, and fewer arguments that boil down to "I just don't like it". -- Steven From pythonchb at gmail.com Thu Mar 21 22:19:06 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Thu, 21 Mar 2019 16:19:06 -1000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: > > https://docs.python.org/3.8/library/collections.html has some > examples using collections.Counter, which is clearly described > as being a subclass of dict. Amongst the examples: > > c + d # add two counters together: c[x] + d[x] > > That's the + operator operating on two dicts (don't make me > quote the Liskov Substitution Principle), but doing something > really different than the base operator. > > So if I know that c and d (or worse, that one of them) is a > dict, then interpreting c + d becomes much more interesting, Killing a use of a common operator with a very common built in data type because the operator is used in a different way by a specialized object in the stdlib seems a bit backwards to me. Frankly, I think considering Counter as a dict subclass is the mistake here, even if it is true. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Thu Mar 21 22:40:52 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Thu, 21 Mar 2019 21:40:52 -0500 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: <3cffb9e2-6b60-b38c-8119-228c1622c007@potatochowder.com> On 3/21/19 9:19 PM, Christopher Barker wrote: >> >> https://docs.python.org/3.8/library/collections.html has some >> examples using collections.Counter, which is clearly described >> as being a subclass of dict. Amongst the examples: >> >> c + d # add two counters together: c[x] + d[x] >> >> That's the + operator operating on two dicts (don't make me >> quote the Liskov Substitution Principle), but doing something >> really different than the base operator. >> >> So if I know that c and d (or worse, that one of them) is a >> dict, then interpreting c + d becomes much more interesting, > > > Killing a use of a common operator with a very common built in data type > because the operator is used in a different way by a specialized object in > the stdlib seems a bit backwards to me. Perhaps. Note that Counter also uses | and & for other operations that probably wouldn't make much sense on base dicts. > Frankly, I think considering Counter as a dict subclass is the mistake > here, even if it is true. I had the same thought that Counter is misdesigned in one way or another, but (a) that ship has long sailed, and (b) I didn't want to run off on that tangent. My point remains: because Counter is a subclass of dict, and Counter uses the + operator for something that doesn't apply to base dicts, adding + to dicts *may* cause confusion that wasn't there before. Presently, +, -, |, and & all raise an exception when given a Counter and a dict. This (raising an exception) is probably still the Right Thing to do in that case, even with a + operator on dicts, but that violates the LSP and IMO the PLS. From mertz at gnosis.cx Thu Mar 21 22:57:33 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Mar 2019 22:57:33 -0400 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190322021451.GS12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> <20190322021451.GS12502@ando.pearwood.info> Message-ID: On Thu, Mar 21, 2019, 10:15 PM Steven D'Aprano wrote: > What would be the most useful behaviour for dict "addition" in your > opinion? > Probably what I would use most often was a "lossless" merging in which duplicate keys resulted in the corresponding value becoming a set containing all the merged values. E.g. >>> d1 = {1: 55, 2: 77, 3: 88} >>> d2 = {3: 99, 4: 22} >>> add(d1, d2) {1: 55, 2: 77, 3: {88, 99}, 4:22} I'm sure most users would hate this too. It changes the type of values between a thing and a set of things, and that has to be sorted out downstream. But it is lossless in a similar way to Counter or sequence addition. I can write what I want perfectly well. Perhaps useing defaultdict as a shortcut to get there. And I know there are some behaviors I have not specified here, but my function can do whatever I want in the edge cases. If we're to see 'd1 + d2' for the first time without having followed this discussion, my guess would be behavior similar to what I show. > Of course I could learn it and teach it, but it will always feel > > like a wart in the language. > > Would that wartness be lessoned if it were spelled | or << instead? > Yes, definitely. Both those spellings feel pretty natural to me. They don't have the misleading associations '+' carries. I'm kinda fond of '<<' because it visitation resembles an arrow that I can think of as "put the stuff here into there". > In contrast, once you tell me about the special object "vectorised > arrays", > > `arr1 + arr2` does exactly what is expect in NumPy. > > I don't know Numpy well enough to know whether that is elementwise > addition or concatenation or something else, so that example doesn't > resonate with me. I can't guess what you expect, and I have no confidence > that my guess (matrix addition of equal-sized arrays, an exception if > unequal) will be what Numpy does > Fair enough. I've worked with NumPy long enough that perhaps I forget what my first intuition was. I accept that it's non-obvious to many users. FWIW, I really love NumPy behavior, but it's a shift in thinking vs lists. E.g. >>> a = array([1, 2, 3]) >>> b = array([[10, 11, 12], [100, 200, 300]]) >>> a + b [[ 11 13 15 ] [ 101 202 303]] This is "broadcasting" of compatible shapes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Mar 21 23:31:46 2019 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 22 Mar 2019 03:31:46 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <3cffb9e2-6b60-b38c-8119-228c1622c007@potatochowder.com> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> <3cffb9e2-6b60-b38c-8119-228c1622c007@potatochowder.com> Message-ID: On 2019-03-22 02:40, Dan Sommers wrote: > On 3/21/19 9:19 PM, Christopher Barker wrote: >>> >>> https://docs.python.org/3.8/library/collections.html has some >>> examples using collections.Counter, which is clearly described >>> as being a subclass of dict. Amongst the examples: >>> >>> c + d # add two counters together: c[x] + d[x] >>> >>> That's the + operator operating on two dicts (don't make me >>> quote the Liskov Substitution Principle), but doing something >>> really different than the base operator. >>> >>> So if I know that c and d (or worse, that one of them) is a >>> dict, then interpreting c + d becomes much more interesting, >> >> >> Killing a use of a common operator with a very common built in data type >> because the operator is used in a different way by a specialized object in >> the stdlib seems a bit backwards to me. > > Perhaps. Note that Counter also uses | and & for other > operations that probably wouldn't make much sense on base > dicts. > >> Frankly, I think considering Counter as a dict subclass is the mistake >> here, even if it is true. > > I had the same thought that Counter is misdesigned in one > way or another, but (a) that ship has long sailed, and (b) > I didn't want to run off on that tangent. > [snip] Counter is trying to provide the functionality of 2 kinds of container: 1. A counting container. 2. A multi-set. + makes sense for counting (sum); | makes sense for multi-sets (union). From steve at pearwood.info Fri Mar 22 00:53:51 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 15:53:51 +1100 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> Message-ID: <20190322045351.GT12502@ando.pearwood.info> On Thu, Mar 21, 2019 at 09:36:20PM -0400, Terry Reedy wrote: > I counted what I believe to be 10 instances of copy-update in the top > level of /lib. Do either of you consider this to be enough that any > addition would be worthwhile. I think you're referring to Guido and Antoine? But for what it's worth, I think that's a good indication that there are uses for a merge operator. > There are 3 in idlelib that I plan to replace with {**a, **b} and be > done with the issue. I did not check any other packages. If a+b already worked for dicts, would you still prefer {**a, **b}? How about if it were spelled a|b? -- Steven From tjreedy at udel.edu Fri Mar 22 03:42:38 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Mar 2019 03:42:38 -0400 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: <20190322045351.GT12502@ando.pearwood.info> References: <20190321154334.70fc245a@fsol> <20190322045351.GT12502@ando.pearwood.info> Message-ID: On 3/22/2019 12:53 AM, Steven D'Aprano wrote: > On Thu, Mar 21, 2019 at 09:36:20PM -0400, Terry Reedy wrote: > >> I counted what I believe to be 10 instances of copy-update in the top >> level of /lib. Do either of you consider this to be enough that any >> addition would be worthwhile. > > I think you're referring to Guido and Antoine? Yes, those were the two (core-devs) I quoted, and perhaps had missed my post, while you already thanked me for collecting some date. > But for what it's worth, > I think that's a good indication that there are uses for a merge > operator. Some, yes. Enough for new syntax? What is a reasonable standard? Are there existing syntax features so sparsely used? What is the bar for something that adds no new function, but saves 6 chars and is easier to understand for at least some? In the past, 'Would this be used in the stdlib?' has been asked of feature proposals. But I never paid attention past == 0 or > 0. When Guido approved ':=', what threashhold of usefulness did he use? How many uses of ':=' does he anticipate, or consider enough to justify the addition? >> There are 3 in idlelib that I plan to replace with {**a, **b} and be >> done with the issue. I did not check any other packages. > > If a+b already worked for dicts, would you still prefer {**a, **b}? Example: {**sys.modules, **globals()} Aside from the fact that I can patch *and* backport to 3.7 *now*, I think so. The latter clearly (to me) maps mappings to a dict. > How about if it were spelled a|b? As in sys.modules | globals() or (sys.modules | globals())? Closer. -- Terry Jan Reedy From jfine2358 at gmail.com Fri Mar 22 03:47:01 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 22 Mar 2019 07:47:01 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: Chris Angelico wrote: > The trouble with that is that you can't always use a dict subclass (or > a non-subclass MutableMapping implementation, etc, etc, etc). There > are MANY situations in which Python will give you an actual real dict, > and it defeats the purpose if you then have to construct an > AddableDict out of it just so you can add something to it. Not every > proposed change makes sense on PyPI, and it definitely won't get a > fair representation in "practical experience". Chris seems to accept that sometimes you can use a dict subclass, and that my proposal will give some representation of "practical experience". Even if not perfect, such benefits are I think worth having. And Chris gives no evidence (or examples) beyond his own assertions, that my proposal would not produce a fair representation of practical experience. Why don't we just try it and see. This would engage us with the users. And it would, as I suggested, clarify and document the syntax and semantics. And provide backporting to current versions of Python. By the way, in "Masterminds of Programming" [page 20], Guido gives four lines of defence against the unwise addition of a "favorite feature" to the language. They are [1] Explain to people that they can already do what they want. [2] Tell them to write their own module or class to encapsulate the feature. [3] Accept the feature, as pure Python, in the standard library. [4] Accept the feature as a C-Python extension standard. And [4] is, in Guido's words > the last line of defense before we have to admit [...] > this is so useful [...] so we'll have to change the language I think the pure Python implementation is important. If the supporters of this proposal are not willing to provide this, then I will (along with anyone else who volunteers). http://shop.oreilly.com/product/9780596515171.do # Masterminds of Programming -- Jonathan From rosuav at gmail.com Fri Mar 22 03:49:12 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Mar 2019 18:49:12 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> <20190322021451.GS12502@ando.pearwood.info> Message-ID: On Fri, Mar 22, 2019 at 1:58 PM David Mertz wrote: > > On Thu, Mar 21, 2019, 10:15 PM Steven D'Aprano wrote: >> > Of course I could learn it and teach it, but it will always feel >> > like a wart in the language. >> >> Would that wartness be lessoned if it were spelled | or << instead? > > Yes, definitely. Both those spellings feel pretty natural to me. They don't have the misleading associations '+' carries. I'm kinda fond of '<<' because it visitation resembles an arrow that I can think of as "put the stuff here into there". > Please no. The "cuteness" value of abusing the operator to indicate information flow got old shortly after C++ did it, and it doesn't help. With normal operator overloading, you can say "the + operator means addition", and then define "addition" for different types. Perhaps that ship has sailed, since we already have division between path objects, but at least in that example it is VERY closely related. There's no use of "<<" inside string literals with dictionaries the way there's "/foo/bar/spam" in paths. Dictionary merging is a form of addition. It's also related to set union, which is well known as part of the pipe operator. Either of those is far better than abusing left shift. ChrisA From rosuav at gmail.com Fri Mar 22 03:51:49 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Mar 2019 18:51:49 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: On Fri, Mar 22, 2019 at 6:47 PM Jonathan Fine wrote: > > Chris Angelico wrote: > > > The trouble with that is that you can't always use a dict subclass (or > > a non-subclass MutableMapping implementation, etc, etc, etc). There > > are MANY situations in which Python will give you an actual real dict, > > and it defeats the purpose if you then have to construct an > > AddableDict out of it just so you can add something to it. Not every > > proposed change makes sense on PyPI, and it definitely won't get a > > fair representation in "practical experience". > > Chris seems to accept that sometimes you can use a dict subclass, and > that my proposal will give some representation of "practical > experience". I said "definitely won't", not "will give some". So, no. ChrisA From jfine2358 at gmail.com Fri Mar 22 03:59:03 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 22 Mar 2019 07:59:03 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321235508.GR12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> <20190321235508.GR12502@ando.pearwood.info> Message-ID: Steven D'Aprano wrote: > (For the record, the PEP isn't set in stone in regards to the choice of > operator. Steven: Please say what parts of the PEP you consider to be set in stone. This will allow discussion to focus on essentials rather than details. -- Jonathan From jfine2358 at gmail.com Fri Mar 22 04:36:12 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Fri, 22 Mar 2019 08:36:12 +0000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321234659.GQ12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321234659.GQ12502@ando.pearwood.info> Message-ID: Python fits well in the mind, because (1) by design it reduces cognitive load, and (2) it encourages its users to reduce cognitive load, and (3) we have a culture of reading code, taking pride in our code. Readability counts. https://en.wikipedia.org/wiki/Cognitive_load Steven D'Aprano says that examples such as below don't help us discuss the cognitive load associated with dict + dict. > 1. Toy examples using generic names don't count. > eggs += spam I assume he's referring to my example >>> items.update(points) >>> items += points In this example, items.update gives useful additional information. We expect, from duck typing and sensible naming, that points can be iterated to give key value pairs. In Python, when >>> a + b gives one of TypeError: unsupported operand type(s) for +: 'int' and 'str' TypeError: Can't convert 'int' object to str implicitly we get a very strong hint to write instead something like a + int(b) str(a) + b so that the nature of the addition is made clear to the next person who reads the code (who might be ourselve, in a crisis, in ten years time.) (JavaScript does implicit conversion. This makes the code easier to write, harder to read, and harder to maintain.) For certain values of dct and lst we get >>> lst += dct >>> lst [('a', 1), ('b', 2), 'c', 'd'] For the same values of dct and lst (if proposal allowed) >>> dct += lst >>> dct {'a': 1, 'b': 2, 'c': 3, 'd': 4} In these examples, dct is a dict, and lst is a list. This behaviour is something Python users will have to learn, and have in their mind, whenever they see '+=' in unfamiliar code. I find this as much an unwelcome cognitive load as that produced by Javascript's > 2 * "8" 16 > 2 + "8" "28" To be fair, this may in part be a problem with our expectations about +=. -- Jonathan From steve at pearwood.info Fri Mar 22 05:53:22 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Mar 2019 20:53:22 +1100 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> <20190321235508.GR12502@ando.pearwood.info> Message-ID: <20190322095322.GV12502@ando.pearwood.info> On Fri, Mar 22, 2019 at 07:59:03AM +0000, Jonathan Fine wrote: > Steven D'Aprano wrote: > > > (For the record, the PEP isn't set in stone in regards to the choice of > > operator. > > Steven: Please say what parts of the PEP you consider to be set in > stone. This will allow discussion to focus on essentials rather than > details. The PEP is primarily about making a merge operator, so that stays, regardless of whether it's spelled + | or something else. Otherwise there's no point to the PEP. If there is demand for a merged() method/function, that go into a competing PEP, but it won't be part of this PEP. If anyone wants to propose syntax for chained method calls (fluent programming) so we can write d.copy().update(), that won't be in this PEP either. Likewise for new syntax to turn method calls into operators. Feel free to propose a competing PEP (and I might even support yours, if it makes a good enough case). A grey area is the "last wins" merge behaviour matching update(). In theory, if somebody made an absolutely brilliant case for some other behaviour, I could change my mind, but it would have to be pretty amazing. In the absence of such, I'm going to use my perogative as PEP author to choose the behaviour I prefer to see, and leave alternatives to subclasses. -- Steven From p.f.moore at gmail.com Fri Mar 22 06:09:57 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 22 Mar 2019 10:09:57 +0000 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> <20190322045351.GT12502@ando.pearwood.info> Message-ID: On Fri, 22 Mar 2019 at 07:46, Terry Reedy wrote: > On 3/22/2019 12:53 AM, Steven D'Aprano wrote: > > If a+b already worked for dicts, would you still prefer {**a, **b}? > > Example: {**sys.modules, **globals()} > > Aside from the fact that I can patch *and* backport to 3.7 *now*, I > think so. The latter clearly (to me) maps mappings to a dict. > > > How about if it were spelled a|b? > > As in sys.modules | globals() or (sys.modules | globals())? Closer. Adding a comment here because it's new information (to me, about my subjective preferences, at least). I accept that it's "just" more comment on the whole point about what people subjectively prefer, but at some point the *amount* of subjective preference has to be considered, not everything can be decided purely on objective grounds, so hopefully it's still a useful data point. This is probably the first example of "real world" code written using {**d1, **d2} notation alongside d1+d2 and d1|d2 notation that has caught my attention (I've been skimming, I may have missed some). And I have to say that I find {**d1, **d2} (when used with real values rather than d1 and d2) *far* more obvious in context than either of the operator notations. I wouldn't have expected that - my intuition was that {**d1, **d2} is too punctuation-heavy and "perlish". But surprisingly that's not the case at all. If I ever needed side effect free dictionary merging as an expression, I'd now definitely prefer {**d1, **d2} to any operator form. Paul From storchaka at gmail.com Fri Mar 22 09:49:13 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 22 Mar 2019 15:49:13 +0200 Subject: [Python-ideas] Dict joining using + and += In-Reply-To: References: <0433166f-4f88-93d6-327b-ed1708c17fa7@gmail.com> <378B4AB0-A949-462F-ADCE-10D40E3969CC@killingar.net> <20190301104427.GH4465@ando.pearwood.info> Message-ID: 04.03.19 15:43, Serhiy Storchaka ????: > 01.03.19 12:44, Steven D'Aprano ????: >> On Fri, Mar 01, 2019 at 08:47:36AM +0200, Serhiy Storchaka wrote: >>> Also, if the custom dict subclass implemented the plus operator with >>> different semantic which supports the addition with a dict, this change >>> will break it, because dict + CustomDict will call dict.__add__ instead >>> of CustomDict.__radd__. >> >> That's not how operators work in Python or at least that's not how they >> worked the last time I looked: if the behaviour has changed without >> discussion, that's a breaking change that should be reverted. > > You are right. Actually there is still a problem if the first argument is an instance of dict subclass that does not implement __add__. From brandtbucher at gmail.com Fri Mar 22 13:26:07 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Fri, 22 Mar 2019 10:26:07 -0700 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: <20190321235508.GR12502@ando.pearwood.info> References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> <20190321235508.GR12502@ando.pearwood.info> Message-ID: >> On Mar 21, 2019, at 16:55, Steven D'Aprano wrote: >> >> On Thu, Mar 21, 2019 at 03:10:48PM -0700, Brandt Bucher wrote: >> For anyone interested in "trying it out": if you're not against cloning and >> compiling CPython yourself, here is a PEP 584 C implementation I have PR'd >> against master right now. I'm keeping it in sync with the draft PEP as it >> changes, so subtraction performance is not overly optimized yet, but it >> will show you the *exact* behavior outlined in the PEP on the dict builtin >> and its subclasses. The relevant branch is called "addiction". You can >> clone it from: > > That's great, thank you! > > For the sake of comparisons, could you support | as an alias? That will > allow people to get a feel for whether a+b or a|b looks nicer. > > > (For the record, the PEP isn't set in stone in regards to the choice of > operator. > >> https://github.com/brandtbucher/cpython.git Great idea. I just added this, and all tests are passing. For reference, here?s the PR (it?s linked to the BPO, too): https://github.com/python/cpython/pull/12088 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Fri Mar 22 18:28:37 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Fri, 22 Mar 2019 12:28:37 -1000 Subject: [Python-ideas] PEP: Dict addition and subtraction In-Reply-To: References: <20190301162645.GM4465@ando.pearwood.info> <20190321164159.GC29550@ando.pearwood.info> <20190321180636.41498b1b@fsol> <33e49cbf-0e40-9140-f5e0-5374865bca66@kynesim.co.uk> <20190321202151.6ca7e96a@fsol> Message-ID: Oops, meant for the list: > >> > >> > Chris seems to accept that sometimes you can use a dict subclass, and > >> > that my proposal will give some representation of "practical > >> > experience". > >> > >> I said "definitely won't", not "will give some". So, no. > > > > > > Jonathan: Chris A. is right. But if you think an example implementation > is useful, go ahead and make one ? it?s actually quite easy, and I?m pretty > sure code has already been posted here as well. > > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre.roberge at gmail.com Sat Mar 23 10:36:20 2019 From: andre.roberge at gmail.com (Andre Roberge) Date: Sat, 23 Mar 2019 11:36:20 -0300 Subject: [Python-ideas] Enabling / disabling optional type hinting Message-ID: Consider the following example [1]: Python 3.7.0 (v3.7.0:1bf9cc5093... >>> d = { ... "injury": "flesh wound" ... } >>> d["answer"]: 42 >>> if "answer" in d: ... print("Don't panic!") ... else: ... print("Sorry, I can't help you.") ... Sorry, I can't help you. = = No SyntaxError raised (which would have been the case before version 3.5?) and yet what could be a very unexpected result occurred. Of course, the problem is with the line d["answer"]: 42 which is not an assignment but is an "optional" type hint. I think it would be very useful to have a way to turn off completely type hinting and flag any use of code like the above as a SyntaxError. My preference would be if type hinting would be something that is enabled per file with a top-level comment pragma, something like # type: enable Failing that, having a to-level comment pragma like # type: disable might be acceptable as it would not require any change to any existing file. However, the other option would be more inline with type hinting being "optional" and something that should not trip beginners who are not aware of its existence. Andr? Roberge [1] This example was inspired by a blog post I read yesterday and which I cannot find; I apologize to the author of that blog post. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sat Mar 23 12:24:04 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 23 Mar 2019 09:24:04 -0700 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> <20190322045351.GT12502@ando.pearwood.info> Message-ID: > > I think that's a good indication that there are uses for a merge > > operator. > > Some, yes. Enough for new syntax? Let?s be clear here ? this would not be new syntax ? the operator (s) already exist and are commonly used and overloaded already. This would be a minor change to the dictionary class (and maybe the Mapping ABC), not a change to the language. Are > there existing syntax features so sparsely used? I wonder how often + is used with lists in the stdlib... What is the bar for > something that adds no new function, but saves 6 chars and is easier to > understand for at least some? The ?height of the bar? depends not just on how it would be used, but by how disruptive it is. As this is not nearly as disruptive as, say :=, I think the bar is pretty low. But others seem to think it would add great confusion, which would raise the bar a lot. By the way, if it isn?t used much, that also means it wouldn?t be very disruptive. :-) I?m coming down on the side of ?not worth the argument? -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sat Mar 23 13:37:43 2019 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Mar 2019 10:37:43 -0700 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: Message-ID: On Sat, Mar 23, 2019 at 7:37 AM Andre Roberge wrote: > Consider the following example [1]: > > Python 3.7.0 (v3.7.0:1bf9cc5093... > >>> d = { > ... "injury": "flesh wound" > ... } > >>> d["answer"]: 42 > >>> if "answer" in d: > ... print("Don't panic!") > ... else: > ... print("Sorry, I can't help you.") > ... > Sorry, I can't help you. > > = = > No SyntaxError raised (which would have been the case before version 3.5?) > and yet what could be a very unexpected result occurred. > > Of course, the problem is with the line > > d["answer"]: 42 > > which is not an assignment but is an "optional" type hint. > A useless statement like that isn't likely to be typed. I've never seen anyone do that. Sure, someone is going to typo and omit the = from a := assignment in 3.8 but the walrus is unlikely to be used outside of an conditional or loop test context so this seems like a made up problem. (if anything, this possibility encourages people not to use the walrus in unintended places). Someone might also typo it by meaning to use a ; for multiple statements but enter : instead. Again, very rare. Because ; is frowned upon. A linter (running live as part of an editor/ide these days) can flag most meaningless annotations quite easily. I think it would be very useful to have a way to turn off completely type > hinting and flag any use of code like the above as a SyntaxError. > > My preference would be if type hinting would be something that is enabled > per file with a top-level comment pragma, something like > > # type: enable > > Failing that, having a to-level comment pragma like > > # type: disable > Too late. Requiring something like that would break existing code. might be acceptable as it would not require any change to any existing > file. However, the other option would be more inline with type hinting > being "optional" and something that should not trip beginners who are not > aware of its existence. > What evidence is there that this frequently trips up beginners? Enhancing the dev environments beginners use to automatically lint would be better and would automate some learning, handling cases way beyond this one. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah at bytereef.org Sat Mar 23 13:59:19 2019 From: skrah at bytereef.org (Stefan Krah) Date: Sat, 23 Mar 2019 18:59:19 +0100 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: Message-ID: <20190323175919.GA14413@bytereef.org> On Sat, Mar 23, 2019 at 10:37:43AM -0700, Gregory P. Smith wrote: > A useless statement like that isn't likely to be typed. I've never seen > anyone do that. Unlikely yes, but ideally type annotations should not alter program behavior: >>> d = {} >>> try: d["x"] ... except KeyError: print("KeyError") ... KeyError >>> >>> d = {} >>> try: d["x"] : int ... except KeyError: print("KeyError") ... Stefan Krah From greg at krypto.org Sat Mar 23 15:44:29 2019 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Mar 2019 12:44:29 -0700 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: <20190323175919.GA14413@bytereef.org> References: <20190323175919.GA14413@bytereef.org> Message-ID: On Sat, Mar 23, 2019 at 11:00 AM Stefan Krah wrote: > On Sat, Mar 23, 2019 at 10:37:43AM -0700, Gregory P. Smith wrote: > > A useless statement like that isn't likely to be typed. I've never seen > > anyone do that. > > Unlikely yes, but ideally type annotations should not alter program > behavior: > > >>> d = {} > >>> try: d["x"] > ... except KeyError: print("KeyError") > ... > KeyError > >>> > >>> d = {} > >>> try: d["x"] : int > ... except KeyError: print("KeyError") > ... > Unfortunately that isn't what PEP 526 said: https://www.python.org/dev/peps/pep-0526/#annotating-expressions -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah at bytereef.org Sat Mar 23 15:52:29 2019 From: skrah at bytereef.org (Stefan Krah) Date: Sat, 23 Mar 2019 20:52:29 +0100 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: <20190323175919.GA14413@bytereef.org> Message-ID: <20190323195229.GA27383@bytereef.org> On Sat, Mar 23, 2019 at 12:44:29PM -0700, Gregory P. Smith wrote: > Unfortunately that isn't what PEP 526 said: > https://www.python.org/dev/peps/pep-0526/#annotating-expressions Which part though? I'd understand ... (x): int # Annotates x with int, (x) treated as expression by compiler. ... to mean that the expression is also evaluated if no assignment takes place. Stefan Krah From ned at nedbatchelder.com Sat Mar 23 17:25:57 2019 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 23 Mar 2019 17:25:57 -0400 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: Message-ID: <30cf14b5-fa1c-a029-d7cd-30e879b5bf48@nedbatchelder.com> On 3/23/19 1:37 PM, Gregory P. Smith wrote: > Sure, someone is going to typo and omit the = from a := assignment in > 3.8 but the walrus is unlikely to be used outside of an conditional or > loop test context so this seems like a made up problem. Walruses aren't allowed as a top-level expression anyway: ??? Python 3.8.0a2 (default, Feb 25 2019, 17:15:37) ??? [Clang 10.0.0 (clang-1000.10.44.4)] on darwin ??? Type "help", "copyright", "credits" or "license" for more information. ??? >>> d["answer"] := 42 ? ? ? File "", line 1 ? ?? ?? d["answer"] := 42 ? ?? ?????????????? ^ ??? SyntaxError: invalid syntax --Ned. From andre.roberge at gmail.com Sat Mar 23 17:41:12 2019 From: andre.roberge at gmail.com (Andre Roberge) Date: Sat, 23 Mar 2019 18:41:12 -0300 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: <30cf14b5-fa1c-a029-d7cd-30e879b5bf48@nedbatchelder.com> References: <30cf14b5-fa1c-a029-d7cd-30e879b5bf48@nedbatchelder.com> Message-ID: On Sat, Mar 23, 2019 at 6:26 PM Ned Batchelder wrote: > On 3/23/19 1:37 PM, Gregory P. Smith wrote: > > Sure, someone is going to typo and omit the = from a := assignment in > > 3.8 but the walrus is unlikely to be used outside of an conditional or > > loop test context so this seems like a made up problem. > My original message was referring to someone writing ":" instead of "=" by mistake -- nothing to do with the walrus assignment, but rather using the same notation to assign a value to a key as they would when defining a dict. Andr? Roberge > > Walruses aren't allowed as a top-level expression anyway: > > Python 3.8.0a2 (default, Feb 25 2019, 17:15:37) > [Clang 10.0.0 (clang-1000.10.44.4)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> d["answer"] := 42 > File "", line 1 > d["answer"] := 42 > ^ > SyntaxError: invalid syntax > > > --Ned. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Mar 23 19:20:40 2019 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Mar 2019 16:20:40 -0700 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: <30cf14b5-fa1c-a029-d7cd-30e879b5bf48@nedbatchelder.com> Message-ID: On Sat, Mar 23, 2019 at 2:43 PM Andre Roberge wrote: > My original message was referring to someone writing ":" instead of "=" by > mistake -- nothing to do with the walrus assignment, but rather using the > same notation to assign a value to a key as they would when defining a dict. > OK, I read your Original Post for this thread, about accidentally writing `d["answer"]: 42` instead of `d["answer"] = 42`. My reaction is that this was a user mistake of the same kind as accidentally writing `x + 1` instead of `x += 1`. That's just going to happen, very occasionally. (Though why? The ':' and '=' keys are not that close together.) Read your code carefully, or in an extreme case step through it in a debugger, and you'll notice the mistake. It's not a reason to pick on that particular syntax, and not much of a reason to try and introduce a mechanism to disable type hints. Sorry. PS. This particular syntax was introduced by PEP 526, and introduced in Python 3.6. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sat Mar 23 19:35:25 2019 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 23 Mar 2019 19:35:25 -0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? Message-ID: I know it has been discussed endlessly, so just a gentle reminder about the final arguments would be good. I think I remember it was discussed recently, mentioning that join() doesn't convert elements to strings? This came up while reading this speculative article about how programmers migrate from one programming language to the other (I call it speculative because there's no hard data, but the conclusions and comments pretty much match my experience and my observations over decades [I would add an arrow from Python3 to Rust]). https://apenwarr.ca/log/20190318 In that sense, I think it wouldn't hurt to make Python more familiar/friendly to people coming from other languages, even if it breaks "There should be one-- and preferably only one --obvious way to do it." -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Mar 23 19:36:16 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 23 Mar 2019 19:36:16 -0400 Subject: [Python-ideas] Enabling / disabling optional type hinting In-Reply-To: References: <30cf14b5-fa1c-a029-d7cd-30e879b5bf48@nedbatchelder.com> Message-ID: I agree with Guido. Yes, there are sequences of symbols that used to be syntax errors but that now have some meaning in Python. The type annotation colon is one of them. There are moderately many other constructs this can be said of. I can vaguely imagine accidentally using a colon rather than an equal sign because the colon is similar to an assignment in a dict display. I don't think anyone would do that as a slip of the finger, but more as a brain glitch. However, this feels like a very uncommon hazard, and just one among dozens of similar ones. I don't think anything should be "fixed" at a language level. On Sat, Mar 23, 2019, 7:21 PM Guido van Rossum wrote: > On Sat, Mar 23, 2019 at 2:43 PM Andre Roberge > wrote: > >> My original message was referring to someone writing ":" instead of "=" >> by mistake -- nothing to do with the walrus assignment, but rather using >> the same notation to assign a value to a key as they would when defining a >> dict. >> > > OK, I read your Original Post for this thread, about accidentally writing > `d["answer"]: 42` instead of `d["answer"] = 42`. > > My reaction is that this was a user mistake of the same kind as > accidentally writing `x + 1` instead of `x += 1`. That's just going to > happen, very occasionally. (Though why? The ':' and '=' keys are not that > close together.) Read your code carefully, or in an extreme case step > through it in a debugger, and you'll notice the mistake. > > It's not a reason to pick on that particular syntax, and not much of a > reason to try and introduce a mechanism to disable type hints. Sorry. > > PS. This particular syntax was introduced by PEP 526, and introduced in > Python 3.6. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Mar 23 23:19:50 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 23 Mar 2019 23:19:50 -0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: On 3/23/2019 7:35 PM, Juancarlo A?ez wrote: > I know it has been discussed endlessly, So please read any of the endless discussions either on this list or python-list. I myself have answered multiple times. -- Terry Jan Reedy From storchaka at gmail.com Sun Mar 24 02:53:10 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 24 Mar 2019 08:53:10 +0200 Subject: [Python-ideas] dict.merge(d1, d2, ...) (Counter proposal for PEP 584) In-Reply-To: References: <20190321154334.70fc245a@fsol> <20190322045351.GT12502@ando.pearwood.info> Message-ID: 23.03.19 18:24, Christopher Barker ????: > I wonder how often + is used with lists in the stdlib... Searching for "+ [" shows that even concatenating with the string display and comprehensions is several times more common that merging dicts. And there should be cases not covered by this simple search, and concatenating of tuples and other sequences. Also, using + for sequences is a generalization of using it for strings and bytes objects, which are even more common. From greg at krypto.org Sun Mar 24 03:50:40 2019 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 24 Mar 2019 00:50:40 -0700 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() In-Reply-To: References: Message-ID: I don't think this belongs in subprocess. It isn't related to processes creation. A module on PyPI with the Windows code would make more sense. On Wed, Mar 20, 2019 at 3:19 PM eryk sun wrote: > On 3/18/19, Giampaolo Rodola' wrote: > > > > I've been having these 2 implemented in psutil for a long time. On > > POSIX these are convenience functions using os.kill() + SIGSTOP / > > SIGCONT (the same as CTRL+Z / "fg"). On Windows they use > > undocumented NtSuspendProcess and NtResumeProcess Windows > > APIs available since XP. > > Currently, Windows Python only calls documented C runtime-library and > Windows API functions. It doesn't directly call NT runtime-library and > system functions. Maybe it could in the case of documented functions, > but calling undocumented functions in the standard library should be > avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, > I don't see a way to reliably implement this feature for Windows. I'm > CC'ing Steve Dower. He might say it's okay in this case, or know of > another approach. > > DebugActiveProcess, the other simple approach mentioned in the linked > SO answer [1], is unreliable and has the wrong semantics. A process > only has a single debug port, so DebugActiveProcess will fail the PID > as an invalid parameter if another debugger is already attached to the > process. (The underlying NT call, DbgUiDebugActiveProcess, fails with > STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect > here, at least for Windows, is that each call to suspend() will > require a corresponding call to resume(), since it's incrementing the > suspend count on the threads; however, a debugger can't reattach to > the same process. Also, if the Python process exits while it's > attached as a debugger, the system will terminate the debugee as well, > unless we call DebugSetProcessKillOnExit(0), but that interferes with > the Python process acting as a debugger normally, as does this entire > wonky idea. Also, the debugging system creates a thread in the debugee > that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This > thread is waiting, but it's not suspended, so the process will never > actually appear as suspended in Task Manager or Process Explorer. > > That leaves enumerating threads in a snapshot and calling OpenThread > and SuspendThread on each thread that's associated with the process. > In comparison, let's take an abridged look at the guts of > NtSuspendProcess. > > nt!NtSuspendProcess: > ... > mov r8,qword ptr [nt!PsProcessType] > ... > call nt!ObpReferenceObjectByHandleWithTag > ... > call nt!PsSuspendProcess > ... > mov ebx,eax > call nt!ObfDereferenceObjectWithTag > mov eax,ebx > ... > ret > > nt!PsSuspendProcess: > ... > call nt!ExAcquireRundownProtection > cmp al,1 > jne nt!PsSuspendProcess+0x74 > ... > call nt!PsGetNextProcessThread > xor ebx,ebx > jmp nt!PsSuspendProcess+0x62 > > nt!PsSuspendProcess+0x4d: > ... > call nt!PsSuspendThread > ... > call nt!PsGetNextProcessThread > > nt!PsSuspendProcess+0x62: > ... > test rax,rax > jne nt!PsSuspendProcess+0x4d > ... > call nt!ExReleaseRundownProtection > jmp nt!PsSuspendProcess+0x79 > > nt!PsSuspendProcess+0x74: > mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING) > > nt!PsSuspendProcess+0x79: > ... > mov eax,ebx > ... > ret > > This code repeatedly calls PsGetNextProcessThread to walk the > non-terminated threads of the process in creation order (based on a > linked list in the process object) and suspends each thread via > PsSuspendThread. In contrast, a Tool-Help thread snapshot is > unreliable since it won't include threads created after the snapshot > is created. The alternative is to use a different undocumented system > call, NtGetNextThread [2], which is implemented via > PsGetNextProcessThread. But that's slightly worse than calling > NtSuspendProcess. > > [1]: https://stackoverflow.com/a/11010508 > [2]: > https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsapi.h#L848 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evrial at gmail.com Sun Mar 24 04:42:39 2019 From: evrial at gmail.com (Alex Grigoryev) Date: Sun, 24 Mar 2019 10:42:39 +0200 Subject: [Python-ideas] New explicit methods to trim strings Message-ID: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> Following the discussion here (https://link.getmailspring.com/link/7D84D131-65B6-4EF7-9C43-51957F9DFAA9 at getmailspring.com/0?redirect=https%3A%2F%2Fbugs.python.org%2Fissue36410&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D) I propose to add 3 new string methods: str.trim, str.ltrim, str.rtrim Another option would be to change API for str.split method to work correctly with sequences. In [1]: def ltrim(s, seq): ...: return s[len(seq):] if s.startswith(seq) else s ...: In [2]: def rtrim(s, seq): ...: return s[:-len(seq)] if s.endswith(seq) else s ...: In [3]: def trim(s, seq): ...: return ltrim(rtrim(s, seq), seq) ...: In [4]: s = 'mailto:maria at gmail.com' In [5]: ltrim(s, 'mailto:') Out[5]: 'maria at gmail.com' In [6]: rtrim(s, 'com') Out[6]: 'mailto:maria at gmail.' In [7]: trim(s, 'm') Out[7]: 'ailto:maria at gmail.co' -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Mar 24 05:04:20 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 24 Mar 2019 20:04:20 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> Message-ID: On Sun, Mar 24, 2019 at 7:43 PM Alex Grigoryev wrote: > > Following the discussion here I propose to add 3 new string methods: str.trim, str.ltrim, str.rtrim > Another option would be to change API for str.split method to work correctly with sequences. > > In [1]: def ltrim(s, seq): > ...: return s[len(seq):] if s.startswith(seq) else s > ...: > [corresponding functions snipped] > You may need to clarify here one of two options: either ltrim accepts *only and precisely* a string, not an arbitrary sequence (as your parameter naming suggests); or that it accepts an arbitrary sequence, but with different semantics to your example. With str.startswith, any sequence can be accepted, and if the string starts with *any* of the strings, it will return True: >>> "abcd".startswith(("ab", "qw", "12")) True Your simple one-liner would take the length of the tuple (3) and remove that many characters. From the BPO discussion, I suspect you actually just want to use a single string here, but I could be wrong, especially with the suggestion to make str.split work with sequences; do you mean that you want to be able to split on any string in the sequence, or split arbitrary sequences, or something else? (Another option here - since an email address won't usually contain a colon - would be to use s.replace("mailto:", "") to remove the prefix. Technically it IS valid and possible, but it's not something I see in the wild, so you're unlikely to break anyone's address by removing "mailto:" out of the middle of it.) For complicated string matching and replacement work, you may need to reach for the 're' module. Yes, I'm aware that then you'll have two problems, but it's in the stdlib for a reason. ChrisA From jfine2358 at gmail.com Sun Mar 24 05:11:40 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sun, 24 Mar 2019 09:11:40 +0000 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: I'm willing to provide some useful information, if you're willing to write it up into a good blog post. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Sun Mar 24 05:34:49 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sun, 24 Mar 2019 10:34:49 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> Message-ID: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> I don't see what trim() is good for but I know I've written ltrim() hundreds of times easy. I propose naming them strip_prefix() and strip_suffix() and just skip the one that does both sides since it makes no sense to me. Trim is generally a bad name because what is called strip() in python is called trim() in other languages. This would be needlessly confusing. > On 24 Mar 2019, at 09:42, Alex Grigoryev wrote: > > Following the discussion here I propose to add 3 new string methods: str.trim, str.ltrim, str.rtrim > Another option would be to change API for str.split method to work correctly with sequences. > > In [1]: def ltrim(s, seq): > > ...: return s[len(seq):] if s.startswith(seq) else s > > ...: > > > > In [2]: def rtrim(s, seq): > > ...: return s[:-len(seq)] if s.endswith(seq) else s > > ...: > > > > In [3]: def trim(s, seq): > > ...: return ltrim(rtrim(s, seq), seq) > > ...: > > > > In [4]: s = 'mailto:maria at gmail.com' > > > > In [5]: ltrim(s, 'mailto:') > > Out[5]: 'maria at gmail.com' > > > > In [6]: rtrim(s, 'com') > > Out[6]: 'mailto:maria at gmail.' > > > > In [7]: trim(s, 'm') > > Out[7]: 'ailto:maria at gmail.co' > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From evrial at gmail.com Sun Mar 24 05:46:56 2019 From: evrial at gmail.com (Alex Grigoryev) Date: Sun, 24 Mar 2019 11:46:56 +0200 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> References: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> Message-ID: <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> Yeah good idea with names because php ltrim does the same as lstrip in python. Normally I'd expect strip to behave as I proposed, not like input a string as mask of characters, which is more rare use case and confusing in some scenarios. On ??? ? 24 2019, at 11:34 ????, Anders Hovm?ller wrote: > I don't see what trim() is good for but I know I've written ltrim() hundreds of times easy. > > I propose naming them strip_prefix() and strip_suffix() and just skip the one that does both sides since it makes no sense to me. > > Trim is generally a bad name because what is called strip() in python is called trim() in other languages. This would be needlessly confusing. > > On 24 Mar 2019, at 09:42, Alex Grigoryev wrote: > > Following the discussion here (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/1?redirect=https%3A%2F%2Flink.getmailspring.com%2Flink%2F7D84D131-65B6-4EF7-9C43-51957F9DFAA9%40getmailspring.com%2F0%3Fredirect%3Dhttps%253A%252F%252Fbugs.python.org%252Fissue36410%26recipient%3DcHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%253D&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D) I propose to add 3 new string methods: str.trim, str.ltrim, str.rtrim > > Another option would be to change API for str.split method to work correctly with sequences. > > > > In [1]: def ltrim(s, seq): > > ...: return s[len(seq):] if s.startswith(seq) else s > > ...: > > > > In [2]: def rtrim(s, seq): > > ...: return s[:-len(seq)] if s.endswith(seq) else s > > ...: > > > > In [3]: def trim(s, seq): > > ...: return ltrim(rtrim(s, seq), seq) > > ...: > > > > In [4]: s = 'mailto:maria at gmail.com (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/2?redirect=mailto%3Amaria%40gmail.com&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D)' > > > > In [5]: ltrim(s, 'mailto:') > > Out[5]: 'maria at gmail.com (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/3?redirect=mailto%3Amaria%40gmail.com&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D)' > > > > In [6]: rtrim(s, 'com') > > Out[6]: 'mailto:maria at gmail.' > > > > In [7]: trim(s, 'm') > > Out[7]: 'ailto:maria at gmail.co (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/4?redirect=mailto%3Amaria%40gmail.co&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D)' > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/5?redirect=mailto%3APython-ideas%40python.org&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D) > > https://mail.python.org/mailman/listinfo/python-ideas (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/6?redirect=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fpython-ideas&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D) > > Code of Conduct: http://python.org/psf/codeofconduct/ (https://link.getmailspring.com/link/5181B0DB-3B10-4202-90D6-1365AEF19654 at getmailspring.com/7?redirect=http%3A%2F%2Fpython.org%2Fpsf%2Fcodeofconduct%2F&recipient=cHl0aG9uLWlkZWFzQHB5dGhvbi5vcmc%3D) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sun Mar 24 06:49:50 2019 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sun, 24 Mar 2019 11:49:50 +0100 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() In-Reply-To: References: Message-ID: On Wed, Mar 20, 2019 at 11:19 PM eryk sun wrote: > > On 3/18/19, Giampaolo Rodola' wrote: > > > > I've been having these 2 implemented in psutil for a long time. On > > POSIX these are convenience functions using os.kill() + SIGSTOP / > > SIGCONT (the same as CTRL+Z / "fg"). On Windows they use > > undocumented NtSuspendProcess and NtResumeProcess Windows > > APIs available since XP. > > Currently, Windows Python only calls documented C runtime-library and > Windows API functions. It doesn't directly call NT runtime-library and > system functions. Maybe it could in the case of documented functions, > but calling undocumented functions in the standard library should be > avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, > I don't see a way to reliably implement this feature for Windows. I'm > CC'ing Steve Dower. He might say it's okay in this case, or know of > another approach. > > DebugActiveProcess, the other simple approach mentioned in the linked > SO answer [1], is unreliable and has the wrong semantics. A process > only has a single debug port, so DebugActiveProcess will fail the PID > as an invalid parameter if another debugger is already attached to the > process. (The underlying NT call, DbgUiDebugActiveProcess, fails with > STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect > here, at least for Windows, is that each call to suspend() will > require a corresponding call to resume(), since it's incrementing the > suspend count on the threads; however, a debugger can't reattach to > the same process. Also, if the Python process exits while it's > attached as a debugger, the system will terminate the debugee as well, > unless we call DebugSetProcessKillOnExit(0), but that interferes with > the Python process acting as a debugger normally, as does this entire > wonky idea. Also, the debugging system creates a thread in the debugee > that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This > thread is waiting, but it's not suspended, so the process will never > actually appear as suspended in Task Manager or Process Explorer. > > That leaves enumerating threads in a snapshot and calling OpenThread > and SuspendThread on each thread that's associated with the process. > In comparison, let's take an abridged look at the guts of > NtSuspendProcess. > > nt!NtSuspendProcess: > ... > mov r8,qword ptr [nt!PsProcessType] > ... > call nt!ObpReferenceObjectByHandleWithTag > ... > call nt!PsSuspendProcess > ... > mov ebx,eax > call nt!ObfDereferenceObjectWithTag > mov eax,ebx > ... > ret > > nt!PsSuspendProcess: > ... > call nt!ExAcquireRundownProtection > cmp al,1 > jne nt!PsSuspendProcess+0x74 > ... > call nt!PsGetNextProcessThread > xor ebx,ebx > jmp nt!PsSuspendProcess+0x62 > > nt!PsSuspendProcess+0x4d: > ... > call nt!PsSuspendThread > ... > call nt!PsGetNextProcessThread > > nt!PsSuspendProcess+0x62: > ... > test rax,rax > jne nt!PsSuspendProcess+0x4d > ... > call nt!ExReleaseRundownProtection > jmp nt!PsSuspendProcess+0x79 > > nt!PsSuspendProcess+0x74: > mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING) > > nt!PsSuspendProcess+0x79: > ... > mov eax,ebx > ... > ret Thanks for chiming in with useful info as usual. I agree with your rationale after all. I've been dealing with undocumented Windows APIs in psutil for a long time and they have always been in a sort of grey area where despite they stayed "stable" since forever, the lack of an official stand from Microsoft probably makes this addition inappropriate for the stdlib. > This code repeatedly calls PsGetNextProcessThread to walk the > non-terminated threads of the process in creation order (based on a > linked list in the process object) and suspends each thread via > PsSuspendThread. In contrast, a Tool-Help thread snapshot is > unreliable since it won't include threads created after the snapshot > is created. The alternative is to use a different undocumented system > call, NtGetNextThread [2], which is implemented via > PsGetNextProcessThread. But that's slightly worse than calling > NtSuspendProcess. > > [1]: https://stackoverflow.com/a/11010508 > [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsapi.h#L848 FWIW older psutil versions relied on Thread32Next / OpenThread / SuspendThread / ResumeThread, which appear similar to these Ps* counterparts (and I assume have the same drawbacks). -- Giampaolo - http://grodola.blogspot.com From apalala at gmail.com Sun Mar 24 07:28:52 2019 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 24 Mar 2019 07:28:52 -0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: On Sun, Mar 24, 2019 at 5:11 AM Jonathan Fine wrote: > I'm willing to provide some useful information, if you're willing to write > it up into a good blog post. > ... or a PEP for rejection. Deal! -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sun Mar 24 08:15:48 2019 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sun, 24 Mar 2019 13:15:48 +0100 Subject: [Python-ideas] Improve os.times() resolution Message-ID: It turns out we could use resource.getrusage() which provides micro seconds (tested on Linux and macOS): import os, resource for x in range(10000000): # warm up pass for x in range(5): a = os.times() b = resource.getrusage(resource.RUSAGE_SELF) print(a.user, a.system) print(b.ru_utime, b.ru_stime) ...it prints: 0.39 0.01 0.394841 0.011963999999999999 0.39 0.01 0.394899 0.011966 0.39 0.01 0.394908 0.011966 0.39 0.01 0.394936 0.011967 0.39 0.01 0.394963 0.011968 getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" and "children_system". I see 2 possibilities here: 1) doc fix, mentioning that resource.getrusage provides a better resolution 2) if available (it should always be as it's a POSIX standard), just use getrusage in Modules/posixmodule.c. It seems we can check availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H definitions which are already in place. I'm not sure what's best to do as os.* functions usually expose the original C function with the same name, but given that "elapsed" field is not part of times(2) struct and that on Windows "elapsed", "children_user" and "children_system" are set to 0 it appears there may be some space for flexibility here. Thoughts? -- Giampaolo - http://grodola.blogspot.com From jfine2358 at gmail.com Sun Mar 24 09:27:42 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sun, 24 Mar 2019 13:27:42 +0000 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: Disclaimer: I've not recently read any discussions of this topic. And everything I say is my own opinion. SUMMARY The syntax of Python's string join operator are a consequence of Python's core design decisions and coding principles, together with the semantics of join. I'll explain this in a series of posts. Why str.join? Wouldn't list.join be better? =============================== Python variables don't have types. Python objects do have types. When writing Python, don't test for the type of an object, unless you have to. Instead, test to see if the object has the methods you need. Eg help(dict.update) update(...) D.update([E, ]**F) -> None. Update D from dict/iterable E and F. If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k] Now for the semantics of string join. By Python principles From a str and an iterable The string join operator Produces a str You question amounts to this. Should the signature be (str, iterable) -> str # Informal type signature of join. or should it be (iterable, str) -> str At first glance, there's no way of choosing. But in Python we prefer aaa.join(bbb) to from somewhere import join join(aaa, bbb) So the question now becomes: Choose between str.join(iterable) iterable.join(str) There is some sense in having list.join. But in Python, we can't join the elements of a list without getting an iterable from the list (unless there's something like very special short-cut-semantics built into Python). So in Python the choice is between str.join(iterable) iterable.join(str) Now str.join looks more attractive. But I said, without thinking ahead, that the syntax of Python's string join operator is a consequence of Python's core design decisions and coding principles, together with the semantics of join. I'm not quite there yet, there's a gap to fill. I'll pause for a day or two now, just in case someone else wants to JOIN in the discussion, and perhaps fill the gap. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Sun Mar 24 09:29:42 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sun, 24 Mar 2019 14:29:42 +0100 Subject: [Python-ideas] Improve os.times() resolution In-Reply-To: References: Message-ID: <678BB8D2-B0BF-4005-B352-F4110C329077@killingar.net> Have you checked how much overhead the two functions have? That seems like an obvious way this proposal could go south. > On 24 Mar 2019, at 13:15, Giampaolo Rodola' wrote: > > It turns out we could use resource.getrusage() which provides micro > seconds (tested on Linux and macOS): > > import os, resource > for x in range(10000000): # warm up > pass > for x in range(5): > a = os.times() > b = resource.getrusage(resource.RUSAGE_SELF) > print(a.user, a.system) > print(b.ru_utime, b.ru_stime) > > ...it prints: > > 0.39 0.01 > 0.394841 0.011963999999999999 > 0.39 0.01 > 0.394899 0.011966 > 0.39 0.01 > 0.394908 0.011966 > 0.39 0.01 > 0.394936 0.011967 > 0.39 0.01 > 0.394963 0.011968 > > getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" > and "children_system". I see 2 possibilities here: > > 1) doc fix, mentioning that resource.getrusage provides a better resolution > 2) if available (it should always be as it's a POSIX standard), just > use getrusage in Modules/posixmodule.c. It seems we can check > availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H > definitions which are already in place. > > I'm not sure what's best to do as os.* functions usually expose the > original C function with the same name, but given that "elapsed" field > is not part of times(2) struct and that on Windows "elapsed", > "children_user" and "children_system" are set to 0 it appears there > may be some space for flexibility here. > > Thoughts? > > -- > Giampaolo - http://grodola.blogspot.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From eryksun at gmail.com Sun Mar 24 11:39:16 2019 From: eryksun at gmail.com (eryk sun) Date: Sun, 24 Mar 2019 10:39:16 -0500 Subject: [Python-ideas] Add subprocess.Popen suspend() and resume() In-Reply-To: References: Message-ID: On 3/24/19, Giampaolo Rodola' wrote: > On Wed, Mar 20, 2019 at 11:19 PM eryk sun wrote: > >> This code repeatedly calls PsGetNextProcessThread to walk the >> non-terminated threads of the process in creation order (based on a >> linked list in the process object) and suspends each thread via >> PsSuspendThread. In contrast, a Tool-Help thread snapshot is >> unreliable since it won't include threads created after the snapshot >> is created. The alternative is to use a different undocumented system >> call, NtGetNextThread [2], which is implemented via >> PsGetNextProcessThread. But that's slightly worse than calling >> NtSuspendProcess. >> >> [1]: https://stackoverflow.com/a/11010508 >> [2]: https://github.com/processhacker/processhacker/blob/v2.39/ >> phnt/include/ntpsapi.h#L848 > > FWIW older psutil versions relied on Thread32Next / OpenThread / > SuspendThread / ResumeThread, which appear similar to these Ps* > counterparts (and I assume have the same drawbacks). This is the toolhelp snapshot I was talking about, which is an unreliable way to pause a process since it doesn't include threads created after the snapshot. For TH32CS_SNAPTHREAD, it's based on calling NtQuerySystemInformation: SystemProcessInformation to take a snapshot of all running processes and threads at the time. This buffer gets written to a shared section, and the section handle is returned as the snapshot handle. Thread32First and Thread32Next are called to walk the buffer a record at a time by temporarily mapping the section with NtMapViewOfSection and NtUnmapViewOfSection. In contrast, NtSuspendProcess is based on PsGetNextProcessThread, which walks a linked list of the non-terminated threads in the process. Unlike a snapshot, this won't miss threads created after we start, since new threads are appended to the list. To implement this in user mode with SuspendThread would require the NtGetNextThread system call that's implemented via PsGetNextProcessThread. But that's just trading one undocumented system call for another at the expense of a more complicated implementation. From rosuav at gmail.com Sun Mar 24 12:01:45 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Mar 2019 03:01:45 +1100 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: On Mon, Mar 25, 2019 at 12:28 AM Jonathan Fine wrote: > > Disclaimer: I've not recently read any discussions of this topic. And everything I say is my own opinion. > > SUMMARY > The syntax of Python's string join operator are a consequence of Python's core design decisions and coding principles, together with the semantics of join. I'll explain this in a series of posts. > > Why str.join? Wouldn't list.join be better? > =============================== > Python variables don't have types. Python objects do have types. When writing Python, don't test for the type of an object, unless you have to. Instead, test to see if the object has the methods you need. > > Eg help(dict.update) > update(...) > D.update([E, ]**F) -> None. Update D from dict/iterable E and F. > If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] > If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v > In either case, this is followed by: for k in F: D[k] = F[k] > > Now for the semantics of string join. By Python principles > From a str and an iterable > The string join operator > Produces a str > > You question amounts to this. Should the signature be > (str, iterable) -> str # Informal type signature of join. > or should it be > (iterable, str) -> str > > At first glance, there's no way of choosing. But in Python we prefer > aaa.join(bbb) > to > from somewhere import join > join(aaa, bbb) > > So the question now becomes: Choose between > str.join(iterable) > iterable.join(str) > > There is some sense in having list.join. But in Python, we can't join the elements of a list without getting an iterable from the list (unless there's something like very special short-cut-semantics built into Python). > > So in Python the choice is between > str.join(iterable) > iterable.join(str) > > Now str.join looks more attractive. But I said, without thinking ahead, that the syntax of Python's string join operator is a consequence of Python's core design decisions and coding principles, together with the semantics of join. > > I'm not quite there yet, there's a gap to fill. I'll pause for a day or two now, just in case someone else wants to JOIN in the discussion, and perhaps fill the gap. > It's way WAY simpler than all this. "Iterable" isn't a type, it's a protocol; in fact, "iterable" just means "has an __iter__ method". Adding a method to iterables means adding that method to every single object that wants to be iterable, but adding a method to strings just means adding it to the str type. And, this is a topic for python-list, not python-ideas, unless someone is proposing a change. ChrisA From greg at krypto.org Sun Mar 24 12:24:38 2019 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 24 Mar 2019 09:24:38 -0700 Subject: [Python-ideas] Improve os.times() resolution In-Reply-To: References: Message-ID: On Sun, Mar 24, 2019 at 5:16 AM Giampaolo Rodola' wrote: > It turns out we could use resource.getrusage() which provides micro > seconds (tested on Linux and macOS): > > import os, resource > for x in range(10000000): # warm up > pass > for x in range(5): > a = os.times() > b = resource.getrusage(resource.RUSAGE_SELF) > print(a.user, a.system) > print(b.ru_utime, b.ru_stime) > > ...it prints: > > 0.39 0.01 > 0.394841 0.011963999999999999 > 0.39 0.01 > 0.394899 0.011966 > 0.39 0.01 > 0.394908 0.011966 > 0.39 0.01 > 0.394936 0.011967 > 0.39 0.01 > 0.394963 0.011968 > > getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" > and "children_system". I see 2 possibilities here: > > 1) doc fix, mentioning that resource.getrusage provides a better resolution > 2) if available (it should always be as it's a POSIX standard), just > use getrusage in Modules/posixmodule.c. It seems we can check > availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H > definitions which are already in place. > > I'm not sure what's best to do as os.* functions usually expose the > original C function with the same name, but given that "elapsed" field > is not part of times(2) struct and that on Windows "elapsed", > "children_user" and "children_system" are set to 0 it appears there > may be some space for flexibility here. > > Thoughts? > I'd just document that resource.getrusage() provides better data. That is what man pages for times(3) have done as well. It is good to keep the os module close to the underlying library/system calls and leave it to higher level code to abstract and choose as deemed appropriate. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Sun Mar 24 13:16:22 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sun, 24 Mar 2019 17:16:22 +0000 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: SUMMARY I think we're about to have a discussion of what's appropriate to have on this list, so I've started a new thread. I formulate the question as: Is this list an appropriate place for the discovery, discussion and application of Python's core design decisions and coding principles? Or in other words, is a PEP an appropriate place to record the outcome of such activities? If you want to discuss further, this thread I suggest is the place to do it. But I'd rather you suspended such contributions, until I have elsewhere by example shown what I mean by the discover etc of Python's core principles. BACKGROUND In "Why not ['a','b','c'].join(',') ?" Chris Angelico wrote: > this is a topic for python-list, not python-ideas, unless someone > is proposing a change. In response, I made and stated a bold statement. Namely that the Python syntax for string join is a consequence of Python's core design decisions and coding principles, together with the semantics of join. When I made this statement, I was confident it was true. ABOUT PYTHON PRINCIPLES After further reflection, I realised that it applies more widely. I've also discovered, applying the principle, a gap in the Python string module. Right now and here is not a good time to talk about it, but YES, someone will be proposing a change. I've said recently on this list, at least once, that I'm a pure mathematician. And that I'm trained to find a small, simple and elegant system of rules which determine the behaviour of a large number of examples. AN EXAMPLE - Roman and Arabic numbers The Hindu-Arabic numeral system 1, 2, 3, 4, 5, ..., 9, 10, 11, ... were developed in 1st to 4th centuries by Indian mathematicians. Addition and multiplication of numbers, using this numeral system is much simpler than using the earlier Roman numeral system I, II, III, IV, V, ..., IX, X, XI, ... . This is part of the story of how the discovery and introduction of new concepts made addition and multiplication much easier. By the way, from about 500 to 630 a symbol that we now would call zero was introduced, and understood. And today zero is mathematics for children, if not exactly child's play. While writing this, I consulted https://en.wikipedia.org/wiki/Hindu%E2%80%93Arabic_numeral_system#History FACTS, AXIOMS and THEOREM Chris Angelico wrote: > It's way WAY simpler than all this. "Iterable" isn't a type, it's a > protocol; in fact, "iterable" just means "has an __iter__ method". I think that for Chris this is a FACT about Python. This is the way Python is. My mathematical approach is to find simple AXIOMS for which have this FACT is a logical consequence, or in other words a THEOREM. (Also we want the axioms not to have wrong statements as a logical consequence.) Here's an example to show how FACTS, AXIOMS and THEOREMS fit together. For most of us, at grade school statements such are 2 + 2 = 4 and 7 * 8 = 56 are FACTS when summoned from memory. And 1 + 2 + 3 + 4 + 5 = 15 is a THEOREM that arises from knowing how to add numbers (which for most students is a collection of FACTS). Now consider X = 1 + 2 + 3 + 4 + 5 + .... + 99 + 100 That X == 5050 is a THEOREM based on the FACT of addition, together with a laborious calculation. Once, a grade school teacher gave the calculation of X as work for his student to do. To the teacher's surprise, one of the students very soon came up to his desk with the correct answer 5050. This grade school student had discovered for himself the THEOREM, based on the fundamental properties of counting number, that 1 + 2 + 3 + ... + (N-1) + N == N (N + 1) / 2 This student went on to be the greatest pure and mathematician and theoretical physicist of his day. https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss#Early_years ASIDE Mathematics is a special skill. All of us have our own special skills and experiences, and most of us are not mathematicians. Our community code of conduct encourages collaboration, so that all of our experiences and skill sets can contribute "to the whole of our efforts". With good will, we can overcome friction and misunderstanding between the X and non-X communities, for the benefit of all. For all every special skill X. Or in other words, many hands make light work, and many eyes find many more bugs. SUMMARY I have argued that Python's core design decisions and coding principles can, at least in part, be reduced to a system of AXIOMS that can be useful applied. I have argued mainly based on analogy with the Hindu-Arabic numeral system, and the life and work of Gauss. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Mar 24 13:19:51 2019 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Mar 2019 10:19:51 -0700 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: I think this belongs in a personal blog, not on python-ideas and definitely not in a PEP. On Sun, Mar 24, 2019 at 10:18 AM Jonathan Fine wrote: > SUMMARY > I think we're about to have a discussion of what's appropriate to have on > this list, so I've started a new thread. > > I formulate the question as: Is this list an appropriate place for the > discovery, discussion and application of Python's core design decisions and > coding principles? Or in other words, is a PEP an appropriate place to > record the outcome of such activities? > > If you want to discuss further, this thread I suggest is the place to do > it. But I'd rather you suspended such contributions, until I have elsewhere > by example shown what I mean by the discover etc of Python's core > principles. > > BACKGROUND > > In "Why not ['a','b','c'].join(',') ?" Chris Angelico wrote: > >> this is a topic for python-list, not python-ideas, unless someone >> > is proposing a change. > > > In response, I made and stated a bold statement. Namely that the Python > syntax for string join is a consequence of Python's core design decisions > and coding principles, together with the semantics of join. When I made > this statement, I was confident it was true. > > ABOUT PYTHON PRINCIPLES > > After further reflection, I realised that it applies more widely. I've > also discovered, applying the principle, a gap in the Python string module. > Right now and here is not a good time to talk about it, but YES, someone > will be proposing a change. > > I've said recently on this list, at least once, that I'm a pure > mathematician. And that I'm trained to find a small, simple and elegant > system of rules which determine the behaviour of a large number of examples. > > AN EXAMPLE - Roman and Arabic numbers > > The Hindu-Arabic numeral system 1, 2, 3, 4, 5, ..., 9, 10, 11, ... were > developed in 1st to 4th centuries by Indian mathematicians. Addition and > multiplication of numbers, using this numeral system is much simpler than > using the earlier Roman numeral system I, II, III, IV, V, ..., IX, X, XI, > ... . > > This is part of the story of how the discovery and introduction of new > concepts made addition and multiplication much easier. By the way, from > about 500 to 630 a symbol that we now would call zero was introduced, and > understood. And today zero is mathematics for children, if not exactly > child's play. > > While writing this, I consulted > https://en.wikipedia.org/wiki/Hindu%E2%80%93Arabic_numeral_system#History > > FACTS, AXIOMS and THEOREM > > Chris Angelico wrote: > >> It's way WAY simpler than all this. "Iterable" isn't a type, it's a >> protocol; in fact, "iterable" just means "has an __iter__ method". > > > I think that for Chris this is a FACT about Python. This is the way Python > is. > > My mathematical approach is to find simple AXIOMS for which have this FACT > is a logical consequence, or in other words a THEOREM. (Also we want the > axioms not to have wrong statements as a logical consequence.) > > Here's an example to show how FACTS, AXIOMS and THEOREMS fit together. For > most of us, at grade school statements such are 2 + 2 = 4 and 7 * 8 = 56 > are FACTS when summoned from memory. And 1 + 2 + 3 + 4 + 5 = 15 is a > THEOREM that arises from knowing how to add numbers (which for most > students is a collection of FACTS). > > Now consider > X = 1 + 2 + 3 + 4 + 5 + .... + 99 + 100 > That X == 5050 is a THEOREM based on the FACT of addition, together with a > laborious calculation. Once, a grade school teacher gave the calculation of > X as work for his student to do. > > To the teacher's surprise, one of the students very soon came up to his > desk with the correct answer 5050. This grade school student had discovered > for himself the THEOREM, based on the fundamental properties of counting > number, that > 1 + 2 + 3 + ... + (N-1) + N == N (N + 1) / 2 > > This student went on to be the greatest pure and mathematician and > theoretical physicist of his day. > https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss#Early_years > > ASIDE > > Mathematics is a special skill. All of us have our own special skills and > experiences, and most of us are not mathematicians. Our community code of > conduct encourages collaboration, so that all of our experiences and skill > sets can contribute "to the whole of our efforts". > > With good will, we can overcome friction and misunderstanding between the > X and non-X communities, for the benefit of all. For all every special > skill X. Or in other words, many hands make light work, and many eyes find > many more bugs. > > SUMMARY > > I have argued that Python's core design decisions and coding principles > can, at least in part, be reduced to a system of AXIOMS that can be useful > applied. I have argued mainly based on analogy with the Hindu-Arabic > numeral system, and the life and work of Gauss. > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Mar 24 13:23:36 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Mar 2019 04:23:36 +1100 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: On Mon, Mar 25, 2019 at 4:16 AM Jonathan Fine wrote: > FACTS, AXIOMS and THEOREM > > Chris Angelico wrote: >> >> It's way WAY simpler than all this. "Iterable" isn't a type, it's a >> protocol; in fact, "iterable" just means "has an __iter__ method". > > > I think that for Chris this is a FACT about Python. This is the way Python is. > "For me"? No. It is, pure and simply, a fact about Python. That is how the language is defined. It's not "for me" a fact, as if facts might not be facts for other people. That isn't how facts work, and it isn't how Python works. Here's some documentation on the iterator protocol, and if you want to discuss further, python-list hasn't had a long and rambling thread on "why are iterators the way they are" for a while... have fun. https://docs.python.org/3/library/stdtypes.html#typeiter ChrisA From pythonchb at gmail.com Sun Mar 24 13:23:39 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 24 Mar 2019 10:23:39 -0700 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: I think is was a couple years ago that someone on this list suggested a ?commonly suggested and rejected ideas? PEP. I don?t know that it should be a PEP, but it would be a good idea to have such a document in an ?official? place. We could start with this one. Interestingly (to me), Chris?s explanation is a good one, but I don?t think historically accurate. IIRC, the join method was added to strings pretty early on, when most of the other methods were added. Back in the day (1.5 anyway), strings were pretty simple objects, and you did most string manipulation with functions in the string module, e.g. import string a_string = string.upper(a_string) The string module had a join() function -- lists did not have a join method. Strings themselves had (I think) no methods -- the only "functionality" they had was what other sequences had (slicing, "in", etc), and the % formatting operator. Which makes antoher point -- strings ARE sequences, so if sequences had a join() method, then strings should have a join() method, too, but it would join the items ini the sequence -- i.e. characters -- not really that useful ;-) When the string object was introduced (expanded) to have a full set of methods, it grew most of the functions in the string module, so naturally, join() was one of them. ( https://docs.python.org/3/whatsnew/2.0.html#string-methods). NOTE that a promary motivator for making stringoperations methods was that then unicode and "Old style" strings could have the same interface. As it turns out, str.join() works with any iterable, which is really nice, but back in the day, Python was more about sequences than iterables [1], and it still made sense that join really belonged with str, as it is inherently a string operation -- sure, any python object can be stringified, but not all objects actually produce a useful string representation, so a "stringify_and_join" method on every sequence is pretty out of place. And, as Chris points out, impossible for every iterable. [1] String methods were introduced in 2.0, and the iteration protocol was introduced in 2.1: https://www.python.org/dev/peps/pep-0234/ -CHB On Sun, Mar 24, 2019 at 9:02 AM Chris Angelico wrote: > On Mon, Mar 25, 2019 at 12:28 AM Jonathan Fine > wrote: > > > > Disclaimer: I've not recently read any discussions of this topic. And > everything I say is my own opinion. > > > > SUMMARY > > The syntax of Python's string join operator are a consequence of > Python's core design decisions and coding principles, together with the > semantics of join. I'll explain this in a series of posts. > > > > Why str.join? Wouldn't list.join be better? > > =============================== > > Python variables don't have types. Python objects do have types. When > writing Python, don't test for the type of an object, unless you have to. > Instead, test to see if the object has the methods you need. > > > > Eg help(dict.update) > > update(...) > > D.update([E, ]**F) -> None. Update D from dict/iterable E and F. > > If E is present and has a .keys() method, then does: for k in E: > D[k] = E[k] > > If E is present and lacks a .keys() method, then does: for k, v in > E: D[k] = v > > In either case, this is followed by: for k in F: D[k] = F[k] > > > > Now for the semantics of string join. By Python principles > > From a str and an iterable > > The string join operator > > Produces a str > > > > You question amounts to this. Should the signature be > > (str, iterable) -> str # Informal type signature of join. > > or should it be > > (iterable, str) -> str > > > > At first glance, there's no way of choosing. But in Python we prefer > > aaa.join(bbb) > > to > > from somewhere import join > > join(aaa, bbb) > > > > So the question now becomes: Choose between > > str.join(iterable) > > iterable.join(str) > > > > There is some sense in having list.join. But in Python, we can't join > the elements of a list without getting an iterable from the list (unless > there's something like very special short-cut-semantics built into Python). > > > > So in Python the choice is between > > str.join(iterable) > > iterable.join(str) > > > > Now str.join looks more attractive. But I said, without thinking ahead, > that the syntax of Python's string join operator is a consequence of > Python's core design decisions and coding principles, together with the > semantics of join. > > > > I'm not quite there yet, there's a gap to fill. I'll pause for a day or > two now, just in case someone else wants to JOIN in the discussion, and > perhaps fill the gap. > > > > It's way WAY simpler than all this. "Iterable" isn't a type, it's a > protocol; in fact, "iterable" just means "has an __iter__ method". > Adding a method to iterables means adding that method to every single > object that wants to be iterable, but adding a method to strings just > means adding it to the str type. > > And, this is a topic for python-list, not python-ideas, unless someone > is proposing a change. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Sun Mar 24 13:32:34 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Sun, 24 Mar 2019 17:32:34 +0000 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: Guido van Rossum wrote: > I think this belongs in a personal blog, not on python-ideas and > definitely not in a PEP. > I don't agree, but I will accept that judgement, as if Guido still had BDFL status. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sun Mar 24 13:34:23 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 24 Mar 2019 10:34:23 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> References: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> Message-ID: On Sun, Mar 24, 2019 at 2:47 AM Alex Grigoryev wrote: > Yeah good idea with names because php ltrim does the same as lstrip in > python. > Normally I'd expect strip to behave as I proposed, not like input a string > as mask of characters, which is more rare use case and confusing in some > scenarios. > I agree -- I actually wrote buggy code in a PyPi published package that incorrectly used strip(*) in this way: i.e. I expected. My bad for not reading the docs carefully and writing crappy tests (yes, there were tests -- shows you how meaningless 100% coverage is) So +1 for some version of "remove exactly this substring from the left or right of a string" I agree that the "either end" option is unlikley to be useful, and it the rare case you want it, you can call both. I'll let others bikeshed on the name. And this really is simple enough that I don't want to reach for regex's for it. That is, I'd write it by hand rather than mess with that. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Sun Mar 24 13:39:31 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Sun, 24 Mar 2019 21:39:31 +0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: I'm up for writing it, in fact i'm planning on a series of posts/mini books for the threads, too many, i repeat again too many gems are hidden away in the arc-hive. Abdur-Rahmaan Janhangeer http://www.pythonmembers.club | https://github.com/Abdur-rahmaanJ Mauritius On Sun, 24 Mar 2019, 13:12 Jonathan Fine, wrote: > I'm willing to provide some useful information, if you're willing to write > it up into a good blog post. > -- > Jonathan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sun Mar 24 13:44:51 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 24 Mar 2019 10:44:51 -0700 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: Jonathan, This is the glory of open source projects -- if you have a great idea, you can simply do it: - Start a document that describes Python's Core design principles - Put it up somewhere (gitHub would be good) where others can contribute to it - If it becomes a wonderful thing, then propose that it be published somewhere "official" -- as a meta-PEP or whatever. (you can do something similar with a package that you may think should be in the stdlib, or a proposal that you might think should be a PEP, or...) In short, you don't need approval for an idea ahead of time -- if you demonstrate its worth, then it may be included later. If not, then maybe it wasn't a great idea, or it is a great idea that can live on outside the official project. -CHB On Sun, Mar 24, 2019 at 10:33 AM Jonathan Fine wrote: > Guido van Rossum wrote: > >> I think this belongs in a personal blog, not on python-ideas and >> definitely not in a PEP. >> > > I don't agree, but I will accept that judgement, as if Guido still had > BDFL status. > -- > Jonathan > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sun Mar 24 13:49:57 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 24 Mar 2019 10:49:57 -0700 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: On Sun, Mar 24, 2019 at 10:40 AM Abdur-Rahmaan Janhangeer < arj.python at gmail.com> wrote: > I'm up for writing it, > I encourage you to look in the archives of this list for the previous discussion -- there are some good starting points there. Also -- rather than a series of posts, a community-written document on gitHub or something would be good. I suspect you'll get contributions. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Sun Mar 24 13:52:36 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Sun, 24 Mar 2019 21:52:36 +0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: in markdown i guess, @vstinner and @matrixise initiative to rewrite a tuto for cpython beginners is really awesome according to me, yes, that's the right idea! Abdur-Rahmaan Janhangeer http://www.pythonmembers.club | https://github.com/Abdur-rahmaanJ Mauritius On Sun, 24 Mar 2019, 21:50 Christopher Barker, wrote: > > > On Sun, Mar 24, 2019 at 10:40 AM Abdur-Rahmaan Janhangeer < > arj.python at gmail.com> wrote: > >> I'm up for writing it, >> > > I encourage you to look in the archives of this list for the previous > discussion -- there are some good starting points there. > > Also -- rather than a series of posts, a community-written document on > gitHub or something would be good. I suspect you'll get contributions. > > -CHB > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Mar 24 14:39:11 2019 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 24 Mar 2019 18:39:11 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> Message-ID: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> On 2019-03-24 08:42, Alex Grigoryev wrote: > Following the discussion here > > I propose to add 3 new string methods: str.trim, str.ltrim, str.rtrim > Another option would be to change API for str.split method to work > correctly with sequences. > > In [1]: def ltrim(s, seq): > > ?? ...:???? return s[len(seq):] if s.startswith(seq) else s > > ?? ...: > > This has a subtle bug: > > In [2]: def rtrim(s, seq): > > ?? ...:???? return s[:-len(seq)] if s.endswith(seq) else s > > ?? ...: > If len(seq) == 0, then rtrim will return ''. It needs to be: def rtrim(s, seq): return s[ : len(s) - len(seq)] if s.endswith(seq) else s From arj.python at gmail.com Sun Mar 24 16:55:35 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Mon, 25 Mar 2019 00:55:35 +0400 Subject: [Python-ideas] Why not ['a','b','c'].join(',') ? In-Reply-To: References: Message-ID: skeleton here: https://github.com/Abdur-rahmaanJ/py-mailing-list-summary On Sun, Mar 24, 2019 at 9:50 PM Christopher Barker wrote: > > I encourage you to look in the archives of this list for the previous > discussion -- there are some good starting points there. > > Also -- rather than a series of posts, a community-written document on > gitHub or something would be good. I suspect you'll get contributions. > > -CHB > -- Abdur-Rahmaan Janhangeer http://www.pythonmembers.club | https://github.com/Abdur-rahmaanJ Mauritius Garanti sans virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sun Mar 24 17:19:53 2019 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sun, 24 Mar 2019 22:19:53 +0100 Subject: [Python-ideas] Improve os.times() resolution In-Reply-To: <678BB8D2-B0BF-4005-B352-F4110C329077@killingar.net> References: <678BB8D2-B0BF-4005-B352-F4110C329077@killingar.net> Message-ID: On Sun, Mar 24, 2019 at 2:29 PM Anders Hovm?ller wrote: > > Have you checked how much overhead the two functions have? That seems like an obvious way this proposal could go south. Without patch: $ ./python -m timeit -s "import os" "os.times()" 500000 loops, best of 5: 546 nsec per loop With patch: $ ./python -m timeit -s "import os" "os.times()" 200000 loops, best of 5: 1.23 usec per loop The patch: diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c index 3f76018357..ad91ed702a 100644 --- a/Modules/posixmodule.c +++ b/Modules/posixmodule.c @@ -8035,6 +8035,14 @@ os_times_impl(PyObject *module) #else /* MS_WINDOWS */ { +#if defined(HAVE_SYS_RESOURCE_H) + struct rusage ruself; + struct rusage ruchildren; + if (getrusage(RUSAGE_SELF, &ruself) == -1) + return posix_error(); + if (getrusage(RUSAGE_CHILDREN, &ruchildren) == -1) + return posix_error(); +#endif struct tms t; clock_t c; @@ -8043,10 +8051,18 @@ os_times_impl(PyObject *module) if (c == (clock_t) -1) return posix_error(); return build_times_result( + +#if defined(HAVE_SYS_RESOURCE_H) + doubletime(ruself.ru_utime), + doubletime(ruself.ru_stime), + doubletime(ruchildren.ru_utime), + doubletime(ruchildren.ru_stime), +#else (double)t.tms_utime / ticks_per_second, (double)t.tms_stime / ticks_per_second, (double)t.tms_cutime / ticks_per_second, (double)t.tms_cstime / ticks_per_second, +#endif (double)c / ticks_per_second); } #endif /* MS_WINDOWS */ -- Giampaolo - http://grodola.blogspot.com From cs at cskk.id.au Sun Mar 24 19:45:49 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Mon, 25 Mar 2019 10:45:49 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> Message-ID: <20190324234549.GA79811@cskk.homeip.net> On 24Mar2019 18:39, MRAB wrote: >On 2019-03-24 08:42, Alex Grigoryev wrote: >>Following the discussion here >This has a subtle bug: >> >>In [2]: def rtrim(s, seq): >> >> ?? ...:???? return s[:-len(seq)] if s.endswith(seq) else s >> >> ?? ...: >> >If len(seq) == 0, then rtrim will return ''. > >It needs to be: > >def rtrim(s, seq): > return s[ : len(s) - len(seq)] if s.endswith(seq) else s Or: return s[:-len(seq)] if seq and s.endswith(seq) else s which I think more readable. For the record, like others, I suspect I've written ltrim/rtrim code many times. I'm +0.9 on the idea: it feels like a very common operation and as shown above rtrim at least is fairly easily miscoded. (I think most of my own situations were with strings I know are not empty, often literals, but that doesn't really detract.) Like others I'm against the name 'trim" itself because of PHP's homonym which means what "strip" means in Python (and therefore doesn't mean what "trim" is proposed to mean here). "clip"? I'm +0.9 rather than +1 entirely because the operation feels so... trivial, which usually trips the "not everything needs a method" argument. But it is also very common. Cheers, Cameron Simpson From ja.py at farowl.co.uk Sun Mar 24 18:28:16 2019 From: ja.py at farowl.co.uk (Jeff Allen) Date: Sun, 24 Mar 2019 22:28:16 +0000 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: On 24/03/2019 17:44, Christopher Barker wrote: > Jonathan, > > This is the glory of open source projects -- if you have a great idea, > you can simply do it: > > - Start a document that describes Python's Core design principles > - Put it up somewhere (gitHub would be good) where others can > contribute to it > - If it becomes a wonderful thing, then propose that it be published > somewhere "official" -- as a meta-PEP or whatever. > And of course people have.I'd like to thank Victor Stinner and Eli Bendersky for their articles about aspects of Python and its implementation. I've found both useful to go back to. Others may like to bookmark these: https://pythondev.readthedocs.io/ https://eli.thegreenplace.net/tag/python I looked briefly for an article on .join(), but amusingly all I found was further evidence that the question that started this thread is a recurring one (https://eli.thegreenplace.net/2008/06/06/python-impressions). Jeff Allen -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Sun Mar 24 20:20:17 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Sun, 24 Mar 2019 19:20:17 -0500 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190324234549.GA79811@cskk.homeip.net> References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: On 3/24/19 6:45 PM, Cameron Simpson wrote: > Like others I'm against the name 'trim" itself because of PHP's homonym > which means what "strip" means in Python (and therefore doesn't mean > what "trim" is proposed to mean here). "clip"? > > I'm +0.9 rather than +1 entirely because the operation feels so... > trivial, which usually trips the "not everything needs a method" > argument. But it is also very common. strip, trim, chop, chomp, clip, left, right, and various permutations with leading "l"s and "r"s. Is the "other" argument a character, a string, or a list of characters, or a list of strings, or a regex? Argh. Maybe I use too many languages and I don't do enough string processing, but I always have to look this stuff up every time I use it, or I just write my own. No, I don't have a solution, but matching or mismatching any particular language only makes sense if you happen to be familiar with that language's string functions. And then someone will fall into a trap because "their" language handles newlines and returns, or spaces and tabs, or some other detail completely differently. That said, I'm all for more library functions, especially in cases like this that are easy to get wrong. From jfine2358 at gmail.com Mon Mar 25 06:22:53 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 10:22:53 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: Instead of naming these operations, we could use '+' and '-', with semantics: # Set the values of the variables. >>> a = 'hello ' >>> b = 'world' >>> c = 'hello world' # Some values between the variables. >>> a + b == c True >>> a == c - b True >>> b = -a + c True # Just like numbers except. >>> a + b == b + a False This approach has both attractions and problems. And also decisions. The main issue, I think come to this. Suppose we have a, A = ('a', -'a') b, B = ('b', -'b') a + A == A + a == '' b + B == B + b == '' A + '' == '' + A == A B + '' == '' + B == B together with unrestricted addition of a, A, b, B then we have what mathematicians call the free group on 2 letters, which is an enormous object. If you want the math, look at, https://en.wikipedia.org/wiki/Free_group#Examples We've made a big mistake, I think, if we allow Python programmers to accidentally encounter this free group. One way to look at this, is that we want to cut the free group down to a useful size. One way is >>> 'hello ' - 'world' == 'hello' # I like to call this truncation. True Another way is >>> 'hello' - 'world' # I like to call this subtraction. ValueError: string s1 does not end with s2, so can't be subtracted I hope this little discussion helps with naming things. I think this is enough for now. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Mar 25 06:58:52 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Mar 2019 21:58:52 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: On Mon, Mar 25, 2019 at 9:24 PM Jonathan Fine wrote: > > Instead of naming these operations, we could use '+' and '-', with semantics: > > # Set the values of the variables. > >>> a = 'hello ' > >>> b = 'world' > >>> c = 'hello world' > > # Some values between the variables. > >>> a + b == c > True > >>> a == c - b > True > >>> b = -a + c > True > The semantics are rather underdefined here. What *exactly* does string subtraction do? Is a-b equivalent to a.replace(b, "") or something else? Also.... you imply that it's possible to negate a string and then add it, but... what does a negative string look like? *confused* ChrisA From jfine2358 at gmail.com Mon Mar 25 08:01:21 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 12:01:21 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: Chris Angelico asked: what does a negative string look like? This is a very good question. It looks a bit like a negative number. >>> 2 + 2 4 >>> len('aa' + 'bb') 4 >>> len(-'bb') -2 # Odd, I must confess. >>> 5 + (-1) 4 >>> len('hello') 5 >>> len(-'o') -1 >>> 'hello' + (-'o') 'hell' >>> len('hello' + (-'o')) 4 Grade school: How can I possible have -3 apples in my bag. University: How can I possibly be overdrawn in my bank account. Negative strings are similar to negative numbers except: For numbers a + b == b + a For strings a + b != b + a It is the non-commuting that make negative strings difficult. This is a bit like computer programming. It's not enough to have the correct lines of code (or notes). They also have to be put in the right order. I hope this helps. I do this sort of math all the time, and enjoy it. Your experience may be different. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Mar 25 08:19:35 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 12:19:35 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: More on negative strings. They are easier, if they only use one character. Red Queen: What's one and one and one and one and one and one and one and one and one and one and one and one and one? Alice: I don't know. I lost count. Red Queen: She can't do arithmetic. 3 --> 'aaa' 2 --> 'aa' 1 --> 'a' 0 --> '' -1 -> -'a' -2 -> -'aa' -3 -> -'aaa' Negative strings are easier if we can rearrange the order of the letters. Like anagrams. >>> ''.join(sorted('forty five')) ' effiortvy' >>> ''.join(sorted('over fifty')) ' effiortvy' Instead of counting (positively and negatively) just the letter 'a', we do the whole alphabet. By when order matters, we get an enormous free group, which Python programmers by accident see. I hope this helps. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Mon Mar 25 08:23:28 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Mon, 25 Mar 2019 07:23:28 -0500 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: <27c7395c-cf34-6344-c600-94a5b6df009c@potatochowder.com> On 3/25/19 7:01 AM, Jonathan Fine wrote: > Chris Angelico asked: what does a negative string look like? > > This is a very good question. It looks a bit like a negative number. > > >>> 2 + 2 > 4 > >>> len('aa' + 'bb') > 4 > >>> len(-'bb') > -2 # Odd, I must confess. > >>> 5 + (-1) > 4 > >>> len('hello') > 5 > >>> len(-'o') > -1 > >>> 'hello' + (-'o') > 'hell' > >>> len('hello' + (-'o')) > 4 > > Grade school: How can I possible have -3 apples in my bag. > University: How can I possibly be overdrawn in my bank account. > > Negative strings are similar to negative numbers except: > For numbers a + b == b + a > For strings a + b != b + a > > It is the non-commuting that make negative strings difficult. This is a bit > like computer programming. It's not enough to have the correct lines of > code (or notes). They also have to be put in the right order. In the abstract, I believe I understand what Jonathan is saying, and in the concrete, I understand Chris's objection. Ridding a string of some of the graphemes from one end, or the other, or both, or elsewhere, is one or more different operations on the same underlying data type. We just went through this with dictionaries. So what it is "hello" - "world"? "hello" because it doesn't end in "world"? "hello" because it doesn't begin with "world"? "he" because that's "hello" with all the graphemes also in "world" removed? "he" because that's "hello" with all the graphemes also in "world" removed from the end? "hello" because that's "hello" with all the graphemes also in "world" removed from the begining?" And once we pick one of those results, what operator(s) produce the others and don't lead to perl or APL? And no matter how much Python I learn, I still can't divide by zero or by an empty string. ;-) From boxed at killingar.net Mon Mar 25 08:24:47 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Mon, 25 Mar 2019 13:24:47 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: I think this is a terrible idea. I also think it's a mistake that python uses + for string concatenation and * for string repeat. You end up with type errors far from the first place you could have had the crash! That ship has obviously sailed buy we shouldn't make even more mistakes in the same vain because we have some Stockholm syndrome with the current state of the language, or for a misplaced ideal of consistency. > On 25 Mar 2019, at 11:22, Jonathan Fine wrote: > > Instead of naming these operations, we could use '+' and '-', with semantics: > > # Set the values of the variables. > >>> a = 'hello ' > >>> b = 'world' > >>> c = 'hello world' > > # Some values between the variables. > >>> a + b == c > True > >>> a == c - b > True > >>> b = -a + c > True > > # Just like numbers except. > >>> a + b == b + a > False > > This approach has both attractions and problems. And also decisions. The main issue, I think come to this. Suppose we have > a, A = ('a', -'a') > b, B = ('b', -'b') > a + A == A + a == '' > b + B == B + b == '' > A + '' == '' + A == A > B + '' == '' + B == B > together with unrestricted addition of a, A, b, B then we have what mathematicians call the free group on 2 letters, which is an enormous object. If you want the math, look at, https://en.wikipedia.org/wiki/Free_group#Examples > > We've made a big mistake, I think, if we allow Python programmers to accidentally encounter this free group. One way to look at this, is that we want to cut the free group down to a useful size. One way is > >>> 'hello ' - 'world' == 'hello' # I like to call this truncation. > True > Another way is > >>> 'hello' - 'world' # I like to call this subtraction. > ValueError: string s1 does not end with s2, so can't be subtracted > > I hope this little discussion helps with naming things. I think this is enough for now. > > -- > Jonathan > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Mar 25 09:51:31 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 13:51:31 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: Here, concisely, is my view of the situation and my preferences. Mostly, I won't give supporting arguments or evidence. We can TRUNCATE either PRE or the POST, and similarly SUBTRACT. SUBTRACT can raise a ValueError. TRUNCATE always returns a value. Interactive examples (not tested) >>> from somewhere import post_subtract >>> sub_ed = post_subtract('ed') >>> sub_ed('fred') >>> 'fr' >>> sub_ed('lead') ValueError Similarly >>> trunc_ed('fred') 'fr' >>> trunc_ed('lead') 'lead' Can be 'combined into one' >>> pre_truncate('app')('applet) 'let' >>> pre_truncate('app')('paper') 'paper' Possibly 1. Allow pre_truncate('app', 'applet'), perhaps with different spelling. 2. Allow '-' as a symbol for subtract. (Likely to be controversial.) I'm not particularly attached to the names. But I definitely think 3. None of these are string methods. (So pure Python implementation automatically backports.) 4. Encourage a 'two-step' process. This allow separation of concerns, and encourage good names. Supporting argument. When we write pre_subtract(suffix, s) the suffix has a special meaning. For example, it's the header. So in one module define and test a routine remove_header. And in another module use remove_header. That way, the user of remove_header only needs to know the business purpose of the command. And the implementer needs to know only the value of the header. If the specs change, and the implementer needs to use regular expressions, then this does not affect the user of remove_header. I hope this helps. Maybe others would like to express their preferences. -- Jonathan -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Mon Mar 25 09:50:02 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 25 Mar 2019 13:50:02 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: <4995acce-dbd8-6214-c887-4c598996dbe8@kynesim.co.uk> On 25/03/2019 12:01, Jonathan Fine wrote: > Chris Angelico asked: what does a negative string look like? > > This is a very good question. It looks a bit like a negative number. They really don't. Negative numbers are well defined in terms of being the additive inverse of natural numbers. String concatenation doesn't have a well-defined inverse, as you demonstrated by not actually trying to define it. It strikes me that following this line of reasoning is at best a category error. -- Rhodri James *-* Kynesim Ltd From jfine2358 at gmail.com Mon Mar 25 10:33:49 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 14:33:49 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: Earlier, Anders wrote: I propose naming them strip_prefix() and strip_suffix() and just skip the one that does both sides since it makes no sense to me. This is good, except I prefer subtract_prefix(a, b), truncate_suffix etc. And for the two step process prefix_subtractor(a)(b) etc. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Mon Mar 25 10:40:44 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Mon, 25 Mar 2019 15:40:44 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: > Earlier, Anders wrote: > I propose naming them strip_prefix() and strip_suffix() and just skip the one that does both sides since it makes no sense to me. > > This is good, except I prefer subtract_prefix(a, b), truncate_suffix etc. And for the two step process prefix_subtractor(a)(b) etc. I don't understand the logic for "subtract". That's not a thing for non-numbers. If you don't think "strip" is good, then I suggest "remove". Or one could also consider "without" since we're talking about something that removes /if present/ (making subtract even worse! Subtract doesn't stop at zero). So "without_prefix()". From jfine2358 at gmail.com Mon Mar 25 10:41:59 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 14:41:59 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <4995acce-dbd8-6214-c887-4c598996dbe8@kynesim.co.uk> References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> <4995acce-dbd8-6214-c887-4c598996dbe8@kynesim.co.uk> Message-ID: Rhodri James wrote: > They really don't. Negative numbers are well defined in terms of being > the additive inverse of natural numbers. String concatenation doesn't > have a well-defined inverse, > In an earlier post I showed (assuming some knowledge of group theory) that for strings in the two letters 'a' and 'b', allowing negative strings give rise to what mathematicians call the free group on 2 letters, which is an enormous object. If you want the math, look at https://en.wikipedia.org/wiki/Free_group#Construction [Except previously I linked to the wrong part of the page.] Free groups are a difficult concept, usually introduced at post-graduate level. If you can tell me you understand that concept, I'm happy on that basis to explain how it provides string concatenation with a well-defined inverse. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 25 10:43:27 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Mar 2019 07:43:27 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <4995acce-dbd8-6214-c887-4c598996dbe8@kynesim.co.uk> References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> <4995acce-dbd8-6214-c887-4c598996dbe8@kynesim.co.uk> Message-ID: On Mon, Mar 25, 2019 at 7:30 AM Rhodri James wrote: > On 25/03/2019 12:01, Jonathan Fine wrote: > > Chris Angelico asked: what does a negative string look like? > > > > This is a very good question. It looks a bit like a negative number. > > They really don't. Negative numbers are well defined in terms of being > the additive inverse of natural numbers. String concatenation doesn't > have a well-defined inverse, as you demonstrated by not actually trying > to define it. It strikes me that following this line of reasoning is at > best a category error. > I assume the whole proposal was a pastiche of the proposal to add a + operator for dictionaries. Jonathan needs to come clean before more people waste their time discussing this. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From evrial at gmail.com Mon Mar 25 10:43:50 2019 From: evrial at gmail.com (Alex Grigoryev) Date: Mon, 25 Mar 2019 16:43:50 +0200 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: Message-ID: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> strip_prefix and strip_suffix I think are the best names from all and work perfectly with auto completion. Common use case: " mailto:maria at gmail.com".strip().strip_prefix("mailto:") On Mar 25 2019, at 4:40 pm, Anders Hovm?ller wrote: > > > Earlier, Anders wrote: > > I propose naming them strip_prefix() and strip_suffix() and just skip the one that does both sides since it makes no sense to me. > > > > This is good, except I prefer subtract_prefix(a, b), truncate_suffix etc. And for the two step process prefix_subtractor(a)(b) etc. > I don't understand the logic for "subtract". That's not a thing for non-numbers. > If you don't think "strip" is good, then I suggest "remove". Or one could also consider "without" since we're talking about something that removes /if present/ (making subtract even worse! Subtract doesn't stop at zero). So "without_prefix()". > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Mon Mar 25 10:55:02 2019 From: mertz at gnosis.cx (David Mertz) Date: Mon, 25 Mar 2019 10:55:02 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> Message-ID: All of this would be well served by a 3rd party library on PyPI. Strings already have plenty of methods (probably too many). Having `stringtools` would be nice to import a bunch of simple functions from. On Mon, Mar 25, 2019 at 10:45 AM Alex Grigoryev wrote: > strip_prefix and strip_suffix I think are the best names from all and work > perfectly with auto completion. Common use case: > > " mailto:maria at gmail.com".strip().strip_prefix("mailto:") > > > On Mar 25 2019, at 4:40 pm, Anders Hovm?ller wrote: > > > Earlier, Anders wrote: > I propose naming them strip_prefix() and strip_suffix() and just skip the > one that does both sides since it makes no sense to me. > > This is good, except I prefer subtract_prefix(a, b), truncate_suffix etc. > And for the two step process prefix_subtractor(a)(b) etc. > > > I don't understand the logic for "subtract". That's not a thing for > non-numbers. > > If you don't think "strip" is good, then I suggest "remove". Or one could > also consider "without" since we're talking about something that removes > /if present/ (making subtract even worse! Subtract doesn't stop at zero). > So "without_prefix()". > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > [image: Sent from Mailspring] > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfine2358 at gmail.com Mon Mar 25 12:09:07 2019 From: jfine2358 at gmail.com (Jonathan Fine) Date: Mon, 25 Mar 2019 16:09:07 +0000 Subject: [Python-ideas] I'm saying goodbye for a bit Message-ID: Hi I've been active recently in some threads, that have become a bit heated. To help things cool down, I won't be posting for a while. I don't know how long, but certainly not until Wednesday 3 April. You can of course contact me off-list if you want. -- Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Mon Mar 25 12:32:26 2019 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 25 Mar 2019 19:32:26 +0300 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. Message-ID: Not a proposal yet, but some thoughts: I think it would help in a longer perspective if a user could include a directive in the header of the source code file that defines indentation character(s) for this source file. So this source would be parsed strictly by this char (or sequence). E.g.: # indent "\t" ... Would force the Python parser to use exactly 1 tab for 1 indent level. # indent " " ... Would accordingly force the parser to use exactly 4 spaces for 1 indent level. Frankly I don't have much proof in hand for that will be a good addition, but intuitively I suppose it would help with some possible future features and in general, ease the development of source processors. One possible example: if a potential future feature would require a statement, and moreover require it to be indentation-aware? Lets take e.g. a multi-line string: s = """ Hello world """ print (s) >>> Hello world Here it is not so intuitive (unless you already know) how the string would be parsed (given that Python blocks are actually indentation-based). So if one would want to try introduce a new indent-aware string type and look into possible parsing disambiguation scenarios - it will be not an easy task. E.g. say one proposes a syntax for auto-unindented string block: s = !!! Hello world print (s) >>> Hello world (no leading newline, no leading indent in resulting string, which is a bit more expected result IMO). Then it seems one can define the parsing rule unambiguously _only_ if one has a strictly defined character sequence for the indent level (e.g. 1 tab or 4 spaces, but not both). Thus it seems such a directive would be a prerequisite for such feature. And in general, I think it could help to make automatic conversions from one type of indentation to other easier. Mikhail From tjreedy at udel.edu Mon Mar 25 13:35:55 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 25 Mar 2019 13:35:55 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> Message-ID: On 3/25/2019 6:22 AM, Jonathan Fine wrote: > Instead of naming these operations, we could use '+' and '-', with > semantics: > > ? ? # Set the values of the variables. > ? ? >>> a = 'hello ' > ? ? >>> b = 'world' > ? ? >>> c = 'hello world' > > ? ? # Some values between the variables. > ? ? >>> a + b == c > ? ? True > ? ? >>> a == c - b > ? ? True > ? ? >>> b = -a + c > ? ? True Summary: using '-' for trimming works well for postfixes, badly for prefixes, and not at all for infixes. Clever but not too practical since trimming prefixes seems to be more common than trimming postfixes. -- Terry Jan Reedy From tjreedy at udel.edu Mon Mar 25 13:39:24 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 25 Mar 2019 13:39:24 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> Message-ID: On 3/25/2019 10:55 AM, David Mertz wrote: > All of this would be well served by a 3rd party library on PyPI. > Strings already have plenty of methods (probably too many).? Having > `stringtools` would be nice to import a bunch of simple functions from. I agree. -- Terry Jan Reedy From boxed at killingar.net Mon Mar 25 13:47:58 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Mon, 25 Mar 2019 18:47:58 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> Message-ID: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> > All of this would be well served by a 3rd party library on PyPI. Strings already have plenty of methods (probably too many). Having `stringtools` would be nice to import a bunch of simple functions from. I respectfully disagree. This isn't javascript where we are OK with millions of tiny dependencies. Python is batteries included and that's a great thing. This is just a tiny battery that was overlooked :) / Anders From brett at python.org Mon Mar 25 14:27:02 2019 From: brett at python.org (Brett Cannon) Date: Mon, 25 Mar 2019 11:27:02 -0700 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: On Sun, Mar 24, 2019 at 10:34 AM Jonathan Fine wrote: > Guido van Rossum wrote: > >> I think this belongs in a personal blog, not on python-ideas and >> definitely not in a PEP. >> > > I don't agree, but I will accept that judgement, as if Guido still had > BDFL status. > To help add more weight to what Guido said, it doesn't belong here and it only belongs in a PEP if that PEP is justifying the feature to begin with. IOW we don't need a PEP justifying every design decision that we have prior to the PEP process. -Brett > -- > Jonathan > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Mar 25 14:29:43 2019 From: brett at python.org (Brett Cannon) Date: Mon, 25 Mar 2019 11:29:43 -0700 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: Message-ID: On Mon, Mar 25, 2019 at 9:33 AM Mikhail V wrote: > Not a proposal yet, but some thoughts: > I think it would help in a longer perspective if a user could > include a directive in the header of the source code file that > defines indentation character(s) for this source file. So this > source would be parsed strictly by this char (or sequence). > This is more of a linter thing than a language thing, so I would propose you take it to the code-quality mailing list. -Brett > > E.g.: > > # indent "\t" > ... > > Would force the Python parser to use exactly 1 tab for 1 indent level. > > # indent " " > ... > > Would accordingly force the parser to use exactly 4 spaces for > 1 indent level. > > Frankly I don't have much proof in hand for that will be a good > addition, but intuitively I suppose it would help with some possible > future features and in general, ease the development of source > processors. > > One possible example: if a potential future feature would require > a statement, and moreover require it to be indentation-aware? > Lets take e.g. a multi-line string: > > s = """ > Hello > world > """ > print (s) > > >>> > > Hello > world > > > Here it is not so intuitive (unless you already know) how the string would > be parsed (given that Python blocks are actually indentation-based). > So if one would want to try introduce a new indent-aware string type and > look into possible parsing disambiguation scenarios - it will be not an > easy task. > E.g. say one proposes a syntax for auto-unindented string block: > > s = !!! > Hello > world > print (s) > >>> > Hello > world > > (no leading newline, no leading indent in resulting string, which is a bit > more > expected result IMO). > > Then it seems one can define the parsing rule unambiguously _only_ > if one has a strictly defined character sequence for the indent level > (e.g. 1 tab or 4 spaces, but not both). > Thus it seems such a directive would be a prerequisite for such feature. > > And in general, I think it could help to make automatic conversions from > one > type of indentation to other easier. > > > > Mikhail > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Mar 25 14:31:35 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 25 Mar 2019 18:31:35 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> Message-ID: On Mon, 25 Mar 2019 at 17:49, Anders Hovm?ller wrote: > > > All of this would be well served by a 3rd party library on PyPI. Strings already have plenty of methods (probably too many). Having `stringtools` would be nice to import a bunch of simple functions from. > > I respectfully disagree. This isn't javascript where we are OK with millions of tiny dependencies. Python is batteries included and that's a great thing. This is just a tiny battery that was overlooked :) While batteries included is a very good principle (and one I've argued for strongly in the past) it's also important to remember that Python is a mature language, and the days of being able to assume that "most people" will be on a recent version are gone. Adding these functions to the stdlib would mean that *only* people using Python 3.8+ would have access to them (and in particular, library authors wouldn't be able to use them until they drop support for all versions older than 3.8). Having the functions as an external library makes them accessible to *every* Python user. As with everything, it's a trade-off. IMO, in this case the balance is in favour of a 3rd party library (at least initially - it's perfectly possible to move the library into the stdlib later if it becomes popular). Paul From ethan at stoneleaf.us Mon Mar 25 16:56:08 2019 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 25 Mar 2019 13:56:08 -0700 Subject: [Python-ideas] META: Is a PEP a good place to record Python's core design decisions and coding principles? In-Reply-To: References: Message-ID: <7f18735d-1620-3718-c27c-3ddbc08960d7@stoneleaf.us> Chris replied > Jonathan Fine opined: >> Chris Angelico stated: Chris ----- >>> It's way WAY simpler than all this. "Iterable" isn't a type, it's a >>> protocol; in fact, "iterable" just means "has an __iter__ method". Jonathan -------- >> I think that for Chris this is a FACT about Python. This is the way Python is. Jonathan, you just lost some serious credibility. You really should do more research before posting. Chris ----- > "For me"? No. It is, pure and simply, a fact about Python. That is how > the language is defined. It's not "for me" a fact, as if facts might > not be facts for other people. That isn't how facts work, and it isn't > how Python works. And if you want some examples, try here: https://stackoverflow.com/a/7542261/208880 -- ~Ethan~ From greg.ewing at canterbury.ac.nz Mon Mar 25 18:27:42 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 26 Mar 2019 11:27:42 +1300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <27c7395c-cf34-6344-c600-94a5b6df009c@potatochowder.com> References: <3646bf05-fab4-9edb-2c9a-1e84dd86c060@mrabarnett.plus.com> <20190324234549.GA79811@cskk.homeip.net> <27c7395c-cf34-6344-c600-94a5b6df009c@potatochowder.com> Message-ID: <5C9955DE.5040804@canterbury.ac.nz> Dan Sommers wrote: > So what it is "hello" - "world"? If we were to implement the entire group, it would be an element that can't be written in any simpler form. We could do that by representing a string as a sequence of signed substrings, and performing cancellations whereever possible during concatenation. But that would be a huge amount of machinery just to provide a cute notation for removing a prefix or suffix, with little in the way of other obvious applications. -- Greg From fhsxfhsx at 126.com Tue Mar 26 00:03:33 2019 From: fhsxfhsx at 126.com (fhsxfhsx) Date: Tue, 26 Mar 2019 12:03:33 +0800 (CST) Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: Message-ID: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Just as to your example, you can try `textwrap.dedent` At 2019-03-26 00:32:26, "Mikhail V" wrote: >Not a proposal yet, but some thoughts: >I think it would help in a longer perspective if a user could >include a directive in the header of the source code file that >defines indentation character(s) for this source file. So this >source would be parsed strictly by this char (or sequence). > >E.g.: > > # indent "\t" > ... > >Would force the Python parser to use exactly 1 tab for 1 indent level. > > # indent " " > ... > >Would accordingly force the parser to use exactly 4 spaces for >1 indent level. > >Frankly I don't have much proof in hand for that will be a good >addition, but intuitively I suppose it would help with some possible >future features and in general, ease the development of source >processors. > >One possible example: if a potential future feature would require >a statement, and moreover require it to be indentation-aware? >Lets take e.g. a multi-line string: > > s = """ > Hello > world > """ > print (s) > > >>> > > Hello > world > > >Here it is not so intuitive (unless you already know) how the string would >be parsed (given that Python blocks are actually indentation-based). >So if one would want to try introduce a new indent-aware string type and >look into possible parsing disambiguation scenarios - it will be not an >easy task. >E.g. say one proposes a syntax for auto-unindented string block: > > s = !!! > Hello > world > print (s) > >>> > Hello > world > >(no leading newline, no leading indent in resulting string, which is a bit more >expected result IMO). > >Then it seems one can define the parsing rule unambiguously _only_ >if one has a strictly defined character sequence for the indent level >(e.g. 1 tab or 4 spaces, but not both). >Thus it seems such a directive would be a prerequisite for such feature. > >And in general, I think it could help to make automatic conversions from one >type of indentation to other easier. > > > >Mikhail >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Tue Mar 26 01:07:15 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Tue, 26 Mar 2019 09:07:15 +0400 Subject: [Python-ideas] The Mailing List Digest Project Message-ID: As proposed on python-ideas, i setup a repo to turn mail threads into articles. here is the repo https://github.com/Abdur-rahmaanJ/py-mailing-list-summary i included a script to build .md to .html (with syntax highlighting) here is the index https://abdur-rahmaanj.github.io/py-mailing-list-summary/ included 3 articles as a start if you want to contribute an article, just follow existing .md format and put it in the .md folder planning to go across ideas, list and dev i can tell you, it's a really enjoyable experience. psst. we can enhance some html later -- Abdur-Rahmaan Janhangeer Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.whitehead at ieee.org Tue Mar 26 05:27:18 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Tue, 26 Mar 2019 09:27:18 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" Message-ID: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Problem: Using Python's Condition class is error-prone and difficult. For example, the following pseudo-code would be quite typical of its use: condition = threading.Condition() def sender(): while alive(): wait_for_my_data_from_hardware() with condition: send_data_to_receiver() condition.raise() def receiver(): while alive(): with condition: condition.wait() receive_all_data_from_sender() process_data() Main code will then run sender() and receiver() in separate threads. (I know that in a simple case like this, I could just use a Queue. I've been working with code that needs to service several events, where polling queues would introduce overhead and latency so a Condition must be used, but I've simplified the above as much as possible to illustrate the issues even in this simple case). There are two issues even with the above case: 1. Raising a condition only has any effect if the condition is already being waited upon; a condition does not "remember" that it has been raised. So in the example above, if the receiver starts after the sender, it will wait on the condition even though data is already available for it. This issue can be solved by rearranging the code, waiting at the end of the loop rather than the start; but this is counter-intuitive and lots of examples online look like the above. 2. In the receiver, the condition has to be kept locked (be inside the "with" statement") all the time. The lock is automatically released while the condition is being waited upon, but otherwise it must be locked, otherwise we risk missing the condition being raised. The result is that process_data() is called with the lock held - and so this will prevent the sender going round its loop. The sending thread becomes a slave to the processing thread. This may have a huge performance penalty, losing the advantage of loose coupling that threading should provide. You might think that using an Event, rather than a Condition, would solve things, since an Event does "remember" that is has been set. But there is no atomic way to wait on an event and clear it afterwards. Solution: The solution is very simple: to create something that might be called a StickyCondition, or an AutoResetEvent. This is a Condition that does remember it has been raised; or if you prefer, it is an Event that resets itself after it has been waited upon. The example then becomes: auto_event = threading.AutoResetEvent() def sender(): while alive(): wait_for_my_data_from_hardware() send_data_to_receiver() auto_event.set() def receiver(): while alive(): auto_event.wait() receive_all_data_from_sender() process_data() This solves both of the issues described: the receiver will not wait if data is already available, and the sender is not blocked by a lock that is being held by the receiver. The event doesn't need to be locked at all, because of its internal memory. Implementation is trivial, involving a boolean and a Condition in the cross-thread case. I presume it would be more involved for the cross-process case, but I can see no reason it would be impossible. Please let me know if you think this would be useful. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Mar 26 06:38:14 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Mar 2019 11:38:14 +0100 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: <20190326113814.4abe0332@fsol> On Tue, 26 Mar 2019 09:27:18 -0000 "Richard Whitehead" wrote: > Problem: > > Using Python's Condition class is error-prone and difficult. For example, > the following pseudo-code would be quite typical of its use: [...] Nowadays, I would recommend to always use `Condition.wait_for()` rather than `Condition.wait()`. A Condition, unlike what the name suggests, is just a means of synchronization. It doesn't have a boolean state per se. You have to manage your boolean state (or any other kind of state) separately, hence the usefulness of `wait_for()`. As for auto-reset events, the Windows API has them, here's an insightful writeup about them: https://devblogs.microsoft.com/oldnewthing/?p=30773 But, yes, perhaps auto-reset events would be a good addition regardless. Either through a flag to Event, or as a separate class. Regards Antoine. From mikhailwas at gmail.com Tue Mar 26 06:48:48 2019 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 26 Mar 2019 13:48:48 +0300 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> References: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Message-ID: On Tue, Mar 26, 2019 at 7:04 AM fhsxfhsx wrote: > > Just as to your example, you can try `textwrap.dedent` > Procedural removal is not cool, because it does not know the parent indentation of the statement that contains the text block, thus it can't resolve automatically string indents that were intentionally indented to include extra space. E.g. this, where "s=" line is already inside an indented block: s = """ Hello world""" Say I use 1 tab for 1 indent - what I want here is to remove 1 tab from each string line AND 1 tab that belongs to code formatting (current indent of the "s=" line). And if you try to guess it from the string itself - it is impossible to resolve all cases, because you still need some criteria - for example you could use criteria "minimal indent inside the string is the indent" but this will again fail if you want extra shift on same level inside the string. E.g. this: s = """ Hello world""" Here I do not want to remove all indents but only as in above - only 1 level inside string and 1 from parent line. Do you get it? So in other words if I want my string blocks aligned within containing blocks, it becomes impossible to resolve all un-indenting cases automatically. Mikhail From rosuav at gmail.com Tue Mar 26 07:01:04 2019 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Mar 2019 22:01:04 +1100 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Message-ID: On Tue, Mar 26, 2019 at 9:49 PM Mikhail V wrote: > Procedural removal is not cool, because it does not know the parent indentation > of the statement that contains the text block, thus it can't resolve > automatically > string indents that were intentionally indented to include extra space. > > E.g. this, where "s=" line is already inside an indented block: > s = """ > Hello > world""" > > Say I use 1 tab for 1 indent - what I want here is to remove 1 tab from > each string line AND 1 tab that belongs to code formatting (current indent > of the "s=" line). And if you try to guess it from the string itself - > it is impossible to resolve all cases, because you still need some criteria > - for example you could use criteria "minimal indent inside the string > is the indent" but > this will again fail if you want extra shift on same level inside the string. > E.g. this: > s = """ > Hello > world""" > > Here I do not want to remove all indents but only as in above - only 1 > level inside string > and 1 from parent line. > Do you get it? > > So in other words if I want my string blocks aligned within containing > blocks, it becomes > impossible to resolve all un-indenting cases automatically. > This is true if you put your closing quotes on the same line as the last line of text, because then there's no information available. If, instead, you put the final triple-quote delimiter on its own line, you then have the indentation preserved, and can remove just that much indentation from each line. So, yes, you CAN unindent automatically. Point of note: PEP 257 recommends this for docstrings. https://www.python.org/dev/peps/pep-0257/#handling-docstring-indentation ChrisA From richard.whitehead at ieee.org Tue Mar 26 07:14:57 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Tue, 26 Mar 2019 11:14:57 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: Message-ID: <17C5C1AA7EBD4C5FAF3FCB5F738FFAEA@HomePC2> Antoine, Thanks for a couple of very useful comments. The usefulness of wait_for(), especially in avoiding the problem of spurious-waits, is something I'd forgotten - this is really just syntactic sugar, but I agree it is very useful - as illustrated by the fact that I didn't remember to put my wait() call in its own loop in my example! On the other hand, the lack of locking makes this new thing more like an Event than a Condition. Also, since it would only make sense for one thread to wait on this object, rather than many, that again makes it more like an Event. And Event only has wait(), not wait_for() - it makes no sense to "loop" on an object that is "sticky", you would just get livelock. So in summary, this new class (or behaviour) would still only have wait() and not wait_for(). Does that devalue it, do you think? I know that C# has AutoResetEvent, but I didn't know it was native to the Windows API. I'm pretty sure it is not native to Linux (there are a couple of other useful features, such as "Wait on multiple events", that are in Windows but not Linux, and therefore are also not in Python). Does this affect implementation - would you expect an AutoResetEvent to be implemented natively on Windows, rather than being composed out of pure Python using a boolean and a Condition? Regarding the link you sent, I don't entirely agree with the opinion expressed: if you try to use a Semaphore for this purpose you will soon find that it is "the wrong way round", it is intended to protect resources from multiple accesses, not to synchronize those multiple accesses. Thanks, Richard -----Original Message----- From: python-ideas-request at python.org Sent: Tuesday, March 26, 2019 10:49 AM To: python-ideas at python.org Subject: Python-ideas Digest, Vol 148, Issue 143 Send Python-ideas mailing list submissions to python-ideas at python.org To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to python-ideas-request at python.org You can reach the person managing the list at python-ideas-owner at python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: 1. The Mailing List Digest Project (Abdur-Rahmaan Janhangeer) 2. Simpler thread synchronization using "Sticky Condition" (Richard Whitehead) 3. Re: Simpler thread synchronization using "Sticky Condition" (Antoine Pitrou) 4. Re: A directive for indentation type, stricter indentation parsing. (Mikhail V) ---------------------------------------------------------------------- Message: 1 Date: Tue, 26 Mar 2019 09:07:15 +0400 From: Abdur-Rahmaan Janhangeer To: Python , python-ideas Subject: [Python-ideas] The Mailing List Digest Project Message-ID: Content-Type: text/plain; charset="utf-8" As proposed on python-ideas, i setup a repo to turn mail threads into articles. here is the repo https://github.com/Abdur-rahmaanJ/py-mailing-list-summary i included a script to build .md to .html (with syntax highlighting) here is the index https://abdur-rahmaanj.github.io/py-mailing-list-summary/ included 3 articles as a start if you want to contribute an article, just follow existing .md format and put it in the .md folder planning to go across ideas, list and dev i can tell you, it's a really enjoyable experience. psst. we can enhance some html later -- Abdur-Rahmaan Janhangeer Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 2 Date: Tue, 26 Mar 2019 09:27:18 -0000 From: "Richard Whitehead" To: Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" Message-ID: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Content-Type: text/plain; charset="us-ascii" Problem: Using Python's Condition class is error-prone and difficult. For example, the following pseudo-code would be quite typical of its use: condition = threading.Condition() def sender(): while alive(): wait_for_my_data_from_hardware() with condition: send_data_to_receiver() condition.raise() def receiver(): while alive(): with condition: condition.wait() receive_all_data_from_sender() process_data() Main code will then run sender() and receiver() in separate threads. (I know that in a simple case like this, I could just use a Queue. I've been working with code that needs to service several events, where polling queues would introduce overhead and latency so a Condition must be used, but I've simplified the above as much as possible to illustrate the issues even in this simple case). There are two issues even with the above case: 1. Raising a condition only has any effect if the condition is already being waited upon; a condition does not "remember" that it has been raised. So in the example above, if the receiver starts after the sender, it will wait on the condition even though data is already available for it. This issue can be solved by rearranging the code, waiting at the end of the loop rather than the start; but this is counter-intuitive and lots of examples online look like the above. 2. In the receiver, the condition has to be kept locked (be inside the "with" statement") all the time. The lock is automatically released while the condition is being waited upon, but otherwise it must be locked, otherwise we risk missing the condition being raised. The result is that process_data() is called with the lock held - and so this will prevent the sender going round its loop. The sending thread becomes a slave to the processing thread. This may have a huge performance penalty, losing the advantage of loose coupling that threading should provide. You might think that using an Event, rather than a Condition, would solve things, since an Event does "remember" that is has been set. But there is no atomic way to wait on an event and clear it afterwards. Solution: The solution is very simple: to create something that might be called a StickyCondition, or an AutoResetEvent. This is a Condition that does remember it has been raised; or if you prefer, it is an Event that resets itself after it has been waited upon. The example then becomes: auto_event = threading.AutoResetEvent() def sender(): while alive(): wait_for_my_data_from_hardware() send_data_to_receiver() auto_event.set() def receiver(): while alive(): auto_event.wait() receive_all_data_from_sender() process_data() This solves both of the issues described: the receiver will not wait if data is already available, and the sender is not blocked by a lock that is being held by the receiver. The event doesn't need to be locked at all, because of its internal memory. Implementation is trivial, involving a boolean and a Condition in the cross-thread case. I presume it would be more involved for the cross-process case, but I can see no reason it would be impossible. Please let me know if you think this would be useful. -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Tue, 26 Mar 2019 11:38:14 +0100 From: Antoine Pitrou To: python-ideas at python.org Subject: Re: [Python-ideas] Simpler thread synchronization using "Sticky Condition" Message-ID: <20190326113814.4abe0332 at fsol> Content-Type: text/plain; charset=US-ASCII On Tue, 26 Mar 2019 09:27:18 -0000 "Richard Whitehead" wrote: > Problem: > > Using Python's Condition class is error-prone and difficult. For example, > the following pseudo-code would be quite typical of its use: [...] Nowadays, I would recommend to always use `Condition.wait_for()` rather than `Condition.wait()`. A Condition, unlike what the name suggests, is just a means of synchronization. It doesn't have a boolean state per se. You have to manage your boolean state (or any other kind of state) separately, hence the usefulness of `wait_for()`. As for auto-reset events, the Windows API has them, here's an insightful writeup about them: https://devblogs.microsoft.com/oldnewthing/?p=30773 But, yes, perhaps auto-reset events would be a good addition regardless. Either through a flag to Event, or as a separate class. Regards Antoine. ------------------------------ Message: 4 Date: Tue, 26 Mar 2019 13:48:48 +0300 From: Mikhail V Cc: python-ideas Subject: Re: [Python-ideas] A directive for indentation type, stricter indentation parsing. Message-ID: Content-Type: text/plain; charset="UTF-8" On Tue, Mar 26, 2019 at 7:04 AM fhsxfhsx wrote: > > Just as to your example, you can try `textwrap.dedent` > Procedural removal is not cool, because it does not know the parent indentation of the statement that contains the text block, thus it can't resolve automatically string indents that were intentionally indented to include extra space. E.g. this, where "s=" line is already inside an indented block: s = """ Hello world""" Say I use 1 tab for 1 indent - what I want here is to remove 1 tab from each string line AND 1 tab that belongs to code formatting (current indent of the "s=" line). And if you try to guess it from the string itself - it is impossible to resolve all cases, because you still need some criteria - for example you could use criteria "minimal indent inside the string is the indent" but this will again fail if you want extra shift on same level inside the string. E.g. this: s = """ Hello world""" Here I do not want to remove all indents but only as in above - only 1 level inside string and 1 from parent line. Do you get it? So in other words if I want my string blocks aligned within containing blocks, it becomes impossible to resolve all un-indenting cases automatically. Mikhail ------------------------------ Subject: Digest Footer _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas ------------------------------ End of Python-ideas Digest, Vol 148, Issue 143 ********************************************** From mikhailwas at gmail.com Tue Mar 26 08:44:59 2019 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 26 Mar 2019 15:44:59 +0300 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Message-ID: On Tue, Mar 26, 2019 at 2:02 PM Chris Angelico wrote: > > On Tue, Mar 26, 2019 at 9:49 PM Mikhail V wrote: > > Procedural removal is not cool, because it does not know the parent indentation > > [...] > > E.g. this: > > s = """ > > Hello > > world""" > > > > Here I do not want to remove all indents but only as in above - only 1 > > level inside string > > and 1 from parent line. > > This is true if you put your closing quotes on the same line as the > last line of text, because then there's no information available. If, > instead, you put the final triple-quote delimiter on its own line, you > then have the indentation preserved, and can remove just that much > indentation from each line. So, yes, you CAN unindent automatically. > > Point of note: PEP 257 recommends this for docstrings. > > https://www.python.org/dev/peps/pep-0257/#handling-docstring-indentation > > ChrisA Yes you are right. If closing triple quote is on a separate line , I can use the last line contents and extract it from previous lines. I should have remembered that trick and it is cool actually. So technically I can do it. So my point was that it would be cool to have a dedicated statement for this and so it works on parser level plus would look cleaner (no closing quotes), and no need to care about escaping of """ sequence if it appears inside the string, but I think that would be impossible without the parser knowing the indent method in advance. From robertve92 at gmail.com Tue Mar 26 09:06:05 2019 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Tue, 26 Mar 2019 14:06:05 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> Message-ID: > And this really is simple enough that I don't want to reach for regex's > for it. That is, I'd write it by hand rather than mess with that. > Well, with re.escape it's not messy at all : import re def trim_mailto(s): regex = re.compile("^" + re.escape("mailto:")) return regex.sub('', s) With literally means "if you have mailto: at the beginning, replace it with the empty string" You could do a ltrim function in one line : def ltrim(s, x): return re.sub("^" + re.escape(x), '', s) Escape will take care of escaping special characters, so the regex escape(x) matches exactly the string "x". -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Tue Mar 26 10:18:39 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Tue, 26 Mar 2019 15:18:39 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> Message-ID: <58F83F48-0C3E-44BC-8B93-57BBE871D632@killingar.net> > And this really is simple enough that I don't want to reach for regex's for it. That is, I'd write it by hand rather than mess with that. > > Well, with re.escape it's not messy at all : > > import re > def trim_mailto(s): > regex = re.compile("^" + re.escape("mailto:")) > return regex.sub('', s) > > With literally means "if you have mailto: at the beginning, replace it with the empty string" > > You could do a ltrim function in one line : > > def ltrim(s, x): > return re.sub("^" + re.escape(x), '', s) > > Escape will take care of escaping special characters, so the regex escape(x) matches exactly the string "x". I think re.sub("^" + re.escape(x), '', s) is a lot more messy and hard to read than s[len(prefix):] if s.startswith(prefix) else s it's also roughly an order of magnitude slower. / Anders -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 26 11:14:25 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 26 Mar 2019 08:14:25 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <58F83F48-0C3E-44BC-8B93-57BBE871D632@killingar.net> References: <7F3E2818-17C1-4987-A61A-BDED8CF5ADB6@killingar.net> <5181B0DB-3B10-4202-90D6-1365AEF19654@getmailspring.com> <58F83F48-0C3E-44BC-8B93-57BBE871D632@killingar.net> Message-ID: > And this really is simple enough that I don't want to reach for regex's >> for it. That is, I'd write it by hand rather than mess with that. >> > > Well, with re.escape it's not messy at all : > > You could do a ltrim function in one line : > > def ltrim(s, x): > return re.sub("^" + re.escape(x), '', s) > > I think > > re.sub("^" + re.escape(x), '', s) > > is a lot more messy and hard to read than > > s[len(prefix):] if s.startswith(prefix) else s > > it's also roughly an order of magnitude slower. > I agree, but I said I wouldn?t choose to ?mess? with regex, not that the resulting code would be messy. There is a substantial cognitive load in working with regex?they are another language. If you aren?t familiar with them (I?m not) it would take some time, and googling, to find that solution. If you are familiar with them, you still need to import another module and end up with an arguably less readable and slower solution. Python was designed from the beginning not to rely on regex for simple string processing, opting for fairly full featured set of string methods. These two simple methods fit well into that approach. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 26 11:18:25 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 26 Mar 2019 08:18:25 -0700 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Message-ID: > > So my point was that it would be cool to have a dedicated statement for > this Look in the archives of this list ? therewasa rejected proposal for dedented strings as s language feature fairly recently. -CHB > > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 26 11:18:25 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 26 Mar 2019 08:18:25 -0700 Subject: [Python-ideas] A directive for indentation type, stricter indentation parsing. In-Reply-To: References: <6f926584.3be1.169b82ae72e.Coremail.fhsxfhsx@126.com> Message-ID: > > So my point was that it would be cool to have a dedicated statement for > this Look in the archives of this list ? therewasa rejected proposal for dedented strings as s language feature fairly recently. -CHB > > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Tue Mar 26 11:28:14 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 26 Mar 2019 08:28:14 -0700 Subject: [Python-ideas] The Mailing List Digest Project In-Reply-To: References: Message-ID: On Mon, Mar 25, 2019 at 10:01 PM Abdur-Rahmaan Janhangeer < arj.python at gmail.com> wrote: > As proposed on python-ideas, i setup a repo to turn mail threads into > articles. > Thanks for doing this ? I find myself frequently telling people about past relevant threads on this list - it will be great to have a single place to point people. It can be hard to find stuff in the archives if you?re not sure what to search for. here is the repo > > https://github.com/Abdur-rahmaanJ/py-mailing-list-summary > > i included a script to build .md to .html > Maybe Sphinx and RST instead? For consistency with other Python docs? But markup is far less important than content. -CHB (with syntax highlighting) here is the index > > https://abdur-rahmaanj.github.io/py-mailing-list-summary/ > > included 3 articles as a start > > if you want to contribute an article, just follow existing .md format and > put it in the .md folder > > planning to go across ideas, list and dev > > i can tell you, it's a really enjoyable experience. > > psst. we can enhance some html later > > -- > Abdur-Rahmaan Janhangeer > Mauritius > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Tue Mar 26 11:32:38 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Tue, 26 Mar 2019 19:32:38 +0400 Subject: [Python-ideas] The Mailing List Digest Project In-Reply-To: References: Message-ID: Great! will see sphinx but if i find the html hard to customise, i'll drop it. Search feature and tags coming. also, currently i'm formatting the mails rather than an article, i don't know if a real summary of the topic preferable ... Abdur-Rahmaan Janhangeer Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 26 12:24:40 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 26 Mar 2019 09:24:40 -0700 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: These kinds of low-level synchronization primitives are notoriously tricky, yeah, and I'm all in favor of having better higher-level tools. But I'm not sure that AutoResetEvent adds enough to be worth it. AFAICT, you can get this behavior with an Event just fine ? using your pseudocode: def sender(): while alive(): wait_for_my_data_from_hardware() send_data_to_receiver() auto_event.set() def receiver(): while alive(): auto_event.wait() auto_event.clear() # <-- this line added receive_all_data_from_sender() process_data() It's true that if we use a regular Event then the .clear() doesn't happen atomically with the wakeup, but that doesn't matter. If we call auto_event.set() and then have new data arrive, then there are two cases: 1) the new data early enough to be seen by the current call to receive_all_data_from_sender(): this is fine, the new data will be processed in this call 2) the new data arrives too late to be seen by the current call to receive_all_data_from_sender(): that means the new data arrived after the call to receive_all_data_from_sender() started, which means it arrived after auto_event.clear(), which means that the call to auto_event.set() will successfully re-arm the event and another call to receive_all_data_from_sender() will happen immediately That said, this is still very tricky. It requires careful analysis, and it's not very general (for example, if we want to support multiple receivers than we need to throw out the whole approach and do something entirely different). In Trio we've actually discussed removing Event.clear(), since it's so difficult to use correctly: https://github.com/python-trio/trio/issues/637 You said your original problem is that you have multiple event sources, and the receiver needs to listen to all of them. And based on your approach, I guess you only have one receiver, and that it's OK to couple all the event sources directly to this receiver (i.e., you're OK with passing them all a Condition object to use). Under these circumstances, wouldn't it make more sense to use a single Queue, pass it to all the sources, and have them each do queue.put((source_id, event))? That's simple to implement, hard to mess up, and can easily be extended to multiple receivers. If you want to further decouple the sources from the receiver, then one approach would be to have each source expose its own Queue independently, and then define some kind of 'select' operation (like in Golang/CSP/concurrent ML) to let the receiver read from multiple Queues simultaneously. This is non-trivial to do, but in return you get a very general and powerful construct. There's some more links and discussion here: https://github.com/python-trio/trio/issues/242 > Regarding the link you sent, I don't entirely agree with the opinion expressed: if you try to use a Semaphore for this purpose you will soon find that it is "the wrong way round", it is intended to protect resources from multiple accesses, not to synchronize those multiple accesses Semaphores are extremely generic primitives ? there are a lot of different ways to use them. I think the blog post is correct that an AutoResetEvent is equivalent to a semaphore whose value is clamped so that it can't exceed 1. Your 'auto_event.set()' would be implemented as 'sem.release()', and 'auto_event.wait()' would be 'sem.acquire()'. I guess technically the semantics might be slightly different when there are multiple waiters: the semaphore wakes up exactly one waiter, while I'm not sure what your AutoResetEvent would do. But I can't see any way to use AutoResetEvent reliably with multiple waiters anyway. -n -- Nathaniel J. Smith -- https://vorpus.org From richard.whitehead at ieee.org Tue Mar 26 12:50:20 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Tue, 26 Mar 2019 16:50:20 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: Nathaniel, Thanks very much for taking the time to comment. Clearing the event after waiting for it will introduce a race condition: if the sender has gone around its loop again and set the semaphore after we have woken but before we've cleared it. As you said, this stuff is tricky! The only safe way is to make the wait-and-clear atomic, which can be done with a lock; and this comes essentially back to what I'm proposing. I realise this is not a fundamental new primitive - if it was, I wouldn't be able to build it in pure Python - but I've found it extremely useful in our generic threading and processing library. You're right about what you say regarding queues; I didn't want to go into the full details of the multi-threading and multi-processing situation at hand, but I will say that we have a pipeline of tasks that can run as either threads or processes, and we want to make it easy to construct this pipeline, "wiring" it as necessary; combining command queues with data queues just gets a real mess. Richard -----Original Message----- From: Nathaniel Smith Sent: Tuesday, March 26, 2019 4:24 PM To: richard.whitehead at ieee.org Cc: Python-Ideas Subject: Re: [Python-ideas] Simpler thread synchronization using "Sticky Condition" These kinds of low-level synchronization primitives are notoriously tricky, yeah, and I'm all in favor of having better higher-level tools. But I'm not sure that AutoResetEvent adds enough to be worth it. AFAICT, you can get this behavior with an Event just fine ? using your pseudocode: def sender(): while alive(): wait_for_my_data_from_hardware() send_data_to_receiver() auto_event.set() def receiver(): while alive(): auto_event.wait() auto_event.clear() # <-- this line added receive_all_data_from_sender() process_data() It's true that if we use a regular Event then the .clear() doesn't happen atomically with the wakeup, but that doesn't matter. If we call auto_event.set() and then have new data arrive, then there are two cases: 1) the new data early enough to be seen by the current call to receive_all_data_from_sender(): this is fine, the new data will be processed in this call 2) the new data arrives too late to be seen by the current call to receive_all_data_from_sender(): that means the new data arrived after the call to receive_all_data_from_sender() started, which means it arrived after auto_event.clear(), which means that the call to auto_event.set() will successfully re-arm the event and another call to receive_all_data_from_sender() will happen immediately That said, this is still very tricky. It requires careful analysis, and it's not very general (for example, if we want to support multiple receivers than we need to throw out the whole approach and do something entirely different). In Trio we've actually discussed removing Event.clear(), since it's so difficult to use correctly: https://github.com/python-trio/trio/issues/637 You said your original problem is that you have multiple event sources, and the receiver needs to listen to all of them. And based on your approach, I guess you only have one receiver, and that it's OK to couple all the event sources directly to this receiver (i.e., you're OK with passing them all a Condition object to use). Under these circumstances, wouldn't it make more sense to use a single Queue, pass it to all the sources, and have them each do queue.put((source_id, event))? That's simple to implement, hard to mess up, and can easily be extended to multiple receivers. If you want to further decouple the sources from the receiver, then one approach would be to have each source expose its own Queue independently, and then define some kind of 'select' operation (like in Golang/CSP/concurrent ML) to let the receiver read from multiple Queues simultaneously. This is non-trivial to do, but in return you get a very general and powerful construct. There's some more links and discussion here: https://github.com/python-trio/trio/issues/242 > Regarding the link you sent, I don't entirely agree with the opinion > expressed: if you try to use a Semaphore for this purpose you will soon > find that it is "the wrong way round", it is intended to protect resources > from multiple accesses, not to synchronize those multiple accesses Semaphores are extremely generic primitives ? there are a lot of different ways to use them. I think the blog post is correct that an AutoResetEvent is equivalent to a semaphore whose value is clamped so that it can't exceed 1. Your 'auto_event.set()' would be implemented as 'sem.release()', and 'auto_event.wait()' would be 'sem.acquire()'. I guess technically the semantics might be slightly different when there are multiple waiters: the semaphore wakes up exactly one waiter, while I'm not sure what your AutoResetEvent would do. But I can't see any way to use AutoResetEvent reliably with multiple waiters anyway. -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Tue Mar 26 13:10:42 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 26 Mar 2019 10:10:42 -0700 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: On Tue, Mar 26, 2019, 09:50 Richard Whitehead wrote: > Nathaniel, > > Thanks very much for taking the time to comment. > > Clearing the event after waiting for it will introduce a race condition: > if > the sender has gone around its loop again and set the semaphore after we > have woken but before we've cleared it. Sounds fine to me. Why is that a problem? Can you write down an example of how two threads could be interleaved to produce incorrect results? As you said, this stuff is tricky! > The only safe way is to make the wait-and-clear atomic, which can be done > with a lock; and this comes essentially back to what I'm proposing. > > I realise this is not a fundamental new primitive - if it was, I wouldn't > be > able to build it in pure Python - but I've found it extremely useful in > our > generic threading and processing library. > > You're right about what you say regarding queues; I didn't want to go into > the full details of the multi-threading and multi-processing situation at > hand, but I will say that we have a pipeline of tasks that can run as > either > threads or processes, and we want to make it easy to construct this > pipeline, "wiring" it as necessary; combining command queues with data > queues just gets a real mess. > But you're effectively implementing a multi-producer single-consumer Queue anyway, so without any details it's hard to guess why using a Queue would be messier. I know you don't want to get into too many details, but if the whole motivation for your proposal is based on some details then it's usually a good idea to explain them :-). -n > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.whitehead at ieee.org Tue Mar 26 15:56:18 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Tue, 26 Mar 2019 19:56:18 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: <242371E9E2EC4B129F92C3B2C437DD0B@HomePC2> Nathaniel, You?re quite right, I can?t work out the race condition myself now... Is it possible that an AutoResetEvent is just an Event with a two-line wait() overload??? Richard From: Nathaniel Smith Sent: Tuesday, March 26, 2019 5:10 PM To: Richard Whitehead Cc: Python-Ideas Subject: Re: [Python-ideas] Simpler thread synchronization using "Sticky Condition" On Tue, Mar 26, 2019, 09:50 Richard Whitehead wrote: Nathaniel, Thanks very much for taking the time to comment. Clearing the event after waiting for it will introduce a race condition: if the sender has gone around its loop again and set the semaphore after we have woken but before we've cleared it. Sounds fine to me. Why is that a problem? Can you write down an example of how two threads could be interleaved to produce incorrect results? As you said, this stuff is tricky! The only safe way is to make the wait-and-clear atomic, which can be done with a lock; and this comes essentially back to what I'm proposing. I realise this is not a fundamental new primitive - if it was, I wouldn't be able to build it in pure Python - but I've found it extremely useful in our generic threading and processing library. You're right about what you say regarding queues; I didn't want to go into the full details of the multi-threading and multi-processing situation at hand, but I will say that we have a pipeline of tasks that can run as either threads or processes, and we want to make it easy to construct this pipeline, "wiring" it as necessary; combining command queues with data queues just gets a real mess. But you're effectively implementing a multi-producer single-consumer Queue anyway, so without any details it's hard to guess why using a Queue would be messier. I know you don't want to get into too many details, but if the whole motivation for your proposal is based on some details then it's usually a good idea to explain them :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.mitchell.chch at gmail.com Tue Mar 26 16:05:49 2019 From: tim.mitchell.chch at gmail.com (Tim Mitchell) Date: Wed, 27 Mar 2019 09:05:49 +1300 Subject: [Python-ideas] singledispatch for methods Message-ID: Hi All, functools.singledispatch does not work on methods. There are 2 packages that do this for methods one on GitHub only https://gist.github.com/adamnew123456/9218f99ba35da225ca11 and my pypi package https://pypi.org/project/methoddispatch/. There are also a couple of stack overflow posts about it ( https://stackoverflow.com/questions/24601722/how-can-i-use-functools-singledispatch-with-instance-methods , https://stackoverflow.com/questions/24063788/python3-singledispatch-in-class-how-to-dispatch-self-type/24064102 ) with about 60 votes between them. Is it time to add singledispatch for methods to the core library? If so, would the methoddispatch implementation suffice or are there changes you would like made? Thanks for your time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From benrudiak at gmail.com Tue Mar 26 18:36:08 2019 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Tue, 26 Mar 2019 15:36:08 -0700 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: On Tue, Mar 26, 2019 at 2:40 AM Richard Whitehead wrote: > Using Python?s Condition class is error-prone and difficult. After looking at your example, I think the problem is just that you aren't using condition variables the way they're intended to be used. To write correct condition-variable code, first write code that contains polling loops like with lock: ... while not some_boolean_condition(): lock.release() time.sleep(0) lock.acquire() ... Then (1) for each boolean condition that you polled on, introduce a corresponding condition variable; (2) whenever you do something that causes the truth of that condition to flip from false to true, notify the condition variable; and (3) replace the polling loop with while not some_boolean_condition(): cvar_associated_with_that_condition.wait() You should only call Condition.wait() inside a while loop that tests the associated condition (or use wait_for()). Implementations of condition variables don't necessarily guarantee that the condition is true when wait() returns, even if you do everything else correctly. See https://en.wikipedia.org/wiki/Spurious_wakeup . If the condition variable is notified when you aren't inside wait(), you will miss the notification. That isn't a problem because as long as the boolean condition itself remains true, you will exit the while loop immediately upon testing it. Or, if the condition has become false again, you will wait and you will get the next false-to-true notification. Condition variables are not meant to substitute for polling the actual condition; they just make it more efficient. From pythonchb at gmail.com Tue Mar 26 21:00:24 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Tue, 26 Mar 2019 18:00:24 -0700 Subject: [Python-ideas] The Mailing List Digest Project In-Reply-To: References: Message-ID: On Tue, Mar 26, 2019 at 8:32 AM Abdur-Rahmaan Janhangeer < arj.python at gmail.com> wrote: > Great! will see sphinx but if i find the html hard to customise, i'll drop > it. > Sphinx has theming support, plus you can do custom CSS if you want. But Highly discourage you from worrying about formatting ? decent structure is good enough, and content is what matters. Search feature and tags coming. > Sphinx has search built in. also, currently i'm formatting the mails rather than an article, i don't > know if a real summary of the topic preferable ... > These mailing lists are really big, and the threads are long and scattered, and they are archived and searchable already. So I think the real value would be article-style summaries (with links to the threads). For Python-Ideas, I?m thinking kind of a mini rejected PEP ... -CHB > > Abdur-Rahmaan Janhangeer > Mauritius > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Tue Mar 26 22:22:32 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 27 Mar 2019 06:22:32 +0400 Subject: [Python-ideas] The Mailing List Digest Project In-Reply-To: References: Message-ID: #agree -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Tue Mar 26 23:35:11 2019 From: mike at selik.org (Michael Selik) Date: Tue, 26 Mar 2019 20:35:11 -0700 Subject: [Python-ideas] singledispatch for methods In-Reply-To: References: Message-ID: On Tue, Mar 26, 2019, 1:09 PM Tim Mitchell wrote: > Is it time to add singledispatch for methods to the core library? > What's the motivation for it, beyond the fact that it's possible? Regarding jargon, aren't Python's instance methods are already single-dispatch, because they receive the instance as the first argument? -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Mar 27 02:29:18 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 27 Mar 2019 10:29:18 +0400 Subject: [Python-ideas] Allow not in lambda expressions Message-ID: Suppose i have (lambda x: x if x != None else '')(someVar) returning an empty string if none but, if "not" was allowed (lambda x: x if x not None else '')(someVar) it might have been more elegant PROPOSAL: Allow "not" in lambda expressions -- Abdur-Rahmaan Janhangeer Mauritius Garanti sans virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Mar 27 02:28:43 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 27 Mar 2019 15:28:43 +0900 Subject: [Python-ideas] Allow not in lambda expressions In-Reply-To: References: Message-ID: Do you mean "is not"? On Wed, Mar 27, 2019 at 3:24 PM Abdur-Rahmaan Janhangeer < arj.python at gmail.com> wrote: > Suppose i have > > (lambda x: x if x != None else '')(someVar) > > returning an empty string if none > > but, if "not" was allowed > > (lambda x: x if x not None else '')(someVar) > > it might have been more elegant > > PROPOSAL: Allow "not" in lambda expressions > > -- > Abdur-Rahmaan Janhangeer > Mauritius > > > Garanti > sans virus. www.avast.com > > <#m_2517025322762206080_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Inada Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Wed Mar 27 02:29:29 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 27 Mar 2019 10:29:29 +0400 Subject: [Python-ideas] Allow not in lambda expressions In-Reply-To: References: Message-ID: please discard, it should maybe be proposed under "is not" and "not" directly -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Wed Mar 27 02:27:58 2019 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 26 Mar 2019 23:27:58 -0700 Subject: [Python-ideas] Allow not in lambda expressions In-Reply-To: References: Message-ID: <5C9B17EE.1030508@brenbarn.net> On 2019-03-26 23:29, Abdur-Rahmaan Janhangeer wrote: > Suppose i have > > (lambda x: x if x != None else '')(someVar) > > returning an empty string if none > > but, if "not" was allowed > > (lambda x: x if x not None else '')(someVar) > > it might have been more elegant > > PROPOSAL: Allow "not" in lambda expressions What you describe is already possible, you just have to use the "is not" operator, just as you would in any other expression. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From arj.python at gmail.com Wed Mar 27 03:48:21 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Wed, 27 Mar 2019 11:48:21 +0400 Subject: [Python-ideas] Allow not in lambda expressions In-Reply-To: <5C9B17EE.1030508@brenbarn.net> References: <5C9B17EE.1030508@brenbarn.net> Message-ID: @Brendan unfortunately i've realised it. py sounded suddenly english. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.mitchell.chch at gmail.com Wed Mar 27 04:28:21 2019 From: tim.mitchell.chch at gmail.com (Tim Mitchell) Date: Wed, 27 Mar 2019 21:28:21 +1300 Subject: [Python-ideas] singledispatch for methods In-Reply-To: References: Message-ID: The motivation is the same as for functools.singledispatch ( https://www.python.org/dev/peps/pep-0443/) - provide generic methods. These are useful for visitor implementations that are cleaner than the visit_ approach typically used today (e.g. https://github.com/mbr/visitor). It is also useful for writing serialisers and encoders. The motivation is also that people keep using the method name munging approach (e.g. mypy) because functools.singledispatch only works on functions and not class methods. Cheers Tim On Wed, Mar 27, 2019 at 4:33 PM Michael Selik wrote: > On Tue, Mar 26, 2019, 1:09 PM Tim Mitchell > wrote: > >> Is it time to add singledispatch for methods to the core library? >> > > What's the motivation for it, beyond the fact that it's possible? > > Regarding jargon, aren't Python's instance methods are already > single-dispatch, because they receive the instance as the first argument? > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Mar 27 04:33:33 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 27 Mar 2019 17:33:33 +0900 Subject: [Python-ideas] singledispatch for methods In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 5:09 AM Tim Mitchell wrote: > > Is it time to add singledispatch for methods to the core library? > If so, would the methoddispatch implementation suffice or are there changes you would like made? > singledispatchmethod will be added in Python 3.8. Please try Python 3.8a3! https://docs.python.org/3.8/library/functools.html#functools.singledispatchmethod https://www.python.org/downloads/release/python-380a3/ -- Inada Naoki From richard.whitehead at ieee.org Wed Mar 27 05:39:26 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Wed, 27 Mar 2019 09:39:26 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: <001301d4e480$f8030000$e8090000$@ieee.org> Ben, Thanks very much for your reply. Everything you say is true, and the system can certainly be made to work (and without the performance problem I described) using only Condition variables. I think the correct version of my pseudo-code becomes: def sender(): while alive(): wait_for_my_data_from_hardware() with condition: send_data_to_receiver() # I think this has to be done inside the lock on the Condition condition.raise() def receiver(): while alive(): while data available: receive_all_data_from_sender() process_data() with condition: # only sleep when we have nothing to do condition.wait() From richard.whitehead at ieee.org Wed Mar 27 05:43:45 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Wed, 27 Mar 2019 09:43:45 -0000 Subject: [Python-ideas] Simpler thread synchronization using "Sticky Condition" In-Reply-To: References: <005301d4e3b6$1c1099b0$5431cd10$@ieee.org> Message-ID: <001601d4e481$92a359c0$b7ea0d40$@ieee.org> Thank you everyone who has commented on this issue, I've been overwhelmed by the helpfulness and thoughtfulness of the responses. In summary: * Condition variables work without problems provided you use them correctly, but using them correctly is quite tricky. * Using an event to achieve the same thing is trivially easy. * Having the event auto-resetting isn't even necessary. My conclusions: * The very simplest client code, which is almost impossible to get wrong or for someone else to mess up during maintenance, uses an auto-resetting event, and so that's what I'm going to do. * It doesn't sound like there is any appetite for this auto-reset-event to be part of the Python standard library, so I'm not going to push for a PEP to be raised on it, but I will certainly use it in my company's base library code. Thanks again, Richard From malincns at 163.com Wed Mar 27 09:46:44 2019 From: malincns at 163.com (Ma Lin) Date: Wed, 27 Mar 2019 21:46:44 +0800 Subject: [Python-ideas] Unified style of cache management API Message-ID: re module [1] and struct module [2] have module-level cache for compiled stuffs. Other third-party modules may also need cache for something. Do we need an unified cache management API like this? I suppose it's not mandatory, but welcome each module to use this API. ? module.cache_get_capacity()???? # return current capacity ? module.cache_set_capacity(100)? # set capacity ? module.cache_clear()??????????? # clear cache Moreover, add these API to sys module, then the users can manage system wide cache easily: ? sys.cache_register(f)? # register a .cache_clear() function ? sys.cache_clear()????? # call all registered .cache_clear() [1] re module policy: FIFO capacity: 512 (default), changeable implementation: Python https://github.com/python/cpython/blob/v3.8.0a3/Lib/re.py#L268-L295 [2] struct module clear entire cache when full capacity: 100, unchangeable implementation: C https://github.com/python/cpython/blob/v3.8.0a3/Modules/_struct.c#L2071-L2126 From brett at python.org Wed Mar 27 13:09:57 2019 From: brett at python.org (Brett Cannon) Date: Wed, 27 Mar 2019 10:09:57 -0700 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 6:47 AM Ma Lin wrote: > re module [1] and struct module [2] have module-level cache for compiled > stuffs. > Other third-party modules may also need cache for something. > > Do we need an unified cache management API like this? > Need? No. Nice to have? Maybe. > I suppose it's not mandatory, but welcome each module to use this API. > > module.cache_get_capacity() # return current capacity > module.cache_set_capacity(100) # set capacity > module.cache_clear() # clear cache > The only thing might be a cache-clearing function and I would name that module._clear_cache() or something like importlib.invalidate_cache() (which isn't a module-level cache, but it's still a cache ;) . > > Moreover, add these API to sys module, then the users can manage system > wide cache easily: > > sys.cache_register(f) # register a .cache_clear() function > sys.cache_clear() # call all registered .cache_clear() > That's not necessary if you standardize on the name as all you're asking to do is: for module in sys.modules.values(): if hasattr(module, '_clear_cache'): module._clear_cache() -Brett > > [1] re module > policy: FIFO > capacity: 512 (default), changeable > implementation: Python > https://github.com/python/cpython/blob/v3.8.0a3/Lib/re.py#L268-L295 > > [2] struct module > clear entire cache when full > capacity: 100, unchangeable > implementation: C > > https://github.com/python/cpython/blob/v3.8.0a3/Modules/_struct.c#L2071-L2126 > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Mar 27 13:33:24 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 27 Mar 2019 19:33:24 +0200 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: 27.03.19 15:46, Ma Lin ????: > re module [1] and struct module [2] have module-level cache for compiled > stuffs. > Other third-party modules may also need cache for something. > > Do we need an unified cache management API like this? > I suppose it's not mandatory, but welcome each module to use this API. > > ? module.cache_get_capacity()???? # return current capacity > ? module.cache_set_capacity(100)? # set capacity > ? module.cache_clear()??????????? # clear cache > > Moreover, add these API to sys module, then the users can manage system > wide cache easily: > > ? sys.cache_register(f)? # register a .cache_clear() function > ? sys.cache_clear()????? # call all registered .cache_clear() I proposed similar idea at 2015. https://mail.python.org/pipermail/python-ideas/2015-April/032836.html From pythonchb at gmail.com Thu Mar 28 01:31:25 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Wed, 27 Mar 2019 22:31:25 -0700 Subject: [Python-ideas] New Project to Capture summaries from this list Message-ID: Hi all, Inspired by Abdur-Rahmaan Janhangeer and some previous discussion son this list, I've started a project to capture some of the common threads from this list (and others). We often find ourselves saying things like "this was discussed on this list a year or so ago" when someone brings up a common topic, but it can be hard to find those threads, and often those threads are very, very, long and hard to follow. I'm hoping you all will contribute to this effort to better document the history of Python-ideas. The project is here: https://github.com/PythonCHB/PythonListsSummaries And is published here: https://pythonchb.github.io/PythonListsSummaries If it catches on, we can move it to maybe a new gitHub org, or somewhere less associated with me -- I'm hoping it will be a community effort. Contributions Encouraged! I've kicked it off with a summary of the whole ``list.join()`` question: https://pythonchb.github.io/PythonListsSummaries/python_ideas/join_as_a_list_method.html -Chris -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From arj.python at gmail.com Thu Mar 28 02:01:58 2019 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Thu, 28 Mar 2019 10:01:58 +0400 Subject: [Python-ideas] The Mailing List Digest Project In-Reply-To: References: Message-ID: continuing a better effort here: https://github.com/PythonCHB/PythonListsSummaries moving to an impersonal repo later! Garanti sans virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Thu Mar 28 11:07:42 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Thu, 28 Mar 2019 16:07:42 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> Message-ID: >>> All of this would be well served by a 3rd party library on PyPI. Strings already have plenty of methods (probably too many). Having `stringtools` would be nice to import a bunch of simple functions from. >> >> I respectfully disagree. This isn't javascript where we are OK with millions of tiny dependencies. Python is batteries included and that's a great thing. This is just a tiny battery that was overlooked :) > > While batteries included is a very good principle (and one I've argued > for strongly in the past) it's also important to remember that Python > is a mature language, and the days of being able to assume that "most > people" will be on a recent version are gone. It's much more true now than it has been in over a decade. People have largely moved away from python 2.7 and after that it's pretty easy to keep pace. There's a lag, but it's no longer decades. > Adding these functions > to the stdlib would mean that *only* people using Python 3.8+ would > have access to them (and in particular, library authors wouldn't be > able to use them until they drop support for all versions older than > 3.8). Having the functions as an external library makes them > accessible to *every* Python user. Sure. And if library authors want to support older versions they'll have to vendor this into their own code, just like always. This seems totally irrelevant to the discussion. And it's of course irrelevant to all the end users that aren't writing libraries but are using python directly. > As with everything, it's a trade-off. IMO, in this case the balance is > in favour of a 3rd party library (at least initially - it's perfectly > possible to move the library into the stdlib later if it becomes > popular). Putting it in a library virtually guarantees it will never become popular. And because we are talking about new methods on str, a library that monkey patches on two new method on str won't become popular for obvious reasons. Plus it's actually impossible: >>> str.foo = 1 Traceback (most recent call last): File "", line 1, in TypeError: can't set attributes of built-in/extension type 'str' So this can't really be moved into the standard library or be implemented by a library in a nice way. / Anders From richard.whitehead at ieee.org Thu Mar 28 11:25:34 2019 From: richard.whitehead at ieee.org (Richard Whitehead) Date: Thu, 28 Mar 2019 15:25:34 -0000 Subject: [Python-ideas] New Project to Capture summaries from this Message-ID: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Chris, As a new member to this list, I can tell you that searching for relevant old content was effectively impossible, so I'm all for some way of doing that. Please can I make a more radical suggestion, though: Drop the mailing list. How about a GitHub repo - a specific one (with no code), specifically for early ideas? Then, if an idea was accepted and turned into an issue to be implemented, it could link back to that original discussion. GitHub is easily searchable. It can email you if someone comments on an issue you have raised, etc. An alternative might be a StackOverflow section, but that wouldn't provide such tight integration in the case of an issue being raised. The new work you're doing would be a good way to populate the repo with its initial content. Richard From lisandrosnik at gmail.com Thu Mar 28 11:33:22 2019 From: lisandrosnik at gmail.com (Lysandros Nikolaou) Date: Thu, 28 Mar 2019 16:33:22 +0100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: Hi Richard, there's been lots of discussion about that in the past. You can check out https://discuss.python.org, which is the current most popular idea about an alternative. It has not been decided if the mailing lists will be dropped or not though, as far as I know. Lys On Thu, Mar 28, 2019 at 4:30 PM Richard Whitehead < richard.whitehead at ieee.org> wrote: > Chris, > > As a new member to this list, I can tell you that searching for relevant > old > content was effectively impossible, so I'm all for some way of doing that. > > Please can I make a more radical suggestion, though: Drop the mailing list. > How about a GitHub repo - a specific one (with no code), specifically for > early ideas? Then, if an idea was accepted and turned into an issue to be > implemented, it could link back to that original discussion. GitHub is > easily searchable. It can email you if someone comments on an issue you > have > raised, etc. > > An alternative might be a StackOverflow section, but that wouldn't provide > such tight integration in the case of an issue being raised. > > The new work you're doing would be a good way to populate the repo with its > initial content. > > Richard > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Thu Mar 28 11:45:42 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Thu, 28 Mar 2019 08:45:42 -0700 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: > > there's been lots of discussion about that in the past. > Indeed. Personally, I?m all for keeping the mailing list. See past discussions for the advantage. I do think it might be a good idea to move to a gitHub repo once a topic gets past the vague idea stage to the hammer out a proposal stage ? which indeed often happens once someone starts drafting a PEP. But a freewheeling discussion in a gitHub issue isn?t any easier to navigate than a list discussion. This latest effort will only help if it grows to critical mass ? but newbies can help! For example, if you see a post here that says something to the effect of: ?this has been discussed in the past on this list?, you can search and find hat discussion, and summarize the results for future readers. Hmm, maybe the ?we should use something other than a list? discussion is one of those ;-) -CHB You can check out https://discuss.python.org, which is the current most > popular idea about an alternative. It has not been decided if the mailing > lists will be dropped or not though, as far as I know. > > Lys > > On Thu, Mar 28, 2019 at 4:30 PM Richard Whitehead < > richard.whitehead at ieee.org> wrote: > >> Chris, >> >> As a new member to this list, I can tell you that searching for relevant >> old >> content was effectively impossible, so I'm all for some way of doing that. >> >> Please can I make a more radical suggestion, though: Drop the mailing >> list. >> How about a GitHub repo - a specific one (with no code), specifically for >> early ideas? Then, if an idea was accepted and turned into an issue to be >> implemented, it could link back to that original discussion. GitHub is >> easily searchable. It can email you if someone comments on an issue you >> have >> raised, etc. >> >> An alternative might be a StackOverflow section, but that wouldn't provide >> such tight integration in the case of an issue being raised. >> >> The new work you're doing would be a good way to populate the repo with >> its >> initial content. >> >> Richard >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Thu Mar 28 13:12:06 2019 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 28 Mar 2019 17:12:06 +0000 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: <26f787ae-ceb9-a984-51ea-75a75f4f648f@kynesim.co.uk> On 28/03/2019 15:25, Richard Whitehead wrote: > Chris, > > As a new member to this list, I can tell you that searching for relevant old > content was effectively impossible, so I'm all for some way of doing that. > > Please can I make a more radical suggestion, though: Drop the mailing list. > How about a GitHub repo - a specific one (with no code), specifically for > early ideas? Then, if an idea was accepted and turned into an issue to be > implemented, it could link back to that original discussion. GitHub is > easily searchable. It can email you if someone comments on an issue you have > raised, etc. Github is more searchable than a mailing list (at least until the mailing list archives are made searchable), but is not designed for and is not as good at discussion as a mailing list. Threads of discussion and branch and interweave easily on a mailing list, so you don't lose lines of thought. That lack is the big failing of forum-style interfaces. > An alternative might be a StackOverflow section, but that wouldn't provide > such tight integration in the case of an issue being raised. I think I'd go so far as "Hell, no!" here. > The new work you're doing would be a good way to populate the repo with its > initial content. Having to do new work would certainly discourage some of the less technically well-founded ideas. I don't think that's what you meant, though :-) -- Rhodri James *-* Kynesim Ltd From brett at python.org Thu Mar 28 13:45:09 2019 From: brett at python.org (Brett Cannon) Date: Thu, 28 Mar 2019 10:45:09 -0700 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 10:34 AM Serhiy Storchaka wrote: > 27.03.19 15:46, Ma Lin ????: > > re module [1] and struct module [2] have module-level cache for compiled > > stuffs. > > Other third-party modules may also need cache for something. > > > > Do we need an unified cache management API like this? > > I suppose it's not mandatory, but welcome each module to use this API. > > > > module.cache_get_capacity() # return current capacity > > module.cache_set_capacity(100) # set capacity > > module.cache_clear() # clear cache > > > > Moreover, add these API to sys module, then the users can manage system > > wide cache easily: > > > > sys.cache_register(f) # register a .cache_clear() function > > sys.cache_clear() # call all registered .cache_clear() > > I proposed similar idea at 2015. > > https://mail.python.org/pipermail/python-ideas/2015-April/032836.html So I would say that a cache-clearing function convention would be a reasonable starting point. If that turns out to not be enough for folks we can talk about expanding it, but I think we should start small and grow from there as needed. So what name would people want. clear_cache() or _clear_cache()? I personally prefer the latter since clearing the cache shouldn't be something people should typically need to do and thus the function is an implementation detail. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Mar 28 13:52:22 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 28 Mar 2019 19:52:22 +0200 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: 28.03.19 19:45, Brett Cannon ????: > So I would say that a cache-clearing function convention would be a > reasonable starting point. If that turns out to not be enough for folks > we can talk about expanding it, but I think we should start small and > grow from there as needed. > > So what name would people want. clear_cache() or _clear_cache()? I > personally prefer the latter since clearing the cache shouldn't be > something people should typically need to do and thus the function is an > implementation detail. This is an interesting idea. I think it should be a dunder name: __clearcache__() or __clear_cache__(). The disadvantage is that this method is slower in the case of large number of imported modules. From steve at pearwood.info Thu Mar 28 19:49:33 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Mar 2019 10:49:33 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: <20190328234933.GN31406@ando.pearwood.info> On Thu, Mar 28, 2019 at 03:25:34PM -0000, Richard Whitehead wrote: > Chris, > > As a new member to this list, I can tell you that searching for relevant old > content was effectively impossible, so I'm all for some way of doing that. "Effectively impossible" is a gross exaggeration. The old mailman built-in search functionality is not fantastic, but it's not useless either, and more importantly, Google does a great job of indexing the archives. (I haven't tried it, but I expect Bing does too.) Very occasionally I find that Google's indexes may be off: in the past, if a post was deleted (a rare occurrence, but it did happen from time to time) it would force the remaining posts for that month to get new URLs. But I don't think that's a problem now, and as Google crawls the archives again that will slowly correct itself. Whereas for *me*, three quarters of the functionality on Github doesn't work at all. -- Steven From mertz at gnosis.cx Thu Mar 28 20:54:53 2019 From: mertz at gnosis.cx (David Mertz) Date: Thu, 28 Mar 2019 20:54:53 -0400 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: Dropping the mailing list is another topic that often comes up, and is always a terrible idea. Every suggester had a different platform in mind, only consistent in all being vastly worse than email for this purpose That said, if someone writes a FAQ about this mailing list, the first answer can be "We are not moving discussion to GitHub / Slack / Discuss / Reddit / StackOverflow / MediaWiki / graffiti on popular buildings / etc" On Thu, Mar 28, 2019, 11:29 AM Richard Whitehead wrote: > Chris, > > As a new member to this list, I can tell you that searching for relevant > old > content was effectively impossible, so I'm all for some way of doing that. > > Please can I make a more radical suggestion, though: Drop the mailing list. > How about a GitHub repo - a specific one (with no code), specifically for > early ideas? Then, if an idea was accepted and turned into an issue to be > implemented, it could link back to that original discussion. GitHub is > easily searchable. It can email you if someone comments on an issue you > have > raised, etc. > > An alternative might be a StackOverflow section, but that wouldn't provide > such tight integration in the case of an issue being raised. > > The new work you're doing would be a good way to populate the repo with its > initial content. > > Richard > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Mar 28 21:11:10 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Mar 2019 12:11:10 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: On Fri, Mar 29, 2019 at 11:56 AM David Mertz wrote: > > That said, if someone writes a FAQ about this mailing list, the first answer can be "We are not moving discussion to GitHub / Slack / Discuss / Reddit / StackOverflow / MediaWiki / > graffiti on popular buildings / etc" > That last one reminds me of XKCD 1810 with its references to "wall (Unix)" and "wall (bathroom)"... General principle: People who complain about email are using suboptimal clients, people who complain about non-email systems are using suboptimal services. And there's nothing that's truly optimal in either case (some email clients do come close, but only for their current users, not for new users). ChrisA From njs at pobox.com Thu Mar 28 21:43:39 2019 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 28 Mar 2019 18:43:39 -0700 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <20190328234933.GN31406@ando.pearwood.info> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <20190328234933.GN31406@ando.pearwood.info> Message-ID: On Thu, Mar 28, 2019 at 4:52 PM Steven D'Aprano wrote: > On Thu, Mar 28, 2019 at 03:25:34PM -0000, Richard Whitehead wrote: > > Chris, > > > > As a new member to this list, I can tell you that searching for relevant old > > content was effectively impossible, so I'm all for some way of doing that. > > "Effectively impossible" is a gross exaggeration. > > The old mailman built-in search functionality is not fantastic, but it's > not useless either, and more importantly, Google does a great job of > indexing the archives. It really doesn't. I often need to look up specific emails in the mail.python.org archives, that I remember seeing or writing, in order to link to them. IME, Google never works for this. For whatever reason, most pages on mail.python.org are not included in Google's index. For example, here's a post of yours from a few weeks ago: https://mail.python.org/pipermail/python-ideas/2019-March/055911.html AFAICT, it is not possible to find that post with Google. For example, doing a site-restricted search with an exact quote from your email says that there are no pages that match: https://www.google.com/search?q="Is+that+common+enough+that+it+needs+to+be+built-in+to+dict+itself%3F"+site%3Amail.python.org (I also just tried a few variants of that search on Bing and DuckDuckGo, and they both failed as well.) The only reliable way that I know of to find emails on mail.python.org is to (a) find the email in my MUA's archives, (b) note author and the date it was sent, (c) navigate through the mailman archives 'date' index to narrow things down, and then click around manually until I find the post I'm looking for. I don't think this proves we should switch to using Github issues or something instead. But I do think we should listen when people say that they're struggling with something, instead of dismissing their concerns. -n -- Nathaniel J. Smith -- https://vorpus.org From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 28 23:06:25 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 29 Mar 2019 12:06:25 +0900 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> Message-ID: <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Anders Hovm?ller writes: > Sure. And if library authors want to support older versions they'll > have to vendor this into their own code, You (indirectly) argue below that they can't, as a reason for including the change. You can't have it both ways. > just like always. This seems totally irrelevant to the > discussion. And it's of course irrelevant to all the end users that > aren't writing libraries but are using python directly. No, it's not "irrelevant". I wish we all would stop using that word, and trying to exclude others' arguments in this way. We are balancing equities here. We have a plethora of changes, on the one side taken by itself each of which is an improvement, but on the other taken as a group they greatly increase the difficulty of learning to read Python programs fluently. So we set a bar that the change must clear, and the ability of the change to clear it depends on the balance of equities. In this case, where it requires C support and is not possible to "from __future__", the fact that library maintainers can't use it until they drop support for past versions of Python weakens the argument for the change by excluding important bodies of code from using it. > Putting it in a library virtually guarantees it will never become > popular. Factually, you're wrong. Many libraries have moved from PyPI to the stdlib, often very quickly as they prove their worth in a deliberate test. Also, here "popular" has a special meaning. It doesn't mean millions of downloads. It means people say they like it in blogs, recommend it to others, and start to post to Python development channels saying how much it improves their code and posting examples of how it does so. > And because we are talking about new methods on str, a > library that monkey patches on two new method on str won't become > popular for obvious reasons [specifically, it's impossible]. This is a valid point. But it doesn't need to be a monkey patch. Note that decimal was introduced with no literal syntax and is quite useful and used. If this change is going to prove it's tall enough to ride the stdlib ride, using a constructor for a derived class rather than str literal syntax shouldn't be too big a barrier to judging popularity (accounting for the annoyance of a constructor). Alternatively, the features could be introduced using functions. Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Mar 28 23:07:39 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 29 Mar 2019 12:07:39 +0900 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> Message-ID: <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> Chris Angelico writes: > General principle: People who complain about email are using > suboptimal clients, people who complain about non-email systems are > using suboptimal services. And there's nothing that's truly optimal in > either case (some email clients do come close, but only for their > current users, not for new users). There's a big difference, though. Email users choose their own email clients. If you choose GMail, well, "sorry, you chose GMail." (I understand that avoiding GMail on Apple handhelds is kinda hard, AppleMail sucking amazingly and all. But writing technical discussion on a phone is strictly for disasters, anyway, at least IME.) We don't have such a choice about the non-email service *at all*, except to use the inevitably sucky email interface. Theoretically one could use the API that clients-in-the-browser use and design and write a new client, but they're often not documented and often don't make backward compatibility promises. Practically, it's a massive undertaking. Not to mention that those alternative clients don't yet exist, whereas email clients that one can modify (or libraries to build one) exist in pretty much every language. Regards, Steve From rosuav at gmail.com Thu Mar 28 23:23:52 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Mar 2019 14:23:52 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> Message-ID: On Fri, Mar 29, 2019 at 2:07 PM Stephen J. Turnbull wrote: > > Chris Angelico writes: > > > General principle: People who complain about email are using > > suboptimal clients, people who complain about non-email systems are > > using suboptimal services. And there's nothing that's truly optimal in > > either case (some email clients do come close, but only for their > > current users, not for new users). > > There's a big difference, though. Email users choose their own email > clients. If you choose GMail, well, "sorry, you chose GMail." (I > understand that avoiding GMail on Apple handhelds is kinda hard, > AppleMail sucking amazingly and all. But writing technical discussion > on a phone is strictly for disasters, anyway, at least IME.) That's half of my point (the distinction between "suboptimal clients" and "suboptimal services"), but the other half is that every time someone says "sorry, you chose Gmail", there's a lengthy discussion that ends up NOT showcasing any sort of perfect alternative - and often not even any *better* alternatives. Have you ever actually convinced someone to move off Gmail onto some other client? ChrisA From steve at pearwood.info Fri Mar 29 02:04:24 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Mar 2019 17:04:24 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <20190328234933.GN31406@ando.pearwood.info> Message-ID: <20190329060424.GP31406@ando.pearwood.info> On Thu, Mar 28, 2019 at 06:43:39PM -0700, Nathaniel Smith wrote: > For example, here's a post of yours from a few weeks ago: > https://mail.python.org/pipermail/python-ideas/2019-March/055911.html > > AFAICT, it is not possible to find that post with Google. > > For example, > doing a site-restricted search with an exact quote from your email > says that there are no pages that match: > https://www.google.com/search?q="Is+that+common+enough+that+it+needs+to+be+built-in+to+dict+itself%3F"+site%3Amail.python.org *shrug* Maybe Google hasn't crawled Python-Ideas for March yet. Or maybe they are continuing their ambition to dumb everything down to the lowest common denominator by applying it to search now too. For many years now I've been disappointed at Google search's lack of *precision*: it is really good at finding the same super-popular pages, but not so good at finding *specific* pages even when given exact substrings from that page. Nevertheless, Google is not the only search engine in town. This was the first search I tried: https://duckduckgo.com/?q=site%3Ahttps%3A%2F%2Fmail.python.org+python-ideas+steven+d%27aprano+common+enough+builtin (second link is the desired page). Here's a second set of search terms which leads to the right page, without even using a site-specific search: https://duckduckgo.com/?q=steven+d%27aprano+%22common+enough%22+built-in+dict+python-ideas (sixth link) Google does seem worse at this though. Nevertheless, I take your point: the ability of search engines to hit a *specific* page containing an exact phrase seems pretty poor. (As opposed to finding a *popular* page related to the phrase.) But in context, we're not asking people to find a specific page, so this test is totally irrelevant. What we want people to do is find previous discussions. A link to any part of the thread would be sufficient. https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fmail.python.org+python-ideas+dict+addition links to at least three relevant discussions on the first page. -- Steven From p.f.moore at gmail.com Fri Mar 29 04:11:20 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 29 Mar 2019 08:11:20 +0000 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> Message-ID: Bah. Tablet client defaults to reply, not reply to all... On Fri, 29 Mar 2019 at 08:10, Paul Moore wrote: > On Fri, 29 Mar 2019 at 03:25, Chris Angelico wrote: > >> On Fri, Mar 29, 2019 at 2:07 PM Stephen J. Turnbull >> wrote: >> > >> > There's a big difference, though. Email users choose their own email >> > clients. If you choose GMail, well, "sorry, you chose GMail." (I >> > understand that avoiding GMail on Apple handhelds is kinda hard, >> > AppleMail sucking amazingly and all. But writing technical discussion >> > on a phone is strictly for disasters, anyway, at least IME.) >> >> That's half of my point (the distinction between "suboptimal clients" >> and "suboptimal services"), but the other half is that every time >> someone says "sorry, you chose Gmail", there's a lengthy discussion >> that ends up NOT showcasing any sort of perfect alternative - and >> often not even any *better* alternatives. Have you ever actually >> convinced someone to move off Gmail onto some other client? > > > As someone who uses gmail (the web interface) this is a good point. When > people say that there are all sorts of better alternative clients, no-one > has ever been able to offer one that actually satisfies my specific > requirements. Having said that, *in spite of having to use gmail* I still > strongly prefer mailing lists. > > I can?t easily articulate why, but certainly one aspect of it is the fact > that there is no universally accepted alternative that gets proposed. One > time it?s discourse, then it?s github, then something else I?ve never heard > of... And every other project that gets quoted as having ?successfully > switched? seems to use something different. So in my mind the question > isn?t about mail or a particular alternative (that I can look at and form > an opinion about over time, as these discussions reoccur) but rather about > mail or ?not mail? with an ever changing alternative that I have to > consider and re-assess from scratch each time. > > For me, mail wins as the stable alternative. And for something I spend so > much time on, stability is essential. > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 29 05:40:27 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Mar 2019 20:40:27 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190329094026.GQ31406@ando.pearwood.info> Without getting into the pros and cons of mailing lists versus github versus discourse versus Stackoverflow versus ... On Fri, Mar 29, 2019 at 02:23:52PM +1100, Chris Angelico wrote: > Have you ever actually > convinced someone to move off Gmail onto some other client? Gmail is an email provider with a web interface. As a provider, it is available on any email client, so long as it speaks POP3 or IMAP. It's even officially supported: https://support.google.com/mail/answer/7126229?hl=en (Perhaps I was a bit hasty earlier when I accused Google of trying to dumb *everything* down. Credit where credit is due.) I've never tried this myself, but I know of people who have. -- Steven From rosuav at gmail.com Fri Mar 29 07:20:41 2019 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Mar 2019 22:20:41 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: <20190329094026.GQ31406@ando.pearwood.info> References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> <20190329094026.GQ31406@ando.pearwood.info> Message-ID: On Fri, Mar 29, 2019 at 8:41 PM Steven D'Aprano wrote: > On Fri, Mar 29, 2019 at 02:23:52PM +1100, Chris Angelico wrote: > > > Have you ever actually > > convinced someone to move off Gmail onto some other client? > > Gmail is an email provider with a web interface. As a provider, it is > available on any email client, so long as it speaks POP3 or IMAP. In those terms, I'm talking about convincing someone to move off the _Gmail client_, whether or not they continue using the Gmail service. Have you ever convinced someone to stop using the Gmail web interface and start using some other client, as a means of improving their use of the mailing list (eg to resolve some complaints they were having)? ChrisA From steve at pearwood.info Fri Mar 29 07:46:27 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Mar 2019 22:46:27 +1100 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> <20190329094026.GQ31406@ando.pearwood.info> Message-ID: <20190329114626.GR31406@ando.pearwood.info> On Fri, Mar 29, 2019 at 10:20:41PM +1100, Chris Angelico wrote: > In those terms, I'm talking about convincing someone to move off the > _Gmail client_, whether or not they continue using the Gmail service. > Have you ever convinced someone to stop using the Gmail web interface > and start using some other client, as a means of improving their use > of the mailing list (eg to resolve some complaints they were having)? Specifically because they complained about *mailing lists*? No. Because they didn't like the Gmail web interface in general, or preferred to use some other email client (such as Outlook)? No, because they didn't need convincing, they already wanted to keep their email client. Make of that what you will :-) -- Steven From brett at python.org Fri Mar 29 14:52:32 2019 From: brett at python.org (Brett Cannon) Date: Fri, 29 Mar 2019 11:52:32 -0700 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: On Thu, Mar 28, 2019 at 10:53 AM Serhiy Storchaka wrote: > 28.03.19 19:45, Brett Cannon ????: > > So I would say that a cache-clearing function convention would be a > > reasonable starting point. If that turns out to not be enough for folks > > we can talk about expanding it, but I think we should start small and > > grow from there as needed. > > > > So what name would people want. clear_cache() or _clear_cache()? I > > personally prefer the latter since clearing the cache shouldn't be > > something people should typically need to do and thus the function is an > > implementation detail. > > This is an interesting idea. I think it should be a dunder name: > __clearcache__() or __clear_cache__(). > Between those two then I would go with __clearcache__(), but I was talking about a naming scheme for a function in each module, not a new built-in in case you thought I meant that. -Brett > The disadvantage is that this method is slower in the case of large > number of imported modules. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Fri Mar 29 19:05:55 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Fri, 29 Mar 2019 16:05:55 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: Could we try to keep the discussion about the topic at hand? There are a broad set if considerations that apply to any change, but they don?t all apply equally to all proposals. The proposal at hand is to add two fairly straightforward methods to string. So: > > > We are balancing equities here. We have a plethora of changes, on the > one side taken by itself each of which is an improvement, but on the > other taken as a group they greatly increase the difficulty of > learning to read Python programs fluently. Unless the methods are really poorly named, then this will make them maybe a tiny bit more readable, not less. But tiny. So ?irrelevant? may be appropriate here. So we set a bar that the > change must clear, and the ability of the change to clear it depends > on the balance of equities. Exactly ? small impact, low bar. In this case, where it requires C support and is not possible to "from > __future__", the fact that library maintainers can't use it until they > drop support for past versions of Python weakens the argument for the > change by excluding important bodies of code from using it. But there is no need for __future__ ? it?s not a breaking change. It could be back ported to any version we want. Same as a __future__ import. > Putting it in a library virtually guarantees it will never become > > popular. > > Factually, you're wrong. I don?t think he is, and I made the same point earlier. He did not say that no PyPi libs become popular or are brought into the stdlib, He said that this particular proposal is not suited to that. Do you really think a lib with two (or a few, though no one yet has suggested anymore) almost trivial string functions will gain any traction?? > > Note that decimal was introduced with no literal syntax and is quite > useful and used. But Decimal isn?t a float with a couple extra handy methods. If it didn?t provide significant extra functionality, no one would use it. And many strings in our code are created from other code ? so then you?d need to wrap MySpecialString() around every function call that produced a string. Again? it?s not going to happen. Alternatively, the features could be introduced using functions. Better than a custom class, but still too awkward to bother with for this ? see a previous post of mine for more detail. This proposal would provide a minor gain for an even more minor disruption. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 29 21:37:36 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Mar 2019 12:37:36 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190330013736.GU31406@ando.pearwood.info> On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote: > This proposal would provide a minor gain for an even more minor disruption. I don't think that is correct. I think you are underestimating the gain and exaggerating the disruption :-) Cutting a prefix or suffix from a string is a common task, and there is no obvious "battery" in the std lib available for it. And there is a long history of people mistaking strip() and friends as that battery. The problem is that it seems to work: py> "something.zip".rstrip(".zip") 'something' until it doesn't: py> "something.jpg".rstrip(".jpg") 'somethin' It is *very common* for people to trip over this and think they have found a bug: https://duckduckgo.com/?q=python+bug+in+strip I would guestimate that for every person who think that they found a bug, there are probably a hundred who trip over this and then realise their error without ever going public. I believe this is a real pain point for people doing string processing. I know it has bitten me once or twice. The correct solution is a verbose statement: if string.startswith("spam"): string = string[:len("spam")] which repeats itself (*two* references to the prefix being removed, *three* references to the string being cut). The expression form is no better: process(a, b, string[:len("spam")] if string.startswith("spam") else string, c) and heaven help you if you need to cut from both ends. To make that practical, you really need a helper function. Now that's fine as far as it goes, but why do we make people re-invent the wheel over and over again? A pair of "cut" methods (cut prefix, cut suffix) fills a real need, and will avoid a lot of mistaken bug reports/questions. As for the disruption, I don't see that this will cause *any* disruption at all, beyond bike-shedding the method names and doing an initial implementation. It is a completely backwards compatible change. Since we can't monkey-patch builtins, this isn't going to break anyone's use of str. Any subclasses of str which define the same methods will still work. I've sometimes said in the past that any change will break *someone's* code, and so we should be risk-adverse. I still stand by that, but we shouldn't be *so risk adverse* that we're paralysed. Breaking users' code is a cost, but there is also the uncounted opportunity cost of *not* adding this useful battery. If we don't add these new methods, how many hundreds of users over the next decade will we condemn to repeating the same old misuse of strip() that has been misused so often in the past? How much developer time will be wasted writing, and then closing, bug reports like this? https://bugs.python.org/issue5318 Inaction has costs too. I can only think of one scenario where this change might break someone's code: - we decide on method names (let's say) lcut and rcut; - somebody else already has a class with lcut and rcut; - which does something completely different; - and they use hasattr() to decide whether to call those methods, rather than isinstance: if hasattr(myobj, 'lcut'): print(myobj.lcut(1, 2, 3, 4)) else: # do something else - and they sometimes pass strings into this code. In 3.7 and older, ordinary strings will take the second path. If we add these methods, they will take the first path. But the chances of this actually being more than a trivially small problem for anyone in real life is so small that I don't know why I even raise it. This isn't a minor disruption. Its a small possibility of a minor disruption to a tiny set of users who can fix the breakage easily. The functionality is clear, meets a real need, is backwards compatible, and has no significant downsides. The only hard part is bikeshedding names for the methods: lcut rcut cutprefix cutsuffix ltrim rtrim prestrip poststrip etc. Am I wrong about any of these statements? -- Steven From cs at cskk.id.au Fri Mar 29 22:19:02 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 30 Mar 2019 13:19:02 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190330013736.GU31406@ando.pearwood.info> References: <20190330013736.GU31406@ando.pearwood.info> Message-ID: <20190330021902.GA91102@cskk.homeip.net> On 30Mar2019 12:37, Steven D'Aprano wrote: >On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote: >> This proposal would provide a minor gain for an even more minor >> disruption. > >I don't think that is correct. I think you are underestimating the gain >and exaggerating the disruption :-) > >Cutting a prefix or suffix from a string is a common task, and there is >no obvious "battery" in the std lib available for it. And there is a >long history of people mistaking strip() and friends as that battery. >The problem is that it seems to work: > >py> "something.zip".rstrip(".zip") >'something' > >until it doesn't: > >py> "something.jpg".rstrip(".jpg") >'somethin' Yeah, this is a very common mistake. I don't think I've made it myself (not really sure why, except that I use strip a lot to remove whitespace so I don't think about the file extesion thing for it). But I've seen people make this mistake. And personally I strip prefixes or suffixes from strings a lot and the "measure the suffix and get s[:-len(suffix)]" shuffle is tedious. Also I need to decode that shuffle in my head every time I see it _and_ debug it because in the file extension case I'm always concerned as to whether it gets the "." separator or not. With .cutsuffix('.foo') it is really obvious and unambiguous. Also, I'm curious - how often to people use strip() to strip stuff other than whitespace? It is rare or unknown for myself. So I am a data point for the individually small but common gain. [...adding a method to str is only going to break quite weird code...] >The functionality is clear, meets a real need, is backwards compatible, >and has no significant downsides. The only hard part is bikeshedding >names for the methods: > > lcut rcut > cutprefix cutsuffix > ltrim rtrim > prestrip poststrip > etc. > >Am I wrong about any of these statements? I do not think so. I agree with everything you've said, anyway. For the shed: I'm a big -1 on ltrim and rtrim because of confusion with the VERY well known PHP trim function which does something else. I like lcut/rcut as succienct and reminiscent of the UNIX "cut" command. I like cutprefix and cutsuffix even more, as having similar heft as the startswith and endswith methods. I dislike prestrip and poststrip because of their similarity to strip, which like the PHP trim does something else. -1 here too. Cheers, Cameron Simpson From pythonchb at gmail.com Sat Mar 30 03:15:00 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sat, 30 Mar 2019 00:15:00 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190330013736.GU31406@ando.pearwood.info> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> Message-ID: On Fri, Mar 29, 2019 at 6:38 PM Steven D'Aprano wrote: > On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote: > > > This proposal would provide a minor gain for an even more minor > disruption. > > I don't think that is correct. I think you are underestimating the gain > and exaggerating the disruption :-) I am very confused - I made that statement in response to a post of yours ( unless I got the attribution wrong) in which you seemed to be arguing against the proposal. But If I goaded you into making a strong case that I completely agree with ? great! And for the record: I have put that very bug into production code. And I do use strip() with things other than white space ? though never multiple characters (at least not on purpose) -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Mar 30 04:29:25 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Mar 2019 19:29:25 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> Message-ID: <20190330082925.GY31406@ando.pearwood.info> On Sat, Mar 30, 2019 at 12:15:00AM -0700, Christopher Barker wrote: > On Fri, Mar 29, 2019 at 6:38 PM Steven D'Aprano wrote: > > > On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote: > > > > > This proposal would provide a minor gain for an even more minor > > disruption. > > > > I don't think that is correct. I think you are underestimating the gain > > and exaggerating the disruption :-) > > > I am very confused - I made that statement in response to a post of yours ( > unless I got the attribution wrong) in which you seemed to be arguing > against the proposal. That was the other Steven, the one who spells his name with a PH instead of a V :-) -- Steven From p.f.moore at gmail.com Sat Mar 30 06:21:23 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 30 Mar 2019 10:21:23 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: On Fri, 29 Mar 2019 at 23:07, Christopher Barker wrote: > The proposal at hand is to add two fairly straightforward methods to string. So: Some of what you are calling digressions are actually questioning the design choice behind that proposal. Specifically, there's no particular justification given for making these methods rather than standalone functions. But OK, let's stick to the points you want to make here. >> > We are balancing equities here. We have a plethora of changes, on the >> one side taken by itself each of which is an improvement, but on the >> other taken as a group they greatly increase the difficulty of >> learning to read Python programs fluently. > > Unless the methods are really poorly named, then this will make them maybe a tiny bit more readable, not less. But tiny. So ?irrelevant? may be appropriate here. And how do we decide if they are poorly named, given that it's *very* hard to get real-world usage experience for a core Python change before it's released (essentially no-one uses pre-releases for anything other than testing that the release doesn't break their code). Note that the proposed name (trim) is IMO "poorly named", because a number of languages in my experience use that name for what Python calls "strip", so there would be continual confusion (for me at least) over which name meant which behaviour... >> So we set a bar that the >> change must clear, and the ability of the change to clear it depends >> on the balance of equities. > > Exactly ? small impact, low bar. If we accept your statement that it's a small impact. I contend that the confusion that this would cause between strip and trim is not small. It's not *huge*, but it's not small... We can agree to differ, but if we do then don't expect me to agree to your statement that the bar can be low, you need to persuade me to agree that the impact is low if you want me to agree on the matter of the bar. >> In this case, where it requires C support and is not possible to "from >> __future__", the fact that library maintainers can't use it until they >> drop support for past versions of Python weakens the argument for the >> change by excluding important bodies of code from using it. > > But there is no need for __future__ ? it?s not a breaking change. It could be back ported to any version we want. Same as a __future__ import. OTOH, it's a new feature, so it won't be acceptable for backporting. Sorry, but those are the rules. What "we" want isn't relevant here, unless the "we" in question is the Python core devs, and the core devs have established the "no backports of new features" rule over many years, and won't be likely to change it for something that you yourself are describing as "small impact". Paul From evrial at gmail.com Sat Mar 30 06:29:57 2019 From: evrial at gmail.com (Alex Grigoryev) Date: Sat, 30 Mar 2019 12:29:57 +0200 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: Message-ID: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> To me this is really surprising that 28 years old language has some weird methods like str.swapcase(), but none to cut string from left or right, and two of them which exist only accept string mask. On ???? 30 2019, at 12:21 ???, Paul Moore wrote: > On Fri, 29 Mar 2019 at 23:07, Christopher Barker wrote: > > The proposal at hand is to add two fairly straightforward methods to string. So: > > > Some of what you are calling digressions are actually questioning the > design choice behind that proposal. Specifically, there's no > particular justification given for making these methods rather than > standalone functions. But OK, let's stick to the points you want to > make here. > > > > > We are balancing equities here. We have a plethora of changes, on the > > > one side taken by itself each of which is an improvement, but on the > > > other taken as a group they greatly increase the difficulty of > > > learning to read Python programs fluently. > > > > > > Unless the methods are really poorly named, then this will make them maybe a tiny bit more readable, not less. But tiny. So ?irrelevant? may be appropriate here. > And how do we decide if they are poorly named, given that it's *very* > hard to get real-world usage experience for a core Python change > before it's released (essentially no-one uses pre-releases for > anything other than testing that the release doesn't break their > code). > > Note that the proposed name (trim) is IMO "poorly named", because a > number of languages in my experience use that name for what Python > calls "strip", so there would be continual confusion (for me at least) > over which name meant which behaviour... > > > > So we set a bar that the > > > change must clear, and the ability of the change to clear it depends > > > on the balance of equities. > > > > > > Exactly ? small impact, low bar. > If we accept your statement that it's a small impact. I contend that > the confusion that this would cause between strip and trim is not > small. It's not *huge*, but it's not small... We can agree to differ, > but if we do then don't expect me to agree to your statement that the > bar can be low, you need to persuade me to agree that the impact is > low if you want me to agree on the matter of the bar. > > > > In this case, where it requires C support and is not possible to "from > > > __future__", the fact that library maintainers can't use it until they > > > drop support for past versions of Python weakens the argument for the > > > change by excluding important bodies of code from using it. > > > > > > But there is no need for __future__ ? it?s not a breaking change. It could be back ported to any version we want. Same as a __future__ import. > OTOH, it's a new feature, so it won't be acceptable for backporting. > Sorry, but those are the rules. What "we" want isn't relevant here, > unless the "we" in question is the Python core devs, and the core devs > have established the "no backports of new features" rule over many > years, and won't be likely to change it for something that you > yourself are describing as "small impact". > > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sat Mar 30 07:40:18 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 30 Mar 2019 11:40:18 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> Message-ID: On Sat, 30 Mar 2019 at 10:29, Alex Grigoryev wrote: > To me this is really surprising that 28 years old language has some weird > methods like str.swapcase(), but none to cut string from left or right, and > two of them which exist only accept string mask. > As someone who was programming 28 years ago, I can confirm that the things that were "obviously useful" that long ago are vastly different from the things that are "obviously useful" today. Requirements evolve, use cases evolve, languages evolve. The good thing about general purpose languages like Python (as opposed to languages like SQL, which I use in my day job) is that you can easily handle new requirements by writing your own functions and utilities, which takes the pressure off the language design to keep up with every change in requirements and trends. The bad thing about it is that it's sometimes difficult to distinguish between significant improvements (which genuinely warrant language/stdlib changes) and insignificant ones (which can be handled by "write your own function"). Things like str.swapcase are a good example of that experience. It probably seemed like a useful little function at the time, not much overhead, maybe useful, people coming from C had something like this and found it helpful, so why not? But then Unicode came along, and there was a chunk of maintenance work needed to update swapcase. And there were probably bugs that got fixed. And as you point out, the function is probably barely ever used nowadays. So was it worth the effort invested in adding it, and maintaining it all those years? "It's only a simple addition of a straightforward string method". Is str.trim like str.swapcase, or like str.split? Who knows, at this point? The best any of us with the experience of seeing proposals like this come up regularly can do, is to push back, make the proposer justify the suggestion, try to make the proposer consider whether while his idea seems great right now, will it feel more like str.swapcase in a few years? And sometimes that pushback *is* too conservative, and an idea is good. But it still needs someone to implement it, document it, and integrate it into the language - the proposer isn't always able (or willing) to do that, so again there's a question of who does the work? In the case of str.swapcase, the "proposer" was probably the person implementing the str class, and so they did the work and it was very little extra to do. Nowadays the str class is a lot more complex, and Unicode rules are far less straightforward than ASCII was 28 years ago - so maybe now they wouldn't have bothered[1]. Sorry, that went a lot further than I originally intended - hopefully it's useful background, though. Paul [1] One thing I don't know (apologies if it's been answered earlier in the thread). Are you expecting to implement this change yourself (you'd need to know C, so it's perfectly OK if the answer is no), and if so, have you tried to do so? Personally, I don't have any feel for how complex the proposed new methods would be to implement, but I'd be much more willing to accept the word of someone who's written the code that it's a "simple change". How easy it is to implement isn't the whole story (as I mentioned above) but it is relevant. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Mar 30 07:39:45 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Mar 2019 22:39:45 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190330113944.GA6059@ando.pearwood.info> On Fri, Mar 29, 2019 at 12:06:25PM +0900, Stephen J. Turnbull wrote: > Anders Hovm?ller writes: [...] > > just like always. This seems totally irrelevant to the > > discussion. And it's of course irrelevant to all the end users that > > aren't writing libraries but are using python directly. > > No, it's not "irrelevant". I wish we all would stop using that word, > and trying to exclude others' arguments in this way. I won't comment on Anders' claim that this issue is irrelevant to the discussion, but I think he is correct about it being irrelevant to "all the end users that aren't writing libraries but are using python directly" -- or at least those on the cutting edge of 3.8. There are lots of people who will soon be using nothing older than 3.8, and they will no more care that 3.7 lacks this feature than they will care that Python 1.5 lacks Unicode, iterators, and new-style classes. More power to them :-) For the sake of the argument, I'll grant your point that libraries which support older versions of Python cannot use the new feature[1]. But those libraries, and their users, are no worse off by adding a string method which they can't yet use. They will simply continue doing whatever it is that they already do, which will remain backward compatible to 3.3 or 2.7 or however far back they go. And some day they will have dropped support for 3.7 and older, and will be able to use all the new shiny features in 3.8. After all, if "libraries that support old versions can't use this feature" was a reason to reject new features, we would never have added *any* new feature past those available in Python 1.0. New features are added for the benefit of the present and the future, not for the past. > We are balancing equities here. Indeed, and a certain level of caution is justified -- but not so much as to cause paralysis and stagnation. There's a word for a language which has stopped changing: "dead". > We have a plethora of changes, on the > one side taken by itself each of which is an improvement, but on the > other taken as a group they greatly increase the difficulty of > learning to read Python programs fluently. "Greatly"? Is it truly that hard to go help(str.cutprefix) at the interactive interpreter, or look it up in the docs? I mean, if a simple string method causes a developer that much confusion, imagine how badly they will cope with async! You can't read Python programs fluently unless you understand the custom functions and classes in that program. Compared to that, I don't think that it is especially difficult to learn what a couple of new methods do. Especially if their name is self-documenting. [...] > > Putting it in a library virtually guarantees it will never become > > popular. > > Factually, you're wrong. Many libraries have moved from PyPI to the > stdlib, often very quickly as they prove their worth in a deliberate > test. The Python community is not the Javascript community, we don't tend to download tiny one-or-two line libraries. And that is a good thing: https://medium.com/commitlog/the-internet-is-at-the-mercy-of-a-handful-of-people-73fac4bc5068 Putting aside all those whose are prohibited from using unapproved third-party libraries -- and there are a lot of them, from students using locked-down machines to corporate and government users where downloading unapproved software is grounds for instant dismissal -- I think most people simply couldn't be bothered installing and importing a package that offered something as simple as a couple of "cut" functions. While its true that not every two-line function needs to be in the stdlib, its often better to have it in the stdlib than expect ten thousand people to write the same two-line function over and over again. > Note that decimal was introduced with no literal syntax and is quite > useful and used. It was also added straight into the stdlib without being forced to go through the "third-party library" stage first, and with minimal discussion: https://mail.python.org/pipermail/python-dev/2003-October/thread.html If there was ever a module which *could* have proven itself as a third-party library on PyPI, it was probably Decimal. It adds an entire new numeric class, one with significant advantages (and some disadvantages) over binary floats, not just a couple of lines of code. Re-inventing the wheel is impractical: few people have the numeric know-how to duplicate that wheel, and for those who can, it would take a massive amount of effort: the Python version is over 6000 lines (including blanks and comments). If you want a Decimal type, it isn't practical to write one yourself. > If this change is going to prove it's tall enough to > ride the stdlib ride, using a constructor for a derived class rather > than str literal syntax shouldn't be too big a barrier to judging > popularity (accounting for the annoyance of a constructor). There's little difference between writing MyDecimal("1.2") versus Decimal("1.2"), but there's a huge annoyance factor in having to write MyString("hello world") instead of "hello world". Especially when all you want is to add a single new method and instead you have to override a dozen or more methods to return instances of MyString. And then you pass it to some function or library, and it returns a regular string again. So you're constantly playing wack-a-mole trying to discover why your MyString subclass objects are turning into regular built-in strings when you least expect it. Forget it. That's a serious PITA. > Alternatively, the features could be introduced using functions. We specifically added a str class with methods to get away from the functions in the string module, and you want to bring them back? I think the bar for adding string functions into the string module should be much higher than adding a couple of lightweight methods. [1] Actually, they can. As we know from the transition from 2 to 3, there is often a perfectly viable solution for libraries that want to support old versions. Here is some actual code taken from one of my modules which works back to Python 2.4: try: casefold = str.casefold # Added in 3.3 (I think). except AttributeError: # Fall back version is not as good, but is good enough. casefold = str.lower So even libraries that support Python 2 can get the advantage of an accelerated C method by using this technique with a fallback to whatever they are currently using: try: lcut = str.cutprefix except AttributeError: # Fall back to pure Python version. def lcut(astring, prefix): ... -- Steven (the other one) From boxed at killingar.net Sat Mar 30 07:46:59 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sat, 30 Mar 2019 12:46:59 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: <2BB1EE45-84FE-44F4-8666-A3F52ADEA06E@killingar.net> > On 30 Mar 2019, at 11:21, Paul Moore > > Note that the proposed name (trim) is IMO "poorly named", because a > number of languages in my experience use that name for what Python > calls "strip", so there would be continual confusion (for me at least) > over which name meant which behaviour.. That isn't the proposal as it stands now. The consensus among those supporting the idea is "strip_prefix" and "strip_suffix". Let's debate that. / Anders From storchaka at gmail.com Sat Mar 30 08:04:59 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 30 Mar 2019 14:04:59 +0200 Subject: [Python-ideas] Unified style of cache management API In-Reply-To: References: Message-ID: 29.03.19 20:52, Brett Cannon ????: > On Thu, Mar 28, 2019 at 10:53 AM Serhiy Storchaka > > wrote: > > 28.03.19 19:45, Brett Cannon ????: > > So I would say that a cache-clearing function convention would be a > > reasonable starting point. If that turns out to not be enough for > folks > > we can talk about expanding it, but I think we should start small > and > > grow from there as needed. > > > > So what name would people want. clear_cache() or _clear_cache()? I > > personally prefer the latter since clearing the cache shouldn't be > > something people should typically need to do and thus the > function is an > > implementation detail. > > This is an interesting idea. I think it should be a dunder name: > __clearcache__() or __clear_cache__(). > > > Between those two then I would go with __clearcache__(), but I was > talking about a naming scheme for a function in each module, not a new > built-in in case you thought I meant that. Then I understood you correctly. Only now I saw that your words could have been understood differently. Opened https://bugs.python.org/issue36485 for implementing this idea. From steve at pearwood.info Sat Mar 30 08:14:39 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Mar 2019 23:14:39 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190330121439.GB31406@ando.pearwood.info> On Sat, Mar 30, 2019 at 10:21:23AM +0000, Paul Moore wrote: > On Fri, 29 Mar 2019 at 23:07, Christopher Barker wrote: > > The proposal at hand is to add two fairly straightforward methods to string. So: > > Some of what you are calling digressions are actually questioning the > design choice behind that proposal. Specifically, there's no > particular justification given for making these methods rather than > standalone functions. Strings are objects and Python is an object-oriented language. Surely the default presumption ought to be that string functionality goes into the string object as methods, not a seperate function, unless they're so specialised, or so large and unwieldy, that they ought to go into a module. There's a cost to moving functionality into a seperate module. Its harder to discover functions buried in a module when most of the string functionality is in str itself, and its rather a nuisance to write (e.g.): unicodedata.name('x') instead of 'x'.name(). I use unicodedata *a lot* and there is never a time I didn't wish it was built into str instead. Once upon a time all the useful str methods were functions in the string module. I don't think we should be re-introducing that annoyance. [...] > And how do we decide if they are poorly named, given that it's *very* > hard to get real-world usage experience for a core Python change > before it's released (essentially no-one uses pre-releases for > anything other than testing that the release doesn't break their > code). By common sense. s.xyahezgnfspwq(prefix) s.lt(prefix) s.remove_the_prefix_but_only_if_it_exists_on_the_left(prefix) would all be poorly named. We surely don't need real-world usage experience to know that. Eliminate the obviously bad names, and you're left with names which ought to be at least reasonable. > Note that the proposed name (trim) is IMO "poorly named", because a > number of languages in my experience use that name for what Python > calls "strip", so there would be continual confusion (for me at least) > over which name meant which behaviour... Well there you go, you've just answered your own question about how to tell if the name is poor. That's why we have this list :-) Personally, I don't mind "ltrim" and "rtrim". We're not obliged to present the precise same method names as other languages, any more than they're obliged to consider Python's names. But since you and some others may be confused, I'm happy to bike-shed alternatives. I especially like: cutprefix, cutsuffix lcut, rcut strip_prefix, strip_suffix but presumably somebody will object to them too :-) We didn't need a lot of real-world experience before deciding on async etc, and the disruption risked by adding new keywords is *much* more serious than adding a couple of string methods. [...] > OTOH, it's a new feature, so it won't be acceptable for backporting. Indeed. I'm not sure why we're talking about backporting. It isn't going to happen, so let's just move on. -- Steven From steve at pearwood.info Sat Mar 30 08:41:06 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Mar 2019 23:41:06 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> Message-ID: <20190330124105.GC31406@ando.pearwood.info> On Sat, Mar 30, 2019 at 11:40:18AM +0000, Paul Moore wrote: > Is str.trim like str.swapcase, or like str.split? Who knows, at this point? I think you are making a rhetorical point, but not a very good one. I think we all know, or at least *should* know, that this proposal is much closer to split than swapcase. Most of us have had to cut a prefix or a suffix from a string, often a file extension. Its not as common as, say, stripping whitespace, but it happens often enough. Stackoverflow and the mailing lists and even the bug tracker are full of people asking about "bugs" in str.[lr]split because they've tried to use those methods to cut prefixes and suffixes, so we know that this functionality is needed far more often than swapcase, and easy to get it wrong. We can't say the same about swapcase. Even in Python 1.5, it was a gimmick. I can only think of a single use-case for it: "I've typed a whole lot of text without noticing that Caps Lock was on, so it looks like 'hELLO wORLD' by mistake." [...] > try to make the proposer consider whether while his idea seems > great right now, will it feel more like str.swapcase in a few years? And > sometimes that pushback *is* too conservative, and an idea is good. But it > still needs someone to implement it, document it, and integrate it into the > language - the proposer isn't always able (or willing) to do that, so again > there's a question of who does the work? And that's a very good point -- if there's no volunteer willing and able to do the work, even the best ideas can languish, sometimes for years. But that's not a reason to *reject* an idea. -- Steven From mertz at gnosis.cx Sat Mar 30 09:03:26 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 30 Mar 2019 09:03:26 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190330124105.GC31406@ando.pearwood.info> References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> <20190330124105.GC31406@ando.pearwood.info> Message-ID: On Sat, Mar 30, 2019, 8:42 AM Steven D'Aprano wrote: > Most of us have had to cut a prefix or a suffix from a string, often a > file extension. Its not as common as, say, stripping whitespace, but it > happens often enough. I do this all the time! I never really thought about wanting a method though. I just spell it like this without much thought: basename = fname.split(".ext")[0] But I suppose a method would be helpful. If we have one, PLEASE no variation of 'trim' in the name. I still forget whether it's .lstrip() or .ltrim() or .stripl() or etc. after 20 years using Python. Lots of languages use trim for Python's strip, so having both with subtly different meanings is a bug magnet. One thing I love about .startswith() and .endswith() is matching multiple options. It's a little funny the multiple options must be a tuple exactly (not a list, not a set, not an iterator), but whatever. It would be about to lack that symmetry in the .cut_suffix() method. E.g now: if fname.endswith(('.jpg', '.png', '.gif)): ... I'd expect to be able to do: basename = fname.cut_suffix(('.jpg', '.png', '.gif)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Sat Mar 30 09:29:15 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Sat, 30 Mar 2019 09:29:15 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> <20190330124105.GC31406@ando.pearwood.info> Message-ID: On 3/30/19 9:03 AM, David Mertz wrote: > On Sat, Mar 30, 2019, 8:42 AM Steven D'Aprano wrote: > >> Most of us have had to cut a prefix or a suffix from a string, often a >> file extension. Its not as common as, say, stripping whitespace, but it >> happens often enough. > > > I do this all the time! I never really thought about wanting a method > though. I just spell it like this without much thought: > > basename = fname.split(".ext")[0] This one also works until it doesn't: basename = 'special.extensions.ext'.split(".ext")[0] basename = 'food.pyramid.py'.split(".py")[0] basename = 'build.classes.c'.split(".c")[0] Safer is fname.rsplit('.ext', 1)[0]. There's always os.path.splitext. ;-) Dan From brett at python.org Sat Mar 30 12:52:04 2019 From: brett at python.org (Brett Cannon) Date: Sat, 30 Mar 2019 09:52:04 -0700 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> <20190329094026.GQ31406@ando.pearwood.info> Message-ID: I think this thread has gone off-topic as this mailing list is not about Gmail (the client or the service). ;) On Fri, Mar 29, 2019 at 4:21 AM Chris Angelico wrote: > On Fri, Mar 29, 2019 at 8:41 PM Steven D'Aprano > wrote: > > On Fri, Mar 29, 2019 at 02:23:52PM +1100, Chris Angelico wrote: > > > > > Have you ever actually > > > convinced someone to move off Gmail onto some other client? > > > > Gmail is an email provider with a web interface. As a provider, it is > > available on any email client, so long as it speaks POP3 or IMAP. > > In those terms, I'm talking about convincing someone to move off the > _Gmail client_, whether or not they continue using the Gmail service. > Have you ever convinced someone to stop using the Gmail web interface > and start using some other client, as a means of improving their use > of the mailing list (eg to resolve some complaints they were having)? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bitsink at gmail.com Sat Mar 30 13:56:56 2019 From: bitsink at gmail.com (Nam Nguyen) Date: Sat, 30 Mar 2019 10:56:56 -0700 Subject: [Python-ideas] Built-in parsing library Message-ID: Hello list, What do you think of a universal parsing library in the stdlib mainly for use by other libraries in the stdlib? Through out the years we have had many issues with protocol parsing. Some have even introduced security bugs. The main cause of these issues is the use of simple regular expressions. Having a universal parsing library in the stdlib would help cut down these issues. Such a library should be minimal yet encompassing, and whole parse trees should be entirely expressible in code. I am thinking of combinatoric parsing as the main candidate that fits this bill. What do you say? Thanks! Nam -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Mar 30 14:03:03 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 31 Mar 2019 03:03:03 +0900 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <2BB1EE45-84FE-44F4-8666-A3F52ADEA06E@killingar.net> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <2BB1EE45-84FE-44F4-8666-A3F52ADEA06E@killingar.net> Message-ID: <23711.44887.752496.496451@turnbull.sk.tsukuba.ac.jp> Anders Hovm?ller writes: > That isn't the proposal as it stands now. The consensus among those > supporting the idea is "strip_prefix" and "strip_suffix". Let's > debate that. IMO, the "prefix" and "suffix" parts are necessary to fully clarify the intent (and do that well), so any of the verbs "strip", "trim", or "cut" work for me. I prefer Steven d'A's "cutprefix" and "cutsuffix" as the shortest, and by analogy to startswith/endswith, I'd drop the underscore (PEP 8 nothwithstanding). From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Mar 30 14:05:59 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 31 Mar 2019 03:05:59 +0900 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190330013736.GU31406@ando.pearwood.info> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> Message-ID: <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > The correct solution is a verbose statement: > > if string.startswith("spam"): > string = string[:len("spam")] This is harder to write than I thought! (The slice should be 'len("spam"):'.) But s/The/A/: string = re.sub("^spam", "", string) And a slightly incorrect solution (unless you really do want to remove all spam, which most people do, but might not apply to "tooth"): string = string.replace("spam", "") > A pair of "cut" methods (cut prefix, cut suffix) fills a real need, But do they, really? Do we really need multiple new methods to replace a dual-use one-liner, which also handles outfile = re.sub("\\.bmp$", ".jpg", infile) in one line? I concede that the same argument was made against startswith/endswith, and they cleared the bar. Python is a lot more complex now, though, and I think the predicates are more frequently useful. > and will avoid a lot of mistaken bug reports/questions. That depends on analogies to other languages. Coming from Emacs, I'm not at all surprised that .strip takes a character class as an argument and strips until it runs into a character not in the class. Evidently others have different intuition. If that's from English, and they know about cutprefix/cutsuffix, yeah, they won't make the mistake. If it's from another programming language they know, or they don't know about cutprefix, they may just write "string.strip('.jpg')" without thinking about it and it (sometimes) works, then they report a bug when it doesn't. Remember, these folks are not understanding the docs, and very likely not reading them. > As for the disruption, The word is "complexity". Where do you get "disruption" from? > code is a cost, but there is also the uncounted opportunity cost of > *not* adding this useful battery. Obviously some people think it's useful. Nobody denies that. The problem is *measuring* the opportunity cost of not having the battery, or the "usefulness" of the battery, as well as measuring the cost of complexity. Please stop caricaturing those who oppose the change as Luddites. > I can only think of one scenario where this change might > break someone's code: Again, who claimed it would break code? > The functionality is clear, meets a real need, is backwards compatible, > and has no significant downsides. The only hard part is bikeshedding > names for the methods: > > lcut rcut > cutprefix cutsuffix > ltrim rtrim > prestrip poststrip > etc. > > Am I wrong about any of these statements? It's not obvious to me from the names that the startswith/endswith test is included in the method, although on reflection it would be weird if it wasn't. Still, I wouldn't be surprised to see if string.startswith("spam"): string.cutprefix("spam") in a new user's code. You're wrong about "no significant downsides," in the sense that that's the wrong criterion. The right criterion is "if we add a slew of features that clear the same bar, does the total added benefit from that set exceed the cost?" The answer to that question is not a trivial extrapolation from the question you did ask, because the benefits will increase approximately linearly in the number of such features, but the cost of additional complexity is generally superlinear. I also disagree they meet a real need, as explained above. They're merely convenient. And the bikeshedding isn't hard. In the list above, cutprefix/ cutsuffix are far and away the best. From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Mar 30 14:06:47 2019 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 31 Mar 2019 03:06:47 +0900 Subject: [Python-ideas] New Project to Capture summaries from this In-Reply-To: References: <001101d4e57a$7d4dd290$77e977b0$@ieee.org> <23709.35835.380175.588224@turnbull.sk.tsukuba.ac.jp> Message-ID: <23711.45111.68592.566439@turnbull.sk.tsukuba.ac.jp> Chris Angelico writes: > Have you ever actually convinced someone to move off Gmail onto > some other client? No, but then, I never tried. I have gotten a couple score people to seriously try about a dozen different MUAs over the last three decades though. It's not impossible. But that's really not relevant. I'm a mail person, I develop Mailman. I'm well aware that the answer is never the logical, obvious, and invariably effective when tried "get a better client", it's always "impose my preferences on everybody I might correspond with." Nobody is contesting that. What's relevant is that sticking to a crappy mail client is a personal choice. Sticking to the user interface of a web forum is not. > That's half of my point (the distinction between "suboptimal > clients" and "suboptimal services"), but the other half is that > every time someone says "sorry, you chose Gmail", there's a lengthy > discussion that ends up NOT showcasing any sort of perfect > alternative - and often not even any *better* alternatives. "Perfect alternative" is a strawman. There's no perfect alternative. People use email in different ways; different MUAs are suited to different user habits and different mail streams. As for "better" alternatives, there as many MUAs better than GMail as there are programming languages better than original Dartmouth BASIC. They're just not GMail, and will require jumping through hoops to get personal archives moved over, or setting up GMail as an IMAP server, or using different clients for different purposes, which aren't acceptable to most people. Still, that's their choice. And their life won't be worse than it is now. Those of us who have exercised choice and invested in productive use of email will lose both productivity and choice, for questionable net benefit to the project, even assuming that those who prefer Discourse or Zulip or whatever get the benefits they expect. Steve From python at mrabarnett.plus.com Sat Mar 30 15:26:25 2019 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 30 Mar 2019 19:26:25 +0000 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> <20190330124105.GC31406@ando.pearwood.info> Message-ID: <506c1b16-a665-45a9-0848-880f87d43b3e@mrabarnett.plus.com> On 2019-03-30 13:03, David Mertz wrote: > On Sat, Mar 30, 2019, 8:42 AM Steven D'Aprano > wrote: > > Most of us have had to cut a prefix or a suffix from a string, often a > file extension. Its not as common as, say, stripping whitespace, but > it happens often enough. > > > I do this all the time! I never really thought about wanting a method > though. I just spell it like this without much thought: > > ? basename = fname.split(".ext")[0] > > But I suppose a method would be helpful. If we have one, PLEASE no > variation of 'trim' in the name. I still forget whether it's .lstrip() > or .ltrim() or .stripl() or etc. after 20 years using Python. Lots of > languages use trim for Python's strip, so having both with subtly > different meanings is a bug magnet. > > One thing I love about .startswith() and .endswith() is matching > multiple options. It's a little funny the multiple options must be a > tuple exactly (not a list, not a set, not an iterator), but whatever. It > would be about to lack that symmetry in the .cut_suffix() method. > > E.g now: > > ? if fname.endswith(('.jpg', '.png', '.gif)): ... > > I'd expect to be able to do: > > ? basename = fname.cut_suffix(('.jpg', '.png', '.gif)) > I'd much prefer .lcut/.rcut to .cut_prefix/.cut_suffix, to match .lstrip/.rstrip. From brandtbucher at gmail.com Sat Mar 30 20:27:48 2019 From: brandtbucher at gmail.com (Brandt Bucher) Date: Sat, 30 Mar 2019 17:27:48 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> <20190330124105.GC31406@ando.pearwood.info> Message-ID: > One thing I love about .startswith() and .endswith() is matching multiple options. It's a little funny the multiple options must be a tuple exactly (not a list, not a set, not an iterator), but whatever. It would be about to lack that symmetry in the .cut_suffix() method. > > E.g now: > > if fname.endswith(('.jpg', '.png', '.gif)): ... > > I'd expect to be able to do: > > basename = fname.cut_suffix(('.jpg', '.png', '.gif)) An idea worth considering: one can think of the ?strip? family of methods as currently taking an iterable of strings as an argument (since a string is itself an sequence of strings): >>> "abcd".rstrip("dc") 'ab' It would not be a huge logical leap to allow them to take any iterable. Backward compatible, no new methods: >>> fname.rstrip(('.jpg', '.png', '.gif')) It even, in my opinion, can clarify "classic" strip/rstrip/lstrip usage: >>> "abcd".rstrip(("d", "c")) 'ab' Maybe I?m missing a breaking case though, or this isn?t as clear for others. Thoughts? Brandt From mertz at gnosis.cx Sat Mar 30 21:15:11 2019 From: mertz at gnosis.cx (David Mertz) Date: Sat, 30 Mar 2019 21:15:11 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <061FBF45-02FD-4C3E-B792-EB09C05ED98E@getmailspring.com> <20190330124105.GC31406@ando.pearwood.info> Message-ID: I like this idea quite a lot. I do not think of anything it works best at first consideration. On Sat, Mar 30, 2019, 8:28 PM Brandt Bucher wrote: > > > One thing I love about .startswith() and .endswith() is matching > multiple options. It's a little funny the multiple options must be a tuple > exactly (not a list, not a set, not an iterator), but whatever. It would be > about to lack that symmetry in the .cut_suffix() method. > > > > E.g now: > > > > if fname.endswith(('.jpg', '.png', '.gif)): ... > > > > I'd expect to be able to do: > > > > basename = fname.cut_suffix(('.jpg', '.png', '.gif)) > > An idea worth considering: one can think of the ?strip? family of methods > as currently taking an iterable of strings as an argument (since a string > is itself an sequence of strings): > > >>> "abcd".rstrip("dc") > 'ab' > > It would not be a huge logical leap to allow them to take any iterable. > Backward compatible, no new methods: > > >>> fname.rstrip(('.jpg', '.png', '.gif')) > > It even, in my opinion, can clarify "classic" strip/rstrip/lstrip usage: > > >>> "abcd".rstrip(("d", "c")) > 'ab' > > Maybe I?m missing a breaking case though, or this isn?t as clear for > others. Thoughts? > > Brandt > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Mar 31 00:43:25 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 31 Mar 2019 15:43:25 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> Message-ID: <20190331044323.GB6059@ando.pearwood.info> On Sun, Mar 31, 2019 at 03:05:59AM +0900, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > > The correct solution is a verbose statement: > > > > if string.startswith("spam"): > > string = string[:len("spam")] > > This is harder to write than I thought! (The slice should be > 'len("spam"):'.) But s/The/A/: > > string = re.sub("^spam", "", string) Indeed, you're right that there can be other solutions, but whether they are "correct" depends on how one defines correct :-) I don't consider something that pulls in the heavy bulldozer of regexes to crack this peanut to be the right way to solve the problem, but YMMV. But for what it's worth, a regex solution is likely to be significantly slower -- see below. > And a slightly incorrect solution (unless you really do want to remove > all spam, which most people do, but might not apply to "tooth"): > > string = string.replace("spam", "") Sorry, that's not "slightly" incorrect, that is completely incorrect, for precisely the reason you state: it replaces *all* matching substrings, not just the leading prefix. I don't see a way to easily use replace to implement a prefix cut. I supose one might do: string = string[:-len(suffix)] + string[-len(suffix):].replace(suffix, '') but I haven't tried it and it sure isn't what I would call easy or obvious. > > A pair of "cut" methods (cut prefix, cut suffix) fills a real need, > > But do they, really? Do we really need multiple new methods to > replace a dual-use one-liner, which also handles > > outfile = re.sub("\\.bmp$", ".jpg", infile) Solutions based on regexes are far less discoverable: - all those people who have reported "bugs" in lstrip() and rstrip() could have thought of using a regex instead but didn't; - they involve reading what is effectively another programming language which uses cryptic symbols like "$" instead of words like "suffix". We aren't the Perl community where regexes are the first hammer we reach for every time we need to drive a screw :-) I had to read your re.sub() call twice before I convinced myself that it only replaced a suffix. And we also have to deal with the case where we want to delete a substring containing metacharacters: # Ouch! re.sub(r'\\\.\$$', '', string) # cut literal \.$ suffix Additionally, a regex solution is likely to be slower than even a pure-Python solution, let alone a string method. On my computer, regexes are three times slower than a Python function: $ python3.5 -m timeit -s "import re" "re.sub('eese$', '', 'spam eggs cheese')" 100000 loops, best of 3: 3.75 usec per loop $ python3.5 -m timeit -s "def rcut(suff, s): return s[:-len(suff)] if s.endswith(suff) else s" "rcut('eese', 'spam eggs cheese')" 1000000 loops, best of 3: 1.22 usec per loop > in one line? I concede that the same argument was made against > startswith/endswith, and they cleared the bar. Python is a lot more > complex now, though, and I think the predicates are more frequently > useful. > > > and will avoid a lot of mistaken bug reports/questions. > > That depends on analogies to other languages. I don't think it matters that much. Of course it doesn't help if you come to Python from a language where strip() deletes a prefix or suffix, but even if you don't, as I don't, there's something about the pattern: string = string.lstrip("spam") which looks like it ought to remove a prefix rather than a set of characters. I've fallen for that error myself. > Coming from Emacs, I'm > not at all surprised that .strip takes a character class as an > argument and strips until it runs into a character not in the class. And neither am I... until I forget, and get surprised that it doesn't work that way. This post is already too long, so in the interest of brevity and my dignity I'll skip the anecdote about the time I too blundered publicly about the "bug" in [lr]strip. > Evidently others have different intuition. If that's from English, > and they know about cutprefix/cutsuffix, yeah, they won't make the > mistake. If it's from another programming language they know, or they > don't know about cutprefix, they may just write "string.strip('.jpg')" > without thinking about it and it (sometimes) works, then they report a > bug when it doesn't. Remember, these folks are not understanding the > docs, and very likely not reading them. Its not reasonable to hold the failure of the proposed new methods to prevent *all* erroneous uses of [lr]strip against them. Short of breaking backwards compatibility and changing strip() to remove_characters_from_a_set_not_a_substring_read_the_docs_before_reporting_any_bugs() there's always going to be *someone* who makes a mistake. But with an easily discoverable alternative available, the number of such errors should plummett as people gradually migrate to 3.8 or above. > > As for the disruption, > > The word is "complexity". Where do you get "disruption" from? If you had read the text I quoted before trimming it, you would have seen that it was from Chris Barker: On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote: > This proposal would provide a minor gain for an even more minor disruption. I try very hard to provide enough context that my comments are understandable, and I don't always succeed, but the reader has to meet me part way by at least skimming the quoted text for context before questioning me :-) > > code is a cost, but there is also the uncounted opportunity cost of > > *not* adding this useful battery. > > Obviously some people think it's useful. Nobody denies that. Well, further on you do question whether it meets a real need, so there is at least one :-) > The > problem is *measuring* the opportunity cost of not having the battery, > or the "usefulness" of the battery, as well as measuring the cost of > complexity. We have never objectively measured these things before, because they can't be. We don't even have a good, objective measurement of complexity of the language -- but if we did, I'm pretty sure that adding a pair of fairly simple, self-explanatory methods to the str class would not increase it by much. We're on steadier ground if we talk about complexity of the user's code. In that case, whether we measure the complexity of a program by lines of code or number of functions or some other more complicated measurement, it ought to be self-evident that being able to replace a helper function with a built-in will slightly reduce complexity. For the sake of the argument, if we can decrease the complexity of a thousand user programs by 1 LOC each, at the cost of increasing the complexity of the interpreter by 100 LOC, isn't that a cost worth paying? I think it is. > Please stop caricaturing those who oppose the change as > Luddites. That's a grossly unjust misrepresentation of my arguments. Nothing I have said can be fairly read as a caricature of the opposing point, let alone as attacks on others for being Luddites. On the contrary: *twice* I have acknowledged that a level of caution about adding new features is justified. My argument is that in *this* case, the cost-benefit analysis falls firmly on the "benefit" side, not that any opposition is misguided. Whereas your attack on me comes perilously close to poisoning the well: "oh, pay no attention to his arguments, he is the sort of person who caricatures those who disagree as Luddites". > > I can only think of one scenario where this change might > > break someone's code: > > Again, who claimed it would break code? Any addition of a new feature has the risk of breaking code, and we ought to consider that possibility. [...] > It's not obvious to me from the names that the startswith/endswith > test is included in the method, although on reflection it would be > weird if it wasn't. Agreed. We can't be completely explicit about everything, it isn't practical: math.trigonometric_sine_where_the_angle_is_measured_in_radians(x) > Still, I wouldn't be surprised to see > > if string.startswith("spam"): > string.cutprefix("spam") > > in a new user's code. That's the sort of inefficient code newbies often write, and the fix for that is experience and education. I'm not worried about that, just as I'm not worried about newbies writing: if string.startswith(" "): string = string.lstrip(" ") > You're wrong about "no significant downsides," in the sense that > that's the wrong criterion. The right criterion is "if we add a slew > of features that clear the same bar, does the total added benefit from > that set exceed the cost?" The answer to that question is not a > trivial extrapolation from the question you did ask, because the > benefits will increase approximately linearly in the number of such > features, but the cost of additional complexity is generally > superlinear. I disagree that the benefits of new features scale linearly. There's a certain benefit to having (say) str.strip, and a certain benefit of having (say) string slicing, and a certain benefit of having (say) str.upper, but being able to do *all three* is much more powerful than just being able to do one or another. And I have no idea about the "additional complexity" (of what? the language? the interpreter?) because we don't really have a good way of measuring complexity of a language. > I also disagree they meet a real need, as explained above. They're > merely convenient. I don't understand how you can question whether or not people need to cut prefixes and suffixes in the face of people writing code to cut prefixes and suffixes. (Sometimes *wrong* code.) We have had a few people on this list explicitly state that they cut prefixes and suffixes, there's the evidence of the dozens of people who misused strip() to cut prefixes and suffixes, and there's history of people asking how to do it: https://stackoverflow.com/questions/599953/how-to-remove-the-left-part-of-a-string https://stackoverflow.com/questions/16891340/remove-a-prefix-from-a-string https://stackoverflow.com/questions/1038824/how-do-i-remove-a-substring-from-the-end-of-a-string-in-python https://codereview.stackexchange.com/questions/33817/remove-prefix-and-remove-suffix-functions https://www.quora.com/Whats-the-best-way-to-remove-a-suffix-of-a-string-in-Python https://stackoverflow.com/questions/3663450/python-remove-substring-only-at-the-end-of-string This same question comes up time and time again, and you're questioning whether people need to do it. Contrast to a hypothetical suggested feature which doesn't meet a real need (or at least nobody has yet suggested one, as yet): Jonathon Fine's suggestion that we define a generalised "string subtraction" operator. Jonathon explained that this is well-defined within the bounds of free groups and category theory. That's great, but being well-defined is only the first step. What would we use a generalised string subtraction for? What need does it meet? There are easy cases: "abcd" - "d" # remove a suffix -"a" + "abcd" # remove a prefix but in the full generality, it isn't clear what "abcd" - "z" would be useful for. Lacking a use-case for full string subtraction, we can reject adding it as a builtin feature or even a stdlib module. > And the bikeshedding isn't hard. In the list above, cutprefix/ > cutsuffix are far and away the best. Well I'm glad we agree on that, even if nothing else :-) -- Steven From rosuav at gmail.com Sun Mar 31 01:48:36 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 31 Mar 2019 16:48:36 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331044323.GB6059@ando.pearwood.info> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> Message-ID: On Sun, Mar 31, 2019 at 3:44 PM Steven D'Aprano wrote: > Of course it doesn't help if you come to Python from a language where > strip() deletes a prefix or suffix, but even if you don't, as I don't, > there's something about the pattern: > > string = string.lstrip("spam") > > which looks like it ought to remove a prefix rather than a set of > characters. I've fallen for that error myself. I think it will be far less confusing once there's parallel functions for prefix/suffix removal. Actually, this is an argument in favour of matching that pattern; if people see lstrip() and lcut() as well as rstrip() and rcut(), it's obvious that they are similar methods, and you can nip over to the docs to check which one you want ("oh right, that makes sense, cut snips off a word but strip takes off a set of letters"). But even if it's called cutprefix/cutsuffix (to match hasprefix/hassuffix... oh wait, I mean startswith/endswith), there's at least a _somewhat_ better chance that people will grab the right tool. Plus, it'll be easy to deal with the problems when they come up - "hey, strip has a weird bug" :: "ah, you want cut instead". If we're bikeshedding the actual method names, I think it would be good to have a list of viable options. A quick skim through the thread gives me these: * cut_prefix/cut_suffix * strip_prefix/strip_suffix * cut_start/cut_end * Any of the above with the underscore removed * lcut/rcut * ltrim/rtrim (and maybe trim) * truncate (end only, no from-start equivalent) Of them, I think cutprefix/cutsuffix (no underscore) and lcut/rcut are the strongest contenders, but that's just my opinion. Have I missed anyone's favourite spelling? Is there a name that parallels startswith/endswith? Regardless of the method name, IMO the functions should accept a tuple of test strings, as startswith/endwith do. That's a feature that can't easily be spelled in a one-liner. (Though stacked suffixes shouldn't all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should return "asdf.jpg", not "asdf".) ChrisA From boxed at killingar.net Sun Mar 31 01:53:53 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sun, 31 Mar 2019 07:53:53 +0200 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331044323.GB6059@ando.pearwood.info> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> Message-ID: <277B8B91-ACE5-4BDF-B4EA-BA86BA4284BA@killingar.net> > Steven D'Aprano: > > Stephen J. Turnbull: > >> And the bikeshedding isn't hard. In the list above, cutprefix/ >> cutsuffix are far and away the best. > > Well I'm glad we agree on that, even if nothing else :-) I prefer ?strip_prefix? because of the analogy to strip() which doesn?t do anything if the characters aren?t present. Introducing a new word ?cut? seems unnecessary and confusing and I?d wager it will increase the probability of: if s.startswith(?foo?): s = s.cutprefix(?foo?) Obviously this is a guess! I also don?t understand why not using the underscore is preferable? It seems just to be poor form. / Anders From steve at pearwood.info Sun Mar 31 04:35:22 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 31 Mar 2019 19:35:22 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> Message-ID: <20190331083522.GE31406@ando.pearwood.info> On Sun, Mar 31, 2019 at 04:48:36PM +1100, Chris Angelico wrote: > Regardless of the method name, IMO the functions should accept a tuple > of test strings, as startswith/endwith do. That's a feature that can't > easily be spelled in a one-liner. (Though stacked suffixes shouldn't > all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should > return "asdf.jpg", not "asdf".) There's a slight problem with that: what happens if more than one suffix matches? E.g. given: "musical".lcut(('al', 'ical')) should the suffix "al" be removed, leaving "music"? (First match wins.) Or should the suffix "ical" be removed, leaving "mus"? (Longest match wins.) I don't think we can decide which is better, and I'm not keen on a keyword argument to choose one or the other, so I suggest we stick to the 90% solution of only supporting a single suffix. We can always revisit that in the future. -- Steven From kirillbalunov at gmail.com Sun Mar 31 04:45:58 2019 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 31 Mar 2019 11:45:58 +0300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331083522.GE31406@ando.pearwood.info> References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: ??, 31 ???. 2019 ?. ? 11:36, Steven D'Aprano : > There's a slight problem with that: what happens if more than one suffix > matches? E.g. given: > > "musical".lcut(('al', 'ical')) > > should the suffix "al" be removed, leaving "music"? (First match wins.) > > Or should the suffix "ical" be removed, leaving "mus"? (Longest match > wins.) > > I think you should choose "First match wins", because in this case you can make "Longest match wins" as `"musical".lcut(tuple(sorted(('al, 'ical'))))`. But if you choose "Longest match wins" there is no chance to achieve "First match wins" behaviour. with kind regards, -gfg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirillbalunov at gmail.com Sun Mar 31 04:53:52 2019 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 31 Mar 2019 11:53:52 +0300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: ??, 31 ???. 2019 ?. ? 11:45, Kirill Balunov : > > > ??, 31 ???. 2019 ?. ? 11:36, Steven D'Aprano : > >> There's a slight problem with that: what happens if more than one suffix >> matches? E.g. given: >> >> "musical".lcut(('al', 'ical')) >> >> should the suffix "al" be removed, leaving "music"? (First match wins.) >> >> Or should the suffix "ical" be removed, leaving "mus"? (Longest match >> wins.) >> >> > I think you should choose "First match wins", because in this case you > can make "Longest match wins" as `"musical".lcut(tuple(sorted(('al, > 'ical'))))`. But if you choose "Longest match wins" there is no chance to > achieve "First match wins" behaviour. > > with kind regards, > -gfg > Sorry it should be `.rcut` instead of `.lcut` in `"musical".*rcut*(tuple(sorted(('al, 'ical'))))` at the first place. I will prefer names `.lstrip` and `.rstrip` instead of *cut versions. with kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Mar 31 04:56:57 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 31 Mar 2019 19:56:57 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331083522.GE31406@ando.pearwood.info> References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: <20190331085657.GF31406@ando.pearwood.info> On Sun, Mar 31, 2019 at 07:35:22PM +1100, Steven D'Aprano wrote: > "musical".lcut(('al', 'ical')) Oops, typo, I was thinking rcut and wrote lcut :-( -- Steven From evrial at gmail.com Sun Mar 31 04:58:45 2019 From: evrial at gmail.com (Alex Grigoryev) Date: Sun, 31 Mar 2019 11:58:45 +0300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331085657.GF31406@ando.pearwood.info> References: <20190331085657.GF31406@ando.pearwood.info> Message-ID: <3E6365B7-E7E9-4B54-8D15-70B9E2AD01BC@getmailspring.com> That's why strip_prefix(suffix) is a better name, can't double think. On ???? 31 2019, at 11:56 ????, Steven D'Aprano wrote: > On Sun, Mar 31, 2019 at 07:35:22PM +1100, Steven D'Aprano wrote: > > > "musical".lcut(('al', 'ical')) > Oops, typo, I was thinking rcut and wrote lcut :-( > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Mar 31 05:03:11 2019 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 31 Mar 2019 20:03:11 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331083522.GE31406@ando.pearwood.info> References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: On Sun, Mar 31, 2019 at 7:36 PM Steven D'Aprano wrote: > > On Sun, Mar 31, 2019 at 04:48:36PM +1100, Chris Angelico wrote: > > > Regardless of the method name, IMO the functions should accept a tuple > > of test strings, as startswith/endwith do. That's a feature that can't > > easily be spelled in a one-liner. (Though stacked suffixes shouldn't > > all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should > > return "asdf.jpg", not "asdf".) > > There's a slight problem with that: what happens if more than one suffix > matches? E.g. given: > > "musical".lcut(('al', 'ical')) > > should the suffix "al" be removed, leaving "music"? (First match wins.) > > Or should the suffix "ical" be removed, leaving "mus"? (Longest match > wins.) > > I don't think we can decide which is better, and I'm not keen on a > keyword argument to choose one or the other, so I suggest we stick to > the 90% solution of only supporting a single suffix. > > We can always revisit that in the future. The only way there could be multiple independent matches is if one is a strict suffix of another (as in your example here). In most cases, this will require semantics at the control of the programmer, so I would say "first match wins" is the only sane definition (as it permits the programmer to order the cuttables to define the desired semantics). The overwhelming majority of use cases won't be affected by this decision, so first-wins won't hurt them. ChrisA From kirillbalunov at gmail.com Sun Mar 31 05:22:23 2019 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sun, 31 Mar 2019 12:22:23 +0300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: Sorry one more time (it is early morning and I should drink a cup of coffee first before posting here). Of course it should be `tuple(sorted(('al', 'ical'), key=len, reverse=True))`, I hope it was obvious to everyone from the very beginning. with kind regards, -gdg ??, 31 ???. 2019 ?. ? 11:53, Kirill Balunov : > > > ??, 31 ???. 2019 ?. ? 11:45, Kirill Balunov : > >> >> >> ??, 31 ???. 2019 ?. ? 11:36, Steven D'Aprano : >> >>> There's a slight problem with that: what happens if more than one suffix >>> matches? E.g. given: >>> >>> "musical".lcut(('al', 'ical')) >>> >>> should the suffix "al" be removed, leaving "music"? (First match wins.) >>> >>> Or should the suffix "ical" be removed, leaving "mus"? (Longest match >>> wins.) >>> >>> >> I think you should choose "First match wins", because in this case you >> can make "Longest match wins" as `"musical".lcut(tuple(sorted(('al, >> 'ical'))))`. But if you choose "Longest match wins" there is no chance >> to achieve "First match wins" behaviour. >> >> with kind regards, >> -gfg >> > > Sorry it should be `.rcut` instead of `.lcut` in `"musical".*rcut*(tuple(sorted(('al, > 'ical'))))` at the first place. I will prefer names `.lstrip` and `.rstrip` > instead of *cut versions. > > with kind regards, > -gdg > -------------- next part -------------- An HTML attachment was scrubbed... URL: From 2QdxY4RzWzUUiLuE at potatochowder.com Sun Mar 31 05:54:08 2019 From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers) Date: Sun, 31 Mar 2019 05:54:08 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> Message-ID: <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> On 3/31/19 1:48 AM, Chris Angelico wrote: > If we're bikeshedding the actual method names, I think it would be > good to have a list of viable options. A quick skim through the thread > gives me these: > > * cut_prefix/cut_suffix > * strip_prefix/strip_suffix > * cut_start/cut_end > * Any of the above with the underscore removed > * lcut/rcut > * ltrim/rtrim (and maybe trim) > * truncate (end only, no from-start equivalent) > > Of them, I think cutprefix/cutsuffix (no underscore) and lcut/rcut are > the strongest contenders, but that's just my opinion. Have I missed > anyone's favourite spelling? Is there a name that parallels > startswith/endswith? without_prefix without_suffix They're a little longer, but IMO "without" helps reenforce the immutability of the underlying string. None of these functions actually remove part of the original string, but rather they return a new string that's the original string without some piece of it. From mertz at gnosis.cx Sun Mar 31 11:48:39 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 11:48:39 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331083522.GE31406@ando.pearwood.info> References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: The only reason I would support the idea would be to allow multiple suffixes (or prefixes). Otherwise, it just does too little for a new method. But adding that capability of startswith/endswith makes the cut off something easy to get wrong and non-trivial to implement. That said, I really like Brandt's ideas of expanding the signature of .lstrip/.rstrip instead. mystring.rstrip("abcd") # remove any of these single character suffixes mystring.rstrip(('foo', 'bar', 'baz')) # remove any of these suffixes Yes, the semantics or removals where one is a substring of another would need to be decided. As long as it's documented, any behavior would be fine. Most of the time the issue would be moot. On Sun, Mar 31, 2019, 4:36 AM Steven D'Aprano wrote: > On Sun, Mar 31, 2019 at 04:48:36PM +1100, Chris Angelico wrote: > > > Regardless of the method name, IMO the functions should accept a tuple > > of test strings, as startswith/endwith do. That's a feature that can't > > easily be spelled in a one-liner. (Though stacked suffixes shouldn't > > all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should > > return "asdf.jpg", not "asdf".) > > There's a slight problem with that: what happens if more than one suffix > matches? E.g. given: > > "musical".lcut(('al', 'ical')) > > should the suffix "al" be removed, leaving "music"? (First match wins.) > > Or should the suffix "ical" be removed, leaving "mus"? (Longest match > wins.) > > I don't think we can decide which is better, and I'm not keen on a > keyword argument to choose one or the other, so I suggest we stick to > the 90% solution of only supporting a single suffix. > > We can always revisit that in the future. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Mar 31 12:08:00 2019 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 31 Mar 2019 17:08:00 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> Message-ID: <73791019-c93a-5638-05fe-96094df18fda@mrabarnett.plus.com> On 2019-03-31 16:48, David Mertz wrote: > The only reason I would support the idea would be to allow multiple > suffixes (or prefixes). Otherwise, it just does too little for a new > method. But adding that capability of startswith/endswith makes the cut > off something easy to get wrong and non-trivial to implement. > > That said, I really like Brandt's ideas of expanding the signature of > .lstrip/.rstrip instead. > > mystring.rstrip("abcd") # remove any of these single character suffixes > It removes _all_ of the single character suffixes. > mystring.rstrip(('foo', 'bar', 'baz')) # remove any of these suffixes > In keeping with the current behaviour, it would strip _all_ of these suffixes. > Yes, the semantics or removals where one is a substring of another would > need to be decided. As long as it's documented, any behavior would be > fine. Most of the time the issue would be moot. > > On Sun, Mar 31, 2019, 4:36 AM Steven D'Aprano > wrote: > > On Sun, Mar 31, 2019 at 04:48:36PM +1100, Chris Angelico wrote: > > > Regardless of the method name, IMO the functions should accept a > tuple > > of test strings, as startswith/endwith do. That's a feature that > can't > > easily be spelled in a one-liner. (Though stacked suffixes shouldn't > > all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should > > return "asdf.jpg", not "asdf".) > > There's a slight problem with that: what happens if more than one > suffix > matches? E.g. given: > > ? ? "musical".lcut(('al', 'ical')) > > should the suffix "al" be removed, leaving "music"? (First match wins.) > > Or should the suffix "ical" be removed, leaving "mus"? (Longest match > wins.) > > I don't think we can decide which is better, and I'm not keen on a > keyword argument to choose one or the other, so I suggest we stick to > the 90% solution of only supporting a single suffix. > > We can always revisit that in the future. > From mertz at gnosis.cx Sun Mar 31 12:17:44 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 12:17:44 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <73791019-c93a-5638-05fe-96094df18fda@mrabarnett.plus.com> References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> <73791019-c93a-5638-05fe-96094df18fda@mrabarnett.plus.com> Message-ID: On Sun, Mar 31, 2019 at 12:09 PM MRAB wrote: > > That said, I really like Brandt's ideas of expanding the signature of > > .lstrip/.rstrip instead. > > > > mystring.rstrip("abcd") # remove any of these single character suffixes > > It removes _all_ of the single character suffixes. > > > mystring.rstrip(('foo', 'bar', 'baz')) # remove any of these suffixes > > In keeping with the current behaviour, it would strip _all_ of these > suffixes. > Yes, the exact behavior would need to be documented. The existing case indeed removes *ALL* of the single letter suffixes. Clearly that behavior cannot be changed (nor would I want to, that behavior is useful). It's a decision about whether passing a tuple of substrings would remove all of them (perhaps repeatedly) or only one of them. And if only one, is it "longest wins" or "first wins." As I say, any choice of the semantics would be fine with me if it were documented... since this edge case will be uncommon in most uses (certainly in almost all of my uses). E.g. basename = fname.rstrip(('.jpg', '.png', 'gif')) This is rarely ambiguous, and does something concretely useful that I've coded many times. But what if: fname = 'silly.jpg.png.gif.png.jpg.gif.jpg' I'm honestly not sure what behavior would be useful most often for this oddball case. For the suffixes, I think "remove them all" is probably the best; that is consistent with thinking of the string passed in the existing signature of .rstrip() as an iterable of characters. But even if the decision was made to "only remove the single thing at end", I'd still find the enhancement useful. Sure, once in a while someone might trip over the choice of semantics in this edge case, but if it were documented, no big deal. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Mar 31 13:31:47 2019 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 31 Mar 2019 18:31:47 +0100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <20190331083522.GE31406@ando.pearwood.info> <73791019-c93a-5638-05fe-96094df18fda@mrabarnett.plus.com> Message-ID: On 2019-03-31 17:17, David Mertz wrote: > On Sun, Mar 31, 2019 at 12:09 PM MRAB > wrote: > > > That said, I really like Brandt's ideas of expanding the > signature of > > .lstrip/.rstrip instead. > > > > mystring.rstrip("abcd") # remove any of these single character > suffixes > > It removes _all_ of the single character suffixes. > > > mystring.rstrip(('foo', 'bar', 'baz')) # remove any of these > suffixes > > In keeping with the current behaviour, it would strip _all_ of these > suffixes. > > > Yes, the exact behavior would need to be documented.? The existing > case indeed removes *ALL* of the single letter suffixes.? Clearly that > behavior cannot be changed (nor would I want to, that behavior is useful). > > It's a decision about whether passing a tuple of substrings would > remove all of them (perhaps repeatedly) or only one of them.? And if > only one, is it "longest wins" or "first wins."? As I say, any choice > of the semantics would be fine with me if it were documented... since > this edge case will be uncommon in most uses (certainly in almost all > of my uses). > > E.g. > > ? ? basename = fname.rstrip(('.jpg', '.png', 'gif')) > > This is rarely ambiguous, and does something concretely useful that > I've coded many times.? But what if: > > ? ? fname = 'silly.jpg.png.gif.png.jpg.gif.jpg' > > I'm honestly not sure what behavior would be useful most often for > this oddball case.? For the suffixes, I think "remove them all" is > probably the best; that is consistent with thinking of the string > passed in the existing signature of .rstrip() as an iterable of > characters. > > But even if the decision was made to "only remove the single thing at > end", I'd still find the enhancement useful. Sure, once in a while > someone might trip over the choice of semantics in this edge case, but > if it were documented, no big deal. > Could/should we borrow from .replace, which accepts a replace count? From prometheus235 at gmail.com Sun Mar 31 13:59:52 2019 From: prometheus235 at gmail.com (Nick Timkovich) Date: Sun, 31 Mar 2019 12:59:52 -0500 Subject: [Python-ideas] Built-in parsing library In-Reply-To: References: Message-ID: What does it mean to be a universal parser? In my mind, to be universal you should be able to parse anything, so you'd need something as versatile as any Turing language, so one could stick with the one we already have (Python). I'm vaguely aware of levels of grammar (regular, context-free?, etc.), and how things like XML can't/shouldn't be parsed with regex [1]. Most protocols probably aren't *completely* free to do whatever and probably fit into some level of the hierarchy, what level would this putative parser perform at? Doing something like this from-scratch is a very tall order, are there candidate libraries that you'd want to see included in the stdlib? There is an argument for trying to "promote" a library that would security into the standard library over others that would just add features: trying to make the "one obvious way to do it" also the safe way. However, all things equal, more used libraries tend to be more secure. I think suggestions of this form need to pose a library that a) exists, b) is well used and regarded, c) stable (once in the the stdlib things are hard to change), and d) has maintainers that are amenable to inclusion. Nick [1]: https://stackoverflow.com/a/1732454/194586 On Sat, Mar 30, 2019 at 12:57 PM Nam Nguyen wrote: > Hello list, > > What do you think of a universal parsing library in the stdlib mainly for > use by other libraries in the stdlib? > > Through out the years we have had many issues with protocol parsing. Some > have even introduced security bugs. The main cause of these issues is the > use of simple regular expressions. > > Having a universal parsing library in the stdlib would help cut down these > issues. Such a library should be minimal yet encompassing, and whole parse > trees should be entirely expressible in code. I am thinking of combinatoric > parsing as the main candidate that fits this bill. > > What do you say? > > Thanks! > Nam > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bitsink at gmail.com Sun Mar 31 14:58:35 2019 From: bitsink at gmail.com (Nam Nguyen) Date: Sun, 31 Mar 2019 11:58:35 -0700 Subject: [Python-ideas] Built-in parsing library In-Reply-To: References: Message-ID: On Sun, Mar 31, 2019 at 11:00 AM Nick Timkovich wrote: > What does it mean to be a universal parser? In my mind, to be universal > you should be able to parse anything, so you'd need something as versatile > as any Turing language, > I'm not aware of, nor looking for, such Turing-complete parsers. Parsing algorithms such as Earley's, Generalized LL/LR, parser combinators, often are universal in the sense that they can work with all context-free grammars. I do not know if they are Turing complete. so one could stick with the one we already have (Python). > One of the reasons why the parser should be "coded" in and not declared (e.g. in the sense of eBNF). Combinatoric parsers are usually glued together with functions which can act based on the current parse tree. > I'm vaguely aware of levels of grammar (regular, context-free?, etc.), and > how things like XML can't/shouldn't be parsed with regex [1]. Most > protocols probably aren't *completely* free to do whatever and probably > fit into some level of the hierarchy, what level would this putative parser > perform at? > I'd say any context-free grammars should be supported. But given the immediate use case (to help with other libraries in the stdlib), this could start small (but complete and correct). I am talking about simple parsing needs such as email validation, HTTP cookie format, URL parsing, well-known date formats. In fact, I would expect this parsing library to only offer primitives like parse any character, parse a character matching a predicate, parse a string, etc. > > Doing something like this from-scratch is a very tall order, are there > candidate libraries that you'd want to see included in the stdlib? There is > an argument for trying to "promote" a library that would security into the > standard library over others that would just add features: trying to make > the "one obvious way to do it" also the safe way. However, all things > equal, more used libraries tend to be more secure. I think suggestions of > this form need to pose a library that a) exists, b) is well used and > regarded, c) stable (once in the the stdlib things are hard to change), and > d) has maintainers that are amenable to inclusion. > This email wasn't to promote or consider any library in particular. I'm more interested in finding out which way the consensus is with respect to the need. Implementation-wise, I'm thinking of this paper ~25 years ago and a very bare-bone pyparsing. http://www.cs.nott.ac.uk/~pszgmh/monparsing.pdf Cheers, Nam > > Nick > > [1]: https://stackoverflow.com/a/1732454/194586 > > On Sat, Mar 30, 2019 at 12:57 PM Nam Nguyen wrote: > >> Hello list, >> >> What do you think of a universal parsing library in the stdlib mainly for >> use by other libraries in the stdlib? >> >> Through out the years we have had many issues with protocol parsing. Some >> have even introduced security bugs. The main cause of these issues is the >> use of simple regular expressions. >> >> Having a universal parsing library in the stdlib would help cut down >> these issues. Such a library should be minimal yet encompassing, and whole >> parse trees should be entirely expressible in code. I am thinking of >> combinatoric parsing as the main candidate that fits this bill. >> >> What do you say? >> >> Thanks! >> Nam >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Mar 31 15:09:35 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 15:09:35 -0400 Subject: [Python-ideas] Built-in parsing library In-Reply-To: References: Message-ID: There are about a half dozen widely used parsing libraries for Python. Each one of them takes a dramatically different approach to the defining a grammar. Each one has been debugged for over a decade. While I can imagine proposing one for inclusion in the standard library, you'd have to choose one (or write a new one) and explain why that one is better for everyone (or at least a better starting point) than all the others are. You're also have to explain why it needs to be in the standard library rather than installed by 'pip install someparser'. On Sat, Mar 30, 2019, 1:58 PM Nam Nguyen wrote: > Hello list, > > What do you think of a universal parsing library in the stdlib mainly for > use by other libraries in the stdlib? > > Through out the years we have had many issues with protocol parsing. Some > have even introduced security bugs. The main cause of these issues is the > use of simple regular expressions. > > Having a universal parsing library in the stdlib would help cut down these > issues. Such a library should be minimal yet encompassing, and whole parse > trees should be entirely expressible in code. I am thinking of combinatoric > parsing as the main candidate that fits this bill. > > What do you say? > > Thanks! > Nam > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Mar 31 15:12:49 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 15:12:49 -0400 Subject: [Python-ideas] Built-in parsing library In-Reply-To: References: Message-ID: I just found this nice summary. It's not complete, but it looks well written. https://tomassetti.me/parsing-in-python/ On Sun, Mar 31, 2019, 3:09 PM David Mertz wrote: > There are about a half dozen widely used parsing libraries for Python. > Each one of them takes a dramatically different approach to the defining a > grammar. Each one has been debugged for over a decade. > > While I can imagine proposing one for inclusion in the standard library, > you'd have to choose one (or write a new one) and explain why that one is > better for everyone (or at least a better starting point) than all the > others are. You're also have to explain why it needs to be in the standard > library rather than installed by 'pip install someparser'. > > On Sat, Mar 30, 2019, 1:58 PM Nam Nguyen wrote: > >> Hello list, >> >> What do you think of a universal parsing library in the stdlib mainly for >> use by other libraries in the stdlib? >> >> Through out the years we have had many issues with protocol parsing. Some >> have even introduced security bugs. The main cause of these issues is the >> use of simple regular expressions. >> >> Having a universal parsing library in the stdlib would help cut down >> these issues. Such a library should be minimal yet encompassing, and whole >> parse trees should be entirely expressible in code. I am thinking of >> combinatoric parsing as the main candidate that fits this bill. >> >> What do you say? >> >> Thanks! >> Nam >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pythonchb at gmail.com Sun Mar 31 15:32:25 2019 From: pythonchb at gmail.com (Christopher Barker) Date: Sun, 31 Mar 2019 12:32:25 -0700 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> Message-ID: > without_prefix > without_suffix > > They're a little longer, but IMO "without" helps > reenforce the immutability of the underlying string. None > of these functions actually remove part of the original > string, but rather they return a new string that's the > original string without some piece of it. Which is the case for EVERY string method? we do need to get all wordy for just these two. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython -------------- next part -------------- An HTML attachment was scrubbed... URL: From boxed at killingar.net Sun Mar 31 15:38:59 2019 From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=) Date: Sun, 31 Mar 2019 21:38:59 +0200 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> Message-ID: <33F95C62-97E6-42BF-BB2D-3166ECC1C2D7@killingar.net> > On 31 Mar 2019, at 21:32, Christopher Barker wrote: > > > without_prefix > without_suffix > > They're a little longer, but IMO "without" helps > reenforce the immutability of the underlying string. None > of these functions actually remove part of the original > string, but rather they return a new string that's the > original string without some piece of it. > > Which is the case for EVERY string method? we do need to get all wordy for just these two. Agreed! Let?s not remake the mistakes of the past in order to try to keep consistency. / Anders -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Sun Mar 31 17:47:10 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Mon, 1 Apr 2019 08:47:10 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331083522.GE31406@ando.pearwood.info> References: <20190331083522.GE31406@ando.pearwood.info> Message-ID: <20190331214710.GA77434@cskk.homeip.net> On 31Mar2019 19:35, Steven D'Aprano wrote: >On Sun, Mar 31, 2019 at 04:48:36PM +1100, Chris Angelico wrote: >> Regardless of the method name, IMO the functions should accept a >> tuple of test strings, as startswith/endwith do. I did not know that! >> That's a feature that can't >> easily be spelled in a one-liner. (Though stacked suffixes shouldn't >> all be removed - "asdf.jpg.png".cutsuffix((".jpg", ".png")) should >> return "asdf.jpg", not "asdf".) > >There's a slight problem with that: what happens if more than one suffix >matches? E.g. given: > > "musical".lcut(('al', 'ical')) > >should the suffix "al" be removed, leaving "music"? (First match wins.) > >Or should the suffix "ical" be removed, leaving "mus"? (Longest match >wins.) > >I don't think we can decide which is better, and I'm not keen on a >keyword argument to choose one or the other, so I suggest we stick to >the 90% solution of only supporting a single suffix. This is easy to decide: first match wins. That is (a) simple and (b) very predictable for users. You can easily get longest-match behaviour from this by sorting the suffixes. The reverse does not hold. If anyone opposes my reasoning I can threaten them with my partner's story about Netscape proxy, where match rules rules were not processed in the order in the config file but by longest regexp pattern: yes, the longest regexp itself, nothing to do with what it matched. Config stupidity ensues. Do things in the order supplied: that way the user has control. Doing things by length is imposing policy which can't be circumvented. Cheers, Cameron Simpson From greg.ewing at canterbury.ac.nz Sun Mar 31 19:02:25 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Apr 2019 12:02:25 +1300 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> References: <1E2D83BC-326E-4278-A46C-7FE4F7FE2560@getmailspring.com> <669F88F5-3CB2-4B33-A451-621307BDBD0F@killingar.net> <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> Message-ID: <5CA14701.7090504@canterbury.ac.nz> Dan Sommers wrote: > without_prefix > without_suffix > > They're a little longer, but IMO "without" helps > reenforce the immutability of the underlying string. We don't seem to worry about that distinction for other string methods, such as lstrip and rstrip. -- Greg From steve at pearwood.info Sun Mar 31 19:59:50 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 1 Apr 2019 10:59:50 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <5CA14701.7090504@canterbury.ac.nz> References: <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> <5CA14701.7090504@canterbury.ac.nz> Message-ID: <20190331235950.GJ31406@ando.pearwood.info> On Mon, Apr 01, 2019 at 12:02:25PM +1300, Greg Ewing wrote: > Dan Sommers wrote: > > >without_prefix > >without_suffix > > > >They're a little longer, but IMO "without" helps > >reenforce the immutability of the underlying string. > > We don't seem to worry about that distinction for other > string methods, such as lstrip and rstrip. Perhaps we ought to. In the spirit of today's date, let me propose renaming existing string methods to be more explicit, e.g.: str.new_string_in_uppercase str.new_string_with_substrings_replaced str.new_string_filled_to_the_given_length_with_zeroes_on_the_left str.new_string_with_character_translations_not_natural_language_translations The best thing is that there will no longer be any confusion as to whether you are looking at a Unicode string or a byte-string: a = a.new_string_trimmed_on_the_left() a = a.new_bytes_trimmed_on_the_left() *wink* -- Steven From rosuav at gmail.com Sun Mar 31 20:08:38 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Apr 2019 11:08:38 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190331235950.GJ31406@ando.pearwood.info> References: <23709.35761.793748.866930@turnbull.sk.tsukuba.ac.jp> <20190330013736.GU31406@ando.pearwood.info> <23711.45063.758826.321466@turnbull.sk.tsukuba.ac.jp> <20190331044323.GB6059@ando.pearwood.info> <07170fd4-0195-f2d4-d2d7-77f7272e263b@potatochowder.com> <5CA14701.7090504@canterbury.ac.nz> <20190331235950.GJ31406@ando.pearwood.info> Message-ID: On Mon, Apr 1, 2019 at 11:00 AM Steven D'Aprano wrote: > > On Mon, Apr 01, 2019 at 12:02:25PM +1300, Greg Ewing wrote: > > Dan Sommers wrote: > > > > >without_prefix > > >without_suffix > > > > > >They're a little longer, but IMO "without" helps > > >reenforce the immutability of the underlying string. > > > > We don't seem to worry about that distinction for other > > string methods, such as lstrip and rstrip. > > Perhaps we ought to. In the spirit of today's date, let me propose > renaming existing string methods to be more explicit, e.g.: > > str.new_string_in_uppercase > str.new_string_with_substrings_replaced > str.new_string_filled_to_the_given_length_with_zeroes_on_the_left > str.new_string_with_character_translations_not_natural_language_translations Excellent! Love it. Add that to the feature list for Python 2.8. But for those of us still discussing the 3.x line, do we need to put together a PEP about this? There seems to be a lot of bikeshedding, a lot of broad support, and a small amount of "bah, don't need it, use regex/long expression/etc". Who wants to champion the proposal? Do we have a core dev who's interested in sponsoring it? ChrisA From steve at pearwood.info Sun Mar 31 20:08:37 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 1 Apr 2019 11:08:37 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> Message-ID: <20190401000837.GK31406@ando.pearwood.info> Thank you, this is a simple, unambiguous proposal which meets a real need and will help prevent a lot of wasted developer time misusing [lr]strip and then reporting it as a bug: remove a single prefix or suffix. This is a useful string primitive provided by other modern languages and libraries, including Go, Ruby, Kotlin, and the Apache StringUtils Java library: https://golang.org/pkg/strings/#TrimPrefix https://golang.org/pkg/strings/#TrimSuffix https://ruby-doc.org/core-2.5.1/String.html#method-i-delete_prefix https://ruby-doc.org/core-2.5.1/String.html#method-i-delete_suffix https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/remove-prefix.html https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/remove-suffix.html https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#removeStart-java.la$ https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#removeEnd-java.lang$ Regarding later proposals to add support for multiple affixes, to recursively delete the affix repeatedly, and to take an additional argument to limit how many affixes will be removed: YAGNI. Let's not over-engineer this to be something which is ambigious and complex. We can add more complexity later, if and when practical experience suggests: (1) that multiple affixes actually is useful in practice, not just a "Wouldn't It Be Cool???" feature; and (2) a consensus as to how to handle ambiguous cases. Until then, let's keep it simple: methods to remove a *single* prefix or suffix, precisely as given. Anything else is YAGNI and is best left for the individual programmer. -- Steven From mertz at gnosis.cx Sun Mar 31 20:23:05 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 20:23:05 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190401000837.GK31406@ando.pearwood.info> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> <20190401000837.GK31406@ando.pearwood.info> Message-ID: On Sun, Mar 31, 2019, 8:11 PM Steven D'Aprano wrote: > Regarding later proposals to add support for multiple affixes, to > recursively delete the affix repeatedly, and to take an additional > argument to limit how many affixes will be removed: YAGNI. > That's simply not true, and I think it's clearly illustrated by the example I gave a few times. Not just conceivably, but FREQUENTLY I write code to accomplish the effect of the suggested: basename = fname.rstrip(('.jpg', '.gif', '.png')) I probably do this MORE OFTEN than removing a single suffix. Obviously I *can* achieve this result now. I probably take a slightly different approach as the mood strikes me, with three or four different styles I've used. Actually, I've probably never done it in a way that wouldn't be subtly wrong for cases like 'base.jpg.gif.png.jpg.gif'. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Mar 31 21:34:27 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 1 Apr 2019 12:34:27 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> <20190401000837.GK31406@ando.pearwood.info> Message-ID: <20190401013425.GC6059@ando.pearwood.info> On Sun, Mar 31, 2019 at 08:23:05PM -0400, David Mertz wrote: > On Sun, Mar 31, 2019, 8:11 PM Steven D'Aprano wrote: > > > Regarding later proposals to add support for multiple affixes, to > > recursively delete the affix repeatedly, and to take an additional > > argument to limit how many affixes will be removed: YAGNI. > > > > That's simply not true, and I think it's clearly illustrated by the example > I gave a few times. Not just conceivably, but FREQUENTLY I write code to > accomplish the effect of the suggested: > > basename = fname.rstrip(('.jpg', '.gif', '.png')) > > I probably do this MORE OFTEN than removing a single suffix. Okay. Yesterday, you stated that you didn't care what the behaviour was for the multiple affix case. You made it clear that "any" semantics would be okay with you so long as it was documented. You seemed to feel so strongly about your indifference that you mentioned it in two seperate emails. That doesn't sound like someone who has a clear use-case in mind. If you're doing this frequently, then surely one of the following two alternatives apply: (1) One specific behaviour makes sense for all or a majority of your use-cases, in which case you would prefer that behaviour rather than something that you can't use. (2) Or there is no single useful behaviour that you want, perhaps all or a majority of your use-cases are different, and you'll usually need to write your own helper function to suit your own usage, no matter what the builtin behaviour is. Hence you don't care what the builtin behaviour is. Since you have no preferred behaviour, either you don't do this often enough to care (but above you say differently), or you are going to have to write your own helpers because the behaviour you need won't match the behaviour of the builtin. And you clearly don't mind this, because you stated twice that you don't care what the builtin behaviour is. So why rush to handle the multiple argument case? "YAGNI" is a misnomer, because it doesn't actually mean "you aren't (ever) going to need it". It means (generic) you don't need it *now*, but when you do, you can come back and revisit the design with concrete use-cases in mind. That's all I'm saying. For 29 years, we've done without this string primitive, and as a consequence the forums are full of examples of people misusing strip and getting it wrong. There's a clear case for the single argument version, and fixing that is the 90% solution. In comparison, we've been discussing this multiple affix feature for, what, a week? Lacking a good set of semantics for removing multiple affixes at once, we shouldn't rush to guess what people want. You don't even know what behaviour YOU want, let alone what the community as a whole needs. You won't be any worse off than you are now. You'll probably be better off, because you can use the single-affix version as the basic primitive, and build on top of that, instead of the incorrect version you currently use in an ad hoc manner: basename = fname.split(".ext")[0] # replace with fname.cut_suffix(".ext") Others have already pointed out why the split version is incorrect. For the use-case of stripping a single file extension out of a set of such extensions, while leaving all others, there's an obvious solution: if fname.endswith(('.jpg', '.png', '.gif'): basename = os.path.splitext(fname)[0] else: # Any other extension stays with the base. # (Presumably to be handled seperately?) basename = fname But a more general solution needs to decide on two issues: - given two affixes where one is an affix of the other, which wins? e.g. "abcd".cut_prefix(("a", "ab")) # should this return "bcd" or "cd"? - once you remove an affix, should you stop processing or continue? "ab".cut_prefix(("a", "b")) # should this return "b" or ""? The startswith and endswith methods don't suffer from this problem, for obvious reasons. We shouldn't add a problematic, ambiguous feature just for consistency with methods where it is not problematic or ambiguous. I posted links to prior art. Unless I missed something, not one of those languages or libraries supports multiple affixes in the one call. Don't let the perfect be the enemy of the good. In this case, a 90% solution will let us fix real problems and meet real needs, and we can always revisit the multiple affix case once we have more experience and have time to build a consensus based on actual use-cases. -- Steven From mertz at gnosis.cx Sun Mar 31 22:58:24 2019 From: mertz at gnosis.cx (David Mertz) Date: Sun, 31 Mar 2019 22:58:24 -0400 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190401013425.GC6059@ando.pearwood.info> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> <20190401000837.GK31406@ando.pearwood.info> <20190401013425.GC6059@ando.pearwood.info> Message-ID: On Sun, Mar 31, 2019, 9:35 PM Steven D'Aprano wrote: > > That's simply not true, and I think it's clearly illustrated by the > example I gave a few times. Not just conceivably, but FREQUENTLY I write > code to accomplish the effect of the suggested: > > > > basename = fname.rstrip(('.jpg', '.gif', '.png')) > > > > I probably do this MORE OFTEN than removing a single suffix. > > Okay. > > Yesterday, you stated that you didn't care what the behaviour was for the > multiple affix case. You made it clear that "any" semantics would be okay > with you so long as it was documented. You seemed to feel so strongly about > your indifference that you mentioned it in two seperate emails. > Yes. Because the multiple affix is an edge case that will rarely affect any of my code. I.e. I don't care much when a single string had multiple candidate affixes, because that's just not a common situation. That doesn't mean I'm indifferent to the core purpose that I need frequently. Any of the several possible behaviors in the edge case will not affect my desired usage whatsoever. That doesn't sound like someone who has a clear use-case in mind. If you're > doing this frequently, then surely one of the following two alternatives > apply: > I don't think I've ever written code that cares about the edge case you focus on. Ok, I guess technically the code I've written is all buggy in the sense that it would behave in a manner I haven't thought through when presented with weird input. Perhaps I should always have been more careful about those edges. There simply is no "majority of the time" for a situation I've never specifically coded for. The rest gets more and more sophistical. I'm sure most people here have written code similar to this (maybe structured differently, but same purpose): for fname in filenames: basename, ext = fname.rsplit('.', 1) if ext in {'jpg', 'gif', 'png'}: do_stuff(basename) In all the times I've written things close to that, I've never thought about files named 'silly.jpg.gif.png.gif.jpg'. The sophistry is insistently asking "but what about...?" of this edge case. For 29 years, we've done without this string primitive, and as a > consequence the forums are full of examples of people misusing strip and > getting it wrong. It's interesting that you keep raising this error. I've made a whole lot of silly mistakes in Python (and other languages). I have never for a moment been tempted to think .rstrip() would remove a suffix rather than a character class. I did write the book Text Processing in Python a very long time ago, so I've thought a bit about text processing in Python. Maybe it's just that I'm comfortable enough with regexen that thinking of a character class doesn't feel strange to me. There's a clear case for the single argument version, and fixing that is > the 90% solution. > I think there's very little case for a single argument version. At best, it's a 10% solution. Lacking a good set of semantics for removing multiple affixes at once, we > shouldn't rush to guess what people want. You don't even know what > behaviour YOU want, let alone what the community as a whole needs. > This is both dumb and dishonest. There are basically two choices, both completely clear. I think the more obvious one is to treat several prefixes or suffixes as substring class, much as .[rl]strip() does character class. But another choice indeed is to remove at most one of the affixes. I think that's a little bit less good for the edge case. But it would be fine also... and as I keep writing, the difference would almost always be moot, it just needs to be documented. > the use-case of stripping a single file extension out of a set of > such extensions, while leaving all others, there's an obvious solution: > > if fname.endswith(('.jpg', '.png', '.gif'): > basename = os.path.splitext(fname)[0] > I should probably use of.path.splitext() more than I do. But that's just an example. Another is, e.g. 'if url.startswith(('http://', 'sftp://', 's3://')): ...'. And lots of similar things that aren't addressed by os.path.splitext(). E.g. 'if logline.startswith(('WARNING', 'ERROR')): ...' I posted links to prior art. Unless I missed something, not one of those > languages or libraries supports multiple affixes in the one call. > Also, none of those languages support the amazingly useful signature of str.startswith(tuple). Well, they do in the sense they support regexen. But not as a standard method or function on strings. I don't even know if PHP with it's 5000 string functions had this great convenience. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Mar 31 23:29:44 2019 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Apr 2019 14:29:44 +1100 Subject: [Python-ideas] New explicit methods to trim strings In-Reply-To: <20190401013425.GC6059@ando.pearwood.info> References: <7D84D131-65B6-4EF7-9C43-51957F9DFAA9@getmailspring.com> <20190401000837.GK31406@ando.pearwood.info> <20190401013425.GC6059@ando.pearwood.info> Message-ID: On Mon, Apr 1, 2019 at 12:35 PM Steven D'Aprano wrote: > > On Sun, Mar 31, 2019 at 08:23:05PM -0400, David Mertz wrote: > > On Sun, Mar 31, 2019, 8:11 PM Steven D'Aprano wrote: > > > > > Regarding later proposals to add support for multiple affixes, to > > > recursively delete the affix repeatedly, and to take an additional > > > argument to limit how many affixes will be removed: YAGNI. > > > > > > > That's simply not true, and I think it's clearly illustrated by the example > > I gave a few times. Not just conceivably, but FREQUENTLY I write code to > > accomplish the effect of the suggested: > > > > basename = fname.rstrip(('.jpg', '.gif', '.png')) > > > > I probably do this MORE OFTEN than removing a single suffix. > > Okay. > > Yesterday, you stated that you didn't care what the behaviour was for > the multiple affix case. You made it clear that "any" semantics would be > okay with you so long as it was documented. You seemed to feel so > strongly about your indifference that you mentioned it in two seperate > emails. The multiple affix case has exactly two forms: 1) Tearing multiple affixes off (eg stripping "asdf.jpg.png" down to just "asdf"), which most people are saying "no, don't do that, it doesn't make sense and isn't needed" 2) Removing one of several options, which implies that one option is a strict subpiece of another (eg stripping off "test" and "st") If anyone is advocating for #1, I would agree with saying YAGNI. But #2 is an extremely unlikely edge case, and whatever semantics are chosen for it, *normal* usage will not be affected. In the example that David gave, there is no way for "first wins" or "longest wins" or anything like that to make any difference, because it's impossible for there to be multiple candidates. Since this would be going into the language as a feature, the semantics have to be clearly defined (with "first match wins", "longest match wins", and "raise exception" being probably the most plausible options), but most of us aren't going to care which one is picked. > That doesn't sound like someone who has a clear use-case in mind. If > you're doing this frequently, then surely one of the following two > alternatives apply: > > (1) One specific behaviour makes sense for all or a majority of your > use-cases, in which case you would prefer that behaviour rather than > something that you can't use. > > (2) Or there is no single useful behaviour that you want, perhaps all or > a majority of your use-cases are different, and you'll usually need to > write your own helper function to suit your own usage, no matter what > the builtin behaviour is. Hence you don't care what the builtin > behaviour is. Or all the behaviours actually do the same thing anyway. > Lacking a good set of semantics for removing multiple affixes at once, > we shouldn't rush to guess what people want. You don't even know what > behaviour YOU want, let alone what the community as a whole needs. We're basically debating collision semantics here. It's on par with asking "how should statistics.mode() cope with multiple modes?". Should the introduction of statistics.mode() have been delayed pending a thorough review of use-cases, or is it okay to make it do what most people want, and then be prepared to revisit its edge-case handling? (For those who don't know, mode() was changed in 3.8 to return the first mode encountered, in contrast to previous behaviour where it would raise an exception.) > For the use-case of stripping a single file extension out of a set of > such extensions, while leaving all others, there's an obvious solution: > > if fname.endswith(('.jpg', '.png', '.gif'): > basename = os.path.splitext(fname)[0] > else: > # Any other extension stays with the base. > # (Presumably to be handled seperately?) > basename = fname Sure, but I've often wanted to do something like "strip off a prefix of http:// or https://", or something else that doesn't have a semantic that's known to the stdlib. Also, this is still fairly verbose, and a lot of people are going to reach for a regex, just because it can be done in one line of code. > I posted links to prior art. Unless I missed something, not one of those > languages or libraries supports multiple affixes in the one call. And they don't support multiple affixes in startswith/endswith either, but we're very happy to have that in Python. The parallel is strong. You ask if it has a prefix, you remove the prefix. You ask if it has multiple prefixes, you remove any one of those prefixes. We don't have to worry about edge cases that are unlikely to come up in real-world code, just as long as the semantics ARE defined somewhere. ChrisA