From greg.ewing at canterbury.ac.nz Fri Jun 1 01:54:11 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 01 Jun 2018 17:54:11 +1200 Subject: [Python-ideas] exception instantiation philosophy and practice [was: Let try-except check the exception instance] In-Reply-To: <5B100B96.6050900@stoneleaf.us> References: <5B100B96.6050900@stoneleaf.us> Message-ID: <5B10DF83.9080404@canterbury.ac.nz> Ethan Furman wrote: > Why is this? Doesn't the exception have to be instantiated at some > point, even if just to print to stderr? If it gets caught by an except clause without an else clause, in theory there's no need to instantiate it. However, Python doesn't currently seem to take advantage of that: >>> class E(Exception): ... def __init__(self, *args): ... Exception.__init__(self, *args) ... print("E got instantiated!") ... >>> try: ... print("Trying") ... raise E ... except E: ... print("Caught an E") ... Trying E got instantiated! Caught an E -- Greg From solipsis at pitrou.net Fri Jun 1 04:10:34 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 1 Jun 2018 10:10:34 +0200 Subject: [Python-ideas] Let try-except check the exception instance References: Message-ID: <20180601101034.3607cbf3@fsol> On Thu, 31 May 2018 14:00:24 -0400 Alexander Belopolsky wrote: > > Is this really true? Consider the following simple code > > class E(Exception): > def __init__(self): > print("instantiated") > > try: > raise E > except E: > pass > > Is it truly necessary to instantiate E() in this case? Yet when I run it, > I see "instantiated" printed on the console. I don't think it's truly necessary, but there's enough complication nowadays in the exception subsystem (e.g. with causes and contexts) that at some point we (perhaps I) decided it made things less hairy to always instantiate it in an "except" clause. Let me stress, though: this happens when catching exceptions in *Python*. When in C you call PyErr_ExceptionMatches, the exception should not get instantiated. Regards Antoine. From ericfahlgren at gmail.com Fri Jun 1 09:47:45 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Fri, 1 Jun 2018 06:47:45 -0700 Subject: [Python-ideas] exception instantiation philosophy and practice [was: Let try-except check the exception instance] In-Reply-To: <5B10DF83.9080404@canterbury.ac.nz> References: <5B100B96.6050900@stoneleaf.us> <5B10DF83.9080404@canterbury.ac.nz> Message-ID: On Thu, May 31, 2018 at 10:55 PM Greg Ewing wrote: > Ethan Furman wrote: > > > Why is this? Doesn't the exception have to be instantiated at some > > point, even if just to print to stderr? > > If it gets caught by an except clause without an else clause, > in theory there's no need to instantiate it. > > However, Python doesn't currently seem to take advantage of > that: > > >>> > ?? > class E(Exception): > ... def __init__(self, *args): > ... Exception.__init__(self, *args) > ... print("E got instantiated!") > ... > >>> try: > ... print("Trying") > ... raise E > ... except E: > ... print("Caught an E") > ... > Trying > E got instantiated! > Caught an E ?I don't think it's possible to avoid instantiating the exception at the user level, what would sys.exc_info() do about it's second return value? I believe the only cases where it's possible to avoid instantiation are inside the interpreter itself, where the exception never propagates up to user visibility. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Jun 1 18:41:27 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 02 Jun 2018 10:41:27 +1200 Subject: [Python-ideas] exception instantiation philosophy and practice [was: Let try-except check the exception instance] In-Reply-To: References: <5B100B96.6050900@stoneleaf.us> <5B10DF83.9080404@canterbury.ac.nz> Message-ID: <5B11CB97.8030808@canterbury.ac.nz> Eric Fahlgren wrote: > I don't think it's possible to avoid instantiating the exception at the > user level, what would sys.exc_info() do about it's second return value? The exception could be instantiated if and when sys.exc_info() gets called. -- Greg From paal.drange at gmail.com Sat Jun 2 08:21:50 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Sat, 2 Jun 2018 14:21:50 +0200 Subject: [Python-ideas] datetime.timedelta literals Message-ID: Elevator pitch: (2.5h - 14min + 9300ms).total_seconds() # 8169.3 from datetime import datetime as dt start = dt.now() end = dt.now() (end-start) < 5s # True chrono::duration: In C++ 14 the std::chrono::duration was introduced which corresponds somewhat to datetime.timedelta. C++ 14 introduced so-called chrono literals[1], which are literals specified as [number][h|min|s|ms|us|ns], e.g. * 2.5h * 14min * 9300ms These literals should correspond to * datetime.timedelta(0, 9000) # 2.5h = 2.5*3600 = 9000 seconds * datetime.timedelta(0, 840) # 14min = 14*60 = 840 seconds * datetime.timedelta(0, 9, 300000) # 9300ms = 9 seconds + 3*10^5 microseconds If a literal was interpreted as a datetime.timedelta, the following would work out of the box: 2.5h - 14min + 9300ms * 2 which would correspond to from datetime import timedelta as D D(hours=2.5) - D(minutes=14) + D(milliseconds=9300) * 2 # datetime.timedelta(0, 8178, 600000) # (*2 precedes, so that's to be expected) (D(hours=2.5) - D(minutes=14) + D(milliseconds=9300)) * 2 # datetime.timedelta(0, 16338, 600000) Notes: * C++ uses `min` instead of `m`. `min` is a keyword in Python. * In C++, `1d` means the first day of a month [2]. * In C++, `1990y` means the year 1990 (in the Proleptic Gregorian calendar) [3]. * C++ have the types signed integers and not floats, so 2.5h would not be valid. * My apologies if this has been discussed before; my search-fu gave me nothing. References: [1] std::literals::chrono_literals::operator""min http://en.cppreference.com/w/cpp/chrono/operator%22%22min [2] http://en.cppreference.com/w/cpp/chrono/day [3] http://en.cppreference.com/w/cpp/chrono/year Best regards, P?l Gr?n?s Drange -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Sat Jun 2 09:08:51 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Sat, 2 Jun 2018 15:08:51 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: What about 2.5*h - 14*min + 9300*ms * 2 where: h = timedelta(hours=1) min = timedelta (minutes=1) ms = timedelta (milliseconds=1) By the way "min" isn't a keyword, it's a standard function so it can be used as a variable name. However why be limited to time units ? One would want in certain application to define other units, like meter ? Would we want a litteral for that ? For units like those, the "*" operator works well, and if one want more support (like, mix minutes and seconds) one could write simple classes or use libraries, for example to avoid to add time and distance units. Le sam. 2 juin 2018 ? 14:21, P?l Gr?n?s Drange a ?crit : > Elevator pitch: > > (2.5h - 14min + 9300ms).total_seconds() > # 8169.3 > > from datetime import datetime as dt > start = dt.now() > end = dt.now() > (end-start) < 5s > # True > > > chrono::duration: > > In C++ 14 the std::chrono::duration was introduced which corresponds > somewhat to > datetime.timedelta. > > C++ 14 introduced so-called chrono literals[1], which are literals > specified as > [number][h|min|s|ms|us|ns], e.g. > > * 2.5h > * 14min > * 9300ms > > These literals should correspond to > > * datetime.timedelta(0, 9000) # 2.5h = 2.5*3600 = 9000 seconds > * datetime.timedelta(0, 840) # 14min = 14*60 = 840 seconds > * datetime.timedelta(0, 9, 300000) # 9300ms = 9 seconds + 3*10^5 > microseconds > > > If a literal was interpreted as a datetime.timedelta, the following would > work > out of the box: > > 2.5h - 14min + 9300ms * 2 > > which would correspond to > > from datetime import timedelta as D > > D(hours=2.5) - D(minutes=14) + D(milliseconds=9300) * 2 > # datetime.timedelta(0, 8178, 600000) # (*2 precedes, so that's to be > expected) > > > (D(hours=2.5) - D(minutes=14) + D(milliseconds=9300)) * 2 > # datetime.timedelta(0, 16338, 600000) > > > > Notes: > > * C++ uses `min` instead of `m`. `min` is a keyword in Python. > * In C++, `1d` means the first day of a month [2]. > * In C++, `1990y` means the year 1990 (in the Proleptic Gregorian > calendar) [3]. > * C++ have the types signed integers and not floats, so 2.5h would not be > valid. > * My apologies if this has been discussed before; my search-fu gave me > nothing. > > > > References: > > > [1] std::literals::chrono_literals::operator""min > http://en.cppreference.com/w/cpp/chrono/operator%22%22min > > [2] http://en.cppreference.com/w/cpp/chrono/day > > [3] http://en.cppreference.com/w/cpp/chrono/year > > > Best regards, > P?l Gr?n?s Drange > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paal.drange at gmail.com Sun Jun 3 06:53:19 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Sun, 3 Jun 2018 12:53:19 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: > What about > > 2.5*h - 14*min + 9300*ms * 2 That doesn't seem feasible to implement, however, that is essentially how the Pint [1] module works: import pint u = pint.UnitRegistry() (2.5*u.hour - 14*u.min + 9300*u.ms) * 2 # ((2.5*u.hour - 14*u.min + 9300*u.ms) * 2).to('sec') # > However why be limited to time units ? One would want in certain > application to define other units, like meter ? Would we want a litteral > for that ? Pint works with all units imaginable: Q = u.Quantity Q(u.c, (u.m/u.s)).to('km / hour') # However, the idea was just the six (h|min|s|ms|us|ns) time literals; I believe time units are used more often than other units, e.g. in constructs like while end - start < 1min: poll() sleep(1s) # TypeError sleep(1s.total_seconds()) # works, but ugly [1] https://pypi.org/project/Pint/ Best regards, P?l Gr?n?s Drange -------------- next part -------------- An HTML attachment was scrubbed... URL: From skreft at gmail.com Sun Jun 3 11:26:12 2018 From: skreft at gmail.com (Sebastian Kreft) Date: Sun, 3 Jun 2018 17:26:12 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: Please read the discussions on *https://mail.python.org/pipermail//python-ideas/2016-August/041890.html *, https://mail.python.org/pipermail//python-ideas/2016-August/041939.html, https://mail.python.org/pipermail//python-ideas/2016-August/042028.html, https://mail.python.org/pipermail//python-ideas/2016-August/041899.html They are more general than your proposal but they still cover pitfalls that may affect yours. It would be better if you could expand your proposal to the concerns raised on those threads. On Sun, Jun 3, 2018 at 12:53 PM, P?l Gr?n?s Drange wrote: > > What about > > > > 2.5*h - 14*min + 9300*ms * 2 > > That doesn't seem feasible to implement, however, that is essentially how > the > Pint [1] module works: > > import pint > u = pint.UnitRegistry() > (2.5*u.hour - 14*u.min + 9300*u.ms) * 2 > # > > ((2.5*u.hour - 14*u.min + 9300*u.ms) * 2).to('sec') > # > > > However why be limited to time units ? One would want in certain > > application to define other units, like meter ? Would we want a litteral > > for that ? > > Pint works with all units imaginable: > > Q = u.Quantity > Q(u.c, (u.m/u.s)).to('km / hour') > # > > > However, the idea was just the six (h|min|s|ms|us|ns) time literals; I > believe > time units are used more often than other units, e.g. in constructs like > > while end - start < 1min: > poll() > sleep(1s) # TypeError > sleep(1s.total_seconds()) # works, but ugly > > > [1] https://pypi.org/project/Pint/ > > Best regards, > P?l Gr?n?s Drange > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Sebastian Kreft -------------- next part -------------- An HTML attachment was scrubbed... URL: From paal.drange at gmail.com Sun Jun 3 17:03:43 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Sun, 3 Jun 2018 23:03:43 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: > They are more general than your proposal but they still cover pitfalls that > may affect yours. It would be better if you could expand your proposal to the > concerns raised on those threads. After reading that thread (this summary should be taken with a grain of salt), the first half is concerned with dimensional analysis not (satisfactorily) performed. The second half of the discussion is dedicated to the major issue with 1E1 meaning 1 and not 1 exa, and it's also pointed out that it is somewhat confusing that 1m means 0.001 and not 1 meter. There is an alternative proposal from that thread, namely that we write 2_m, 2_k etc to mean milli and kilo, respectively. That could also be used in my case. However, the suggested idea with _all the SI units_ is a completely new construct over any programming language in existence; and quite frankly a much bigger change than mine. My suggestion, which is not mine at all, really, has an implementation in C++, with experiences we can learn from. The "timedelta literals" idea has _six_ new literals, whereas Ken Kundert's idea has I don't know how many. I want to highlight one comment I found enlightening, and that is one from Paul Moore: [Python-ideas] SI scale factors in Python Paul Moore p.f.moore at gmail.com Thu Aug 25 16:03:32 EDT 2016 > Python has a track record of being open to adding syntactic support if > it demonstrably helps 3rd party tools (for example, the matrix > multiplication operator was added specifically to help the numeric > Python folks address a long-standing issue they had), so this is a > genuine possibility - but such proposals need support from the groups > they are intended to help. I can understand that a lack of support from people using timedelta will be a blocker. Now, please don't take this as a dismissal of your suggestion that I can learn from the referenced discussion; I did learn a great deal, but I also felt that much of the discussion was around subjects that are less relevant for the "timedelta literals" suggestion. Best regards, P?l Gr?n?s Drange -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sun Jun 3 17:29:56 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 3 Jun 2018 22:29:56 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On 3 June 2018 at 22:03, P?l Gr?n?s Drange wrote: > I want to highlight one comment I found enlightening, and that is one from > Paul Moore: > > [Python-ideas] SI scale factors in Python > Paul Moore p.f.moore at gmail.com > Thu Aug 25 16:03:32 EDT 2016 > >> Python has a track record of being open to adding syntactic support if >> it demonstrably helps 3rd party tools (for example, the matrix >> multiplication operator was added specifically to help the numeric >> Python folks address a long-standing issue they had), so this is a >> genuine possibility - but such proposals need support from the groups >> they are intended to help. > > > I can understand that a lack of support from people using timedelta will be > a blocker. I'm not entirely sure what point you're trying to make here, but you quoted that section somewhat out of context. The very next sentence in the same post (full post is at https://mail.python.org/pipermail//python-ideas/2016-August/041878.html) was """ At the moment, I'm not even aware of a particular "dimensional analysis with Python" community, or any particular "best of breed" package in this area that might lead such a proposal - and a language change of this nature probably does need that sort of backing. """ That seems directly relevant here. I'm not aware of a "timedelta users" community, nor is there a particular package (or set of packages) other than the stdlib datetime module, that constitute "best of breed" practice when handling timedeltas. So taking my full quote, I'd have to say that you seem to have undermined your own proposal here. For what it's worth, I use the datetime module and timedeltas regularly, and I would have no use for timedelta literals. Even if the proposal were for a complete set of datetime, date, time and timedelta literals,I still wouldn't have a use for it. Whether that's useful data I don't know, but you seem to be looking for "support from people using timedelta", so I thought I'd clearly state that I, personally, don't support the proposal. Paul From paal.drange at gmail.com Mon Jun 4 02:44:26 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Mon, 4 Jun 2018 08:44:26 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: > I'm not entirely sure what point you're trying to make here, but you > quoted that section somewhat out of context. > > [...] > > That seems directly relevant here. I'm not aware of a "timedelta > users" community, nor is there a particular package (or set of > packages) other than the stdlib datetime module, that constitute "best > of breed" practice when handling timedeltas. So taking my full quote, > I'd have to say that you seem to have undermined your own proposal > here. Yes, that was why I quoted it; I thought I should bring the relevant parts of the discussion into this, even if they were against this proposal. > For what it's worth, I use the datetime module and timedeltas > regularly, and I [...] don't support the proposal. That's good to hear, but if you don't mind asking, is your lack of support because you use timedelta "programatically" instead of hard-coded time units, or is there a different (or more) reason(s)? (I'm ready to yield, now I'm just curious.) Best regards, P?l Gr?n?s Drange -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jun 4 03:58:23 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 4 Jun 2018 08:58:23 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On 4 June 2018 at 07:44, P?l Gr?n?s Drange wrote: >> I'm not entirely sure what point you're trying to make here, but you >> quoted that section somewhat out of context. >> >> [...] >> >> That seems directly relevant here. I'm not aware of a "timedelta >> users" community, nor is there a particular package (or set of >> packages) other than the stdlib datetime module, that constitute "best >> of breed" practice when handling timedeltas. So taking my full quote, >> I'd have to say that you seem to have undermined your own proposal >> here. > > Yes, that was why I quoted it; I thought I should bring the relevant > parts of the discussion into this, even if they were against this > proposal. > >> For what it's worth, I use the datetime module and timedeltas >> regularly, and I [...] don't support the proposal. > > That's good to hear, but if you don't mind asking, is your lack of > support because you use timedelta "programatically" instead of > hard-coded time units, or is there a different (or more) reason(s)? > > (I'm ready to yield, now I'm just curious.) I don't know what you mean by "programatically instead of hard-coded time units". In my code, I've never felt the need to be able to write something like "5min" rather than "timedelta(minutes=5)". Far from it - I find the former jarring, whereas the latter is perfectly clear, so even if the literal form were available I probably wouldn't use it much. I don't see what problem they solve, I've *never* heard anyone claim that using the constructor to create a timedelta object was a problem - even in your proposal you don't explicitly claim that (although I presume you do think that, otherwise why bother making the proposal?). The point of my original comment that you quoted was more to say that *if* a specialist community with expertise in a particular area exists, and they claim that there is benefit to a notation, then their expert knowledge can override the more general view (which is typically pretty conservative, for good reasons). In this case, there's no such expert community, so there's no reason not to go with the default position (which in this case could be summarised as "if none of decimals, scale suffixes or physical units managed to make a case for dedicated literal syntax, timedelta doesn't stand a chance"). Paul From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Jun 4 04:26:01 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 4 Jun 2018 17:26:01 +0900 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> P?l Gr?n?s Drange writes: > That's good to hear, but if you don't mind asking, is your lack of > support because you use timedelta "programatically" instead of > hard-coded time units, or is there a different (or more) reason(s)? Speaking for myself, not for Paul, I guess my big objection would be that there would be too many collisions with other interpretations. Eg, does "5m" mean "5 minutes", "5 meters", or "the number 0.005"? Or perhaps "5 minutes of arc"? If you want something close to a literal, you can take the approach used by Decimal of initializing from a string: def make_timedelta(s): if s[-1] == "m": return timedelta(0, 60*int(s[:-1]), 0) # and so on TD = make_timedelta # alias for brevity td = TD("5m") This is more to type, and less readable (if you have a dominant interpretation for the "unit"!), but more flexible, and more readable (if you have multiple dimensioned types around -- if you forget what the unit means in this context, at least you still know the type of the object). It occurs to me that when we have more experience with typed variables, it might be worth revisting this, because then you could write td: datetime.timedelta td = 5m I personally would still be against it because there's too much potential for collision and I don't see why we'd privilege time units over other units or SI multipliers. I'm also sure that you'd get some degree of opposition from the people who really want types to be optional and fear any other feature that depends on typing as a "Trojan horse" to turn Python into a strictly-typed language. Steve From paal.drange at gmail.com Mon Jun 4 04:44:59 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Mon, 4 Jun 2018 10:44:59 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> References: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> Message-ID: > Speaking for myself, not for Paul, I guess my big objection would be > that there would be too many collisions with other interpretations. > Eg, does "5m" mean "5 minutes", "5 meters", or "the number 0.005"? Or > perhaps "5 minutes of arc"? Just to clarify, the proposition was actually `5min`, and that it would mean `datetime.timedelta(minutes=5)`. - P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jun 4 04:58:08 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 4 Jun 2018 09:58:08 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> Message-ID: On 4 June 2018 at 09:44, P?l Gr?n?s Drange wrote: >> Speaking for myself, not for Paul, I guess my big objection would be >> that there would be too many collisions with other interpretations. >> Eg, does "5m" mean "5 minutes", "5 meters", or "the number 0.005"? Or >> perhaps "5 minutes of arc"? > > Just to clarify, the proposition was actually `5min`, and that it would > mean `datetime.timedelta(minutes=5)`. ... and the fact that it's easy to forget that it's 5min rather than 5m (is it 1h or 1hr, 2s or 2sec?) is a problem with the proposal. Of course there's a similar problem with timedelta(minutes=5) or timedelta(minute=5) (timedelta uses minutes, datetime uses minute, I had to look in the docs to check), so it's not unique to the literal notation, but it is an issue nevertheless. Paul From paal.drange at gmail.com Mon Jun 4 06:59:22 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Mon, 4 Jun 2018 12:59:22 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> Message-ID: One thing that could solve both this proposal and the aforementioned SI-proposition by Ken Kundert, could be supporting user-defined literals. Suppose that __litXXX___ would make XXX a literal one could use as a suffix for numbers and strings (and lists, dicts, sets?). A user-defined literal could be defined as __lit__, though I don't know how to import it. Anything without leading underscore could be preserved for the standard library: # In the standard library: def __litj__(x): return complex(0, x) # The above example would make 2+3j work as expected # In the datetime module def __lit_h__(x): return timedelta(hours=x) def __lit_min__(x): return timedelta(minutes=x) # In the Pint (units) module for dimensional analysis def __lit_km__(x): return x * pint.UnitRegistry().km # It wouldn't be limited to numbers def __lit_up__(x): return x.upper() s = 'literal'_up # s = LITERAL # The _(x) from Django could be written def __lit_T__(x): return translate(x) s = 'literal'_T # s = translate('literal') # Heck, it could even be written as (abusing(?) notation) def __lit___(x): return translate(x) s = 'literal'_ # s = translate('literal') If we want to abuse the literals more, one could make def __lit_s__(lst): """Makes a list into a sorted list""" return sortedlist(lst) heap = []_s # heap is of type sortedlist - P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jun 4 07:38:03 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 4 Jun 2018 12:38:03 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: <23316.63385.45839.753182@turnbull.sk.tsukuba.ac.jp> Message-ID: On 4 June 2018 at 11:59, P?l Gr?n?s Drange wrote: > One thing that could solve both this proposal and the aforementioned > SI-proposition by Ken Kundert, could be supporting user-defined > literals. Suppose that __litXXX___ would make XXX a literal one could > use as a suffix for numbers and strings (and lists, dicts, sets?). > > A user-defined literal could be defined as __lit__, > though I don't know how to import it. The killer would be putting together a full proposal, though. Unless you mean something very different from the norm when you say "literal", literals are evaluated very early in the parsing process, long before user-defined functions are accessible. If what you actually mean is a specialised function calling syntax (where NNN_suf is parsed as the a "literal call" of "suf" with the number NNN as an argument, so it translates to a call to (something like) __lit_suf__(NNN) at runtime, then that's probably possible, but it's extremely unclear why that function-call syntax has any significant advantage over the standard syntax suf(NNN). "readability" is notoriously difficult to argue, and "allows construction of domain-specific languages" is pretty much an anti-goal in Python. So what's left to justify this? Paul From g.rodola at gmail.com Mon Jun 4 09:08:09 2018 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 4 Jun 2018 15:08:09 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On Sat, Jun 2, 2018 at 2:21 PM, P?l Gr?n?s Drange wrote: > Elevator pitch: > > (2.5h - 14min + 9300ms).total_seconds() > # 8169.3 > > from datetime import datetime as dt > start = dt.now() > end = dt.now() > (end-start) < 5s > # True > > > chrono::duration: > > In C++ 14 the std::chrono::duration was introduced which corresponds > somewhat to > datetime.timedelta. > > C++ 14 introduced so-called chrono literals[1], which are literals > specified as > [number][h|min|s|ms|us|ns], e.g. > > * 2.5h > * 14min > * 9300ms > > These literals should correspond to > > * datetime.timedelta(0, 9000) # 2.5h = 2.5*3600 = 9000 seconds > * datetime.timedelta(0, 840) # 14min = 14*60 = 840 seconds > * datetime.timedelta(0, 9, 300000) # 9300ms = 9 seconds + 3*10^5 > microseconds > > > If a literal was interpreted as a datetime.timedelta, the following would > work > out of the box: > > 2.5h - 14min + 9300ms * 2 > > which would correspond to > > from datetime import timedelta as D > > D(hours=2.5) - D(minutes=14) + D(milliseconds=9300) * 2 > # datetime.timedelta(0, 8178, 600000) # (*2 precedes, so that's to be > expected) > > > (D(hours=2.5) - D(minutes=14) + D(milliseconds=9300)) * 2 > # datetime.timedelta(0, 16338, 600000) > > > > Notes: > > * C++ uses `min` instead of `m`. `min` is a keyword in Python. > * In C++, `1d` means the first day of a month [2]. > * In C++, `1990y` means the year 1990 (in the Proleptic Gregorian > calendar) [3]. > * C++ have the types signed integers and not floats, so 2.5h would not be > valid. > * My apologies if this has been discussed before; my search-fu gave me > nothing. > > > > References: > > > [1] std::literals::chrono_literals::operator""min > http://en.cppreference.com/w/cpp/chrono/operator%22%22min > > [2] http://en.cppreference.com/w/cpp/chrono/day > > [3] http://en.cppreference.com/w/cpp/chrono/year > > > Best regards, > P?l Gr?n?s Drange > IMO datetimes are not common enough to deserve their own literals. It would make the language more complex and harder to learn for a relatively little benefit. This would probably make more sense as a third party lib: >>> import datetimeutils >>> datetimeutils.interpretstr("2.5h - 14min + 9300ms") datetime(...) Both the string and the possibility to specify function arguments would give you way more expressiveness than language literals. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From clint.hepner at gmail.com Mon Jun 4 09:17:51 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Mon, 4 Jun 2018 09:17:51 -0400 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: <2167FD19-52D9-4AEB-B230-A45DABC5ECC1@gmail.com> > On 2018 Jun 4 , at 9:08 a, Giampaolo Rodola' wrote: > > > > > > IMO datetimes are not common enough to deserve their own literals. It would make the language more complex and harder to learn for a relatively little benefit. This would probably make more sense as a third party lib: > > >>> import datetimeutils > >>> datetimeutils.interpretstr("2.5h - 14min + 9300ms") > datetime(...) > > Both the string and the possibility to specify function arguments would give you way more expressiveness than language literals. > Agreed. I'll add that interpretstr probably isn't necessary; the constructor for timedelta already lets you write >>> datetime.timedelta(hours=2.5, minutes=-14, milliseconds=9300) datetime.timedelta(0, 8169, 300000) Further, I'd argue that such involved timedelta instances are rarely instantiated explicitly, resulting instead from datetime arithmetic. -- Clint From klahnakoski at mozilla.com Mon Jun 4 14:50:55 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Mon, 4 Jun 2018 14:50:55 -0400 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: P?l Gr?n?s Drange, I do like the idea of literals typed with scientific units, but I often get short variable names mixed up, so I am not sure if I could use them without a cheat sheet. Formatting datetime is a good example of how confusing a collection of short names can get: Is month %m or %M?? Is minute %m or %i? https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior In case you are thinking, "Kyle, how can you even *think* "%i" means minutes?!", please see https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html#function_date-format? :) I made a class, with magic methods, to get close to what you want: (2.5*HOUR - 14*MINUTE + 9300*MILLISECOND).total_seconds() I used full names for less confusion, but you can do the same with shorter names: (2.5*h- 14*min + 9300*ms).total_seconds() Maybe the Python parser can be made to add an implied multiplication between a-number-followed-directly-by-a-variable-name. If so, then I could write: (2.5HOUR - 14MINUTE + 9300MILLISECOND).total_seconds() You can define your short, domain specific, suffixes: (2.5h - 14m + 9300ms).total_seconds() On 2018-06-02 08:21, P?l Gr?n?s Drange wrote: > Elevator pitch: > > (2.5h - 14min + 9300ms).total_seconds() > # 8169.3 > > from datetime import datetime as dt > start = dt.now() > end = dt.now() > (end-start) < 5s > # True > > > chrono::duration: > > In C++ 14 the std::chrono::duration was introduced which corresponds > somewhat to > datetime.timedelta. > > C++ 14 introduced so-called chrono literals[1], which are literals > specified as > [number][h|min|s|ms|us|ns], e.g. > > * 2.5h > * 14min > * 9300ms > > These literals should correspond to > > * datetime.timedelta(0, 9000)? # 2.5h = 2.5*3600 = 9000 seconds > * datetime.timedelta(0, 840)?? # 14min = 14*60 = 840 seconds > * datetime.timedelta(0, 9, 300000)? # 9300ms = 9 seconds + 3*10^5 > microseconds > > > If a literal was interpreted as a datetime.timedelta, the following > would work > out of the box: > > 2.5h - 14min + 9300ms * 2 > > which would correspond to > > from datetime import timedelta as D > > D(hours=2.5) - D(minutes=14) + D(milliseconds=9300) * 2 > # datetime.timedelta(0, 8178, 600000)? # (*2 precedes, so that's to be > expected) > > > (D(hours=2.5) - D(minutes=14) + D(milliseconds=9300)) * 2 > # datetime.timedelta(0, 16338, 600000) > > > > Notes: > > * C++ uses `min` instead of `m`.? `min` is a keyword in Python. > * In C++, `1d` means the first day of a month [2]. > * In C++, `1990y` means the year 1990 (in the Proleptic Gregorian > calendar) [3]. > * C++ have the types signed integers and not floats, so 2.5h would not > be valid. > * My apologies if this has been discussed before; my search-fu gave me > nothing. > > > > References: > > > [1] std::literals::chrono_literals::operator""min > http://en.cppreference.com/w/cpp/chrono/operator%22%22min > > [2] http://en.cppreference.com/w/cpp/chrono/day > > [3] http://en.cppreference.com/w/cpp/chrono/year > > > Best regards, > P?l Gr?n?s Drange > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Jun 4 16:41:13 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 4 Jun 2018 13:41:13 -0700 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On Mon, Jun 4, 2018 at 12:58 AM, Paul Moore wrote: > > That's good to hear, but if you don't mind asking, is your lack of > > support because you use timedelta "programatically" instead of > > hard-coded time units, or is there a different (or more) reason(s)? > > > > (I'm ready to yield, now I'm just curious.) > > I don't know what you mean by "programatically instead of hard-coded > time units". I think he means essentially: Do you use timedelta with literal arguments? -- as opposed to having it be the result of a calculation or read from a file or ... > In my code, I've never felt the need to be able to write > something like "5min" rather than "timedelta(minutes=5)". Far from it > - I find the former jarring, whereas the latter is perfectly clear, so > even if the literal form were available I probably wouldn't use it > much. I'm the opposite - I use timedelta a fair bit, and find writing: timedelta(hours=24) Pretty awkward. To the point that I make part of my "scripting" API take integer seconds (or floating point hours, or...) rather than a timedelta objects, to save my users from having to do: from datetime import timedelta call_a_function(... timestep = timedelta(seconds=10) ... ) Rather than: call_a_function(... timestep = 10 ... ) The latter of which requires more typing, an extra import, and, most importantly, a knowledge of datetime and the timedelta API. So yeah -- I have a need for it. All that being said, there are an number of things one might want a literal for, and adding a huge pile of them is a bad idea, so I'm -1 on this proposal anyway. It does make me think that I may want to add my own utilities to make this easier: def seconds(s) return timedelta(seconds=s) etc. Then I could add them to my "scripting" library, and my users could do: call_a_function(... timestep = seconds(10) ... ) Not so bad. In fact, maybe it would be a good idea to add such utilities to the datetime module... ANOTHER NOTE: The interface to timedelta is hard to discover because the docstring is incomplete: In [2]: timedelta? Init signature: timedelta(self, /, *args, **kwargs) Docstring: Difference between two datetime values. File: ~/miniconda2/envs/py3/lib/python3.6/datetime.py Type: type Ouch! no idea what keyword argument it takes! [but that's an argument for better docstrings, not adding literals...] -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From paal.drange at gmail.com Mon Jun 4 16:59:34 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Mon, 4 Jun 2018 22:59:34 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: For the general user-defined literals, here are some example use cases: Here is how to expose literals: __literals__ = ['__literal_h__', '__literal_min__', '__literal_s__'] And here is how to import them: from mymodule import _* # imports all literals from mymodule # could also be *_ or ** or ~ or __literals__ ? 90_deg # 1.0 in rad 20_?C # 293.15 Kelvin units.to(20.0_mi, 'km') e = 1_X + 4_G + 13_M + 180_k # SI scale factors can easily be implemented p = 3_x - 2_y + 1_z # Coordinate in 3D M = 4_I # The 4x4 Identity Matrix p * 2_pi # because it's simpler than 2 * np.pi p * 11_? # why not d = 2_sqrt # sqrt(2), or simply 2**(1/2), not a killer argument h1 = 30_px h2 = 4_em # sizes for typography c = 'ffff00'_color # the color yellow h = 'My header'_h3 # renders

My header

or something g = 9.81_m_div_ss # that was ugly ... 'More weird examples'_info # log.info(msg) '2018-06-04'_AD # is a date '192.168.0.42'_ip4 # why not? 'USER'_os # = os.environ.get(x, '') # Can have state? initialize(80) # set default string width to 80 chars 'my string'_c # will now be centralized in 80 chars 'my string'_l # will now be left aligned in 80 chars 'my string'_r # will now be right aligned in 80 chars If we used, e.g. tilde, we could even use it on function calls y = fun(x)~dx # differentiations! Like decorators, but on "call time" I accept that introducing a new symbol has a very high threshold, and will not go through. I just wanted to mention it. Yes, we could write all these as easily as function calls, deg(90) celsius(20) center('my string') # or 'my string'.center(80) But somehow it seems nicer to write 42_km than 12 * pint.UnitRegistry().km - P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From benrudiak at gmail.com Mon Jun 4 17:22:29 2018 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Mon, 4 Jun 2018 14:22:29 -0700 Subject: [Python-ideas] Add dict.append and dict.extend Message-ID: I'd like to propose adding `append` and `extend` methods to dicts which behave like `__setitem__` and `update` respectively, except that they raise an exception (KeyError?) instead of overwriting preexisting entries. Very often I expect that the key I'm adding to a dict isn't already in it. If I want to verify that, I have to expand my single-line assignment statement to 3-5 lines (depending on whether the dict and key are expressions that I now need to assign to local variables). If I don't verify it, I may overwrite a dict entry and produce silently wrong output. The names `append` and `extend` make sense now that dicts are defined to preserve insertion order: they try to append the new entries, and if that can't be done because it would duplicate a key, they raise an exception. In case of error, `extend` should probably leave successfully appended entries in the dict, since that's consistent with list.extend and dict.update. The same methods would also be useful on sets. Unfortunately, the names make less sense. -- Ben From chris.barker at noaa.gov Mon Jun 4 17:50:43 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 4 Jun 2018 14:50:43 -0700 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On Mon, Jun 4, 2018 at 1:59 PM, P?l Gr?n?s Drange wrote: > For the general user-defined literals, here are some example use cases: > I kind of like the idea of user-defined literals, but: > Yes, we could write all these as easily as function calls, > > deg(90) > celsius(20) > center('my string') # or 'my string'.center(80) > > But somehow it seems nicer to write 42_km than 12 * pint.UnitRegistry().km > how about? from pint import km 42*km still not as nice as 42_km, though only by a bit.... So maybe you could propose adding: seconds minutes hours days to the datetime module, and then we could write: 60*seconds == 1*minutes Without any changes to the language at all. -CHB > > > - P?l > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jun 4 18:21:01 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 4 Jun 2018 23:21:01 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On 4 June 2018 at 22:50, Chris Barker via Python-ideas wrote: > So maybe you could propose adding: > > seconds > minutes > hours > days > > to the datetime module, and then we could write: > > 60*seconds == 1*minutes > > Without any changes to the language at all. This strikes me as more of a personal/team style issue, so probably fits better as local definitions in a project's "utilities" module. Putting it in the stdlib implies a certain level of approval of this as a standard idiom, and I'm not sure it's common enough practice to warrant that. Paul From benrudiak at gmail.com Mon Jun 4 18:23:07 2018 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Mon, 4 Jun 2018 15:23:07 -0700 Subject: [Python-ideas] Allow popping of slices Message-ID: The `pop` method of built-in sequences is basically an atomic version of val = self[pos] del self[pos] return val If this behavior was extended to the case where `pos` is a slice, you could write things like: def cut_deck(deck, pos): deck.extend(deck.pop(slice(0, pos))) def bfs(roots): depth, frontier = 0, list(roots) while frontier: depth += 1 for item in frontier.pop(slice(None)): ... frontier.append(...) ... Similar functionality is found in many other languages (e.g. Perl and JavaScript's `splice`). I think it's useful not just because it's more concise, but because it's linear/reversible: it moves data rather than duplicating and then destroying it, which makes it less prone to bugs. The syntax is a bit odd since you have to construct the slice by hand. Here are three solutions for that from least to most extravagant: 1. Don't worry about it. It's still useful, and the syntax, though verbose, makes sense. (The "reference implementation" of pop is literally unchanged.) 2. Give pop methods a __getitem__ that does the same thing as __call__, so you can write xs.pop[-1] or xs.pop[:]. 3. Promote del statements to expressions that return the same values as the underlying __delitem__, __delattr__, etc., and make those methods of built-in types return the thing that was deleted. (Or introduce __popitem__, __popattr__, etc. which return a value.) -- Ben From waksman at gmail.com Mon Jun 4 18:57:32 2018 From: waksman at gmail.com (George Leslie-Waksman) Date: Mon, 4 Jun 2018 15:57:32 -0700 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: Message-ID: Semantically, I'm not sure append and extend would be universally understood to mean don't overwrite. This can be accomplished with a custom subclass for your use case: ``` import collections class OverwriteGuardedDict(collections.UserDict): def append(self, key, value): if key in self.data: raise KeyError(key) self.data[key] = value def extend(self, other): overlap = self.data.keys() & other.keys() if overlap: raise KeyError(','.join(overlap)) self.data.update(other) ``` On Mon, Jun 4, 2018 at 2:24 PM Ben Rudiak-Gould wrote: > I'd like to propose adding `append` and `extend` methods to dicts > which behave like `__setitem__` and `update` respectively, except that > they raise an exception (KeyError?) instead of overwriting preexisting > entries. > > Very often I expect that the key I'm adding to a dict isn't already in > it. If I want to verify that, I have to expand my single-line > assignment statement to 3-5 lines (depending on whether the dict and > key are expressions that I now need to assign to local variables). If > I don't verify it, I may overwrite a dict entry and produce silently > wrong output. > > The names `append` and `extend` make sense now that dicts are defined > to preserve insertion order: they try to append the new entries, and > if that can't be done because it would duplicate a key, they raise an > exception. > > In case of error, `extend` should probably leave successfully appended > entries in the dict, since that's consistent with list.extend and > dict.update. > > The same methods would also be useful on sets. Unfortunately, the > names make less sense. > > -- Ben > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Mon Jun 4 19:02:19 2018 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 4 Jun 2018 16:02:19 -0700 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: Message-ID: On Mon, Jun 4, 2018 at 3:58 PM George Leslie-Waksman wrote: > Semantically, I'm not sure append and extend would be universally > understood to mean don't overwrite. > > The proposed meanings surprised me too. My initial instinct for `dict.append` was that it would always succeed, much like `list.append` always succeeds. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Mon Jun 4 19:13:36 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Tue, 5 Jun 2018 00:13:36 +0100 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: <4e9f0202-b0c2-37b5-5981-916922c17eb5@btinternet.com> On 04/06/2018 19:50, Kyle Lahnakoski wrote: > > > Maybe the Python parser can be made to add an implied multiplication > between a-number-followed-directly-by-a-variable-name. If so, then I > could write: > > (2.5HOUR - 14MINUTE + 9300MILLISECOND).total_seconds() > > This strikes me as quite a nifty idea, if the implied multiplication calls (by default) __rmul__ on the second operand.? A ridiculously simple example: >>> import datetime >>> class D(object): ??? ??? def __rmul__(self, LHS): ??? ??? ??????????? return datetime.timedelta(days=LHS) >>> # Possibly some magic to make D a singleton class >>> d=D() >>> 2*d??? # Works now datetime.timedelta(2) >>> 2d???? # Does not work now datetime.timedelta(2) There would, sadly,? be a conflict with ??? floating literals such as "2e3" ??? hex literals such as 0XB ??? complex literals such as 4j ??? numeric literals such as 1_234 ??? any others I haven't thought of So the parser would have to give priority to such existing, valid forms. Rob Cliffe From chris.barker at noaa.gov Mon Jun 4 19:22:43 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 4 Jun 2018 16:22:43 -0700 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: On Mon, Jun 4, 2018 at 3:21 PM, Paul Moore wrote: > > So maybe you could propose adding: > > > > seconds > > minutes > > hours > > days > > > > to the datetime module, and then we could write: > > > > 60*seconds == 1*minutes > > > > Without any changes to the language at all. > > This strikes me as more of a personal/team style issue, so probably > fits better as local definitions in a project's "utilities" module. > probably, yes -- that's why I'm not going to push for it. But the point is that we can "solve" this problem in Python by: * Doing nothing, and letting people roll their own simiple solution or * Adding a few names to the datetime module Which is a pretty small lift, at least compared to adding new literal syntax. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jun 4 20:11:57 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Jun 2018 10:11:57 +1000 Subject: [Python-ideas] Allow popping of slices In-Reply-To: References: Message-ID: <20180605001157.GT12683@ando.pearwood.info> On Mon, Jun 04, 2018 at 03:23:07PM -0700, Ben Rudiak-Gould wrote: > The `pop` method of built-in sequences is basically an atomic version of > > val = self[pos] > del self[pos] > return val Aside from the atomicness, for testing we can subclass list: # Untested class MyList(list): def pop(self, pos): if isinstance(pos, slice): temp = self[pos] del self[pos] return temp return super().pop(pos) Is that what you have in mind? > If this behavior was extended to the case where `pos` is a slice, you > could write things like: > > def cut_deck(deck, pos): > deck.extend(deck.pop(slice(0, pos))) I'm not sure that's an advantage over: deck[:] = deck[pos:] + deck[:pos] but I suppose one might be faster or slower than the other. But I think the version with slices is much more clear. > def bfs(roots): > depth, frontier = 0, list(roots) > while frontier: > depth += 1 > for item in frontier.pop(slice(None)): > ... > frontier.append(...) > ... If that's a breadth-first search, I've never seen it written like that before. The classic bfs algorithm is at Wikipedia (conveniently written in Python): https://en.wikipedia.org/wiki/Breadth-first_search#Pseudocode and yours is very different. I'm not saying yours is wrong, but its not obviously right either. It might help your argument if you show equivalent (but working) code that doesn't rely on popping a slice. > Similar functionality is found in many other languages (e.g. Perl and > JavaScript's `splice`). https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/splice Do you have a link for the Perl version? > I think it's useful not just because it's more > concise, but because it's linear/reversible: it moves data rather than > duplicating and then destroying it, which makes it less prone to bugs. I don't see how that is possible. You still have to move the data out of the sequence into a new sequence before deleting it from the original. Maybe there are optimizations if the slice is at the end of the sequence, but in general you have to make a copy of the items. (Well, not the items themselves, just the references to them.) > The syntax is a bit odd since you have to construct the slice by hand. > Here are three solutions for that from least to most extravagant: > > 1. Don't worry about it. It's still useful, and the syntax, though > verbose, makes sense. (The "reference implementation" of pop is > literally unchanged.) Indeed. If this functionality can be justified, we could start with this, and think about a neater syntax later (if required). > 2. Give pop methods a __getitem__ that does the same thing as > __call__, so you can write xs.pop[-1] or xs.pop[:]. That's interesting. But it would mean that pop, and only pop, would allow pop(n) and pop[n] to be the same thing. That's going to confuse people who wonder why they can't call other methods like that: mylist.pop[0] # okay mylist.append[item] # fails and why they can't use slice syntax in the round-bracket call syntax: mylist.pop[1:-1] # okay mylist.pop(1:-1) # not okay While I'm intrigued by this, I think it will be too confusing. > 3. Promote del statements to expressions that return the same values > as the underlying __delitem__, __delattr__, etc., and make those > methods of built-in types return the thing that was deleted. (Or > introduce __popitem__, __popattr__, etc. which return a value.) I don't get how this allows us to pass slices to pop. You missed one, allow slice literals: https://mail.python.org/pipermail/python-ideas/2015-June/034086.html -- Steve From steve at pearwood.info Mon Jun 4 20:25:56 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Jun 2018 10:25:56 +1000 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: Message-ID: <20180605002556.GU12683@ando.pearwood.info> On Mon, Jun 04, 2018 at 02:22:29PM -0700, Ben Rudiak-Gould wrote: > I'd like to propose adding `append` and `extend` methods to dicts > which behave like `__setitem__` and `update` respectively, except that > they raise an exception (KeyError?) instead of overwriting preexisting > entries. > > Very often I expect that the key I'm adding to a dict isn't already in > it. Can you give some examples of when you want to do that? I'm having difficulty in thinking of any. The only example I thought of is when you have a "preferences" or "settings" dict, and you want to add default settings but only if the user hasn't provided them. But the way to do that is to go in the opposite direction, starting with the defaults, and unconditionally add user settings, overriding the default. # wrong way (yes, I've actually done this :-( settings = get_user_prefs() for key, value in get_default_prefs(): if key not in settings: settings[key] = value # right way settings = get_default_prefs() settings.update(get_user_prefs()) So I'm afraid I don't see the value of this. > If I want to verify that, I have to expand my single-line > assignment statement to 3-5 lines (depending on whether the dict and > key are expressions that I now need to assign to local variables). I'm sorry, I can't visualise how it would take you up to five lines to check and update a key. It shouldn't take more than two: if key not in d: d[key] = value Can you give an real-life example of the five line version? > The names `append` and `extend` make sense now that dicts are defined > to preserve insertion order: they try to append the new entries, and > if that can't be done because it would duplicate a key, they raise an > exception. I don't see any connection between "append" and "fail if the key already exists". That's not what it means with lists. -- Steve From python at mrabarnett.plus.com Mon Jun 4 20:59:37 2018 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 5 Jun 2018 01:59:37 +0100 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: <20180605002556.GU12683@ando.pearwood.info> References: <20180605002556.GU12683@ando.pearwood.info> Message-ID: On 2018-06-05 01:25, Steven D'Aprano wrote: > On Mon, Jun 04, 2018 at 02:22:29PM -0700, Ben Rudiak-Gould wrote: >> I'd like to propose adding `append` and `extend` methods to dicts >> which behave like `__setitem__` and `update` respectively, except that >> they raise an exception (KeyError?) instead of overwriting preexisting >> entries. >> >> Very often I expect that the key I'm adding to a dict isn't already in >> it. > [snip] >> If I want to verify that, I have to expand my single-line >> assignment statement to 3-5 lines (depending on whether the dict and >> key are expressions that I now need to assign to local variables). > > I'm sorry, I can't visualise how it would take you up to five lines to > check and update a key. It shouldn't take more than two: > > if key not in d: > d[key] = value > It shouldn't take more than one: d.setdefault(key, value) :-) [snip] From chris.barker at noaa.gov Mon Jun 4 21:06:59 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 4 Jun 2018 21:06:59 -0400 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: <20180605002556.GU12683@ando.pearwood.info> Message-ID: > d.setdefault(key, value) I though the OP wanted an error if the key already existed. This is close, as it won?t change the dict if the key is already there, but it will add it if it?s not. @OP Maybe post those five lines so we know exactly what you want ? maybe there is already a good solution. I know I spent years thinking ?there should be an easy way to do this? before I found setdefault(). -CHB > > :-) > > [snip] > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mertz at gnosis.cx Mon Jun 4 19:04:06 2018 From: mertz at gnosis.cx (David Mertz) Date: Mon, 4 Jun 2018 19:04:06 -0400 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: Message-ID: I don't think I'd ever guess the intended semantics from the names in a million years. They seem like horribly misnamed methods, made worse by the false suggestion of similarity to list operations. In 20 years of using Python, moreover, I don't think I've ever wanted the described behavior under any spelling. On Mon, Jun 4, 2018, 6:58 PM George Leslie-Waksman wrote: > Semantically, I'm not sure append and extend would be universally > understood to mean don't overwrite. > > This can be accomplished with a custom subclass for your use case: > > ``` > import collections > > > class OverwriteGuardedDict(collections.UserDict): > def append(self, key, value): > if key in self.data: > raise KeyError(key) > self.data[key] = value > > def extend(self, other): > overlap = self.data.keys() & other.keys() > if overlap: > raise KeyError(','.join(overlap)) > self.data.update(other) > ``` > > On Mon, Jun 4, 2018 at 2:24 PM Ben Rudiak-Gould > wrote: > >> I'd like to propose adding `append` and `extend` methods to dicts >> which behave like `__setitem__` and `update` respectively, except that >> they raise an exception (KeyError?) instead of overwriting preexisting >> entries. >> >> Very often I expect that the key I'm adding to a dict isn't already in >> it. If I want to verify that, I have to expand my single-line >> assignment statement to 3-5 lines (depending on whether the dict and >> key are expressions that I now need to assign to local variables). If >> I don't verify it, I may overwrite a dict entry and produce silently >> wrong output. >> >> The names `append` and `extend` make sense now that dicts are defined >> to preserve insertion order: they try to append the new entries, and >> if that can't be done because it would duplicate a key, they raise an >> exception. >> >> In case of error, `extend` should probably leave successfully appended >> entries in the dict, since that's consistent with list.extend and >> dict.update. >> >> The same methods would also be useful on sets. Unfortunately, the >> names make less sense. >> >> -- Ben >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benrudiak at gmail.com Tue Jun 5 02:27:35 2018 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Mon, 4 Jun 2018 23:27:35 -0700 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: <20180605002556.GU12683@ando.pearwood.info> References: <20180605002556.GU12683@ando.pearwood.info> Message-ID: On Mon, Jun 4, 2018 at 4:02 PM, Yuval Greenfield wrote: > The proposed meanings surprised me too. My initial instinct for > `dict.append` was that it would always succeed, much like `list.append` > always succeeds. Many of the methods named `append` in the standard libraries fail if adding the item would violate a constraint of the data structure. `list.append` is an exception because it stores uninterpreted object references, but, e.g., bytearray().append(-1) raises ValueError. Also, `dict.__setitem__` and `dict.update` fail if a key is unhashable, which is another dict-specific constraint. Regardless, I'm not too attached to those names. I want the underlying functionality, and the names made sense to me. I'd be okay with `unique_setitem` and `unique_update`. On Mon, Jun 4, 2018 at 5:25 PM, Steven D'Aprano wrote: > On Mon, Jun 04, 2018 at 02:22:29PM -0700, Ben Rudiak-Gould wrote: >> Very often I expect that the key I'm adding to a dict isn't already in >> it. > > Can you give some examples of when you want to do that? I'm having > difficulty in thinking of any. One example (or family of examples) is any situation where you would have a UNIQUE constraint on an indexed column in a database. If the values in a column should always be distinct, like the usernames in a table of user accounts, you can declare that column UNIQUE (or PRIMARY KEY) and any attempt to add a record with a duplicate username will fail. People often use Python dicts to look up objects by some property of the object, which is similar to indexing a database column. When the values aren't necessarily unique (like a zip code), you have to use something like defaultdict(list) for the index, because Python doesn't have a dictionary that supports duplicate keys like C++'s std::multimap. When the values should be unique (like a username), the best data type for the index is dict, but there's no method on dicts that has the desired behavior of refusing to add a record with a duplicate key. I think this is a frequent enough use case to deserve standard library support. Of course you can implement the same functionality in other ways, but that's as true of databases as it is of Python. If SQL didn't have UNIQUE, every client of the database would have its own code for checking and enforcing the constraint. They'd all have different names, and slightly different implementations. The uniqueness property that they're supposed to guarantee would probably be documented only in comments if at all. Some implementations would probably have bugs. You can't offload all of your programming needs onto the database developer, but I think UNIQUE is a useful enough feature to merit inclusion in SQL. And that's my argument for Python as well. Another example is keyword arguments. f(**{'a': 1}, **{'a': 2}) could mean f(a=1) or f(a=2), but actually it's an error. I think that was a good design decision: it's consistent with Python's general philosophy of raising exceptions when things look dodgy, which makes it much easier to find bugs. Compare this to JavaScript, where if you pass four arguments to a function that expected three, the fourth is just discarded; and if the actual incorrect argument was the first, not the fourth, then all of the arguments will be bound to the wrong variables, and if an argument that was supposed to be a number gets a value of some other type as a consequence, and the function tries to add 1 to it, it still won't fail but will produce some silly result like "[object Object]1", which will then propagate through more of the code, until finally you get a wrong answer or a failure in code that's unrelated to the actually erroneous code. I'm thankful that Python doesn't do that, and I wish it didn't do it even more than it already doesn't. Methods that raise an exception on duplicated keys, instead of silently discarding the old or new value, are an example of the sort of fail-safe operations that I'd like to see more of. For overridable options with defaults, `__setitem__` and `update` do the right thing - I certainly don't think they're useless. > I'm sorry, I can't visualise how it would take you up to five lines to > check and update a key. It shouldn't take more than two: > > if key not in d: > d[key] = value > > Can you give an real-life example of the five line version? The three lines I was thinking of were something like if k in d: raise KeyError(k) d[k] = ... The five lines were something like d = get_mapping() k = get_key() if k in d: raise KeyError(k) d[k] = ... as a checked version of get_mapping()[get_key()] = ... (or in general, any situation where you can't or don't want to duplicate the expressions that produce the mapping and the key). > I don't see any connection between "append" and "fail if the key already > exists". That's not what it means with lists. If Python had built-in dictionaries with no unique-key constraint, and you started with multidict({'a': 1, 'b': 2}) and appended 'a': 3 to that, you'd get multidict({'a': 1, 'b': 2, 'a': 3}) just as if you'd appended ('a', 3) to [('a', 1), ('b', 2)], except that this "list" is indexed on the first half of each element. If you try to append 'a': 3 to the actual Python dict {'a': 1, 'b': 2}, it should fail because {'a': 1, 'b': 2, 'a': 3} violates the unique-key constraint of that data structure. The failure isn't the point, as such. It just means the method can't do what it's meant to do, which is add something to the dict while leaving everything that's already there alone. -- Ben From j.van.dorp at deonet.nl Tue Jun 5 02:57:26 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 5 Jun 2018 08:57:26 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: I'd say seconds(), minutes() etc functions in the datetime module might be good. Same for better docstrings. I dont really think it's worth adding literals though. @P?l You can't import literals. They're syntax, not just bound names. (Unless we're counting future imports, but those aren't real imports.) From benrudiak at gmail.com Tue Jun 5 03:18:16 2018 From: benrudiak at gmail.com (Ben Rudiak-Gould) Date: Tue, 5 Jun 2018 00:18:16 -0700 Subject: [Python-ideas] Allow popping of slices In-Reply-To: <20180605001157.GT12683@ando.pearwood.info> References: <20180605001157.GT12683@ando.pearwood.info> Message-ID: On Mon, Jun 4, 2018 at 5:11 PM, Steven D'Aprano wrote: > class MyList(list): > def pop(self, pos): > if isinstance(pos, slice): > temp = self[pos] > del self[pos] > return temp > return super().pop(pos) > > Is that what you have in mind? Yes. (I used almost exactly that to test my examples.) >> def cut_deck(deck, pos): >> deck.extend(deck.pop(slice(0, pos))) > > I'm not sure that's an advantage over: > > deck[:] = deck[pos:] + deck[:pos] > > but I suppose one might be faster or slower than the other. But I think > the version with slices is much more clear. That's fair. I didn't spend as long creating examples as I probably should've. > It might help your argument if you show equivalent (but working) code > that doesn't rely on popping a slice. When I have a collection of items and I want to consume them, process them, and produce a new collection of items, I often find myself writing something along the lines of items2 = [] for item in items: ... items2.append(...) ... items = items2 del items2 The last two statements aren't strictly necessary, but omitting them is a bug factory in my experience; it's too easy to use the wrong variable in subsequent code. When the loop is simple enough I can write items = [... for item in items] and when it's complicated enough it probably makes sense to split it into a separate function. But I've many times wished that I could write for item in items.pop_all(): ... items.append(...) ... This proposal uses pop(slice(None)) instead, because it's a natural extension of the existing meaning of that method. My bfs function was a breadth-first search. The outer loop runs once for each depth, and the inner loop once for each item at the current depth (the frontier). The code in the Wikipedia article uses a FIFO queue instead and has just one loop (but doesn't explicitly track the depth). > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/splice > > Do you have a link for the Perl version? https://perldoc.perl.org/functions/splice.html It's slightly more versatile: it essentially does old = ARRAY[OFFSET:OFFSET+LENGTH] ARRAY[OFFSET:OFFSET+LENGTH] = LIST return old The special case LIST = [] is what I'm suggesting for the pop method. >> I think it's useful not just because it's more >> concise, but because it's linear/reversible: it moves data rather than >> duplicating and then destroying it, which makes it less prone to bugs. > > I don't see how that is possible. You still have to move the data out of > the sequence into a new sequence before deleting it from the original. Under the hood it copies data because that's how von Neumann machines work, but from the Python programmer's perspective it moves it. (Similarly, std::swap(x, y) in C++ probably does { t = x; x = y; y = t; } internally, but it looks like a swap from the programmer's perspective.) >> 3. Promote del statements to expressions that return the same values >> as the underlying __delitem__, __delattr__, etc., and make those >> methods of built-in types return the thing that was deleted. (Or >> introduce __popitem__, __popattr__, etc. which return a value.) > > I don't get how this allows us to pass slices to pop. It doesn't; it basically makes del the new pop. It's almost pop already, except that it doesn't return the deleted value. I wasn't seriously proposing this, although I do like the idea. I don't think it reaches the usefulness threshold for new syntax. Also, del foo[:] would be slower if it had to move the deleted items into a new list whether or not the caller had any interest in them. That's why I suggested that del expressions call __popXXX__ instead, while del statements continue to call __delXXX__; but now it's getting complicated. Oh well. -- Ben From paal.drange at gmail.com Tue Jun 5 04:08:32 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Tue, 5 Jun 2018 10:08:32 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: > You can't import literals. They're syntax, not just bound names. I'm way out of my comfort zone now, but the parser could for `123.45_f` give `__literal_f__(123.45)` and then that function should be imported. I'm sure this idea has many shortcomings that I don't see, but that was the reason why I wanted to import stuff. P?l -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Tue Jun 5 04:22:20 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 5 Jun 2018 10:22:20 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: 2018-06-05 10:08 GMT+02:00 P?l Gr?n?s Drange : >> You can't import literals. They're syntax, not just bound names. > > I'm way out of my comfort zone now, but the parser could for > `123.45_f` > give > `__literal_f__(123.45)` > and then that function should be imported. > > I'm sure this idea has many shortcomings that I don't see, but that was the > reason why I wanted to import stuff. > > P?l Before your code is executed, python scans your entire file for syntax errors. Since 123.45_f is currently not a valid literal, it'll just print a syntax error without even looking at your imports. To change that, the very core of python would need to look completely different. It'd be a metric fuckton of work for a whole lot of people. Im not a core dev myself or anything, but i'm pretty confident that this isn't going to happen for a rather minor need like this. From robertve92 at gmail.com Tue Jun 5 06:48:55 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Tue, 5 Jun 2018 12:48:55 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: second, minute, hour (singular) timedelta objects in the module are a good idea, one could do 5 * minute to get a timedelta or one could do value / minute to get a float. a = datetime.now() b = datetime(2018, 2, 3) + 5 * minute print((a - b).total_seconds()) print((a - b) / minute) Le mar. 5 juin 2018 ? 10:23, Jacco van Dorp a ?crit : > 2018-06-05 10:08 GMT+02:00 P?l Gr?n?s Drange : > >> You can't import literals. They're syntax, not just bound names. > > > > I'm way out of my comfort zone now, but the parser could for > > `123.45_f` > > give > > `__literal_f__(123.45)` > > and then that function should be imported. > > > > I'm sure this idea has many shortcomings that I don't see, but that was > the > > reason why I wanted to import stuff. > > > > P?l > > Before your code is executed, python scans your entire file for syntax > errors. Since 123.45_f is currently not a valid literal, it'll just > print a syntax error without even looking at your imports. > > To change that, the very core of python would need to look completely > different. It'd be a metric fuckton of work for a whole lot of people. > Im not a core dev myself or anything, but i'm pretty confident that > this isn't going to happen for a rather minor need like this. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Tue Jun 5 07:56:21 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Tue, 5 Jun 2018 13:56:21 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: i'd also be pretty simple to implement.... Just list: minute = timedelta(minutes=1) hour = timedelta(hours=1) etc... and you could import and use them like that. Or if you really want to write 5*m, the just from datetime import minute as m From desmoulinmichel at gmail.com Tue Jun 5 08:24:12 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 5 Jun 2018 14:24:12 +0200 Subject: [Python-ideas] Make asyncio.get_event_loop a builtin In-Reply-To: References: Message-ID: Well that doesn't matter anyway. With breakpoint() we choose the debugger implementation, so we could do the same here. However, this would be addressing the wrong problem the problem is verbosity. asyncio s are already aware of that, since they are working on providing asycio.run() in 3.7. Now they also added asyncio.get_running_loop(), which we will soon be used everywhere instead of get_event_loop(). That should have a shorter alias lie asyncio.loop([check_running=True]). Currently the shortest we can do is: import asyncio as aio aio.get_event_loop() This gets old very fast. It's annoying enough that we have to deal manually with a very low level event loop every when we go from JS/Go to Python. It would be nice to have more helpers to do so. Le 25/05/2018 ? 02:42, Ken Hilton a ?crit?: > On?Tue May 22 22:08:40 (-0400), Chris Barker wrote: >>?while asyncio is in the standard library, it is not intended to be THE > async event loop implementation > > I'm surprised this is true - with dedicated syntax like async def/await, > it's still not THE async event loop implementation? As far as I know, > "async def" is a shorthand for > > @asyncio.coroutine > def > > and "await" is short for "yield from". > > Sincerely, > Ken Hilton; > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From desmoulinmichel at gmail.com Tue Jun 5 08:47:44 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 5 Jun 2018 14:47:44 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: Message-ID: After years of playing with asyncio, I'm still having a harder time using it than any other async architecture around. There are a lot of different reasons for it, but this mail want to address one particular one: The event loop and policy can be tweaked at any time, by anyone. Now, it's hard enough to have to deal, manually, with a low-level event loop. But having it exposed that much, and it being that flexible means any code can just do whatever it wants with it, and make a mess. Several things in particular, comes to mind: - Changing the event loop policy - Changing the event loop - Spawning a new loop - Starting the loop - Stopping the loop - Closing the loop Now, if you want to make any serious project with it, you currently have to guard against all of those, especially if you want to have proper cleanup code, good error message and a decent debugging experience. I tried to do it for one year, and currently, it's very hard. You have a lot of checks to make, redundantly in a lot of places. Some things can only be done by providing a custom event policy/loop yourself, and, of course, expecting (aka documenting and praying) that it's used. For a lot of things, when it breaks, the people that haven't read the doc in depth will have a hard time to understand the problem after the fact. Sometimes, it's just that your code use somebody else code that is not here to read your doc anymore. Now you have to check their code to understand what they are doing that breaks your expectations about the loop / policy or workflow. Barring the creating of an entire higher level framework that everybody will agree on using and that makes messing up way harder, we can improve this situation by adding hooks to those events. I hence propose to add: - asyncio.on_change_policy(cb:Callable[[EventLoopPolicy, EventLoopPolicy], EventLoopPolicy]) - asyncio.on_set_event_loop(cb:Callable[[EventLoop, EventLoop], EventLoop]) - asyncio.on_create_event_loop(cb:Callable[[EventLoop], EventLoop]) - EventLoop.on_start(cb:Callable[EventLoop]) - EventLoop.on_stop(cb:Awaitable[EventLoop]) - EventLoop.on_close(cb:Callable[EventLoop]) - EventLoop.on_set_debug_mode(cb:Callable[[loop]]) This would allow to implement safer, more robust and easier to debug code. E.G: - you can raise a warning stating that if somebody changes the event policy, it must inherit from your custom one or deal with disabled features - you can raise an exception on loop swap and forbid it, saying that your small script doesn't support it yet so that it's easy to understand the limit of your code - you can hook on the event loop life cycle to automatically get on board, or run clean up code, starting logging, warn that you were supposed to start the loop yourself, etc From desmoulinmichel at gmail.com Tue Jun 5 09:30:35 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 5 Jun 2018 15:30:35 +0200 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> Message-ID: There are very few programs that never use any path operation. Opening a file is such a common one we have a built-in for it with open(), but you usually need to do some manipulation to get the file path in the first place. We have __file__, but the most common usage is to get the parent dir, with os or pathlib. Websites open static files and configurations file. GUI open files to be processed. Data processing open data source files. Terminal apps often pass files as a parameters. All those paths you may resolve, turn absolute, check against and so on. So much that pathlib.Path is one of the things I always put in a PYTHONSTARTUP since you need it so often. I think Path fits the bill for being a built-in, I feel it's used more often than any/all or zip, and maybe enumerate. Besides, it would help to make people use it, as I regularly meet dev that keep import os.path because of habits, tutorials, books, docs, etc. From desmoulinmichel at gmail.com Tue Jun 5 09:33:30 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 5 Jun 2018 15:33:30 +0200 Subject: [Python-ideas] Runtime assertion with no overhead when not active In-Reply-To: References: Message-ID: <81f63746-7786-7422-603d-4b044971efe7@gmail.com> Maybe somebody already answered this, but what's the difference between this and the keyword "assert", which basically can be stripped of using "python -o" ? Le 05/05/2018 ? 10:04, Eloi Gaudry a ?crit?: > > Hi folks, > > > I intend to add a runtime assertion feature in python. Before > submitting a PEP, I am sending a draft to this mailing list to > discuss whether it would make sense (above this message). I have > actually been using it for the last 2 years and it has proven to be > robust and has achieved its primary goal. > > > Briefly, the idea is to add a new assert that can be switch on/off > depending on some variable/mechanism at runtime. The whole point of > this assert is that it should not bring any overhead when off, i.e. > by avoiding evaluating the expression enclosed in the runtime > assert. It thus relies on Python grammar. > > > Thanks for your feedback. > > > Eloi > > > ? > > > ? > > > ? > > > Abstract > > This PEP aims at offering a runtime assert functionnality, extending the > compiletime assert already available. > > > Rationale > > There is no runtime assert relying on python grammar available. For > diagnostics and/or debugging reasons, it would be valuable to add such a > feature. > > A runtime assert makes sense when extra checks would be needed in a > production environment (where non-debug builds are used). > > By extending the current python grammar, it would be possible to limit > the overhead induces by those runtime asserts when running in a non > "assertive" mode (-ed). The idea here is to avoid evaluating the > expression when runtime assertion is not active. > > A brief example why avoiding evaluating the expression is needed to > avoid any overhead in case the runtime assert should be ignored. > > :: > > runtime_assert( 999 in { i:None for i in range( 10000000 ) } ) > > > Usage > > :: > > runtime_assert( expr ) > > #would result in if expr and runtime_assert_active: > > print RuntimeAssertionError() > > > Implementation details > > There is already an implementation available, robust enough for > production. The implementation is mainly grammar based, and thus the > parser and the grammar needs to be updated: > > * Python/graminit.c > * Python/ast.c > * Python/symtable.c > * Python/compile.c > * Python/Python-ast.c > * Python/sysmodule.c > * Modules/parsermodule.c > * Include/Python-ast.h > * Include/graminit.h > * Include/pydebug.h > * Grammar/Grammar > * Parser/Python.asdl > * Lib/lib2to3/Grammar.txt > * Lib/compiler/ast.py > * Lib/compiler/pycodegen.py > * Lib/compiler/transformer.py > * Lib/symbol.py > * Modules/parsermodule.c > > > References > > [1] > > > > PEP 306, How to Change Python's Grammar > (http://www.python.org/dev/peps/pep-0306) > > > Copyright > > This document has been placed in the public domain. > > ? > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From yselivanov.ml at gmail.com Tue Jun 5 10:17:05 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Jun 2018 10:17:05 -0400 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: Message-ID: Hi Michel, Yes, theoretically, it's possible to try to change an event loop policy while an event loop is running, but I've yet to see a library (or user code) that tries to do that (it's pointless anyways). There are libraries like uvloop that tell their users to explicitly install a special policy *before* they run their code, but that's about it. The only use case for policies is allowing to plug in a custom event loop implementation. The current policies implementation, as well as "get_event_loop()" and "get_running_loop()" functions, is already very complex. For example, I tried to add rudimentary support for "os.fork()" in 3.7 and failed, because there are too many things that can go wrong. Therefore without a *very* clear case for adding hooks, I'm -1 on further extending and complicating policies API (or any related APIs). I actually want to propose to reduce policies API surface by deprecating and then removing "set_child_watcher()" methods. Besides, there are many other higher priority To-Do items for asyncio in 3.8, like implementing Trio's nursery-like objects and cancellation scopes or fixing tracebacks in Tasks. That said, the above is my "IMO". And in your email you haven't actually provided clear scenarios that could be solved by adding "event loop hooks" to asyncio. So I have a few questions for you: - Do you have real-life examples of libraries that abuse policies in some weird ways? - Are those libraries popular? - What's the actual problem they try to solve by using policies? - What problem are you trying to solve in your code that uses policies? - Why do you think this isn't a documentation/tutorial issue? - Can you list 2-3 clear examples where having hooks would benefit an average asyncio user? Thank you, Yury On Tue, Jun 5, 2018 at 8:48 AM Michel Desmoulin wrote: > > After years of playing with asyncio, I'm still having a harder time > using it than any other async architecture around. There are a lot of > different reasons for it, but this mail want to address one particular one: > > The event loop and policy can be tweaked at any time, by anyone. > > Now, it's hard enough to have to deal, manually, with a low-level event > loop. But having it exposed that much, and it being that flexible means > any code can just do whatever it wants with it, and make a mess. > > Several things in particular, comes to mind: > > - Changing the event loop policy > - Changing the event loop > - Spawning a new loop > - Starting the loop > - Stopping the loop > - Closing the loop > > Now, if you want to make any serious project with it, you currently have > to guard against all of those, especially if you want to have proper > cleanup code, good error message and a decent debugging experience. > > I tried to do it for one year, and currently, it's very hard. You have a > lot of checks to make, redundantly in a lot of places. Some things can > only be done by providing a custom event policy/loop yourself, and, of > course, expecting (aka documenting and praying) that it's used. > > For a lot of things, when it breaks, the people that haven't read the > doc in depth will have a hard time to understand the problem after the fact. > > Sometimes, it's just that your code use somebody else code that is not > here to read your doc anymore. Now you have to check their code to > understand what they are doing that breaks your expectations about the > loop / policy or workflow. > > Barring the creating of an entire higher level framework that everybody > will agree on using and that makes messing up way harder, we can improve > this situation by adding hooks to those events. > > I hence propose to add: > > - asyncio.on_change_policy(cb:Callable[[EventLoopPolicy, > EventLoopPolicy], EventLoopPolicy]) > > - asyncio.on_set_event_loop(cb:Callable[[EventLoop, EventLoop], EventLoop]) > > - asyncio.on_create_event_loop(cb:Callable[[EventLoop], EventLoop]) > > - EventLoop.on_start(cb:Callable[EventLoop]) > > - EventLoop.on_stop(cb:Awaitable[EventLoop]) > > - EventLoop.on_close(cb:Callable[EventLoop]) > > - EventLoop.on_set_debug_mode(cb:Callable[[loop]]) > > This would allow to implement safer, more robust and easier to debug > code. E.G: > > - you can raise a warning stating that if somebody changes the event > policy, it must inherit from your custom one or deal with disabled features > > - you can raise an exception on loop swap and forbid it, saying that > your small script doesn't support it yet so that it's easy to understand > the limit of your code > > - you can hook on the event loop life cycle to automatically get on > board, or run clean up code, starting logging, warn that you were > supposed to start the loop yourself, etc > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Yury From Eloi.Gaudry at fft.be Tue Jun 5 11:28:21 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Tue, 5 Jun 2018 15:28:21 +0000 Subject: [Python-ideas] Runtime assertion with no overhead when not active In-Reply-To: <81f63746-7786-7422-603d-4b044971efe7@gmail.com> References: , <81f63746-7786-7422-603d-4b044971efe7@gmail.com> Message-ID: They are basically the same thing, with one difference being that runtime_assert would be used for extension mainly, and switchable on/off without using -o flag on the command line. Eloi ________________________________ From: Python-ideas on behalf of Michel Desmoulin Sent: Tuesday, June 5, 2018 3:33:30 PM To: python-ideas at python.org Subject: Re: [Python-ideas] Runtime assertion with no overhead when not active Maybe somebody already answered this, but what's the difference between this and the keyword "assert", which basically can be stripped of using "python -o" ? Le 05/05/2018 ? 10:04, Eloi Gaudry a ?crit : > > Hi folks, > > > I intend to add a runtime assertion feature in python. Before > submitting a PEP, I am sending a draft to this mailing list to > discuss whether it would make sense (above this message). I have > actually been using it for the last 2 years and it has proven to be > robust and has achieved its primary goal. > > > Briefly, the idea is to add a new assert that can be switch on/off > depending on some variable/mechanism at runtime. The whole point of > this assert is that it should not bring any overhead when off, i.e. > by avoiding evaluating the expression enclosed in the runtime > assert. It thus relies on Python grammar. > > > Thanks for your feedback. > > > Eloi > > > > > > > > > > > > Abstract > > This PEP aims at offering a runtime assert functionnality, extending the > compiletime assert already available. > > > Rationale > > There is no runtime assert relying on python grammar available. For > diagnostics and/or debugging reasons, it would be valuable to add such a > feature. > > A runtime assert makes sense when extra checks would be needed in a > production environment (where non-debug builds are used). > > By extending the current python grammar, it would be possible to limit > the overhead induces by those runtime asserts when running in a non > "assertive" mode (-ed). The idea here is to avoid evaluating the > expression when runtime assertion is not active. > > A brief example why avoiding evaluating the expression is needed to > avoid any overhead in case the runtime assert should be ignored. > > :: > > runtime_assert( 999 in { i:None for i in range( 10000000 ) } ) > > > Usage > > :: > > runtime_assert( expr ) > > #would result in if expr and runtime_assert_active: > > print RuntimeAssertionError() > > > Implementation details > > There is already an implementation available, robust enough for > production. The implementation is mainly grammar based, and thus the > parser and the grammar needs to be updated: > > * Python/graminit.c > * Python/ast.c > * Python/symtable.c > * Python/compile.c > * Python/Python-ast.c > * Python/sysmodule.c > * Modules/parsermodule.c > * Include/Python-ast.h > * Include/graminit.h > * Include/pydebug.h > * Grammar/Grammar > * Parser/Python.asdl > * Lib/lib2to3/Grammar.txt > * Lib/compiler/ast.py > * Lib/compiler/pycodegen.py > * Lib/compiler/transformer.py > * Lib/symbol.py > * Modules/parsermodule.c > > > References > > [1] > > > > PEP 306, How to Change Python's Grammar > (http://www.python.org/dev/peps/pep-0306) > > > Copyright > > This document has been placed in the public domain. > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Jun 5 11:44:16 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 5 Jun 2018 08:44:16 -0700 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> Message-ID: Sorry for the top-post ? iPhone email sucks. But: in regard to the whole ?what paths to use to find resource files? issue: The ?current working directory? concept can be very helpful. You put your files in a directory tree somewhere? could be inside the package, could be anywhere else. Then all paths in your app are relative to the root of that location. So all your app needs to do is set the cwd on startup, and you?re good to go. So you may use __file__ once at startup (or not, depending on configuration settings) Alternatively, in the simple web server example, you have a root path that gets tacked on automatically in you app, so again, you use relative paths everywhere. The concept of non-python-code resources being accessible within a package is really a separate issue than generic data files, etc. that you may want to access and serve different way. In short, if you have a collection of files that you want to access from Python, and also might want to serve up with another application? you don?t want to use a python resources system. Now I?m a bit confused about the topic of the thread, but I do like the idea of putting Path in a more accessible place. ( though a bit concerned about startup time if it were a built in) -CHB Sent from my iPhone > On Jun 5, 2018, at 6:30 AM, Michel Desmoulin wrote: > > There are very few programs that never use any path operation. > > Opening a file is such a common one we have a built-in for it with > open(), but you usually need to do some manipulation to get the file > path in the first place. > > We have __file__, but the most common usage is to get the parent dir, > with os or pathlib. > > Websites open static files and configurations file. > > GUI open files to be processed. > > Data processing open data source files. > > Terminal apps often pass files as a parameters. > > All those paths you may resolve, turn absolute, check against and so on. > So much that pathlib.Path is one of the things I always put in a > PYTHONSTARTUP since you need it so often. > > I think Path fits the bill for being a built-in, I feel it's used more > often than any/all or zip, and maybe enumerate. > > Besides, it would help to make people use it, as I regularly meet dev > that keep import os.path because of habits, tutorials, books, docs, etc. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From chris.barker at noaa.gov Tue Jun 5 12:44:59 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 5 Jun 2018 09:44:59 -0700 Subject: [Python-ideas] Add dict.append and dict.extend Message-ID: I think your proposal got a bit confused by the choice of names, and that you were proposing two things, one of which I think already exists (setdefault). So, I _think_ what you are proposing is that there be a method something like: def exclusive_add(self, key, value): if key in self: raise KeyError("the key: {} already exists in this dict".format(key)) self[key] = value Which would make sense if that is a common use case, and you make a pretty good case for that, with the analogy to UNIQUE in database tables. I will point out that the DB case is quite different, because python dicts have a way to of spelling it that's really pretty straightforward, performant and robust. However, .setdefault() is essentially a replacement for: if key in a_dict: value = a_dict(key) else: a_dict[key] = default_value value = default_value or a similar try: except block: try: value = a_dict[key] except KeyError: a_dict[key] = default_value value = default_value And I'm glad it's there. So - -if this is a fairly common use case, then maybe it's worth adding. Note that this would be adding a new method to not just dict, but to the entire mapping protocol (ABC) -- so maybe a heavier lift than you're imagining. -CHB PS: making sure this is what you are suggesting: In [*12*]: *class* *my_dict*(dict): ...: *def* exclusive_add(self, key, value): ...: *if* key *in* self: ...: *raise* *KeyError*("the key: *{}* already exists in this dict".format(key)) ...: self[key] = value ...: In [*13*]: d = my_dict() In [*14*]: d.exclusive_add('this', 5) In [*15*]: d.exclusive_add('that', 6) In [*16*]: d.exclusive_add('this', 7) --------------------------------------------------------------------------- KeyError Traceback (most recent call last) in () ----> 1 d.exclusive_add('this', 7) in exclusive_add(self, key, value) * 2* def exclusive_add(self, key, value): * 3* if key in self: ----> 4 raise KeyError("the key: {} already exists in this dict".format(key)) * 5* self[key] = value * 6* KeyError: 'the key: this already exists in this dict' -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From klahnakoski at mozilla.com Tue Jun 5 13:25:45 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Tue, 5 Jun 2018 13:25:45 -0400 Subject: [Python-ideas] Runtime assertion with no overhead when not active In-Reply-To: <64D2E0DB-8176-4968-99F4-1403261CEB42@barrys-emacs.org> References: <64D2E0DB-8176-4968-99F4-1403261CEB42@barrys-emacs.org> Message-ID: I currently use the form ??? and log_function( ) where is some module variable, usually "DEBUG".? I do this because it is one line, and it ensures the log_function parameters are not evaluated. *IF* runtime assertions had a switch so they have no overhead when not active, how much faster can it get?? How expensive is the check? On 2018-05-10 03:55, Barry Scott wrote: > > My logging example would be > > log( control_flag, msg_expr ) > > expanding to: > > if : > log_function( ) > > Barry > > This idea requires the same sort of machinery in python that I was > hoping for to implement the short circuit logging. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jun 5 14:07:14 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Jun 2018 11:07:14 -0700 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: Message-ID: Twisted's reactor API has some lifecycle hooks: https://twistedmatrix.com/documents/18.4.0/api/twisted.internet.interfaces.IReactorCore.html#addSystemEventTrigger My impression is that this is actually pretty awkward for twisted/asyncio interoperability, because if you're trying to use a twisted library on top of an asyncio loop then there's no reliable way to implement these methods. And twisted uses these internally for things like managing its thread pool. There is some subtlety though because in twisted, reactors can't transition from the stopped state back into the running state, which implies an invariant where the start and shutdown hooks can be called at most once. Anyway, I'm not a twisted expert, but wanted to flag this so you all know that if you're talking about adding lifecycle hooks then you know to go talk to them and get the details. (Trio does have some sorta-kinda analogous functionality. Specifically it has a concept of "system tasks" that are automatically cancelled when the main task exits, so they have a chance to do any cleanup at that point. But trio's lifecycle model is so different that I'm not sure how helpful this is.) -n On Tue, Jun 5, 2018, 05:48 Michel Desmoulin wrote: > After years of playing with asyncio, I'm still having a harder time > using it than any other async architecture around. There are a lot of > different reasons for it, but this mail want to address one particular one: > > The event loop and policy can be tweaked at any time, by anyone. > > Now, it's hard enough to have to deal, manually, with a low-level event > loop. But having it exposed that much, and it being that flexible means > any code can just do whatever it wants with it, and make a mess. > > Several things in particular, comes to mind: > > - Changing the event loop policy > - Changing the event loop > - Spawning a new loop > - Starting the loop > - Stopping the loop > - Closing the loop > > Now, if you want to make any serious project with it, you currently have > to guard against all of those, especially if you want to have proper > cleanup code, good error message and a decent debugging experience. > > I tried to do it for one year, and currently, it's very hard. You have a > lot of checks to make, redundantly in a lot of places. Some things can > only be done by providing a custom event policy/loop yourself, and, of > course, expecting (aka documenting and praying) that it's used. > > For a lot of things, when it breaks, the people that haven't read the > doc in depth will have a hard time to understand the problem after the > fact. > > Sometimes, it's just that your code use somebody else code that is not > here to read your doc anymore. Now you have to check their code to > understand what they are doing that breaks your expectations about the > loop / policy or workflow. > > Barring the creating of an entire higher level framework that everybody > will agree on using and that makes messing up way harder, we can improve > this situation by adding hooks to those events. > > I hence propose to add: > > - asyncio.on_change_policy(cb:Callable[[EventLoopPolicy, > EventLoopPolicy], EventLoopPolicy]) > > - asyncio.on_set_event_loop(cb:Callable[[EventLoop, EventLoop], EventLoop]) > > - asyncio.on_create_event_loop(cb:Callable[[EventLoop], EventLoop]) > > - EventLoop.on_start(cb:Callable[EventLoop]) > > - EventLoop.on_stop(cb:Awaitable[EventLoop]) > > - EventLoop.on_close(cb:Callable[EventLoop]) > > - EventLoop.on_set_debug_mode(cb:Callable[[loop]]) > > This would allow to implement safer, more robust and easier to debug > code. E.G: > > - you can raise a warning stating that if somebody changes the event > policy, it must inherit from your custom one or deal with disabled features > > - you can raise an exception on loop swap and forbid it, saying that > your small script doesn't support it yet so that it's easy to understand > the limit of your code > > - you can hook on the event loop life cycle to automatically get on > board, or run clean up code, starting logging, warn that you were > supposed to start the loop yourself, etc > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Jun 5 14:40:10 2018 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 5 Jun 2018 19:40:10 +0100 Subject: [Python-ideas] Runtime assertion with no overhead when not active In-Reply-To: References: <64D2E0DB-8176-4968-99F4-1403261CEB42@barrys-emacs.org> Message-ID: <40fd905e-7cfe-7a37-0374-3893de0d83c1@mrabarnett.plus.com> On 2018-06-05 18:25, Kyle Lahnakoski wrote: > > I currently use the form > > ??? and log_function( ) > > where is some module variable, usually "DEBUG".? I do > this because it is one line, and it ensures the log_function parameters > are not evaluated. > You'd get the same result with: if : log_function( ) > *IF* runtime assertions had a switch so they have no overhead when not > active, how much faster can it get?? How expensive is the > check? > > > On 2018-05-10 03:55, Barry Scott wrote: >> >> My logging example would be >> >> log( control_flag, msg_expr ) >> >> expanding to: >> >> if : >> log_function( ) >> >> Barry >> >> This idea requires the same sort of machinery in python that I was >> hoping for to implement the short circuit logging. > From steve at pearwood.info Tue Jun 5 19:10:34 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Jun 2018 09:10:34 +1000 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: Message-ID: <20180605231034.GW12683@ando.pearwood.info> On Tue, Jun 05, 2018 at 09:44:59AM -0700, Chris Barker via Python-ideas wrote: > I think your proposal got a bit confused by the choice of names, and that > you were proposing two things, one of which I think already exists > (setdefault). Ben's very first paragraph in this thread says: I'd like to propose adding `append` and `extend` methods to dicts which behave like `__setitem__` and `update` respectively, except that they raise an exception (KeyError?) instead of overwriting preexisting entries. setdefault does not raise an exception. So while your enthusiasm does you credit (setdefault is an excellent little method that is not known anywhere near as well as it ought to be) I don't think it is the least bit relevant here. Ben's proposal is for something like your exclusive_add below (although written in C so it would be atomic), and then given that, something like: def exclusive_update(self, other): # simplified version for key in other: self.exclusive_add(key, other[key]) except that it isn't clear to me what should happen if a key matches: 1) should the update have an "all or nothing" guarantee? (either the entire update succeeds, or none of it) 2) if not, what happens on partial success? are the results dependent on the order of the attempted update? > So, I _think_ what you are proposing is that there be a method something > like: > > def exclusive_add(self, key, value): > if key in self: > raise KeyError("the key: {} already exists in this > dict".format(key)) > self[key] = value > > Which would make sense if that is a common use case, and you make a pretty > good case for that, with the analogy to UNIQUE in database tables. I'm not sure that it is so common. I think I've wanted the opposite behaviour more often: aside from initialisation, only allow setting existing keys, and raise if the key doesn't exist. Nor do I think the DB analogy is very convincing, since we have no transactions with ACID guarantees for dicts. Pretty much the only similarity between a dict and a DB is that they both map a key to a value. > I will point out that the DB case is quite different, because python dicts > have a way to of spelling it that's really pretty straightforward, > performant and robust. I'm confused... first you say that Ben makes a good case for this functionality with the DB analogy, and then one sentence later, you say the DB case is very different. So not a good case? I don't understand. And what is this way of spelling "it" (what is it?) that's straightforward and robust? You've completely lost me, sorry. -- Steve From steve at pearwood.info Tue Jun 5 19:42:27 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Jun 2018 09:42:27 +1000 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> Message-ID: <20180605234227.GX12683@ando.pearwood.info> On Tue, Jun 05, 2018 at 03:30:35PM +0200, Michel Desmoulin wrote: > There are very few programs that never use any path operation. On the contrary, there are many programs than never use any path operations. I have many programs which take input and provide output and no files are involved at all. Of course path manipulation is common. But hyperbola about how common it is does not help your case. > Opening a file is such a common one we have a built-in for it with > open(), but you usually need to do some manipulation to get the file > path in the first place. > We have __file__, but the most common usage is to get the parent dir, > with os or pathlib. Parent directory of what? Are you talking about the parent directory of the script? I almost never care about the script directory. I sometimes care about file names passed in by the user, and maybe ten percent of the time I care about the parent directory of those file names. I sometimes care about the current working directory. But I can't think of the last time I've cared about __file__. In my experience, that's an uncommon need. You keep making absolute claims about what is "most common". What is your evidence for these absolute claims? Have you done a survey of all the Python software in existence? Or do what you mean is that this is *your* most common usage? Because it isn't *my* most common usage. [...] > So much that pathlib.Path is one of the things I always put in a > PYTHONSTARTUP since you need it so often. Please don't speak for me. I don't need it at all, and even if I did, putting it in *your* startup file doesn't help me. > I think Path fits the bill for being a built-in, I feel it's used more > often than any/all or zip, and maybe enumerate. This is a quick and dirty survey of my code: [steve at ando python]$ grep Path *.py */*.py */*/*.py | wc -l 21 [steve at ando python]$ grep "enumerate(" *.py */*.py */*/*.py | wc -l 307 [steve at ando python]$ grep "zip(" *.py */*.py */*/*.py | wc -l 499 [steve at ando python]$ grep "any(" *.py */*.py */*/*.py | wc -l 96 [steve at ando python]$ grep "all(" *.py */*.py */*/*.py | wc -l 224 So I would say that Path is used about 25 times less often than zip, and I wouldn't consider zip to be an essential builtin. I use math.sqrt about 15 times more often than Path. > Besides, it would help to make people use it, as I regularly meet dev > that keep import os.path because of habits, tutorials, books, docs, etc. Why do you want to *make* people use it? Why shouldn't people use os.path if it meets their needs? -- Steve From chris.barker at noaa.gov Wed Jun 6 02:18:34 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 5 Jun 2018 23:18:34 -0700 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: <20180605231034.GW12683@ando.pearwood.info> References: <20180605231034.GW12683@ando.pearwood.info> Message-ID: On Tue, Jun 5, 2018 at 4:10 PM, Steven D'Aprano wrote: > I'm confused... first you say that Ben makes a good case for this > functionality with the DB analogy, and then one sentence later, you say > the DB case is very different. So not a good case? I don't understand. > I wasn't trying to make a case either way -- on the one hand, there is a good analogy to DB UNIDQUE, on the other hand, dicts are really pretty different than DBs. > And what is this way of spelling "it" (what is it?) that's > straightforward and robust? You've completely lost me, sorry. > if key in dict: raise KeyError if you had to do that with a DB before adding a record, it could be a pretty expensive operation.... Thinking on this a bit more, I'm pretty -1 -- the main reason that if we had a dict.exclusive_add() method, when you wanted to use it, you'd have to catch the KeyError and do something: try: my_dict.exclusive_add(key, val) except KeyError: do_something_else_probably_with(my_dict, val) since you'd have to catch it, then you aren't really simplifying the code much anyway: if key in my_dict: so_something_else else: my_dict[key] = val the same amount of code and I think the explicit check is clearer.... Also -- seems kind of odd to raise a KeyError when the key IS there?!? -CHB > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jun 6 02:23:55 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 5 Jun 2018 23:23:55 -0700 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: <20180605234227.GX12683@ando.pearwood.info> References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> Message-ID: On Tue, Jun 5, 2018 at 4:42 PM, Steven D'Aprano wrote: > This is a quick and dirty survey of my code: > > [steve at ando python]$ grep Path *.py */*.py */*/*.py | wc -l > 21 > [steve at ando python]$ grep "enumerate(" *.py */*.py */*/*.py | wc -l > 307 > [steve at ando python]$ grep "zip(" *.py */*.py */*/*.py | wc -l > 499 > [steve at ando python]$ grep "any(" *.py */*.py */*/*.py | wc -l > 96 > [steve at ando python]$ grep "all(" *.py */*.py */*/*.py | wc -l > 224 > I"m not saying I agree with the OP, but this is not a fair comparison at all -- Path is pretty new, and even newer is it functional with most of teh stdlib. I do a lot of path manipulations in my code, but hardly ever use Path -- nly brand new code uses it. so I think you'd need to grep for os.path (and probably shutil, too) to get a meaningful answer. But key here is that there is no consensus that Path is the new "obvious way to do it", and adding it to builtins would be essentially making that statement. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 6 02:53:32 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Jun 2018 16:53:32 +1000 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> Message-ID: <20180606065331.GZ12683@ando.pearwood.info> On Tue, Jun 05, 2018 at 11:23:55PM -0700, Chris Barker wrote: > On Tue, Jun 5, 2018 at 4:42 PM, Steven D'Aprano wrote: > > > This is a quick and dirty survey of my code: [snip grepping] > I"m not saying I agree with the OP, but this is not a fair comparison at > all -- Path is pretty new, and even newer is it functional with most of teh > stdlib. > > I do a lot of path manipulations in my code, but hardly ever use Path -- > nly brand new code uses it. > > so I think you'd need to grep for os.path (and probably shutil, too) to get > a meaningful answer. Why? The OP isn't asking for os.path and shutil to be builtins. The OP's statement wasn't "file manipulations of any sort, using any technique including Path, os.path, shutil and string processing, is more common than enumerate etc". (For *my own code* I'd disagree with that claim too, but other's experience may vary.) It was specifically that Path was more common than enumerate. Maybe it is for him, but that isn't a universal fact. > But key here is that there is no consensus that Path is the new "obvious > way to do it", and adding it to builtins would be essentially making that > statement. Indeed. I think there are at least three hurdles to overcome before Path could become a builtin: - concensus, or at least a BDFL ruling, that path manipulation is important enough to be a builtin. (If we're voting, I'd rather have sqrt as a builtin. But maybe that's just me :-) - agreement that Path is the One Obvious Way that should be officially promoted over os.path; - and determination that making Path a builtin would not cause an excessive or onerous burden on the core developers; - or a serious regression in interpreter startup. (pathlib is a reasonably big library, over 1000 LOC, which relies on over a dozen other modules.) -- Steve From rosuav at gmail.com Wed Jun 6 04:34:20 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Jun 2018 18:34:20 +1000 Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: <20180605231034.GW12683@ando.pearwood.info> Message-ID: On Wed, Jun 6, 2018 at 4:18 PM, Chris Barker via Python-ideas wrote: > > Also -- seems kind of odd to raise a KeyError when the key IS there?!? class DuplicateKeyError(LookupError): pass Problem solved :) ChrisA From j.van.dorp at deonet.nl Wed Jun 6 05:51:43 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 6 Jun 2018 11:51:43 +0200 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: <20180606065331.GZ12683@ando.pearwood.info> References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> Message-ID: For the startup time, you could keep it around as builtin but save the import time until someone actually uses it. While I agree sqrt should be a builtin as well, I think there's a good argument to be made for Path to. I just switched to it the past month, and im liking it a lot over constructs like (real code example): os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "filename"). (could probably be inproved by changing to __path__ and removing the dirname ? but the current version works...) sqrt isn't as much used in situations I've been in - and when it was, I generally got a giant heap of data to process and was doing that with numpy anyway. From desmoulinmichel at gmail.com Wed Jun 6 06:40:25 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 6 Jun 2018 12:40:25 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: Message-ID: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> Hi Yuri, > > I actually want to propose to reduce policies API surface by > deprecating and then removing "set_child_watcher()" methods. Besides, > there are many other higher priority To-Do items for asyncio in 3.8, > like implementing Trio's nursery-like objects and cancellation scopes > or fixing tracebacks in Tasks. > > That said, the above is my "IMO". And in your email you haven't > actually provided clear scenarios that could be solved by adding > "event loop hooks" to asyncio. So I have a few questions for you: > > - Do you have real-life examples of libraries that abuse policies in > some weird ways? It's not abuse, it's just regular use of a documented feature, but a very potent feature with deep consequences. > - Are those libraries popular? aiothttp comes to mind. But the fact is, I think people explicitly avoid using policies and custom loops because they know that they can be swapped and you can't make it a hard requirement since they can be swapped under your node. > - What's the actual problem they try to solve by using policies? e.g: https://github.com/aio-libs/aiohttp/blob/53828229a13dc72137f430abc7a0c469678f3dd6/aiohttp/worker.py > - What problem are you trying to solve in your code that uses policies? I have tried to create an abstraction that creates an uvloop if it exists, or a regular loop otherwise, or integrated it to the twisted reactor or the trio event loop if it's loaded. I want to abstract that from the user, so I tried to put that in a policy. But that's dangerous since it can be changed at any time, so I gave up on it and made it explicit. Of course, if the user misses that in the doc (hopefully, it's an company internal code so they should be trained), it will be a bummer to debug. Another example is the proof of concept of nurseries we talked about on twitter: https://0bin.net/paste/V5KyhAg-2i5EOyoK#dzBvhdCVeFy8Q2xNcxXyqwtyQFgkxlKI3u5QG0buIcT Yet another one, with a custom loop this time: I want to provide a fail fast mode for code using loop.run_forever. The goal is to make it crashes when an uncaught exception occurs instead of just log it in the console to ease debugging in a local machine. Of course it would be best to use run_until_complete instead but you don't always get to choose. So we set a task factory on the loop. But, of course, you lose everything if something changes the loop on the fly, which my code has no idea has happened. Another use case is to just log that the loop / policy has changed in debug mode. It's always something I want to know anyway, because it has consequences on my entire program since those are pretty fundamental components. > - Why do you think this isn't a documentation/tutorial issue?> - Can you list 2-3 clear examples where having hooks would benefit an > average asyncio user? The best documentation is the one you don't need to write. But if I'm being honest, I'd like to have it despite having a good warning in the documentation. When you run Django's manage.py runserver, it will check your code for a lot of common issues and raise a warning or an exception, letting you know what to do. With those hooks, I could check if a policy or a loop is changed, and if some things in my code depends on a policy or a loop, I can do the same and raise a warning or an exception. This benefit the users because they get the info exactly in context, and not just rely on them reading the doc, understanding it, remembering it and applying it. And that benefits the framework writers because that's less support, and less bug reports to deal with. Using asyncio is scary and mysterious enough for a lot of people, so I want to make the experience as natural as possible. I don't want people to have to read my doc and learn what event loops and policies are for basic usage. It's too much. But on the other hand, I do want them to be able to debug a problem if the loop or policy is swapped. > > Thank you, > Yury Thank you too From rosuav at gmail.com Wed Jun 6 08:51:08 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Jun 2018 22:51:08 +1000 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> Message-ID: On Wed, Jun 6, 2018 at 7:51 PM, Jacco van Dorp wrote: > For the startup time, you could keep it around as builtin but save the > import time until someone actually uses it. That would mean creating a system of lazy imports, which is an entirely separate proposal. ChrisA From j.van.dorp at deonet.nl Wed Jun 6 09:32:41 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 6 Jun 2018 15:32:41 +0200 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> Message-ID: 2018-06-06 14:51 GMT+02:00 Chris Angelico : > On Wed, Jun 6, 2018 at 7:51 PM, Jacco van Dorp wrote: >> For the startup time, you could keep it around as builtin but save the >> import time until someone actually uses it. > > That would mean creating a system of lazy imports, which is an > entirely separate proposal. > > ChrisA It's that complicated ? I know it's not exactly properties on a class, but I thought there were other cases, even if I couldn't name one. Dont mind me, then. From andrew.svetlov at gmail.com Wed Jun 6 09:46:16 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 6 Jun 2018 16:46:16 +0300 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> Message-ID: Hold on. aiohttp doesn't suffer from hooks absence. Moreover, I don't see how these hooks could be utilized by aiohttp. Gunicorn workers are not imported and instantiated by user code, they are imported by gunicorn using a command line parameter. Please choose a different use case as the proof of your request. On Wed, Jun 6, 2018 at 1:41 PM Michel Desmoulin wrote: > > Hi Yuri, > > > > > I actually want to propose to reduce policies API surface by > > deprecating and then removing "set_child_watcher()" methods. Besides, > > there are many other higher priority To-Do items for asyncio in 3.8, > > like implementing Trio's nursery-like objects and cancellation scopes > > or fixing tracebacks in Tasks. > > > > That said, the above is my "IMO". And in your email you haven't > > actually provided clear scenarios that could be solved by adding > > "event loop hooks" to asyncio. So I have a few questions for you: > > > > - Do you have real-life examples of libraries that abuse policies in > > some weird ways? > > It's not abuse, it's just regular use of a documented feature, but a > very potent feature with deep consequences. > > > - Are those libraries popular? > > aiothttp comes to mind. But the fact is, I think people explicitly avoid > using policies and custom loops because they know that they can be > swapped and you can't make it a hard requirement since they can be > swapped under your node. > > > - What's the actual problem they try to solve by using policies? > > e.g: > > https://github.com/aio-libs/aiohttp/blob/53828229a13dc72137f430abc7a0c469678f3dd6/aiohttp/worker.py > > > - What problem are you trying to solve in your code that uses policies? > > I have tried to create an abstraction that creates an uvloop if it > exists, or a regular loop otherwise, or integrated it to the twisted > reactor or the trio event loop if it's loaded. > > I want to abstract that from the user, so I tried to put that in a > policy. But that's dangerous since it can be changed at any time, so I > gave up on it and made it explicit. Of course, if the user misses that > in the doc (hopefully, it's an company internal code so they should be > trained), it will be a bummer to debug. > > Another example is the proof of concept of nurseries we talked about on > twitter: > > > https://0bin.net/paste/V5KyhAg-2i5EOyoK#dzBvhdCVeFy8Q2xNcxXyqwtyQFgkxlKI3u5QG0buIcT > > Yet another one, with a custom loop this time: > > I want to provide a fail fast mode for code using loop.run_forever. The > goal is to make it crashes when an uncaught exception occurs instead of > just log it in the console to ease debugging in a local machine. > > Of course it would be best to use run_until_complete instead but you > don't always get to choose. > > So we set a task factory on the loop. But, of course, you lose > everything if something changes the loop on the fly, which my code has > no idea has happened. > > Another use case is to just log that the loop / policy has changed in > debug mode. It's always something I want to know anyway, because it has > consequences on my entire program since those are pretty fundamental > components. > > > - Why do you think this isn't a documentation/tutorial issue?> - Can you > list 2-3 clear examples where having hooks would benefit an > > average asyncio user? > > The best documentation is the one you don't need to write. But if I'm > being honest, I'd like to have it despite having a good warning in the > documentation. > > When you run Django's manage.py runserver, it will check your code for a > lot of common issues and raise a warning or an exception, letting you > know what to do. > > With those hooks, I could check if a policy or a loop is changed, and if > some things in my code depends on a policy or a loop, I can do the same > and raise a warning or an exception. > > This benefit the users because they get the info exactly in context, and > not just rely on them reading the doc, understanding it, remembering it > and applying it. > > And that benefits the framework writers because that's less support, and > less bug reports to deal with. > > Using asyncio is scary and mysterious enough for a lot of people, so I > want to make the experience as natural as possible. I don't want people > to have to read my doc and learn what event loops and policies are for > basic usage. It's too much. But on the other hand, I do want them to be > able to debug a problem if the loop or policy is swapped. > > > > > Thank you, > > Yury > > Thank you too > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jun 6 12:04:12 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 6 Jun 2018 09:04:12 -0700 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> Message-ID: >>> For the startup time, you could keep it around as builtin but save the >>> import time until someone actually uses it. >> >> That would mean creating a system of lazy imports, which is an >> entirely separate proposal. > > It's that complicated ? I know it's not exactly properties on a class, > but I thought there were other cases, even if I couldn't name one. > Dont mind me, then. It wouldn?t be THAT hard to wrote lazy-import code for pathlib. But there has been a lot of discussion lately about Python startup time. One approach is to create a lazy-import system that could be generally used to help startup time. So I expect that an expensive to import built in will not get added unless that problem is generically solved. And as for Steven?s other points: There has been a fair bit of discussion here and on Python-dev about pathlib. The fact is that it is still not ready to be a full featured replacement for os.path, etc. And a number of core devs aren?t all that interested in it becoming the ?one obvious way?. So I think we are no where near it becoming a built in. But if you like it, you can help the efforts to make it even more useful, which would be good in itself, but is also the Path (pun intended) to making it the ?one obvious way?. If it?s useful enough, people will use it, even if the have to import it. There was a recent thread about adding functionality to the Oath object that seems to have petered out? maybe contribute to that effort? One more point: A major step in making pathlib useful was adding the __path__ protocol, and then adding support for it in most (all) of the standard library. Another step would be to make any paths in the stdlib (such as __file__) Path objects (as suggested in this thread) but that would bring up the startup costs problem. I wonder if a Path-lite with the core functionality, but less startup cost, would be useful here? -CHB From Eloi.Gaudry at fft.be Wed Jun 6 04:34:13 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Wed, 6 Jun 2018 08:34:13 +0000 Subject: [Python-ideas] Runtime assertion with no overhead when not active In-Reply-To: References: <64D2E0DB-8176-4968-99F4-1403261CEB42@barrys-emacs.org>, Message-ID: The check is made against a boolean value in the C extension, I don't think that it offers a significant speed-up against the pure python code. but it offers a simpler (reduced, global) assertion syntax though. ________________________________ From: Python-ideas on behalf of Kyle Lahnakoski Sent: Tuesday, June 5, 2018 7:25:45 PM To: python-ideas at python.org Subject: Re: [Python-ideas] Runtime assertion with no overhead when not active I currently use the form and log_function( ) where is some module variable, usually "DEBUG". I do this because it is one line, and it ensures the log_function parameters are not evaluated. *IF* runtime assertions had a switch so they have no overhead when not active, how much faster can it get? How expensive is the check? On 2018-05-10 03:55, Barry Scott wrote: My logging example would be log( control_flag, msg_expr ) expanding to: if : log_function( ) Barry This idea requires the same sort of machinery in python that I was hoping for to implement the short circuit logging. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Wed Jun 6 14:05:35 2018 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 6 Jun 2018 19:05:35 +0100 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> Message-ID: <31883CB4-1F13-4208-8A31-07F59241B704@barrys-emacs.org> I assume the the idea is that everybody has Path available without the need to do the import dance first. If its for personal convenience you can always do this trick, that is used by gettext to make _ a builtin. import pathlib import builtings builtins.__dict__['Path'] = pathlib.Path Now Path *is* a builtin for the rest of the code. Barry From paal.drange at gmail.com Thu Jun 7 07:34:58 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Thu, 7 Jun 2018 13:34:58 +0200 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: For closure, I've added a package, timeliterals (env) [pgdr at hostname ~]$ pip install timeliterals (env) [pgdr at hostname ~]$ python >>> from timeliterals import * >>> 3*hours datetime.timedelta(0, 10800) >>> 3*minutes datetime.timedelta(0, 180) >>> 3*seconds datetime.timedelta(0, 3) The source code is at https://github.com/pgdr/timeliterals I'm not going to submit a patch to datetime at this time, but I will if people would be interested. - P?l On 5 Jun 2018 13:56, "Jacco van Dorp" wrote: > i'd also be pretty simple to implement.... > > Just list: > minute = timedelta(minutes=1) > hour = timedelta(hours=1) > etc... > > and you could import and use them like that. Or if you really want to > write 5*m, the just from datetime import minute as m > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertvandeneynde at hotmail.com Thu Jun 7 08:33:29 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Thu, 7 Jun 2018 12:33:29 +0000 Subject: [Python-ideas] Trigonometry in degrees Message-ID: I suggest adding degrees version of the trigonometric functions in the math module. - Useful in Teaching and replacing calculators by python, importing something is seen by the young students much more easy than to define a function. - Special values could be treated, aka when the angle is a multiple of 90, young students are often surprise to see that cos(pi/2) != 0 Testing for a special value Isn't very costly (x % 90 == 0) but it could be pointed out that there is a small overhead using the "degrees" equivalent of trig function because of the radians to degrees conversion And the special values testing. - Standard names will be chosen so that everyone will use the same name convention. I suggest adding a "d" like sind, cosd, tand, acosd, asind, atand, atan2d. Another option would be to add "deg" or prepend "d" or "deg" however the name should be short. sind, dsin, sindeg or degsin ? We can look in other languages what they chose. Creating a new package like 'from math.degrees import cos' however I would not recommend that because "cos" in the source code would mean to lookup the import to know if it's in degrees or radians (and that leads to very filthy bugs). Also "degrees" is already so the name would have to change the name of the package. - Also in the cmath module. Even though the radians make more sense in the complex plane. The same functions sin cos tan, asin acos atan, alongside with phase and polar. Here's my current implementation : def cosd(x): if x % 90 == 0: return (1, 0, -1, 0)[int(x // 90) % 4] else: return cos(radians(x)) def sind(x): if x % 90 == 0: return (0, 1, 0, -1)[int(x // 90) % 4] else: return sin(radians(x)) def tand(x): if x % 90 == 0: return (0, float('inf'), 0, float('-inf'))[int(x // 90) % 4] else: return tan(radians(x)) The infinity being positive of negative is debatable however, here I've chosen the convention lim tan(x) as x approaches ?90? from 0 def acosd(x): if x == 1: return 0 if x == 0: return 90 if x == -1: return 180 return degrees(acos(x)) def asind(x): if x == 1: return 90 if x == 0: return 0 if x == -1: return -90 return degrees(asin(x)) However, currently [degrees(acos(x)) for x in (1,0,-1)] == [0, 90, 180] on my machine so maybe the test isn't necessary. Testing for Special values of abs(x) == 0.5 could be an idea but I don't think the overhead is worth the effort. Probably this has already been discussed but I don't know how to check that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Thu Jun 7 16:21:56 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Thu, 07 Jun 2018 15:21:56 -0500 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <163dbe9b2a0.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> You could always do e.g. math.sin(math.degress(radians)) and so forth... On June 7, 2018 3:07:21 PM Robert Vanden Eynde wrote: > I suggest adding degrees version of the trigonometric functions in the math > module. > > - Useful in Teaching and replacing calculators by python, importing > something is seen by the young students much more easy than to define a > function. > > - Special values could be treated, aka when the angle is a multiple of 90, > young students are often surprise to see that cos(pi/2) != 0 > > Testing for a special value Isn't very costly (x % 90 == 0) but it could be > pointed out that there is a small overhead using the "degrees" equivalent > of trig function because of the radians to degrees conversion And the > special values testing. > > - Standard names will be chosen so that everyone will use the same name > convention. I suggest adding a "d" like sind, cosd, tand, acosd, asind, > atand, atan2d. > > Another option would be to add "deg" or prepend "d" or "deg" however the > name should be short. > > sind, dsin, sindeg or degsin ? > > We can look in other languages what they chose. > > Creating a new package like 'from math.degrees import cos' however I would > not recommend that because "cos" in the source code would mean to lookup > the import to know if it's in degrees or radians (and that leads to very > filthy bugs). Also "degrees" is already so the name would have to change > the name of the package. > > - Also in the cmath module. Even though the radians make more sense in the > complex plane. The same functions sin cos tan, asin acos atan, alongside > with phase and polar. > > Here's my current implementation : > > def cosd(x): > if x % 90 == 0: > return (1, 0, -1, 0)[int(x // 90) % 4] > else: > return cos(radians(x)) > > def sind(x): > if x % 90 == 0: > return (0, 1, 0, -1)[int(x // 90) % 4] > else: > return sin(radians(x)) > > def tand(x): > if x % 90 == 0: > return (0, float('inf'), 0, float('-inf'))[int(x // 90) % 4] > else: > return tan(radians(x)) > > The infinity being positive of negative is debatable however, here I've > chosen the convention lim tan(x) as x approaches ?90? from 0 > > def acosd(x): > if x == 1: return 0 > if x == 0: return 90 > if x == -1: return 180 > return degrees(acos(x)) > > def asind(x): > if x == 1: return 90 > if x == 0: return 0 > if x == -1: return -90 > return degrees(asin(x)) > > However, currently [degrees(acos(x)) for x in (1,0,-1)] == [0, 90, 180] on > my machine so maybe the test isn't necessary. > > Testing for Special values of abs(x) == 0.5 could be an idea but I don't > think the overhead is worth the effort. > > Probably this has already been discussed but I don't know how to check that. > > > ---------- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rspeer at luminoso.com Thu Jun 7 16:45:42 2018 From: rspeer at luminoso.com (Rob Speer) Date: Thu, 7 Jun 2018 16:45:42 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <163dbe9b2a0.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> References: <163dbe9b2a0.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> Message-ID: You meant math.radians(degrees), and Robert already mentioned the problem with this: >>> math.cos(math.radians(90)) 6.123233995736766e-17 On Thu, 7 Jun 2018 at 16:22 Ryan Gonzalez wrote: > You could always do e.g. math.sin(math.degress(radians)) and so forth... > > On June 7, 2018 3:07:21 PM Robert Vanden Eynde < > robertvandeneynde at hotmail.com> wrote: > >> I suggest adding degrees version of the trigonometric functions in the >> math module. >> >> - Useful in Teaching and replacing calculators by python, importing >> something is seen by the young students much more easy than to define a >> function. >> >> - Special values could be treated, aka when the angle is a multiple of >> 90, young students are often surprise to see that cos(pi/2) != 0 >> >> Testing for a special value Isn't very costly (x % 90 == 0) but it could >> be pointed out that there is a small overhead using the "degrees" >> equivalent of trig function because of the radians to degrees conversion >> And the special values testing. >> >> - Standard names will be chosen so that everyone will use the same name >> convention. I suggest adding a "d" like sind, cosd, tand, acosd, asind, >> atand, atan2d. >> >> Another option would be to add "deg" or prepend "d" or "deg" however the >> name should be short. >> >> sind, dsin, sindeg or degsin ? >> >> We can look in other languages what they chose. >> >> Creating a new package like 'from math.degrees import cos' however I >> would not recommend that because "cos" in the source code would mean to >> lookup the import to know if it's in degrees or radians (and that leads to >> very filthy bugs). Also "degrees" is already so the name would have to >> change the name of the package. >> >> - Also in the cmath module. Even though the radians make more sense in >> the complex plane. The same functions sin cos tan, asin acos atan, >> alongside with phase and polar. >> >> Here's my current implementation : >> >> def cosd(x): >> if x % 90 == 0: >> return (1, 0, -1, 0)[int(x // 90) % 4] >> else: >> return cos(radians(x)) >> >> def sind(x): >> if x % 90 == 0: >> return (0, 1, 0, -1)[int(x // 90) % 4] >> else: >> return sin(radians(x)) >> >> def tand(x): >> if x % 90 == 0: >> return (0, float('inf'), 0, float('-inf'))[int(x // 90) % 4] >> else: >> return tan(radians(x)) >> >> The infinity being positive of negative is debatable however, here I've >> chosen the convention lim tan(x) as x approaches ?90? from 0 >> >> def acosd(x): >> if x == 1: return 0 >> if x == 0: return 90 >> if x == -1: return 180 >> return degrees(acos(x)) >> >> def asind(x): >> if x == 1: return 90 >> if x == 0: return 0 >> if x == -1: return -90 >> return degrees(asin(x)) >> >> However, currently [degrees(acos(x)) for x in (1,0,-1)] == [0, 90, 180] >> on my machine so maybe the test isn't necessary. >> >> Testing for Special values of abs(x) == 0.5 could be an idea but I don't >> think the overhead is worth the effort. >> >> Probably this has already been discussed but I don't know how to check >> that. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Thu Jun 7 17:01:52 2018 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 7 Jun 2018 14:01:52 -0700 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: On Thu, Jun 7, 2018 at 1:07 PM Robert Vanden Eynde < robertvandeneynde at hotmail.com> wrote: > I suggest adding degrees version of the trigonometric functions in the > math module. > > You can create a pypi package that suits your needs. If it becomes popular it could considered for inclusion in the standard library. Would that work for you? Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From hugo.fisher at gmail.com Thu Jun 7 18:17:02 2018 From: hugo.fisher at gmail.com (Hugh Fisher) Date: Fri, 8 Jun 2018 08:17:02 +1000 Subject: [Python-ideas] Trigonometry in degrees Message-ID: > Date: Thu, 7 Jun 2018 12:33:29 +0000 > From: Robert Vanden Eynde > To: python-ideas > Subject: [Python-ideas] Trigonometry in degrees > Message-ID: > > > I suggest adding degrees version of the trigonometric functions in the math module. > > - Useful in Teaching and replacing calculators by python, importing something is seen by the young students much more easy than to define a function. I agree that degrees are useful for teaching. They are also very useful for graphics programming, especially with my favourite OpenGL API. But I think that the use of radians in programming language APIs is more prevalent, so the initial advantage of easy learning will be outweighed by the long term inconvenience of adjusting to what everyone else is doing. Writing degrees(x) and radians(x) is a little inconvenient, but it does make it clear what units are being used. And even if your proposal is adopted, there is still going to be a lot of code around that uses the older math routines. With the current API it is a least safe to assume that angles are radians unless stated otherwise. > - Special values could be treated, aka when the angle is a multiple of 90, young students are often surprise to see that cos(pi/2) != 0 > > Testing for a special value Isn't very costly (x % 90 == 0) but it could be pointed out that there is a small overhead using the "degrees" equivalent of trig function because of the radians to degrees conversion And the special values testing. Not just young students :-) I agree with this, but I would prefer the check to be in the implementation of the existing functions as well. Any sin/cos very close to 0 becomes 0, any close to 1 becomes 1. > - Standard names will be chosen so that everyone will use the same name convention. I suggest adding a "d" like sind, cosd, tand, acosd, asind, atand, atan2d. Not "d". In the OpenGL 3D API, and many 3D languages/APIs since, appending "d" means "double precision". It's even sort of implied by the C math library which has sinf and friends for single precision. > > Creating a new package like 'from math.degrees import cos' however I would not recommend that because "cos" in the source code would mean to lookup the import to know if it's in degrees or radians (and that leads to very filthy bugs). Also "degrees" is already so the name would have to change the name of the package. Agree, not a good idea. -- cheers, Hugh Fisher From robertve92 at gmail.com Thu Jun 7 19:08:52 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Fri, 8 Jun 2018 01:08:52 +0200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: - I didn't know there were sinf in C (that's since C99), I was aware of the 'd' postfix in opengl. So yeah, sind would be a bad idea, but sindeg or degsin would be too long, hmm, and I can settle for the Pre or Post fix. sindeg(90) degsin(90) are both pretty, the first emphasize on the "degree" part and the second on the "sin(90)" part. I feel I prefer sindeg, cosdeg, atandeg, atan2deg, phasedeg, rectdeg hmhm By the way I've seen a stackoverflow answer using Sin and Cos with a capital letter, doesn't seem very explicit to me. - I could do a pypi for it for sure, I didn't know it was that easy to create a repo actually. degreesmath (and degreesmath.cmath ?) would be a good package name but again I don't want to name the functions sin, cos. People could rename them on import anyway (let the fools be fools as long as they don't hurt anyone). - I agree radians should be the default, but is it especially Because sin/cos must be in radians ? And because it's more efficient ? The problem arrises when Mixing units in the same program. However, should everyone use m? and not Liters because they're the SI units ? That's more a problems of "mixing units and not sticking to one convention". I've seen lot of libraries using degrees (and not just good old glRotate). Let's notice there are some libraries that wrap units so that one can mix them safely (and avoid to add meters to seconds). Let's be honest, radians are useful only when converting arc length, areas or dealing with derivatives, signals, or complex numbers (engineering stuffs), and efficiency of sin/cos implementations. When doing simple 2D/3D applications, angles are just angles and nobody needs to know that derivative of sin(ax) is a?cos(ax) if x is in radians. - Integers are more convenient than float, you could add 1 degree every frame at 60fps to a counter and after 60 frames you'll do a full turn, adding tau/360 doesn't add so well (floating point representation). Having exact representation for multiple of 90 degrees is a big plus. Another advantage is also being able to check if the angle is particular (multiple of 30 or 90 for example). Especially python Integers with infinite precision. - Everyone knows degrees, whereas radians are known only by people that had math in 10th grade. I know it's easy to say "just convert" but trust me, not everyone is confident with unit conversions, when you ask "what's the unit of angle ?" people will think of degrees. - Changing the behavior for current cos/sin function to have cos(pi/2) being exact is a bad idea in my opinion, the good old sin/cos from C exist for a long time and people rely on the behaviors. That would break too much existing code for no much trouble. And would slow Current applications relying on the efficiency of the C functions. - I totally agree writing degrees(...) and radians(...) make it clear and explicit. That's why I strongly discourage people defining their own "sin" function that'd take degrees, therefore I look for a new function name (sindeg). Le ven. 8 juin 2018 ? 00:17, Hugh Fisher a ?crit : > > Date: Thu, 7 Jun 2018 12:33:29 +0000 > > From: Robert Vanden Eynde > > To: python-ideas > > Subject: [Python-ideas] Trigonometry in degrees > > Message-ID: > > > > > I suggest adding degrees version of the trigonometric functions in the > math module. > > > > - Useful in Teaching and replacing calculators by python, importing > something is seen by the young students much more easy than to define a > function. > > I agree that degrees are useful for teaching. They are also very > useful for graphics > programming, especially with my favourite OpenGL API. But I think that > the use of > radians in programming language APIs is more prevalent, so the initial > advantage > of easy learning will be outweighed by the long term inconvenience of > adjusting to > what everyone else is doing. > > Writing degrees(x) and radians(x) is a little inconvenient, but it > does make it clear > what units are being used. And even if your proposal is adopted, there > is still going > to be a lot of code around that uses the older math routines. With the > current API > it is a least safe to assume that angles are radians unless stated > otherwise. > > > - Special values could be treated, aka when the angle is a multiple of > 90, young students are often surprise to see that cos(pi/2) != 0 > > > > Testing for a special value Isn't very costly (x % 90 == 0) but it could > be pointed out that there is a small overhead using the "degrees" > equivalent of trig function because of the radians to degrees conversion > And the special values testing. > > Not just young students :-) I agree with this, but I would prefer the > check to be in > the implementation of the existing functions as well. Any sin/cos very > close to 0 > becomes 0, any close to 1 becomes 1. > > > - Standard names will be chosen so that everyone will use the same name > convention. I suggest adding a "d" like sind, cosd, tand, acosd, asind, > atand, atan2d. > > Not "d". In the OpenGL 3D API, and many 3D languages/APIs since, appending > "d" > means "double precision". It's even sort of implied by the C math > library which has > sinf and friends for single precision. > > > > > Creating a new package like 'from math.degrees import cos' however I > would not recommend that because "cos" in the source code would mean to > lookup the import to know if it's in degrees or radians (and that leads to > very filthy bugs). Also "degrees" is already so the name would have to > change the name of the package. > > Agree, not a good idea. > > -- > > cheers, > Hugh Fisher > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Thu Jun 7 22:39:06 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Thu, 7 Jun 2018 22:39:06 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: On 6/7/18 7:08 PM, Robert Vanden Eynde wrote: > - I didn't know there were sinf in C (that's since C99), I was aware > of the 'd' postfix in opengl. > > So yeah, sind would be a bad idea, but sindeg or degsin would be too > long, hmm, and I can settle for the Pre or Post fix. sindeg(90) > degsin(90) are both pretty, the first emphasize on the "degree" part > and the second on the "sin(90)" part. I feel I prefer sindeg, cosdeg, > atandeg, atan2deg, phasedeg, rectdeg hmhm > > By the way I've seen a stackoverflow answer using Sin and Cos with a > capital letter, doesn't seem very explicit to me. > > - I could do a pypi for it for sure, I didn't know it was that easy to > create a repo actually. degreesmath (and degreesmath.cmath ?) would be > a good package name but again I don't want to name the functions sin, > cos. People could rename them on import anyway (let the fools be fools > as long as they don't hurt anyone). > > - I agree radians should be the default, but is it especially Because > sin/cos must be in radians ? And because it's more efficient ? The > problem arrises when Mixing units in the same program. > > However, should everyone use m? and not Liters because they're the SI > units ? That's more a problems of "mixing units and not sticking to > one convention". I've seen lot of libraries using degrees (and not > just good old glRotate). > > Let's notice there are some libraries that wrap units so that one can > mix them safely (and avoid to add meters to seconds). > > Let's be honest, radians are useful only when converting arc length, > areas or dealing with derivatives, signals, or complex numbers > (engineering stuffs), and efficiency of sin/cos implementations. When > doing simple 2D/3D applications, angles are just angles and nobody > needs to know that derivative of sin(ax) is a?cos(ax) if x is in radians. > > - Integers are more convenient than float, you could add 1 degree > every frame at 60fps to a counter and after 60 frames you'll do a full > turn, adding tau/360 doesn't add so well (floating point > representation). Having exact representation for multiple of 90 > degrees is a big plus. Another advantage is also being able to check > if the angle is particular (multiple of 30 or 90 for example). > Especially python Integers with infinite precision. > > - Everyone knows degrees, whereas radians are known only by people > that had math in 10th grade. I know it's easy to say "just convert" > but trust me, not everyone is confident with unit conversions, when > you ask "what's the unit of angle ?" people will think of degrees. > > - Changing the behavior for current cos/sin function to have cos(pi/2) > being exact is a bad idea in my opinion, the good old sin/cos from C > exist for a long time and people rely on the behaviors. That would > break too much existing code for no much trouble. And would slow > Current applications relying on the efficiency of the C functions. > > - I totally agree writing degrees(...) and radians(...) make it clear > and explicit. That's why I strongly discourage people defining their > own "sin" function that'd take degrees, therefore I look for a new > function name (sindeg). First I feel the need to point out that radians are actually fairly fundamental in trigonometry, so there is good reasons for the base functions to be based on radians. The fact that the arc length of the angle on the unit circle is the angle in radians actually turns out to be a fairly basic property. To make it so that sindeg/cosdeg of multiples of 90 come out exact is probably easiest to do by doing the angle reduction in degrees (so the nice precise angles stay as nice precise angles) and then either adjust the final computation formulas for degrees, or convert the angle to radians and let the fundamental routine do the small angle computation. While we are at it, it might be worth thinking if it might make sense to also define a set of functions using circles as a unit (90 degrees = 0.25, one whole revolution = 1) -- Richard Damon From greg.ewing at canterbury.ac.nz Fri Jun 8 01:26:28 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Jun 2018 17:26:28 +1200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <5B1A1384.4020905@canterbury.ac.nz> Richard Damon wrote: > First I feel the need to point out that radians are actually fairly > fundamental in trigonometry, Even more so in calculus, since the derivative of sin(x) is cos(x) if and only if x is in radians. -- Greg From turnbull.stephen.fw at u.tsukuba.ac.jp Fri Jun 8 01:37:33 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 8 Jun 2018 14:37:33 +0900 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> Richard Damon writes: > To make it so that sindeg/cosdeg of multiples of 90 come out exact is > probably easiest to do by doing the angle reduction in degrees (so the > nice precise angles stay as nice precise angles) and then either adjust > the final computation formulas for degrees, or convert the angle to > radians and let the fundamental routine do the small angle > computation. You would still need some sort of correction for many angles because of the nature of floating point computation. The modern approach is that floating point is exact computation but some numbers can't be exactly represented. Since Pi is irrational, Pi/4 is too, so it definitely cannot be represented. Making a correction to a number that "looks like" Pi/4 is against this philosophy. So you need separate functions (a "high school mode" argument would be frowned upon, I think). > While we are at it, it might be worth thinking if it might make sense to > also define a set of functions using circles as a unit (90 degrees = > 0.25, one whole revolution = 1) While 1/4 is no problem, 1/6 is not exactly representable as a binary floating point number, and that's kind of an important angle for high school trigonometry (which is presumably what we're talking about here -- a symbolic math program would not represent Pi by a floating point number, but rather as a symbol with special properties as an argument to a trigonometric function!) My bias is that people who want to program this kind of thing just need to learn about floating point numbers and be aware that they're going to have to accept that >>> from math import cos, radians >>> cos(radians(90)) 6.123233995736766e-17 >>> is good enough for government work, including at the local public high school. Of course, I admit that's a bias, not a scientific fact. :-) -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From ubershmekel at gmail.com Fri Jun 8 01:44:16 2018 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 7 Jun 2018 22:44:16 -0700 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> Message-ID: On Thu, Jun 7, 2018 at 10:38 PM Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > > 6.123233995736766e-17 > >>> > > is good enough for government work, including at the local public high > school. > > There probably is room for a library like "fractions" that represents multiples of pi or degrees precisely. I'm not sure how complicated or valuable of an endeavor that would be. But while I agree that floating point is good enough, we probably can do better. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 8 01:45:31 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 15:45:31 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <20180608054530.GC12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 08:17:02AM +1000, Hugh Fisher wrote: > But I think that the use of > radians in programming language APIs is more prevalent, so the initial advantage > of easy learning will be outweighed by the long term inconvenience of > adjusting to what everyone else is doing. But why would you need to? If we had a degrees API, why wouldn't people just use it? > Writing degrees(x) and radians(x) is a little inconvenient, but it > does make it clear what units are being used. I never know what conversion function to use. I always expect something like deg2rad and rad2deg. I never remember whether degrees(x) expects an angle in degrees or returns an angle in degrees. So I would disagree that it is clear. > And even if your proposal is adopted, there is still going > to be a lot of code around that uses the older math routines. Why would that be a problem? When the iterator protocol was introduced, that didn't require lists and sequences to be removed. > Not just young students :-) I agree with this, but I would prefer the > check to be in the implementation of the existing functions as well. > Any sin/cos very close to 0 becomes 0, any close to 1 becomes 1. Heavens no! That's a terrible idea -- that means that functions which *ought to return 0.9999987 (say) will suddenly become horribly inaccurate and return 1. The existing trig functions are as close to accurate as is practical to expect with floating point maths. (Although some platform's maths libraries are less accurate than others.) We shouldn't make them *less* accurate just because some people don't care for more than three decimal places. > > - Standard names will be chosen so that everyone will use the same > > name convention. I suggest adding a "d" like sind, cosd, tand, > > acosd, asind, atand, atan2d. > > Not "d". In the OpenGL 3D API, and many 3D languages/APIs since, appending "d" > means "double precision". Python floats are already double precision. What advantage is there for reserving a prefix/suffix because some utterly unrelated framework in another language uses it for a completely different purpose? Like mathematicians, we use the "h" suffix for hyperbolic sin, cos and tan; should we have done something different because C uses ".h" for header files, or because the struct module uses "h" as the format code for short ints? Julia provides a full set of trigonometric functions in both radians and degrees: https://docs.julialang.org/en/release-0.4/manual/mathematical-operations/#trigonometric-and-hyperbolic-functions They use sind, cosd, tand etc for the variants expecting degrees. I think that's much more relevant than OpenGL. Although personally I prefer the look of d as a prefix: dsin, dcos, dtan That's more obviously pronounced "d(egrees) sin" etc rather than "sined" "tanned" etc. -- Steve From rosuav at gmail.com Fri Jun 8 01:55:34 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Jun 2018 15:55:34 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <20180608054530.GC12683@ando.pearwood.info> References: <20180608054530.GC12683@ando.pearwood.info> Message-ID: On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano wrote: > Although personally I prefer the look of d as a prefix: > > dsin, dcos, dtan > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > "tanned" etc. Having it as a suffix does have one advantage. The math module would need a hyperbolic sine function which accepts an argument in; and then, like Charles Napier [1], Python would finally be able to say "I have sindh". ChrisA [1] Apocryphally, alas. From steve at pearwood.info Fri Jun 8 02:11:39 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 16:11:39 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <20180608061139.GD12683@ando.pearwood.info> On Thu, Jun 07, 2018 at 10:39:06PM -0400, Richard Damon wrote: > First I feel the need to point out that radians are actually fairly > fundamental in trigonometry, so there is good reasons for the base > functions to be based on radians. The fact that the arc length of the > angle on the unit circle is the angle in radians actually turns out to > be a fairly basic property. People managed to use trigonometry for *literally* millennia before radians were invented and named by James Thomson in 1873. Just because they are, *in some sense*, mathematically fundamental doesn't mean we ought to be using them for measurements. We don't write large numbers using powers of e instead of powers of 10, just because exponentiation to base e is in some sense more fundamental than other powers. Even the fact that we talk about sine, cosine and tangent as distinct functions is mathematically unnecessary, since both cosine and tangent can be expressed in terms of sine. > While we are at it, it might be worth thinking if it might make sense to > also define a set of functions using circles as a unit (90 degrees = > 0.25, one whole revolution = 1) Hardly anyone still uses grads, and even fewer people use revolutions as the unit of angles. But if you did need revolutions, conveniently many simple fractions of a revolution come out to be whole numbers of degrees, thanks to 360 having lots of factors. All of these fractions of a revolution are exact whole numbers of degrees: 1/2, 1/3, 1/4, 1/5, 1/6, 1/8, 1/9, 1/10, 1/12, 1/15, 1/18 so I don't believe we need a third set of trig functions for revolutions. -- Steve From steve at pearwood.info Fri Jun 8 02:22:07 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 16:22:07 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> Message-ID: <20180608062206.GE12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 02:37:33PM +0900, Stephen J. Turnbull wrote: > My bias is that people who want to program this kind of thing just > need to learn about floating point numbers and be aware that they're > going to have to accept that > > >>> from math import cos, radians > >>> cos(radians(90)) > 6.123233995736766e-17 > >>> > > is good enough for government work, including at the local public high > school. In Australia, most secondary schools recommend or require CAS calculators from about Year 10, sometimes even from Year 9. Most (all?) state curricula for Year 11 and 12 mandate CAS calculators. Even old-school scientific calcuators without the fancy CAS symbolic maths are capable of having cos(90) return zero in degree mode. It is quite common for high school students to expect cos(90?) to come out as exactly zero. And why not? It's the 21st century, not 1972 when four-function calculators were considered advanced technology :-) To my mind, the question is not "should we have trig functions that take angles in degrees" -- that's a no-brainer, of course we should. The only questions in my mind are whether or not such a library is (1) appropriate for the stdlib and (2) ready for the stdlib. -- Steve From greg.ewing at canterbury.ac.nz Fri Jun 8 02:34:25 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Jun 2018 18:34:25 +1200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> Message-ID: <5B1A2371.6080608@canterbury.ac.nz> Stephen J. Turnbull wrote: > Since Pi is irrational, Pi/4 is too, so it > definitely cannot be represented. Making a correction to a number > that "looks like" Pi/4 is against this philosophy. I'm not sure what all the fuss is about: >>> from math import pi, sin >>> sin(pi/2) 1.0 >>> sin(pi/2 + 2 * pi) 1.0 >>> sin(pi/2 + 4 * pi) 1.0 >>> sin(pi/2 + 8 * pi) 1.0 >>> sin(pi/2 + 16 * pi) 1.0 >>> sin(pi/2 + 32 * pi) 1.0 Seems to be more than good enough for most angle ranges that your average schoolkid is going to be plugging into it. In fact you have to go quite a long way before expectations start to break down: >>> sin(pi/2 + 10000000 * pi) 1.0 >>> sin(pi/2 + 100000000 * pi) 0.9999999999999984 -- Greg From greg.ewing at canterbury.ac.nz Fri Jun 8 02:40:48 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Jun 2018 18:40:48 +1200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> Message-ID: <5B1A24F0.9010409@canterbury.ac.nz> Chris Angelico wrote: > The math module would > need a hyperbolic sine function which accepts an argument in; Except that the argument to hyperbolic trig functions is not an angle in any normal sense of the word, so expressing it in degrees makes little sense. (However I do like the idea of a function called "tanhd" and pronounced "tanned hide". :-) -- Greg From greg.ewing at canterbury.ac.nz Fri Jun 8 02:45:15 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Jun 2018 18:45:15 +1200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <20180608062206.GE12683@ando.pearwood.info> References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> <20180608062206.GE12683@ando.pearwood.info> Message-ID: <5B1A25FB.4090408@canterbury.ac.nz> Steven D'Aprano wrote: > Even > old-school scientific calcuators without the fancy CAS symbolic maths > are capable of having cos(90) return zero in degree mode. FWIW, my Casio fx-100 (over 30 years old) produces exactly 1 for both sin(90?) and sin(pi/2) for its version of pi. -- Greg From steve at pearwood.info Fri Jun 8 02:59:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 16:59:03 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> Message-ID: <20180608065903.GF12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano wrote: > > Although personally I prefer the look of d as a prefix: > > > > dsin, dcos, dtan > > > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > > "tanned" etc. > > Having it as a suffix does have one advantage. The math module would > need a hyperbolic sine function which accepts an argument in; and > then, like Charles Napier [1], Python would finally be able to say "I > have sindh". Ha ha, nice pun, but no, the hyperbolic trig functions never take arguments in degrees. Or radians for that matter. They are "hyperbolic angles", which some electrical engineering text books refer to as "hyperbolic radians", but all the maths text books I've seen don't call them anything other than a real number. (Or sometimes a complex number.) But for what it's worth, there is a correspondence of a sort between the hyperbolic angle and circular angles. The circular angle going between 0 to 45? corresponds to the hyperbolic angle going from 0 to infinity. https://en.wikipedia.org/wiki/Hyperbolic_angle https://en.wikipedia.org/wiki/Hyperbolic_function > [1] Apocryphally, alas. Don't ruin a good story with facts ;-) -- Steve From j.van.dorp at deonet.nl Fri Jun 8 03:05:45 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 8 Jun 2018 09:05:45 +0200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <5B1A24F0.9010409@canterbury.ac.nz> References: <20180608054530.GC12683@ando.pearwood.info> <5B1A24F0.9010409@canterbury.ac.nz> Message-ID: Or when students get stuff e-17 out of a function, you teach them what floating point numbers are and what gotcha's they can expect. The simple version is "value is stored as a float, and a float gets rounding errors below e-16", or for the more inquisitive minds you give them nice places like https://docs.python.org/3/tutorial/floatingpoint.html . If they're really going to use what they learn, they're going to run into it sooner or later. So having a bit of base knowledge about floats is a lot more useful than having to google "why does sin() return weird values python". At the very least, they'll isntead google "float limitations", which is going to get them a lot closer to the real information a lot faster. That said, I wouldn't be that opposed to a dedicated type to remember things about pi. Lets say.... class pi(Numeric): """Represents numbers that represent some function of pi""" def __init__(self, mul=1): self.multiplier = mul def __mul__(self, other): if isinstance(other, Numeric): return self.__class__(self.multiplier*other) (similar with the other special methods) (Please consider the idea, not the exact code. I dont even know if i spelled the numeric superclass right. Let alone making this type hashable and immutable, which it should be.) It's probably not a good idea to use that for performance-critical parts, but for the more trivial applications, it could allow for more clarity. Also, in a lot of common angles, it'd be far easier to actually recognize special cases. you could also make it __repr__ like f"Pi*{self.multiplier}", so you get a neat exact answer if you print it.. From drekin at gmail.com Fri Jun 8 04:53:34 2018 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Fri, 8 Jun 2018 10:53:34 +0200 Subject: [Python-ideas] Trigonometry in degrees Message-ID: Wouldn't sin(45 * DEG) where DEG = 2 * math.pi / 360 be better that sind(45)? This way we woudn't have to introduce new functions. (The problem with nonexact results for nice angles is a separate issue.) Regards, Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 8 07:49:42 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 21:49:42 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <20180608114941.GG12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 10:53:34AM +0200, Adam Barto? wrote: > Wouldn't sin(45 * DEG) where DEG = 2 * math.pi / 360 be better that > sind(45)? This way we woudn't have to introduce new functions. (The problem > with nonexact results for nice angles is a separate issue.) But that's not a separate issue, that's precisely one of the motives for having dedicated trig functions for degrees. sind(45) (or dsin(45), as I would prefer) could (in principle) return the closest possible float to sqrt(2)/2, which sin(45*DEG) does not do: py> DEG = 2 * math.pi / 360 py> math.sin(45*DEG) == math.sqrt(2)/2 False Likewise, we'd expect cosd(90) to return zero, not something not-quite zero: py> math.cos(90*DEG) 6.123031769111886e-17 That's how it works in Julia: julia> sind(45) == sqrt(2)/2 true julia> cosd(90) 0.0 and I'd expect no less here. If we can't do that, there probably wouldn't be much point in the exercise. -- Steve From steve at pearwood.info Fri Jun 8 08:25:13 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Jun 2018 22:25:13 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <5B1A2371.6080608@canterbury.ac.nz> References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> <5B1A2371.6080608@canterbury.ac.nz> Message-ID: <20180608122513.GH12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 06:34:25PM +1200, Greg Ewing wrote: > I'm not sure what all the fuss is about: > > >>> from math import pi, sin > >>> sin(pi/2) > 1.0 Try cos(pi/2) or sin(pi/6). Or try: sin(pi/4) == sqrt(2)/2 tan(pi/4) == 1 tan(pi/3) == sqrt(3) And even tan(pi/2), which ought to be an error, but isn't. These are, of course, limitations due to the finite precision of floats. But Julia gets the equivalent degree-based calculations all correct, except for tan(90) where it returns Inf (NAN would be better, as the limit from below and the limit from above are different). -- Steve From hugo.fisher at gmail.com Fri Jun 8 09:19:00 2018 From: hugo.fisher at gmail.com (Hugh Fisher) Date: Fri, 8 Jun 2018 23:19:00 +1000 Subject: [Python-ideas] Trigonometry in degrees Message-ID: > Date: Fri, 8 Jun 2018 15:45:31 +1000 > From: Steven D'Aprano > To: python-ideas at python.org > Subject: Re: [Python-ideas] Trigonometry in degrees > Message-ID: <20180608054530.GC12683 at ando.pearwood.info> > Content-Type: text/plain; charset=us-ascii > > On Fri, Jun 08, 2018 at 08:17:02AM +1000, Hugh Fisher wrote: > >> But I think that the use of >> radians in programming language APIs is more prevalent, so the initial advantage >> of easy learning will be outweighed by the long term inconvenience of >> adjusting to what everyone else is doing. > > But why would you need to? > > If we had a degrees API, why wouldn't people just use it? Is this going to be backported all the way to Python 2.7? More generally, there is a huge body of code in C, C++, Java, JavaScript, etc etc where angles are always passed as radians. Python programmers will almost certainly have to read, and often write, such code. If everybody else is doing something in a particular way then there is a strong case for doing the same thing. It's not as if Python programmers cannot use degrees. The built in conversion functions make it easier to do so than most other languages. > I never know what conversion function to use. I always expect something > like deg2rad and rad2deg. I never remember whether degrees(x) expects an > angle in degrees or returns an angle in degrees. > > So I would disagree that it is clear. The degrees and radian functions follow the Python idiom for converting values, eg str(x) is interpreted as converting x into a str. However the analogy breaks down because str(x) is a NOP if x is already a string, while degrees(x) can't tell whether x is already in degrees or not. Maybe rad2deg would have been better, but the current solution is good enough - and as noted above, much better than what you get in C or JavaScript. >> And even if your proposal is adopted, there is still going >> to be a lot of code around that uses the older math routines. > > Why would that be a problem? See above. Why do Python subscripts start from zero? Because most programmers expect them to. >> Not just young students :-) I agree with this, but I would prefer the >> check to be in the implementation of the existing functions as well. >> Any sin/cos very close to 0 becomes 0, any close to 1 becomes 1. > > Heavens no! That's a terrible idea -- that means that functions which > *ought to return 0.9999987 (say) will suddenly become horribly > inaccurate and return 1. > > The existing trig functions are as close to accurate as is practical to > expect with floating point maths. (Although some platform's maths > libraries are less accurate than others.) We shouldn't make them *less* > accurate just because some people don't care for more than three decimal > places. But I want them to be more accurate. I didn't make myself clear. Like you, I want cos(90 degrees) to be 0, not some small number. Other people have pointed out the problem with trying to guess the result from the argument value, so I am suggesting that the functions should instead look at the calculated result and if it is sufficiently close to 0.0 or 1.0, assume that the argument value was 90 degrees or some multiple thereof. >> Not "d". In the OpenGL 3D API, and many 3D languages/APIs since, appending "d" >> means "double precision". > > Python floats are already double precision. > > What advantage is there for reserving a prefix/suffix because some > utterly unrelated framework in another language uses it for a completely > different purpose? Well I program using the OpenGL API in Python, so there's at least one person who will find the d suffix confusing for that reason. And the d suffix is used for types and functions in OpenGL shading language, C, ARM/Intel assembler. The intersection with Python programmers may not be very large, but again it is at least 1. So no they are not utterly unrelated. > Like mathematicians, we use the "h" suffix for hyperbolic sin, cos and > tan; should we have done something different because C uses ".h" for > header files, or because the struct module uses "h" as the format code > for short ints? Is d used as a suffix by mathematicians though? The h works because the context makes it clear which sense is being used, mathematics, C, or struct module. Here we are discussing functions in the same module. Whether to use d or deg is an arbitrary choice for mathematicians (AFAIK), so either would work equally well. Since d can be confusing for others, to me that would make deg preferable. But see below where I change my mind. > Julia provides a full set of trigonometric functions in both radians and > degrees: > > https://docs.julialang.org/en/release-0.4/manual/mathematical-operations/#trigonometric-and-hyperbolic-functions > > They use sind, cosd, tand etc for the variants expecting degrees. I > think that's much more relevant than OpenGL. OK, that's interesting, I did not know that. And a quick google shows that Matlab also has sind and similar variants for degrees. > Although personally I prefer the look of d as a prefix: > > dsin, dcos, dtan > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > "tanned" etc. If Julia and Matlab are sufficiently well known, I would prefer d as suffix rather than prefix. -- cheers, Hugh Fisher From klahnakoski at mozilla.com Fri Jun 8 09:44:40 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Fri, 8 Jun 2018 09:44:40 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <23322.5661.552082.91600@turnbull.sk.tsukuba.ac.jp> Message-ID: <9fcb214b-a89d-e777-bc44-b0425c3aa0f2@mozilla.com> On 2018-06-08 01:44, Yuval Greenfield wrote: > On Thu, Jun 7, 2018 at 10:38 PM Stephen J. Turnbull > > wrote: > > > 6.123233995736766e-17 > >>> > > is good enough for government work, including at the local public high > school. > > > There probably is room for a library like "fractions" that represents > multiples of pi or degrees precisely. I'm not sure how complicated or > valuable of an endeavor that would be. But while I agree that floating > point is good enough, we probably can do better. > ? > Yes, I agree with making a module (called `rational_trig`?), that defines some Angle constants, and defines trig functions that accept Angle objects. Using angle objects will prevent the explosion of unit-specific variations on the trig functions (sin, sindeg, singrad, etc).? Like mentioned above, the Angle object is probably best implemented as a Rational of 2*pi, which will allow our favorite angles to be represented without floating point error.? We can define `degrees` and `radians` constants which can be used as units; then trig looks something like: from rational_trig import cos if cos(90*degrees) == 0: ??? print("yay!") It is probably slow as molasses, but maybe good enough for a teaching environment? -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Fri Jun 8 09:55:00 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 8 Jun 2018 15:55:00 +0200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: 2018-06-08 15:19 GMT+02:00 Hugh Fisher : >> Julia provides a full set of trigonometric functions in both radians and >> degrees: >> >> https://docs.julialang.org/en/release-0.4/manual/mathematical-operations/#trigonometric-and-hyperbolic-functions >> >> They use sind, cosd, tand etc for the variants expecting degrees. I >> think that's much more relevant than OpenGL. > > OK, that's interesting, I did not know that. And a quick google shows that > Matlab also has sind and similar variants for degrees. > >> Although personally I prefer the look of d as a prefix: >> >> dsin, dcos, dtan >> >> That's more obviously pronounced "d(egrees) sin" etc rather than "sined" >> "tanned" etc. > > If Julia and Matlab are sufficiently well known, I would prefer d as suffix > rather than prefix. I graduated less than a year ago - Matlab at the very least is quite well-known, we got lessons about it (although I was the one kid who used python for his matlab assignments...you can make do, with matplotlib, google, opencv, and numpy.). I also believe they give free/heavily discounted licenses to schools, hoping that after graduation, those students are used to matlab and will try to make their employers buy full-priced versions. I don't know about Julia, though. From randiaz95 at gmail.com Fri Jun 8 10:12:01 2018 From: randiaz95 at gmail.com (Randy Diaz) Date: Fri, 8 Jun 2018 10:12:01 -0400 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: I think that the keyword do would solve problems that occur when people want a simple way to run a command over an iterable but they dont want to store the data. example: do print(x) for x in range(50) --------- this above command will not return anything and will just run the command that is underlined over a generator. thus running a command comprehension or do comprehension. this will stop people from using the list comprehension to run an iterable through a function when they dont want to return anything. ( Specifically if memory is something we would want to conserve, such as in multithreaded web applications. ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jw14896 at my.bristol.ac.uk Fri Jun 8 10:15:35 2018 From: jw14896 at my.bristol.ac.uk (Jamie Willis) Date: Fri, 8 Jun 2018 15:15:35 +0100 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: <1528467311.local-24277bd1-a355-v1.2.2-96fb3a99@getmailspring.com> I don't see how this is different to just: for x in range(50): print(x) Can you elaborate further? Jamie On Jun 8 2018, at 3:12 pm, Randy Diaz wrote: > > I think that the keyword do would solve problems that occur when people want a simple way to run a command over an iterable but they dont want to store the data. > example: > > do print(x) for x in range(50) > --------- > this above command will not return anything and will just run the command that is underlined over a generator. thus running a command comprehension or do comprehension. this will stop people from using the list comprehension to run an iterable through a function when they dont want to return anything. ( Specifically if memory is something we would want to conserve, such as in multithreaded web applications. ) > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Jun 8 10:41:37 2018 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 8 Jun 2018 15:41:37 +0100 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: <599dd5ea-8410-2308-28fc-e3433df11d47@mrabarnett.plus.com> On 2018-06-08 15:12, Randy Diaz wrote: > I think that the keyword do would solve problems that occur when people > want a simple way to run a command over an iterable but they dont want > to store the data. > > example: > > do print(x) for x in range(50) > ? ? ?--------- > this above command will not return anything and will just run the > command that is underlined over a generator. thus running a command > comprehension or do comprehension. this will stop people from using the > list comprehension to run an iterable through a function when they dont > want to return anything. ( Specifically if memory is something we would > want to conserve, such as in multithreaded web applications. ) > How is that better than: for x in range(50): print(x) (if you want it to be a single line)? From j.van.dorp at deonet.nl Fri Jun 8 10:58:19 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 8 Jun 2018 16:58:19 +0200 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: <599dd5ea-8410-2308-28fc-e3433df11d47@mrabarnett.plus.com> References: <599dd5ea-8410-2308-28fc-e3433df11d47@mrabarnett.plus.com> Message-ID: Given that an exhaust method for generators didn't make it, I dont think a keyword has more chance. For reference, that exhaust method would have looked like: (print (x) for x in range(50)).exhaust() and be defined as: def exhaust(self): for _ in self: pass Also, there was some more involved method that did it in less characters, having to do with queues of length 0 or something. From jw14896.2014 at my.bristol.ac.uk Fri Jun 8 11:08:00 2018 From: jw14896.2014 at my.bristol.ac.uk (Jamie Willis) Date: Fri, 8 Jun 2018 16:08:00 +0100 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: <599dd5ea-8410-2308-28fc-e3433df11d47@mrabarnett.plus.com> Message-ID: What about just supporting filtering syntax at the top level? for x range(50) if x % 2: print(x) It's a minor syntactic change which is very flexible and consistent with the existing comprehension syntax. Could even extend to allow multiple iterators as per a comprehension On Fri, 8 Jun 2018, 15:59 Jacco van Dorp, wrote: > Given that an exhaust method for generators didn't make it, I dont > think a keyword has more chance. For reference, that exhaust method > would have looked like: > > (print (x) for x in range(50)).exhaust() > > and be defined as: > > def exhaust(self): > for _ in self: > pass > > Also, there was some more involved method that did it in less > characters, having to do with queues of length 0 or something. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Fri Jun 8 11:39:22 2018 From: toddrjen at gmail.com (Todd) Date: Fri, 8 Jun 2018 11:39:22 -0400 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: On Fri, Jun 8, 2018, 10:12 Randy Diaz wrote: > I think that the keyword do would solve problems that occur when people > want a simple way to run a command over an iterable but they dont want to > store the data. > > example: > > do print(x) for x in range(50) > --------- > this above command will not return anything and will just run the command > that is underlined over a generator. thus running a command comprehension > or do comprehension. this will stop people from using the list > comprehension to run an iterable through a function when they dont want to > return anything. ( Specifically if memory is something we would want to > conserve, such as in multithreaded web applications. ) > I would prefer some some syntax for consuming iterators in a general way, such as * = (x for x in y) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 8 12:01:31 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 02:01:31 +1000 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: <20180608160131.GJ12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 10:12:01AM -0400, Randy Diaz wrote: > I think that the keyword do would solve problems that occur when people > want a simple way to run a command over an iterable but they dont want to > store the data. Why does it have to be a keyword? I like this pair of functions: def do(func, iterable, **kwargs): for x in iterable: func(x, **kwargs) def star(func, iterable, **kwargs): for x in iterable: func(*x, **kwargs) do.star = star del star Here's an example in use: py> do(print, [(1, 2), (3, 4), (5, 6, 7, 8)], sep='-', end='*\n') (1, 2)* (3, 4)* (5, 6, 7, 8)* py> do.star(print, [(1, 2), (3, 4), (5, 6, 7, 8)], sep='-', end='*\n') 1-2* 3-4* 5-6-7-8* Customize to your taste, and put them in your own personal toolbox. -- Steve From dkteresi at gmail.com Fri Jun 8 14:41:54 2018 From: dkteresi at gmail.com (David Teresi) Date: Fri, 8 Jun 2018 14:41:54 -0400 Subject: [Python-ideas] A "within" keyword Message-ID: One of the features I miss from languages such as C# is namespaces that work across files - it makes it a lot easier to organize code IMO. Here's an idea I had - it might not be the best idea, just throwing this out there: a "within" keyword that lets you execute code inside a namespace. For example: # A.py import types cool_namespace = types.SimpleNamespace() within cool_namespace: def foo(): print("foo run") #B.py import A within A.cool_namespace: foo() # prints "foo run" Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Fri Jun 8 15:01:36 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 8 Jun 2018 20:01:36 +0100 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: <987b0301-9ed6-bffa-5d65-bb3b84dfc5b2@kynesim.co.uk> On 08/06/18 19:41, David Teresi wrote: > One of the features I miss from languages such as C# is namespaces that > work across files - it makes it a lot easier to organize code IMO. > > Here's an idea I had - it might not be the best idea, just throwing this > out there: a "within" keyword that lets you execute code inside a > namespace. For example: > > # A.py > import types > cool_namespace = types.SimpleNamespace() > > within cool_namespace: > def foo(): > print("foo run") > > #B.py > import A > within A.cool_namespace: > foo() # prints "foo run" New keywords have a fairly high barrier to get over. Do you have a convincing use case? I don't personally consider this particularly convincing, not when it's pretty much equivalent to: >>> import A >>> foo = A.cool_namespace.foo >>> foo() To be honest I wouldn't even bother doing that, I'd just type A.cool_namespace.foo() when I wanted it. Explicit is better than implicit, after all. -- Rhodri James *-* Kynesim Ltd From drekin at gmail.com Fri Jun 8 17:11:09 2018 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Fri, 8 Jun 2018 23:11:09 +0200 Subject: [Python-ideas] Trigonometry in degrees Message-ID: Steven D'Aprano wrote: > On Fri, Jun 08, 2018 at 10:53:34AM +0200, Adam Barto? wrote: >> Wouldn't sin(45 * DEG) where DEG = 2 * math.pi / 360 be better that >> sind(45)? This way we woudn't have to introduce new functions. (The problem >> with nonexact results for nice angles is a separate issue.) > > But that's not a separate issue, that's precisely one of the motives for > having dedicated trig functions for degrees. > > sind(45) (or dsin(45), as I would prefer) could (in principle) return > the closest possible float to sqrt(2)/2, which sin(45*DEG) does not do: > > py> DEG = 2 * math.pi / 360 > py> math.sin(45*DEG) == math.sqrt(2)/2 > False > > Likewise, we'd expect cosd(90) to return zero, not something not-quite > zero: > > py> math.cos(90*DEG) > 6.123031769111886e-17 > > > > That's how it works in Julia: > > julia> sind(45) == sqrt(2)/2 > true > > julia> cosd(90) > 0.0 > > > and I'd expect no less here. If we can't do that, there probably > wouldn't be much point in the exercise. But if there are both sin and dsin, and you ask about the difference between them, the obvious answer would be that one takes radians and the other takes degrees. The point that the degrees version is additionally exact on special values is an extra benefit. It would be nice to also fix the original sin, or more precisely to provide a way to give it a fractional multiple of pi. How about a special class PiMultiple that would represent a fractional multiple of pi? PI = PiMultiple(1) assert PI / 2 == PiMultiple(1, 2) assert cos(PI / 2) == 0 DEG = 2 * PI / 360 assert sin(45 * DEG) == sqrt(2) / 2 Best regards, Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 8 18:07:28 2018 From: mike at selik.org (Michael Selik) Date: Fri, 8 Jun 2018 15:07:28 -0700 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: You can use ``eval`` to run an expression, swapping in a different globals and/or locals namespace. Will this serve your purpose? In [1]: import types In [2]: ns = types.SimpleNamespace(a=1) In [3]: eval('a', ns.__dict__) Out[3]: 1 https://docs.python.org/3/library/functions.html#eval On Fri, Jun 8, 2018 at 11:43 AM David Teresi wrote: > One of the features I miss from languages such as C# is namespaces that > work across files - it makes it a lot easier to organize code IMO. > > Here's an idea I had - it might not be the best idea, just throwing this > out there: a "within" keyword that lets you execute code inside a > namespace. For example: > > # A.py > import types > cool_namespace = types.SimpleNamespace() > > within cool_namespace: > def foo(): > print("foo run") > > #B.py > import A > within A.cool_namespace: > foo() # prints "foo run" > > Thoughts? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 8 19:06:46 2018 From: mike at selik.org (Michael Selik) Date: Fri, 8 Jun 2018 16:06:46 -0700 (PDT) Subject: [Python-ideas] A real life example of "given" In-Reply-To: References: Message-ID: <1fec9e22-802b-4717-b7db-56c91f11e80e@googlegroups.com> What's wrong with making this two lines? In [1]: import random In [2]: xs = [10, 20, 30] In [3]: def foo(x): ...: return [x + i for i in range(3)] ...: ...: In [4]: def bar(y): ...: if random.random() < 0.3: ...: return None ...: return str(y) ...: ...: In [5]: ys = ((y, bar(y)) for x in xs for y in foo(x)) In [6]: {y: result for y, result in ys if result is not None} Out[6]: {10: '10', 11: '11', 20: '20', 21: '21', 22: '22', 30: '30', 32: '32'} -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 8 19:38:05 2018 From: mike at selik.org (Michael Selik) Date: Fri, 8 Jun 2018 16:38:05 -0700 (PDT) Subject: [Python-ideas] Add dict.append and dict.extend In-Reply-To: References: <20180605002556.GU12683@ando.pearwood.info> Message-ID: <7ac500c6-201d-4b7c-83fb-3dd951802770@googlegroups.com> On Monday, June 4, 2018 at 11:29:15 PM UTC-7, Ben Rudiak-Gould wrote: > > One example (or family of examples) is any situation where you would > have a UNIQUE constraint on an indexed column in a database. If the > values in a column should always be distinct, like the usernames in a > table of user accounts, you can declare that column UNIQUE (or PRIMARY > KEY) and any attempt to add a record with a duplicate username will > fail. > This might do the trick for you: class InsertOnlyDict(dict): ''' Supports item inserts, but not updates. ''' def __init__(self, *args, **kwds): self.update(*args, **kwds) def __setitem__(self, key, value): if key in self: raise KeyError(f'Duplicate key, {key!r}') super().__setitem__(key, value) def update(self, *args, **kwds): for k, v in dict(*args, **kwds).items(): self[k] = v If you're using a dict-like as an interface to a database table with a unique key constraint, I think your database will appropriately raise IntegrityError when you accidentally try to update instead of insert. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 8 20:03:05 2018 From: mike at selik.org (Michael Selik) Date: Fri, 8 Jun 2018 17:03:05 -0700 (PDT) Subject: [Python-ideas] Allow popping of slices In-Reply-To: References: <20180605001157.GT12683@ando.pearwood.info> Message-ID: <20aac3bd-723e-42b9-b25c-901cf7e81d21@googlegroups.com> On Tuesday, June 5, 2018 at 12:19:53 AM UTC-7, Ben Rudiak-Gould wrote: > > When the loop is simple enough I can write > > items = [... for item in items] > > and when it's complicated enough it probably makes sense to split it > into a separate function. But I've many times wished that I could > write > > for item in items.pop_all(): > ... > items.append(...) > ... > Isn't it equally easy to write: def foo(x): ... items = [foo(x) for x in items] I don't understand the desire to write this as a loop over a ``items.pop(slice(None))``. You can simply use the same loop body, replacing the ``for x in items:`` with ``def blah(x):`` and ``items.append(y)`` with ``return y``. Low typing effort. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Fri Jun 8 20:04:51 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Fri, 8 Jun 2018 20:04:51 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: On 6/8/18 5:11 PM, Adam Barto? wrote: > Steven D'Aprano wrote: > > On Fri, Jun 08, 2018 at 10:53:34AM +0200, Adam Barto? wrote: > >> Wouldn't sin(45 * DEG) where DEG = 2 * math.pi / 360 be better that > >> sind(45)? This way we woudn't have to introduce new functions. (The > problem > >> with nonexact results for nice angles is a separate issue.) > > > > But that's not a separate issue, that's precisely one of the motives > for > > having dedicated trig functions for degrees. > > > > sind(45) (or dsin(45), as I would prefer) could (in principle) return > > the closest possible float to sqrt(2)/2, which sin(45*DEG) does not do: > > > > py> DEG = 2 * math.pi / 360 > > py> math.sin(45*DEG) == math.sqrt(2)/2 > > False > > > > Likewise, we'd expect cosd(90) to return zero, not something not-quite > > zero: > > > > py> math.cos(90*DEG) > > 6.123031769111886e-17 > > > > > > > > That's how it works in Julia: > > > > julia> sind(45) == sqrt(2)/2 > > true > > > > julia> cosd(90) > > 0.0 > > > > > > and I'd expect no less here. If we can't do that, there probably > > wouldn't be much point in the exercise. > > But if there are both sin and dsin, and you ask about the difference > between them, the obvious answer would be that one takes radians and > the other takes degrees. The point that the degrees version is > additionally exact on special values is an extra benefit. It would be > nice to also fix the original sin, or more precisely to provide a way > to give it a fractional multiple of pi. How about a special class > PiMultiple that would represent a fractional multiple of pi? > > PI = PiMultiple(1) > assert PI / 2 == PiMultiple(1, 2) > assert cos(PI / 2) == 0 > DEG = 2 * PI / 360 > assert sin(45 * DEG) == sqrt(2) / 2 > > Best regards, > Adam Barto? > In one sense that is why I suggest a Circle based version of the trig functions. In effect that is a multiple of Tau (= 2*pi) routine. Richard Damon From robertvandeneynde at hotmail.com Fri Jun 8 04:45:54 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Fri, 8 Jun 2018 08:45:54 +0000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: <20180608065903.GF12683@ando.pearwood.info> References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: - Thanks for pointing out a language (Julia) that already had a name convention. Interestingly they don't have a atan2d function. Choosing the same convention as another language is a big plus. - Adding trig function using floats between 0 and 1 is nice, currently one needs to do sin(tau * t) which is not so bad (from math import tau, tau sounds like turn). - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) or sinturn(x). Grads are in the idea of turns but with more problems, as you guys said, grads are used by noone, but turns are more useful. sin(tau * t) For The Win. - Even though people mentionned 1/6 not being exact, so that advantage over radians isn't that obvious ? from math import sin, tau from fractions import Fraction sin(Fraction(1,6) * tau) sindeg(Fraction(1,6) * 360) These already work today by the way. - As you guys pointed out, using radians implies knowing a little bit about floating point arithmetic and its limitations. Integer are more simple and less error prone. Of course it's useful to know about floats but in many case it's not necessary to learn about it right away, young students just want their player in the game move in a straight line when angle = 90. - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is kind of an exception. Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano > a ?crit : On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano > wrote: > > Although personally I prefer the look of d as a prefix: > > > > dsin, dcos, dtan > > > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > > "tanned" etc. > > Having it as a suffix does have one advantage. The math module would > need a hyperbolic sine function which accepts an argument in; and > then, like Charles Napier [1], Python would finally be able to say "I > have sindh". Ha ha, nice pun, but no, the hyperbolic trig functions never take arguments in degrees. Or radians for that matter. They are "hyperbolic angles", which some electrical engineering text books refer to as "hyperbolic radians", but all the maths text books I've seen don't call them anything other than a real number. (Or sometimes a complex number.) But for what it's worth, there is a correspondence of a sort between the hyperbolic angle and circular angles. The circular angle going between 0 to 45? corresponds to the hyperbolic angle going from 0 to infinity. https://en.wikipedia.org/wiki/Hyperbolic_angle https://en.wikipedia.org/wiki/Hyperbolic_function > [1] Apocryphally, alas. Don't ruin a good story with facts ;-) -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 8 21:04:40 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 11:04:40 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: <20180609010440.GK12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 08:45:54AM +0000, Robert Vanden Eynde wrote: > from math import sin, tau > from fractions import Fraction > sin(Fraction(1,6) * tau) > sindeg(Fraction(1,6) * 360) > > These already work today by the way. You obviously have a different understanding of the words "already work" than I do: py> sindeg(Fraction(1,6) * 360) Traceback (most recent call last): File "", line 1, in NameError: name 'sindeg' is not defined Since tau is a float, writing Fraction(1,6) * tau instead of tau/6 is a waste of time. Also, Fraction(1,6) * 360 is also a waste of time, since 360/6 is not only exact, but can be done at compile-time. -- Steve From steve at pearwood.info Fri Jun 8 21:12:35 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 11:12:35 +1000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: <20180609011234.GL12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 11:11:09PM +0200, Adam Barto? wrote: > But if there are both sin and dsin, and you ask about the difference > between them, the obvious answer would be that one takes radians and the > other takes degrees. The point that the degrees version is additionally > exact on special values is an extra benefit. No, that's not an extra benefit, it is the only benefit! If we can't make it exact for the obvious degree angles, there would be no point in doing this. We'd just tell people to write their own two-line functions: def sindeg(angle): return math.sin(math.radians(angle)) The only reason to even consider making this a standard library function is if we can do better than that. > It would be nice to also fix the original sin, The sin function is not broken and does not need fixing. (Modulo quirks of individual platform maths libraries.) > or more precisely to provide a way to give it a > fractional multiple of pi. How about a special class PiMultiple that would > represent a fractional multiple of pi? What is the point of that? When you pass it to math.sin, it still needs to be converted to a float before sin can operate on it. Unless you are proposing a series of dunder methods __sin__ __cos__ and __tan__ to allow arbitrary classes to be passed to sin, cos and tan, the following cannot work: > PI = PiMultiple(1) > assert PI / 2 == PiMultiple(1, 2) > assert cos(PI / 2) == 0 Without a __cos__ dunder method that allows PiMultiple objects to customise the result of cos(), that last line has to fail, because cos(math.pi/2) == 0 fails. > DEG = 2 * PI / 360 > assert sin(45 * DEG) == sqrt(2) / 2 Likewise. -- Steve From greg.ewing at canterbury.ac.nz Fri Jun 8 21:43:33 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Jun 2018 13:43:33 +1200 Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: <599dd5ea-8410-2308-28fc-e3433df11d47@mrabarnett.plus.com> Message-ID: <5B1B30C5.5020808@canterbury.ac.nz> Jamie Willis wrote: > What about just supporting filtering syntax at the top level? > > for x range(50) if x % 2: print(x) It would be more general and probably easier to support this: for x in range(50): if x % 2: print(x) i.e. just relax the requirement that a statement following on the same line after a colon must be a simple statement. -- Greg From wes.turner at gmail.com Fri Jun 8 22:09:01 2018 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 8 Jun 2018 22:09:01 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: # Python, NumPy, SymPy, mpmath, sage trigonometric functions https://en.wikipedia.org/wiki/Trigonometric_functions ## Python math module https://docs.python.org/3/library/math.html#trigonometric-functions - degrees(radians): Float degrees - radians(degrees): Float degrees ## NumPy https://docs.scipy.org/doc/numpy/reference/routines.math.html#trigonometric-functions - degrees(radians) : List[float] degrees - rad2deg(radians): List[float] degrees - radians(degrees) : List[float] radians - deg2rad(degrees): List[float] radians https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html ## SymPy http://docs.sympy.org/latest/modules/functions/elementary.html#sympy-functions-elementary-trigonometric http://docs.sympy.org/latest/modules/functions/elementary.html#trionometric-functions - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy - cosd, sind - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy#comment50176770_31072815 > Let x, theta, phi, etc. be Symbols representing quantities in radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to change the meaning of the symbols to degrees" ## mpmath http://mpmath.org/doc/current/functions/trigonometric.html - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians ## Sage https://doc.sagemath.org/html/en/reference/functions/sage/functions/trig.html On Friday, June 8, 2018, Robert Vanden Eynde wrote: > - Thanks for pointing out a language (Julia) that already had a name > convention. Interestingly they don't have a atan2d function. Choosing the > same convention as another language is a big plus. > > - Adding trig function using floats between 0 and 1 is nice, currently one > needs to do sin(tau * t) which is not so bad (from math import tau, tau > sounds like turn). > > - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) > or sinturn(x). > > Grads are in the idea of turns but with more problems, as you guys said, > grads are used by noone, but turns are more useful. sin(tau * t) For The > Win. > > - Even though people mentionned 1/6 not being exact, so that advantage > over radians isn't that obvious ? > > from math import sin, tau > from fractions import Fraction > sin(Fraction(1,6) * tau) > sindeg(Fraction(1,6) * 360) > > These already work today by the way. > > - As you guys pointed out, using radians implies knowing a little bit > about floating point arithmetic and its limitations. Integer are more > simple and less error prone. Of course it's useful to know about floats but > in many case it's not necessary to learn about it right away, young > students just want their player in the game move in a straight line when > angle = 90. > > - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is > kind of an exception. > > > > > Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano a > ?crit : > >> On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: >> > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano >> wrote: >> > > Although personally I prefer the look of d as a prefix: >> > > >> > > dsin, dcos, dtan >> > > >> > > That's more obviously pronounced "d(egrees) sin" etc rather than >> "sined" >> > > "tanned" etc. >> > >> > Having it as a suffix does have one advantage. The math module would >> > need a hyperbolic sine function which accepts an argument in; and >> > then, like Charles Napier [1], Python would finally be able to say "I >> > have sindh". >> >> Ha ha, nice pun, but no, the hyperbolic trig functions never take >> arguments in degrees. Or radians for that matter. They are "hyperbolic >> angles", which some electrical engineering text books refer to as >> "hyperbolic radians", but all the maths text books I've seen don't call >> them anything other than a real number. (Or sometimes a complex number.) >> >> But for what it's worth, there is a correspondence of a sort between the >> hyperbolic angle and circular angles. The circular angle going between 0 >> to 45? corresponds to the hyperbolic angle going from 0 to infinity. >> >> https://en.wikipedia.org/wiki/Hyperbolic_angle >> >> https://en.wikipedia.org/wiki/Hyperbolic_function >> >> >> > [1] Apocryphally, alas. >> >> Don't ruin a good story with facts ;-) >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Jun 8 23:13:19 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 8 Jun 2018 23:13:19 -0400 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: On 6/8/2018 6:07 PM, Michael Selik wrote: > You can use ``eval`` to run an expression, swapping in a different > globals and/or locals namespace. Will this serve your purpose? > > In [1]: import types > In [2]: ns = types.SimpleNamespace(a=1) > In [3]: eval('a', ns.__dict__) > Out[3]: 1 > > https://docs.python.org/3/library/functions.html#eval Or exec to run statements in namespaces. This is how IDLE, and I am sure, at least some other IDEs, execute user code in a simulated __main__ namespace. -- Terry Jan Reedy From tritium-list at sdamon.com Fri Jun 8 23:27:58 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Fri, 8 Jun 2018 23:27:58 -0400 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> > -----Original Message----- > From: Python-ideas list=sdamon.com at python.org> On Behalf Of David Teresi > Sent: Friday, June 8, 2018 2:42 PM > To: python-ideas at python.org > Subject: [Python-ideas] A "within" keyword > > One of the features I miss from languages such as C# is namespaces that > work across files - it makes it a lot easier to organize code IMO. > > Here's an idea I had - it might not be the best idea, just throwing this out > there: a "within" keyword that lets you execute code inside a namespace. > For example: > > # A.py > import types > cool_namespace = types.SimpleNamespace() > > > within cool_namespace: > def foo(): > print("foo run") > > > #B.py > import A > within A.cool_namespace: > foo() # prints "foo run" > > > Thoughts? Why not... cool_namespace = SomeNamespaceContextManager() with cool_namespace: def foo(): pass advantage being it introduces no new keyword. The 'disadvantage' is it would change semantics of the with statement (as would be required to get the names defined in the suite of the context manager) From robert.kern at gmail.com Fri Jun 8 23:43:15 2018 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 8 Jun 2018 20:43:15 -0700 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On 6/8/18 01:45, Robert Vanden Eynde wrote: > - Thanks for pointing out a language (Julia) that already had a name convention. > Interestingly they don't have a atan2d function. Choosing the same convention as > another language is a big plus. For what it's worth, scipy calls them sindg, cosdg, tandg, cotdg. https://docs.scipy.org/doc/scipy-1.1.0/reference/special.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wes.turner at gmail.com Fri Jun 8 23:53:42 2018 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 8 Jun 2018 23:53:42 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On Fri, Jun 8, 2018 at 11:44 PM Robert Kern wrote: > On 6/8/18 01:45, Robert Vanden Eynde wrote: > > - Thanks for pointing out a language (Julia) that already had a name > convention. > > Interestingly they don't have a atan2d function. Choosing the same > convention as > > another language is a big plus. > > For what it's worth, scipy calls them sindg, cosdg, tandg, cotdg. > > https://docs.scipy.org/doc/scipy-1.1.0/reference/special.html ## SciPy https://docs.scipy.org/doc/scipy-1.1.0/reference/special.html#convenience-functions - https://docs.scipy.org/doc/scipy-1.1.0/reference/generated/scipy.special.sindg.html#scipy.special.sindg > > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma > that is made terrible by our own mad attempt to interpret it as though > it had > an underlying truth." > -- Umberto Eco > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Sat Jun 9 00:46:45 2018 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Fri, 8 Jun 2018 21:46:45 -0700 Subject: [Python-ideas] A "within" keyword In-Reply-To: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> References: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> Message-ID: 2018-06-08 20:27 GMT-07:00 Alex Walters : > Why not... > > cool_namespace = SomeNamespaceContextManager() > > with cool_namespace: > def foo(): > pass > > advantage being it introduces no new keyword. The 'disadvantage' is it > would change semantics of the with statement (as would be required to get > the names defined in the suite of the context manager) > > Actually, this is probably doable now. You can get the globals of the calling code by doing sys._getframe(), then check which names are added while the context manager is active. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jun 9 01:07:39 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jun 2018 15:07:39 +1000 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> Message-ID: On 9 June 2018 at 14:46, Jelle Zijlstra wrote: > > > 2018-06-08 20:27 GMT-07:00 Alex Walters : > >> Why not... >> >> cool_namespace = SomeNamespaceContextManager() >> >> with cool_namespace: >> def foo(): >> pass >> >> advantage being it introduces no new keyword. The 'disadvantage' is it >> would change semantics of the with statement (as would be required to get >> the names defined in the suite of the context manager) >> >> Actually, this is probably doable now. You can get the globals of the > calling code by doing sys._getframe(), then check which names are added > while the context manager is active. > It's doable without code generation hacks by using class statements instead of with statements. The withdrawn PEP 422 shows how to use a custom metaclass to support a "namespace" keyword argument in the class header that redirects all writes in the body to the given dict: https://www.python.org/dev/peps/pep-0422/#new-ways-of-using-classes https://www.python.org/dev/peps/pep-0422/#extending-a-class even shows how to further use that to extend existing classes with new attributes. We're not likely to actively encourage that approach though - while they do enable some handy things, they also encourage hard to navigate programs with a lot of "action at a distance" side effects that make it tricky to reason locally about the code you're currently looking at (if anything, we've been pushing more in the other direction: encouraging the use of features like checked type hints to better *enable* reasoning locally about a piece of code). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Jun 9 01:13:30 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Jun 2018 17:13:30 +1200 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: <5B1B61FA.6060409@canterbury.ac.nz> David Teresi wrote: > One of the features I miss from languages such as C# is namespaces that > work across files - it makes it a lot easier to organize code IMO. Maybe for the writer, but not for the reader. I like the fact that in Python I can usually tell which file a given class or function is implemented in by looking at the import statements. -- Greg From mike at selik.org Sat Jun 9 02:27:08 2018 From: mike at selik.org (Michael Selik) Date: Fri, 8 Jun 2018 23:27:08 -0700 (PDT) Subject: [Python-ideas] Fwd: New suggested built in keyword: do In-Reply-To: References: Message-ID: The benefit of list, dict, and set comprehensions and generator expressions is that they evaluate, as opposed to simply exec. The purpose of making them one-liners is to allow them to be assigned to a variable or passed as an argument. If you're not assigning or passing, then why not use a newline character? "Sparse is better than dense." On Friday, June 8, 2018 at 7:13:07 AM UTC-7, Randy Diaz wrote: > > I think that the keyword do would solve problems that occur when people > want a simple way to run a command over an iterable but they dont want to > store the data. > > example: > do print(x) for x in range(50) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 9 04:21:23 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 18:21:23 +1000 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: <20180609082123.GM12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 03:07:28PM -0700, Michael Selik wrote: > You can use ``eval`` to run an expression, swapping in a different globals > and/or locals namespace. Will this serve your purpose? > > In [1]: import types > In [2]: ns = types.SimpleNamespace(a=1) > In [3]: eval('a', ns.__dict__) > Out[3]: 1 The public API for getting an object namespace is vars(ns). But why would we write eval('a', vars(ns)) instead of getattr(ns, 'a') or even better just ns.a? Is your Python code too fast and you need to slow it down? *wink* eval and exec are useful when the code you want to run needs to be constructed at runtime. Its not generally useful when you know what you want ahead of time as in your example above. -- Steve From desmoulinmichel at gmail.com Sat Jun 9 04:58:00 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 10:58:00 +0200 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: References: <598EB1E5-AD1E-47C6-ADBC-A09036CFFA11@python.org> <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> <31883CB4-1F13-4208-8A31-07F59241B704@barrys-emacs.org> Message-ID: <00a160d0-7fd5-1f8a-8c38-c5c0dec8ebf1@gmail.com> Creating built in dynamically is not a good idea. Tools complain, new comers wonder where it comes from, it sets a precedent for adding more or debating about it. Better have the debate once here, make it official or decline it officially, and have a clean result. Le 08/06/2018 ? 21:28, Barry a ?crit?: > I think you forgot to to reply to the list. > Barry > > >> On 8 Jun 2018, at 13:16, Michel Desmoulin wrote: >> >> Creating builtin dynamically is not a good idea. tools complains, new comers wonder where it comes from, it sets a precedent for adding more or debating about it. >> >> Better have the debate once here, make it official or decline it officially, and have a clean result. >> >>> Le 06/06/2018 ? 20:05, Barry Scott a ?crit : >>> I assume the the idea is that everybody has Path available without the need to do the import dance first. >>> >>> If its for personal convenience you can always do this trick, that is used by gettext to make _ a builtin. >>> >>> import pathlib >>> import builtings >>> >>> builtins.__dict__['Path'] = pathlib.Path >>> >>> Now Path *is* a builtin for the rest of the code. >>> >>> Barry >>> >>> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> > From steve at pearwood.info Sat Jun 9 05:03:55 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 19:03:55 +1000 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: Message-ID: <20180609090354.GN12683@ando.pearwood.info> On Fri, Jun 08, 2018 at 02:41:54PM -0400, David Teresi wrote: > One of the features I miss from languages such as C# is namespaces that > work across files - it makes it a lot easier to organize code IMO. I too have often wanted a sub-module namespace without the need to separate code into seperate files. Something like a package, but existing inside a single physical file. In pseudo-code: a = 1 def spam(): return a namespace A: # conceptually, this is like a module within a module a = 100 def spam(): return a # later spam() + A.spam() => returns 101 If that looks similar to the class statement, except there's no need to create an instance, that's deliberate. > Here's an idea I had - it might not be the best idea, just throwing this > out there: a "within" keyword that lets you execute code inside a > namespace. For example: [...] > within cool_namespace: > def foo(): > print("foo run") [...] > within A.cool_namespace: > foo() # prints "foo run" This sort of thing is similar to a old FAQ: https://docs.python.org/3/faq/design.html#why-doesn-t-python-have-a-with-statement-for-attribute-assignments so it isn't unambiguously clear what foo would mean: is it the current namespace foo, the surrounding namespace, or the module namespace? Its not necessarily undoable: currently Python has what the "Learning Python" book calls the LEGB scoping rule: L - local E - enclosing function(s) or class G - global (module) B - builtins It could be conceivable to add a rule to slot a namespace in there somewhere, but the scoping rules already are fairly hairy (e.g. classes and functions work a little differently, we have a sub-local scope for comprehensions) and I'd be cautious about making them hairier with something like your "within"/"with" statement. The question I have is what is your motive for this? What problem does this solve for you? -- Steve From robertve92 at gmail.com Sat Jun 9 05:11:16 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Sat, 9 Jun 2018 11:11:16 +0200 Subject: [Python-ideas] A "within" keyword In-Reply-To: <20180609082123.GM12683@ando.pearwood.info> References: <20180609082123.GM12683@ando.pearwood.info> Message-ID: Classes Provide already some features of a namespace : class cool_namespace: A = 8 @staticmethod def f(): return "yo" @staticmethod def g(): return (1 + cool_namespace.A) * cool_namespace.f() And if you're tired of writing @staticmethod, you can write a class decorator "namespace" : @namespace class cool_namespace: A = 8 def f(): return "yo" def g(): return (1 + cool_namespace.A) * cool_namespace.f() And I think this decorator already exists somewhere. Le sam. 9 juin 2018 ? 10:21, Steven D'Aprano a ?crit : > On Fri, Jun 08, 2018 at 03:07:28PM -0700, Michael Selik wrote: > > > You can use ``eval`` to run an expression, swapping in a different > globals > > and/or locals namespace. Will this serve your purpose? > > > > In [1]: import types > > In [2]: ns = types.SimpleNamespace(a=1) > > In [3]: eval('a', ns.__dict__) > > Out[3]: 1 > > The public API for getting an object namespace is vars(ns). > > But why would we write eval('a', vars(ns)) instead of getattr(ns, 'a') > or even better just ns.a? Is your Python code too fast and you need to > slow it down? *wink* > > eval and exec are useful when the code you want to run needs to be > constructed at runtime. Its not generally useful when you know what you > want ahead of time as in your example above. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sat Jun 9 05:17:05 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 11:17:05 +0200 Subject: [Python-ideas] Allow callables in slices Message-ID: <26992e67-1ba9-73df-34ce-81db9bd133c0@gmail.com> Such as that: def starting_when(element): ... a_list[starting_when:] Is equivalent to: from itertools import dropwhile def starting_when(element): ... list(dropwhile(lambda x: not starting_when(x), a_list)) And def ending_when(element: ... a_list[:ending_when] To: from itertools import dropwhile def ending_when(element: ... list(itertools.takwhile(lambda x: not condition(x), a_list)) From steve at pearwood.info Sat Jun 9 05:19:50 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 19:19:50 +1000 Subject: [Python-ideas] A "within" keyword In-Reply-To: References: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> Message-ID: <20180609091949.GO12683@ando.pearwood.info> On Sat, Jun 09, 2018 at 03:07:39PM +1000, Nick Coghlan wrote: > It's doable without code generation hacks by using class statements instead > of with statements. > > The withdrawn PEP 422 shows how to use a custom metaclass to support a > "namespace" keyword argument in the class header that redirects all writes > in the body to the given dict: > https://www.python.org/dev/peps/pep-0422/#new-ways-of-using-classes > > https://www.python.org/dev/peps/pep-0422/#extending-a-class even shows how > to further use that to extend existing classes with new attributes. That's awesome! I've been poking away at similar ideas for a long time. -- Steve From drekin at gmail.com Sat Jun 9 05:21:50 2018 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Sat, 9 Jun 2018 11:21:50 +0200 Subject: [Python-ideas] Trigonometry in degrees Message-ID: Steven D'Arpano wrote: > On Fri, Jun 08, 2018 at 11:11:09PM +0200, Adam Barto? wrote: > >>* But if there are both sin and dsin, and you ask about the difference *>>* between them, the obvious answer would be that one takes radians and the *>>* other takes degrees. The point that the degrees version is additionally *>>* exact on special values is an extra benefit. *> > No, that's not an extra benefit, it is the only benefit! > > If we can't make it exact for the obvious degree angles, there would be > no point in doing this. We'd just tell people to write their own > two-line functions: > > def sindeg(angle): > return math.sin(math.radians(angle)) > > > The only reason to even consider making this a standard library function > is if we can do better than that. I agree completely, I just think it doesn't look obvious. >>* It would be nice to also fix the original sin, *> > The sin function is not broken and does not need fixing. > > (Modulo quirks of individual platform maths libraries.) > > >>* or more precisely to provide a way to give it a *>*> fractional multiple of pi. How about a special class PiMultiple that would *>*> represent a fractional multiple of pi? * > > What is the point of that? When you pass it to math.sin, it still needs > to be converted to a float before sin can operate on it. > > Unless you are proposing a series of dunder methods __sin__ __cos__ and > __tan__ to allow arbitrary classes to be passed to sin, cos and tan, the > following cannot work. The idea was that the functions could handle the PiMultiple instances in a special way and fall back to float only when a special value is not detected. It would be like the proposed dsin functionality, but with a magic class instead of a new set of functions, and without a particular choice of granularity (360 degrees). But maybe it isn't worth it. Also what about acos(0)? Should it return PiMultiple(1, 2) and confuse people or just 1.5707963267948966 and loose exactness? Best regards, Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sat Jun 9 05:20:00 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 11:20:00 +0200 Subject: [Python-ideas] Allow callable in slices Message-ID: Given 2 callables checking when a condition arises and returning True: def starting_when(element): ... def ending_when(element: ... Allow: a_list[starting_when:] To be equivalent to: from itertools import dropwhile list(dropwhile(lambda x: not starting_when(x), a_list)) And: a_list[:ending_when] To: from itertools import takewhile list(takewhile(lambda x: not ending_when(x), a_list)) From steve at pearwood.info Sat Jun 9 05:47:57 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 19:47:57 +1000 Subject: [Python-ideas] Allow callables in slices In-Reply-To: <26992e67-1ba9-73df-34ce-81db9bd133c0@gmail.com> References: <26992e67-1ba9-73df-34ce-81db9bd133c0@gmail.com> Message-ID: <20180609094757.GP12683@ando.pearwood.info> On Sat, Jun 09, 2018 at 11:17:05AM +0200, Michel Desmoulin wrote: > Such as that: > > def starting_when(element): > ... > > a_list[starting_when:] > Is equivalent to: [...] > list(dropwhile(lambda x: not starting_when(x), a_list)) > That looks like a slice from an index to the end of the list. Things which are similar should look similar, but things which are different should NOT look similar. What would: alist[callable:callable:callable] do? How about this one? alist[7:callable:-1] If there are not meaningful interpretations of callables as part of general slice notation, then we shouldn't use slice notation as a shortcut for dropwhile. Rather than give this syntactic support, I'd rather add a new function to itertools that composes takewhile and dropwhile: def between(iterable, startcondition, endcondition): it = iter(iterable) return takewhile(lambda x: not endcondition(x), dropwhile(lambda x: not startcondition(x), it) ) If the caller wants to convert to a list (or some other sequence), they can, otherwise they can keep it as an iterator. -- Steve From steve at pearwood.info Sat Jun 9 08:40:25 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 22:40:25 +1000 Subject: [Python-ideas] Making Path() a built in. In-Reply-To: <31883CB4-1F13-4208-8A31-07F59241B704@barrys-emacs.org> References: <20180605234227.GX12683@ando.pearwood.info> <20180606065331.GZ12683@ando.pearwood.info> <31883CB4-1F13-4208-8A31-07F59241B704@barrys-emacs.org> Message-ID: <20180609124024.GQ12683@ando.pearwood.info> On Wed, Jun 06, 2018 at 07:05:35PM +0100, Barry Scott wrote: > I assume the the idea is that everybody has Path available without the need to do the import dance first. > > If its for personal convenience you can always do this trick, that is used by gettext to make _ a builtin. > > import pathlib > import builtings > > builtins.__dict__['Path'] = pathlib.Path The public API for getting the namespace of an object is vars(): vars(builtins)['Path'] but since builtins is just a module, the best way to add a new attribute to it is: builtins.Path = pathlib.Path -- Steve From apalala at gmail.com Sat Jun 9 08:43:47 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 9 Jun 2018 08:43:47 -0400 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' Message-ID: Hello @here, Is there a guide about writing (and publishing) PEPs? I'd like to write one on `while expre as v: ...` using the context semantics of `with expr as v` (not `except E as e`). Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Sat Jun 9 08:47:10 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 9 Jun 2018 14:47:10 +0200 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: Message-ID: <5B1BCC4E.1010508@UGent.be> On 2018-06-09 14:43, Juancarlo A?ez wrote: > Is there a guide about writing (and publishing) PEPs? https://www.python.org/dev/peps/pep-0001/ From steve at pearwood.info Sat Jun 9 08:57:21 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jun 2018 22:57:21 +1000 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: Message-ID: <20180609125721.GR12683@ando.pearwood.info> On Sat, Jun 09, 2018 at 08:43:47AM -0400, Juancarlo A?ez wrote: > Hello @here, > > Is there a guide about writing (and publishing) PEPs? https://www.python.org/dev/peps/pep-0001/ > I'd like to write one on `while expre as v: ...` using the context > semantics of `with expr as v` (not `except E as e`). Do you mean the context manager semantics of with statements? As in, calling the __enter__ and __exit__ method? Please make sure you are very familiar with PEP 572 before you do, and expect to have your PEP compared to it. Be especially prepared to be challenged with the question what is so special about while and if that they, and they alone, are permitted to use assignment expressions. (You don't have to answer that now provided it is answered in the PEP.) -- Steve From apalala at gmail.com Sat Jun 9 09:17:37 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 9 Jun 2018 09:17:37 -0400 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: <20180609125721.GR12683@ando.pearwood.info> References: <20180609125721.GR12683@ando.pearwood.info> Message-ID: > Do you mean the context manager semantics of with statements? As in, > calling the __enter__ and __exit__ method? > No. Just the scope of the variables introduced, which is different in `with as` and `except as`. > Please make sure you are very familiar with PEP 572 before you do, and > expect to have your PEP compared to it. > My intention would be to make the to proposals orthogonal, if possible, so both/any can be accepted or rejected in their own timeline. I'm certain that both can live together. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sat Jun 9 09:28:11 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 15:28:11 +0200 Subject: [Python-ideas] Allow callables in slices In-Reply-To: <20180609094757.GP12683@ando.pearwood.info> References: <26992e67-1ba9-73df-34ce-81db9bd133c0@gmail.com> <20180609094757.GP12683@ando.pearwood.info> Message-ID: Le 09/06/2018 ? 11:47, Steven D'Aprano a ?crit?: > On Sat, Jun 09, 2018 at 11:17:05AM +0200, Michel Desmoulin wrote: >> Such as that: >> >> def starting_when(element): >> ... >> >> a_list[starting_when:] > >> Is equivalent to: > [...] >> list(dropwhile(lambda x: not starting_when(x), a_list)) >> > > That looks like a slice from an index to the end of the list. Things > which are similar should look similar, but things which are different > should NOT look similar. > The semantic is [start_condition:stop_condition:step]. Here condition can be an index or something more complex. Just like you can do dictionary[key], but key can be a complex object with a custom __hash__ executing weird computation. Just like you can sorted() on natural values or pass a callable as a key. Just like you can re.replace() with a string or a function. > What would: > > alist[callable:callable:callable] > > do? How about this one? > > alist[7:callable:-1] ValueError("Step cannot be used when callables are part of a slice") However: alist[7:callable] Would be: list(dropwhile(lambda x: not starting_when(x), islice(alist, 7, None))) Example, open this files, load all lines in memory, skip the first line, then get all the line until the first comment: import itertools def is_commented(line): return lines.startwith('#') def lines(): with open('/etc/fstab'): lines = f.readlines()[1:] return list(itertools.dropwhile(lines, is_commented) Becomes: def is_commented(line): return lines.startwith('#') def lines(): with open('/etc/fstab'): return f.readlines()[1:is_commented] It's not about how much shorter is is, but it is very nice to read. Of course I'd prefer to have slicing on generators, since here we load the entire file in memory. But I already suggested it several times on python-idea and the dictator killed it. For files like fstab it's ok since they are small. > > If there are not meaningful interpretations of callables as part of > general slice notation, then we shouldn't use slice notation as a > shortcut for dropwhile. > > Rather than give this syntactic support, I'd rather add a new function > to itertools that composes takewhile and dropwhile: It's not syntaxic support. You can already pass callables and the interpreter accept it fine. But the underlying types raise TypeError because they don't know how to use that. Just like you can pass tuples, but CPython can't use them, while numpy can. > > > def between(iterable, startcondition, endcondition): > it = iter(iterable) > return takewhile(lambda x: not endcondition(x), > dropwhile(lambda x: not startcondition(x), it) > ) Or accept callables in islice. From desmoulinmichel at gmail.com Sat Jun 9 09:31:08 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 15:31:08 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> Message-ID: <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Le 09/06/2018 ? 12:33, Andrew Svetlov a ?crit?: > If we consistently apply the idea of?hook?for internal python structure > modification too many things should be changed. Import > machinery,?tracemalloc, profilers/tracers, name it. > If your code (I still don't see the real-life example) wants to check a > policy change -- just do it. > > # on initialization > policy = asyncio.get_event_loop_policy() > > # somewhere in?code > if?policy?is not asyncio.get_event_loop_policy(): > ? ? raise RuntimeError("Policy was changed") You need to repeat this check everywhere because nothing guaranty a code hasn't change it after the check. While a call back allow you to catch it every time. You can set either raise, warn, monkey patch, disable features, etc. From gjcarneiro at gmail.com Sat Jun 9 09:46:43 2018 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 9 Jun 2018 14:46:43 +0100 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Message-ID: On Sat, 9 Jun 2018 at 14:31, Michel Desmoulin wrote: > > > Le 09/06/2018 ? 12:33, Andrew Svetlov a ?crit : > > If we consistently apply the idea of hook for internal python structure > > modification too many things should be changed. Import > > machinery, tracemalloc, profilers/tracers, name it. > > If your code (I still don't see the real-life example) wants to check a > > policy change -- just do it. > > > > # on initialization > > policy = asyncio.get_event_loop_policy() > > > > # somewhere in code > > if policy is not asyncio.get_event_loop_policy(): > > raise RuntimeError("Policy was changed") > > You need to repeat this check everywhere because nothing guaranty a code > hasn't change it after the check. > > While a call back allow you to catch it every time. > > You can set either raise, warn, monkey patch, disable features, etc. > IMHO, it is not any framework's job to check for this. It is a programmer error. You don't need to babysit programmers so much. Not if the cost is adding new APIs. I agree with Andrew, if we open this precedent, next thing we know, Python has to provide callbacks for any internal state changes. Why not callbacks for when modules are imported? Etc. etc. It leads to API noise. Doesn't seem to be justified in this case, since I would guess almost all applications only change event loop policy once, during startup, and never again. -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Sat Jun 9 10:39:45 2018 From: mike at selik.org (Michael Selik) Date: Sat, 9 Jun 2018 07:39:45 -0700 Subject: [Python-ideas] Allow callables in slices In-Reply-To: References: <26992e67-1ba9-73df-34ce-81db9bd133c0@gmail.com> <20180609094757.GP12683@ando.pearwood.info> Message-ID: On Sat, Jun 9, 2018 at 6:28 AM Michel Desmoulin wrote: > Example, open this files, load all lines in memory, skip the first line, > then get all the line until the first comment: > > import itertools > > def is_commented(line): > return lines.startwith('#') > > def lines(): > with open('/etc/fstab'): > lines = f.readlines()[1:] > return list(itertools.dropwhile(lines, is_commented) > > Becomes: > > def is_commented(line): > return lines.startwith('#') > > def lines(): > with open('/etc/fstab'): > return f.readlines()[1:is_commented] > > It's not about how much shorter is is, but it is very nice to read. > If you're going to put it in a function anyway, why something like this? def lines(): with open('/etc/fstab'): f.readline() for line in f: if line.startswith('#'): break yield line -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Sat Jun 9 10:46:34 2018 From: mike at selik.org (Michael Selik) Date: Sat, 9 Jun 2018 07:46:34 -0700 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: Message-ID: On Sat, Jun 9, 2018 at 2:22 AM Adam Barto? wrote: > The idea was that the functions could handle the PiMultiple instances in a special way and fall back to float only when a special value is not detected. It would be like the proposed dsin functionality, but with a magic class instead of a new set of functions, and without a particular choice of granularity (360 degrees). > > But maybe it isn't worth it. Also what about acos(0)? Should it return PiMultiple(1, 2) and confuse people or just 1.5707963267948966 and loose exactness? > > That'd be the only module in the standard library with such a specialized class and behavior. You could argue that pathlib creates a sort of Path preference with str fallback, but the Path type has a large collection of methods. In contrast, this PiMultiple type would be only used as an input. That's very unusual style for Python. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jun 9 11:06:42 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 10 Jun 2018 01:06:42 +1000 Subject: [Python-ideas] Allow callable in slices In-Reply-To: References: Message-ID: On 9 June 2018 at 19:20, Michel Desmoulin wrote: > Given 2 callables checking when a condition arises and returning True: > > def starting_when(element): > ... > > def ending_when(element: > ... > Allow: > > a_list[starting_when:] > > To be equivalent to: > > from itertools import dropwhile > > list(dropwhile(lambda x: not starting_when(x), a_list)) > Custom container implementations can already do this if they're so inclined, as slice objects don't type check their inputs: >>> class MyContainer: ... def __getitem__(self, key): ... return key ... >>> mc = MyContainer() >>> mc[:bool] slice(None, , None) >>> mc[bool:] slice(, None, None) >>> mc[list:tuple:range] slice(, , ) It's only slice.indices() that needs start/stop/step to adhere to the Optional[int] type hint. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertvandeneynde at hotmail.com Sat Jun 9 02:18:16 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Sat, 9 Jun 2018 06:18:16 +0000 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: Indeed what we need for exact math for multiple of 90 (and 30) is ideas from the symbolic libraries (sympy, sage). Of course the symbolic lib can do more like : sage: k = var('k', domain='integer') sage: cos(1 + 2*k*pi) cos(1) sage: cos(k*pi) cos(pi*k) sage: cos(pi/3 + 2*k*pi) 1/2 But that would concern symbolic lib only I think. For the naming convention, scipy using sindg (therefore Nor sind nor sindeg) will make the sind choice less obvious. However if Matlab and Julia chooses sind that's a good path to go, Matlab is pretty popular, as other pointed out, with Universities giving "free" licences and stuff. With that regards, scipy wanting to "be a replacement to Matlab in python and open source" it's interesting they chose sindg and not the Matlab name sind. For the "d" as suffix that would mean "d" as "double" like in opengl. Well, let's remember that in Python there's only One floating type, that's a double, and it's called float... So python programmers will not think "sind means it uses a python float and not a python float32 that C99 sinf would". Python programmers would be like "sin takes float in radians, sind takes float in degrees or int, because int can be converted to float when there's no overflow". Le sam. 9 juin 2018 ? 04:09, Wes Turner > a ?crit : # Python, NumPy, SymPy, mpmath, sage trigonometric functions https://en.wikipedia.org/wiki/Trigonometric_functions ## Python math module https://docs.python.org/3/library/math.html#trigonometric-functions - degrees(radians): Float degrees - radians(degrees): Float degrees ## NumPy https://docs.scipy.org/doc/numpy/reference/routines.math.html#trigonometric-functions - degrees(radians) : List[float] degrees - rad2deg(radians): List[float] degrees - radians(degrees) : List[float] radians - deg2rad(degrees): List[float] radians https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html ## SymPy http://docs.sympy.org/latest/modules/functions/elementary.html#sympy-functions-elementary-trigonometric http://docs.sympy.org/latest/modules/functions/elementary.html#trionometric-functions - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy - cosd, sind - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy#comment50176770_31072815 > Let x, theta, phi, etc. be Symbols representing quantities in radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to change the meaning of the symbols to degrees" ## mpmath http://mpmath.org/doc/current/functions/trigonometric.html - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians ## Sage https://doc.sagemath.org/html/en/reference/functions/sage/functions/trig.html On Friday, June 8, 2018, Robert Vanden Eynde > wrote: - Thanks for pointing out a language (Julia) that already had a name convention. Interestingly they don't have a atan2d function. Choosing the same convention as another language is a big plus. - Adding trig function using floats between 0 and 1 is nice, currently one needs to do sin(tau * t) which is not so bad (from math import tau, tau sounds like turn). - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) or sinturn(x). Grads are in the idea of turns but with more problems, as you guys said, grads are used by noone, but turns are more useful. sin(tau * t) For The Win. - Even though people mentionned 1/6 not being exact, so that advantage over radians isn't that obvious ? from math import sin, tau from fractions import Fraction sin(Fraction(1,6) * tau) sindeg(Fraction(1,6) * 360) These already work today by the way. - As you guys pointed out, using radians implies knowing a little bit about floating point arithmetic and its limitations. Integer are more simple and less error prone. Of course it's useful to know about floats but in many case it's not necessary to learn about it right away, young students just want their player in the game move in a straight line when angle = 90. - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is kind of an exception. Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano > a ?crit : On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano > wrote: > > Although personally I prefer the look of d as a prefix: > > > > dsin, dcos, dtan > > > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > > "tanned" etc. > > Having it as a suffix does have one advantage. The math module would > need a hyperbolic sine function which accepts an argument in; and > then, like Charles Napier [1], Python would finally be able to say "I > have sindh". Ha ha, nice pun, but no, the hyperbolic trig functions never take arguments in degrees. Or radians for that matter. They are "hyperbolic angles", which some electrical engineering text books refer to as "hyperbolic radians", but all the maths text books I've seen don't call them anything other than a real number. (Or sometimes a complex number.) But for what it's worth, there is a correspondence of a sort between the hyperbolic angle and circular angles. The circular angle going between 0 to 45? corresponds to the hyperbolic angle going from 0 to infinity. https://en.wikipedia.org/wiki/Hyperbolic_angle https://en.wikipedia.org/wiki/Hyperbolic_function > [1] Apocryphally, alas. Don't ruin a good story with facts ;-) -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sat Jun 9 17:59:30 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Jun 2018 23:59:30 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Message-ID: > > IMHO, it is not any framework's job to check for this.? It is a > programmer error.? Not clearing the memory is a programmer error either but a gc helps. Not closing a file either but using `with` help. Not passing the proper type is a programmer error but we have type hints. We even have assert that we can be disabled to check that the arguments of a function match a test when debugging. You don't need to babysit programmers so much.? Helping to debug is not babysitting. It's a important part of the user experience, and it's especially needed on asyncio, one of the hardest part of the stdlib to code with. Not if the cost is adding new APIs. The cost of bugs is greater than the cost of API. The current API gives no sane way to prevent the bug. You can't expect anyone to check if the policy / loop / task factory has changed every single time in the code you depend on it. > > I agree with Andrew, if we open this precedent, next thing we know, > Python has to provide callbacks for any internal state changes. It's not internal, it's a public API. ? Why not > callbacks for when modules are imported? Good example. We have import hooks to run code every time a module is imported : https://www.python.org/dev/peps/pep-0302/#specification-part-2-registering-hooks Etc. etc.? It leads to API > noise.? Every features is either useful or noise. I argue that this one is useful. Doesn't seem to be justified in this case, since I would guess > almost all applications only change event loop policy once, during > startup, and never again. Yes, but when is start up ? E.G: I'm currently working on a large QT api on an old 2.7 code base. It has threads to avoid blocking the UI, qt events loops of course and the tornado events loop providing a web API. The orchestration of that very complex app requires careful initialization for a start up time of about 7s, and we had to code some wrapper and document it as the only entry point to make sure that if a junior breaks things the error message is very clear. RuntimeError('The event loop should not be set when...') is way easier to debug than an obscure down the road error on some data structure that is incorrectly accessed. What I'm proposing is to make that easy to implement by just letting anyone put a check in there. Overriding policy, loops or tasks factories are usually down for critical parts of the system. The errors emerging from a bug in there are very cryptic. Asyncio design made the choice to expose very low level things. You literally don't have this problem in languages like JS because nobody can change those. Now it's here, it's a footgun, and it would be nice to provide a way to put it in a holster. From steve at pearwood.info Sat Jun 9 19:50:30 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jun 2018 09:50:30 +1000 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: <20180609125721.GR12683@ando.pearwood.info> Message-ID: <20180609235030.GT12683@ando.pearwood.info> On Sat, Jun 09, 2018 at 09:17:37AM -0400, Juancarlo A?ez wrote: > > Do you mean the context manager semantics of with statements? As in, > > calling the __enter__ and __exit__ method? > > > > No. Just the scope of the variables introduced, which is different in `with > as` and `except as`. They aren't. They are the same scope: both `with` and `except` bind to a local variable. The only difference is that the `except` block implicitly unbinds the variable when the block ends. py> err = "something" py> try: ... None + 1 ... except TypeError as err: ... pass ... py> err Traceback (most recent call last): File "", line 1, in NameError: name 'err' is not defined > > Please make sure you are very familiar with PEP 572 before you do, and > > expect to have your PEP compared to it. > > My intention would be to make the to proposals orthogonal, if possible, so > both/any can be accepted or rejected in their own timeline. > > I'm certain that both can live together. Seems redundant... while (condition := expression) as flag: ... Accepting "while/if as name" would remove much (but not all) of the motivation for assignment expressions, while accepting assignment expressions would make a dedicated while/if as name syntax unnecessary. Like it or not, I expect that they will be seen as competing PEPs, not independent ones. -- Steve From ncoghlan at gmail.com Sun Jun 10 01:04:33 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 10 Jun 2018 15:04:33 +1000 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Message-ID: On 10 June 2018 at 07:59, Michel Desmoulin wrote: > What I'm proposing is to make that easy to implement by just letting > anyone put a check in there. > > Overriding policy, loops or tasks factories are usually down for > critical parts of the system. The errors emerging from a bug in there > are very cryptic. > > Asyncio design made the choice to expose very low level things. You > literally don't have this problem in languages like JS because nobody > can change those. > > Now it's here, it's a footgun, and it would be nice to provide a way to > put it in a holster. > With the API need framed that way, perhaps all that asyncio is currently missing is an "asyncio.lock_policy(unlock_token, err_callback)" API such that your application can declare that initialisation is completed and no further event loop policy changes should be allowed? (The "unlock_token" would be an arbitrary app-provided object that must also be passed to the corresponding "unlock_policy" call - that way libraries couldn't unlock the policy after the application locks it, since they won't have a reference to the app-specific unlock token). Adding further callback hooks for more events seems like it will just push the problem back another level, and you'll have the potential for conflicts between callbacks registered with the new hooks, and an even harder to understand overall system. By contrast, the above would be amenable to doing something like: 1. Per-process setup code establishes a particular event loop policy, and then locks it 2. Until the policy gets unlocked again, attempts to change it will call the err_callback (so the app can raise a custom access denied exception) 3. get_event_loop(), set_event_loop(), and new_event_loop() are all already managed by the event loop policy, so shouldn't need new hooks 4. stop(), close(), set_debug(), set_task_factory(), etc are all already managed by the event loop (and hence by the event loop policy), so shouldn't need new hooks Right now, the weak link in that chain is that there's no way for the application to keep a library from switching out the event policy with a new one, and subsequently bypassing all of the app's control over how it expects event loops to be managed. Given a way to close that loophole, the application should already have the ability to enforce everything else that it wants to enforce via the existing event loop and event loop policy APIs. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Sun Jun 10 03:02:25 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 10 Jun 2018 10:02:25 +0300 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Message-ID: Policy locking is a viable idea at first glance. On Sun, Jun 10, 2018 at 8:05 AM Nick Coghlan wrote: > On 10 June 2018 at 07:59, Michel Desmoulin > wrote: > >> What I'm proposing is to make that easy to implement by just letting >> anyone put a check in there. >> >> Overriding policy, loops or tasks factories are usually down for >> critical parts of the system. The errors emerging from a bug in there >> are very cryptic. >> >> Asyncio design made the choice to expose very low level things. You >> literally don't have this problem in languages like JS because nobody >> can change those. >> >> Now it's here, it's a footgun, and it would be nice to provide a way to >> put it in a holster. >> > > With the API need framed that way, perhaps all that asyncio is currently > missing is an "asyncio.lock_policy(unlock_token, err_callback)" API such > that your application can declare that initialisation is completed and no > further event loop policy changes should be allowed? > > (The "unlock_token" would be an arbitrary app-provided object that must > also be passed to the corresponding "unlock_policy" call - that way > libraries couldn't unlock the policy after the application locks it, since > they won't have a reference to the app-specific unlock token). > > Adding further callback hooks for more events seems like it will just push > the problem back another level, and you'll have the potential for conflicts > between callbacks registered with the new hooks, and an even harder to > understand overall system. > > By contrast, the above would be amenable to doing something like: > > 1. Per-process setup code establishes a particular event loop policy, > and then locks it > 2. Until the policy gets unlocked again, attempts to change it will > call the err_callback (so the app can raise a custom access denied > exception) > 3. get_event_loop(), set_event_loop(), and new_event_loop() are all > already managed by the event loop policy, so shouldn't need new hooks > 4. stop(), close(), set_debug(), set_task_factory(), etc are all > already managed by the event loop (and hence by the event loop policy), so > shouldn't need new hooks > > Right now, the weak link in that chain is that there's no way for the > application to keep a library from switching out the event policy with a > new one, and subsequently bypassing all of the app's control over how it > expects event loops to be managed. Given a way to close that loophole, the > application should already have the ability to enforce everything else that > it wants to enforce via the existing event loop and event loop policy APIs. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sun Jun 10 06:50:45 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 10 Jun 2018 03:50:45 -0700 (PDT) Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: <2c8a547b-d224-49e2-9f94-c6ff59f6151c@googlegroups.com> I think this suggestion should result in a library on PyPi, which can then be considered for the standard library if it sees a lot of use. Also, modern OpenGL does this just like Python does: all of the trigonometric functions take radians and a "radians" function is provided. Best, Neil On Saturday, June 9, 2018 at 2:55:12 PM UTC-4, Robert Vanden Eynde wrote: > > Indeed what we need for exact math for multiple of 90 (and 30) is ideas > from the symbolic libraries (sympy, sage). > > Of course the symbolic lib can do more like : > > sage: k = var('k', domain='integer') > sage: cos(1 + 2*k*pi) > cos(1) > sage: cos(k*pi) > cos(pi*k) > sage: cos(pi/3 + 2*k*pi) > 1/2 > > But that would concern symbolic lib only I think. > > For the naming convention, scipy using sindg (therefore Nor sind nor > sindeg) will make the sind choice less obvious. However if Matlab and Julia > chooses sind that's a good path to go, Matlab is pretty popular, as other > pointed out, with Universities giving "free" licences and stuff. With that > regards, scipy wanting to "be a replacement to Matlab in python and open > source" it's interesting they chose sindg and not the Matlab name sind. > > For the "d" as suffix that would mean "d" as "double" like in opengl. > Well, let's remember that in Python there's only One floating type, that's > a double, and it's called float... So python programmers will not think > "sind means it uses a python float and not a python float32 that C99 sinf > would". Python programmers would be like "sin takes float in radians, sind > takes float in degrees or int, because int can be converted to float when > there's no overflow". > > Le sam. 9 juin 2018 ? 04:09, Wes Turner > > a ?crit : > >> # Python, NumPy, SymPy, mpmath, sage trigonometric functions >> https://en.wikipedia.org/wiki/Trigonometric_functions >> >> ## Python math module >> https://docs.python.org/3/library/math.html#trigonometric-functions >> - degrees(radians): Float degrees >> - radians(degrees): Float degrees >> >> ## NumPy >> >> https://docs.scipy.org/doc/numpy/reference/routines.math.html#trigonometric-functions >> - degrees(radians) : List[float] degrees >> - rad2deg(radians): List[float] degrees >> - radians(degrees) : List[float] radians >> - deg2rad(degrees): List[float] radians >> >> https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html >> >> >> ## SymPy >> >> http://docs.sympy.org/latest/modules/functions/elementary.html#sympy-functions-elementary-trigonometric >> >> http://docs.sympy.org/latest/modules/functions/elementary.html#trionometric-functions >> >> - sympy.mpmath.degrees(radians): Float degrees >> - sympy.mpmath.radians(degrees): Float radians >> >> - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy >> - cosd, sind >> - >> https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy#comment50176770_31072815 >> >> > Let x, theta, phi, etc. be Symbols representing quantities in >> radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at >> the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to >> change the meaning of the symbols to degrees" >> >> >> ## mpmath >> http://mpmath.org/doc/current/functions/trigonometric.html >> - sympy.mpmath.degrees(radians): Float degrees >> - sympy.mpmath.radians(degrees): Float radians >> >> >> ## Sage >> >> https://doc.sagemath.org/html/en/reference/functions/sage/functions/trig.html >> >> >> >> On Friday, June 8, 2018, Robert Vanden Eynde > > wrote: >> >>> - Thanks for pointing out a language (Julia) that already had a name >>> convention. Interestingly they don't have a atan2d function. Choosing the >>> same convention as another language is a big plus. >>> >>> - Adding trig function using floats between 0 and 1 is nice, currently >>> one needs to do sin(tau * t) which is not so bad (from math import tau, tau >>> sounds like turn). >>> >>> - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) >>> or sinturn(x). >>> >>> Grads are in the idea of turns but with more problems, as you guys said, >>> grads are used by noone, but turns are more useful. sin(tau * t) For The >>> Win. >>> >>> - Even though people mentionned 1/6 not being exact, so that advantage >>> over radians isn't that obvious ? >>> >>> from math import sin, tau >>> from fractions import Fraction >>> sin(Fraction(1,6) * tau) >>> sindeg(Fraction(1,6) * 360) >>> >>> These already work today by the way. >>> >>> - As you guys pointed out, using radians implies knowing a little bit >>> about floating point arithmetic and its limitations. Integer are more >>> simple and less error prone. Of course it's useful to know about floats but >>> in many case it's not necessary to learn about it right away, young >>> students just want their player in the game move in a straight line when >>> angle = 90. >>> >>> - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is >>> kind of an exception. >>> >>> >>> >>> >>> Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano >> > a ?crit : >>> >>>> On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: >>>> > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano >>> > wrote: >>>> > > Although personally I prefer the look of d as a prefix: >>>> > > >>>> > > dsin, dcos, dtan >>>> > > >>>> > > That's more obviously pronounced "d(egrees) sin" etc rather than >>>> "sined" >>>> > > "tanned" etc. >>>> > >>>> > Having it as a suffix does have one advantage. The math module would >>>> > need a hyperbolic sine function which accepts an argument in; and >>>> > then, like Charles Napier [1], Python would finally be able to say "I >>>> > have sindh". >>>> >>>> Ha ha, nice pun, but no, the hyperbolic trig functions never take >>>> arguments in degrees. Or radians for that matter. They are "hyperbolic >>>> angles", which some electrical engineering text books refer to as >>>> "hyperbolic radians", but all the maths text books I've seen don't call >>>> them anything other than a real number. (Or sometimes a complex number.) >>>> >>>> But for what it's worth, there is a correspondence of a sort between >>>> the >>>> hyperbolic angle and circular angles. The circular angle going between >>>> 0 >>>> to 45? corresponds to the hyperbolic angle going from 0 to infinity. >>>> >>>> https://en.wikipedia.org/wiki/Hyperbolic_angle >>>> >>>> https://en.wikipedia.org/wiki/Hyperbolic_function >>>> >>>> >>>> > [1] Apocryphally, alas. >>>> >>>> Don't ruin a good story with facts ;-) >>>> >>>> >>>> >>>> -- >>>> Steve >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python... at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> _______________________________________________ >> Python-ideas mailing list >> Python... at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sun Jun 10 08:08:13 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 10 Jun 2018 08:08:13 -0400 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: <20180609235030.GT12683@ando.pearwood.info> References: <20180609125721.GR12683@ando.pearwood.info> <20180609235030.GT12683@ando.pearwood.info> Message-ID: > The only difference is that the `except` block implicitly unbinds the > variable when the block ends. > Mmmm. Good to know, because that means that the semantics are the same, except... > > while (condition := expression) as flag: > ... > Ah! Are the parenthesis necessary there? Accepting "while/if as name" would remove much (but not all) of the > motivation for assignment expressions, while accepting assignment > expressions would make a dedicated while/if as name syntax unnecessary. > Most of the arguments in favor of ':=' have been through examples of generators in which the introduced name is used within, with the if/while case often forgotten. There can be cases in which combining both syntaxes is useful: x = None while compute(x := v for v in next_series_value(x)) as comp: ... x = comp > Like it or not, I expect that they will be seen as competing PEPs, not > independent ones. Finding a real-world example of something like the above synthetic example would be in favor of the orthogonality. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 10 08:55:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jun 2018 22:55:06 +1000 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: <20180609125721.GR12683@ando.pearwood.info> <20180609235030.GT12683@ando.pearwood.info> Message-ID: On Sun, Jun 10, 2018 at 10:08 PM, Juancarlo A?ez wrote: > > There can be cases in which combining both syntaxes is useful: > > x = None > while compute(x := v for v in next_series_value(x)) as comp: > ... > x = comp Why not just: while comp := compute(x := v for v in next_series_value(x)): Why have two syntaxes, one of which is a shackled version of the other? If we have :=, there's no point having as. ChrisA From steve at pearwood.info Sun Jun 10 09:19:58 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jun 2018 23:19:58 +1000 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: <20180609125721.GR12683@ando.pearwood.info> <20180609235030.GT12683@ando.pearwood.info> Message-ID: <20180610131958.GV12683@ando.pearwood.info> On Sun, Jun 10, 2018 at 08:08:13AM -0400, Juancarlo A?ez wrote: > > while (condition := expression) as flag: > > ... > > > > Ah! Are the parenthesis necessary there? It's your PEP, you tell us. > > Accepting "while/if as name" would remove much (but not all) of the > > motivation for assignment expressions, while accepting assignment > > expressions would make a dedicated while/if as name syntax unnecessary. > > Most of the arguments in favor of ':=' have been through examples of > generators in which the introduced name is used within, with the if/while > case often forgotten. Indeed. I think that the generator/comprehension use-cases for assignment expressions are not the most compelling. They're important, but at the point you need to use assignment inside a comprehension, there's a good argument that it *may* be time to refactor to a loop. I think the while/if examples are even more important. > There can be cases in which combining both syntaxes is useful: > > x = None > while compute(x := v for v in next_series_value(x)) as comp: > ... > x = comp But if you have assignment expressions, you don't need "as". x = None while comp := compute(x := v for v in next_series_value(x)): ... x = comp > Finding a real-world example of something like the above synthetic example > would be in favor of the orthogonality. As the PEP author, that's your job. -- Steve From stephanh42 at gmail.com Sun Jun 10 10:44:03 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Sun, 10 Jun 2018 16:44:03 +0200 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: 2018-06-09 8:18 GMT+02:00 Robert Vanden Eynde : > For the naming convention, scipy using sindg (therefore Nor sind nor > sindeg) will make the sind choice less obvious. However if Matlab and Julia > chooses sind that's a good path to go, Matlab is pretty popular, as other > pointed out, with Universities giving "free" licences and stuff. With that > regards, scipy wanting to "be a replacement to Matlab in python and open > source" it's interesting they chose sindg and not the Matlab name sind. > I would suggest that compatibility with a major Python library such as SciPy is more important than compatibility with other programming languages. I would go even further and argue that scipy.special.sindg and its friends cosdg and tandg can serve as the reference implementation for this proposal. Stephan > > For the "d" as suffix that would mean "d" as "double" like in opengl. > Well, let's remember that in Python there's only One floating type, that's > a double, and it's called float... So python programmers will not think > "sind means it uses a python float and not a python float32 that C99 sinf > would". Python programmers would be like "sin takes float in radians, sind > takes float in degrees or int, because int can be converted to float when > there's no overflow". > > Le sam. 9 juin 2018 ? 04:09, Wes Turner a ?crit : > >> # Python, NumPy, SymPy, mpmath, sage trigonometric functions >> https://en.wikipedia.org/wiki/Trigonometric_functions >> >> ## Python math module >> https://docs.python.org/3/library/math.html#trigonometric-functions >> - degrees(radians): Float degrees >> - radians(degrees): Float degrees >> >> ## NumPy >> https://docs.scipy.org/doc/numpy/reference/routines.math.htm >> l#trigonometric-functions >> - degrees(radians) : List[float] degrees >> - rad2deg(radians): List[float] degrees >> - radians(degrees) : List[float] radians >> - deg2rad(degrees): List[float] radians >> >> https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html >> >> >> ## SymPy >> http://docs.sympy.org/latest/modules/functions/elementary.ht >> ml#sympy-functions-elementary-trigonometric >> http://docs.sympy.org/latest/modules/functions/elementary.ht >> ml#trionometric-functions >> >> - sympy.mpmath.degrees(radians): Float degrees >> - sympy.mpmath.radians(degrees): Float radians >> >> - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy >> - cosd, sind >> - https://stackoverflow.com/questions/31072815/cosd-and-sind >> -with-sympy#comment50176770_31072815 >> >> > Let x, theta, phi, etc. be Symbols representing quantities in >> radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at >> the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to >> change the meaning of the symbols to degrees" >> >> >> ## mpmath >> http://mpmath.org/doc/current/functions/trigonometric.html >> - sympy.mpmath.degrees(radians): Float degrees >> - sympy.mpmath.radians(degrees): Float radians >> >> >> ## Sage >> https://doc.sagemath.org/html/en/reference/functions/sage/fu >> nctions/trig.html >> >> >> >> On Friday, June 8, 2018, Robert Vanden Eynde < >> robertvandeneynde at hotmail.com> wrote: >> >>> - Thanks for pointing out a language (Julia) that already had a name >>> convention. Interestingly they don't have a atan2d function. Choosing the >>> same convention as another language is a big plus. >>> >>> - Adding trig function using floats between 0 and 1 is nice, currently >>> one needs to do sin(tau * t) which is not so bad (from math import tau, tau >>> sounds like turn). >>> >>> - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) >>> or sinturn(x). >>> >>> Grads are in the idea of turns but with more problems, as you guys said, >>> grads are used by noone, but turns are more useful. sin(tau * t) For The >>> Win. >>> >>> - Even though people mentionned 1/6 not being exact, so that advantage >>> over radians isn't that obvious ? >>> >>> from math import sin, tau >>> from fractions import Fraction >>> sin(Fraction(1,6) * tau) >>> sindeg(Fraction(1,6) * 360) >>> >>> These already work today by the way. >>> >>> - As you guys pointed out, using radians implies knowing a little bit >>> about floating point arithmetic and its limitations. Integer are more >>> simple and less error prone. Of course it's useful to know about floats but >>> in many case it's not necessary to learn about it right away, young >>> students just want their player in the game move in a straight line when >>> angle = 90. >>> >>> - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is >>> kind of an exception. >>> >>> >>> >>> >>> Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano a >>> ?crit : >>> >>>> On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: >>>> > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano >>>> wrote: >>>> > > Although personally I prefer the look of d as a prefix: >>>> > > >>>> > > dsin, dcos, dtan >>>> > > >>>> > > That's more obviously pronounced "d(egrees) sin" etc rather than >>>> "sined" >>>> > > "tanned" etc. >>>> > >>>> > Having it as a suffix does have one advantage. The math module would >>>> > need a hyperbolic sine function which accepts an argument in; and >>>> > then, like Charles Napier [1], Python would finally be able to say "I >>>> > have sindh". >>>> >>>> Ha ha, nice pun, but no, the hyperbolic trig functions never take >>>> arguments in degrees. Or radians for that matter. They are "hyperbolic >>>> angles", which some electrical engineering text books refer to as >>>> "hyperbolic radians", but all the maths text books I've seen don't call >>>> them anything other than a real number. (Or sometimes a complex number.) >>>> >>>> But for what it's worth, there is a correspondence of a sort between >>>> the >>>> hyperbolic angle and circular angles. The circular angle going between >>>> 0 >>>> to 45? corresponds to the hyperbolic angle going from 0 to infinity. >>>> >>>> https://en.wikipedia.org/wiki/Hyperbolic_angle >>>> >>>> https://en.wikipedia.org/wiki/Hyperbolic_function >>>> >>>> >>>> > [1] Apocryphally, alas. >>>> >>>> Don't ruin a good story with facts ;-) >>>> >>>> >>>> >>>> -- >>>> Steve >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sun Jun 10 14:35:54 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 10 Jun 2018 14:35:54 -0400 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: <20180610131958.GV12683@ando.pearwood.info> References: <20180609125721.GR12683@ando.pearwood.info> <20180609235030.GT12683@ando.pearwood.info> <20180610131958.GV12683@ando.pearwood.info> Message-ID: > As the PEP author, that's your job. > I started writing the PEP, and I found an interesting example: if not (m := re.match(r'^(\d+)-(\d+)$', identifier): raise ValueError('f{identifier} is not a valid identifier') print(f'first part is {m.group(1)}') print(f'first part is {m.group(2)}') That's fairly easy to understand, and not something that can be resolved with `as` if it's part of the `if` and `while` statement, rather than a different syntax for the `:=` semantics. That one would have to be written as it is done now: m = re.match(r'^(\d+)-(\d+)$', identifier) if not m: raise ValueError('f{identifier} is not a valid identifier') print(f'first part is {m.group(1)}') print(f'first part is {m.group(2)}') -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertvandeneynde at hotmail.com Sun Jun 10 14:26:29 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Sun, 10 Jun 2018 18:26:29 +0000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: I agree that a big python library is more close to the standard python lib than matlab. However, helping transition from matlab is a great concern in the python scientific community, because matlab is used is a lot of engineering classes at University. That's a tough call hmmm. I'll look at the implementation of scipy.special.sindg and friends to see if/how they have optimisations for exact values. Le dim. 10 juin 2018 ? 16:44, Stephan Houben > a ?crit : 2018-06-09 8:18 GMT+02:00 Robert Vanden Eynde >: For the naming convention, scipy using sindg (therefore Nor sind nor sindeg) will make the sind choice less obvious. However if Matlab and Julia chooses sind that's a good path to go, Matlab is pretty popular, as other pointed out, with Universities giving "free" licences and stuff. With that regards, scipy wanting to "be a replacement to Matlab in python and open source" it's interesting they chose sindg and not the Matlab name sind. I would suggest that compatibility with a major Python library such as SciPy is more important than compatibility with other programming languages. I would go even further and argue that scipy.special.sindg and its friends cosdg and tandg can serve as the reference implementation for this proposal. Stephan For the "d" as suffix that would mean "d" as "double" like in opengl. Well, let's remember that in Python there's only One floating type, that's a double, and it's called float... So python programmers will not think "sind means it uses a python float and not a python float32 that C99 sinf would". Python programmers would be like "sin takes float in radians, sind takes float in degrees or int, because int can be converted to float when there's no overflow". Le sam. 9 juin 2018 ? 04:09, Wes Turner > a ?crit : # Python, NumPy, SymPy, mpmath, sage trigonometric functions https://en.wikipedia.org/wiki/Trigonometric_functions ## Python math module https://docs.python.org/3/library/math.html#trigonometric-functions - degrees(radians): Float degrees - radians(degrees): Float degrees ## NumPy https://docs.scipy.org/doc/numpy/reference/routines.math.html#trigonometric-functions - degrees(radians) : List[float] degrees - rad2deg(radians): List[float] degrees - radians(degrees) : List[float] radians - deg2rad(degrees): List[float] radians https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html ## SymPy http://docs.sympy.org/latest/modules/functions/elementary.html#sympy-functions-elementary-trigonometric http://docs.sympy.org/latest/modules/functions/elementary.html#trionometric-functions - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy - cosd, sind - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy#comment50176770_31072815 > Let x, theta, phi, etc. be Symbols representing quantities in radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to change the meaning of the symbols to degrees" ## mpmath http://mpmath.org/doc/current/functions/trigonometric.html - sympy.mpmath.degrees(radians): Float degrees - sympy.mpmath.radians(degrees): Float radians ## Sage https://doc.sagemath.org/html/en/reference/functions/sage/functions/trig.html On Friday, June 8, 2018, Robert Vanden Eynde > wrote: - Thanks for pointing out a language (Julia) that already had a name convention. Interestingly they don't have a atan2d function. Choosing the same convention as another language is a big plus. - Adding trig function using floats between 0 and 1 is nice, currently one needs to do sin(tau * t) which is not so bad (from math import tau, tau sounds like turn). - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) or sinturn(x). Grads are in the idea of turns but with more problems, as you guys said, grads are used by noone, but turns are more useful. sin(tau * t) For The Win. - Even though people mentionned 1/6 not being exact, so that advantage over radians isn't that obvious ? from math import sin, tau from fractions import Fraction sin(Fraction(1,6) * tau) sindeg(Fraction(1,6) * 360) These already work today by the way. - As you guys pointed out, using radians implies knowing a little bit about floating point arithmetic and its limitations. Integer are more simple and less error prone. Of course it's useful to know about floats but in many case it's not necessary to learn about it right away, young students just want their player in the game move in a straight line when angle = 90. - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is kind of an exception. Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano > a ?crit : On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano > wrote: > > Although personally I prefer the look of d as a prefix: > > > > dsin, dcos, dtan > > > > That's more obviously pronounced "d(egrees) sin" etc rather than "sined" > > "tanned" etc. > > Having it as a suffix does have one advantage. The math module would > need a hyperbolic sine function which accepts an argument in; and > then, like Charles Napier [1], Python would finally be able to say "I > have sindh". Ha ha, nice pun, but no, the hyperbolic trig functions never take arguments in degrees. Or radians for that matter. They are "hyperbolic angles", which some electrical engineering text books refer to as "hyperbolic radians", but all the maths text books I've seen don't call them anything other than a real number. (Or sometimes a complex number.) But for what it's worth, there is a correspondence of a sort between the hyperbolic angle and circular angles. The circular angle going between 0 to 45? corresponds to the hyperbolic angle going from 0 to infinity. https://en.wikipedia.org/wiki/Hyperbolic_angle https://en.wikipedia.org/wiki/Hyperbolic_function > [1] Apocryphally, alas. Don't ruin a good story with facts ;-) -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Jun 10 17:37:03 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 10 Jun 2018 17:37:03 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On 6/10/2018 10:44 AM, Stephan Houben wrote: > I would suggest that compatibility with a major Python library such as > SciPy is more important than compatibility > with other programming languages. > > I would go even further and argue that scipy.special.sindg and its > friends cosdg and tandg > can serve as the reference implementation for this proposal. Or we could decide that we don't need to duplicate scipy. -- Terry Jan Reedy From mistersheik at gmail.com Sun Jun 10 19:53:50 2018 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 10 Jun 2018 19:53:50 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On Sun, Jun 10, 2018 at 5:41 PM Terry Reedy wrote: > On 6/10/2018 10:44 AM, Stephan Houben wrote: > > > I would suggest that compatibility with a major Python library such as > > SciPy is more important than compatibility > > with other programming languages. > > > > I would go even further and argue that scipy.special.sindg and its > > friends cosdg and tandg > > can serve as the reference implementation for this proposal. > > Or we could decide that we don't need to duplicate scipy. > > Copying scipy's and numpy's interface makes vectorizing code a lot simpler. tensorflow also initially made the mistake of varying from numpy's interface only to then deprecate the variations and adopt the standard. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/-NauPA0ZckE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Jun 11 00:10:14 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jun 2018 00:10:14 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On 6/10/2018 7:53 PM, Neil Girdhar wrote: > > > On Sun, Jun 10, 2018 at 5:41 PM Terry Reedy > > wrote: > > On 6/10/2018 10:44 AM, Stephan Houben wrote: > > > I would suggest that compatibility with a major Python library > such as > > SciPy is more important than compatibility > > with other programming languages. > > > > I would go even further and argue that scipy.special.sindg and its > > friends cosdg and tandg > > can serve as the reference implementation for this proposal. > > Or we could decide that we don't need to duplicate scipy. > > > Copying scipy's and numpy's interface makes vectorizing code a lot > simpler.? tensorflow also initially made the mistake of varying from > numpy's interface only to then deprecate the variations and adopt the > standard. What I meant is to not add functions that already exist in scipy. Core devs do not need the burden of keeping up with whatever improvements are made to the scipy functions. -- Terry Jan Reedy From chris.barker at noaa.gov Mon Jun 11 01:01:09 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Sun, 10 Jun 2018 22:01:09 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: On Sun, Jun 10, 2018 at 11:26 AM, Robert Vanden Eynde < robertvandeneynde at hotmail.com> wrote: > I agree that a big python library is more close to the standard python lib > than matlab. However, helping transition from matlab is a great concern in > the python scientific community, because matlab is used is a lot of > engineering classes at University. > > That's a tough call hmmm. > not really -- if you are moving from matlab to python, you are going to be using numpy and scipy -- we really don't need to spell similar functionality differently than scipy does. In regard to the "special values", and exact results -- a good math lib should return results that are "exact" in all but maybe the last digit stored. So you could check inputs and outputs with, e.g. math.isclose() to give people the "exact" results. -- and keep it all in floating point. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jun 11 01:48:31 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jun 2018 15:48:31 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: <20180611054831.GW12683@ando.pearwood.info> On Sun, Jun 10, 2018 at 10:01:09PM -0700, Chris Barker via Python-ideas wrote: > In regard to the "special values", and exact results -- a good math lib > should return results that are "exact" in all but maybe the last digit > stored. So you could check inputs and outputs with, e.g. math.isclose() to > give people the "exact" results. -- and keep it all in floating point. I wish Uncle Timmy or Mark Dickinson were around to give a definite answer, but in their absence I'll have a go. I'm reasonably sure that's wrong. The problem with trig functions is that they suffer from "the table maker's dilemma", so it is very hard to guarantee a correctly rounded result without going to ludicrous extremes: http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm So I think that there's no guarantee given for trancendental functions like sine, cosine etc. But even if they were, using isclose() is the wrong solution. Suppose sin(x) returns some number y, such that isclose(y, 0.0) say. You have no way of knowing that y is an inaccurate result that ought to be zero, or whether the answer should be non-zero and y is correct. You cannot assume that "y is close to zero, therefore it ought to be zero". It's not just zero, the same applies for any value. That's just moving rounding errors from one input to a slightly different input. # current situation sine of x returns y, but the mathematical exact result is exactly z # suggested "fix" sine of x ? a tiny bit returns exactly z, but ought to return y Guessing what sin or cos "ought to" return based on either the inexact input or inexact output is not a good approach. Remember, because ? is irrational, we cannot actually call sin or cos on any rational multiple of ?. We can only operate on multiples of pi, which is *close to* but not the same as ?. That's why it is okay that tan(pi/2) returns a huge number instead of infinity or NAN. That's because the input is every so slightly smaller than ?/2. That's exactly the behaviour you want when x is ever so slightly smaller than ?/2. -- Steve From j.van.dorp at deonet.nl Mon Jun 11 02:45:25 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Mon, 11 Jun 2018 08:45:25 +0200 Subject: [Python-ideas] A "within" keyword In-Reply-To: <20180609091949.GO12683@ando.pearwood.info> References: <48c9801d3ffa1$ddb01b80$99105280$@sdamon.com> <20180609091949.GO12683@ando.pearwood.info> Message-ID: I'd use a class or simple import a 5-line module if I wanted a namespace like that. Seems like clearner solutions to me. If you feel that your module namespace has to much content for a simple namespace, putting it in a seperate module should be the way to go anyway. From j.van.dorp at deonet.nl Mon Jun 11 03:18:22 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Mon, 11 Jun 2018 09:18:22 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180611054831.GW12683@ando.pearwood.info> References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> Message-ID: > Remember, because ? is irrational, we cannot actually call sin or cos on > any rational multiple of ?. We can only operate on multiples of pi, > which is *close to* but not the same as ?. That's why it is okay that > tan(pi/2) returns a huge number instead of infinity or NAN. That's > because the input is every so slightly smaller than ?/2. That's exactly > the behaviour you want when x is ever so slightly smaller than ?/2. This would basically be the reason for a PiMultiple class - you can special case it. You'd know sin(PiMultiple(0.5)) == 0. You'd know cos(PiMultiple(0.5)) == -1 and tan(PiMultiple(0.5)) == nan. This could let you remember as much angles as possible into multiples of pi, and as long as you're in multiples of pi, you're exact. PiMultiple(Fraction(1, 6)) would be exact and could give the right sin() and cos() behaviour. And because it'd be a numeric type, you could still use it with all other numeric types and add/multiply etc it. When you add and subtract it with another numeric type, it'd lose the special status, but even with multiples and divisions you can preserve it's specialness. And if it gets weird values, you can always fall back on converting it to a float, therefore never giving worse results. It also gives a reason -against- degrees. if you have PiMultiple or TauMultiple, it's rather easy to give common angles, and students can learn to properly learn radians for angles as they should.(because, lets be honest, they're objectively better measures of angles than degrees, or even *shiver* grads. ) We SHOULD make it easy to code exact and the right way, and I think a PiMultiple class could help that a lot. That said, it does need a better name. From ronaldoussoren at mac.com Mon Jun 11 04:00:17 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 11 Jun 2018 10:00:17 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> Message-ID: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Op 11 jun. 2018 om 09:18 heeft Jacco van Dorp het volgende geschreven: >> Remember, because ? is irrational, we cannot actually call sin or cos on >> any rational multiple of ?. We can only operate on multiples of pi, >> which is *close to* but not the same as ?. That's why it is okay that >> tan(pi/2) returns a huge number instead of infinity or NAN. That's >> because the input is every so slightly smaller than ?/2. That's exactly >> the behaviour you want when x is ever so slightly smaller than ?/2. > > > This would basically be the reason for a PiMultiple class - you can > special case it. You'd know sin(PiMultiple(0.5)) == 0. You'd know > cos(PiMultiple(0.5)) == -1 and tan(PiMultiple(0.5)) == nan. This could > let you remember as much angles as possible into multiples of pi, and > as long as you're in multiples of pi, you're exact. > PiMultiple(Fraction(1, 6)) would be exact and could give the right > sin() and cos() behaviour. > > And because it'd be a numeric type, you could still use it with all > other numeric types and add/multiply etc it. When you add and subtract > it with another numeric type, it'd lose the special status, but even > with multiples and divisions you can preserve it's specialness. > > And if it gets weird values, you can always fall back on converting it > to a float, therefore never giving worse results. > > It also gives a reason -against- degrees. if you have PiMultiple or > TauMultiple, it's rather easy to give common angles, and students can > learn to properly learn radians for angles as they should.(because, > lets be honest, they're objectively better measures of angles than > degrees, or even *shiver* grads. ) > > We SHOULD make it easy to code exact and the right way, and I think a > PiMultiple class could help that a lot. That said, it does need a > better name. What is the real world advantage of such a class? So far I?ve only seen examples where the current behavior is said to be confusing for students. In most cases where I have used math.sin the angle wasn?t a constant and wasn?t an exact mulltiple of pi. Ronald > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From desmoulinmichel at gmail.com Mon Jun 11 04:23:18 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 11 Jun 2018 10:23:18 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> Message-ID: <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> I like it. First, it solves the issue for policies, and let people decide how they want to deal with the problem (drop the lib, subclass the policy/factory, etc). But it also solves the problem for loops, because loops are set by the task factory, and so you can easily check somebody is changing your loop from you locked policy and do whatever you want. This also solves the problem of: - task factories - event loop life cycle hooks Indeed, if somebody needs those, he/she can implement a custom loop, which can be safe guarded by the policy, which is locked. It doesn't have the drawback of my proposal of being overly general, and is quite simple to implement. But it does let people get creative with the stack. From j.van.dorp at deonet.nl Mon Jun 11 06:08:07 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Mon, 11 Jun 2018 12:08:07 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: 2018-06-11 10:00 GMT+02:00 Ronald Oussoren : >> [me suggestion PiMultiple class] > > What is the real world advantage of such a class? So far I?ve only seen examples where the current behavior is said to be confusing for students. In most cases where I have used math.sin the angle wasn?t a constant and wasn?t an exact mulltiple of pi. > > Ronald Im assuming the current math.pi would be converted to PiMultiple(1) When learning, it's rather easy to write and read the following: >>> from math import sin, pi, asin, cos >>> myangle = pi / 2 >>> sin(myangle) 1 >>> asin(1) "0.5?" # Currently: 1.5707963267948966 >>> cos(pi / 2) 0 # Currently: 6.123233995736766e-17 It helps clarity and understanding when you're coming to python from a math background. In universities, half of your values when working with angles are written as some multiple of pi (of some weird fraction involving it). Also, for the common angles, we'll gain accuracy (and perhaps we can avoid the infinite series for certain computations. That would be a win.). Also, you're countering the "confusing to students" part with your own production environment experience. Those aren't alike. And if this was done, your float-based production code wouldn't get slower or care any other way. What you implement and test with PiMultiple would work perfectly fine with any random float as well, just without the extra advantages. From steve at pearwood.info Mon Jun 11 06:45:24 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jun 2018 20:45:24 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> Message-ID: <20180611104522.GX12683@ando.pearwood.info> On Mon, Jun 11, 2018 at 09:18:22AM +0200, Jacco van Dorp wrote: > > Remember, because ? is irrational, we cannot actually call sin or cos on > > any rational multiple of ?. We can only operate on multiples of pi, > > which is *close to* but not the same as ?. That's why it is okay that > > tan(pi/2) returns a huge number instead of infinity or NAN. That's > > because the input is every so slightly smaller than ?/2. That's exactly > > the behaviour you want when x is ever so slightly smaller than ?/2. > > > This would basically be the reason for a PiMultiple class - you can > special case it. You'd know sin(PiMultiple(0.5)) == 0. Not when I went to school it wasn't. sin(?/2) = sin(90?) = 1. Perhaps you meant cos? In any case, the sin function doesn't work that way. Unless we add a special dunder method to defer to, it cannot be expected to magically know how to deal with these "PiMultiple" objects, except by converting them to floats. You don't suddenly get accurate results by waving a magic wand over the float 0.5 and saying "You're a multiple of pi". You still have to code a separate algorithm for this, and that's hard work. (Why do you think the decimal module doesn't support trig functions?) Is every function that takes float arguments now supposed to recognise PiMultiple objects and treat them specially? How do they integrate in the numeric tower and interact with ordinary floats? But let's say you do this: def sin(arg): if isinstance(arg, PiMultiple): # does this mean it needs to be a builtin class? call internal sinpi(arg) function else: call regular sin(arg) function This isn't Java, and not everything needs to be a class. If we go to the trouble of writing separate sinpi() etc implementations, why hide one of them behind a class (and not even in a proper object-oriented interface) when we can just call the functions directly? sinpi(1.5) sin(PiMultiple(1.5)) I know which I'd rather use. > It also gives a reason -against- degrees. if you have PiMultiple or > TauMultiple, it's rather easy to give common angles, What about the uncommon angles? The whole point of this proposal is to make it easy to give angles in degrees without the need to convert to radians, introducing rounding errors over and above those introduced by the trig function itself. > and students can > learn to properly learn radians for angles as they should.(because, > lets be honest, they're objectively better measures of angles than > degrees, or even *shiver* grads. ) No they are not objectively better measures of angles. With radians, you have to divide a right angle into an irrational number of radians, one which cannot be expressed in a finite number of decimal places. In a very real sense, it is impossible to measure exactly 1 radian. I know that in practical terms, this makes no difference, we can get close enough, but physical measurements are limited to rational numbers. A measurement system based on irrational numbers, especially one as difficult as ?, is not objectively better. Its not just because of tradition that nobody uses radians in civil engineering, astronomy, architecture, etc. Radians shine when we're doing pure maths and some branches of physics, but they're a PITA to use in most practical circumstances. E,g, the tip of your little finger at arms length is close enough to 1? or 0.017 radian. Who wants to measure angles in multiples of 0.017? https://www.timeanddate.com/astronomy/measuring-the-sky-by-hand.html Using radians, these heuristics stink. -- Steve From steve at pearwood.info Mon Jun 11 07:38:57 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jun 2018 21:38:57 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: <20180611113857.GY12683@ando.pearwood.info> On Mon, Jun 11, 2018 at 12:08:07PM +0200, Jacco van Dorp wrote: > >>> asin(1) > "0.5?" # Currently: 1.5707963267948966 I think that if you expect the stdlib math library to change to symbolic maths for trig functions, you are going to be extremely disappointed. Aside from everything else, this is such a massive backward- compatibility break that it probably wouldn't even have been allowed in Python 3.0, let alone in 3.8. > It helps clarity and understanding when you're coming to python from a > math background. What about the other 95% of Python programmers who don't have a maths background and don't give two hoots about the mathematical elegance of being able to differentiate sin(x) without a multiplicative factor? -- Steve From ncoghlan at gmail.com Mon Jun 11 08:11:19 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Jun 2018 22:11:19 +1000 Subject: [Python-ideas] A PEP on introducing variables on 'if' and 'while' In-Reply-To: References: <20180609125721.GR12683@ando.pearwood.info> <20180609235030.GT12683@ando.pearwood.info> <20180610131958.GV12683@ando.pearwood.info> Message-ID: On 11 June 2018 at 04:35, Juancarlo A?ez wrote: > > As the PEP author, that's your job. >> > > I started writing the PEP, and I found an interesting example: > > if not (m := re.match(r'^(\d+)-(\d+)$', identifier): > raise ValueError('f{identifier} is not a valid identifier') > print(f'first part is {m.group(1)}') > print(f'first part is {m.group(2)}') > > > That's fairly easy to understand, and not something that can be resolved > with `as` if it's part of the `if` and `while` statement, rather than a > different syntax for the `:=` semantics. > Yep, the "What about cases where you only want to capture part of the conditional expression?" question is the rock on which every "only capture the entire conditional expression" proposal has foundered. PEP 572 arose from Chris deciding to take on the challenge of seriously asking the question "Well, what if we *did* allow capturing of arbitrary subexpressions with inline assignments?". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Mon Jun 11 12:23:21 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 11 Jun 2018 12:23:21 -0400 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: > I want to abstract that from the user, so I tried to put that in a policy. But that's dangerous since it can be changed at any time, so I gave up on it and made it explicit. Of course, if the user misses that in the doc (hopefully, it's an company internal code so they should be trained), it will be a bummer to debug. I still don't get it... If you have a framework you presumably have an entry point. Why can't you set up your policy in that entrypoint? Why would a user attempt to change the policy at runtime (you haven't listed examples of libraries that do this)? I see a lot of "I want to protect users from ..." arguments but I haven't yet seen "this and that happened in production and we haven't been able to debug what happened for a while". Do you handle cases when people install a blocking logging handler in their async application? Do you handle cases when a malfunctioning sys.excepthook is installed? What about cases when users accidentally import gevent somewhere in their asyncio application and it monkeypatches the 'socket' module (this is a real horror story, by the way)? My point is that there are so many things that users can do that will break any framework, be it asyncio or django or trio. This sounds like "if something can happen it will happen" kind of thing, but I haven't yet seen good examples of real code that suffers from non-locked policies. Using the nurseries example doesn't count, as this is something that we want to have as a builtin functionality in 3.8. Locking policies can lead to more predictable user experience; OTOH what happens if, say, aiohttp decides to lock its policy to use uvloop and thus make it impossible for its users to use tokio or some other loop implementation? Yury On Mon, Jun 11, 2018 at 4:23 AM Michel Desmoulin wrote: > > I like it. > > First, it solves the issue for policies, and let people decide how they > want to deal with the problem (drop the lib, subclass the > policy/factory, etc). > > But it also solves the problem for loops, because loops are set by the > task factory, and so you can easily check somebody is changing your loop > from you locked policy and do whatever you want. > > This also solves the problem of: > > - task factories > - event loop life cycle hooks > > Indeed, if somebody needs those, he/she can implement a custom loop, > which can be safe guarded by the policy, which is locked. > > It doesn't have the drawback of my proposal of being overly general, and > is quite simple to implement. But it does let people get creative with > the stack. > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Yury From andrew.svetlov at gmail.com Mon Jun 11 12:35:30 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 11 Jun 2018 19:35:30 +0300 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: In my mind aiohttp doesn't depend on On Mon, Jun 11, 2018, 19:24 Yury Selivanov wrote: > > I want to abstract that from the user, so I tried to put that in a > policy. But that's dangerous since it can be changed at any time, so I > gave up on it and made it explicit. Of course, if the user misses that > in the doc (hopefully, it's an company internal code so they should be > trained), it will be a bummer to debug. > > I still don't get it... If you have a framework you presumably have an > entry point. Why can't you set up your policy in that entrypoint? Why > would a user attempt to change the policy at runtime (you haven't > listed examples of libraries that do this)? I see a lot of "I want to > protect users from ..." arguments but I haven't yet seen "this and > that happened in production and we haven't been able to debug what > happened for a while". > > Do you handle cases when people install a blocking logging handler in > their async application? Do you handle cases when a malfunctioning > sys.excepthook is installed? What about cases when users accidentally > import gevent somewhere in their asyncio application and it > monkeypatches the 'socket' module (this is a real horror story, by the > way)? My point is that there are so many things that users can do > that will break any framework, be it asyncio or django or trio. > > This sounds like "if something can happen it will happen" kind of > thing, but I haven't yet seen good examples of real code that suffers > from non-locked policies. Using the nurseries example doesn't count, > as this is something that we want to have as a builtin functionality > in 3.8. > > Locking policies can lead to more predictable user experience; OTOH > what happens if, say, aiohttp decides to lock its policy to use uvloop > and thus make it impossible for its users to use tokio or some other > loop implementation? > > Yury > > > > On Mon, Jun 11, 2018 at 4:23 AM Michel Desmoulin > wrote: > > > > I like it. > > > > First, it solves the issue for policies, and let people decide how they > > want to deal with the problem (drop the lib, subclass the > > policy/factory, etc). > > > > But it also solves the problem for loops, because loops are set by the > > task factory, and so you can easily check somebody is changing your loop > > from you locked policy and do whatever you want. > > > > This also solves the problem of: > > > > - task factories > > - event loop life cycle hooks > > > > Indeed, if somebody needs those, he/she can implement a custom loop, > > which can be safe guarded by the policy, which is locked. > > > > It doesn't have the drawback of my proposal of being overly general, and > > is quite simple to implement. But it does let people get creative with > > the stack. > > > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Jun 11 12:39:48 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 11 Jun 2018 19:39:48 +0300 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: Sorry, smartphone is not my preferred tool. aiohttp doesn't depend on event loop implementation but uses public API only. aiohttp test suite allows to check against asyncio, uvloop, and tokio but it is another story. On Mon, Jun 11, 2018 at 7:35 PM Andrew Svetlov wrote: > In my mind aiohttp doesn't depend on > > On Mon, Jun 11, 2018, 19:24 Yury Selivanov > wrote: > >> > I want to abstract that from the user, so I tried to put that in a >> policy. But that's dangerous since it can be changed at any time, so I >> gave up on it and made it explicit. Of course, if the user misses that >> in the doc (hopefully, it's an company internal code so they should be >> trained), it will be a bummer to debug. >> >> I still don't get it... If you have a framework you presumably have an >> entry point. Why can't you set up your policy in that entrypoint? Why >> would a user attempt to change the policy at runtime (you haven't >> listed examples of libraries that do this)? I see a lot of "I want to >> protect users from ..." arguments but I haven't yet seen "this and >> that happened in production and we haven't been able to debug what >> happened for a while". >> >> Do you handle cases when people install a blocking logging handler in >> their async application? Do you handle cases when a malfunctioning >> sys.excepthook is installed? What about cases when users accidentally >> import gevent somewhere in their asyncio application and it >> monkeypatches the 'socket' module (this is a real horror story, by the >> way)? My point is that there are so many things that users can do >> that will break any framework, be it asyncio or django or trio. >> >> This sounds like "if something can happen it will happen" kind of >> thing, but I haven't yet seen good examples of real code that suffers >> from non-locked policies. Using the nurseries example doesn't count, >> as this is something that we want to have as a builtin functionality >> in 3.8. >> >> Locking policies can lead to more predictable user experience; OTOH >> what happens if, say, aiohttp decides to lock its policy to use uvloop >> and thus make it impossible for its users to use tokio or some other >> loop implementation? >> >> Yury >> >> >> >> On Mon, Jun 11, 2018 at 4:23 AM Michel Desmoulin >> wrote: >> > >> > I like it. >> > >> > First, it solves the issue for policies, and let people decide how they >> > want to deal with the problem (drop the lib, subclass the >> > policy/factory, etc). >> > >> > But it also solves the problem for loops, because loops are set by the >> > task factory, and so you can easily check somebody is changing your loop >> > from you locked policy and do whatever you want. >> > >> > This also solves the problem of: >> > >> > - task factories >> > - event loop life cycle hooks >> > >> > Indeed, if somebody needs those, he/she can implement a custom loop, >> > which can be safe guarded by the policy, which is locked. >> > >> > It doesn't have the drawback of my proposal of being overly general, and >> > is quite simple to implement. But it does let people get creative with >> > the stack. >> > >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> -- >> Yury >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- > Thanks, > Andrew Svetlov > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Mon Jun 11 12:50:14 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 11 Jun 2018 12:50:14 -0400 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: > aiohttp doesn't depend on event loop implementation but uses public API only. Yeah, I understand. I was using it as an example of what happens if a popular library like aiohttp decides to lock the policy for whatever reason. To add to my point: event loop policies should only be used to inject a custom event loop implementation, like uvloop. They shouldn't be used to add some framework- or library-specific functionality. That's why I think that locking policies does not make a lot of sense. Yury On Mon, Jun 11, 2018 at 12:40 PM Andrew Svetlov wrote: > > Sorry, smartphone is not my preferred tool. > aiohttp doesn't depend on event loop implementation but uses public API only. > aiohttp test suite allows to check against asyncio, uvloop, and tokio but it is another story. > > On Mon, Jun 11, 2018 at 7:35 PM Andrew Svetlov wrote: >> >> In my mind aiohttp doesn't depend on >> >> On Mon, Jun 11, 2018, 19:24 Yury Selivanov wrote: >>> >>> > I want to abstract that from the user, so I tried to put that in a >>> policy. But that's dangerous since it can be changed at any time, so I >>> gave up on it and made it explicit. Of course, if the user misses that >>> in the doc (hopefully, it's an company internal code so they should be >>> trained), it will be a bummer to debug. >>> >>> I still don't get it... If you have a framework you presumably have an >>> entry point. Why can't you set up your policy in that entrypoint? Why >>> would a user attempt to change the policy at runtime (you haven't >>> listed examples of libraries that do this)? I see a lot of "I want to >>> protect users from ..." arguments but I haven't yet seen "this and >>> that happened in production and we haven't been able to debug what >>> happened for a while". >>> >>> Do you handle cases when people install a blocking logging handler in >>> their async application? Do you handle cases when a malfunctioning >>> sys.excepthook is installed? What about cases when users accidentally >>> import gevent somewhere in their asyncio application and it >>> monkeypatches the 'socket' module (this is a real horror story, by the >>> way)? My point is that there are so many things that users can do >>> that will break any framework, be it asyncio or django or trio. >>> >>> This sounds like "if something can happen it will happen" kind of >>> thing, but I haven't yet seen good examples of real code that suffers >>> from non-locked policies. Using the nurseries example doesn't count, >>> as this is something that we want to have as a builtin functionality >>> in 3.8. >>> >>> Locking policies can lead to more predictable user experience; OTOH >>> what happens if, say, aiohttp decides to lock its policy to use uvloop >>> and thus make it impossible for its users to use tokio or some other >>> loop implementation? >>> >>> Yury >>> >>> >>> >>> On Mon, Jun 11, 2018 at 4:23 AM Michel Desmoulin >>> wrote: >>> > >>> > I like it. >>> > >>> > First, it solves the issue for policies, and let people decide how they >>> > want to deal with the problem (drop the lib, subclass the >>> > policy/factory, etc). >>> > >>> > But it also solves the problem for loops, because loops are set by the >>> > task factory, and so you can easily check somebody is changing your loop >>> > from you locked policy and do whatever you want. >>> > >>> > This also solves the problem of: >>> > >>> > - task factories >>> > - event loop life cycle hooks >>> > >>> > Indeed, if somebody needs those, he/she can implement a custom loop, >>> > which can be safe guarded by the policy, which is locked. >>> > >>> > It doesn't have the drawback of my proposal of being overly general, and >>> > is quite simple to implement. But it does let people get creative with >>> > the stack. >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > Python-ideas mailing list >>> > Python-ideas at python.org >>> > https://mail.python.org/mailman/listinfo/python-ideas >>> > Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >>> -- >>> Yury >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- >> Thanks, >> Andrew Svetlov > > -- > Thanks, > Andrew Svetlov -- Yury From andrew.svetlov at gmail.com Mon Jun 11 12:55:39 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 11 Jun 2018 19:55:39 +0300 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: Well, if we need something -- locking is better than hooks. But yes, we need a real life example. While the example is absent let's postpone the solution. On Mon, Jun 11, 2018 at 7:50 PM Yury Selivanov wrote: > > aiohttp doesn't depend on event loop implementation but uses public API > only. > > Yeah, I understand. I was using it as an example of what happens if a > popular library like aiohttp decides to lock the policy for whatever > reason. > > To add to my point: event loop policies should only be used to inject > a custom event loop implementation, like uvloop. They shouldn't be > used to add some framework- or library-specific functionality. That's > why I think that locking policies does not make a lot of sense. > > Yury > On Mon, Jun 11, 2018 at 12:40 PM Andrew Svetlov > wrote: > > > > Sorry, smartphone is not my preferred tool. > > aiohttp doesn't depend on event loop implementation but uses public API > only. > > aiohttp test suite allows to check against asyncio, uvloop, and tokio > but it is another story. > > > > On Mon, Jun 11, 2018 at 7:35 PM Andrew Svetlov > wrote: > >> > >> In my mind aiohttp doesn't depend on > >> > >> On Mon, Jun 11, 2018, 19:24 Yury Selivanov > wrote: > >>> > >>> > I want to abstract that from the user, so I tried to put that in a > >>> policy. But that's dangerous since it can be changed at any time, so I > >>> gave up on it and made it explicit. Of course, if the user misses that > >>> in the doc (hopefully, it's an company internal code so they should be > >>> trained), it will be a bummer to debug. > >>> > >>> I still don't get it... If you have a framework you presumably have an > >>> entry point. Why can't you set up your policy in that entrypoint? Why > >>> would a user attempt to change the policy at runtime (you haven't > >>> listed examples of libraries that do this)? I see a lot of "I want to > >>> protect users from ..." arguments but I haven't yet seen "this and > >>> that happened in production and we haven't been able to debug what > >>> happened for a while". > >>> > >>> Do you handle cases when people install a blocking logging handler in > >>> their async application? Do you handle cases when a malfunctioning > >>> sys.excepthook is installed? What about cases when users accidentally > >>> import gevent somewhere in their asyncio application and it > >>> monkeypatches the 'socket' module (this is a real horror story, by the > >>> way)? My point is that there are so many things that users can do > >>> that will break any framework, be it asyncio or django or trio. > >>> > >>> This sounds like "if something can happen it will happen" kind of > >>> thing, but I haven't yet seen good examples of real code that suffers > >>> from non-locked policies. Using the nurseries example doesn't count, > >>> as this is something that we want to have as a builtin functionality > >>> in 3.8. > >>> > >>> Locking policies can lead to more predictable user experience; OTOH > >>> what happens if, say, aiohttp decides to lock its policy to use uvloop > >>> and thus make it impossible for its users to use tokio or some other > >>> loop implementation? > >>> > >>> Yury > >>> > >>> > >>> > >>> On Mon, Jun 11, 2018 at 4:23 AM Michel Desmoulin > >>> wrote: > >>> > > >>> > I like it. > >>> > > >>> > First, it solves the issue for policies, and let people decide how > they > >>> > want to deal with the problem (drop the lib, subclass the > >>> > policy/factory, etc). > >>> > > >>> > But it also solves the problem for loops, because loops are set by > the > >>> > task factory, and so you can easily check somebody is changing your > loop > >>> > from you locked policy and do whatever you want. > >>> > > >>> > This also solves the problem of: > >>> > > >>> > - task factories > >>> > - event loop life cycle hooks > >>> > > >>> > Indeed, if somebody needs those, he/she can implement a custom loop, > >>> > which can be safe guarded by the policy, which is locked. > >>> > > >>> > It doesn't have the drawback of my proposal of being overly general, > and > >>> > is quite simple to implement. But it does let people get creative > with > >>> > the stack. > >>> > > >>> > > >>> > > >>> > > >>> > _______________________________________________ > >>> > Python-ideas mailing list > >>> > Python-ideas at python.org > >>> > https://mail.python.org/mailman/listinfo/python-ideas > >>> > Code of Conduct: http://python.org/psf/codeofconduct/ > >>> > >>> > >>> > >>> -- > >>> Yury > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas at python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > >> -- > >> Thanks, > >> Andrew Svetlov > > > > -- > > Thanks, > > Andrew Svetlov > > > > -- > Yury > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Jun 11 13:05:10 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Jun 2018 10:05:10 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: On Mon, Jun 11, 2018 at 1:00 AM, Ronald Oussoren wrote: > > What is the real world advantage of such a class? So far I?ve only seen > examples where the current behavior is said to be confusing for students. EXACTLY! In [*49*]: math.sin(math.pi) Out[*49*]: 1.2246467991473532e-16 If the difference between 1.2246467991473532e-16 and zero is important to you, you've got bigger issues to deal with, and you'd better have a decent grasp of floating point computation's limitations. This is not that different than using Decimal, because it is confusing or aesthetically unpleasing to get something other than 1 (for example) when you add 0.1 up ten times: In [*25*]: x = 0.0 In [*26*]: *for* i *in* range(10): x += 0.1 In [*27*]: x Out[*27*]: 0.9999999999999999 But: In [*28*]: 1.0 - x Out[*28*]: 1.1102230246251565e-16 i.e. x is within one decimal unit in the last place stored by a float to 1.0. Which is to say -- there is no practical difference within the abilities of floating point, and Decimal, while it would present this particular result exactly, isn't any more "accurate" in general (unless you use more precision, which is a result of variable precision, not decimal arithmetic per se) So -- If there is a nifty way to specify that I want, say, the sin of "exactly pi", then the code could special case that, and return exactly zero. But what if you happen to pass in a value just a tiny bit larger than pi? then you have a potential discontinuity at pi, because you'd still have to use the regular FP computation for any non-exact multiple of pi. All this means that you will get a very similar result by rounding your outputs to a digit less than full FP precision: In [*46*]: math.sin(math.pi) Out[*46*]: 1.2246467991473532e-16 In [*47*]: round(math.sin(math.pi), 15) Out[*47*]: 0.0 But anyway: There *may* be a nice use-case for a more "friendly" trig package, particularly in education, but I don't think it would ever belong in the std lib. (though maybe I'd change my mind if it saw really wide use) However simiply adding a few names like: sindeg, cosdeg, etc, to save folks from having to type: math.sin(math.degree(something)) is a fine idea -- and it may make it a bit more clear, when people go looking for the "Sine" function, that they don't want to use degrees with the regular one... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 11 13:24:42 2018 From: mike at selik.org (Michael Selik) Date: Mon, 11 Jun 2018 10:24:42 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: Would sind and cosd make Euler's formula work correctly? sind(x) + i * sind(x) == math.e ** (i * x) I suspect that adding these functions is kind of like those cartoons where the boat is springing leaks and the character tried to plug them with their fingers. Floating point is a leaky abstraction. Perhaps you'd prefer an enhancement to the fractions module that provides real (not float) math? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 11 13:33:57 2018 From: mike at selik.org (Michael Selik) Date: Mon, 11 Jun 2018 10:33:57 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: Whoops, it turns out Euler's formula does work! I expected imprecision, but at least one test matched. x = 42 cos(x) + 1j * sin(x) == e ** (1j * x) I suppose that's because it's radians. On Mon, Jun 11, 2018, 10:24 AM Michael Selik wrote: > Would sind and cosd make Euler's formula work correctly? > > sind(x) + i * sind(x) == math.e ** (i * x) > > I suspect that adding these functions is kind of like those cartoons where > the boat is springing leaks and the character tried to plug them with their > fingers. Floating point is a leaky abstraction. > > Perhaps you'd prefer an enhancement to the fractions module that provides > real (not float) math? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Mon Jun 11 14:04:31 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 11 Jun 2018 20:04:31 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: 2018-06-11 19:33 GMT+02:00 Michael Selik : > Whoops, it turns out Euler's formula does work! I expected imprecision, > but at least one test matched. > > x = 42 > cos(x) + 1j * sin(x) == e ** (1j * x) > I think you will find it holds for any x (except inf, -inf and nan). The boat is less leaky than you think; IEEE floating-point arithmetic goes out of its way to produce exact answers whenever possible. (To great consternation of hardware designers who felt that requiring 1.0*x == x was too expensive.) > I suppose that's because it's radians. > Well, the formula obviously only holds in exact arithmetic if cos and sin are the versions taking radians. Stephan > > > On Mon, Jun 11, 2018, 10:24 AM Michael Selik wrote: > >> Would sind and cosd make Euler's formula work correctly? >> >> sind(x) + i * sind(x) == math.e ** (i * x) >> >> I suspect that adding these functions is kind of like those cartoons >> where the boat is springing leaks and the character tried to plug them with >> their fingers. Floating point is a leaky abstraction. >> >> Perhaps you'd prefer an enhancement to the fractions module that provides >> real (not float) math? >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jun 11 14:38:07 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Jun 2018 04:38:07 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: <20180611183806.GZ12683@ando.pearwood.info> On Mon, Jun 11, 2018 at 10:24:42AM -0700, Michael Selik wrote: > Would sind and cosd make Euler's formula work correctly? > > sind(x) + i * sind(x) == math.e ** (i * x) No, using degrees makes Euler's identity *not* work correctly, unless you add in a conversion factor from degrees to radians: https://math.stackexchange.com/questions/1368049/eulers-identity-in-degrees Euler's Identity works fine in radians: py> from cmath import exp py> exp(1j*math.pi) (-1+1.2246063538223773e-16j) which is close enough to -1 given the usual rounding issues with floats. (Remember, math.pi is not ?, but a number close to it. There is no way to represent the irrational number ? in less than an infinite amount of memory without symbolic maths.) [...] > Perhaps you'd prefer an enhancement to the fractions module that provides > real (not float) math? I should think not. Niven's Theorem tells us that for rational angles between 0? and 90? (that is, angles which can be represented as fractions), there are only THREE for which sine (and cosine) are themselves rational: https://en.wikipedia.org/wiki/Niven's_theorem Every value of sin(x) except for those three angles is an irrational number, which means they cannot be represented exactly as fractions or in a finite number of decimal places. What that means is that if we tried to implement real (not float) trigonometric functions on fractions, we'd need symbolic maths capable of returning ever-more complicated expressions involving surds. For example, the exact value of sin(7/2 ?) involves a triple nested square root: 1/2 sqrt(2 - sqrt(2 + sqrt(3))) and that's one of the relatively pretty ones. sin(3?) is: -1/2 (-1)^(29/60) ((-1)^(1/60) - 1) (1 + (-1)^(1/60)) http://www.wolframalpha.com/input/?i=exact+value+of+sine%2815%2F2+degrees%29 http://www.wolframalpha.com/input/?i=exact+value+of+sine%283+degrees%29 This proposal was supposed to *simplify* the trig functions for non-mathematicians, not make them mind-bogglingly complicated. -- Steve From cpitclaudel at gmail.com Mon Jun 11 15:50:01 2018 From: cpitclaudel at gmail.com (=?UTF-8?Q?Cl=c3=a9ment_Pit-Claudel?=) Date: Mon, 11 Jun 2018 15:50:01 -0400 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: On 2018-06-11 14:04, Stephan Houben wrote: > 2018-06-11 19:33 GMT+02:00 Michael Selik >: > > Whoops, it turns out Euler's formula does work! I expected imprecision, but at least one test matched. > > x = 42 > cos(x) + 1j * sin(x) == e ** (1j * x) > > > I think you will find it holds for any x (except inf, -inf and nan). > The boat is less leaky than you think; IEEE floating-point arithmetic goes > out of its way to produce exact answers whenever possible. > (To great consternation of hardware designers who felt that > requiring 1.0*x == x was too expensive.) In fact, 1.0*x == x is almost all that this test exercises. If I'm looking in the right place, this is C the implementation of a ** b, omitting in a few special cases: vabs = hypot(a.real,a.imag); len = pow(vabs,b.real); at = atan2(a.imag, a.real); phase = at*b.real; if (b.imag != 0.0) { len /= exp(at*b.imag); phase += b.imag*log(vabs); } r.real = len*cos(phase); r.imag = len*sin(phase); This means that (e ** ...) is essentially implemented in terms of the formula above. Indeed, in the special case of e ** (1j * x), we have a.real = e, a.imag = 0.0, b.real = 0.0, and b.imag = 1.0, so concretely the code simplifies to this: vabs = e len = 1.0 at = 0.0 phase = 0.0 if (b.imag != 0.0) { len = 1.0; phase = x; // requires log(e) == 1.0 and x * 1.0 == x } r.real = cos(phase); // requires 1.0 * x == x r.imag = sin(phase); Thus, it shouldn't be too surprising that the formula holds :) Cl?ment. From chris.barker at noaa.gov Mon Jun 11 16:18:10 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Jun 2018 13:18:10 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: On Mon, Jun 11, 2018 at 10:24 AM, Michael Selik wrote: > Would sind and cosd make Euler's formula work correctly? Not trying to pick on you, but this question shows a key misunderstanding: There is nothing inherently more accurate in using degrees rather than radians for trigonometry. IT's nice that handy values like "one quarter of a circle" can be exactly represented, but that's really only an asthetic thing. And every computer math lib I've even seen uses floating point radians for trig functions, so unless you're really going to implement trig from degrees from scratch, then you are going to go to floating point radians (and floating point pi) anyway. Oh, and radians are the more "natural" units (in fact unitless) for math, and the only way that things like the Euler identity work. Which is why computational math libs use them. So there are two orthogonal ideas on the table here: 1) Have trig functions that take degrees for convenience for when folks are working in degrees already. 2) Have trig functions that produce exact values (i.e what is "expected") for the special cases. It seems the OP is interested in a package that combines both of these -- which is a fine idea as a third party lib. Perhaps you'd prefer an enhancement to the fractions module that provides > real (not float) math? > Isn't that exactly what the fractions module does? or are you suggesting that it be extended with trig functions ? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 11 16:53:25 2018 From: mike at selik.org (Michael Selik) Date: Mon, 11 Jun 2018 13:53:25 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: On Mon, Jun 11, 2018, 1:18 PM Chris Barker wrote: > On Mon, Jun 11, 2018 at 10:24 AM, Michael Selik wrote: > >> Would sind and cosd make Euler's formula work correctly? > > > There is nothing inherently more accurate in using degrees rather than > radians for trigonometry. > That's actually what I was trying to say. Shouldn't have tried to be round-about. Not only did the point get muddled, but I wrote something false as well! Perhaps you'd prefer an enhancement to the fractions module that provides >> real (not float) math? >> > > Isn't that exactly what the fractions module does? or are you suggesting > that it be extended with trig functions? > The latter. However, to Steven's point about irrationals, perhaps this should be an entirely separate module designed to handle various irrationalities accurately. ... Like SymPy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Jun 11 17:12:06 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Jun 2018 14:12:06 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180611054831.GW12683@ando.pearwood.info> References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> Message-ID: On Sun, Jun 10, 2018 at 10:48 PM, Steven D'Aprano wrote: > > > In regard to the "special values", and exact results -- a good math lib > > should return results that are "exact" in all but maybe the last digit > > stored. So you could check inputs and outputs with, e.g. math.isclose() > to > > give people the "exact" results. -- and keep it all in floating point. > > I wish Uncle Timmy or Mark Dickinson were around to give a definite > answer, but in their absence I'll have a go. I'm reasonably sure > that's wrong. > hmm -- I'm no numerical analyst, but I could have sworn I learned (from Kahan himself) that the trig functions could (and were, at least in the HP calculators :-) ) be computed to one digit of accuracy. He even proved how many digits of pi you'd have to store to do that (though I can't say I understood the proof) -- I think you needed all those digits of pi because the trig functions are defined on the range 0 -- pi/2, and any larger value needs to be mapped to that domain -- if someone asks for the sin(e100), you need to know pretty exactly what x % pi/4 is. The problem with trig functions is that they suffer from "the > table maker's dilemma", so it is very hard to guarantee a correctly > rounded result without going to ludicrous extremes: > > http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm > > So I think that there's no guarantee given for trancendental functions > like sine, cosine etc. > so -- if that's the case, I still think we know that while the last digit may not be the best rounded value to the real one, the second to last digit is correct. And if not, then there's nothing we could do other than implement the math lib :-) But even if they were, using isclose() is the wrong solution. Suppose > sin(x) returns some number y, such that isclose(y, 0.0) say. You have no > way of knowing that y is an inaccurate result that ought to be zero, or > whether the answer should be non-zero and y is correct. You cannot > assume that "y is close to zero, therefore it ought to be zero". > no, but you can say "y is as close to zero as I care about" we are already restricted to not knowing the distinction within less than an eps -- so making the "effective eps" a bit larger would result in more esthetically pleasing results. It's not just zero, the same applies for any value. That's just moving > rounding errors from one input to a slightly different input. > > # current situation > sine of x returns y, but the mathematical exact result is exactly z > > # suggested "fix" > sine of x ? a tiny bit returns exactly z, but ought to return y > > Guessing what sin or cos "ought to" return based on either the inexact > input or inexact output is not a good approach. > I don't think that's what it would be -- rather, it would be returning a bit less precision in exchange for more esthetically pleasing results :-) Note that there is no way I would advocate using this for the stdlib trig functions -- only for a purpose library. I'm also suggesting that it would result in equally good results to what is being proposed: using integer degrees, or a pi or tau based units. If you used integer degrees, then you'd have exactly, say pi (180 degrees), but a precision of only pi/180 -- much less than the 15 digits or so you'd get if you rounded the regular floating point results. And if you used tau based units, you'd be back to the same thing -- 0.5 tau could be exact, but what would you do for a bit bigger or smaller than that? use FP :-) > We can only operate on multiples of pi, > which is *close to* but not the same as ?. That's why it is okay that > tan(pi/2) returns a huge number instead of infinity or NAN. That's > because the input is every so slightly smaller than ?/2. That's exactly > the behavior you want when x is ever so slightly smaller than ?/2. > I suppose so, but is there a guarantee that the FP representation of ?/2 is a tiny bit less than the exact value, rather than a tiny bit more? which would result in VERY different answers: In [*10*]: math.tan(math.pi / 2.0) Out[*10*]: 1.633123935319537e+16 In [*11*]: math.tan(math.pi / 2.0 + 2e-16) Out[*11*]: -6218431163823738.0 (though equally "correct") Also -- 1.6 e+16 is actually pretty darn small compared to FP range. So a library that wants to produce "expected" results may want to do something with that -- something like: In [*119*]: *def* pretty_tan(x): ...: tol = 3e-16 ...: diff = x % (pi / 2) ...: *if* abs(diff) < tol: ...: *return* float("-Inf") ...: *elif* (pi /2) - diff < tol: ...: *return* float("Inf") ...: *return* math.tan(x) ...: ...: In [*120*]: x = pi / 2 - 5e-16 In [*121*]: *for* i *in* range(10): ...: val = x + i * 1e-16 ...: *print* val, pretty_tan(val) ...: 1.57079632679 1.9789379661e+15 1.57079632679 1.9789379661e+15 1.57079632679 inf 1.57079632679 inf 1.57079632679 -inf 1.57079632679 -inf 1.57079632679 -inf 1.57079632679 -inf 1.57079632679 -2.61194216074e+15 1.57079632679 -2.61194216074e+15 You'd want to tweak that tolerance value to be as small as possible, and do somethign to make it more symmetric, but you get the idea. The goal is that if you have an input that is about as close as you can get to pi/2, you get inf or -inf as a result. This does mean you are tossing away a tiny bit of precision -- you could get a "correct" value for those values really close to pi/2, but it would be prettier... -CHB > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jun 11 18:48:27 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Jun 2018 10:48:27 +1200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: <5B1EFC3B.6020800@canterbury.ac.nz> Michael Selik wrote: > Whoops, it turns out Euler's formula does work! I expected imprecision, > but at least one test matched. That might be because the implememtation of e ** x where x is complex is using Euler's formula... -- Greg From greg.ewing at canterbury.ac.nz Mon Jun 11 18:57:18 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Jun 2018 10:57:18 +1200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180611183806.GZ12683@ando.pearwood.info> References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> Message-ID: <5B1EFE4E.1050204@canterbury.ac.nz> Steven D'Aprano wrote: > sin(3?) is: > > -1/2 (-1)^(29/60) ((-1)^(1/60) - 1) (1 + (-1)^(1/60)) > > This proposal was supposed to *simplify* the trig functions for > non-mathematicians, not make them mind-bogglingly complicated. I don't think anyone is going to complain about sin(3?) not being exact, whatever units are being used. This discussion is only about the rational values. I wonder whether another solution would be to provide a set of "newbie math" functions that round their results. >>> round(cos(pi/2), 15) 0.0 >>> round(sin(pi/6), 15) 0.5 Yes, I know, this just pushes the surprises somewhere else, but so does every solution. -- Greg From steve at pearwood.info Mon Jun 11 20:48:36 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Jun 2018 10:48:36 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <5B1EFE4E.1050204@canterbury.ac.nz> References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> Message-ID: <20180612004835.GA12683@ando.pearwood.info> On Tue, Jun 12, 2018 at 10:57:18AM +1200, Greg Ewing wrote: > Steven D'Aprano wrote: > >sin(3?) is: > > > >-1/2 (-1)^(29/60) ((-1)^(1/60) - 1) (1 + (-1)^(1/60)) > > > >This proposal was supposed to *simplify* the trig functions for > >non-mathematicians, not make them mind-bogglingly complicated. > > I don't think anyone is going to complain about sin(3?) not > being exact, whatever units are being used. This discussion > is only about the rational values. Precisely. They are the values I'm talking about. If you're serious about only supporting only the rational values, then the implementation is trivial and we can support both degrees and radians easily. def sin(angle): # radians if angle == 0: return 0 raise ValueError("sorry, no rational sine for that angle") def sind(angle): # degrees angle %= 360 rational_values = { 0: 0, 30: 0.5, 90: 1, 150: 0.5, 180: 0, 210: -0.5, 270: -1, 330: -0.5 } if angle in rational_values: return rational_values[angle] raise ValueError("sorry, no rational sine for that angle") Exceedingly simple, and exceedingly useless. Which was my point: supporting only the rational values is pointless, because there are only a handful of them. We either have to support full-blown symbolic results, or we have rational APPROXIMATIONS to the true value. I'm responding to a proposal that explicitly suggested using fractions to do "real (not float) math", which I read as "no approximations". I imagine that Michael thought that by using fractions, we can calculate exact rational results for sine etc without floating point rounding errors. No we cannot, except for the values above. If "not float" is serious, then with the exception of the above values, *none* of the values will be rational and we either can't return a value (except for the above) or we have to use symbolic maths. Using fractions is not a magic panacea that lets you calculate exact answers just by swapping floats to fractions. > I wonder whether another solution would be to provide a > set of "newbie math" functions that round their results. [...] > Yes, I know, this just pushes the surprises somewhere > else, but so does every solution. No, that's really not true of every solution. The initial proposal is fine: a separate set of trig functions that take their arguments in degrees would have no unexpected surprises (only the expected ones). With a decent implementation i.e. not this one: # don't do this def sind(angle): return math.sin(math.radians(angle)) and equivalent for cos, we ought to be able to get correctly rounded values for nearly all the "interesting" angles of the circle, without those pesky rounding issues caused by ? not being exactly representable as a float. There will still be rounding errors, because we're dealing with numbers like sqrt(2) etc, but with IEEE-754 maths, they'll be correctly rounded. -- Steve From steve at pearwood.info Mon Jun 11 21:09:25 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Jun 2018 11:09:25 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: <20180612010924.GB12683@ando.pearwood.info> On Mon, Jun 11, 2018 at 01:18:10PM -0700, Chris Barker via Python-ideas wrote: > On Mon, Jun 11, 2018 at 10:24 AM, Michael Selik wrote: > > > Would sind and cosd make Euler's formula work correctly? > > > Not trying to pick on you, but this question shows a key misunderstanding: > > There is nothing inherently more accurate in using degrees rather than > radians for trigonometry. Actually there is: using radians, the only "nice" angle that can be represented exactly is 0. With degrees, we can represent a whole lot of "nice" angles exactly. "Nice", is of course subjective, but most of us would recognise that 36? represented as exactly 36.0 is nice but being *approximately* represented as 0.6283185307179586 is not. Using radians, we have two sources of rounding error: - ? cannot be represented exactly as a float, so we have to use a number pi which is ever-so-slightly off; - plus the usual round-off error in the algorithm; while using degrees, we only have the second one (since 180? *can* be represented exactly, as the float 180.0). And with the degrees implementation, we should be able to use correctly rounded roots for many of our "nice" angles. > And every computer math lib I've even seen uses floating point radians for > trig functions, so unless you're really going to implement trig from > degrees from scratch Well that's the whole point of the discussion. > Oh, and radians are the more "natural" units (in fact unitless) for math, Degrees are unit-less too. 180? = ? radians. That's just a scaling factor difference. Unless you're doing symbolic maths, differentiating or integrating trig functions, or certain geometric formulae which are "neater" in radians than in degrees, there's no real advantage to radians. Degrees are simply a much more practical unit of angle for practical work. > and the only way that things like the Euler identity work. Its not the *only* way. https://math.stackexchange.com/questions/1368049/eulers-identity-in-degrees > Which is why computational math libs use them. Actually, more maths libraries than you might guess offer trig functions in degrees, or scaled by pi, e.g: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_lib.html Julia and Matlab provide sind etc, and although I cannot find a reference right now, I seem to recall the latest revision to the IEEE 754 standard suggesting them as optional functions. -- Steve From steve at pearwood.info Mon Jun 11 21:17:07 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Jun 2018 11:17:07 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> Message-ID: <20180612011707.GC12683@ando.pearwood.info> On Mon, Jun 11, 2018 at 02:12:06PM -0700, Chris Barker wrote: > no, but you can say "y is as close to zero as I care about" Of course you can. But we (the std lib) should not make that decision for everybody. For some people 0.001 is "close enough to zero". For others, 1e-16 is not. We're not in the position to decide for everyone. -- Steve From tim.peters at gmail.com Tue Jun 12 01:50:56 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 12 Jun 2018 00:50:56 -0500 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180612004835.GA12683@ando.pearwood.info> References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> Message-ID: [Steven D'Aprano] > ... The initial proposal is fine: a separate set of trig functions that take > their arguments in degrees would have no unexpected surprises (only the > expected ones). With a decent implementation i.e. not this one: > > # don't do this > def sind(angle): > return math.sin(math.radians(angle)) > But that's good enough for almost all real purposes. Indeed, except for argument reduction, it's essentially how scipy's sindg and cosdg _are_ implemented: https://github.com/scipy/scipy/blob/master/scipy/special/cephes/sindg.c > and equivalent for cos, we ought to be able to get correctly rounded > values for nearly all the "interesting" angles of the circle, without > those pesky rounding issues caused by ? not being exactly representable > as a float. > > If people are overly ;-) worried about tiny rounding errors, just compute things with some extra bits of precision to absorb them. For example, install `mpmath` and use this: def sindg(d): import math, mpmath d = math.fmod(d, 360.0) if abs(d) == 180.0: return 0.0 with mpmath.extraprec(12): return float(mpmath.sin(mpmath.radians(d))) Then, e.g, >>> for x in (0, 30, 90, 150, 180, 210, 270, 330, 360): ... print(x, sindg(x)) 0 0.0 30 0.5 90 1.0 150 0.5 180 0.0 210 -0.5 270 -1.0 330 -0.5 360 0.0 Notes: 1. Python's float "%" is unsuitable for argument reduction; e.g., >>> -1e-14 % 360.0 360.0 `math.fmod` is suitable, because it's exact: >>> math.fmod(-1e-14, 360.0) -1e-14 2. Using a dozen extra bits of precision make it very likely you'll get the correctly rounded 53-bit result; it will almost certainly (barring bugs in `mpmath`) always be good to less than 1 ULP. 3. Except for +-180. No matter how many bits of float precision (including the number of bits used to approximate pi) are used, converting that to radians can never yield the mathematical `pi`; and sin(pi+x) is approximately equal to -x for tiny |x|; e.g., here with a thousand bits: >>> mpmath.mp.prec = 1000 >>> float(mpmath.sin(mpmath.radians(180))) 1.2515440597544546e-301 So +-180 is special-cased. For cosdg, +-{90. 270} would need to be special-cased for the same reason. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Jun 12 02:40:23 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Jun 2018 18:40:23 +1200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> Message-ID: <5B1F6AD7.3050302@canterbury.ac.nz> Tim Peters wrote: > 1. Python's float "%" is unsuitable for argument reduction; e.g., > > >>> -1e-14 % 360.0 > 360.0 > > `math.fmod` is suitable, because it's exact: > > >>> math.fmod(-1e-14, 360.0) > -1e-14 So why doesn't float % use math.fmod? -- Greg From marcidy at gmail.com Tue Jun 12 02:43:33 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Mon, 11 Jun 2018 23:43:33 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> Message-ID: Sorry for top posting, but these aren't really opinions for the debate, just information. I haven't seen them mentioned, and none grouped nicely under someone's reply. Number representation:: IEEE-754 doubles cannot represent pi correctly at any bit-depth (i mean, obviously, but more seriously). input pi = 3.14159265358979323846264338327 output pi = 3.14159265358979311599796346854 The cutoff is the value everyone uses, which is what is in math.pi among other places. 3.141592653589793 Any conversion using this value of pi is already wrong, and the input to the function is already wrong. The function doesn't matter. It's always underestimating if using pi. I'm clarifying this because it's separate from any discussion of functions. More accurate or different functions cannot use a better input pi. Calculator: http://www.binaryconvert.com/convert_double.html input pi source: https://www.piday.org/million/ Libraries: Boost implements a cos_pi function, it's algorithm will probably be useful to look at. glibc implements cos/sin as a look-up table, which is most likely where any other implementation will end up, as it is a common endpoint. I've seen this on different architectures and libraries. (sincostab.h if you want to google glibc's table). Maybe there is a hilarious index lookup off-by-1 there, I didn't look. Tables are the basis for the calculation in many libraries in march architectures, so I wanted to point that out to any function designers. A new implementation may come down to calculating right index locations. For a degree based implementation, the same algorithm can be used with a different table input that is calibrated to degrees. Likewise, a pi based implementation, the same but for the pi scale factor input. Nothing needs to be done designed other than a new lookup table calibrated for the "base" of a degree, radian, or pi.. it will have the same output precision issues calculating index values, but at least the input will be cleaner. This is not me saying what should be done, just giving information that may hopefully be useful Small note about python's math: The python math library does not implement algorithms, it exposes the c functions. You can see C/C++ has this exact issue performing the calculation there. As that is "the spec," the spec is defined with the error. The math library is technically correct given it's stated purpose and result. Technically correct, the best kind of correct. https://www.youtube.com/watch?v=hou0lU8WMgo On Mon, Jun 11, 2018 at 10:53 PM Tim Peters wrote: > > [Steven D'Aprano] >> >> ... >> >> The initial proposal is fine: a separate set of trig functions that take >> their arguments in degrees would have no unexpected surprises (only the >> expected ones). With a decent implementation i.e. not this one: >> >> # don't do this >> def sind(angle): >> return math.sin(math.radians(angle)) > > > But that's good enough for almost all real purposes. Indeed, except for argument reduction, it's essentially how scipy's sindg and cosdg _are_ implemented: > > https://github.com/scipy/scipy/blob/master/scipy/special/cephes/sindg.c > >> >> and equivalent for cos, we ought to be able to get correctly rounded >> values for nearly all the "interesting" angles of the circle, without >> those pesky rounding issues caused by ? not being exactly representable >> as a float. >> > If people are overly ;-) worried about tiny rounding errors, just compute things with some extra bits of precision to absorb them. For example, install `mpmath` and use this: > > def sindg(d): > import math, mpmath > d = math.fmod(d, 360.0) > if abs(d) == 180.0: > return 0.0 > with mpmath.extraprec(12): > return float(mpmath.sin(mpmath.radians(d))) > > Then, e.g, > > >>> for x in (0, 30, 90, 150, 180, 210, 270, 330, 360): > ... print(x, sindg(x)) > 0 0.0 > 30 0.5 > 90 1.0 > 150 0.5 > 180 0.0 > 210 -0.5 > 270 -1.0 > 330 -0.5 > 360 0.0 > > Notes: > > 1. Python's float "%" is unsuitable for argument reduction; e.g., > > >>> -1e-14 % 360.0 > 360.0 > > `math.fmod` is suitable, because it's exact: > > >>> math.fmod(-1e-14, 360.0) > -1e-14 > > 2. Using a dozen extra bits of precision make it very likely you'll get the correctly rounded 53-bit result; it will almost certainly (barring bugs in `mpmath`) always be good to less than 1 ULP. > > 3. Except for +-180. No matter how many bits of float precision (including the number of bits used to approximate pi) are used, converting that to radians can never yield the mathematical `pi`; and sin(pi+x) is approximately equal to -x for tiny |x|; e.g., here with a thousand bits: > > >>> mpmath.mp.prec = 1000 > >>> float(mpmath.sin(mpmath.radians(180))) > 1.2515440597544546e-301 > > So +-180 is special-cased. For cosdg, +-{90. 270} would need to be special-cased for the same reason. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Tue Jun 12 02:50:56 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Jun 2018 16:50:56 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <5B1F6AD7.3050302@canterbury.ac.nz> References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: On Tue, Jun 12, 2018 at 4:40 PM, Greg Ewing wrote: > Tim Peters wrote: > >> 1. Python's float "%" is unsuitable for argument reduction; e.g., >> >> >>> -1e-14 % 360.0 >> 360.0 >> >> `math.fmod` is suitable, because it's exact: >> >> >>> math.fmod(-1e-14, 360.0) >> -1e-14 > > > So why doesn't float % use math.fmod? https://docs.python.org/3/reference/expressions.html#binary-arithmetic-operations https://docs.python.org/3/reference/expressions.html#id17 https://docs.python.org/3/reference/expressions.html#id18 (the latter two being footnotes from the section in the first link) With real numbers, divmod (and thus the // and % operators) would always return values such that: div, mod = divmod(x, y): 1) div*y + mod == x 2) sign(mod) == sign(y) 3) 0 <= abs(mod) < abs(y) But with floats, you can't guarantee all three of these. The divmod function focuses on the first, guaranteeing the fundamental arithmetic equality, but to do so, it sometimes has to bend the third one and return mod==y. There are times when it's better to sacrifice one than the other, and there are other times when it's the other way around. We get the two options. ChrisA From stephanh42 at gmail.com Tue Jun 12 03:02:43 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 12 Jun 2018 09:02:43 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: Hi all, I wrote a possible implementation of sindg: https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd This code first reduces the angle to the [0,90] interval. After doing so, it can be observed that the simple implementation math.sin(math.radians(angle)) produces exact results for 0 and 90, and a result already rounded to nearest for 60. For 30 and 45, this simple implementation is one ulp too low. So I special-case those to return the correct/correctly-rounded value instead. Note that this does not affect monotonicity around those values. So I am still unsure if this belong in the stdlib, but if so, this is how it could be done. Stephan 2018-06-12 8:50 GMT+02:00 Chris Angelico : > On Tue, Jun 12, 2018 at 4:40 PM, Greg Ewing > wrote: > > Tim Peters wrote: > > > >> 1. Python's float "%" is unsuitable for argument reduction; e.g., > >> > >> >>> -1e-14 % 360.0 > >> 360.0 > >> > >> `math.fmod` is suitable, because it's exact: > >> > >> >>> math.fmod(-1e-14, 360.0) > >> -1e-14 > > > > > > So why doesn't float % use math.fmod? > > https://docs.python.org/3/reference/expressions.html# > binary-arithmetic-operations > https://docs.python.org/3/reference/expressions.html#id17 > https://docs.python.org/3/reference/expressions.html#id18 > > (the latter two being footnotes from the section in the first link) > > With real numbers, divmod (and thus the // and % operators) would > always return values such that: > > div, mod = divmod(x, y): > 1) div*y + mod == x > 2) sign(mod) == sign(y) > 3) 0 <= abs(mod) < abs(y) > > But with floats, you can't guarantee all three of these. The divmod > function focuses on the first, guaranteeing the fundamental arithmetic > equality, but to do so, it sometimes has to bend the third one and > return mod==y. > > There are times when it's better to sacrifice one than the other, and > there are other times when it's the other way around. We get the two > options. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Tue Jun 12 05:33:56 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 12 Jun 2018 11:33:56 +0200 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> Message-ID: <69f6425b-361a-85fa-d49c-4830d43a3aea@gmail.com> > > I still don't get it... If you have a framework you presumably have an > entry point. Why can't you set up your policy in that entrypoint? Why > would a user attempt to change the policy at runtime (you haven't > listed examples of libraries that do this)? Not at run time. But several libraries have several entry points, that you must use in proper order, and setup correctly. Locking helps with finding when you do an improper setup without reading all documentation for all libs completely to make sure you didn't miss a corner case. It will also help project owners because people will come to the bug tracker saying "I have LockedPolicyError" when I do that, which is a lot easier to solve than "I have an AttributeError on a None object". I see a lot of "I want to > protect users from ..." arguments but I haven't yet seen "this and > that happened in production and we haven't been able to debug what > happened for a while". It's like with asyncio.get_event_loop. We haven't seen the problem for a a long time. Then eventually, people starting to use several loops in several threads, and they said that it caused a problem. It's the same here. I'm advising that we don't wait for people to have the problem to solve it. > > Do you handle cases when people install a blocking logging handler in > their async application? We can't prevent that without a great cost. It's terribly complicated problem, especially since logging is blocking and thread safe by design. There is not even good documentation on the best practice on using logging with asyncio. Most people have no idea how to do it (although in my experience, most dev use the logging module incorrectly outside of asyncio too). Twisted rewrote an entire logging system to solve that problem. Locking the policy is not a great cost. And it's about creating a user friendly API. Do you handle cases when a malfunctioning > sys.excepthook is installed? When I override sys.excepthook, I take great care of backuping the old hook, and calling it in my hook. And I really wish the stdlib had something built in to do that cleanly, because I'm pretty sure most lib overriding sys.excepthook all do that in a different way, if they do it at all. Every time I use something to help with stack trace (coloring, auto logging, etc), I have to read the source code to check they are compatible. I should not have to do that. What about cases when users accidentally > import gevent somewhere in their asyncio application and it > monkeypatches the 'socket' module (this is a real horror story, by the > way)? Monkey patching is not officially supported by anything, anywhere, so this argument is moot. But it opens currently another door for my argument: currently the only way to make sure nobody erase our policy is to monkey patch the function to set the policy. Do we really want monkey patch to be the only solution to this problem ? My point is that there are so many things that users can do > that will break any framework, be it asyncio or django or trio. Yes, eventually the question is how easily and cleanly can we provide a solution to avoid the problem or help to debug it. Django is a good example. I run sanity checks on common errors on your model on startup to ease your life as a dev, and the life on the maintainers and their bug tracker. > > This sounds like "if something can happen it will happen" kind of > thing, The reason I bring get_event_loop on the table is that I knew years ago when I read that source code that it would come back to bite us. And I said nothing because I assume the core devs would have though of that and that I was mistaken. But even if I have spoken up then, my experience with Python-idea is that most of the time, people would have told me "no". They would have told me that "nobody is going to do that". I'm getting used to it. It took me a lot of time of "path should inherit from strings" to finally have enough people involved that we get a solution to the problem (which ended up being __fspath__). At the beginning, the answer was "no, there is no problem". So basically, now I speak up, knowing that people will say "no". But at least I said it. Maybe in 2 years we will go back to that an implement it, or another solution to this problem. but I haven't yet seen good examples of real code that suffers > from non-locked policies. Using the nurseries example doesn't count, > as this is something that we want to have as a builtin functionality > in 3.8. So in we will be able to use that in 2 years. > > Locking policies can lead to more predictable user experience; OTOH > what happens if, say, aiohttp decides to lock its policy to use uvloop > and thus make it impossible for its users to use tokio or some other > loop implementation? That would be very, very good. It would make explicit that aiohttp is: - using a custom policy - assuming it's the only one - not allowing another lib to use one - rely on it to work - has no mechanism to allow to cohabit This would mean user would learn very quickly and easily about those issues, report it to the bug tracker if they matter and allow a debate and a solution. It's not the case for aiohttp though. Another benefit is that this is a universal process. Once we get locking in place, all libs will quickly realize if they do something wrong. While right now, we just can't know. Technically, Go and JS already lock those... by not providing access to any of the machinery. So they don't have the problem. I wish the event loop never had a Python API in the first place, so that the only way to extend it would be in C. But this ship has sailed. Trio nursery is a good example of defensive programming too : can you prove that a lib is calling ensure_future() without proper managing it's lifecycle ? Yes but it requires a lot of work. Can people use ensure_future correctly ? Yes but we know some will mess up. I certainly did. So it's easier to provide a proper way to do things. Locking the policy is providing a clean and easy way to do things. From njs at pobox.com Tue Jun 12 06:41:11 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 12 Jun 2018 03:41:11 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: On Tue, Jun 12, 2018, 00:03 Stephan Houben wrote: > Hi all, > > I wrote a possible implementation of sindg: > > https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd > > This code first reduces the angle to the [0,90] interval. > After doing so, it can be observed that the simple implementation > math.sin(math.radians(angle)) > produces exact results for 0 and 90, and a result already rounded to > nearest for > 60. > You observed this on your system, but math.sin uses the platform libm, which might do different things on other people's systems. > For 30 and 45, this simple implementation is one ulp too low. > So I special-case those to return the correct/correctly-rounded value > instead. > Note that this does not affect monotonicity around those values. > Again, monotonicity is preserved on your system, but it might not be on others. It's not clear that this matters, but then it's not clear that any of this matters... -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Jun 12 08:27:36 2018 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 12 Jun 2018 08:27:36 -0400 Subject: [Python-ideas] Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> Message-ID: Sym: SymPy, SymEngine, PySym, SymCXX, Diofant (re: \pi, symbolic computation and trigonometry instead of surprisingly useful piecewise optimizations) On Fri, Jun 8, 2018 at 10:09 PM Wes Turner wrote: > # Python, NumPy, SymPy, mpmath, sage trigonometric functions > https://en.wikipedia.org/wiki/Trigonometric_functions > > ## Python math module > https://docs.python.org/3/library/math.html#trigonometric-functions > - degrees(radians): Float degrees > - radians(degrees): Float degrees > > ## NumPy > > https://docs.scipy.org/doc/numpy/reference/routines.math.html#trigonometric-functions > - degrees(radians) : List[float] degrees > - rad2deg(radians): List[float] degrees > - radians(degrees) : List[float] radians > - deg2rad(degrees): List[float] radians > > https://docs.scipy.org/doc/numpy/reference/generated/numpy.sin.html > > # Symbolic computation > > ## SymPy > > http://docs.sympy.org/latest/modules/functions/elementary.html#sympy-functions-elementary-trigonometric > > http://docs.sympy.org/latest/modules/functions/elementary.html#trionometric-functions > > - sympy.mpmath.degrees(radians): Float degrees > - sympy.mpmath.radians(degrees): Float radians > > - https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy > - cosd, sind > - > https://stackoverflow.com/questions/31072815/cosd-and-sind-with-sympy#comment50176770_31072815 > > > Let x, theta, phi, etc. be Symbols representing quantities in > radians. Keep a list of these symbols: angles = [x, theta, phi]. Then, at > the very end, use y.subs([(angle, angle*pi/180) for angle in angles]) to > change the meaning of the symbols to degrees" > http://docs.sympy.org/latest/tutorial/simplification.html#trigonometric-simplification https://github.com/sympy/sympy/blob/master/sympy/functions/elementary/trigonometric.py https://github.com/sympy/sympy/blob/master/sympy/functions/elementary/tests/test_trigonometric.py# https://github.com/sympy/sympy/blob/master/sympy/simplify/trigsimp.py https://github.com/sympy/sympy/blob/master/sympy/simplify/tests/test_trigsimp.py https://github.com/sympy/sympy/blob/master/sympy/integrals/trigonometry.py https://github.com/sympy/sympy/blob/master/sympy/integrals/tests/test_trigonometry.py https://github.com/sympy/sympy/blob/master/sympy/utilities/tests/test_wester.py https://github.com/sympy/sympy/blob/master/sympy/utilities/tests/test_wester.py#L593 (I. Trigonometry) ## Sym Src: https://github.com/bjodah/sym PyPI: https://pypi.org/project/sym/ > sym provides a unified wrapper to some symbolic manipulation libraries in Python. It ## SymEngine - Src: https://github.com/symengine/symengine - Src: https://github.com/symengine/symengine.py - Docs: https://github.com/symengine/symengine/blob/master/doc/design.md - SymEngine / SymPy compatibility tests: https://github.com/symengine/symengine.py/blob/master/symengine/tests/test_sympy_compat.py ## Diofant Src: https://github.com/diofant/diofant https://diofant.readthedocs.io/en/latest/tutorial/intro.html https://diofant.readthedocs.io/en/latest/tutorial/basics.html#substitution https://diofant.readthedocs.io/en/latest/tutorial/simplification.html#trigonometric-functions from diofant import symbols, pi x,y,z,_pi = symbols('x y z _pi') expr = pi**x # TODO: see diofant/tests/test_wester.py#L511 expr.subs(x, 1e11) print(operator.sub( expr.subs(pi, 3.14), expr.subs(pi, 3.14159265))) assert expr.subs(pi, 3.14) != expr.subs(pi, 3.14159265) print(expr.subs(pi, 3.14159).evalf(70)) - CAS capability tests: https://github.com/diofant/diofant/blob/master/diofant/tests/test_wester.py > """ Tests from Michael Wester's 1999 paper "Review of CAS mathematical > capabilities". > http://www.math.unm.edu/~wester/cas/book/Wester.pdf > See also http://math.unm.edu/~wester/cas_review.html for detailed output of > each tested system. """ https://github.com/diofant/diofant/blob/79ae584e949a08/diofant/tests/test_wester.py#L511 # I. Trigonometry > @pytest.mark.xfail > def test_I1(): > assert tan(7*pi/10) == -sqrt(1 + 2/sqrt(5)) > @pytest.mark.xfail > def test_I2(): > assert sqrt((1 + cos(6))/2) == -cos(3) > def test_I3(): > assert cos(n*pi) + sin((4*n - 1)*pi/2) == (-1)**n - 1 > def test_I4(): > assert cos(pi*cos(n*pi)) + sin(pi/2*cos(n*pi)) == (-1)**n - 1 > @pytest.mark.xfail > def test_I5(): > assert sin((n**5/5 + n**4/2 + n**3/3 - n/30) * pi) == 0 diofant.sin.eval() has a number of interesting conditionals in there: https://github.com/diofant/diofant/blob/master/diofant/functions/elementary/trigonometric.py#L200 The tests for diofant.functions.elementary.trigonometric likely have a number of helpful tests for implementing methods dealing with pi and trigonometric identities: https://github.com/diofant/diofant/blob/master/diofant/functions/elementary/tests/test_trigonometric.py https://github.com/diofant/diofant/blob/master/diofant/simplify/trigsimp.py https://github.com/diofant/diofant/blob/master/diofant/simplify/tests/test_trigsimp.py https://github.com/diofant/diofant/blob/master/diofant/integrals/tests/test_trigonometry.py https://github.com/diofant/diofant/blob/master/diofant/functions/elementary/tests/test_trigonometric.py ## mpmath > http://mpmath.org/doc/current/functions/trigonometric.html > - sympy.mpmath.degrees(radians): Float degrees > - sympy.mpmath.radians(degrees): Float radians > > > ## Sage > > https://doc.sagemath.org/html/en/reference/functions/sage/functions/trig.html > > > > On Friday, June 8, 2018, Robert Vanden Eynde < > robertvandeneynde at hotmail.com> wrote: > >> - Thanks for pointing out a language (Julia) that already had a name >> convention. Interestingly they don't have a atan2d function. Choosing the >> same convention as another language is a big plus. >> >> - Adding trig function using floats between 0 and 1 is nice, currently >> one needs to do sin(tau * t) which is not so bad (from math import tau, tau >> sounds like turn). >> >> - Julia has sinpi for sin(pi*x), one could have sintau(x) for sin(tau*x) >> or sinturn(x). >> >> Grads are in the idea of turns but with more problems, as you guys said, >> grads are used by noone, but turns are more useful. sin(tau * t) For The >> Win. >> >> - Even though people mentionned 1/6 not being exact, so that advantage >> over radians isn't that obvious ? >> >> from math import sin, tau >> from fractions import Fraction >> sin(Fraction(1,6) * tau) >> sindeg(Fraction(1,6) * 360) >> >> These already work today by the way. >> >> - As you guys pointed out, using radians implies knowing a little bit >> about floating point arithmetic and its limitations. Integer are more >> simple and less error prone. Of course it's useful to know about floats but >> in many case it's not necessary to learn about it right away, young >> students just want their player in the game move in a straight line when >> angle = 90. >> >> - sin(pi/2) == 1 but cos(pi/2) != 0 and sin(3*pi/2) != 1 so sin(pi/2) is >> kind of an exception. >> >> >> >> >> Le ven. 8 juin 2018 ? 09:11, Steven D'Aprano a >> ?crit : >> >>> On Fri, Jun 08, 2018 at 03:55:34PM +1000, Chris Angelico wrote: >>> > On Fri, Jun 8, 2018 at 3:45 PM, Steven D'Aprano >>> wrote: >>> > > Although personally I prefer the look of d as a prefix: >>> > > >>> > > dsin, dcos, dtan >>> > > >>> > > That's more obviously pronounced "d(egrees) sin" etc rather than >>> "sined" >>> > > "tanned" etc. >>> > >>> > Having it as a suffix does have one advantage. The math module would >>> > need a hyperbolic sine function which accepts an argument in; and >>> > then, like Charles Napier [1], Python would finally be able to say "I >>> > have sindh". >>> >>> Ha ha, nice pun, but no, the hyperbolic trig functions never take >>> arguments in degrees. Or radians for that matter. They are "hyperbolic >>> angles", which some electrical engineering text books refer to as >>> "hyperbolic radians", but all the maths text books I've seen don't call >>> them anything other than a real number. (Or sometimes a complex number.) >>> >>> But for what it's worth, there is a correspondence of a sort between the >>> hyperbolic angle and circular angles. The circular angle going between 0 >>> to 45? corresponds to the hyperbolic angle going from 0 to infinity. >>> >>> https://en.wikipedia.org/wiki/Hyperbolic_angle >>> >>> https://en.wikipedia.org/wiki/Hyperbolic_function >>> >>> >>> > [1] Apocryphally, alas. >>> >>> Don't ruin a good story with facts ;-) >>> >>> >>> >>> -- >>> Steve >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Tue Jun 12 10:54:43 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 12 Jun 2018 17:54:43 +0300 Subject: [Python-ideas] Operator for inserting an element into a list Message-ID: I think it would be logical to have the insert operator for lists. Similar to list extend operator += , it could use one of augmented assignment operators, e,g, /=. L = ["aa"] L[0] /= "bb" -> ["bb", "aa"] L[0] /= [1,2] -> [[1,2], "aa"] etc. Without index it would work like append(): L /= "bb" #-> ["aa", "bb"] As for possible spellings I like this one as well: L[i] ^= e The proposed solution is meant to have insert() method semantics, plus it would cover append() method nicely. Insert and append are very frequent operations, so I wonder if there was already related suggestion? Is there some technical problem with implementing this? Note that there is a trick to 'insert' an element with slicing syntax, e.g.: L[0:0] = [[1,2]] -> [[1,2], "aa"] L[0:0] = ["bb"] -> ["bb", "aa"] The trick is to put brackets around the element and so it works as insert(). Though additional brackets look really confusing for this purpose, so I don't feel like using this seriously. M From mike at selik.org Tue Jun 12 11:00:27 2018 From: mike at selik.org (Michael Selik) Date: Tue, 12 Jun 2018 08:00:27 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: That's a slice assignment, works great. I think if you use it more often you'll start to enjoy it. However, lists are optimized for append. Insert is slow. It should be discouraged, not encouraged by the language. If inserting is just a tad more awkward than appending, that's the language design giving you a hint that you should do what reads beautifully. On Tue, Jun 12, 2018, 7:54 AM Mikhail V wrote: > I think it would be logical to have the insert operator for lists. > Similar to list extend operator += , it could use one of augmented > assignment operators, e,g, /=. > > L = ["aa"] > > L[0] /= "bb" > > -> ["bb", "aa"] > > L[0] /= [1,2] > > -> [[1,2], "aa"] > > etc. > > Without index it would work like append(): > > L /= "bb" > > #-> ["aa", "bb"] > > > As for possible spellings I like this one as well: > > L[i] ^= e > > The proposed solution is meant to have insert() method semantics, > plus it would cover append() method nicely. > > Insert and append are very frequent operations, so I wonder > if there was already related suggestion? Is there some technical > problem with implementing this? > > > Note that there is a trick to 'insert' an element with slicing syntax, > e.g.: > > L[0:0] = [[1,2]] > > -> [[1,2], "aa"] > > > L[0:0] = ["bb"] > > -> ["bb", "aa"] > > The trick is to put brackets around the element and so it works as > insert(). > Though additional brackets look really confusing for this purpose, so I > don't > feel like using this seriously. > > > M > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Tue Jun 12 11:17:04 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 12 Jun 2018 18:17:04 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 5:54 PM, Mikhail V wrote: > I think it would be logical to have the insert operator for lists. > Similar to list extend operator += , it could use one of augmented > assignment operators, e,g, /=. > > L = ["aa"] > > L[0] /= "bb" > > -> ["bb", "aa"] > > L[0] /= [1,2] > > -> [[1,2], "aa"] > > etc. > > Without index it would work like append(): Oops Sorry for confusion, I inserted wrong examples here: The examples should be : L[0:0] /= "bb" L[0:0] /= [1,2] ... L[i:i] ^= e Of course. Because L[i] /= is an operation on list element, thus already working syntax. From storchaka at gmail.com Tue Jun 12 11:29:37 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 12 Jun 2018 18:29:37 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: 12.06.18 17:54, Mikhail V ????: > I think it would be logical to have the insert operator for lists. > Similar to list extend operator += , it could use one of augmented > assignment operators, e,g, /=. > > L = ["aa"] > > L[0] /= "bb" > > -> ["bb", "aa"] > > L[0] /= [1,2] > > -> [[1,2], "aa"] > > etc. > > Without index it would work like append(): > > L /= "bb" > > #-> ["aa", "bb"] > > > As for possible spellings I like this one as well: > > L[i] ^= e > > The proposed solution is meant to have insert() method semantics, > plus it would cover append() method nicely. > > Insert and append are very frequent operations, so I wonder > if there was already related suggestion? Is there some technical > problem with implementing this? Nice idea for your language. From tim.peters at gmail.com Tue Jun 12 11:51:37 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 12 Jun 2018 10:51:37 -0500 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: > > > [Tim] >> 1. Python's float "%" is unsuitable for argument reduction; e.g., > >> > >> >>> -1e-14 % 360.0 > >> 360.0 > >> > >> `math.fmod` is suitable, because it's exact: > >> > >> >>> math.fmod(-1e-14, 360.0) > >> -1e-14 > > [Greg Ewing] > So why doesn't float % use math.fmod? > [Chris Angelico] > > https://docs.python.org/3/reference/expressions.html#binary-arithmetic-operations > https://docs.python.org/3/reference/expressions.html#id17 > https://docs.python.org/3/reference/expressions.html#id18 > > (the latter two being footnotes from the section in the first link) > > With real numbers, divmod (and thus the // and % operators) would > always return values such that: > > div, mod = divmod(x, y): > 1) div*y + mod == x > 2) sign(mod) == sign(y) > 3) 0 <= abs(mod) < abs(y) > > But with floats, you can't guarantee all three of these. The divmod > function focuses on the first, guaranteeing the fundamental arithmetic > equality, but to do so, it sometimes has to bend the third one and > return mod==y. > > It's more that #2 is viewed as fundamental (because that's most useful for positive integer y), and _given that_ sometimes results are fiddled to keep #1 approximately true, and strict inequality in #3 may be sacrificed. For `fmod`, sign(mod) == sign(x) instead. >>> -2 % 3 1 >>> -2.0 % 3.0 1.0 >>> math.fmod(-2.0, 3.0) -2.0 All mod functions, m(x, y), strive to return a result that's mathematically exactly equal to x-n*y (for some mathematical integer `n` that may not even be representable in the programming language). `fmod()` is exact in that sense, but Python's floating "%" may not be. and no float scheme such that sign(m(x, y)) = sign(y) can be (see the original example at the top: the only mathematical integer `n` such that the mathematical -1e-14 - n*360.0 is exactly representable as a double is n==0). The most useful mod function for floats _as floats_ would actually satisfy abs(m(x, y)) <= abs(y) / 2 That can be done exactly too - but then the sign of the result has approximately nothing to do with the signs of the arguments. -------------- next part -------------- An HTML attachment was scrubbed... URL: From clint.hepner at gmail.com Tue Jun 12 12:42:24 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Tue, 12 Jun 2018 12:42:24 -0400 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: > On 2018 Jun 12 , at 10:54 a, Mikhail V wrote: > > I think it would be logical to have the insert operator for lists. > Similar to list extend operator += , it could use one of augmented > assignment operators, e,g, /=. > > L = ["aa"] > > L[0] /= "bb" > > -> ["bb", "aa"] > > L[0] /= [1,2] > > -> [[1,2], "aa"] -1. There's not much about this that is logical, no matter how much you want an insertion operator. Even if L[0] /= "bb" worked, then logically so should L[0] = L[0] / "bb". However, there is no sense in which L[0] / "bb" by itself has any meaning, and what would L[1] = L[0] / "bb" mean? And finally, L[0] /= x (and really, every other augmented operator) *already has* a meaning: >>> L = [10] >>> L[0] /= 2 >>> L [5] > > Note that there is a trick to 'insert' an element with slicing syntax, e.g.: > > L[0:0] = [[1,2]] > > -> [[1,2], "aa"] > > > L[0:0] = ["bb"] > > -> ["bb", "aa"] > > The trick is to put brackets around the element and so it works as insert(). > Though additional brackets look really confusing for this purpose, so I don't > feel like using this seriously. It's no more confusing than co-opting an unrelated operator to do the same thing. -- Clint From mikhailwas at gmail.com Tue Jun 12 14:08:22 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 12 Jun 2018 21:08:22 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 7:42 PM, Clint Hepner wrote: > >> On 2018 Jun 12 , at 10:54 a, Mikhail V wrote: >> >> I think it would be logical to have the insert operator for lists. >> Similar to list extend operator += , it could use one of augmented >> assignment operators, e,g, /=. >> >> L = ["aa"] >> >> L[0] /= "bb" >> >> -> ["bb", "aa"] >> >> L[0] /= [1,2] >> >> -> [[1,2], "aa"] > > -1. There's not much about this that is logical, no matter how much > you want an insertion operator. Even if L[0] /= "bb" worked, then logically > so should L[0] = L[0] / "bb". However, there is no sense in which L[0] / "bb" > by itself has any meaning, and what would L[1] = L[0] / "bb" mean? > > And finally, L[0] /= x (and really, every other augmented operator) *already has* a meaning: > Hi Clint, (and others), I must say it is misunderstanding due to my false examples, that I have pasted in original post (I had 2 cloned texts in my text editor and copied the wrong one). I have posted correction just 10 minutes after the original post, see last post. Sorry for confusion! So the idea was about an insert/append operator. Which would use augmented operator. The operator may be /= or ^=. (I like ^= more, so I'll put here example with it). L = [1,2,3] L[0:0] ^= 0 -> [0,1,2,3] L[0:0] ^= -1 -> [-1, 0, 1, 2, 3] L ^= 4 (without index, it works as append() ) -> [-1, 0, 1, 2, 3, 4] As for your question, what would: List1[a:b] = List1[c:d] ^ var mean? (is that what you have asked?) Well, I think this would mean simply : - first append var to List1[c:d] , - then replace the List1[a:b] part with the result. So at least L ^= 4 would make sense as L = L ^ 4. Actually current semantics of += for lists: L += var and L = L + var are different, so it seems to me they were not meant to be bound together. >> Note that there is a trick to 'insert' an element with slicing syntax, e.g.: >> >> L[0:0] = [[1,2]] >> >> -> [[1,2], "aa"] >> >> The trick is to put brackets around the element and so it works as insert(). >> Though additional brackets look really confusing for this purpose, so I don't >> feel like using this seriously. > > It's no more confusing than co-opting an unrelated operator to do the same thing. If the intention is to insert an element, indeed it is confusing, for me at least: L[0:0] = "bb" -> ["b","b","aa"] From mike at selik.org Tue Jun 12 15:25:30 2018 From: mike at selik.org (Michael Selik) Date: Tue, 12 Jun 2018 12:25:30 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 11:08 AM Mikhail V wrote: > On Tue, Jun 12, 2018 at 7:42 PM, Clint Hepner > wrote: > > So the idea was about an insert/append operator. > Which would use augmented operator. The operator may be > /= or ^=. (I like ^= more, so I'll put here example with it). > The "/" operator means divide. How do you divide a list? The "^" operator means exclusive-or. Again, strange concept for a list. Yes, there are examples of Python (mis)using operators like "%" for string interpolation. Note the difficulties that caused; we now have 3 ways to interpolate. It looked cool at the time, but happily we have f-strings now. > L += var > and > L = L + var > > are different, so it seems to me they were not meant to be > bound together. > One is mutation, the other isn't, but aside from that, the result is equivalent. > If the intention is to insert an element, indeed it is confusing, for > me at least: > > L[0:0] = "bb" > -> ["b","b","aa"] > Thats, like, just your opinion, Man. Kidding aside, I think you'll find it's natural once you get used to slice assignment. The slice assignment iterates over the right-hand argument and inserts each element. It overwrites the existing slice if any. In this example, you've mixed issues. It looks like you've gotten confused by string iteration. When you loop a string, you get 1-len strings. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Tue Jun 12 15:26:22 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 12 Jun 2018 15:26:22 -0400 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On 6/12/2018 10:54 AM, Mikhail V wrote: > I think it would be logical to have the insert operator for lists. > Similar to list extend operator += , it could use one of augmented > assignment operators, e,g, /=. ... > Note that there is a trick to 'insert' an element with slicing syntax, e.g.: This is not a 'trick'. It is a particular case of a general operation: replacing a length m slice of a list with a sequence of length n. Both m and n can be 0. The replacement sequence can be any iterable. >>> l = [1,2,3] >>> l[0:0] = 'abc' >>> l ['a', 'b', 'c', 1, 2, 3] > L[0:0] = [[1,2]] > > -> [[1,2], "aa"] > > > L[0:0] = ["bb"] > > -> ["bb", "aa"] In these examples, m and n are 0 and 1. > The trick is to put brackets around the element and so it works as insert(). Again, not a trick. Putting brackets around the element makes it sequence of length 1. To possible be less confusing, you could use (,) >>> l[0:0] = ([1,2],) >>> l [[1, 2], 'aa'] > Though additional brackets look really confusing for this purpose, > so I don't feel like using this seriously. Learning about lists means learning about slice assignment: replace a sublist with another sequence. -- Terry Jan Reedy From nas-python-ideas at arctrix.com Tue Jun 12 17:46:31 2018 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Tue, 12 Jun 2018 15:46:31 -0600 Subject: [Python-ideas] Link accepted PEPs to their whatsnew section? Message-ID: <20180612214631.sje6xkm4po4m2zts@python.ca> I'm testing "Data Classes" for Python 3.7. Awesome new feature, BTW. The PEP is the first search result when I lookup "dataclass python". Given that the PEP is not the best documentation for an end user, I wonder if we should have a link in the header section of the PEP that goes to better documentation. We could link accepted PEPs to their section of the whatsnew. Or, link to the language documentation that describes the feature. From rosuav at gmail.com Tue Jun 12 18:51:57 2018 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Jun 2018 08:51:57 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Wed, Jun 13, 2018 at 5:25 AM, Michael Selik wrote: > On Tue, Jun 12, 2018 at 11:08 AM Mikhail V wrote: >> >> On Tue, Jun 12, 2018 at 7:42 PM, Clint Hepner >> wrote: >> >> So the idea was about an insert/append operator. >> Which would use augmented operator. The operator may be >> /= or ^=. (I like ^= more, so I'll put here example with it). > > > The "/" operator means divide. How do you divide a list? > The "^" operator means exclusive-or. Again, strange concept for a list. I agree about XORing a list, but dividing a list could conceivably be implemented to split a list into parts. For instance: low, high = list(range(10)) / 2 But it wouldn't mean "insert". Also, I can't imagine an augmented division operator being useful, but others may disagree. ChrisA From mikhailwas at gmail.com Tue Jun 12 19:09:16 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 13 Jun 2018 02:09:16 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 10:25 PM, Michael Selik wrote: > On Tue, Jun 12, 2018 at 11:08 AM Mikhail V wrote: >> >> On Tue, Jun 12, 2018 at 7:42 PM, Clint Hepner >> wrote: >> >> So the idea was about an insert/append operator. >> Which would use augmented operator. The operator may be >> /= or ^=. (I like ^= more, so I'll put here example with it). > > > The "/" operator means divide. How do you divide a list? > The "^" operator means exclusive-or. Again, strange concept for a list. > Allowed standard characters are so - they already mean something and there are so few (even less if throw away ugly looking). For me the plus character + means sum of numbers. So for an array: A += 1 Means for me unequivocally increment each element of array by 1. And this: A += B Means for me increment each element of A by corresponding values of B. BTW that is how plus operator works for Numpy arrays. So I don't think it is a precise thing how logical this or that character suits an operation. Everybody understands that overloading of operators is for convenience and it's just a shortcut for some frequent usage. As I understand, overloading some operators for certain object type (list in this case) has relatively low cost in terms of implementation. As for me - I'm fine with append() method and use it all the time. Frankly speaking, I was working on another syntax idea and it turned out that actually augmented operators may be very useful for that particular syntax design. > >> >> L += var >> and >> L = L + var >> >> are different, so it seems to me they were not meant to be >> bound together. > > > One is mutation, the other isn't, but aside from that, the result is > equivalent. You're right of course. I was just confused by the fact that L += "aa" # works L = L + "aa" # gives TypeError But for consistent types both should work the same. > I think you'll find it's > natural once you get used to slice assignment. I use slice assignment all the time with Numpy arrays, though for list element appending I prefer append() method. M From greg.ewing at canterbury.ac.nz Tue Jun 12 19:15:07 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Jun 2018 11:15:07 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: <5B2053FB.7020500@canterbury.ac.nz> Mikhail V wrote: > L[0:0] = ["bb"] > > -> ["bb", "aa"] > > The trick is to put brackets around the element and so it works as insert(). > Though additional brackets look really confusing for this purpose, so I don't > feel like using this seriously. I don't think it's all that confusing. It looks a bit cluttered when the thing being inserted is itself a list literal, but that seems like a rare case of something that's not all that coommon in the first place. My feeling is that inserting is not a frequent enough operation to warrant having its own operator, especially not when there is already a syntax that does the same thing. > Is there some technical problem with implementing this? Yes, it doesn't fit into the current scheme for augmented assignment operators. There are no special methods for combining augmented assignments with slicing -- you couldn't implement this just by adding an __ixor__ method to the list type. There would need to be a new special method for "in-place xor with slice", and the compiler would have to recognise this combination and emit special bytecode for it. That would raise the question of why ^= is getting this special treatment but not any of the other augmented assignments, and why not "in-place operation with attribute" as well, and potentially we would end up with two new entire sets of special methods for different flavours of augmented assignments. I really don't think we want to go there. -- Greg From mikhailwas at gmail.com Tue Jun 12 19:42:47 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 13 Jun 2018 02:42:47 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 10:26 PM, Terry Reedy wrote: > On 6/12/2018 10:54 AM, Mikhail V wrote: >> Though additional brackets look really confusing for this purpose, >> so I don't feel like using this seriously. > > > Learning about lists means learning about slice assignment: replace a > sublist with another sequence. > Yes I see, actually that is what I am saying - slice assignment has _replace_ semantics. If that would be a list method, I suppose it would work something like: L = L.replace(begin, end, L2) So that makes perfect sense. But returning to append/insert. So appending of course is extremely frequent operation, inserting in the beginning of the list may be also not rare. Concatenation in my opinion might be not so frequent. Writing values from one array (or slice) to another is quite frequent operation, that means slices of the same size. As for _replacing_ slices, of different size, I personally never used it. OTOH it smells like generalization. Anyway, I prefer to look at it through 'syntax glass'. Here is something ubiquitous: L = L.append("string1") L = L.append("string2") L = L.append("string3") L = L.append([1,2,3]) Could be written: L ^= "string1" L ^= "string2" L ^= "string3" L ^= [1,2,3] These kind of things are basically in every second python script. I'm ok with append() actually, but it just asks for a shortcut. For the ^ character - yes it looks strange. And writing it like this: L += ["string1"] L += ["string2"] L += ["string3"] L += [[1,2,3]] Sorry, no, that would be too much for me, also this needs some real mental training so as not to misinterpret these. Append() is just fine, though it's a pity there is no shortcut operator. M From cs at cskk.id.au Tue Jun 12 20:24:15 2018 From: cs at cskk.id.au (Cameron Simpson) Date: Wed, 13 Jun 2018 10:24:15 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: <20180613002415.GA69171@cskk.homeip.net> On 13Jun2018 02:42, Mikhail V wrote: >On Tue, Jun 12, 2018 at 10:26 PM, Terry Reedy wrote: >> On 6/12/2018 10:54 AM, Mikhail V wrote: >>> Though additional brackets look really confusing for this purpose, >>> so I don't feel like using this seriously. >> >> Learning about lists means learning about slice assignment: replace a >> sublist with another sequence. > >Yes I see, actually that is what I am saying - slice assignment has >_replace_ semantics. Yes, but note that replacing an _empty_ part of the list _is_ an insert! Cheers, Cameron Simpson From mike at selik.org Tue Jun 12 21:04:41 2018 From: mike at selik.org (Michael Selik) Date: Tue, 12 Jun 2018 18:04:41 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: On Tue, Jun 12, 2018 at 4:43 PM Mikhail V wrote: > inserting in the beginning of the list may be also not rare. > Inserting in the beginning of a list is *slow* and should be rare. If you want to append to the left-side of a list, you should use a deque. Check out ``collections.deque`` for your insert(0, x) needs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Tue Jun 12 21:06:41 2018 From: mike at selik.org (Michael Selik) Date: Tue, 12 Jun 2018 18:06:41 -0700 Subject: [Python-ideas] Link accepted PEPs to their whatsnew section? In-Reply-To: <20180612214631.sje6xkm4po4m2zts@python.ca> References: <20180612214631.sje6xkm4po4m2zts@python.ca> Message-ID: Google will probably fix this problem for you after dataclasses become popular. The docs will gain a bunch of inbound links and the issue will (probably) solve itself as time passes. On Tue, Jun 12, 2018 at 2:48 PM Neil Schemenauer < nas-python-ideas at arctrix.com> wrote: > I'm testing "Data Classes" for Python 3.7. Awesome new feature, > BTW. The PEP is the first search result when I lookup "dataclass > python". Given that the PEP is not the best documentation for an > end user, I wonder if we should have a link in the header section of > the PEP that goes to better documentation. We could link accepted > PEPs to their section of the whatsnew. Or, link to the language > documentation that describes the feature. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Tue Jun 12 22:39:41 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 12 Jun 2018 21:39:41 -0500 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: <163f70355c8.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> ^ is also used in regexes for matching the *beginning* of a string... Realistically, I don't think this proposal would be added, but if it were, ^ would be a horrible choice. That being said, I do understand the feeling of half your code being calls to .append or .extend. You could always do: L += 'string1' Only thing extra is the comma at the end; it converts the single value to a tuple, which is then added to the list. As for the use of +, I think that's mostly opinion. Many other languages use + for concatenation, so Python's hardly on its own here. On June 12, 2018 6:43:18 PM Mikhail V wrote: > On Tue, Jun 12, 2018 at 10:26 PM, Terry Reedy wrote: >> On 6/12/2018 10:54 AM, Mikhail V wrote: >>> Though additional brackets look really confusing for this purpose, >>> so I don't feel like using this seriously. >> >> >> Learning about lists means learning about slice assignment: replace a >> sublist with another sequence. >> > > Yes I see, actually that is what I am saying - slice assignment has > _replace_ semantics. > If that would be a list method, I suppose it would work something like: > > L = L.replace(begin, end, L2) > > So that makes perfect sense. > > > But returning to append/insert. > So appending of course is extremely frequent operation, > inserting in the beginning of the list may be also not rare. > Concatenation in my opinion might be not so frequent. > > Writing values from one array (or slice) to another is quite frequent > operation, that means slices of the same size. > As for _replacing_ slices, of different size, I personally never > used it. OTOH it smells like generalization. > > > Anyway, I prefer to look at it through 'syntax glass'. > Here is something ubiquitous: > > L = L.append("string1") > L = L.append("string2") > L = L.append("string3") > L = L.append([1,2,3]) > > Could be written: > > L ^= "string1" > L ^= "string2" > L ^= "string3" > L ^= [1,2,3] > > These kind of things are basically in every second python script. > I'm ok with append() actually, but it just asks for a shortcut. > For the ^ character - yes it looks strange. > > And writing it like this: > > L += ["string1"] > L += ["string2"] > L += ["string3"] > L += [[1,2,3]] > > Sorry, no, that would be too much for me, also this needs some real > mental training so as not to misinterpret these. > > Append() is just fine, though it's a pity there is no shortcut operator. > > > > M > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From robertvandeneynde at hotmail.com Mon Jun 11 13:32:37 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Mon, 11 Jun 2018 17:32:37 +0000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180608054530.GC12683@ando.pearwood.info> <20180608065903.GF12683@ando.pearwood.info> <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> Message-ID: As mentioned, with complex numbers the radians make more sense and of course cmath.sind(x) + 1j * cmath.sind(x) != cmath.exp(1j * x). However, adding degrees version for cmath (import cmath) is still useful, cmath.rectd, cmath.phased, cmath.polard etc. 2018-06-11 19:24 GMT+02:00 Michael Selik >: Would sind and cosd make Euler's formula work correctly? sind(x) + i * sind(x) == math.e ** (i * x) I suspect that adding these functions is kind of like those cartoons where the boat is springing leaks and the character tried to plug them with their fingers. Floating point is a leaky abstraction. Perhaps you'd prefer an enhancement to the fractions module that provides real (not float) math? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed Jun 13 03:51:04 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 13 Jun 2018 09:51:04 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: Op di 12 jun. 2018 12:41 schreef Nathaniel Smith : > On Tue, Jun 12, 2018, 00:03 Stephan Houben wrote: > >> Hi all, >> >> I wrote a possible implementation of sindg: >> >> https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd >> >> This code first reduces the angle to the [0,90] interval. >> After doing so, it can be observed that the simple implementation >> math.sin(math.radians(angle)) >> produces exact results for 0 and 90, and a result already rounded to >> nearest for >> 60. >> > > You observed this on your system, but math.sin uses the platform libm, > which might do different things on other people's systems. > Ok, I updated the code to treat all the values 0, 30, 45, 60 and 90 specially. Stephan > >> For 30 and 45, this simple implementation is one ulp too low. >> So I special-case those to return the correct/correctly-rounded value >> instead. >> Note that this does not affect monotonicity around those values. >> > > Again, monotonicity is preserved on your system, but it might not be on > others. It's not clear that this matters, but then it's not clear that any > of this matters... > > -n > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Wed Jun 13 06:00:09 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 13 Jun 2018 12:00:09 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: What was wrong with my initial implementation with a lookup table ? :D def sind(x): if x % 90 == 0: return (0, 1, 0, -1)[int(x // 90) % 4] else: return sin(radians(x)) If you want to support multiples of 30, you can do % 30 and // 30. Le mer. 13 juin 2018 ? 09:51, Stephan Houben a ?crit : > Op di 12 jun. 2018 12:41 schreef Nathaniel Smith : > >> On Tue, Jun 12, 2018, 00:03 Stephan Houben wrote: >> >>> Hi all, >>> >>> I wrote a possible implementation of sindg: >>> >>> https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd >>> >>> This code first reduces the angle to the [0,90] interval. >>> After doing so, it can be observed that the simple implementation >>> math.sin(math.radians(angle)) >>> produces exact results for 0 and 90, and a result already rounded to >>> nearest for >>> 60. >>> >> >> You observed this on your system, but math.sin uses the platform libm, >> which might do different things on other people's systems. >> > > > Ok, I updated the code to treat all the values 0, 30, 45, 60 and 90 > specially. > > Stephan > > >> >>> For 30 and 45, this simple implementation is one ulp too low. >>> So I special-case those to return the correct/correctly-rounded value >>> instead. >>> Note that this does not affect monotonicity around those values. >>> >> >> Again, monotonicity is preserved on your system, but it might not be on >> others. It's not clear that this matters, but then it's not clear that any >> of this matters... >> >> -n >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed Jun 13 06:07:42 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 13 Jun 2018 12:07:42 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: 2018-06-13 12:00 GMT+02:00 Robert Vanden Eynde : > What was wrong with my initial implementation with a lookup table ? :D > > def sind(x): > if x % 90 == 0: > return (0, 1, 0, -1)[int(x // 90) % 4] > else: > return sin(radians(x)) > I kinda missed it, but now you ask: 1. It's better to reduce the angle while still in degrees since one of the advantages of degrees is that the reduction can be done exactly. Converting very large angles first to radians and then taking the sine can introduce a large error, 2. I used fmod instead of % on advice in this thread. 3. I also wanted to special case, 30, 45, and 60. > > If you want to support multiples of 30, you can do % 30 and // 30. > Sure, but I also wanted to special-case 45. Stephan > > Le mer. 13 juin 2018 ? 09:51, Stephan Houben a > ?crit : > >> Op di 12 jun. 2018 12:41 schreef Nathaniel Smith : >> >>> On Tue, Jun 12, 2018, 00:03 Stephan Houben wrote: >>> >>>> Hi all, >>>> >>>> I wrote a possible implementation of sindg: >>>> >>>> https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd >>>> >>>> This code first reduces the angle to the [0,90] interval. >>>> After doing so, it can be observed that the simple implementation >>>> math.sin(math.radians(angle)) >>>> produces exact results for 0 and 90, and a result already rounded to >>>> nearest for >>>> 60. >>>> >>> >>> You observed this on your system, but math.sin uses the platform libm, >>> which might do different things on other people's systems. >>> >> >> >> Ok, I updated the code to treat all the values 0, 30, 45, 60 and 90 >> specially. >> >> Stephan >> >> >>> >>>> For 30 and 45, this simple implementation is one ulp too low. >>>> So I special-case those to return the correct/correctly-rounded value >>>> instead. >>>> Note that this does not affect monotonicity around those values. >>>> >>> >>> Again, monotonicity is preserved on your system, but it might not be on >>> others. It's not clear that this matters, but then it's not clear that any >>> of this matters... >>> >>> -n >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Wed Jun 13 06:08:45 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Wed, 13 Jun 2018 12:08:45 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: Then of you also want 45, you could do % 15 ? :D Le mer. 13 juin 2018 ? 12:07, Stephan Houben a ?crit : > 2018-06-13 12:00 GMT+02:00 Robert Vanden Eynde : > >> What was wrong with my initial implementation with a lookup table ? :D >> >> def sind(x): >> if x % 90 == 0: >> return (0, 1, 0, -1)[int(x // 90) % 4] >> else: >> return sin(radians(x)) >> > > I kinda missed it, but now you ask: > > 1. It's better to reduce the angle while still in degrees since one of the > advantages > of degrees is that the reduction can be done exactly. Converting very > large angles > first to radians and then taking the sine can introduce a large error, > > 2. I used fmod instead of % on advice in this thread. > > 3. I also wanted to special case, 30, 45, and 60. > > >> >> If you want to support multiples of 30, you can do % 30 and // 30. >> > > Sure, but I also wanted to special-case 45. > > Stephan > > >> >> Le mer. 13 juin 2018 ? 09:51, Stephan Houben a >> ?crit : >> >>> Op di 12 jun. 2018 12:41 schreef Nathaniel Smith : >>> >>>> On Tue, Jun 12, 2018, 00:03 Stephan Houben >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I wrote a possible implementation of sindg: >>>>> >>>>> https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd >>>>> >>>>> This code first reduces the angle to the [0,90] interval. >>>>> After doing so, it can be observed that the simple implementation >>>>> math.sin(math.radians(angle)) >>>>> produces exact results for 0 and 90, and a result already rounded to >>>>> nearest for >>>>> 60. >>>>> >>>> >>>> You observed this on your system, but math.sin uses the platform libm, >>>> which might do different things on other people's systems. >>>> >>> >>> >>> Ok, I updated the code to treat all the values 0, 30, 45, 60 and 90 >>> specially. >>> >>> Stephan >>> >>> >>>> >>>>> For 30 and 45, this simple implementation is one ulp too low. >>>>> So I special-case those to return the correct/correctly-rounded value >>>>> instead. >>>>> Note that this does not affect monotonicity around those values. >>>>> >>>> >>>> Again, monotonicity is preserved on your system, but it might not be on >>>> others. It's not clear that this matters, but then it's not clear that any >>>> of this matters... >>>> >>>> -n >>>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed Jun 13 06:20:34 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 13 Jun 2018 12:20:34 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611054831.GW12683@ando.pearwood.info> <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: 2018-06-13 12:08 GMT+02:00 Robert Vanden Eynde : > Then of you also want 45, you could do % 15 ? :D > Sure, but how the lookup is done in the Python reference code is ultimately not so important, since it will need to be rewritten in C if it is to be included in the math package (math is C-only). And then we'll probably end up with a bunch of if-checks against the common values. Stephan > > > Le mer. 13 juin 2018 ? 12:07, Stephan Houben a > ?crit : > >> 2018-06-13 12:00 GMT+02:00 Robert Vanden Eynde : >> >>> What was wrong with my initial implementation with a lookup table ? :D >>> >>> def sind(x): >>> if x % 90 == 0: >>> return (0, 1, 0, -1)[int(x // 90) % 4] >>> else: >>> return sin(radians(x)) >>> >> >> I kinda missed it, but now you ask: >> >> 1. It's better to reduce the angle while still in degrees since one of >> the advantages >> of degrees is that the reduction can be done exactly. Converting very >> large angles >> first to radians and then taking the sine can introduce a large error, >> >> 2. I used fmod instead of % on advice in this thread. >> >> 3. I also wanted to special case, 30, 45, and 60. >> >> >>> >>> If you want to support multiples of 30, you can do % 30 and // 30. >>> >> >> Sure, but I also wanted to special-case 45. >> >> Stephan >> >> >>> >>> Le mer. 13 juin 2018 ? 09:51, Stephan Houben a >>> ?crit : >>> >>>> Op di 12 jun. 2018 12:41 schreef Nathaniel Smith : >>>> >>>>> On Tue, Jun 12, 2018, 00:03 Stephan Houben >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I wrote a possible implementation of sindg: >>>>>> >>>>>> https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd >>>>>> >>>>>> This code first reduces the angle to the [0,90] interval. >>>>>> After doing so, it can be observed that the simple implementation >>>>>> math.sin(math.radians(angle)) >>>>>> produces exact results for 0 and 90, and a result already rounded to >>>>>> nearest for >>>>>> 60. >>>>>> >>>>> >>>>> You observed this on your system, but math.sin uses the platform libm, >>>>> which might do different things on other people's systems. >>>>> >>>> >>>> >>>> Ok, I updated the code to treat all the values 0, 30, 45, 60 and 90 >>>> specially. >>>> >>>> Stephan >>>> >>>> >>>>> >>>>>> For 30 and 45, this simple implementation is one ulp too low. >>>>>> So I special-case those to return the correct/correctly-rounded value >>>>>> instead. >>>>>> Note that this does not affect monotonicity around those values. >>>>>> >>>>> >>>>> Again, monotonicity is preserved on your system, but it might not be >>>>> on others. It's not clear that this matters, but then it's not clear that >>>>> any of this matters... >>>>> >>>>> -n >>>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlhilton at gmail.com Wed Jun 13 07:06:09 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Wed, 13 Jun 2018 19:06:09 +0800 Subject: [Python-ideas] Give regex operations more sugar Message-ID: Hi all, Regexes are really useful in many places, and to me it's sad to see the builtin "re" module having to resort to requiring a source string as an argument. It would be much more elegant to simply do "s.search(pattern)" than "re.search(pattern, s)". I suggest building all regex operations into the str class itself, as well as a new syntax for regular expressions. Thus a "findall" for any lowercase letter in a string would look like this: >>> "1a3c5e7g9i".findall(!%[a-z]%) ['a', 'c', 'e', 'g', 'i'] A "findall" for any letter, case insensitive: >>> "1A3c5E7g9I".findall(!%[a-z]%i) ['A', 'c', 'E', 'g', 'I'] A substitution of any letter for the string " WOOF WOOF ": >>> "1a3c5e7g9i".sub(!%[a-z]% WOOF WOOF %) '1 WOOF WOOF 3 WOOF WOOF 5 WOOF WOOF 7 WOOF WOOF 9 WOOF WOOF ' A substitution of any letter, case insensitive, for the string "hovercraft": >>> "1A3c5E7g9I".sub(!%[a-z]%hovercraft%i) '1hovercraft3hovercraft5hovercraft7hovercraft9hovercraft' You may wonder why I chose the regex delimiters as "!%" ... "%" [ ... "%" ] ... The choice of "%" was purely arbitrary; I just thought of it since there seems to be a convention to use "%" in PHP regex patterns. The "!" is in front to disambiguate it from the "%" modulo operator or the "%" string formatting operator, and because "!" is currently not used in Python. Another potential idea is to simply use "!" to denote the start of a regex, and use the character immediately following it to delimit the regex. Thus all of the following would be regexes matching a single lowercase letter: !%[a-z]% !#[a-z]# !?[a-z]? !/[a-z]/ And all of the following would be substitution regexes replacing a single case-insensitive letter with "@": !%[a-z]%@%i !#[a-z]#@#i !?[a-z]?@?i !/[a-z]/@/i Some examples of how to use this: >>> "pneumonoultramicroscopicsilicovolcanokoniosis".findall(!%[aeiou]+%) ['eu', 'o', 'ou', 'a', 'i', 'o', 'o', 'i', 'i', 'i', 'o', 'o', 'a', 'o', 'o', 'io', 'i'] >>> "GMzKqtnnyGdqIQNlQSLidbDlqpdhoRbHrrUAgyhMgkZKYVhQuI".search(!%[^A-Z][A-Z]{3}([a-z])[A-Z]{3}[^A-Z]%) >>> "My name is Joanne.".findall(!%[A-Z][a-z]+%) ['My', 'Joanne'] Thoughts? Sincerely, Ken; -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Wed Jun 13 07:12:06 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Wed, 13 Jun 2018 07:12:06 -0400 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: My first comment is that special casing values like this can lead to some very undesirable properties when you use the function for numerical analysis. Suddenly your sind is no longer continuous (sind(x) is no longer the limit of sind(x+d) as d goes to 0). As I stated in my initial comment on this, if you are going to create a sind function with the idea that you want 'nice' angles to return 'exact' results, then what you need to do is have the degree based trig routines do the angle reduction in degrees, and only when you have a small enough angle, either use the radians version on the small angle or directly include an expansion in degrees. Angle reduction would be based on the identity that sin(x+y) = sin(x) * cos(y) + cos(x) * sin(y) and cos(x+y) = cos(x)*cos(y) - sin(x) * sin(y). If you want to find sin(z) for an arbitrary value z, you can reduce it to and x+y where x is some multiple of say 15 degrees, and y is in the range -7.5 to 7.5 degrees. You can have stored exact values of sin/cos of the 15 degree increments (and only really need them between 0 and 90) and then compute the sin and cos of the y value. On 6/13/18 6:07 AM, Stephan Houben wrote: > 2018-06-13 12:00 GMT+02:00 Robert Vanden Eynde >: > > What was wrong with my initial implementation with a lookup table > ? :D > > def sind(x): > ? ? if x % 90 == 0: > ? ? ? ? return (0, 1, 0, -1)[int(x // 90) % 4] > ? ? else: > ? ? ? ? return sin(radians(x)) > > > I kinda missed it, but now you ask: > > 1. It's better to reduce the angle while still in degrees since one of > the advantages > ?? of degrees is that the reduction can be done exactly. Converting > very large angles > ?? first to radians and then taking the sine can introduce a large error, > > 2. I used fmod instead of % on advice in this thread. > > 3. I also wanted to special case, 30, 45, and 60. > ? > > > If you want to support multiples of 30, you can do % 30 and // 30. > > > Sure, but I also wanted to special-case 45. > > Stephan > ? > > > Le mer. 13 juin 2018 ? 09:51, Stephan Houben > a ?crit?: > > Op di 12 jun. 2018 12:41 schreef Nathaniel Smith > >: > > On Tue, Jun 12, 2018, 00:03 Stephan Houben > > wrote: > > Hi all, > > I wrote a possible implementation of sindg: > > https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd > > > This code first reduces the angle to the [0,90] interval. > After doing so, it can be observed that the simple > implementation > ? math.sin(math.radians(angle)) > produces exact results for 0 and 90, and a result > already rounded to nearest for > 60. > > > You observed this on your system, but math.sin uses the > platform libm, which might do different things on other > people's systems. > > > > Ok, I updated the code to treat all the values 0, 30, 45, 60 > and 90 specially. > > Stephan > > > > For 30 and 45, this simple implementation is one ulp > too low. > So I special-case those to return the > correct/correctly-rounded value instead. > Note that this does not affect monotonicity around > those values. > > > Again, monotonicity is preserved on your system, but it > might not be on others. It's not clear that this matters, > but then it's not clear that any of this matters... > > -n > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Richard Damon From stephanh42 at gmail.com Wed Jun 13 07:21:02 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 13 Jun 2018 13:21:02 +0200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: Op wo 13 jun. 2018 13:12 schreef Richard Damon : > My first comment is that special casing values like this can lead to > some very undesirable properties when you use the function for numerical > analysis. Suddenly your sind is no longer continuous (sind(x) is no > longer the limit of sind(x+d) as d goes to 0). > The deviations introduced by the special casing are on the order of one ulp. At that level of detail the sin wasn't continuous to begin with. > > As I stated in my initial comment on this, if you are going to create a > sind function with the idea that you want 'nice' angles to return > 'exact' results, then what you need to do is have the degree based trig > routines do the angle reduction in degrees, and only when you have a > small enough angle, either use the radians version on the small angle or > directly include an expansion in degrees. > Yes that is what my code does. It reduces degrees to [0,90]. > > Angle reduction would be based on the identity that sin(x+y) = sin(x) * > cos(y) + cos(x) * sin(y) and cos(x+y) = cos(x)*cos(y) - sin(x) * sin(y). > > If you want to find sin(z) for an arbitrary value z, you can reduce it > to and x+y where x is some multiple of say 15 degrees, and y is in the > range -7.5 to 7.5 degrees. You can have stored exact values of sin/cos > of the 15 degree increments (and only really need them between 0 and 90) > and then compute the sin and cos of the y value. This is not how sine functions are calculated. They are calculated by reducing angle to some interval, then evaluating a polynomial which approximates the true sine within that interval. Stephan > On 6/13/18 6:07 AM, Stephan Houben wrote: > > 2018-06-13 12:00 GMT+02:00 Robert Vanden Eynde > >: > > > > What was wrong with my initial implementation with a lookup table > > ? :D > > > > def sind(x): > > if x % 90 == 0: > > return (0, 1, 0, -1)[int(x // 90) % 4] > > else: > > return sin(radians(x)) > > > > > > I kinda missed it, but now you ask: > > > > 1. It's better to reduce the angle while still in degrees since one of > > the advantages > > of degrees is that the reduction can be done exactly. Converting > > very large angles > > first to radians and then taking the sine can introduce a large error, > > > > 2. I used fmod instead of % on advice in this thread. > > > > 3. I also wanted to special case, 30, 45, and 60. > > > > > > > > If you want to support multiples of 30, you can do % 30 and // 30. > > > > > > Sure, but I also wanted to special-case 45. > > > > Stephan > > > > > > > > Le mer. 13 juin 2018 ? 09:51, Stephan Houben > > a ?crit : > > > > Op di 12 jun. 2018 12:41 schreef Nathaniel Smith > > >: > > > > On Tue, Jun 12, 2018, 00:03 Stephan Houben > > > wrote: > > > > Hi all, > > > > I wrote a possible implementation of sindg: > > > > > https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd > > < > https://gist.github.com/stephanh42/336d54a53b31104b97e46156c7deacdd> > > > > This code first reduces the angle to the [0,90] interval. > > After doing so, it can be observed that the simple > > implementation > > math.sin(math.radians(angle)) > > produces exact results for 0 and 90, and a result > > already rounded to nearest for > > 60. > > > > > > You observed this on your system, but math.sin uses the > > platform libm, which might do different things on other > > people's systems. > > > > > > > > Ok, I updated the code to treat all the values 0, 30, 45, 60 > > and 90 specially. > > > > Stephan > > > > > > > > For 30 and 45, this simple implementation is one ulp > > too low. > > So I special-case those to return the > > correct/correctly-rounded value instead. > > Note that this does not affect monotonicity around > > those values. > > > > > > Again, monotonicity is preserved on your system, but it > > might not be on others. It's not clear that this matters, > > but then it's not clear that any of this matters... > > > > -n > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -- > Richard Damon > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Wed Jun 13 07:46:35 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 13 Jun 2018 12:46:35 +0100 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: On 13/06/18 12:06, Ken Hilton wrote: > Hi all, > > Regexes are really useful in many places, and to me it's sad to see the > builtin "re" module having to resort to requiring a source string as an > argument. It would be much more elegant to simply do "s.search(pattern)" > than "re.search(pattern, s)". > I suggest building all regex operations into the str class itself, as well > as a new syntax for regular expressions. [snip] > Thoughts? My first, most obvious thought is that Python is not Perl, and does not encourage people to reach for regular expressions at every opportunity. That said, I don't see how having a special delimiter syntax for something that could just as well be a string is a help. -- Rhodri James *-* Kynesim Ltd From clint.hepner at gmail.com Wed Jun 13 08:38:47 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Wed, 13 Jun 2018 08:38:47 -0400 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: > On 2018 Jun 13 , at 7:06 a, Ken Hilton wrote: > > Hi all, > > Regexes are really useful in many places, and to me it's sad to see the builtin "re" module having to resort to requiring a source string as an argument. It would be much more elegant to simply do "s.search(pattern)" than "re.search(pattern, s)". > I suggest building all regex operations into the str class itself, as well as a new syntax for regular expressions. I think you'll have to be more specific about why it is sad to pass strings to a function. There already is a class that has all the methods you want, although with the roles of regex and string reversed from what you want. >>> x = re.compile(r'[a-z]') # x = !%a-z%!, or what have you >>> type(x) >>> x.findall("1a3c5e7g9i") ['a', 'c', 'e', 'g', 'i'] Strictly speaking, a regular expression is just a string that encodes of a (non)deterministic finite automata. A "regex" (the thing that supports all sorts of extensions that make the expression decidedly non-regular) is a string that encodes ... some class of Turing machines. [Aside: A discussion of just what they match can be found at http://nikic.github.io/2012/06/15/The-true-power-of-regular-expressions.html, which suggests they can match context-free languages, some (but possibly not all) context-sensitive languages, and maybe some languages that context-sensitive grammars cannot not. Suffice it to say, they are powerful and complex.] I don't think it is the job of the str class to build, store, and run the resulting machines, solely for the sake of some perceived syntactic benefit. I don't see any merit in adding regex literals beyond making Python look more like Perl. -- Clint From phd at phdru.name Wed Jun 13 09:01:53 2018 From: phd at phdru.name (Oleg Broytman) Date: Wed, 13 Jun 2018 15:01:53 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: <20180613130153.iqcxy7dyjp2l5duq@phdru.name> On Wed, Jun 13, 2018 at 07:06:09PM +0800, Ken Hilton wrote: > Regexes are really useful in many places, and to me it's sad to see the > builtin "re" module having to resort to requiring a source string as an > argument. It would be much more elegant to simply do "s.search(pattern)" > than "re.search(pattern, s)". pat_compiled = re.compile(pattern) pat_compiled.search(s) > I suggest building all regex operations into the str class itself, as well > as a new syntax for regular expressions. There are many different regular expression implementation (regex, re2). How to make ``s.search(pattern)`` work with all of them? > Thoughts? > Sincerely, > Ken; Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From desmoulinmichel at gmail.com Wed Jun 13 09:33:52 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 13 Jun 2018 15:33:52 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: Le 13/06/2018 ? 13:06, Ken Hilton a ?crit?: > Hi all, > > Regexes are really useful in many places, and to me it's sad to see the > builtin "re" module having to resort to requiring a source string as an > argument. It would be much more elegant to simply do "s.search(pattern)" > than "re.search(pattern, s)". > I suggest building all regex operations into the str class itself, as > well as a new syntax for regular expressions. > > Thus a "findall" for any lowercase letter in a string would look like this: > > ? ? >>> "1a3c5e7g9i".findall(!%[a-z]%) > ? ? ['a', 'c', 'e', 'g', 'i'] > > A "findall" for any letter, case insensitive: > > ? ? >>> "1A3c5E7g9I".findall(!%[a-z]%i) > ? ? ['A', 'c', 'E', 'g', 'I'] > > A substitution of any letter for the string " WOOF WOOF ": > > ? ? >>> "1a3c5e7g9i".sub(!%[a-z]% WOOF WOOF %) > ? ? '1 WOOF WOOF 3 WOOF WOOF 5 WOOF WOOF 7 WOOF WOOF 9 WOOF WOOF ' > > A substitution of any letter, case insensitive, for the string "hovercraft": > > ? ? >>> "1A3c5E7g9I".sub(!%[a-z]%hovercraft%i) > ? ? '1hovercraft3hovercraft5hovercraft7hovercraft9hovercraft' > > I often wished for findall and sub to be string methods, so +1 on that. But there is really no need for a literal. A string pattern is plenty. From desmoulinmichel at gmail.com Wed Jun 13 09:40:18 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 13 Jun 2018 15:40:18 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180613130153.iqcxy7dyjp2l5duq@phdru.name> References: <20180613130153.iqcxy7dyjp2l5duq@phdru.name> Message-ID: >> I suggest building all regex operations into the str class itself, as well >> as a new syntax for regular expressions. > > There are many different regular expression implementation (regex, > re2). How to make ``s.search(pattern)`` work with all of them? > You don't, they work stand alone anyway. Besides, nobody is proposing to retire the re module either. But if it's really important, you can make hooks to provide the implementation like we did with breakpoint(). I really wish, however, than we separate the issue of adding the methods on the str object, and making literals. I know that literals are going to be rejected, "python is not perl", etc. But the str methods are quite an interesting idea. From ncoghlan at gmail.com Wed Jun 13 09:48:13 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Jun 2018 23:48:13 +1000 Subject: [Python-ideas] Link accepted PEPs to their whatsnew section? In-Reply-To: References: <20180612214631.sje6xkm4po4m2zts@python.ca> Message-ID: On 13 June 2018 at 11:06, Michael Selik wrote: > Google will probably fix this problem for you after dataclasses become > popular. The docs will gain a bunch of inbound links and the issue will > (probably) solve itself as time passes. > Sometimes when reading a PEP it isn't especially clear exactly which version it landed in, or whether or not there were significant changes post-acceptance based on issues discovered during the beta period, though. So the idea of a "Release-Note" header that points to the version specific What's New entry seems like a decent idea to me (and may actually help the What's New section supplant the PEP in search results). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Wed Jun 13 10:04:08 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 13 Jun 2018 17:04:08 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <5B2053FB.7020500@canterbury.ac.nz> References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: On Wed, Jun 13, 2018 at 2:15 AM, Greg Ewing wrote: > Mikhail V wrote: >> > My feeling is that inserting is not a frequent enough operation > to warrant having its own operator, especially not when there > is already a syntax that does the same thing. Depends on what you count as 'insert' - append is one case of insert ;) (logically seen) Sorry for repeating myself, the idea was that the default meaning is append(), i.e. normal operator usage on list: L1 = L2 ^ item - would be same as L1 = L2.append(item) But hope you get the point - if an operator for append was added, then it would be a bit sad that it cannot be used for inserting by slicing, namely these two forms: L ^= item #append(item) L[i:j] ^= item #insert(i, item) instead of i:j items would be IMO nice to have and it 'd cover both insert and append. But if you say that special-casing of [i:j] here would be hard to implement, then maybe insert() idea should be dropped. > That > would raise the question of why ^= is getting this > special treatment but not any of the other augmented > assignments, and why not "in-place operation with > attribute" as well As said, I don't insist on ^ operator. I would find ">>" or "<<" or "|" ok as well: L <<= "foo" L1 = L2 << "foo" L |= "foo" L1 = L2 | "foo" But it would raise same questions. As for relation 'by sense', that may be too opinion based. Vertical bar "|" may be somewhat related - IIRC in some contexts it is used as element separator. Just hope the judgement is not like "the symbol looks strange - therefore the feature is not needed". M From rosuav at gmail.com Wed Jun 13 10:13:01 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 00:13:01 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: On Thu, Jun 14, 2018 at 12:04 AM, Mikhail V wrote: > On Wed, Jun 13, 2018 at 2:15 AM, Greg Ewing wrote: >> Mikhail V wrote: >>> > >> My feeling is that inserting is not a frequent enough operation >> to warrant having its own operator, especially not when there >> is already a syntax that does the same thing. > > Depends on what you count as 'insert' - append is one case of insert ;) > (logically seen) > > Sorry for repeating myself, the idea was that the default meaning is append(), > i.e. normal operator usage on list: > > L1 = L2 ^ item - would be same as > L1 = L2.append(item) Not sure exactly what your intention here is, because list.append mutates the list and returns None. Does "L2 ^ item" mutate L2 in place, or does it construct a new list? If it mutates in place, does it return the same list? Or if doesn't, how is it different from "L2 + [item]", which is a much more logical spelling of list addition? ChrisA From mikhailwas at gmail.com Wed Jun 13 10:40:09 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 13 Jun 2018 17:40:09 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: On Wed, Jun 13, 2018 at 5:13 PM, Chris Angelico wrote: > On Thu, Jun 14, 2018 at 12:04 AM, Mikhail V wrote: >> On Wed, Jun 13, 2018 at 2:15 AM, Greg Ewing wrote: >>> Mikhail V wrote: >> Sorry for repeating myself, the idea was that the default meaning is append(), >> i.e. normal operator usage on list: >> >> L1 = L2 ^ item - would be same as >> L1 = L2.append(item) > > Not sure exactly what your intention here is, because list.append > mutates the list and returns None. Does "L2 ^ item" mutate L2 in > place, or does it construct a new list? If it mutates in place, does > it return the same list? Or if doesn't, how is it different from "L2 + > [item]", which is a much more logical spelling of list addition? I made wrong example again. So L1 = L2 ^ item is L1 = L2 + [item] and L ^= item is L.append(item) or L += [item] From rosuav at gmail.com Wed Jun 13 10:46:40 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 00:46:40 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: On Thu, Jun 14, 2018 at 12:40 AM, Mikhail V wrote: > On Wed, Jun 13, 2018 at 5:13 PM, Chris Angelico wrote: >> On Thu, Jun 14, 2018 at 12:04 AM, Mikhail V wrote: >>> On Wed, Jun 13, 2018 at 2:15 AM, Greg Ewing wrote: >>>> Mikhail V wrote: > >>> Sorry for repeating myself, the idea was that the default meaning is append(), >>> i.e. normal operator usage on list: >>> >>> L1 = L2 ^ item - would be same as >>> L1 = L2.append(item) >> >> Not sure exactly what your intention here is, because list.append >> mutates the list and returns None. Does "L2 ^ item" mutate L2 in >> place, or does it construct a new list? If it mutates in place, does >> it return the same list? Or if doesn't, how is it different from "L2 + >> [item]", which is a much more logical spelling of list addition? > > I made wrong example again. So > > L1 = L2 ^ item > is > L1 = L2 + [item] > > and > L ^= item > is > L.append(item) > or > L += [item] Okay. Now it all is coherent and makes perfect sense... but you're offering alternative spellings for what we can already do. The only improvement compared to the + operator is that you don't need to surround the operand in brackets; in return, it's less general, being unable to add multiple elements to the list. The only improvement compared to .append is that it's an operator. There's no connection to exclusive-or, and there's not a lot of "but it's intuitive" here (cf Path division). ChrisA From J.Demeyer at UGent.be Wed Jun 13 11:15:24 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 13 Jun 2018 17:15:24 +0200 Subject: [Python-ideas] Meta-PEP about C functions Message-ID: <5B21350C.9040803@UGent.be> I have finished my "meta-PEP" for issues with built-in (implemented in C) functions and methods. This is meant to become an "informational" (not standards track) PEP for other PEPs to refer to. You can read the full text at https://github.com/jdemeyer/PEP-functions-meta I also give brief ideas of solutions for the various issues. The main idea is a new PyTypeObject field tp_ccalloffset giving an offset in the object structure for a new PyCCallDef struct. This new struct replaces PyMethodDef for calling functions/methods and defines a new "C call" protocol. Comparing with PEP 575, one could say that the base_function class has been replaced by PyCCallDef. This is even more general than PEP 575 and it should be easier to support this new protocol in existing classes. I plan to submit this as PEP in the next days, but I wanted to check for some early feedback first. Jeroen. From python-ideas at mgmiller.net Wed Jun 13 13:11:23 2018 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 13 Jun 2018 10:11:23 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> On 2018-06-13 06:33, Michel Desmoulin wrote: > > I often wished for findall and sub to be string methods, so +1 on that. > Agreed, and there are a few string functions that could be extended (to take a sequence) to handle more cases that push folks to regex, perhaps earlier than they should. Some string functions accept sequences to match on, some don't, if memory serves. -Mike From marcidy at gmail.com Wed Jun 13 13:34:50 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Wed, 13 Jun 2018 10:34:50 -0700 Subject: [Python-ideas] Link accepted PEPs to their whatsnew section? In-Reply-To: References: <20180612214631.sje6xkm4po4m2zts@python.ca> Message-ID: On Wed, Jun 13, 2018, 06:51 Nick Coghlan wrote: > On 13 June 2018 at 11:06, Michael Selik wrote: > >> Google will probably fix this problem for you after dataclasses become >> popular. The docs will gain a bunch of inbound links and the issue will >> (probably) solve itself as time passes. >> > > Sometimes when reading a PEP it isn't especially clear exactly which > version it landed in, or whether or not there were significant changes > post-acceptance based on issues discovered during the beta period, though. > > So the idea of a "Release-Note" header that points to the version specific > What's New entry seems like a decent idea to me (and may actually help the > What's New section supplant the PEP in search results). > perhaps some machine readable format of the release tag on the PEP which is then read when generating the notes, and pulls it in? include the link to the PEP in the notes as well. just tag the PEP with release and forget about it. no idea if that fits into the process, just a thought. > Cheers, > Nick. > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Jun 13 15:37:05 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 13 Jun 2018 14:37:05 -0500 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: [Richard Damon] > My first comment is that special casing values like this can lead to > some very undesirable properties when you use the function for numerical > analysis. Suddenly your sind is no longer continuous (sind(x) is no > longer the limit of sind(x+d) as d goes to 0). > > As I stated in my initial comment on this, if you are going to create a > sind function with the idea that you want 'nice' angles to return > 'exact' results, then what you need to do is have the degree based trig > routines do the angle reduction in degrees, and only when you have a > small enough angle, either use the radians version on the small angle or > directly include an expansion in degrees. > ... > Either way, it's necessary to get the effect of working in greater than output precision, if it's desired that the best possible result be returned for cases well beyond just the handful of "nice integer inputs" people happen to be focused on today. So I'll say again that the easiest way to do that is to use `mpmath` to get extra precision directly. The following does that for sindg, cosdg, and tandg. - There are no special cases. Although tandg(90 + i*180) dies with ZeroDivisionError inside mpmath, and that could/should be fiddled to return an infinity instead. - Apart from that, all functions appear to give the best possible double-precision result for all representable-as-a-double integer degree inputs (sindg(30), cosdg(-100000), doesn't matter). - And for all representable inputs of the form `integer + j/32` for j in range(32). - But not for all of the form `integer + j/64` for j in range(1, 64, 2). A few of those suffer greater than 1/2 ULP error. Setting EXTRAPREC to 16 is enough to repair those - but why bother? ;-) - Consider the largest representable double less than 90: >>> x 89.99999999999999 >>> x.hex() '0x1.67fffffffffffp+6' The code below gives the best possible tangent: >>> tandg(x) 4031832051015932.0 Native precision is waaaaay off: >>> math.tan(math.radians(x)) 3530114321217157.5 It's not really the extra precision that saves the code below, but allowing argument reduction to reduce to the range [-pi/4, pi/4] radians, followed by exploiting trigonometric identities. In this case, exploiting that tan(pi/2 + z) = -1/tan(z). Then even native precision is good enough: >>> -1 / math.tan(math.radians(x - 90)) 4031832051015932.0 Here's the code: import mpmath from math import fmod # Return (n, x) such that: # 1. d degrees is equivalent to x + n*(pi/2) radians. # 2. x is an mpmath float in [-pi/4, pi/4]. # 3. n is an integer in range(4). # There is one potential rounding error, when mpmath.radians() is # used to convert a number of degrees between -45 and 45. This is # done using the current mpmath precision. def treduce(d): d = fmod(d, 360.0) n = round(d / 90.0) assert -4 <= n <= 4 d -= n * 90.0 assert -45.0 <= d <= 45.0 return n & 3, mpmath.radians(d) EXTRAPREC = 14 def sindg(d): with mpmath.extraprec(EXTRAPREC): n, x = treduce(d) if n & 1: x = mpmath.cos(x) else: x = mpmath.sin(x) if n >= 2: x = -x return float(x) def cosdg(d): with mpmath.extraprec(EXTRAPREC): n, x = treduce(d) if n & 1: x = mpmath.sin(x) else: x = mpmath.cos(x) if 1 <= n <= 2: x = -x return float(x) def tandg(d): with mpmath.extraprec(EXTRAPREC): n, x = treduce(d) x = mpmath.tan(x) if n & 1: x = -1.0 / x return float(x) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Wed Jun 13 16:21:39 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 13 Jun 2018 23:21:39 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: On Wed, Jun 13, 2018 at 5:46 PM, Chris Angelico wrote: > On Thu, Jun 14, 2018 at 12:40 AM, Mikhail V wrote: >> L1 = L2 ^ item >> is >> L1 = L2 + [item] >> >> and >> L ^= item >> is >> L.append(item) >> or >> L += [item] > > Okay. Now it all is coherent and makes perfect sense... but you're > offering alternative spellings for what we can already do. The only > improvement compared to the + operator is that you don't need to > surround the operand in brackets; in return, it's less general, being > unable to add multiple elements to the list. The only improvement > compared to .append is that it's an operator. There's no connection to > exclusive-or, and there's not a lot of "but it's intuitive" here (cf > Path division). Exactly, IMO it's important to make distinction for the item append. Not only for practical reason, but for semantical distinction, just like append() vs extend() distinction. Otherwise if += is used as append(), it creates an illusion that += _is_ append(). From desmoulinmichel at gmail.com Wed Jun 13 16:43:43 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 13 Jun 2018 22:43:43 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: Le 13/06/2018 ? 19:11, Mike Miller a ?crit?: > > On 2018-06-13 06:33, Michel Desmoulin wrote: >> >> I often wished for findall and sub to be string methods, so +1 on that. >> > > Agreed, and there are a few string functions that could be extended (to > take a sequence) to handle more cases that push folks to regex, perhaps > earlier than they should. str.replace come to mind. It's a annoying to have to chain it 5 times while we could pass optionally a tuple. several startswith() and endswith() require a loop, but we could make them accept *args. Also, we do have to saturate the str namespace with all the re functions. We could decide to go for `str.re.stuff`. From clint.hepner at gmail.com Wed Jun 13 16:49:42 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Wed, 13 Jun 2018 16:49:42 -0400 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> > On 2018 Jun 13 , at 4:43 p, Michel Desmoulin wrote: > > > > Le 13/06/2018 ? 19:11, Mike Miller a ?crit : >> >> On 2018-06-13 06:33, Michel Desmoulin wrote: >>> >>> I often wished for findall and sub to be string methods, so +1 on that. >>> >> >> Agreed, and there are a few string functions that could be extended (to >> take a sequence) to handle more cases that push folks to regex, perhaps >> earlier than they should. > > str.replace come to mind. It's a annoying to have to chain it 5 times > while we could pass optionally a tuple. > > several startswith() and endswith() require a loop, but we could make > them accept *args. Both accept a tuple. For example: >>> "foo".startswith(("f", "b")) True >>> "bar".startswith(("f", "b")) True > > Also, we do have to saturate the str namespace with all the re > functions. We could decide to go for `str.re.stuff`. Attaching an entire module to a type is probably worse than adding a slew of extra methods to the type. -- Clint From rosuav at gmail.com Wed Jun 13 16:52:58 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 06:52:58 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: On Thu, Jun 14, 2018 at 6:43 AM, Michel Desmoulin wrote: > > > Le 13/06/2018 ? 19:11, Mike Miller a ?crit : >> >> On 2018-06-13 06:33, Michel Desmoulin wrote: >>> >>> I often wished for findall and sub to be string methods, so +1 on that. >>> >> >> Agreed, and there are a few string functions that could be extended (to >> take a sequence) to handle more cases that push folks to regex, perhaps >> earlier than they should. > > str.replace come to mind. It's a annoying to have to chain it 5 times > while we could pass optionally a tuple. That would be handy. Either pass two sequences of equal length (replace each with the corresponding), or one sequence and one string (replaceactual any with that). (And yes, I know that a string IS a sequence.) This would want to be semantically different from chained calls, in that a single replace([x,y,z], q) would avoid re-replacing; but for many situations, it'll be functionally identical. > several startswith() and endswith() require a loop, but we could make > them accept *args. Not without breaking other code: they already accept two optional parameters. It'd have to be accepting a tuple of strings. Which... they already do. :) startswith(...) method of builtins.str instance S.startswith(prefix[, start[, end]]) -> bool Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try. ChrisA From desmoulinmichel at gmail.com Wed Jun 13 16:59:34 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 13 Jun 2018 22:59:34 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> Message-ID: <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> > > Both accept a tuple. For example: > > >>> "foo".startswith(("f", "b")) > True > >>> "bar".startswith(("f", "b")) > True > Nice. Now let's do that for str.replace. >> Also, we do have to saturate the str namespace with all the re >> functions. We could decide to go for `str.re.stuff`. > > Attaching an entire module to a type is probably worse than > adding a slew of extra methods to the type. > Not my point. str.re would not be the re module, just a namespace where to group all regex related string methods. From desmoulinmichel at gmail.com Wed Jun 13 16:59:41 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 13 Jun 2018 22:59:41 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: Le 13/06/2018 ? 22:53, David Mertz a ?crit?: > On Wed, Jun 13, 2018, 4:44 PM Michel Desmoulin > > wrote: > > several startswith() and endswith() require a loop, but we could > make them?accept *args. > > > You mean something like: > > "Lorem ipsum".startswith(('Lo', 'Hi', 'Foo')) > > You might want to check the time machine. > Sweat. Now let's do that for replace :) From mertz at gnosis.cx Wed Jun 13 16:53:07 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 13 Jun 2018 16:53:07 -0400 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: On Wed, Jun 13, 2018, 4:44 PM Michel Desmoulin wrote: > several startswith() and endswith() require a loop, but we could make > them accept *args. > You mean something like: "Lorem ipsum".startswith(('Lo', 'Hi', 'Foo')) You might want to check the time machine. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Jun 13 18:01:49 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 13 Jun 2018 18:01:49 -0400 Subject: [Python-ideas] Add hooks to asyncio lifecycle In-Reply-To: <69f6425b-361a-85fa-d49c-4830d43a3aea@gmail.com> References: <8d13eb02-42a2-93a3-1387-8a16bf54c9fc@gmail.com> <18942f8f-679a-5481-6015-01ed22c08278@gmail.com> <2e0b61b4-d2a2-b7cc-483c-e51ef9b4377f@gmail.com> <94436b26-22e0-802b-e709-cde7d66e89dc@gmail.com> <69f6425b-361a-85fa-d49c-4830d43a3aea@gmail.com> Message-ID: On Tue, Jun 12, 2018 at 5:34 AM Michel Desmoulin wrote: [..] > But even if I have spoken up then, my experience with Python-idea is that most of the time, people would have told me "no". They would have told me that "nobody is going to do that". > I'm getting used to it. It took me a lot of time of "path should inherit from strings" to finally have enough people involved that we get a solution to the problem (which ended up being __fspath__). At the beginning, the answer was "no, there is no problem". > So basically, now I speak up, knowing that people will say "no". But at least I said it. Maybe in 2 years we will go back to that an implement it, or another solution to this problem. [..] > Locking the policy is providing a clean and easy way to do things. Michel, I know you're an avid asyncio user. I've seen your kind comments about asyncio and uvloop on reddit, twitter, etc. So I really don't want to discourage you from posting here and proposing ideas. But arguments like 'people would have told me "no" ... I'm getting used to it.' aren't helpful. I've repeatedly asked you in this thread to provide a couple of good and easy to follow examples so that we can make an informed decision about whether we should add policy locking mechanism or not. Example: We want to create library "A" because it will work only with event loop "B". Checking the event loop once in A's APIs doesn't work [because ...]. Another example: We use asyncio in production and we see that people change policies at runtime all the time; we need to guard against that somehow. Unless you can clearly demonstrate that locking solves real world problems we are not going to implement it, because get_event_loop() and policies are *already* too complicated. You say that Django performs a bunch of sanity checks, but I don't see a good explanation of why you can't just call `isinstance(asyncio.get_event_loop_policy())` in your framework/application. I also keep repeating that libraries, in general, shouldn't be hardcoded to work with some specific policies, and they should not provide alternative policies unless they provide an alternative event loop implementation. And I don't want to add locking to somehow encourage libraries to use policies. I use a bunch of asyncio libraries in my code and none of them (except aiohttp) uses policies. And aiohttp provides policies only to help you run HTTP servers under gunicorn etc, it's up to the user if they want to use them or not. I'm currently prototyping nurseries/cancel scopes for asyncio and I don't need to use policies to make them work. So I honestly don't see a clear case to add locking so far. Maybe having a PEP would allow us to build a compelling case for adding policies locking. Yury From python at mrabarnett.plus.com Wed Jun 13 18:54:34 2018 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 13 Jun 2018 23:54:34 +0100 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: <78fc8f4b-e4e0-4c36-c043-49254cdd3ac7@mrabarnett.plus.com> On 2018-06-13 21:52, Chris Angelico wrote: > On Thu, Jun 14, 2018 at 6:43 AM, Michel Desmoulin > wrote: >> >> >> Le 13/06/2018 ? 19:11, Mike Miller a ?crit : >>> >>> On 2018-06-13 06:33, Michel Desmoulin wrote: >>>> >>>> I often wished for findall and sub to be string methods, so +1 on that. >>>> >>> >>> Agreed, and there are a few string functions that could be extended (to >>> take a sequence) to handle more cases that push folks to regex, perhaps >>> earlier than they should. >> >> str.replace come to mind. It's a annoying to have to chain it 5 times >> while we could pass optionally a tuple. > > That would be handy. Either pass two sequences of equal length > (replace each with the corresponding), or one sequence and one string > (replaceactual any with that). (And yes, I know that a string IS a > sequence.) This would want to be semantically different from chained > calls, in that a single replace([x,y,z], q) would avoid re-replacing; > but for many situations, it'll be functionally identical. > Would it check first-to-last or longest-to-shortest? I think that longest-to-shortest would be the most useful. >>> old = ('cat', 'cats') >>> new = ('mouse', 'mice') >>> >>> # First-to-last. >>> 'cats'.replace(old, new) 'mouses' >>> >>> # Longest-to-shortest. >>> 'cats'.replace(old, new) 'mice' [snip] From rosuav at gmail.com Wed Jun 13 18:58:32 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 08:58:32 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <78fc8f4b-e4e0-4c36-c043-49254cdd3ac7@mrabarnett.plus.com> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <78fc8f4b-e4e0-4c36-c043-49254cdd3ac7@mrabarnett.plus.com> Message-ID: On Thu, Jun 14, 2018 at 8:54 AM, MRAB wrote: > On 2018-06-13 21:52, Chris Angelico wrote: >> >> On Thu, Jun 14, 2018 at 6:43 AM, Michel Desmoulin >> wrote: >>> >>> >>> >>> Le 13/06/2018 ? 19:11, Mike Miller a ?crit : >>>> >>>> >>>> On 2018-06-13 06:33, Michel Desmoulin wrote: >>>>> >>>>> >>>>> I often wished for findall and sub to be string methods, so +1 on that. >>>>> >>>> >>>> Agreed, and there are a few string functions that could be extended (to >>>> take a sequence) to handle more cases that push folks to regex, perhaps >>>> earlier than they should. >>> >>> >>> str.replace come to mind. It's a annoying to have to chain it 5 times >>> while we could pass optionally a tuple. >> >> >> That would be handy. Either pass two sequences of equal length >> (replace each with the corresponding), or one sequence and one string >> (replaceactual any with that). (And yes, I know that a string IS a >> sequence.) This would want to be semantically different from chained >> calls, in that a single replace([x,y,z], q) would avoid re-replacing; >> but for many situations, it'll be functionally identical. >> > Would it check first-to-last or longest-to-shortest? I think that > longest-to-shortest would be the most useful. > >>>> old = ('cat', 'cats') >>>> new = ('mouse', 'mice') >>>> >>>> # First-to-last. >>>> 'cats'.replace(old, new) > 'mouses' >>>> >>>> # Longest-to-shortest. >>>> 'cats'.replace(old, new) > 'mice' I'd go first-to-last, personally. You can always sort them by length if you want that behaviour. But that's a bikeshed where I'm not too picky about the colour. ChrisA From ericfahlgren at gmail.com Wed Jun 13 18:59:29 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Wed, 13 Jun 2018 15:59:29 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <78fc8f4b-e4e0-4c36-c043-49254cdd3ac7@mrabarnett.plus.com> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <78fc8f4b-e4e0-4c36-c043-49254cdd3ac7@mrabarnett.plus.com> Message-ID: On Wed, Jun 13, 2018 at 3:54 PM MRAB wrote: > Would it check first-to-last or longest-to-shortest? I think that > longest-to-shortest would be the most useful. > > >>> old = ('cat', 'cats') > >>> new = ('mouse', 'mice') > >>> > >>> # First-to-last. > >>> 'cats'.replace(old, new) > 'mouses' > >>> > >>> # Longest-to-shortest. > >>> 'cats'.replace(old, new) > 'mice' > ?I would expect left-to-right, and leave the programmer to get their mouse in order.? -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Jun 13 19:32:43 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 11:32:43 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: Message-ID: <5B21A99B.6060508@canterbury.ac.nz> Clint Hepner wrote: > Strictly speaking, a regular expression is just a string that encodes of a > (non)deterministic finite automata. More strictly speaking, regular expressions themselves are agnostic about determinism vs. non-determinism, since for any NFA you can always find an equivalent DFA. -- Greg From greg.ewing at canterbury.ac.nz Wed Jun 13 19:47:29 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 11:47:29 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: <5B21AD11.3010409@canterbury.ac.nz> Mikhail V wrote: > But if you say that special-casing of [i:j] here would be hard to implement, > then maybe insert() idea should be dropped. Since I wrote that I realised that it's not true -- given an infix ^ operator like you propose, the in-place version of it would actually give the desired result. However, I still don't think that either insert or append are frequent enough operations to warrant having their own operators. It's true that append is a lot more common than insert, but usually it's a mutating append that you want, not creating a new list with one more item on the end, which is what your ^ operator does. -- Greg From greg.ewing at canterbury.ac.nz Wed Jun 13 19:58:21 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 11:58:21 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> Message-ID: <5B21AF9D.2080301@canterbury.ac.nz> Mikhail V wrote: > L ^= item > is > L.append(item) > or > L += [item] Okay, that achieves an in-place append, but it's not exactly obvious to the unenlightened what it does, whereas append() is pretty self-explanatory. Also, using the slice version to do an insert L[i:i] ^= item is not as efficient as it looks like it should be, because it creates an empty list, appends the item to it and then splices that back into the list. And you have to write the index twice. Whereas L.insert(i, item) doesn't have any of those problems, and again is mostly self-explanatory. Python is not Perl. Not every operation that you use more than once in a blue moon needs to have its own operator. -- Greg From greg.ewing at canterbury.ac.nz Wed Jun 13 20:09:35 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 12:09:35 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: <5B21B23F.2080504@canterbury.ac.nz> Michel Desmoulin wrote: > Also, we do have to saturate the str namespace with all the re > functions. We could decide to go for `str.re.stuff`. However, note that this is not as simple as just adding a class attribute to str that references the re module, since presumably you want to be able to write mystring.re.match(pattern) instead of having to write mystring.re.match(pattern, mystring) or str.re.match(pattern, mystring) which would mostly defeat the purpose. -- Greg From greg.ewing at canterbury.ac.nz Wed Jun 13 20:15:17 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 12:15:17 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: <5B21B395.4010601@canterbury.ac.nz> Chris Angelico wrote: > This would want to be semantically different from chained > calls, in that a single replace([x,y,z], q) would avoid re-replacing; +1, this would be REALLY handy! It's easy to trip yourself up with chained replacements if you're not careful -- like I did once when escaping things using &xxx; sequences in XML. If you don't do it in the right order, you end up escaping some of the &s you just inserted. :-( -- Greg From greg.ewing at canterbury.ac.nz Wed Jun 13 20:23:56 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 12:23:56 +1200 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: <5B21B59C.6020509@canterbury.ac.nz> Stephan Houben wrote: > Yes that is what my code does. > It reduces degrees to [0,90]. Only for the special angles, though. Richard's point is that you need to do tha for *all* angles to avoid discontinuities with large angles. > This is not how sine functions are calculated. They are calculated by > reducing angle to some interval, then evaluating a polynomial which > approximates the true sine within that interval. That's what he suggested, with the interval being -7.5 to 7.5 degrees. -- Greg From leewangzhong+python at gmail.com Wed Jun 13 20:59:58 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 13 Jun 2018 20:59:58 -0400 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B21B395.4010601@canterbury.ac.nz> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <5B21B395.4010601@canterbury.ac.nz> Message-ID: On Wed, Jun 13, 2018 at 8:15 PM, Greg Ewing wrote: > Chris Angelico wrote: >> >> This would want to be semantically different from chained >> calls, in that a single replace([x,y,z], q) would avoid re-replacing; > > > +1, this would be REALLY handy! > > It's easy to trip yourself up with chained replacements > if you're not careful -- like I did once when escaping > things using &xxx; sequences in XML. If you don't do it > in the right order, you end up escaping some of the &s > you just inserted. :-( > > -- > Greg In a thread earlier this year, I suggested allowing a dict: https://mail.python.org/pipermail/python-ideas/2018-February/048875.html For example: txt.replace({ '&': '&', '"': '"', "'": ''', ... }) Tuples of strings can be dict keys, so it's also possible to allow several options to be replaced with a single thing. One use I had for multi-replace was to parse a file that was almost CSV, but not close enough to be parsed by `import csv`. I had to be careful to get the order right so that old replacements wouldn't cause newer ones. From mikhailwas at gmail.com Wed Jun 13 21:45:04 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Thu, 14 Jun 2018 04:45:04 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <5B21AF9D.2080301@canterbury.ac.nz> References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Thu, Jun 14, 2018 at 2:58 AM, Greg Ewing wrote: > Mikhail V wrote: >> >> L ^= item >> is >> L.append(item) >> or >> L += [item] > > > Okay, that achieves an in-place append, but it's not exactly > obvious to the unenlightened what it does, whereas append() > is pretty self-explanatory. > Sure, I use append() only and obviously would recommend to do so. I am not sure though everything's so simple here. Yet by writing examples here I've made typos several times, like L = L.append() - so strong is the influence of functional programming. I don't like cryptic operators, although such cases make me think in-place operations need some specialty in their syntax and augmented assignment could provide something positive in this regard. Another point is that people do like augmented operators much and for the append - there are so many advises like: hey, use L += [item] ! But Imo this just makes things worse from both practical and semantics POV (still I find it hard to explain why exactly, that's just some gut feeling). Not to argue, since I personally wouldn't benefit much from the idea, but it just seems important, more than just 'cosmetics'. > Also, using the slice version to do an insert > > L[i:i] ^= item > > is not as efficient as it looks like it should be, because it > creates an empty list, appends the item to it and then splices > that back into the list. And you have to write the index twice. Efficiency is important ... where it is important. From steve at pearwood.info Thu Jun 14 01:29:52 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 15:29:52 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> Message-ID: <20180614052952.GE12683@ando.pearwood.info> On Wed, Jun 13, 2018 at 10:59:34PM +0200, Michel Desmoulin wrote: > > Attaching an entire module to a type is probably worse than > > adding a slew of extra methods to the type. > > > > Not my point. > > str.re would not be the re module, just a namespace where to group all > regex related string methods. That's what a module is :-) How would this work? If I say: "My string".re.match(...) if str.re is "just a namespace" how will the match function know the string it is to operate on? -- Steve From brenbarn at brenbarn.net Wed Jun 13 13:31:21 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 13 Jun 2018 10:31:21 -0700 Subject: [Python-ideas] Link accepted PEPs to their whatsnew section? In-Reply-To: References: <20180612214631.sje6xkm4po4m2zts@python.ca> Message-ID: <5B2154E9.504@brenbarn.net> On 2018-06-13 06:48, Nick Coghlan wrote: > On 13 June 2018 at 11:06, Michael Selik > wrote: > > Google will probably fix this problem for you after dataclasses > become popular. The docs will gain a bunch of inbound links and the > issue will (probably) solve itself as time passes. > > > Sometimes when reading a PEP it isn't especially clear exactly which > version it landed in, or whether or not there were significant changes > post-acceptance based on issues discovered during the beta period, though. > > So the idea of a "Release-Note" header that points to the version > specific What's New entry seems like a decent idea to me (and may > actually help the What's New section supplant the PEP in search results). I think that is a great idea. I have definitely sometimes found myself bouncing back and forth between the main documentation and the PEPs, trying to cobble together an understanding of what the actual behavior is. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From steve at pearwood.info Thu Jun 14 02:08:00 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 16:08:00 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> Message-ID: <20180614060758.GF12683@ando.pearwood.info> On Wed, Jun 13, 2018 at 10:43:43PM +0200, Michel Desmoulin wrote: > str.replace come to mind. It's a annoying to have to chain it 5 times > while we could pass optionally a tuple. Its not so simple. Multiple replacements underspecifies the behaviour. The simplest behaviour is to have astring.replace((spam, eggs, cheese), new) be simply syntactic sugar for: astring.replace(spam, new).replace(eggs, new).replace(cheese, new) which is nice and simple to explain and nice and simple to implement (it's just a loop calling the method for each argument in the tuple), but its probably not the most useful solution: # replace any of "salad", "cheese" or "ham" with "cheesecake". s = "Lunch course are cheese & coffee, salad & cream, or ham & peas" s.replace("salad", "cheesecake").replace("cheese", "cheesecake").replace("ham", "cheesecake") => 'Lunch course are cheesecake & coffee, cheesecakecake & cream, or cheesecake & peas' which is highly unlikely to be what anyone wants. But it isn't clear what people *will* want. So we need to decide what replace with multiple targets actually means. Here are some suggestions: - the order of targets ought to be irrelevant: replace((a, b) ...) and replace((b, a) ...) ought to mean the same thing; - should targets match longest first or shortest first? or a flag to choose which you want? - what if you have multiple targets and you need to give some longer ones priority, and some shorter ones? - there ought to be a single pass through the string, not multiple passes -- this is not just syntactic sugar for calling replace in a loop! - the replacement string should be skipped and not scanned. -- Steve From brenbarn at brenbarn.net Thu Jun 14 02:12:43 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 13 Jun 2018 23:12:43 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614052952.GE12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> Message-ID: <5B22075B.5090007@brenbarn.net> On 2018-06-13 22:29, Steven D'Aprano wrote: > On Wed, Jun 13, 2018 at 10:59:34PM +0200, Michel Desmoulin wrote: > >> > Attaching an entire module to a type is probably worse than >> > adding a slew of extra methods to the type. >> > >> >> Not my point. >> >> str.re would not be the re module, just a namespace where to group all >> regex related string methods. > > That's what a module is :-) > > How would this work? If I say: > > "My string".re.match(...) > > if str.re is "just a namespace" how will the match function know the > string it is to operate on? str.re can be a descriptor object which "knows" which string instance it is bound to. This kind of thing is common in many libraries. Pandas for example has all kinds of things like df.loc[1:3], df.column.str.startswith('blah'), etc. The "loc" and "str" attributes give objects which are bound (in the sense that bound methods are bound) to the objects on which they are accessed, so when you use these attributes to do things, the effect takes account of on the "root" object on which you accessed the attribute. Personally I think this is a great way to reduce namespace clutter and group related functionality without having to worry about using up all the short or "good" names at the top level. I'm not sure I agree with the specific proposal here for allowing regex operations on strings, but if we do do it, this would be a good way to do it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From greg.ewing at canterbury.ac.nz Thu Jun 14 02:33:14 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Jun 2018 18:33:14 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614060758.GF12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <20180614060758.GF12683@ando.pearwood.info> Message-ID: <5B220C2A.2010108@canterbury.ac.nz> Steven D'Aprano wrote: > - should targets match longest first or shortest first? or a flag > to choose which you want? > > - what if you have multiple targets and you need to give some longer > ones priority, and some shorter ones? I think the suggestion made earlier is reasonable: match them in the order they're given. Then the user gets complete control over the priorities. -- Greg From rosuav at gmail.com Thu Jun 14 02:37:02 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 16:37:02 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B22075B.5090007@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: On Thu, Jun 14, 2018 at 4:12 PM, Brendan Barnwell wrote: > On 2018-06-13 22:29, Steven D'Aprano wrote: >> >> On Wed, Jun 13, 2018 at 10:59:34PM +0200, Michel Desmoulin wrote: >> >>> > Attaching an entire module to a type is probably worse than >>> > adding a slew of extra methods to the type. >>> > >>> >>> Not my point. >>> >>> str.re would not be the re module, just a namespace where to group all >>> regex related string methods. >> >> >> That's what a module is :-) >> >> How would this work? If I say: >> >> "My string".re.match(...) >> >> if str.re is "just a namespace" how will the match function know the >> string it is to operate on? > > > str.re can be a descriptor object which "knows" which string > instance it is bound to. This kind of thing is common in many libraries. > Pandas for example has all kinds of things like df.loc[1:3], > df.column.str.startswith('blah'), etc. The "loc" and "str" attributes give > objects which are bound (in the sense that bound methods are bound) to the > objects on which they are accessed, so when you use these attributes to do > things, the effect takes account of on the "root" object on which you > accessed the attribute. > > Personally I think this is a great way to reduce namespace clutter > and group related functionality without having to worry about using up all > the short or "good" names at the top level. I'm not sure I agree with the > specific proposal here for allowing regex operations on strings, but if we > do do it, this would be a good way to do it. > How is this materially different from: "some string".re_match(...) ? It's not a grouped namespace in any technical sense, but to any human, a set of methods that start with a clear prefix is functionally a group. That said, though, I don't think any of them need to be methods. The 're' module is there to be imported. ChrisA From steve at pearwood.info Thu Jun 14 03:10:28 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 17:10:28 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B22075B.5090007@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: <20180614071028.GG12683@ando.pearwood.info> On Wed, Jun 13, 2018 at 11:12:43PM -0700, Brendan Barnwell wrote: > >How would this work? If I say: > > > >"My string".re.match(...) > > > >if str.re is "just a namespace" how will the match function know the > >string it is to operate on? > > str.re can be a descriptor object which "knows" which string > instance it is bound to. Obviously, but then its not "just a namespace". This idea might be common in libraries like pandas, but I don't like it. Common is not necessarily good. Unless str.re is something meaningful on its own, what purpose does it hold? If str.re doesn't carry its own weight as a meaningful object, then it shouldn't exist. Particularly since we're only talking about a handful of new methods. Looking at re, we have these public functions: - match - fullmatch - search match and fullmatch are redundant; they're the same as calling search with a pattern that matches "start of string" and "end of string". str.find() could easily take a pattern object instead of needing a separate search object, particularly if we have dedicated syntax for regexes. - sub - subn sub is redundant since it does the same as subn; or the str.replace() method could take a pattern object. - split Likewise str.split() could take a pattern object. - findall - finditer findall is just list(finditer); search, match and fullmatch are just next(finditer). The re module API is full of redundancies. That's okay; I'm not proposing we "fix" that. But we don't have to duplicate that in string objects. Rather than add eight new methods, we could allow the existing string methods to take pattern objects as arguments. That gives us potentially: count, endswith, find, index, lstrip, partition, replace, rfind, rindex, rpartition, rsplit, rstrip, split, startswith, strip (15 methods) that support regex pattern objects, pretty much covering all the functionality of: match, fullmatch, search, split, sub, subn and then some. re.findall is redundant. That leaves (potentially) only a single re function to turn into a string method: finditer. How do you get the pattern object? We have three possible tactics: - import re and call re.compile; - add a compile method to str; - add special regex syntax, let's say /pattern/ for the sake of the argument. With pattern literals, we can do this with a single new string method, finditer. (Or whatever name we choose.) Without pattern literals, it won't be so convenient, but we could do this with just a one more method: compile. Or we could simply require people to import re to compile their patterns, which would be even less convenient, but it would work. (But maybe that's a good thing, to encourage people to think before reaching for a regular expression, not to encourage them to see every problem as a nail and regexes as the hammer.) -- Steve From brenbarn at brenbarn.net Thu Jun 14 03:12:34 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Thu, 14 Jun 2018 00:12:34 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: <5B221562.5090004@brenbarn.net> On 2018-06-13 23:37, Chris Angelico wrote: >> str.re can be a descriptor object which "knows" which string >> instance it is bound to. This kind of thing is common in many libraries. >> Pandas for example has all kinds of things like df.loc[1:3], >> df.column.str.startswith('blah'), etc. The "loc" and "str" attributes give >> objects which are bound (in the sense that bound methods are bound) to the >> objects on which they are accessed, so when you use these attributes to do >> things, the effect takes account of on the "root" object on which you >> accessed the attribute. >> >> Personally I think this is a great way to reduce namespace clutter >> and group related functionality without having to worry about using up all >> the short or "good" names at the top level. I'm not sure I agree with the >> specific proposal here for allowing regex operations on strings, but if we >> do do it, this would be a good way to do it. >> > > How is this materially different from: > > "some string".re_match(...) > > ? It's not a grouped namespace in any technical sense, but to any > human, a set of methods that start with a clear prefix is functionally > a group. Do you really mean that? :-) As far as I can see, by the same argument, there is no need for modules. Instead of math.sin and math.cos, we can just have math_sin and math_cos. Instead of os.path.join we can just have os_path_join. And so on. Just one big namespace for everything. But as we all know, namespaces are one honking great idea! Now, of course there are other advantages to modules (such as being able to save the time of loading things you don't need), and likewise there are other advantages to this descriptor mechanism in some cases. (For instance, sometimes the sub-object may want to hold state if it is going to be passed around and used later, rather than just having a method called and being thrown away immediately.) But I think it's clear that in both cases the namespacing is also nice. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From brenbarn at brenbarn.net Thu Jun 14 03:22:45 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Thu, 14 Jun 2018 00:22:45 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614071028.GG12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <20180614071028.GG12683@ando.pearwood.info> Message-ID: <5B2217C5.1090700@brenbarn.net> On 2018-06-14 00:10, Steven D'Aprano wrote: > Rather than add eight new methods, we could allow the existing string > methods to take pattern objects as arguments. That gives us potentially: > > count, endswith, find, index, lstrip, partition, replace, rfind, > rindex, rpartition, rsplit, rstrip, split, startswith, strip > > (15 methods) that support regex pattern objects, pretty much covering > all the functionality of: > > match, fullmatch, search, split, sub, subn > > and then some. re.findall is redundant. That leaves (potentially) only a > single re function to turn into a string method: finditer. > > How do you get the pattern object? We have three possible tactics: > > - import re and call re.compile; > > - add a compile method to str; > > - add special regex syntax, let's say/pattern/ for the sake of the > argument. Unless a special regex syntax is added, I don't see that there's much benefit to allowing a compiled object as the argument. (And I don't support adding special regex syntax!) The point is to be able to easily type regular expressions. If using a pattern argument still requires you to import re and call functions in there, it's not worth it. In order for there to be any gain in convenience, you need to be able to pass the actual regex directly to the string method. But there is another way to do this beyond the ones you listed: give .find() (or whatever methods we decide should support regexes) an extra boolean "regex" argument that specifies whether to interpret the target string as a literal string or a regex. I'm not sure why I'm arguing this point, though. :-) Because I actually agree with you (and others on this thread) that there is no real need to make regexes more convenient. I think importing the re module and using the functions therein is fine. If anything, I think the name "re" is too short and cryptic and should be made longer! -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From steve at pearwood.info Thu Jun 14 03:29:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 17:29:03 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B220C2A.2010108@canterbury.ac.nz> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <20180614060758.GF12683@ando.pearwood.info> <5B220C2A.2010108@canterbury.ac.nz> Message-ID: <20180614072902.GH12683@ando.pearwood.info> On Thu, Jun 14, 2018 at 06:33:14PM +1200, Greg Ewing wrote: > Steven D'Aprano wrote: > >- should targets match longest first or shortest first? or a flag > > to choose which you want? > > > >- what if you have multiple targets and you need to give some longer > > ones priority, and some shorter ones? > > I think the suggestion made earlier is reasonable: match > them in the order they're given. Then the user gets > complete control over the priorities. "Explicit is better than implicit" -- the problem with having the order be meaningful is that it opens us up to silent errors when we neglect to consider the order. replace((spam, eggs, cheese) ...) *seems* like it simply means "replace any of spam, eggs or cheese" and it is easy to forget that that the order of replacement is *sometimes* meaningful. But not always. So this is a bug magnet in waiting. So I'd rather have to explicitly specify the order with a parameter rather than implicitly according to how I happen to have built the tuple. # remove duplicates targets = tuple(set(targets)) newstring = mystring.replace(targets, replacement) That's buggy, but it doesn't look buggy, and you could test it until the cows come home and never notice the bug. -- Steve From steve at pearwood.info Thu Jun 14 04:02:58 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 18:02:58 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B2217C5.1090700@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <20180614071028.GG12683@ando.pearwood.info> <5B2217C5.1090700@brenbarn.net> Message-ID: <20180614080257.GI12683@ando.pearwood.info> On Thu, Jun 14, 2018 at 12:22:45AM -0700, Brendan Barnwell wrote: > Unless a special regex syntax is added, I don't see that there's > much benefit to allowing a compiled object as the argument. Fair enough -- I'm not convinced that this proposal is either desirable or necessary either, I'm just suggesting what we *could* do if we choose to. But I'll admit that I'm biased: I find all but the simplest regexes virtually unreadable and I'm very antipathetic to anything which encourages the use of regexes. (Even though I intellectually know that they're just a tool and we shouldn't blame regexes for the abuses some people put them too.) [...] > In order for there to be any gain in convenience, you need to be > able to pass the actual regex directly to the string method. But there is > another way to do this beyond the ones you listed: give .find() (or > whatever methods we decide should support regexes) an extra boolean > "regex" argument that specifies whether to interpret the target string > as a literal string or a regex. Guido has a guideline (one I agree with): no constant bool arguments. If you have a method or function that takes a flag that swaps between two modes, and in practice the flag is only ever (or almost only ever) going to be given as a literal, then it is better to split the function into two distinctly named functions and forego the flag. *Especially* if the flag simply swaps between two distinct implementations with little or nothing in common. > I'm not sure why I'm arguing this point, though. :-) Because I > actually agree with you (and others on this thread) that there is no > real need to make regexes more convenient. I think importing the re > module and using the functions therein is fine. If anything, I think > the name "re" is too short and cryptic and should be made longer! Heh, even I don't go that far :-) -- Steve From steve at pearwood.info Thu Jun 14 04:21:51 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 18:21:51 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B221562.5090004@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <5B221562.5090004@brenbarn.net> Message-ID: <20180614082151.GJ12683@ando.pearwood.info> On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote: > On 2018-06-13 23:37, Chris Angelico wrote: [...] > >How is this materially different from: > > > >"some string".re_match(...) > > > >? It's not a grouped namespace in any technical sense, but to any > >human, a set of methods that start with a clear prefix is functionally > >a group. > > Do you really mean that? :-) > > As far as I can see, by the same argument, there is no need for > modules. Instead of math.sin and math.cos, we can just have math_sin > and math_cos. Instead of os.path.join we can just have os_path_join. > And so on. Just one big namespace for everything. But as we all know, > namespaces are one honking great idea! I'm not Chris, but I'll try to give an answer... Visually, there shouldn't be any difference between using . as a namespace separator and using _ instead. Whether we type math.sin or math_sin makes little difference beyond familiarity. But it does make a difference in whether we can treat math as a distinct object without the .sin part, and whether we can treat namespaces as real values or not. So math.sin is little different from math_sin, but the fact that math alone is a module, a first-class object, and not just a prefix of the name, makes a big difference. As you say: > Now, of course there are other advantages to modules (such as being > able to save the time of loading things you don't need), Loading on demand is one such advantage. Organising source code is another. Being able to pass the math object around as a first-class value, to call getattr() and setattr() or vars() or use introspection on it. You can't do that if its just a name prefix. > and likewise > there are other advantages to this descriptor mechanism in some cases. > (For instance, sometimes the sub-object may want to hold state if it is > going to be passed around and used later, rather than just having a > method called and being thrown away immediately.) We can get that from making the regex method a method directly on the string object. The question I have is, what benefit does the str.re intermediate object bring? Does it carry its own weight? In his refactoring books, Martin Fowler makes it clear that objects ought to carry their own weight. When an object grows too big, you ought to split out functionality and state into intermediate objects. But if those intermediate objects do too little, the extra complexity they bring isn't justified by their usefulness. class Count: def __init__(self, start=0): self.counter = 0 def __iadd__(self, value): self.counter += value Would you use that class, or say it simply adds a needless level of indirection? If the re namespace doesn't do something to justify itself beyond simply adding a namespace, then Chris is right: we might as well just use re_ as a prefix and use a de facto namespace, and save the extra mental complexity and the additional indirection by dropping this intermediate descriptor object. -- Steve From rosuav at gmail.com Thu Jun 14 04:59:24 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jun 2018 18:59:24 +1000 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614082151.GJ12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <5B221562.5090004@brenbarn.net> <20180614082151.GJ12683@ando.pearwood.info> Message-ID: On Thu, Jun 14, 2018 at 6:21 PM, Steven D'Aprano wrote: > On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote: >> On 2018-06-13 23:37, Chris Angelico wrote: > [...] >> >How is this materially different from: >> > >> >"some string".re_match(...) >> > >> >? It's not a grouped namespace in any technical sense, but to any >> >human, a set of methods that start with a clear prefix is functionally >> >a group. >> >> Do you really mean that? :-) >> >> As far as I can see, by the same argument, there is no need for >> modules. Instead of math.sin and math.cos, we can just have math_sin >> and math_cos. Instead of os.path.join we can just have os_path_join. >> And so on. Just one big namespace for everything. But as we all know, >> namespaces are one honking great idea! > > I'm not Chris, but I'll try to give an answer... > > Visually, there shouldn't be any difference between using . as a > namespace separator and using _ instead. Whether we type math.sin or > math_sin makes little difference beyond familiarity. > > But it does make a difference in whether we can treat math as a distinct > object without the .sin part, and whether we can treat namespaces as > real values or not. > > So math.sin is little different from math_sin, but the fact that math > alone is a module, a first-class object, and not just a prefix of the > name, makes a big difference. Yep. That's pretty much what I meant. There are many different types of namespace in Python. Some are actual first-class objects (modules, classes, etc). Others are not, but (to a programmer) are very similar (classes that end "Error", the various constants in the stat module, etc). Sometimes it's useful to query a collection - you can say "show me all the methods and attributes of float" or "give me all the builtins that end with Error" - and as groups or collections, both types of namespace are reasonably functional. But there is a very real *thing* that collects up all the float methods, and that is the type . That's a thing, and it has an identity. What is the thing that gathers together all Errors (as opposed to, say, all subclasses of Exception, which can be queried from the Exception type)? Sometimes the line is blurry. What's the true identity of the math module, other than "the collection of all things mathy"? It'd be plausible to have a "trig" module that has sin/cos/tan etc, and it'd also be plausible to say "from math import Fraction". But when there is no strong identity to the actual thing, and there's a technical and technological reason to avoid giving it an arbitrary identity (what is "spam".re and just how magical is it?), there's basically no reason to do it. Python gives us multiple tools, and there are good reasons to use all of them. In this case, yes, I most definitely *am* saying that <"spam".re_> is a valid human-readable namespace, but one which has no intrinsic identity. ChrisA From gadgetsteve at live.co.uk Thu Jun 14 05:27:05 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Thu, 14 Jun 2018 09:27:05 +0000 Subject: [Python-ideas] Allow filtered dir built in Message-ID: Currently when working with interactive sessions using the dir() or dir(module) built in is incredibly useful for exploring what functionality is available in a module. (Especially the regrettable libraries or modules that add really valuable functionality but have no or limited docstrings). However I often find that when a module adds a lot of functions I need to filter those entries to be able to find the one that I need, e.g.: >>> import mpmath >>> dir(mpmath) # This produces 390+ lines of output but >>> for name in dir(mpmath): ... if 'sin' in name: ... print(name) # gives me a mere 13 to consider as candidates What I would really like to do is: >>> dir(mpmath.*sin*) However, I know that the interpreter will hit problems with one or more operators being embedded in the module name. What I would like to suggest is extending the dir built-in to allow an optional filter parameter that takes fnmatch type wild card as an optional filter. Then I could use: >>> dir(mpmath, "*sin*") To narrow down the candidates. Ideally, this could have a recursive variant that would also include listing, (and filtering), any sub-packages. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com From phd at phdru.name Thu Jun 14 05:45:14 2018 From: phd at phdru.name (Oleg Broytman) Date: Thu, 14 Jun 2018 11:45:14 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B221562.5090004@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <5B221562.5090004@brenbarn.net> Message-ID: <20180614094514.3dyl5m6qf6p6zk52@phdru.name> On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote: > as we all know, namespaces are one > honking great idea! Flat is better than nested, so additional string.re subnamespace is not needed. > -- > Brendan Barnwell Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From phd at phdru.name Thu Jun 14 05:49:32 2018 From: phd at phdru.name (Oleg Broytman) Date: Thu, 14 Jun 2018 11:49:32 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B2217C5.1090700@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <20180614071028.GG12683@ando.pearwood.info> <5B2217C5.1090700@brenbarn.net> Message-ID: <20180614094932.jluqypyydytynzbq@phdru.name> On Thu, Jun 14, 2018 at 12:22:45AM -0700, Brendan Barnwell wrote: > If anything, I think the name "re" is too short > and cryptic and should be made longer! import re as regular_expressions_operations > -- > Brendan Barnwell Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From kenlhilton at gmail.com Thu Jun 14 07:01:11 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Thu, 14 Jun 2018 19:01:11 +0800 Subject: [Python-ideas] Multiple replacement in one call [was: Give regex operations more sugar] In-Reply-To: References: Message-ID: Just changing the subject line here, to keep things on topic Sincerely, Ken; ---------- Forwarded message --------- Date: Thu, 14 Jun 2018 17:29:03 +1000 From: Steven D'Aprano To: python-ideas at python.org ?? Subject: Re: [Python-ideas] Give regex operations more sugar Message-ID: <20180614072902.GH12683 at ando.pearwood.info> Content-Type: text/plain; charset=us-ascii On Thu, Jun 14, 2018 at 06:33:14PM +1200, Greg Ewing wrote: > Steven D'Aprano wrote: > >- should targets match longest first or shortest first? or a flag > > to choose which you want? > > > >- what if you have multiple targets and you need to give some longer > > ones priority, and some shorter ones? > > I think the suggestion made earlier is reasonable: match > them in the order they're given. Then the user gets > complete control over the priorities. "Explicit is better than implicit" -- the problem with having the order be meaningful is that it opens us up to silent errors when we neglect to consider the order. replace((spam, eggs, cheese) ...) *seems* like it simply means "replace any of spam, eggs or cheese" and it is easy to forget that that the order of replacement is *sometimes* meaningful. But not always. So this is a bug magnet in waiting. So I'd rather have to explicitly specify the order with a parameter rather than implicitly according to how I happen to have built the tuple. # remove duplicates targets = tuple(set(targets)) newstring = mystring.replace(targets, replacement) That's buggy, but it doesn't look buggy, and you could test it until the cows come home and never notice the bug. -- Steve ------------------------------ Subject: Digest Footer _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas ------------------------------ End of Python-ideas Digest, Vol 139, Issue 70 ********************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.sanchez.fabregas at xunta.gal Thu Jun 14 07:03:37 2018 From: daniel.sanchez.fabregas at xunta.gal (=?UTF-8?Q?Daniel_S=c3=a1nchez_F=c3=a1bregas?=) Date: Thu, 14 Jun 2018 13:03:37 +0200 Subject: [Python-ideas] Check type hints in stack trace printing Message-ID: My idea consist in: Adding a method to perform type checking in traceback objects When printing stack traces search for mistyped arguments and warn about them to the user. Don't know if it is in the roadmap, but seems that have a good cost/benefit ratio to me. From greg.ewing at canterbury.ac.nz Thu Jun 14 08:08:48 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Jun 2018 00:08:48 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614082151.GJ12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <5B221562.5090004@brenbarn.net> <20180614082151.GJ12683@ando.pearwood.info> Message-ID: <5B225AD0.7010501@canterbury.ac.nz> Steven D'Aprano wrote: > So math.sin is little different from math_sin, but the fact that math > alone is a module, a first-class object, and not just a prefix of the > name, makes a big difference. This is important because it provides ways of referring to things in the module without having to write out the whole module name every time, e.g. import math as m y = m.sin(x) Would it be useful to pull out mystring.re and use it this way? I don't know. Maybe sometimes, the same way that extracting bound methods is sometimes useful. -- Greg From greg.ewing at canterbury.ac.nz Thu Jun 14 08:12:26 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Jun 2018 00:12:26 +1200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614094932.jluqypyydytynzbq@phdru.name> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> <20180614071028.GG12683@ando.pearwood.info> <5B2217C5.1090700@brenbarn.net> <20180614094932.jluqypyydytynzbq@phdru.name> Message-ID: <5B225BAA.8050604@canterbury.ac.nz> Oleg Broytman wrote: > import re as regular_expressions_operations Personally I would use import re as ridiculously_enigmatic_operations -- Greg From steve at pearwood.info Thu Jun 14 08:33:50 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 22:33:50 +1000 Subject: [Python-ideas] Allow filtered dir built in In-Reply-To: References: Message-ID: <20180614123350.GK12683@ando.pearwood.info> On Thu, Jun 14, 2018 at 09:27:05AM +0000, Steve Barnes wrote: [...] > What I would like to suggest is extending the dir built-in to allow an > optional filter parameter that takes fnmatch type wild card as an > optional filter. Then I could use: > > >>> dir(mpmath, "*sin*") > > To narrow down the candidates. I have exactly that in my Python startup file. It monkey-patches the builtin dir with a custom wrapper function that has signature: edir( [object, [glob='',]] *, meta=False, dunder=True, private=True) For the glob, I support the following metacharacters: - Reverse matching: if the glob begins with '!' or '!=', the sense of the match is reversed to "don't match". - Case-sensitive matching: if the glob begins with '=' or '!=', perform a case-sensitive match. Otherwise filters are case- insensitive by default. - Wildcards: '?' to match a single character, '*' to match zero or more characters. - Character sets: e.g. '[abc]' to match any of 'a', 'b', 'c', or '[!abc]' to match any character except 'a', 'b', 'c'. If the glob argument contains no metacharacters apart from the ! and = flags, a straight substring match is performed. So there's no need to match on a glob "*foo*", I can just use "foo". If there is interest in this, I will clean it up for public consumption, and publish it somewhere as a third-party module, and for consideration for the stdlib. -- Steve From steve at pearwood.info Thu Jun 14 08:37:05 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Jun 2018 22:37:05 +1000 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: References: Message-ID: <20180614123705.GL12683@ando.pearwood.info> On Thu, Jun 14, 2018 at 01:03:37PM +0200, Daniel S?nchez F?bregas wrote: > My idea consist in: > Adding a method to perform type checking in traceback objects > When printing stack traces search for mistyped arguments and warn about > them to the user. Can you give a concrete example of how this would work? -- Steve From gadgetsteve at live.co.uk Thu Jun 14 09:32:28 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Thu, 14 Jun 2018 13:32:28 +0000 Subject: [Python-ideas] Allow filtered dir built in In-Reply-To: <20180614123350.GK12683@ando.pearwood.info> References: <20180614123350.GK12683@ando.pearwood.info> Message-ID: On 14/06/2018 13:33, Steven D'Aprano wrote: > On Thu, Jun 14, 2018 at 09:27:05AM +0000, Steve Barnes wrote: > > [...] >> What I would like to suggest is extending the dir built-in to allow an >> optional filter parameter that takes fnmatch type wild card as an >> optional filter. Then I could use: >> >> >>> dir(mpmath, "*sin*") >> >> To narrow down the candidates. > > I have exactly that in my Python startup file. It monkey-patches the > builtin dir with a custom wrapper function that has signature: > > edir( [object, [glob='',]] *, meta=False, dunder=True, private=True) > > For the glob, I support the following metacharacters: > > - Reverse matching: if the glob begins with '!' or '!=', the > sense of the match is reversed to "don't match". > > - Case-sensitive matching: if the glob begins with '=' or '!=', > perform a case-sensitive match. Otherwise filters are case- > insensitive by default. > > - Wildcards: '?' to match a single character, '*' to match > zero or more characters. > > - Character sets: e.g. '[abc]' to match any of 'a', 'b', 'c', > or '[!abc]' to match any character except 'a', 'b', 'c'. > > If the glob argument contains no metacharacters apart from the ! and = > flags, a straight substring match is performed. So there's no need to > match on a glob "*foo*", I can just use "foo". > > If there is interest in this, I will clean it up for public consumption, > and publish it somewhere as a third-party module, and for consideration > for the stdlib. > > > Obviously, I personally would be interested (probably goes without saying). -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com From ericfahlgren at gmail.com Thu Jun 14 10:10:42 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Thu, 14 Jun 2018 07:10:42 -0700 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: References: Message-ID: On Thu, Jun 14, 2018 at 4:03 AM Daniel S?nchez F?bregas wrote: > My idea consist in: > Adding a method to perform type checking in traceback objects > When printing stack traces search for mistyped arguments and warn about > them to the user. > ? Isn't it faster and far more reliable to run your code through mypy (or whatever) and detect these problems statically, rather than wait and hope that you run into the problem dynamically? -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 14 12:40:19 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 14 Jun 2018 09:40:19 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Wed, Jun 13, 2018 at 6:45 PM, Mikhail V wrote: > Another point is that people do like augmented operators much and for the > append - there are so many advises like: hey, use L += [item] ! > another data point -- in teaching, a number of newbie students do exactly that. Actually, they do: a_list += an_item and find it does not do what they want, and then the get confused, and I show that they need: a_list += [an_item] or a_list.append(an_item) (this gets particularly confusing them an_item is, itself, a sequence (a string is really common). So it would be nice to have an operator version of append -- but given the limited number of operators, and their usual uses, I suspect it would cause even more confusion... But throwing it out there, how about (ab)using the mat_mul operator: a_list @= an_item Another note: One of the major motivations for augmented assignment was being able to support in-place operations for numpy: an_array += something which was MUCH nicer notation that was was required: np.sum(an_array, something, out=an_array) That is a much bigger win than going from .append() to an operator. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Thu Jun 14 13:37:08 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Thu, 14 Jun 2018 20:37:08 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Thu, Jun 14, 2018 at 7:40 PM, Chris Barker wrote: > On Wed, Jun 13, 2018 at 6:45 PM, Mikhail V wrote: >> > So it would be nice to have an operator version of append -- but given the > limited number of operators, and their usual uses, I suspect it would cause > even more confusion... > > But throwing it out there, how about (ab)using the mat_mul operator: > > a_list @= an_item > Yes, none of the oprators fits well by sense. But which symbol would fit here if you could choose *any* symbol? I've picked the caret ^ for two reasons: 1. It is originally associated with the insertion mark. >From wikipedia (https://en.wikipedia.org/wiki/Caret): "The caret was originally used, and continues to be, in handwritten form as a proofreading mark to indicate where a punctuation mark, word, or phrase should be inserted in a document." So there is some relation with "insert" operation. But of course this makes sense mainly for people who have some editorial or typography background, (such as myself). So for me it makes perfect sense. 2. The symbol ^ itself looks quite 'gentle' and to my eye, is not too distracting (at least less than some other operators). So e.g. I find that vertical bar | causes some bad eye straining effect. > Another note: > > One of the major motivations for augmented assignment was being able to > support in-place operations for numpy: > > an_array += something > > which was MUCH nicer notation that was was required: > > np.sum(an_array, something, out=an_array) > > That is a much bigger win than going from .append() to an operator. > Agreed. From tim.peters at gmail.com Thu Jun 14 14:44:34 2018 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 14 Jun 2018 13:44:34 -0500 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <4630676A-D766-4663-9A9C-753CF5A05145@mac.com> <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: I should note that numeric code "that works" is often much subtler than it appears at first glance. So, for educational purposes, I'll point out some of what _wasn't_ said about this crucial function: [Tim] > import mpmath > from math import fmod > # Return (n, x) such that: > # 1. d degrees is equivalent to x + n*(pi/2) radians. > # 2. x is an mpmath float in [-pi/4, pi/4]. > # 3. n is an integer in range(4). > # There is one potential rounding error, when mpmath.radians() is > # used to convert a number of degrees between -45 and 45. This is > # done using the current mpmath precision. > def treduce(d): > d = fmod(d, 360.0) > n = round(d / 90.0) > assert -4 <= n <= 4 > d -= n * 90.0 > assert -45.0 <= d <= 45.0 > return n & 3, mpmath.radians(d) > > How do we know there is at most one rounding error in that? No, it's not obvious. That `fmod` is exact is guaranteed by the relevant standards, but most people who write a libm don't get it right at first. There is no "cheap" way to implement it correctly. It requires getting the effect of doing exact integer division on, potentially, multi-thousand bit integers. Assuming x > y > 0, a correct implementation of fmod(x, y) ends up in a loop that goes around a number of times roughly equal to log2(x/y), simulating one-bit-at-a-time long division For example, here's glibc's implementation: https://github.com/bminor/glibc/blob/master/sysdeps/ieee754/dbl-64/e_fmod.c Don't expect that to be easy to follow either ;-) Then how do we know that `d -= n * 90.0" is exact? That's not obvious either. It follows from the "Sterbenz lemma", one version of which: if x and y are non-zero floats of the same sign within a factor of 2 of each other, 1/2 <= x/y <= 2 (mathematically) then x-y is exactly representable as a float too. This is true regardless of the floating-point base, or of rounding mode in use. In IEEE-754, it doesn't even need weasel words to exempt underflowing cases. That lemma needs to be applied by cases, for each of the possible values of (the integer) `n`. It gets closest to failing for |n| = 1. For example, if d is a tiny bit larger than 45, n is 1, and then d/90 is (mathematically) very close to 1/2. Which is another thing that needs to be shown: "if d is a tiny bit larger than 45, n is 1". Why? It's certainly true if we were using infinite precision, but we're not. The smallest representable double > 45 is 45 + 2**-47: >>> d = 45 + 2**-47 >>> d 45.00000000000001 >>> _.hex() '0x1.6800000000001p+5' Then d/90.0 (the argument to round()) is, with infinite precision, (45 + 2**-47)/90 = 0.5 + 2**-47/90 1 ULP with respect to 0.5 is 2**-53, so that in turn is equal to 0.5 + 2**-53/(90/64) = 0.5 + (64/90)*2**-53 = 0.5 + 0.71111111111... * 2**-53 Because the tail (0.711...) is greater than 0.5 ULP, it rounds up under nearest-even rounding, to 0.5 + 2**-53 >>> d / 90 0.5000000000000001 >>> 0.5 + 2**-53 0.5000000000000001 and so Python's round() rounds it up to 1: >>> round(_) 1 Note that it would _not_ be true if truncating "rounding" were done, so round-nearest is a hidden assumption in the code. Similar analysis needs to be done at values near the boundaries around all possible values of `n`. That `assert -45.0 <= d <= 45.0` can't fall then follows from all of that. In all, a proof that the code is correct is much longer than the code itself. That's typical. Alas, it's also typical that math library sources rarely point out the subtleties. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Thu Jun 14 17:27:48 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 14 Jun 2018 22:27:48 +0100 Subject: [Python-ideas] Allow filtered dir built in In-Reply-To: References: Message-ID: <0feb4ef0-fa62-8181-0178-845df0f5aa99@btinternet.com> On 14/06/2018 10:27, Steve Barnes wrote: > Currently when working with interactive sessions using the dir() or > dir(module) built in is incredibly useful for exploring what > functionality is available in a module. (Especially the regrettable > libraries or modules that add really valuable functionality but have no > or limited docstrings). > > However I often find that when a module adds a lot of functions I need > to filter those entries to be able to find the one that I need, e.g.: > > >>> import mpmath > >>> dir(mpmath) # This produces 390+ lines of output but > >>> for name in dir(mpmath): > .. if 'sin' in name: > .. print(name) # gives me a mere 13 to consider as candidates > > What I would really like to do is: > >>> dir(mpmath.*sin*) I have also hit this use case.? But it never seemed like much of a hardship to write ??? [? x for x in dir(SomeObject) if??? ??? ] Rob Cliffe From mike at selik.org Thu Jun 14 18:04:49 2018 From: mike at selik.org (Michael Selik) Date: Thu, 14 Jun 2018 15:04:49 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: There's nothing wrong with your ideas if you were designing a language from scratch. However, Python has a long history and many tools and uses for the same operators you are considering. And even has a current "insert" operator (slice assignment). When adding a new feature, you need to consider if you're duplicating current functionality and if a 2nd "obvious" way is worth breaking the Zen. Also, consider whether your "intuitive" design is following or breaking a de-facto standard established elsewhere in the language. For example, if you're making a file format decoder, you should create functions "load" and "dump" to behave like every other file format decoder module. Don't re-invent or we'll end up with too many "standards" and the language will become hard to remember. In case I need to clarify: 1. You're duplicating current clear and more flexible syntax. 2. Your proposed operators are confusing when compared with their meanings elsewhere. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Thu Jun 14 22:24:29 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Fri, 15 Jun 2018 05:24:29 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Fri, Jun 15, 2018 at 1:04 AM, Michael Selik wrote: [..] > In case I need to clarify: > 1. You're duplicating current clear and more flexible syntax. > 2. Your proposed operators are confusing when compared with their meanings > elsewhere. what haven't we repeated in this thread yet? Motivation was explained. About confusion probably haven't discussed : so Xor ^ works on sets, IIRC for finding union without common elements. That's one point for potential confusion - probably expect it to work with list. Sounds probable? Other operator, e.g. bitshift <<, I dont' think it has potential for confusion. @ operator - currently 'free'. But maybe there is some reserved plans for lists and these operators as well - I don't know. IIRC some time ago Steven D'Aprano proposed something with sets and some of the operators (sorry in advance if that is wrong info, but I\m almost sure there was a related mathematical discussion with sets involved). From mike at selik.org Thu Jun 14 22:40:44 2018 From: mike at selik.org (Michael Selik) Date: Thu, 14 Jun 2018 19:40:44 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Thu, Jun 14, 2018, 7:24 PM Mikhail V wrote: > what haven't we repeated in this thread yet? Motivation was explained. > You have repeated your explanations a few times. It isn't convincing. It seems to me that your main complaint is that strings are iterable, though you haven't expressed it as such. During slice assignment, you're surprised that the string is a sequence of characters. I understand, but I don't find that confusion a compelling reason to add a new operator. About confusion probably haven't discussed : so Xor ^ works on sets, > IIRC for finding > union without common elements. > I wrote about this in an earlier email, which you didn't reply to. You're correct that xor doesn't make sense with lists, especially not between a list and a single element. Again, though other operators have been given different meanings in special contexts, those decisions should be seen as abnormal, not an excuse to keep creating more special cases. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Jun 14 22:51:24 2018 From: mike at selik.org (Michael Selik) Date: Thu, 14 Jun 2018 19:51:24 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: Sorry, I forgot that you dropped the suggestion to make it an insert operator and are only asking for an append operator. I see no benefit to this, because += already is an elegant way to extend a list, which is more flexible than append. Yes, if the right-hand is an iterable and should be appended as a single element, you'll need to enclose it in a single-element container. This is true for strings, lists, sets, whatever. It's natural and is not a "trick". If you would like to prove the need for this operator, one piece of evidence you can provide is a count of the number of times someone writes "list.append" for an iterable vs "+=" and encloses a str or other type in a throw-away list to effectively append. If the latter habit is common, that's evidence that the language may need to be improved. Why don't you search GitHub projects to collect some statistics? On Thu, Jun 14, 2018, 7:40 PM Michael Selik wrote: > > > On Thu, Jun 14, 2018, 7:24 PM Mikhail V wrote: > >> what haven't we repeated in this thread yet? Motivation was explained. >> > > You have repeated your explanations a few times. It isn't convincing. > > It seems to me that your main complaint is that strings are iterable, > though you haven't expressed it as such. During slice assignment, you're > surprised that the string is a sequence of characters. I understand, but I > don't find that confusion a compelling reason to add a new operator. > > About confusion probably haven't discussed : so Xor ^ works on sets, >> IIRC for finding >> union without common elements. >> > > I wrote about this in an earlier email, which you didn't reply to. You're > correct that xor doesn't make sense with lists, especially not between a > list and a single element. > > Again, though other operators have been given different meanings in > special contexts, those decisions should be seen as abnormal, not an excuse > to keep creating more special cases. > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Fri Jun 15 02:22:12 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Fri, 15 Jun 2018 02:22:12 -0400 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <5B22075B.5090007@brenbarn.net> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: On Thu, Jun 14, 2018 at 2:12 AM, Brendan Barnwell wrote: > On 2018-06-13 22:29, Steven D'Aprano wrote: >> >> On Wed, Jun 13, 2018 at 10:59:34PM +0200, Michel Desmoulin wrote: >> >>> > Attaching an entire module to a type is probably worse than >>> > adding a slew of extra methods to the type. >>> > >>> >>> Not my point. >>> >>> str.re would not be the re module, just a namespace where to group all >>> regex related string methods. >> >> >> That's what a module is :-) >> >> How would this work? If I say: >> >> "My string".re.match(...) >> >> if str.re is "just a namespace" how will the match function know the >> string it is to operate on? > > > str.re can be a descriptor object which "knows" which string > instance it is bound to. This kind of thing is common in many libraries. > Pandas for example has all kinds of things like df.loc[1:3], > df.column.str.startswith('blah'), etc. The "loc" and "str" attributes give > objects which are bound (in the sense that bound methods are bound) to the > objects on which they are accessed, so when you use these attributes to do > things, the effect takes account of on the "root" object on which you > accessed the attribute. > > Personally I think this is a great way to reduce namespace clutter > and group related functionality without having to worry about using up all > the short or "good" names at the top level. I'm not sure I agree with the > specific proposal here for allowing regex operations on strings, but if we > do do it, this would be a good way to do it. It's a clever idea, but it's a completely new (at least to standard Python) way to call a function that acts on a given argument. That means more to learn. We already have foo.bar(...) and bar(foo): "Hello!".count("o") len("Hello!") Nesting is hiding. Hiding can be good or bad. Adding `foo.b.ar()` will make it harder to discover. It's also magical: To understand what `foo.b.ar()` does, you can't think of `foo.b` as a (semantic) property of the object, or a method of the object, but as a descriptor trick which holds more methods of that object. I mainly use Python on a REPL. When I'm on IPython, I can ask what properties and methods an object has. When I'm on the basic Python REPL, I use `dir`, or a function which filters and prints `dir` in a nicer way. Nested method namespaces will be harder to navigate through. I would not be able to programmatically tell whether a property is just a property. I'd need to manually inspect each oddly-named property, to make sure it's not hiding more methods of the object (and that would only work if the docstrings are maintained and clear enough for me). I don't see any advantage of using `foo.b.ar()` over `foo.b_ar()`. In either case, you'd need to spell out the whole name each time (unlike with import statements), unless you save the bound method, which you can do in both cases. P.S.: Is there any way of guessing what proportion of Python programs use `re`, either explicitly or implicitly? How many programs will, at some point in their runtime, load the `re` module? From mike at selik.org Fri Jun 15 02:32:09 2018 From: mike at selik.org (Michael Selik) Date: Thu, 14 Jun 2018 23:32:09 -0700 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: On Thu, Jun 14, 2018, 11:22 PM Franklin? Lee wrote: > P.S.: Is there any way of guessing what proportion of Python programs > use `re`, either explicitly or implicitly? How many programs will, at > some point in their runtime, load the `re` module? > GitHub posts it's data to Google BigQuery. It's a biased sample, but it's the largest open repository of code I'm aware of. Hmm. Better search now before it gets moved to Azure :-/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Fri Jun 15 02:45:55 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 15 Jun 2018 08:45:55 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> <5B22075B.5090007@brenbarn.net> Message-ID: from a lurker's perspective, why not just implement str.compile() as new method, and methods where it's relevant support it's result as argument ? That's a small change in additions, and the other methods in the normal case just do the same as now. It's also pretty clear what things like "whatever".replace("regex".compile(), "otherstring") should do in that case. From desmoulinmichel at gmail.com Fri Jun 15 03:49:35 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 15 Jun 2018 09:49:35 +0200 Subject: [Python-ideas] Give regex operations more sugar In-Reply-To: <20180614052952.GE12683@ando.pearwood.info> References: <27b67d11-b37a-9a1d-0d85-f041c38dc6d2@mgmiller.net> <40D7A384-5B43-4A5A-965D-9BB922151513@gmail.com> <013d2473-4eca-dbd9-d7ac-9fc29a4a30c9@gmail.com> <20180614052952.GE12683@ando.pearwood.info> Message-ID: Le 14/06/2018 ? 07:29, Steven D'Aprano a ?crit?: > On Wed, Jun 13, 2018 at 10:59:34PM +0200, Michel Desmoulin wrote: > >>> Attaching an entire module to a type is probably worse than >>> adding a slew of extra methods to the type. >>> >> >> Not my point. >> >> str.re would not be the re module, just a namespace where to group all >> regex related string methods. > > That's what a module is :-) > > How would this work? If I say: > > "My string".re.match(...) > > if str.re is "just a namespace" how will the match function know the > string it is to operate on? There are a lot of ways to do that. One possible way: import re class re_proxy: def __init__(self, string): self.string = string def match(self, pattern, flags): return re.match(pattern, self.string, flags) ... @property def re(self): return re_proxy(self) From desmoulinmichel at gmail.com Fri Jun 15 04:10:19 2018 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 15 Jun 2018 10:10:19 +0200 Subject: [Python-ideas] Allow filtered dir built in In-Reply-To: References: Message-ID: <46e5178a-2c16-4e1d-4ae8-fe58a97b8f89@gmail.com> Le 14/06/2018 ? 11:27, Steve Barnes a ?crit?: > Currently when working with interactive sessions using the dir() or > dir(module) built in is incredibly useful for exploring what > functionality is available in a module. (Especially the regrettable > libraries or modules that add really valuable functionality but have no > or limited docstrings). > > However I often find that when a module adds a lot of functions I need > to filter those entries to be able to find the one that I need, e.g.: > > >>> import mpmath > >>> dir(mpmath) # This produces 390+ lines of output but > >>> for name in dir(mpmath): > ... if 'sin' in name: > ... print(name) # gives me a mere 13 to consider as candidates > > What I would really like to do is: > >>> dir(mpmath.*sin*) > > However, I know that the interpreter will hit problems with one or more > operators being embedded in the module name. > > What I would like to suggest is extending the dir built-in to allow an > optional filter parameter that takes fnmatch type wild card as an > optional filter. Then I could use: > > >>> dir(mpmath, "*sin*") > > To narrow down the candidates. > > Ideally, this could have a recursive variant that would also include > listing, (and filtering), any sub-packages. > Fantastic idea. Would this make sense on var() too ? It's not exactly the same usage context. From steve at pearwood.info Fri Jun 15 06:40:47 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Jun 2018 20:40:47 +1000 Subject: [Python-ideas] Allow filtered dir built in In-Reply-To: <46e5178a-2c16-4e1d-4ae8-fe58a97b8f89@gmail.com> References: <46e5178a-2c16-4e1d-4ae8-fe58a97b8f89@gmail.com> Message-ID: <20180615104047.GP12683@ando.pearwood.info> On Fri, Jun 15, 2018 at 10:10:19AM +0200, Michel Desmoulin wrote: > Fantastic idea. Would this make sense on var() too ? It's not exactly > the same usage context. No. The point of vars() is to return the actual namespace dict used by an object. It's not primarily an introspection tool, it is the public interface for accessing __dict__ without using the dunder directly. vars() and dir() have completely different purposes, and they do very different things. -- Steve From Richard at Damon-Family.org Fri Jun 15 09:33:44 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Fri, 15 Jun 2018 09:33:44 -0400 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180611183806.GZ12683@ando.pearwood.info> <5B1EFE4E.1050204@canterbury.ac.nz> <20180612004835.GA12683@ando.pearwood.info> <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: On 6/13/18 7:21 AM, Stephan Houben wrote: > > > Op wo 13 jun. 2018 13:12 schreef Richard Damon > >: > > My first comment is that special casing values like this can lead to > some very undesirable properties when you use the function for > numerical > analysis. Suddenly your sind is no longer continuous (sind(x) is no > longer the limit of sind(x+d) as d goes to 0). > > > > The deviations introduced by the special casing are on the order of > one ulp. > > At that level of detail the sin wasn't continuous to begin with. I would say the change isn't one ulp, changing a non-zero number to zero is not one ulp (unless maybe you are on the verge of underflow). It may be one ulp of 'full scale', but we aren't near the full scale point. It might be the right answer for a less than one ulp change in the INPUT, but if we thought that way we wouldn't have minded the non-zero result in the first place. The fundamental motivation is that for 'nice angles' we want the 'nice result' when possible, but the issue is that most of the 'nice angles'? in radians are not representable exactly, so it isn't surprising that we don't get the nice results out. One property that we like to preserve in functional calculation is that the following pseudo code dp = x + delta derivative = ( f(xp) - f(x) ) / (xp - x) (or variations where you subtract delta or work at x+delta and x-delta) should approximate well the derivative of the function, (which for sin in radians should be cos), and that this improves as delta gets very small until we hit the rounding error in the computation of f(x). (Note, I don't divide by delta, but xp-x to remove the round off error in computing xp which isn't the fault of the function f). Changing a point because it is the closest to the nice number will cause this calculation to spike due to the single point perturbation). Yes, this calculation may start to 'blow up' f(xp) - f(x) is very small compared to f(x) and we start to measure the round off error in the computation of the function, near a zero of the function, (which if we are root finding is common) we can do quite well. > > As I stated in my initial comment on this, if you are going to > create a > sind function with the idea that you want 'nice' angles to return > 'exact' results, then what you need to do is have the degree based > trig > routines do the angle reduction in degrees, and only when you have a > small enough angle, either use the radians version on the small > angle or > directly include an expansion in degrees. > > > > Yes that is what my code does. > It reduces degrees to [0,90]. > > > Angle reduction would be based on the identity that sin(x+y) = > sin(x) * > cos(y) + cos(x) * sin(y) and cos(x+y) = cos(x)*cos(y) - sin(x) * > sin(y). > > If you want to find sin(z) for an arbitrary value z, you can reduce it > to and x+y where x is some multiple of say 15 degrees, and y is in the > range -7.5 to 7.5 degrees. You can have stored exact values of sin/cos > of the 15 degree increments (and only really need them between 0 > and 90) > and then compute the sin and cos of the y value. > > > This is not how sine functions are calculated. They are calculated by > reducing angle to some interval, then evaluating a polynomial which > approximates the true sine within that interval. > > Stephan > And that is what my method did (as others have said). Virtual all methods of computing sin and cos use the angle addition formula (and even quadrant reduction is using it for the special case of using one of the angles where sin/cos are valued in-1, 0, 1). The methods that least use it that I know of reduces the angle to a quadrant (or octant) and then selects one a number of expansions good for a limited range, or table interpolation (but even that sort of uses it with the approximation of sin(x) ~ x and cos(x) ~ 1 for very small x) -- Richard Damon From mikhailwas at gmail.com Fri Jun 15 11:48:58 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Fri, 15 Jun 2018 18:48:58 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Fri, Jun 15, 2018 at 5:51 AM, Michael Selik wrote: > If you would like to prove the need for this operator, one piece of evidence > you can provide is a count of the number of times someone writes > "list.append" for an iterable vs "+=" and encloses a str or other type in a > throw-away list to effectively append. That's strange idea - there is no doubt that one would use list.append() and most probably it is the case statistically. So the question would be "what is wrong with list.append()?" And as said many times, there is nothing wrong, but a lot of people seem to want an in-place operator for this purpose. And I can understand this, because: 1. append() is _ubiquitous_ 2. in-place assignment form makes some emphasis on mutating, in contrast to method call. That's it. So instead of a method call one gets a clean element on the right-hand and (hopefully) emphasis on the in-place nature of operation. A quick google search shows some tendency: https://stackoverflow.com/a/2022044/4157407 https://stackoverflow.com/a/28119966/4157407 So you shoudn't explain it to _me_ - I don't see other significant convincing points, and unless this gets support here - I am not interested in continuing. > > I see no benefit to this, because += already is an elegant way to extend a > list, which is more flexible than append. Yes, if the right-hand is an > iterable and should be appended as a single element, you'll need to enclose > it in a single-element container. This is true for strings, lists, sets, > whatever. It's natural and is not a "trick". > encouraging to mimic append() via += operator is bad practice. In the above links to SO, people try to tell the same, but that is not something that can be easily explained on paper. Now add here the constant wish for using in-place operator - and there will be on-going confusion. Regardless of how natural or general it is, it's something that should be avoided. Only in this discussion thread I had 3 different advices to use: L += ["aa"] L += ("aa",) L += "aa", But you should not give it to _me_ - i use the correct form only: L.append("aa") From rosuav at gmail.com Fri Jun 15 11:54:24 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Jun 2018 01:54:24 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Sat, Jun 16, 2018 at 1:48 AM, Mikhail V wrote: > On Fri, Jun 15, 2018 at 5:51 AM, Michael Selik wrote: > >> If you would like to prove the need for this operator, one piece of evidence >> you can provide is a count of the number of times someone writes >> "list.append" for an iterable vs "+=" and encloses a str or other type in a >> throw-away list to effectively append. > > That's strange idea - there is no doubt that one would use > list.append() and most probably > it is the case statistically. > So the question would be "what is wrong with list.append()?" > And as said many times, there is nothing wrong, but a lot of people > seem to want an in-place > operator for this purpose. And I can understand this, because: > > 1. append() is _ubiquitous_ > 2. in-place assignment form makes some emphasis on mutating, in > contrast to method call. How so? You can write "x += 1" with integers, and that doesn't mutate; but if you write "x.some_method()", doing nothing with the return value, it's fairly obvious that it's going to have side effects, most likely to mutate the object. Augmented assignment is no better than a method at that. ChrisA From mikhailwas at gmail.com Fri Jun 15 13:25:36 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Fri, 15 Jun 2018 20:25:36 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Fri, Jun 15, 2018 at 6:54 PM, Chris Angelico wrote: > On Sat, Jun 16, 2018 at 1:48 AM, Mikhail V wrote: >> On Fri, Jun 15, 2018 at 5:51 AM, Michael Selik wrote: >> >>> If you would like to prove the need for this operator, one piece of evidence >>> you can provide is a count of the number of times someone writes >>> "list.append" for an iterable vs "+=" and encloses a str or other type in a >>> throw-away list to effectively append. >> >> That's strange idea - there is no doubt that one would use >> list.append() and most probably >> it is the case statistically. >> So the question would be "what is wrong with list.append()?" >> And as said many times, there is nothing wrong, but a lot of people >> seem to want an in-place >> operator for this purpose. And I can understand this, because: >> >> 1. append() is _ubiquitous_ >> 2. in-place assignment form makes some emphasis on mutating, in >> contrast to method call. > > How so? You can write "x += 1" with integers, and that doesn't mutate; > but if you write "x.some_method()", doing nothing with the return > value, it's fairly obvious that it's going to have side effects, most > likely to mutate the object. Augmented assignment is no better than a > method at that. > How it would be obvious unless you test it or already have learned by heart that x.some_method() is in-place? For a list variable you might expect it probably, and if you already aware of mutability, etc. It s just very uncommon to see standalone statements like: x.method() for me it came into habit to think that it lacks the left-hand part and =. Of course augmented assignment is not a panacea because it is limited only to one operation, and the appeal of the operator itself is under question. As for x+=1 it is implementation detail - historical idea of such operators was mutating, so at least visually its not like a returning expression. and I am not sure about x.method() form - was it meant to hint to the user about anything? It seemed to me so when I started to learn Python, but its not. From andre.roberge at gmail.com Fri Jun 15 13:38:22 2018 From: andre.roberge at gmail.com (Andre Roberge) Date: Fri, 15 Jun 2018 14:38:22 -0300 Subject: [Python-ideas] Approximately equal operator Message-ID: I have a suggestion to make inspired by the current discussion about trigonometric functions in degrees, and the desire to have them show "exact" values in some special cases. I suggest that it would be useful to have operators for performing **approximate** comparisons. I believe that such operators would be useful both for students learning using Python as well as experts doing numerical computations. For discussion purpose, I will use ~= as representing an operator testing for approximate equality. (I will show some sample usage below). When teaching students, the availability of both == and ~= would give the opportunity to discuss the fact that numerical computations using floats are approximate, while having the possibility to write code that is readable using the approximate equality operator instead of the strict equality operator when needed. Before I started writing this email, I had noticed that numpy includes at least two functions (isclose and allclose) whose purpose is to perform such approximate comparisons. [1] I had completely missed the fact that Python added the function isclose() in the math module in version 3.5, as described in PEP 485 [0]. I would suggest that the possibility of using operators instead of explicit function calls could make programs easier to write and read, both for beginners and experts alike. I note that PEP 485 makes no mention of introducing operators as a possibility. In addition to an approximate equality operator, it would be natural to include two additional operators, greater than or approximately equal, and lesser than or approximately equal. These could be written respectively as >~= and <~=. I did consider using some relevant utf-8 symbol instead of combination of ascii characters, but I think that it would be easier to write programs if one does not require characters nor found on any normal keyboard. Some time ago, I created a toy module [2] to enable easy experiments with some syntactic additions to Python. Using this module, I created a very poor and limited implementation that shows what using these proposed o[erators might look like [3] ... My current implementation is slightly different from either Numpy or Python's math.isclose() function [This may no longer be the case soon as I plan to change it to use Python's version instead.] . As is the case for isclose(), there are two paramaters to be set to determine if the values are close enough to be considered approximately equal: an absolute tolerance and a relative one. Given that one cannot pass parameters to an operator, my implementation includes a function which can change the values of these parameters for a given session. If these new operators were to be added to Python, such a function would either have to be added as a builtin or as a special function in the math module. Here's a sample session for demonstration purpose... $ python -m experimental experimental console version 0.9.5. [Python version: 3.6.1] ~~> 0.1 + 0.2 0.30000000000000004 ~~> 0.1 + 0.2 == 0.3 False ~~> from __experimental__ import approx ~~> 0.1 + 0.2 ~= 0.3 # use approximately equal operator True ~~> 0.1 + 0.2 <~= 0.3 True ~~> 0.1 + 0.2 >~= 0.3 True ~~> 2 ** 0.5 1.4142135623730951 ~~> 2**0.5 ~= 1.414 False ~~> set_tols(0.001, 0.001) ~~> 2**0.5 ~= 1.414 True Andr? Roberge [0] https://www.python.org/dev/peps/pep-0485/ [1] See for example https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.isclose.html [2] https://github.com/aroberge/experimental [3] https://github.com/aroberge/experimental/blob/master/experimental/transformers/readme.md#approxpy -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 15 13:39:32 2018 From: mike at selik.org (Michael Selik) Date: Fri, 15 Jun 2018 10:39:32 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B2053FB.7020500@canterbury.ac.nz> <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: On Fri, Jun 15, 2018 at 10:25 AM Mikhail V wrote: > very uncommon to see standalone statements like: x.method() > Python has many such mutation methods. It sounds like you're judging the frequency of code patterns across all languages instead of just Python. Even then, I don't think that's true. All OO languages that come to mind have that pattern frequently. As for x+=1 it is implementation detail - historical idea of such operators > was > mutating, so at least visually its not like a returning expression. > Incorrect. The += operator was meant as an alias for ``x = x + 1``. The fact that it mutates a list is somewhat of a surprise. Some other languages make no distinction between mutation and reassignment. Perhaps you're thinking of one of those other languages. > and I am not sure about x.method() form - was it meant to hint to the user > about anything? It seemed to me so when I started to learn Python, but its > not. > Yes, it returns None to emphasize that it's a mutation. This is different from the so-called "fluent" design pattern where all mutation methods also return the original object, causing confusion about whether the return value is a copy or not. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Jun 15 13:53:41 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Jun 2018 19:53:41 +0200 Subject: [Python-ideas] Approximately equal operator References: Message-ID: <20180615195341.68a36912@fsol> On Fri, 15 Jun 2018 14:38:22 -0300 Andre Roberge wrote: > > Here's a sample session for demonstration purpose... > > $ python -m experimental > experimental console version 0.9.5. [Python version: 3.6.1] > > ~~> 0.1 + 0.2 > 0.30000000000000004 > ~~> 0.1 + 0.2 == 0.3 > False > ~~> from __experimental__ import approx > ~~> 0.1 + 0.2 ~= 0.3 # use approximately equal operator > True > ~~> 0.1 + 0.2 <~= 0.3 > True > ~~> 0.1 + 0.2 >~= 0.3 > True > ~~> 2 ** 0.5 > 1.4142135623730951 > ~~> 2**0.5 ~= 1.414 > False > ~~> set_tols(0.001, 0.001) > ~~> 2**0.5 ~= 1.414 > True On the one hand, this matches the corresponding math notation quite pleasantly. On the other hand, it doesn't seem so useful that it deserves to be a builtin operator. I'm also not sure we want to encourage its use for anything other than experimenting at the prompt and writing unit tests. Being able to set a global tolerance setting is an anti-pattern IMHO, regardless of the operator proposal. Regards Antoine. From Richard at Damon-Family.org Fri Jun 15 13:55:40 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Fri, 15 Jun 2018 13:55:40 -0400 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> On 6/15/18 1:25 PM, Mikhail V wrote: > On Fri, Jun 15, 2018 at 6:54 PM, Chris Angelico wrote: >> On Sat, Jun 16, 2018 at 1:48 AM, Mikhail V wrote: >>> On Fri, Jun 15, 2018 at 5:51 AM, Michael Selik wrote: >>> >>>> If you would like to prove the need for this operator, one piece of evidence >>>> you can provide is a count of the number of times someone writes >>>> "list.append" for an iterable vs "+=" and encloses a str or other type in a >>>> throw-away list to effectively append. >>> That's strange idea - there is no doubt that one would use >>> list.append() and most probably >>> it is the case statistically. >>> So the question would be "what is wrong with list.append()?" >>> And as said many times, there is nothing wrong, but a lot of people >>> seem to want an in-place >>> operator for this purpose. And I can understand this, because: >>> >>> 1. append() is _ubiquitous_ >>> 2. in-place assignment form makes some emphasis on mutating, in >>> contrast to method call. >> How so? You can write "x += 1" with integers, and that doesn't mutate; >> but if you write "x.some_method()", doing nothing with the return >> value, it's fairly obvious that it's going to have side effects, most >> likely to mutate the object. Augmented assignment is no better than a >> method at that. >> > How it would be obvious unless you test it or already have learned by heart > that x.some_method() is in-place? For a list variable you might expect > it probably, > and if you already aware of mutability, etc. > > It s just very uncommon to see standalone statements like: > x.method() > > for me it came into habit to think that it lacks the left-hand part and =. > Of course augmented assignment is not a panacea because it is limited only > to one operation, and the appeal of the operator itself is under question. > > As for x+=1 it is implementation detail - historical idea of such operators was > mutating, so at least visually its not like a returning expression. > and I am not sure about x.method() form - was it meant to hint to the user > about anything? It seemed to me so when I started to learn Python, but its not. For me if I see foo.bar() my assumption is that likely bar() is going to mutate foo, especially if it doesn't produce a return value, and this is the natural way to define something that mutates something. If it returns something, it might be just an accessor, or it might be a bit of both, easily a mutation that returns a status about the success/failure of the mutation. foo = bar would seem to be never a 'mutation', but always a rebinding of foo, even if 'bar' is an expression using foo. foo += bar, conceptually reads as foo = foo + bar, so the first impression is that it isn't going to mutate, but rebind, but it is possible that it does an inplace rebind so to me THAT is the case where I need to dig into the rules to see what it really does. This means that at quick glance given: list1 = list list1 += value it is unclear if list would have been changed by the +=, while a statement like list1.append(value) is more clearly an in place mutation so other names bound to the same object are expected to see the changes. -- Richard Damon From Richard at Damon-Family.org Fri Jun 15 14:06:26 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Fri, 15 Jun 2018 14:06:26 -0400 Subject: [Python-ideas] Approximately equal operator In-Reply-To: References: Message-ID: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> On 6/15/18 1:38 PM, Andre Roberge wrote: > I have a suggestion to make inspired by the current discussion about > trigonometric functions in degrees, and the desire to have them show > "exact" values in some special cases. > > I suggest that it would be useful to have operators for performing > **approximate** comparisons. I believe that such operators would be > useful both for students learning using Python as well as experts > doing numerical computations. > > For discussion purpose, I will use ~= as representing an operator > testing for approximate equality. (I will show some sample usage below).?? > > When teaching students, the availability of both == and ~= would give > the opportunity to discuss the fact that numerical computations using > floats are approximate, while having the possibility to write code > that is readable using the approximate equality operator instead of > the strict equality operator when needed. > > Before I started writing this email, I had noticed that numpy includes > at least two functions (isclose and allclose) whose purpose is to > perform such approximate comparisons. [1]? I had completely missed the > fact that Python added the function isclose() in the math module in > version 3.5, as described in PEP 485 [0].? I would suggest that the > possibility of using operators instead of explicit function calls > could make programs easier to write and read, both for beginners and > experts alike.? I note that PEP 485 makes no mention of introducing > operators as a possibility. > > In addition to an approximate equality operator, it would be natural > to include two additional operators, greater than or approximately > equal, and lesser than or approximately equal.? These could be written > respectively as >~= and <~=.? I did consider using some relevant utf-8 > symbol instead of combination of ascii characters, but I think that it > would be easier to write programs if one does not require characters > nor found on any normal keyboard. > > Some time ago, I created a toy module [2] to enable easy experiments > with some syntactic additions to Python.? Using this module, I created > a very poor and limited implementation that shows what using these > proposed o[erators might look like [3] ...? My current implementation > is slightly different from either Numpy or Python's math.isclose() > function [This may no longer be the case soon as I plan to change it > to use Python's version instead.] .? As is the case for isclose(), > there are two paramaters to be set to determine if the values are > close enough to be considered approximately equal: an absolute > tolerance and a relative one.? Given that one cannot pass parameters > to an operator, my implementation includes a function which can change > the values of these parameters for a given session. If these new > operators were to be added to Python, such a function would either > have to be added as a builtin or as a special function in the math module. > > Here's a sample session for demonstration purpose... > > $ python -m experimental > experimental console version 0.9.5. [Python version: 3.6.1] > > ~~> 0.1 + 0.2 > 0.30000000000000004 > ~~> 0.1 + 0.2 == 0.3 > False > ~~> from __experimental__ import approx > ~~> 0.1 + 0.2 ~= 0.3? ? # use approximately equal operator > True > ~~> 0.1 + 0.2 <~= 0.3 > True > ~~> 0.1 + 0.2 >~= 0.3 > True > ~~> 2 ** 0.5 > 1.4142135623730951 > ~~> 2**0.5 ~= 1.414 > False > ~~> set_tols(0.001, 0.001) > ~~> 2**0.5 ~= 1.414 > True > > > Andr? Roberge > > [0]?https://www.python.org/dev/peps/pep-0485/ > > [1] See for > example?https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.isclose.html > > [2]?https://github.com/aroberge/experimental > > [3]?https://github.com/aroberge/experimental/blob/master/experimental/transformers/readme.md#approxpy > The big issue with 'approximately equals' is that the definition of it is hard to come up with. This is especially true for small values. In particular how small does a number need to be to be ~= 0. You also get inconsistencies,? you can have a ~= b, and b ~= c but not a ~= c. You can have a + b ~= c but not a ~= c - b should 0.0000001 ~= 0.00000011 To properly handle this sort of thing you need that tolerance parameter, but the values you give should be carefully though about for THAT comparison, having a 'global' value is apt to lead to laziness and wrong answers. -- Richard Damon From andre.roberge at gmail.com Fri Jun 15 18:56:43 2018 From: andre.roberge at gmail.com (Andre Roberge) Date: Fri, 15 Jun 2018 19:56:43 -0300 Subject: [Python-ideas] Approximately equal operator In-Reply-To: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> References: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> Message-ID: On Fri, Jun 15, 2018 at 3:12 PM Richard Damon wrote: > On 6/15/18 1:38 PM, Andre Roberge wrote: > > I have a suggestion to make inspired by the current discussion about > > trigonometric functions in degrees, and the desire to have them show > > "exact" values in some special cases. > > > > I suggest that it would be useful to have operators for performing > > **approximate** comparisons. I believe that such operators would be > > useful both for students learning using Python as well as experts > > doing numerical computations. > > > > For discussion purpose, I will use ~= as representing an operator > > testing for approximate equality. (I will show some sample usage > below). > > > > When teaching students, the availability of both == and ~= would give > > the opportunity to discuss the fact that numerical computations using > > floats are approximate, while having the possibility to write code > > that is readable using the approximate equality operator instead of > > the strict equality operator when needed. > > > > Before I started writing this email, I had noticed that numpy includes > > at least two functions (isclose and allclose) whose purpose is to > > perform such approximate comparisons. [1] I had completely missed the > > fact that Python added the function isclose() in the math module in > > version 3.5, as described in PEP 485 [0]. I would suggest that the > > possibility of using operators instead of explicit function calls > > could make programs easier to write and read, both for beginners and > > experts alike. I note that PEP 485 makes no mention of introducing > > operators as a possibility. > > > > In addition to an approximate equality operator, it would be natural > > to include two additional operators, greater than or approximately > > equal, and lesser than or approximately equal. These could be written > > respectively as >~= and <~=. I did consider using some relevant utf-8 > > symbol instead of combination of ascii characters, but I think that it > > would be easier to write programs if one does not require characters > > nor found on any normal keyboard. > > > > Some time ago, I created a toy module [2] to enable easy experiments > > with some syntactic additions to Python. Using this module, I created > > a very poor and limited implementation that shows what using these > > proposed o[erators might look like [3] ... My current implementation > > is slightly different from either Numpy or Python's math.isclose() > > function [This may no longer be the case soon as I plan to change it > > to use Python's version instead.] . As is the case for isclose(), > > there are two paramaters to be set to determine if the values are > > close enough to be considered approximately equal: an absolute > > tolerance and a relative one. Given that one cannot pass parameters > > to an operator, my implementation includes a function which can change > > the values of these parameters for a given session. If these new > > operators were to be added to Python, such a function would either > > have to be added as a builtin or as a special function in the math > module. > > > > Here's a sample session for demonstration purpose... > > > > $ python -m experimental > > experimental console version 0.9.5. [Python version: 3.6.1] > > > > ~~> 0.1 + 0.2 > > 0.30000000000000004 > > ~~> 0.1 + 0.2 == 0.3 > > False > > ~~> from __experimental__ import approx > > ~~> 0.1 + 0.2 ~= 0.3 # use approximately equal operator > > True > > ~~> 0.1 + 0.2 <~= 0.3 > > True > > ~~> 0.1 + 0.2 >~= 0.3 > > True > > ~~> 2 ** 0.5 > > 1.4142135623730951 > > ~~> 2**0.5 ~= 1.414 > > False > > ~~> set_tols(0.001, 0.001) > > ~~> 2**0.5 ~= 1.414 > > True > > > > > > Andr? Roberge > > > > [0] https://www.python.org/dev/peps/pep-0485/ > > > > [1] See for > > example > https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.isclose.html > > > > [2] https://github.com/aroberge/experimental > > > > [3] > https://github.com/aroberge/experimental/blob/master/experimental/transformers/readme.md#approxpy > > > The big issue with 'approximately equals' is that the definition of it > is hard to come up with. This is especially true for small values. In > particular how small does a number need to be to be ~= 0. You also get > inconsistencies, you can have a ~= b, and b ~= c but not a ~= c. You > can have a + b ~= c but not a ~= c - b > > should 0.0000001 ~= 0.00000011 > > To properly handle this sort of thing you need that tolerance parameter, > but the values you give should be carefully though about for THAT > comparison, having a 'global' value is apt to lead to laziness and wrong > answers. > > ?Both you and Antoine Pitrou made an excellent point about relying on non-explicitly defined values for the tolerance parameters. I've made some changes to my proof-of-concept model [1] so that it requires the user to define the two tolerance parameters (relative and absolute) in a way that they are accessible within the scope where the operators are used. I also use the existing isclose() function in the math module instead of my previous version - and use the same name for the two parameters. Here's a sample session: > python -m experimental experimental console version 0.9.6. [Python version: 3.6.1] ~~> from __experimental__ import approx ~~> 0.1 + 0.2 0.30000000000000004 ~~> 0.1 + 0.2 == 0.3 False ~~> # Attempt to use approximate comparison with defining tolerances ~~> 0.1 + 0.2 ~= 0.3 Traceback (most recent call last): File "", line 1, in NameError: name 'rel_tol' is not defined ~~> rel_tol = abs_tol = 1e-8 ~~> 0.1 + 0.2 ~= 0.3 True ~~> 2**0.5 ~= 1.414 False ~~> abs_tol = 0.001 ~~> 2**0.5 ~= 1.414 True I would predict that most of the usage would be in some if or while statements, rather than simple comparisons like those illustrated above. With this change, would using these operators (as "syntactic sugar") be a worthwhile addition for: * people doing heavy numerical work and wanting code as readable as possible * teaching mostly beginners about finite precision for floating point arithmetics * people wishing to have trigonometric functions with arguments in degrees, as in a current discussion on this forum. [1] available on github as mentioned previously or using "pip install experimental ?Andr? Roberge > -- > Richard Damon > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Fri Jun 15 19:42:00 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sat, 16 Jun 2018 02:42:00 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: Now I have slightly different idea. How is about special-casing of this as a shortcut for append: L[] = item Namely just use the fact that empty slice is SyntaxError now. I understand this is totally different approach than operator overloading and maybe hard to implement, but I feel like it looks really appealing. And it is quite intuitive imo. For me the syntax reads like: "add new empty element and this element will be "item". No? From rosuav at gmail.com Fri Jun 15 20:02:12 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Jun 2018 10:02:12 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Sat, Jun 16, 2018 at 9:42 AM, Mikhail V wrote: > Now I have slightly different idea. How is about special-casing of this > as a shortcut for append: > > L[] = item > > Namely just use the fact that empty slice is SyntaxError now. > > I understand this is totally different approach than operator > overloading and maybe > hard to implement, but I feel like it looks really appealing. > And it is quite intuitive imo. For me the syntax reads like: > "add new empty element and this element will be "item". > > No? Yes, if this were PHP. I still haven't seen any compelling argument against the append method. -1 on introducing a new way to spell append. ChrisA From mikhailwas at gmail.com Fri Jun 15 20:23:49 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sat, 16 Jun 2018 03:23:49 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Sat, Jun 16, 2018 at 3:02 AM, Chris Angelico wrote: > On Sat, Jun 16, 2018 at 9:42 AM, Mikhail V wrote: >> Now I have slightly different idea. How is about special-casing of this >> as a shortcut for append: >> >> L[] = item >> >> Namely just use the fact that empty slice is SyntaxError now. >> >> I understand this is totally different approach than operator >> overloading and maybe >> hard to implement, but I feel like it looks really appealing. >> And it is quite intuitive imo. For me the syntax reads like: >> "add new empty element and this element will be "item". >> >> No? > > Yes, if this were PHP. Is it like that in PHP? > I still haven't seen any compelling argument against the append > method. -1 on introducing a new way to spell append. By me - there is just nothing against append() method. But I see from various posts in the web and SO - people just want to spell it compact, and keep the 'item' part clean of brackets. And that kind of makes sense. > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mike at selik.org Fri Jun 15 20:26:46 2018 From: mike at selik.org (Michael Selik) Date: Fri, 15 Jun 2018 17:26:46 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Fri, Jun 15, 2018, 5:24 PM Mikhail V wrote: > there is just nothing against append() method. > Then why break the Zen: there should be only one obvious way? But I see from various posts in the web and SO - people just want to spell > it compact, and keep the 'item' part clean of brackets. If you're going to cite things, please link to them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Jun 15 20:31:57 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 15 Jun 2018 17:31:57 -0700 Subject: [Python-ideas] Approximately equal operator In-Reply-To: References: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> Message-ID: On Fri, Jun 15, 2018 at 3:56 PM, Andre Roberge wrote: > * people doing heavy numerical work and wanting code as readable as possible IME serious numerical work doesn't use approximate equality tests at all, except in test assertions. > * teaching mostly beginners about finite precision for floating point > arithmetics Given that approximate equality tests are almost never the right solution, I would be worried that emphasizing them to beginners would send them down the wrong path. This is already a common source of confusion and trap for non-experts. > * people wishing to have trigonometric functions with arguments in degrees, > as in a current discussion on this forum. AFAICT approximate equality checks aren't really useful for that, no. (I also don't understand why people in that argument are so worried about exact precision for 90? and 30? when it's impossible for all the other angles.) Python is *very* stingy with adding new operators; IIRC only 3 have been added over the last ~30 years (**, //, @). I don't think ~= is going to make it. -n -- Nathaniel J. Smith -- https://vorpus.org From mikhailwas at gmail.com Fri Jun 15 20:40:15 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sat, 16 Jun 2018 03:40:15 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Sat, Jun 16, 2018 at 3:26 AM, Michael Selik wrote: > > > On Fri, Jun 15, 2018, 5:24 PM Mikhail V wrote: >> >> there is just nothing against append() method. > > > Then why break the Zen: there should be only one obvious way? I think the question could be applied to 99% proposals >> But I see from various posts in the web and SO - people just want to spell >> it compact, and keep the 'item' part clean of brackets. > > > If you're going to cite things, please link to them. Want to force me to work? :) I made pair of links already in previous posts, not directly telling concrete wishes, but shows approximately the picture. Something more concrete maybe: https://stackoverflow.com/a/3653314/4157407 https://stackoverflow.com/q/13818992/4157407 From greg.ewing at canterbury.ac.nz Fri Jun 15 20:42:21 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Jun 2018 12:42:21 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: <5B245CED.2010605@canterbury.ac.nz> Mikhail V wrote: > It s just very uncommon to see standalone statements like: > x.method() > > for me it came into habit to think that it lacks the left-hand part and =. You must be looking at a very limited and non-typical corpus of Python code. Mutating method calls are extremely common in most Python code I've seen. You seem to be trying to reason about Python as though it were intended to be used in a functional style, but it's not. Making guesses about it based on how something would be done in a functinal language will get you nowhere. > I am not sure about x.method() form - was it meant to hint to the user > about anything? It seemed to me so when I started to learn Python, but its not. The fact that something is a method does not, and was never intended to, imply anytbing about whether it is mutating. However, there *is* a fairly easy way to tell, most of the time, when you're *reading* code. There's a convention that mutating methods don't return anything other than None, so mutating method calls reveal themselves by the fact that the result is not used. Conversely, a method call whose result *is* used is most likely non-mutating. (It's possible that it could both return a value and have a side effect, but that's frowned upon, and you'll pretty much never see it in any builtin or stdlib API.) As for writing code, you just have to rely on memory and documentation, just like you do with most aspects of any language and its libraries. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mike at selik.org Fri Jun 15 20:47:22 2018 From: mike at selik.org (Michael Selik) Date: Fri, 15 Jun 2018 17:47:22 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: One of those links was discussing extend, not append. The other wanted a repeated append, best solved by a list comprehension. Neither makes a good case for your suggestion. On Fri, Jun 15, 2018, 5:40 PM Mikhail V wrote: > On Sat, Jun 16, 2018 at 3:26 AM, Michael Selik wrote: > > > > > > On Fri, Jun 15, 2018, 5:24 PM Mikhail V wrote: > >> > >> there is just nothing against append() method. > > > > > > Then why break the Zen: there should be only one obvious way? > > I think the question could be applied to 99% proposals > > >> But I see from various posts in the web and SO - people just want to > spell > >> it compact, and keep the 'item' part clean of brackets. > > > > > > If you're going to cite things, please link to them. > > Want to force me to work? :) > > I made pair of links already in previous posts, not directly > telling concrete wishes, but shows approximately the picture. > > > Something more concrete maybe: > > https://stackoverflow.com/a/3653314/4157407 > https://stackoverflow.com/q/13818992/4157407 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Jun 15 21:06:45 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Jun 2018 13:06:45 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> Message-ID: <5B2462A5.8020907@canterbury.ac.nz> Michael Selik wrote: > The += operator was meant as an alias for ``x = x + 1``. The > fact that it mutates a list is somewhat of a surprise. That's very much a matter of opinion. For every person who thinks this is a surprise, you can find another that thinks it's obvious that += should mutate a list, and is surprised by the fact that it works on immutable types at all. > Some other > languages make no distinction between mutation and reassignment. Perhaps > you're thinking of one of those other languages. Most languages, actually. I'm not aware of any other language with a += operator that has this dual interpretation of its meaning -- which is probably why almost everyone gets surprised by at least one of its meanings. :-) Maybe we should call it the "Spanish Inquisition operator". -- Greg From tjreedy at udel.edu Fri Jun 15 21:15:19 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 15 Jun 2018 21:15:19 -0400 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <5B245CED.2010605@canterbury.ac.nz> References: <5B21AF9D.2080301@canterbury.ac.nz> <5B245CED.2010605@canterbury.ac.nz> Message-ID: On 6/15/2018 8:42 PM, Greg Ewing wrote: > Mikhail V wrote: >> It s just very uncommon to see standalone statements like: >> x.method() >> >> for me it came into habit to think that it lacks the left-hand part >> and =. > > You must be looking at a very limited and non-typical > corpus of Python code. Mutating method calls are > extremely common in most Python code I've seen. > > You seem to be trying to reason about Python as though > it were intended to be used in a functional style, but > it's not. Making guesses about it based on how something > would be done in a functinal language will get you > nowhere. > > > I am not sure about x.method() form - was it meant to hint to the user >> about anything? It seemed to me so when I started to learn Python, but >> its not. > > The fact that something is a method does not, and was never > intended to, imply anytbing about whether it is mutating. > > However, there *is* a fairly easy way to tell, most of > the time, when you're *reading* code. There's a convention > that mutating methods don't return anything other than > None, so mutating method calls reveal themselves by the > fact that the result is not used. > > Conversely, a method call whose result *is* used is most > likely non-mutating. (It's possible that it could both > return a value and have a side effect, but that's frowned > upon, Except for set/list.pop and dict.popitem. and you'll pretty much never see it in any builtin > or stdlib API.) > > As for writing code, you just have to rely on memory and > documentation, just like you do with most aspects of any > language and its libraries. > >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Terry Jan Reedy From mikhailwas at gmail.com Fri Jun 15 21:18:06 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sat, 16 Jun 2018 04:18:06 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Sat, Jun 16, 2018 at 3:47 AM, Michael Selik wrote: > One of those links was discussing extend, not append. Yes and so what? Does this makes it automatically not related to the wish to choose more compact spelling, despite it is not recommended way. What is different with append? Similar posts are in topics about append and anything else actually, e.g.: i = i +1 i +=1 You can find enough evidence that people prefer i += 1. > The other wanted a > repeated append, best solved by a list comprehension. I think yuo should reread that one more thoroughly - I'll even paste here the text: """ When appending longer statements to a list, I feel append becomes awkward to read [...] Though append works just fine, I feel it is less clear than: mylist[i] = 2*math.pi*radius*math.cos(phi[i]) """" So your claim this ^ is not relevant?! Seriously I am starting to get tired of that style of conversation. I provided you links - you are not pleased again. > > On Fri, Jun 15, 2018, 5:40 PM Mikhail V wrote: >> >> On Sat, Jun 16, 2018 at 3:26 AM, Michael Selik wrote: >> > >> > >> > On Fri, Jun 15, 2018, 5:24 PM Mikhail V wrote: >> >> >> >> there is just nothing against append() method. >> > >> > >> > Then why break the Zen: there should be only one obvious way? >> >> I think the question could be applied to 99% proposals >> >> >> But I see from various posts in the web and SO - people just want to >> >> spell >> >> it compact, and keep the 'item' part clean of brackets. >> > >> > >> > If you're going to cite things, please link to them. >> >> Want to force me to work? :) >> >> I made pair of links already in previous posts, not directly >> telling concrete wishes, but shows approximately the picture. >> >> >> Something more concrete maybe: >> >> https://stackoverflow.com/a/3653314/4157407 >> https://stackoverflow.com/q/13818992/4157407 From greg.ewing at canterbury.ac.nz Fri Jun 15 21:30:03 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Jun 2018 13:30:03 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: <5B24681B.7060306@canterbury.ac.nz> Mikhail V wrote: > But I see from various posts in the web and SO - people > just want to spell it compact, and keep the 'item' part clean of > brackets. Where have you seen these posts? -- Greg From mike at selik.org Fri Jun 15 21:38:36 2018 From: mike at selik.org (Michael Selik) Date: Fri, 15 Jun 2018 18:38:36 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <5B21AF9D.2080301@canterbury.ac.nz> <88a6fa85-56ff-99b5-3a9e-e44457c46ea5@Damon-Family.org> Message-ID: On Fri, Jun 15, 2018, 6:18 PM Mikhail V wrote: > On Sat, Jun 16, 2018 at 3:47 AM, Michael Selik wrote: > > > One of those links was discussing extend, not append. > > Yes and so what? ... What is different with append? Luckily for extend, it's similar to the "obvious" semantics of ``a += b`` which is ``a = a + b``. Unfortunately for append, there's nothing quite like it among the operators. > The other wanted a > > repeated append, best solved by a list comprehension. > > I think yuo should reread that one more thoroughly - I'll even paste here > the text: > Though append works just fine, I feel it is less clear than: mylist[i] = 2*math.pi*radius*math.cos(phi[i]) > > So your claim this ^ is not relevant?! > I read that sentence. No, it was not relevant, because the best answer was to teach that person about list comprehensions, not to offer a new syntax. Seriously I am starting to get tired of that style of conversation. > To be honest, I'm getting a little tired myself. I am trying to politely suggest ways to strengthen your proposal even though I disagree with it. I provided you links - you are not pleased again. > Are you aware that modifying the language is difficult, time consuming, and the folks that do it aren't paid for their work? Further, any change is likely to increase the maintenance burden on these same volunteers. Even worse, tens of thousands of teachers will need to add more time to their lesson plans to explain new features. Book authors will need to issue errata and revised versions. If you add that all together, in a sense, changing the parser to expand the syntax would cost millions of dollars. Is that worth the change? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Fri Jun 15 21:44:22 2018 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 16 Jun 2018 11:44:22 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: <20180616014422.GA37699@cskk.homeip.net> On 16Jun2018 02:42, Mikhail V wrote: >Now I have slightly different idea. How is about special-casing of this >as a shortcut for append: > >L[] = item > >Namely just use the fact that empty slice is SyntaxError now. Now we're just making typing errors into working code. Also, that isn't an empty slice. That's a _missing_ slice. An empty slice has zero length. While genuinely new syntax needs to land in such a gap (because otherwise it will break working code), new syntax needs a much highly value than new meanings for existing operators. Some things _should_ be syntax errors. Particularly things which may be typing errors. Suppose I'd meant to type: L[0] = item Silent breakage, requiring runtime debugging. >I understand this is totally different approach than operator >overloading and maybe >hard to implement, but I feel like it looks really appealing. >And it is quite intuitive imo. For me the syntax reads like: >"add new empty element and this element will be "item". The term "new empty element" is a nonsense term to me. If you mean "replace an empty slice at the end of the list with a new element", that can already be written: L[len(L):len(L)]=[9] Cumbersome, I accept. But I've got a .append method. Cheers, Cameron Simpson From brianvanderburg2 at aim.com Fri Jun 15 23:54:42 2018 From: brianvanderburg2 at aim.com (Brian Allen Vanderburg II) Date: Fri, 15 Jun 2018 23:54:42 -0400 Subject: [Python-ideas] Python Decorator Improvement Idea Message-ID: Just a small idea that could possibly be useful for python decorators. An idea I had is that it could be possible for a decorator function to declare a parameter which, when the function is called as a decorator, the runtime can fill in various information in the parameters for the decorator to use.? Some of the information would be available in all contexts, while other information may only be available in certain contexts.The parameter's value cannot be explicitly specified, defaults to Null except when called as a decorator, and can only be specified once in the function's parameter list. Called as a decorator implies: @decorator def decorated(): ??? pass These are not called as a decorator.? The first item's return value would be called as a decorator, thus if it has a decorator parameter, it would be used. @decorator(...) def decorated(): ??? pass def decorated(): ??? pass decorated = decorator(decorated) Any declared callable (function, class instance with __call__, etc) can have an explicitly declared parameter called a decorator information parameter. The syntax could be as follows.? "info" is just a name I've chosen and could be any legal name. def decorator(..., @info): pass def wrapper(..., @info): def decorator(obj, @info): ... return obj if info: called directly as decorator, do something else: return decorator Rules: 1. It is not possible for the parameter's value to be directly specified. You can't call fn(info=...) 2. The parameters value is Null except in the cases where it is invoked (the callable called a a decorator).? If used in a partial, the decorator parameter would be Null. etc. Information that could be contained in the parameters for all contexts: Variable name Module object declared in Module globals (useful for @export/@public style decorators) Etc Using the decorator in a class context, pass the class object.? While the class object hasn't been fully created yet, this could allow accessing attributes of the class (like a registry or such) def decorator(fn, @info): ??? if hasattr(info, "class_obj"): ??? ??? registry = info.class_obj.__dict__.setdefault("_registry", []) ??? registry.append(fn) return fn class MyClass(Base): # Add "method" to the MyClass._registry list ??? @decorator ??? def method(...): ??? ??? pass def call_all(cls): for method in self._registry: method(self) This could also make it possible to use decorators on assignments. Information could be passed to the decorator by the runtime: ??? script_vars = {} ??? def expose(obj, @info): ??? ??? script_vars[info.name] = obj ??? ??? return obj ??? def load(filename): ??? ??? # load script and compile ??? ??? exec(code, script_vars)? ??? @expose ??? def script_function(...): ??? ??? pass ??? # This will call the decorator passing in 200 as the object, as well as info.name as the variable being assigned. ??? @expose ??? SCRIPT_CONSTANT = 200 ??? # If stacked, only the first would be used as info.name, in this case SCRIPT_CONSTANT2 ??? @expose ??? SCRIPT_CONSTANT2 = A_VAR = 300 def default(obj, @info): if info.name in info.ctx.vars: return info.ctx.vars[info.name] if info.name in info.globals: return info.globals[info.name] return obj @default X = 12 @default X = 34 print(X) # print 12 since during the second call, X existed and it's value was returned instead of 34, and was assigned to X The two potential benefits I see from this are: 1. The runtime can pass certain information to the decorator, some information in all contexts, and some information in specific contexts such as when decorating a class member, decorating a function defined within another function, etc 2. It would be possible to decorate values directly, as the runtime can pass relevant information such as the variables name This was just an idea I had that may have some use. Thanks, Brian Vanderburg II -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From steve at pearwood.info Sat Jun 16 01:22:10 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Jun 2018 15:22:10 +1000 Subject: [Python-ideas] Python Decorator Improvement Idea In-Reply-To: References: Message-ID: <20180616052209.GC14437@ando.pearwood.info> On Fri, Jun 15, 2018 at 11:54:42PM -0400, Brian Allen Vanderburg II via Python-ideas wrote: > An idea I had is that it could be possible for a decorator function to > declare a parameter which, when the function is called as a decorator, > the runtime can fill in various information in the parameters for the > decorator to use. We can already do this, by writing a decorator factory: @decorator(any parameters you care to pass) def spam(): ... "Explicit is better than implicit" -- it is better to explicitly pass the parameters you want, than to hope that "the runtime" (do you mean the interpreter?) will guess which parameters you need. > Some of the information would be available in all > contexts, while other information may only be available in certain > contexts.The parameter's value cannot be explicitly specified, defaults > to Null except when called as a decorator, and can only be specified > once in the function's parameter list. Do you mean None? Why do you think it is a good idea to have the same function, the decorator, behave differently when called using decorator syntax and standard function call syntax? To me, that sounds like a terrible idea. What advantage do you see? [...] > Rules: > > 1. It is not possible for the parameter's value to be directly > specified. You can't call fn(info=...) That sounds like a recipe for confusion to me. How would you explain this to a beginner? Aside from the confusion that something that looks like a parameter isn't an actual parameter, but something magical, it is also very limiting. It makes it more difficult to use the decorator, since now it only works using @ syntax. > 2. The parameters value is Null except in the cases where it is invoked > (the callable called a a decorator).? If used in a partial, the > decorator parameter would be Null. etc. You keep saying Null. What's Null? > Information that could be contained in the parameters for all contexts: > > Variable name > Module object declared in > Module globals (useful for @export/@public style decorators) > Etc The variable name is just the name of the function or class, the first parameter received by the decorator. You can get it with func.__name__. The module globals is already available in globals(). You can either pass it directly as an argument to the decorator, or the decorator can call it itself. (Assuming the decorator is used in the same module it is defined in.) If the decorator is in the same module as the globals you want to access, the decorator can just call globals(). Or use the global keyword. If the decorator is contained in another module, the caller can pass the global namespace as an argument to the decorator: @decorate(globals()) def func(): ... Not the neatest solution in the world, but it works now. > Using the decorator in a class context, pass the class object. The decorator already receives the class object as the first parameter. Why pass it again? > While the class object hasn't been fully created yet, What makes you say that? > this could allow > accessing attributes of the class (like a registry or such) > > def decorator(fn, @info): > ??? if hasattr(info, "class_obj"): > ??? ??? registry = info.class_obj.__dict__.setdefault("_registry", []) > > ??? registry.append(fn) > return fn Writing "hasattr(info, whatever)" is an anti-pattern. By the way, the public interface for accessing objects' __dict__ is to call the vars() function: vars(info.class_obj).set_default(...) > This could also make it possible to use decorators on assignments. We already can: result = decorator(obj) is equivalent to: @decorator def obj(): ... or @decorator class obj: ... except that we can use the decorator on anything we like, not just a function or class. [...] > ??? # This will call the decorator passing in 200 as the object, as > # well as info.name as the variable being assigned. > @expose > SCRIPT_CONSTANT = 200 That would require a change to syntax, and would have to be a separate discussion. If there were a way to get the left hand side of assignments as a parameter, that feature would be *far* to useful to waste on just decorators. For instance, we could finally do something about: name = namedtuple("name", fields) > The two potential benefits I see from this are: > > 1. The runtime can pass certain information to the decorator, some > information in all contexts, and some information in specific contexts > such as when decorating a class member, decorating a function defined > within another function, etc > > 2. It would be possible to decorate values directly, as the runtime can > pass relevant information such as the variables name No, that would require a second, independent change. We could, if desired, allow decorator syntax like this: @decorate value = 1 but it seems pretty pointless since that's the same as: value = decorator(1) The reason we have @decorator syntax is not to be a second way to call functions, using two lines instead of a single expression, but to avoid having to repeat the name of the function three times: # Repeat the function name three times: def function(): ... function = decorate(function) # Versus only once: @decorate def function(): ... -- Steve From koyukukan at gmail.com Sat Jun 16 01:49:37 2018 From: koyukukan at gmail.com (Rin Arakaki) Date: Fri, 15 Jun 2018 22:49:37 -0700 (PDT) Subject: [Python-ideas] Loosen 'as' assignment Message-ID: <514e7505-abf0-4410-af36-a971f2e5d91c@googlegroups.com> Hi, I'm wondering if it's possible and consistent that loosen 'as' assignment, for example: >>> import psycopg2 as pg >>> import psycopg2.extensions as pg.ex You can't now assign to an attribute in as statement but are there some reasons? To be honest, I'll be satisfied if the statement above become valid, but also interested in general design decisions about 'as' functionality, I mean, it can be applicable to all expression that can be left side of '=' such as 'list[n]' one, and also other statement than 'import' such as 'with'. Thanks, From ncoghlan at gmail.com Sat Jun 16 04:04:38 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Jun 2018 18:04:38 +1000 Subject: [Python-ideas] Loosen 'as' assignment In-Reply-To: <514e7505-abf0-4410-af36-a971f2e5d91c@googlegroups.com> References: <514e7505-abf0-4410-af36-a971f2e5d91c@googlegroups.com> Message-ID: On 16 June 2018 at 15:49, Rin Arakaki wrote: > Hi, > I'm wondering if it's possible and consistent that loosen 'as' assignment, > for example: > > >>> import psycopg2 as pg > >>> import psycopg2.extensions as pg.ex > > You can't now assign to an attribute in as statement but are there some > reasons? > To be honest, I'll be satisfied if the statement above become valid, but > also interested in general design decisions about 'as' functionality, I > mean, it can be applicable to all expression that can be left side of '=' > such as 'list[n]' one, and also other statement than 'import' such as > 'with'. > This is essentially monkeypatching the psycopg2 module to alias the "extensions" submodule as the "ex" submodule. You can already do that today as: >>> import psycopg2 as pg >>> import psycopg2.extensions >>> pg.ex = pg.extensions Monkeypatching other modules at runtime is a questionable enough practice that we're unlikely to add syntax that actively encourages it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 16 06:51:30 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Jun 2018 20:51:30 +1000 Subject: [Python-ideas] Approximately equal operator In-Reply-To: References: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> Message-ID: <20180616105130.GD14437@ando.pearwood.info> On Fri, Jun 15, 2018 at 05:31:57PM -0700, Nathaniel Smith wrote: > On Fri, Jun 15, 2018 at 3:56 PM, Andre Roberge wrote: > > * people doing heavy numerical work and wanting code as readable as possible > > IME serious numerical work doesn't use approximate equality tests at > all, except in test assertions. I wouldn't go that far. It is quite common to write abs(x - y) < e or similar, to see whether x and y are within a certain distance of each other. APL even made their equals operator an "approximate equality" operator. So I don't think it is completely crazy to want an operator for this. But I think that falls short of "a good idea for Python". > > * teaching mostly beginners about finite precision for floating point > > arithmetics > > Given that approximate equality tests are almost never the right > solution, I would be worried that emphasizing them to beginners would > send them down the wrong path. This is already a common source of > confusion and trap for non-experts. Certainly it is an area that is rife with superstitition, like the idea that you should "never" compare two floats for equality, or folklore about what tolerance you should use. https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ > > * people wishing to have trigonometric functions with arguments in degrees, > > as in a current discussion on this forum. > > AFAICT approximate equality checks aren't really useful for that, no. Indeed. The last thing a maths library should be doing is making arbitrary choices that some value is "close enough" and return "the value we think you want". "Hi, the number you gave is pretty close to 60?, so I'm going to round the answer off whether you want me to or not." Its okay for people to make their own determination what "close enough" means, and for many purposes 0.99 could be close enough to 1. But the library shouldn't. [...] > Python is *very* stingy with adding new operators; IIRC only 3 have > been added over the last ~30 years (**, //, @). I don't think ~= is > going to make it. Exponentiation ** goes back to Python 1.5, so I think that's only two new operators :-) -- Steve From steve at pearwood.info Sat Jun 16 06:59:24 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Jun 2018 20:59:24 +1000 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> Message-ID: <20180616105924.GE14437@ando.pearwood.info> > It is easy to test it. Encoding/decoding with '874' should give the > same result as with 'cp874'. I know it is too late to remove that feature, but why do we support digit-only IDs for encodings? They can be ambiguous. If Wikipedia is correct, cp874 (also known as ibm874) and Windows-874 (also known as cp1162) are different: https://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874 https://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_1162 -- Steve From steve at pearwood.info Sat Jun 16 07:09:06 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Jun 2018 21:09:06 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <5B2462A5.8020907@canterbury.ac.nz> References: <5B2462A5.8020907@canterbury.ac.nz> Message-ID: <20180616110906.GF14437@ando.pearwood.info> On Sat, Jun 16, 2018 at 01:06:45PM +1200, Greg Ewing wrote: > Michael Selik wrote: > >The += operator was meant as an alias for ``x = x + 1``. The > >fact that it mutates a list is somewhat of a surprise. > > That's very much a matter of opinion. For every person who > thinks this is a surprise, you can find another that thinks > it's obvious that += should mutate a list, and is surprised > by the fact that it works on immutable types at all. Given the ubiquity of += in C, where it works on numbers but not lists, and the general difficulty many people have in dealing with the difference between assignment and mutation, I think the ratio would be closer to 20:1 than 1:1. But regardless, I'm pretty sure that nobody expects this: py> t = ([], None) py> t[0] += [1] Traceback (most recent call last): File "", line 1, in TypeError: 'tuple' object does not support item assignment py> print(t) ([1], None) > surprised by at least one of its meanings. :-) > > Maybe we should call it the "Spanish Inquisition operator". :-) -- Steve From rosuav at gmail.com Sat Jun 16 08:00:21 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Jun 2018 22:00:21 +1000 Subject: [Python-ideas] Approximately equal operator In-Reply-To: <20180616105130.GD14437@ando.pearwood.info> References: <28f2d18c-6f0b-b602-2f8a-5bbed1d70d58@Damon-Family.org> <20180616105130.GD14437@ando.pearwood.info> Message-ID: On Sat, Jun 16, 2018 at 8:51 PM, Steven D'Aprano wrote: >> Python is *very* stingy with adding new operators; IIRC only 3 have >> been added over the last ~30 years (**, //, @). I don't think ~= is >> going to make it. > > Exponentiation ** goes back to Python 1.5, so I think that's only two > new operators :-) > I'm not sure if they count or not, but 'await' and 'yield' are kinda like unary operators. But yeah, the language is deliberately stingy there. ChrisA From rosuav at gmail.com Sat Jun 16 08:04:20 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 16 Jun 2018 22:04:20 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <20180616110906.GF14437@ando.pearwood.info> References: <5B2462A5.8020907@canterbury.ac.nz> <20180616110906.GF14437@ando.pearwood.info> Message-ID: On Sat, Jun 16, 2018 at 9:09 PM, Steven D'Aprano wrote: > On Sat, Jun 16, 2018 at 01:06:45PM +1200, Greg Ewing wrote: >> Michael Selik wrote: >> >The += operator was meant as an alias for ``x = x + 1``. The >> >fact that it mutates a list is somewhat of a surprise. >> >> That's very much a matter of opinion. For every person who >> thinks this is a surprise, you can find another that thinks >> it's obvious that += should mutate a list, and is surprised >> by the fact that it works on immutable types at all. > > Given the ubiquity of += in C, where it works on numbers but not lists, > and the general difficulty many people have in dealing with the > difference between assignment and mutation, I think the ratio would be > closer to 20:1 than 1:1. The nearest you'd get to "adding to a list/array" in C would be adding to a pointer. But I don't see people writing code like this in Python: >>> x = [10, 20, 30, 40] >>> x += 1 >>> assert x == [20, 30, 40] :) > But regardless, I'm pretty sure that nobody expects this: > > > py> t = ([], None) > py> t[0] += [1] > Traceback (most recent call last): > File "", line 1, in > TypeError: 'tuple' object does not support item assignment > py> print(t) > ([1], None) > Agreed, although I can't remember this ever coming up outside of interactive work. When you do something at the interactive prompt, you can kinda feel that it should "rollback" on exception, and yet some part of it has happened. But in most scripts, that TypeError is going to pull you all the way out of the scope where 't' exists - frequently, it'll just straight-up terminate the script - so you won't often see this. ChrisA From steve at pearwood.info Sat Jun 16 09:27:24 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Jun 2018 23:27:24 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <5B1F6AD7.3050302@canterbury.ac.nz> Message-ID: <20180616132723.GG14437@ando.pearwood.info> On Thu, Jun 14, 2018 at 01:44:34PM -0500, Tim Peters wrote: > I should note that numeric code "that works" is often much subtler than it > appears at first glance. So, for educational purposes, I'll point out some > of what _wasn't_ said about this crucial function: [...] Thanks Tim! Reading your digressions on the minutia of floating point maths is certainly an education. It makes algebra and real-valued mathematics seem easy in comparison. I still haven't got over Mark Dickinson's demonstration a few years back that under Decimal floating point, but not binary, it is possible for the ordinary arithmetic average (x+y)/2 to be outside of the range [x, y]: py> from decimal import getcontext, Decimal py> getcontext().prec = 3 py> x = Decimal('0.516') py> y = Decimal('0.518') py> (x + y) / 2 Decimal('0.515') -- Steve From brianvanderburg2 at aim.com Sat Jun 16 12:07:31 2018 From: brianvanderburg2 at aim.com (Brian Allen Vanderburg II) Date: Sat, 16 Jun 2018 12:07:31 -0400 Subject: [Python-ideas] Python Decorator Improvement Idea In-Reply-To: <20180616052209.GC14437@ando.pearwood.info> References: <20180616052209.GC14437@ando.pearwood.info> Message-ID: On 06/16/2018 01:22 AM, Steven D'Aprano wrote: > Some of the information would be available in all >> contexts, while other information may only be available in certain >> contexts.The parameter's value cannot be explicitly specified, defaults >> to Null except when called as a decorator, and can only be specified >> once in the function's parameter list. > Do you mean None? Yes, I meant None instead of Null. > [...] >> Rules: >> >> 1. It is not possible for the parameter's value to be directly >> specified. You can't call fn(info=...) > That sounds like a recipe for confusion to me. How would you explain > this to a beginner? > > Aside from the confusion that something that looks like a parameter > isn't an actual parameter, but something magical, it is also very > limiting. It makes it more difficult to use the decorator, since now it > only works using @ syntax. That was just an initial idea.? However there would be no reason that the parameter could not be passed directly.? Actually if creating one decorator that wraps another decorator, being able to pass the parameter on could be needed. Also, the decorator would still work in normal syntax, only with that parameter set to None >> Information that could be contained in the parameters for all contexts: >> >> Variable name >> Module object declared in >> Module globals (useful for @export/@public style decorators) >> Etc > The variable name is just the name of the function or class, the first > parameter received by the decorator. You can get it with func.__name__. This works with functions and classes but not other values that may not have __name__. >> Using the decorator in a class context, pass the class object. > The decorator already receives the class object as the first parameter. > Why pass it again? > > >> While the class object hasn't been fully created yet, > What makes you say that? What I mean is used inside the body of a class to decorate a class member: ??? class MyClass(object): ??? ??? @decorator ??? ??? def method(self): ??? ?? ? ?? pass Using the explicit is better than implicit: ??? class MyClass(object): ??? ??? @decorator(MyClass, ...) ??? ??? def method(self): ??? ?? ? ?? pass However right now that does not work as MyClass does not exist when the decorator is called.? I'm not sure how Python works on this under the hood as it's been a long time since I've looked through the source code.? If Python gather's everything under MyClass first before it even begins to create the MyClass object, then it may not be possible, but if Python has already created a class object, and just not yet assigned it to the MyClass name in the module, then perhaps there could be some way to pass that class object to the decorator. I have seen some examples that decorates the class and members to achieve something similar ??? @outerdecorator ??? class MyClass: ??? ??? @decorator ??? ??? def method(self): ??? ?? ? ?? pass > >> ??? # This will call the decorator passing in 200 as the object, as >> # well as info.name as the variable being assigned. >> @expose >> SCRIPT_CONSTANT = 200 > That would require a change to syntax, and would have to be a separate > discussion. > > If there were a way to get the left hand side of assignments as a > parameter, that feature would be *far* to useful to waste on just > decorators. For instance, we could finally do something about: > > name = namedtuple("name", fields) Agreed it would be a change in syntax.? Using the decorator syntax i've mentioned the name being assigned would be passed to that extra info parameter.? Python would treat anything in the form of: ??? @decorator ??? NAME = (expression) as a decorator as well: ??? _tmp = (expression) ??? NAME = decorator(_tmp) Right now, there's litlte use as it is just as easy to say directly ??? NAME = decorator(expression) With this idea, it could be possible to do something like this: ??? def NamedTuple(obj @info): ??? ??? return namedtuple(info.name, obj) ??? @NamedTuple ??? Point3 = ["x", "y", "z"] >> The two potential benefits I see from this are: >> >> 1. The runtime can pass certain information to the decorator, some >> information in all contexts, and some information in specific contexts >> such as when decorating a class member, decorating a function defined >> within another function, etc >> >> 2. It would be possible to decorate values directly, as the runtime can >> pass relevant information such as the variables name > No, that would require a second, independent change. > > We could, if desired, allow decorator syntax like this: > > @decorate > value = 1 > > but it seems pretty pointless since that's the same as: > > value = decorator(1) > > The reason we have @decorator syntax is not to be a second way to call > functions, using two lines instead of a single expression, but to avoid > having to repeat the name of the function three times: > > # Repeat the function name three times: > def function(): > ... > function = decorate(function) > > # Versus only once: > @decorate > def function(): > ... > The two main use cases I had of this idea were basically assignment decorators, pointless as it can just be name = decorator(value), but my idea was to pass to the decorator some metadata such as the name being assigned, and as class member decorators to receive information of the instance of the class object the member is being declared under. A more general idea could be to allow a function call to receive a meta parameter that provides some context information of the call.? This parameter is not part of a parameter list, but a special __variable__, or perhaps could be retrieved via a function call. Such contexts could be: 1) Assignment (includes decorators since they are just sugar for name = decorator(name)) The meta attribute assignname would contain the name being assigned to ??? def fn(v): ??? ??? print(__callinfo__.assignname) ??? ??? return v ??? # prints X ??? X = fn(12) ??? # prints MyClass ??? @fn ??? class MyClass: ??? ??? pass ??? # Should assignname receive the left-most assignment result or the rightmost othervar ??? # Perhaps assignname could be a tuple of names being assigned to ??? result = othervar = fn(12) ??? #assignname would be myothervar in this augmented assignment ??? result = [myothervar := fn(12)] ??? # Should expressions be allowed, or would assignname be None? ??? result = 1 + fn(12) With something like this. ??? name = namedtuple("name", ...) could become: ??? def NamedTuple(*args): ??? ??? return namedtuple(__callinfo__.assignname, args) ??? Point2 = NamedTuple("x", "y") ??? Point3 = NamedTuple("x", "y", "z") ??? etc 2) Class context. The a classobj parameter could contain the class object it is called under. This would be a raw object initially as __init__ would not have been called, but would allow the decorator to add attributes to a class ??? def fn(v): ??? ??? print(__callinfo__.classobj) # classobj is None except when the function is called in the body of a class declaration ??? ??? print(__callinfo__.assignname) ??? ??? if __callinfo__.classobj: ??? ??? ??? data = vars(__callinfo__.classobj).setdefault("_registry", {}) ??? ??? ??? data[__callinfo__.assignname] = v ??? ??? return v ??? class MyClass: ??? ??? # print main.MyClass (probably something else since __init__ not yet calls, may just be a bare class object at that timie) ??? ??? # print X ??? ??? # sets MyClass._registry["X"] ??? ??? X = fn(12) ??? ??? # print main.MyClass ??? ??? # print method ??? ??? # sets MyClass._registry["method"] ??? ??? @fn ??? ??? def method(self): ??? ??? ??? pass ??? ??? # print None ??? # print Y ??? Y = fn(12) In this case it's not longer a decorator idea but more of an idea for a called function to be able to retrieve certain meta information about it's call. In the examples above, I used __callinfo__ with attributes, but direct names would work the same: ??? def fn(v): ??? ??? print(__assignname__) # May be None if no assignment/etc if otherfunc(fn(value)) ??? ??? print(__classobj__) # Will be None unless fn is called directly under a class body There may be other contexts and use cases, and better ways.? Just an idea. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From mikhailwas at gmail.com Sat Jun 16 13:21:42 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sat, 16 Jun 2018 20:21:42 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <20180616014422.GA37699@cskk.homeip.net> References: <20180616014422.GA37699@cskk.homeip.net> Message-ID: On Sat, Jun 16, 2018 at 4:44 AM, Cameron Simpson wrote: > On 16Jun2018 02:42, Mikhail V wrote: >> > Some things _should_ be syntax errors. Particularly things which may be > typing errors. Suppose I'd meant to type: > > L[0] = item > > Silent breakage, requiring runtime debugging. Not sure that it's different from any other situation: e.g. if by mistake I will write: L[i+1] instead of L[i+2] Visual difference between L[] = L[1] = is big enough to reduce the typing error. But maybe I did not understand your example case. FWIW in general, claims about possible typing errors by introducing this or that syntax are speculative. Main source of typos usually is high similarity of spellings and characters, or initial obscurity of spelling - e.g. excessive punctuation. Does it apply here? >> I understand this is totally different approach than operator >> overloading and maybe >> hard to implement, but I feel like it looks really appealing. >> And it is quite intuitive imo. For me the syntax reads like: >> "add new empty element and this element will be "item". > > > The term "new empty element" is a nonsense term to me. > > If you mean "replace an empty slice at the end of the list with a new > element", that can already be written: I just say that I find this syntax is more intuitive than the idea with operator. Not sure that we need to start semantics nitpicking. For example, such code: L = [] L[] = x L[] = y imo has more chance to be understood correctly than e.g.: L = [] L ^= x L ^= y By L[] there is some mnemonical hint because [] is used to create new empty list. Plus it does not introduce overloading of the operator. And overloading has weakness in this - e.g. " var1 += var2 " does not have mnemonics, other than + character (it could be two integers as well). So if L[] is found somwhere far from initialisation - it may be a good aid. It makes it more clear what is happening, compared to augmented operator. From mike at selik.org Sat Jun 16 15:04:00 2018 From: mike at selik.org (Michael Selik) Date: Sat, 16 Jun 2018 12:04:00 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> Message-ID: On Sat, Jun 16, 2018, 10:22 AM Mikhail V wrote: > Plus it does not introduce overloading of the operator. Now you're critizing duck typing. And overloading has weakness in this - e.g. " var1 += var2 " does not > have mnemonics, other than + character (it could be two integers as well). > That's an advantage, not disadvantage. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at cskk.id.au Sat Jun 16 18:25:20 2018 From: cs at cskk.id.au (Cameron Simpson) Date: Sun, 17 Jun 2018 08:25:20 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: Message-ID: <20180616222520.GA49050@cskk.homeip.net> On 16Jun2018 20:21, Mikhail V wrote: >On Sat, Jun 16, 2018 at 4:44 AM, Cameron Simpson wrote: >> On 16Jun2018 02:42, Mikhail V wrote: >> Some things _should_ be syntax errors. Particularly things which may be >> typing errors. Suppose I'd meant to type: >> >> L[0] = item >> >> Silent breakage, requiring runtime debugging. > >Not sure that it's different from any other situation: Well, many typing errors are invalid syntax. When that happens you find out immediately. >e.g. if by mistake I will write: > L[i+1] instead of > L[i+2] This is true: many other typing errors are not syntax errors. >Visual difference between > L[] = > L[1] = > >is big enough to reduce the typing error. But maybe I did not >understand your example case. No, you understand the example; we differ in our estimation of likelihoods and costs to mistakes. Particularly, I dislike "silent breakage", which aan be much harder to fix because it shows as incorrect behaviour far from the error (eg counters being slightly wrong leading to misbehaviour in things depending on the counter). Assuming we are lucky enough for the misbehaviour to be obvious. But there is also the point that _every_ new piece of syntax reduces the surface of "invalid syntax that can catch simple mistakes". So new syntax tends to requires a higher perceived benefit than, say, a new feature on a class/type. Such as your suggestions about having lists support more operators i.e. "^" to mediate list insertion. >FWIW in general, claims about possible typing errors by >introducing this or that syntax are speculative. Certainly. >Main source of typos usually is high similarity of spellings and >characters, or initial obscurity of spelling - e.g. excessive >punctuation. Does it apply here? Unsure, depends on the programer. >>> I understand this is totally different approach than operator >>> overloading and maybe >>> hard to implement, but I feel like it looks really appealing. >>> And it is quite intuitive imo. For me the syntax reads like: >>> "add new empty element and this element will be "item". >> >> The term "new empty element" is a nonsense term to me. >> >> If you mean "replace an empty slice at the end of the list with a new >> element", that can already be written: > >I just say that I find this syntax is more intuitive than the idea >with operator. Not sure that we need to start semantics nitpicking. > >For example, such code: > > L = [] > L[] = x > L[] = y Well, as someone else pointed out, PHP has this notation for append. It is quite convenient. (In fact, it is one of the very few convenient things in PHP.) However, PHP is generally not a winning source of ideas in the Python world. Certainly I was _very_ glad to not be suffering with PHP once I left my previous job. There were many many things to like about my previous job, but PHP was not one of them. Cheers, Cameron Simpson From steve at pearwood.info Sat Jun 16 19:52:16 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Jun 2018 09:52:16 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> Message-ID: <20180616235216.GI14437@ando.pearwood.info> On Sat, Jun 16, 2018 at 08:21:42PM +0300, Mikhail V wrote: > For example, such code: > > L = [] > L[] = x > L[] = y Should be written as L = [x, y]. > imo has more chance to be understood correctly than e.g.: > > L = [] > L ^= x > L ^= y I disagree. The first syntax L[] = x looks so similar to L[:] assignment that I keep reading it as "set the list L to a single item x". It certainly doesn't look like an append operation. The second at least looks like a mutation on L. > By L[] there is some mnemonical hint because [] is used to create > new empty list. How is that a hint? What is the connection between "append an item" and "create a new empty list"? > So if L[] is found somwhere far from initialisation - it may be a good aid. > It makes it more clear what is happening, compared to augmented operator. I don't think so. Here is a radical thought... why don't we give lists a method that inserts items at the end of the list? We could call it something like "append", and then instead of hoping people guess what the syntax does, they can just look up the name of the method? L.append(x) might work. *wink* -- Steve From mike at selik.org Sat Jun 16 20:22:19 2018 From: mike at selik.org (Michael Selik) Date: Sat, 16 Jun 2018 17:22:19 -0700 Subject: [Python-ideas] Python Decorator Improvement Idea In-Reply-To: References: <20180616052209.GC14437@ando.pearwood.info> Message-ID: The idea of having a dunder to introspect the bound variable name has been discussed before. You can find the past discussions in the mailing list archive. If I recall correctly, there were very few use cases beyond namedtuple. With dataclasses available in 3.7, there may be even less interest than before. On Sat, Jun 16, 2018, 9:04 AM Brian Allen Vanderburg II via Python-ideas < python-ideas at python.org> wrote: > > On 06/16/2018 01:22 AM, Steven D'Aprano wrote: > > Some of the information would be available in all > >> contexts, while other information may only be available in certain > >> contexts.The parameter's value cannot be explicitly specified, defaults > >> to Null except when called as a decorator, and can only be specified > >> once in the function's parameter list. > > Do you mean None? > > Yes, I meant None instead of Null. > > > [...] > >> Rules: > >> > >> 1. It is not possible for the parameter's value to be directly > >> specified. You can't call fn(info=...) > > That sounds like a recipe for confusion to me. How would you explain > > this to a beginner? > > > > Aside from the confusion that something that looks like a parameter > > isn't an actual parameter, but something magical, it is also very > > limiting. It makes it more difficult to use the decorator, since now it > > only works using @ syntax. > > That was just an initial idea. However there would be no reason that the > parameter could not be passed directly. Actually if creating one decorator > that wraps another decorator, being able to pass the parameter on could > be needed. > > Also, the decorator would still work in normal syntax, only with that > parameter > set to None > > >> Information that could be contained in the parameters for all contexts: > >> > >> Variable name > >> Module object declared in > >> Module globals (useful for @export/@public style decorators) > >> Etc > > The variable name is just the name of the function or class, the first > > parameter received by the decorator. You can get it with func.__name__. > > This works with functions and classes but not other values that may > not have __name__. > >> Using the decorator in a class context, pass the class object. > > The decorator already receives the class object as the first parameter. > > Why pass it again? > > > > > >> While the class object hasn't been fully created yet, > > What makes you say that? > > What I mean is used inside the body of a class to decorate a class member: > > class MyClass(object): > @decorator > def method(self): > pass > > Using the explicit is better than implicit: > > class MyClass(object): > @decorator(MyClass, ...) > def method(self): > pass > > However right now that does not work as MyClass does not exist when the > decorator is called. I'm not sure how Python works on this under the hood > as it's been a long time since I've looked through the source code. If > Python > gather's everything under MyClass first before it even begins to create the > MyClass object, then it may not be possible, but if Python has already > created > a class object, and just not yet assigned it to the MyClass name in the > module, > then perhaps there could be some way to pass that class object to the > decorator. > > I have seen some examples that decorates the class and members to achieve > something similar > > @outerdecorator > class MyClass: > @decorator > def method(self): > pass > > > > >> # This will call the decorator passing in 200 as the object, as > >> # well as info.name as the variable being assigned. > >> @expose > >> SCRIPT_CONSTANT = 200 > > That would require a change to syntax, and would have to be a separate > > discussion. > > > > If there were a way to get the left hand side of assignments as a > > parameter, that feature would be *far* to useful to waste on just > > decorators. For instance, we could finally do something about: > > > > name = namedtuple("name", fields) > > Agreed it would be a change in syntax. Using the decorator syntax i've > mentioned > the name being assigned would be passed to that extra info parameter. > Python > would treat anything in the form of: > > @decorator > NAME = (expression) > > as a decorator as well: > > _tmp = (expression) > NAME = decorator(_tmp) > > Right now, there's litlte use as it is just as easy to say directly > > NAME = decorator(expression) > > With this idea, it could be possible to do something like this: > > def NamedTuple(obj @info): > return namedtuple(info.name, obj) > > @NamedTuple > Point3 = ["x", "y", "z"] > >> The two potential benefits I see from this are: > >> > >> 1. The runtime can pass certain information to the decorator, some > >> information in all contexts, and some information in specific contexts > >> such as when decorating a class member, decorating a function defined > >> within another function, etc > >> > >> 2. It would be possible to decorate values directly, as the runtime can > >> pass relevant information such as the variables name > > No, that would require a second, independent change. > > > > We could, if desired, allow decorator syntax like this: > > > > @decorate > > value = 1 > > > > but it seems pretty pointless since that's the same as: > > > > value = decorator(1) > > > > The reason we have @decorator syntax is not to be a second way to call > > functions, using two lines instead of a single expression, but to avoid > > having to repeat the name of the function three times: > > > > # Repeat the function name three times: > > def function(): > > ... > > function = decorate(function) > > > > # Versus only once: > > @decorate > > def function(): > > ... > > > > The two main use cases I had of this idea were basically assignment > decorators, > pointless as it can just be name = decorator(value), but my idea was to > pass to > the decorator some metadata such as the name being assigned, and as class > member decorators to receive information of the instance of the class > object > the member is being declared under. > > A more general idea could be to allow a function call to receive a meta > parameter > that provides some context information of the call. This parameter is > not part of > a parameter list, but a special __variable__, or perhaps could be > retrieved via a > function call. > > Such contexts could be: > > 1) Assignment (includes decorators since they are just sugar for name = > decorator(name)) > The meta attribute assignname would contain the name being assigned to > > def fn(v): > print(__callinfo__.assignname) > return v > > # prints X > X = fn(12) > > # prints MyClass > @fn > class MyClass: > pass > > # Should assignname receive the left-most assignment result or the > rightmost othervar > # Perhaps assignname could be a tuple of names being assigned to > result = othervar = fn(12) > > #assignname would be myothervar in this augmented assignment > result = [myothervar := fn(12)] > > # Should expressions be allowed, or would assignname be None? > result = 1 + fn(12) > > With something like this. > > name = namedtuple("name", ...) > > could become: > > def NamedTuple(*args): > return namedtuple(__callinfo__.assignname, args) > > Point2 = NamedTuple("x", "y") > Point3 = NamedTuple("x", "y", "z") > etc > > 2) Class context. The a classobj parameter could contain the class > object it is called under. > This would be a raw object initially as __init__ would not have been > called, but would allow > the decorator to add attributes to a class > > def fn(v): > print(__callinfo__.classobj) # classobj is None except when the > function is called in the body of a class declaration > print(__callinfo__.assignname) > if __callinfo__.classobj: > data = vars(__callinfo__.classobj).setdefault("_registry", {}) > data[__callinfo__.assignname] = v > return v > > class MyClass: > # print main.MyClass (probably something else since __init__ not > yet calls, may just be a bare class object at that timie) > # print X > # sets MyClass._registry["X"] > X = fn(12) > > # print main.MyClass > # print method > # sets MyClass._registry["method"] > @fn > def method(self): > pass > > # print None > # print Y > Y = fn(12) > > In this case it's not longer a decorator idea but more of an idea for a > called function to be able to retrieve certain meta information about > it's call. > In the examples above, I used __callinfo__ with attributes, but direct > names would work the same: > > def fn(v): > print(__assignname__) # May be None if no assignment/etc if > otherfunc(fn(value)) > print(__classobj__) # Will be None unless fn is called directly > under a class body > > > There may be other contexts and use cases, and better ways. Just an idea. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Jun 16 20:51:35 2018 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 16 Jun 2018 20:51:35 -0400 Subject: [Python-ideas] Python Decorator Improvement Idea In-Reply-To: References: <20180616052209.GC14437@ando.pearwood.info> Message-ID: <361e6814-d073-c3e2-52d7-b5b26252775e@trueblade.com> On 6/16/2018 8:22 PM, Michael Selik wrote: > The idea of having a dunder to introspect the bound variable name has > been discussed before. You can find the past discussions in the mailing > list archive. If I recall correctly, there were very few use cases > beyond namedtuple. With dataclasses available in 3.7, there may be even > less interest than before. One such thread is here: https://mail.python.org/pipermail/python-ideas/2011-March/009250.html Eric > > > On Sat, Jun 16, 2018, 9:04 AM Brian Allen Vanderburg II via Python-ideas > > wrote: > > > On 06/16/2018 01:22 AM, Steven D'Aprano wrote: > > Some of the information would be available in all > >> contexts, while other information may only be available in certain > >> contexts.The parameter's value cannot be explicitly specified, > defaults > >> to Null except when called as a decorator, and can only be specified > >> once in the function's parameter list. > > Do you mean None? > > Yes, I meant None instead of Null. > > > [...] > >> Rules: > >> > >> 1. It is not possible for the parameter's value to be directly > >> specified. You can't call fn(info=...) > > That sounds like a recipe for confusion to me. How would you explain > > this to a beginner? > > > > Aside from the confusion that something that looks like a parameter > > isn't an actual parameter, but something magical, it is also very > > limiting. It makes it more difficult to use the decorator, since > now it > > only works using @ syntax. > > That was just an initial idea.? However there would be no reason > that the > parameter could not be passed directly.? Actually if creating one > decorator > that wraps another decorator, being able to pass the parameter on could > be needed. > > Also, the decorator would still work in normal syntax, only with that > parameter > set to None > > >> Information that could be contained in the parameters for all > contexts: > >> > >> Variable name > >> Module object declared in > >> Module globals (useful for @export/@public style decorators) > >> Etc > > The variable name is just the name of the function or class, the > first > > parameter received by the decorator. You can get it with > func.__name__. > > This works with functions and classes but not other values that may > not have __name__. > >> Using the decorator in a class context, pass the class object. > > The decorator already receives the class object as the first > parameter. > > Why pass it again? > > > > > >> While the class object hasn't been fully created yet, > > What makes you say that? > > What I mean is used inside the body of a class to decorate a class > member: > > ??? class MyClass(object): > ??? ??? @decorator > ??? ??? def method(self): > ??? ?? ? ?? pass > > Using the explicit is better than implicit: > > ??? class MyClass(object): > ??? ??? @decorator(MyClass, ...) > ??? ??? def method(self): > ??? ?? ? ?? pass > > However right now that does not work as MyClass does not exist when the > decorator is called.? I'm not sure how Python works on this under > the hood > as it's been a long time since I've looked through the source code.? If > Python > gather's everything under MyClass first before it even begins to > create the > MyClass object, then it may not be possible, but if Python has already > created > a class object, and just not yet assigned it to the MyClass name in the > module, > then perhaps there could be some way to pass that class object to the > decorator. > > I have seen some examples that decorates the class and members to > achieve > something similar > > ??? @outerdecorator > ??? class MyClass: > ??? ??? @decorator > ??? ??? def method(self): > ??? ?? ? ?? pass > > > > >> ??? # This will call the decorator passing in 200 as the object, as > >>? ? ?# well as info.name as the variable being > assigned. > >>? ? ?@expose > >>? ? ?SCRIPT_CONSTANT = 200 > > That would require a change to syntax, and would have to be a > separate > > discussion. > > > > If there were a way to get the left hand side of assignments as a > > parameter, that feature would be *far* to useful to waste on just > > decorators. For instance, we could finally do something about: > > > > name = namedtuple("name", fields) > > Agreed it would be a change in syntax.? Using the decorator syntax i've > mentioned > the name being assigned would be passed to that extra info parameter. > Python > would treat anything in the form of: > > ??? @decorator > ??? NAME = (expression) > > as a decorator as well: > > ??? _tmp = (expression) > ??? NAME = decorator(_tmp) > > Right now, there's litlte use as it is just as easy to say directly > > ??? NAME = decorator(expression) > > With this idea, it could be possible to do something like this: > > ??? def NamedTuple(obj @info): > ??? ??? return namedtuple(info.name , obj) > > ??? @NamedTuple > ??? Point3 = ["x", "y", "z"] > >> The two potential benefits I see from this are: > >> > >> 1. The runtime can pass certain information to the decorator, some > >> information in all contexts, and some information in specific > contexts > >> such as when decorating a class member, decorating a function > defined > >> within another function, etc > >> > >> 2. It would be possible to decorate values directly, as the > runtime can > >> pass relevant information such as the variables name > > No, that would require a second, independent change. > > > > We could, if desired, allow decorator syntax like this: > > > > @decorate > > value = 1 > > > > but it seems pretty pointless since that's the same as: > > > > value = decorator(1) > > > > The reason we have @decorator syntax is not to be a second way to > call > > functions, using two lines instead of a single expression, but to > avoid > > having to repeat the name of the function three times: > > > > # Repeat the function name three times: > > def function(): > >? ? ... > > function = decorate(function) > > > > # Versus only once: > > @decorate > > def function(): > >? ? ?... > > > > The two main use cases I had of this idea were basically assignment > decorators, > pointless as it can just be name = decorator(value), but my idea was to > pass to > the decorator some metadata such as the name being assigned, and as > class > member decorators to receive information of the instance of the > class object > the member is being declared under. > > A more general idea could be to allow a function call to receive a meta > parameter > that provides some context information of the call.? This parameter is > not part of > a parameter list, but a special __variable__, or perhaps could be > retrieved via a > function call. > > Such contexts could be: > > 1) Assignment (includes decorators since they are just sugar for name = > decorator(name)) > The meta attribute assignname would contain the name being assigned to > > ??? def fn(v): > ??? ??? print(__callinfo__.assignname) > ??? ??? return v > > ??? # prints X > ??? X = fn(12) > > ??? # prints MyClass > ??? @fn > ??? class MyClass: > ??? ??? pass > > ??? # Should assignname receive the left-most assignment result or the > rightmost othervar > ??? # Perhaps assignname could be a tuple of names being assigned to > ??? result = othervar = fn(12) > > ??? #assignname would be myothervar in this augmented assignment > ??? result = [myothervar := fn(12)] > > ??? # Should expressions be allowed, or would assignname be None? > ??? result = 1 + fn(12) > > With something like this. > > ??? name = namedtuple("name", ...) > > could become: > > ??? def NamedTuple(*args): > ??? ??? return namedtuple(__callinfo__.assignname, args) > > ??? Point2 = NamedTuple("x", "y") > ??? Point3 = NamedTuple("x", "y", "z") > ??? etc > > 2) Class context. The a classobj parameter could contain the class > object it is called under. > This would be a raw object initially as __init__ would not have been > called, but would allow > the decorator to add attributes to a class > > ??? def fn(v): > ??? ??? print(__callinfo__.classobj) # classobj is None except when the > function is called in the body of a class declaration > ??? ??? print(__callinfo__.assignname) > ??? ??? if __callinfo__.classobj: > ??? ??? ??? data = > vars(__callinfo__.classobj).setdefault("_registry", {}) > ??? ??? ??? data[__callinfo__.assignname] = v > ??? ??? return v > > ??? class MyClass: > ??? ??? # print main.MyClass (probably something else since > __init__ not > yet calls, may just be a bare class object at that timie) > ??? ??? # print X > ??? ??? # sets MyClass._registry["X"] > ??? ??? X = fn(12) > > ??? ??? # print main.MyClass > ??? ??? # print method > ??? ??? # sets MyClass._registry["method"] > ??? ??? @fn > ??? ??? def method(self): > ??? ??? ??? pass > > ??? # print None > ??? # print Y > ??? Y = fn(12) > > In this case it's not longer a decorator idea but more of an idea for a > called function to be able to retrieve certain meta information about > it's call. > In the examples above, I used __callinfo__ with attributes, but direct > names would work the same: > > ??? def fn(v): > ??? ??? print(__assignname__) # May be None if no assignment/etc if > otherfunc(fn(value)) > ??? ??? print(__classobj__) # Will be None unless fn is called directly > under a class body > > > There may be other contexts and use cases, and better ways.? Just an > idea. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From tim.peters at gmail.com Sun Jun 17 01:57:24 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 17 Jun 2018 00:57:24 -0500 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180616132723.GG14437@ando.pearwood.info> References: <5B1F6AD7.3050302@canterbury.ac.nz> <20180616132723.GG14437@ando.pearwood.info> Message-ID: [Steven D'Aprano ] > Thanks Tim! > You're welcome ;-) > Reading your digressions on the minutia of floating point maths is > certainly an education. It makes algebra and real-valued mathematics > seem easy in comparison. > Hard to say, really. The problem with floating point is that it's so God-awful lumpy - special cases all over the place. Signaling and quiet NaNs; signed infinities; signed zeroes; normal finites all with the same number of bits, but where the gap between numbers changes abruptly at power-of-2 boundaries; subnormals where the gap remains the same across power-of-2 boundaries, but the number of _bits_ changes abruptly; all "the rules" break down when you get too close to overflow or underflow; four rounding modes to worry about; and a whole pile of technically defined exceptional conditions and related traps & flags. Ignoring all that, though, it's pretty easy ;-) 754 was dead serious about requiring results act is if a single rounding is done to the infinitely precise result, and that actually allows great simplification in reasoning. The trend these days appears to be using automated theorem-proving systems to keep track of the mountain of interacting special cases. Those have advanced enough that we may even be on the edge of getting provably-correctly-rounded transcendental functions with reasonable speed. Although it's not clear people will be able to understand the proofs ;-) I still haven't got over Mark Dickinson's demonstration a few years > back that under Decimal floating point, but not binary, it is possible > for the ordinary arithmetic average (x+y)/2 to be outside of the > range [x, y]: > > py> from decimal import getcontext, Decimal > py> getcontext().prec = 3 > py> x = Decimal('0.516') > py> y = Decimal('0.518') > py> (x + y) / 2 > Decimal('0.515') > Ya, decimal fp doesn't really solve anything except the shallow surprise that decimal fractions generally aren't exactly representable as binary fractions. Which is worth a whole lot for casual users, but doesn't address any of the deep problems (to the contrary, it makes those a bit worse). I like to illustrate the above with 1-digit decimal fp, because it makes it more apparent at once that - unlike as in binary fp - multiplication and division by 2 may _not_ be exact in decimal fp. We can't even average a number "with itself" reliably: >>> import decimal >>> decimal.getcontext().prec = 1 >>> x = y = decimal.Decimal(8); (x+y)/2 # 10 is much bigger than 8 Decimal('1E+1') >>> x = y = decimal.Decimal(7); (x+y)/2 # 5 is much smaller than 7 Decimal('5') But related things _can_ happen in binary fp too! You have to be near the edge of representable non-zero finites though: >>> x = y = 1e308 >>> x 1e+308 >>> (x+y)/2 inf Oops. So rewrite it: >>> x/2 + y/2 1e+308 Better! But then: >>> x = y = float.fromhex("3p-1074") >>> x 1.5e-323 >>> x/2 + y/2 2e-323 Oops. A math library has to deal with everything "correctly". Believe it or not, this paper "How do you compute the midpoint of an interval?" https://hal.archives-ouvertes.fr/hal-00576641v1/document is solely concerned with computing the average of two IEEE doubles, yet runs to 29(!) pages. Almost everything you try fails for _some_ goofy cases. I personally write it as (x+y)/2 anyway ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Sun Jun 17 07:43:16 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 17 Jun 2018 14:43:16 +0300 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <20180616235216.GI14437@ando.pearwood.info> References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> Message-ID: On Sun, Jun 17, 2018 at 2:52 AM, Steven D'Aprano wrote: > On Sat, Jun 16, 2018 at 08:21:42PM +0300, Mikhail V wrote: > >> By L[] there is some mnemonical hint because [] is used to create >> new empty list. > > How is that a hint? What is the connection between "append an item" and > "create a new empty list"? Where did I say it has _direct_ connection? It has some associative connection - 'new item', 'special index case'. L = [] is new list which is supposed to be filled with something. And it has references to existing syntax, e.g. slice assignment, or adding dictionary item can be written as : mydict[key] = value so: mylist[] = item is not THAT far. I would even say it's very close - but I'm pretty sure you can find something against this as well. > Here is a radical thought... why don't we give lists a method that > inserts items at the end of the list? We could call it something like > "append", and then instead of hoping people guess what the syntax does, > they can just look up the name of the method? > > L.append(x) Exercising in wit? How about: let's assume most people can't understand even the simplest new feature. So instead of giving clean compact assignment syntax for frequent operation, let's force less readable method call everywhere, which also might be harder to remember than one index case []. *wink* Maybe force .extend() for lists and .update() for dicts everywhere? To preserve poor new users from 'unintuitive' assignment syntax. But how you would do that? From turnbull.stephen.fw at u.tsukuba.ac.jp Sun Jun 17 08:02:02 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 17 Jun 2018 21:02:02 +0900 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <20180616105924.GE14437@ando.pearwood.info> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> Message-ID: <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> Folks. There are standards. "1252" *is not* an alias for "windows-1252" according to the IANA, while "866" *is* an alias for "IBM866" according to the same authority. Most 3-digit "IBMxxx" ARE aliased to both "cpxxx" and just "xxx", but not all. None of "IBM874", "874", or "cp874" exists according to the IANA. https://www.iana.org/assignments/character-sets/character-sets.xhtml For the reasons Steven gave, I would say omit the digits-only aliases, but if we must use them because "there's a standard" (or backward compatibility), we should stick to those defined by standard, and only those. If we're following other standards that I'm unaware of, fine, but let's cite them rather than randomly introduce a plethora of aliases because they "look like" an existing (and unfortunate) standard. There's also some other weirdness with "windows-874", see below. We (somebody) should check other "windows-xxx" character sets to make sure they're not misnamed "cpxxx". Steven D'Aprano writes: > > It is easy to test it. Encoding/decoding with '874' should give the > > same result as with 'cp874'. > > I know it is too late to remove that feature, but why do we support > digit-only IDs for encodings? They can be ambiguous. If Wikipedia is > correct, cp874 (also known as ibm874) and Windows-874 (also known as > cp1162) are different: According to the IANA, they're not necessarily ambiguous. Here is the entry for IBM866: IBM866 2086 IBM NLDG Volume 2 cp866 (SE09-8002-03) August 1994 866 [Rick_Pond] csIBM866 where the entries in column 4 show the registered aliases. There are at least a dozen IBMxxx character sets with 'xxx' aliases. I don't understand what's with "cp874", though. We can surely take that one back, although we'd better hurry if it's in 3.7rc. We might want to add "windows-874" (which does't seem to be present in Python 3.6), since that's the standard character set name per IANA. The confusion between cp874 and windows-874 may be because in VENDORS/MICSFT/WINDOWS it's in CP874.TXT (as are all the code pages there). > https://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874 > > https://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_1162 I don't know where Wikipedia's information comes from, but it's not the IANA. -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From ncoghlan at gmail.com Sun Jun 17 08:06:29 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jun 2018 22:06:29 +1000 Subject: [Python-ideas] Meta-PEP about C functions In-Reply-To: <5B21350C.9040803@UGent.be> References: <5B21350C.9040803@UGent.be> Message-ID: On 14 June 2018 at 01:15, Jeroen Demeyer wrote: > I have finished my "meta-PEP" for issues with built-in (implemented in C) > functions and methods. This is meant to become an "informational" (not > standards track) PEP for other PEPs to refer to. > > You can read the full text at > https://github.com/jdemeyer/PEP-functions-meta > > I also give brief ideas of solutions for the various issues. The main idea > is a new PyTypeObject field tp_ccalloffset giving an offset in the object > structure for a new PyCCallDef struct. This new struct replaces PyMethodDef > for calling functions/methods and defines a new "C call" protocol. Comparing > with PEP 575, one could say that the base_function class has been replaced > by PyCCallDef. This is even more general than PEP 575 and it should be > easier to support this new protocol in existing classes. > > I plan to submit this as PEP in the next days, but I wanted to check for > some early feedback first. This looks like a nice overview of the problem space to me, thanks for putting it together! It will probably make sense to publish it as a PEP just before you publish the first draft of your CCall protocol PEP, so that the two PEPs get assigned consecutive numbers (it doesn't really matter if they're non-consecutive, but at the same time, I don't think there's any specific urgency in getting this one published before the CCall PEP needs to reference it as background information). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sun Jun 17 09:38:23 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Jun 2018 23:38:23 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> Message-ID: <20180617133823.GJ14437@ando.pearwood.info> On Sun, Jun 17, 2018 at 02:43:16PM +0300, Mikhail V wrote: > On Sun, Jun 17, 2018 at 2:52 AM, Steven D'Aprano wrote: > > On Sat, Jun 16, 2018 at 08:21:42PM +0300, Mikhail V wrote: > > > > >> By L[] there is some mnemonical hint because [] is used to create > >> new empty list. > > > > How is that a hint? What is the connection between "append an item" and > > "create a new empty list"? > > Where did I say it has _direct_ connection? I didn't mention "direct" connection. You said there is a mnemonic from "the empty list" to "L[] is used for insert". I don't know what that mnemonic is, and your description below doesn't make sense to me. Insert/append doesn't just work on empty lists, so the connection between "empty list" and inserting/appending is pretty tenuous. Aside from the presence of a list, I don't see any connection at all. If we think about this example: L = ["starting", "values", "go", "here"] for item in extra_values: L[] = item do you still see a connection between a non-empty list and L[] used for append? > It has some associative connection - 'new item', 'special index case'. > L = [] is new list which is supposed to be filled with something. I don't see any connection between "new item" and "special index case" either. > And it has references to existing syntax, e.g. slice assignment, or > adding dictionary item can be written as : > > mydict[key] = value > > so: > mylist[] = item > > is not THAT far. I would expect mylist[] = value to mean that the contents of mylist is replaced with a single item. And then I would wonder why this is so special that it needs dedicated syntax, when we can already do it with existing syntax: mylist[:] = [item] The idea that assignment to mylist[] means "append to the end of the list" would never cross my mind in a million years. > I would even say it's very close - but I'm pretty > sure you can find something against this as well. > > > Here is a radical thought... why don't we give lists a method that > > inserts items at the end of the list? We could call it something like > > "append", and then instead of hoping people guess what the syntax does, > > they can just look up the name of the method? > > > > L.append(x) > > Exercising in wit? > > How about: let's assume most people can't understand even the simplest > new feature. > So instead of giving clean compact assignment syntax for frequent operation, Compact it might be: L[] = x # seven characters, including spaces L.append(x) # eleven characters but I think that calling it "clean" is inappropriate. Slice notation is already one of the trickier things for beginners to learn, and this adds extra complexity for not much benefit. As for "frequent operation", there are lots of frequent operations in Python. Does every one of them deserve special syntax to make it clean? I just opened one of my modules at random, and I don't have a single append in that module, but I have 14 calls to string.startswith and nine calls to kwargs.pop. Appending to a list might be common, but I don't see that it is either common enough or important enough to add additional syntax. > let's force less readable method call everywhere, I think that describing a short, self-descriptive method call like append as "less readable" demonstrates a deep and fundamental gulf between the style you prefer and what the rest of us prefer. Mikhail, sometimes I wonder if you would be happier using Perl rather than Python. Like the Perl community, you seem to have a desire for syntax which most of us see as terse and cryptic over self-descriptive method names. But even Perl doesn't (so far as I know) give us special syntax to append to an array. As far as I know, the standard way to append to an array @arr in Perl is: push(@arr, item); -- Steve From paal.drange at gmail.com Sun Jun 17 09:54:11 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Sun, 17 Jun 2018 15:54:11 +0200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <20180617133823.GJ14437@ando.pearwood.info> References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> Message-ID: Mikhail, this thread is getting quite long, and difficult to follow. It's quite clear that a new operator won't be introduced for list insertion. Furthermore, this thread has now become a hard-to-follow and impossible-to-participate-in debate. If you have other ideas, could you maybe formulate them unambiguously and post them in a new thread? With examples. I'm interested in reading about new ideas, but not rhetorics and argumentations. Best wishes, P?l Gr?n?s Drange -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Sun Jun 17 13:01:09 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 17 Jun 2018 20:01:09 +0300 Subject: [Python-ideas] Alternative spelling for list.append() Message-ID: [by request I've made new subject and summary of proposal] The idea is to introduce new syntax for the list.append() method. Syntax: Variant 1. Use special case of index, namely omitted index: mylist[] = item Examples: mylist = [1, 2, 3] --> [1, 2, 3] mylist[] = x --> [1, 2, 3, x] mylist[] = [foo, bar] --> [1, 2, 3, [foo, bar]] instead of current: mylist = [1, 2, 3] mylist.append(x) mylist.append([foo, bar]) Variant 2. Use one of the augmented assignment operators. For example using ^= operator: mylist = [1, 2, 3] mylist ^= x mylist ^= [foo, bar] For example using >>= operator: mylist = [1, 2, 3] mylist >>= x mylist >>= [foo, bar] Other operators may be considerd as well. Motivation ----------- 1. Assignment form reduces text amount in statements and makes the right-hand part, namely the item, clean of additional brackets, which is improtant especially by more complex items which can also contain brackets or quotes. For example: mylist.append([[foo, bar], [] ]) Could be written as: mylist[] = [[foo, bar], [] ] Which preserves the original item form and has generally more balanced look. 2. Method form has one general issue, especially by longer variable names. Example from https://docs.python.org/3/tutorial/datastructures.html ... for row in matrix: transposed_row.append (row[i]) transposed.append (transposed_row) It becomes hard to read because of lack of spacing between variable names and list names. With the new syntax it could be written: for row in matrix: transposed_row[] = row[i] transposed[] = transposed_row 3. Item appending is very frequent operation. In current syntax, extend() method has dedicated syntax. mylist1 += mylist2 One of important aspects of proposal is that it should discourage usage of += for appending an item. Namely the idea is to discourage this form : mylist += [item] Because it is clunky and may cause additional confusion. E.g. adding brackets is often used and understood as operation of increasing dimension of a list - so in case of variable (not literal) the immediate reaction to this form is that there might be intention to increase the dimension of list, and not just create a list from say, an integer. But of course, increasing the dimension may be also the intention, so this construct causes extra brain load. Also in current syntax it is possible to add item to a dictionary with assignment statement: mydict[key] = item Despite it is also possible via update() method. In both cases assignment form gained popularity and used very often. From mike at selik.org Sun Jun 17 13:11:53 2018 From: mike at selik.org (Michael Selik) Date: Sun, 17 Jun 2018 10:11:53 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: On Sun, Jun 17, 2018, 10:01 AM Mikhail V wrote: > The idea is to introduce new syntax for the list.append() method. > While you have summarized your proposal, you haven't included a summary of the criticism. Also, one thing that's very common for proposals to change syntax and create new uses for operators is to show real code from major libraries that benefit from the change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertve92 at gmail.com Sun Jun 17 13:27:28 2018 From: robertve92 at gmail.com (Robert Vanden Eynde) Date: Sun, 17 Jun 2018 19:27:28 +0200 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: Some api (in c++ at least) use "<<" for appending. A = [1,2,7,2] A <<= 5 A == [1,2,7,2,5] The A[] = syntax has it's benefits being used in php (and I think some other lang). Le dim. 17 juin 2018 ? 19:12, Michael Selik a ?crit : > On Sun, Jun 17, 2018, 10:01 AM Mikhail V wrote: > >> The idea is to introduce new syntax for the list.append() method. >> > > While you have summarized your proposal, you haven't included a summary of > the criticism. > > Also, one thing that's very common for proposals to change syntax and > create new uses for operators is to show real code from major libraries > that benefit from the change. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Sun Jun 17 14:29:41 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 17 Jun 2018 20:29:41 +0200 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> Message-ID: <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> > On 17 Jun 2018, at 14:02, Stephen J. Turnbull wrote: > > Folks. There are standards. "1252" *is not* an alias for > "windows-1252" according to the IANA, while "866" *is* an alias for > "IBM866" according to the same authority. Most 3-digit "IBMxxx" ARE > aliased to both "cpxxx" and just "xxx", but not all. None of > "IBM874", "874", or "cp874" exists according to the IANA. Sure, but for at least one user Python 3.6 fails to start because initialising the sys.std* streams fails due to not finding a ?874? encoding. The user sadly enough didn?t provide more information on his machine, other than that it is running some version of Windows. BTW. ?cp874? does exist according to the unicode consortium: https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT , and appears to be a codepage for a (the?) Thai language. The user might therefore be running Windows with a Thai locale. Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 17 16:09:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 06:09:06 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <20180617133823.GJ14437@ando.pearwood.info> References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> Message-ID: On Sun, Jun 17, 2018 at 11:38 PM, Steven D'Aprano wrote: > As for "frequent operation", there are lots of frequent operations in > Python. Does every one of them deserve special syntax to make it clean? > I just opened one of my modules at random, and I don't have a single > append in that module, but I have 14 calls to string.startswith and nine > calls to kwargs.pop. kwargs.pop("some_key") could plausibly be spelled del kwargs["some_key"] if del were (like yield) upgraded to expression. Whether that is an improvement or not, I don't know, but at least it's logical. ChrisA From rosuav at gmail.com Sun Jun 17 16:18:49 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 06:18:49 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V wrote: > The idea is to introduce new syntax for the list.append() method. > > > Syntax: > > Variant 1. > Use special case of index, namely omitted index: > > mylist[] = item Creation of syntax cannot be done for just one type. So what would this mean (a) for other core data types, and (b) in the protocols? What dunder will be called, and with what arguments? For example: class Foo: def __setitem__(self, key, val): print("Setting", key, "to", val) x = Foo() x[] = 1 What should be printed? Will it come through __setitem__ or are you requiring a completely different dunder method? Regardless, I am still a strong -1 on introducing another way to spell list.append(). ChrisA From jelle.zijlstra at gmail.com Sun Jun 17 16:52:30 2018 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Sun, 17 Jun 2018 13:52:30 -0700 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> Message-ID: 2018-06-17 13:09 GMT-07:00 Chris Angelico : > > kwargs.pop("some_key") could plausibly be spelled del > kwargs["some_key"] if del were (like yield) upgraded to expression. > Whether that is an improvement or not, I don't know, but at least it's > logical. > That already works. It calls the __delitem__ magic method. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 17 17:00:21 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 07:00:21 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> Message-ID: On Mon, Jun 18, 2018 at 6:52 AM, Jelle Zijlstra wrote: > > > 2018-06-17 13:09 GMT-07:00 Chris Angelico : >> >> >> kwargs.pop("some_key") could plausibly be spelled del >> kwargs["some_key"] if del were (like yield) upgraded to expression. >> Whether that is an improvement or not, I don't know, but at least it's >> logical. > > That already works. It calls the __delitem__ magic method. > Yes, but it's a statement. The point of kwargs.pop("some_key") is to get the value of it. x = del kwargs["some_key"] # doesn't work ChrisA From clint.hepner at gmail.com Sun Jun 17 19:36:54 2018 From: clint.hepner at gmail.com (Clint Hepner) Date: Sun, 17 Jun 2018 19:36:54 -0400 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> > On Jun 17, 2018, at 4:18 PM, Chris Angelico wrote: > >> On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V wrote: >> The idea is to introduce new syntax for the list.append() method. >> >> >> Syntax: >> >> Variant 1. >> Use special case of index, namely omitted index: >> >> mylist[] = item > > Creation of syntax cannot be done for just one type. That?s false. @ was added solely for matrix multiplication. > > Regardless, I am still a strong -1 on introducing another way to spell > list.append(). -1 as well. Has any one but the proposer shown any support for it yet? ? Clint From levkivskyi at gmail.com Sun Jun 17 19:50:40 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 18 Jun 2018 00:50:40 +0100 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: References: Message-ID: On 14 June 2018 at 12:03, Daniel S?nchez F?bregas < daniel.sanchez.fabregas at xunta.gal> wrote: > My idea consist in: > Adding a method to perform type checking in traceback objects > When printing stack traces search for mistyped arguments and warn about > them to the user. > > Don't know if it is in the roadmap, but seems that have a good > cost/benefit ratio to me. > It seems to me too this will be rather a work for a static type checker like mypy. There is a misconception that runtime objects can be attributed (or checked w.r.t.) static types, but this is true only in simple cases. Also such runtime check will be not able to correctly catch many type errors that can be detected statically, like a wrong assignment or a Liskov violation. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From abrault at mapgears.com Sun Jun 17 19:53:38 2018 From: abrault at mapgears.com (Alexandre Brault) Date: Sun, 17 Jun 2018 19:53:38 -0400 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On 2018-06-17 7:36 PM, Clint Hepner wrote: > >> On Jun 17, 2018, at 4:18 PM, Chris Angelico wrote: >> >>> On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V wrote: >>> The idea is to introduce new syntax for the list.append() method. >>> >>> >>> Syntax: >>> >>> Variant 1. >>> Use special case of index, namely omitted index: >>> >>> mylist[] = item >> Creation of syntax cannot be done for just one type. > That?s false. @ was added solely for matrix multiplication. > It was added for matrix multiplication, but any type that wants to use the feature can by implementing __matmul__ et al. It's not clear from Mikhail's proposal how types would opt into the append syntax. Alex From rosuav at gmail.com Sun Jun 17 20:07:07 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 10:07:07 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On Mon, Jun 18, 2018 at 9:36 AM, Clint Hepner wrote: > > >> On Jun 17, 2018, at 4:18 PM, Chris Angelico wrote: >> >>> On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V wrote: >>> The idea is to introduce new syntax for the list.append() method. >>> >>> >>> Syntax: >>> >>> Variant 1. >>> Use special case of index, namely omitted index: >>> >>> mylist[] = item >> >> Creation of syntax cannot be done for just one type. > > That?s false. @ was added solely for matrix multiplication. Ah, confusing bit of language there. The @ operator was created for the benefit of a small number of types (not just one, I think), but it MUST be available to all types, including custom classes. >>> class Foo: ... def __matmul__(self, other): ... print("Me @", other) ... >>> Foo() @ 5 Me @ 5 And that's what I was talking about. If "lst[] = X" is to be syntactically valid, there needs to be a protocol that implements it (as with "__matmul__" for @). (It's also worth noting that the @ operator is unique in being created solely for the benefit of third-party types. Every other operator is supported by the core types - usually by many of them. Support for a new operator (or a new form of assignment) would be far greater if multiple use-cases can be shown. So even interpreted the way I hadn't intended, my statement isn't completely out of left field; it just weakens from "cannot" to "is not generally". But syntactically, "cannot" is still true.) > -1 as well. Has any one but the proposer shown any support for it yet? Not to my knowledge. However, this would be far from the first time that a much-hated-on proposal leads to one that has actual support, perhaps by restricting it, or maybe by generalizing it, or something. ChrisA From steve at pearwood.info Sun Jun 17 20:34:46 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Jun 2018 10:34:46 +1000 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> Message-ID: <20180618003446.GK14437@ando.pearwood.info> > Sure, but for at least one user Python 3.6 fails to start because > initialising the sys.std* streams fails due to not finding a ?874? > encoding. That doesn't mean that the bug is best fixed by adding an alias. If the error was failing to find encoding "ltain-1", would we add an alias or fix the spelling? If 874 is not an official alias, we should consider it a misspelling and fix the misspelling, not add an alias. But either way, the point Stephen is making is that even if 874 is a legitimate alias, that shouldn't give us carte blanche to add numeric aliases for every encoding. From greg.ewing at canterbury.ac.nz Sun Jun 17 19:28:29 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Jun 2018 11:28:29 +1200 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> Message-ID: <5B26EE9D.3030104@canterbury.ac.nz> Chris Angelico wrote: > kwargs.pop("some_key") could plausibly be spelled del > kwargs["some_key"] if del were (like yield) upgraded to expression. Except that "delete" is a really misleading name for such an operation! -- Greg From rosuav at gmail.com Sun Jun 17 20:47:16 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 10:47:16 +1000 Subject: [Python-ideas] Operator for inserting an element into a list In-Reply-To: <5B26EE9D.3030104@canterbury.ac.nz> References: <20180616014422.GA37699@cskk.homeip.net> <20180616235216.GI14437@ando.pearwood.info> <20180617133823.GJ14437@ando.pearwood.info> <5B26EE9D.3030104@canterbury.ac.nz> Message-ID: On Mon, Jun 18, 2018 at 9:28 AM, Greg Ewing wrote: > Chris Angelico wrote: > >> kwargs.pop("some_key") could plausibly be spelled del >> kwargs["some_key"] if del were (like yield) upgraded to expression. > > > Except that "delete" is a really misleading name for such > an operation! > Is it? It's removing one element from the dictionary. Is that really so misleading? ChrisA From steve at pearwood.info Sun Jun 17 20:50:29 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Jun 2018 10:50:29 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: <20180618005029.GL14437@ando.pearwood.info> On Mon, Jun 18, 2018 at 10:07:07AM +1000, Chris Angelico wrote: > On Mon, Jun 18, 2018 at 9:36 AM, Clint Hepner wrote: [...] > > That?s false. @ was added solely for matrix multiplication. > > Ah, confusing bit of language there. It certainly is. You are talking about *types* and Clint is talking about *semantics*. As you point out below, you are correct: the @ operator works for any type which defines the correct dunder method: > >>> class Foo: > ... def __matmul__(self, other): > ... print("Me @", other) > ... > >>> Foo() @ 5 > Me @ 5 Clint's point that the *motivation* was a single use-case, matrix multiplication in numpy, is a separate issue. Mikhail's motivation might solely be appending to lists, but we would expect this to be a protocol with a dunder method that any type could opt into. [...] > (It's also worth noting that the @ operator is unique in being created > solely for the benefit of third-party types. Every other operator is > supported by the core types - usually by many of them. That's not quite correct: although it isn't strictly speaking an operator, extended slice notation x[a:b:c] was invented for numpy, and for a while (Python 1.4 I think?) no core type supported it. Similarly for using Ellipsis ... in slices. As far as I know, there is still no core type which supports that. -- Steve From rosuav at gmail.com Sun Jun 17 21:12:43 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 18 Jun 2018 11:12:43 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: <20180618005029.GL14437@ando.pearwood.info> References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> <20180618005029.GL14437@ando.pearwood.info> Message-ID: On Mon, Jun 18, 2018 at 10:50 AM, Steven D'Aprano wrote: >> (It's also worth noting that the @ operator is unique in being created >> solely for the benefit of third-party types. Every other operator is >> supported by the core types - usually by many of them. > > That's not quite correct: although it isn't strictly speaking an > operator, extended slice notation x[a:b:c] was invented for numpy, and > for a while (Python 1.4 I think?) no core type supported it. > > Similarly for using Ellipsis ... in slices. As far as I know, there is > still no core type which supports that. Ellipsis is now just a special form of literal, so it's no longer magical in any way (there's no difference between x[...] and x(...) in the syntax). Extended slice notation - interesting that no core type supported it originally. But, again, both of them are implemented using a standard protocol: an object represents the entire thing between the brackets, and that object is passed to __getitem__. So this might need a new object meaning "emptiness", or else it is defined that x[]=y is the same as x[None]=y, which would have confusing implications. Actually, maybe the problem here is that there's no easy way to represent "-0" in a slice. Consider: >>> items = ["spam", "ham", "foo", "bar", "quux"] >>> items[0] # take first item 'spam' >>> items[1] # index from start, not at start 'ham' >>> items[-1] # take last item 'quux' >>> items[-2] # index from end, not last 'bar' Indexing is perfectly parallel, as long as you understand that "-2" mirrors "1". In fact, we could write these using boolean Not, if we wanted to. >>> items[~0] # last item 'quux' >>> items[~1] # second-last item 'bar' Slicing from the beginning works tidily too. >>> items[:1] # slice from start to position ['spam'] >>> items[:2] # slice from start to position ['spam', 'ham'] >>> items[:0] # useless in retrieval [] >>> items[:0] = ["shim"] # insert at beginning >>> items ['shim', 'spam', 'ham', 'foo', 'bar', 'quux'] >>> del items[0] Great. Now let's try slicing from the end: >>> items = ["spam", "ham", "foo", "bar", "quux"] >>> items[-1:] # slice from position to end ['quux'] >>> items[-0:] # failed parallel ['spam', 'ham', 'foo', 'bar', 'quux'] So you use -1 in slices to parallel 1 (unlike using ~1 as with indexing), and everything works *except zero*. Which means that the slice-assignment form of insert is easy to write, but the slice-assignment form of append isn't. Mikhail, if it were possible to append using slice assignment, would that meet the case? ChrisA From steve at pearwood.info Sun Jun 17 21:25:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Jun 2018 11:25:03 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: <20180618012502.GM14437@ando.pearwood.info> On Sun, Jun 17, 2018 at 08:01:09PM +0300, Mikhail V wrote: > The idea is to introduce new syntax for the list.append() method. Before trying to justify any specific syntax, you need to justify the idea of using syntax in the first place. > Motivation > ----------- > > 1. Assignment form reduces text amount in statements and makes the right-hand > part, namely the item, clean of additional brackets, which is improtant > especially by more complex items which can also contain brackets or quotes. Reducing human-readable words like "append" in favour of cryptic symbols like "[] =" is not a motivation that I agree with. I think this syntax will make a simple statement like mylist.append(x) harder to read, harder to teach, and harder to get right: mylist[] = x Using a *named method* is a Good Thing. Replacing named methods with syntax needs to be carefully justified, not just assumed that our motive should be to reduce the number of words. On the contrary: we should be trying to keep the amount of symbols fairly small. Not zero, but each new symbol and each new syntactic form using symbols needs to be justified. Why should this be syntax if a method will work? > For example: > > mylist.append([[foo, bar], [] ]) > > Could be written as: > > mylist[] = [[foo, bar], [] ] > > Which preserves the original item form and has generally more balanced look. The original version preserves the original item form too, and I disagree that the replacement looks "more balanced". > 2. Method form has one general issue, especially by longer variable names. > Example from https://docs.python.org/3/tutorial/datastructures.html > > ... > for row in matrix: > transposed_row.append (row[i]) > transposed.append (transposed_row) There's nothing wrong with that example except you have stuck a space between the method name and the opening parenthises. > It becomes hard to read because of lack of spacing between variable names > and list names. That's your opinion. I think that's unjustified, but even if it were justified, there are *hundreds* or *thousands* of method calls and function calls where you might make the same claim. Should we invent syntax for every single method call? > 3. Item appending is very frequent operation. Not that frequent. > In current syntax, extend() method has dedicated syntax. > > mylist1 += mylist2 No, that is wrong. The += syntax applies to *any* type, *any* value which supports the plus operator. It is NOT dedicated syntax for the extend syntax. > One of important aspects of proposal is that it should discourage usage of += > for appending an item. Namely the idea is to discourage this form : > > mylist += [item] > > Because it is clunky and may cause additional confusion. That should be discouraged because it uses special syntax instead of a self-explanatory method call. Again, I disagree with both this motivation and the supposed solution. > E.g. adding brackets is often used and understood as operation of increasing > dimension of a list Can you provide an example of somebody who made this error, or did you just make it up? Mikhail, I disagree with your motivation, and I disagree with your supposed solution. As far as I am concerned: - I disagree that list.append method is a problem to be fixed; - but even if it were a problem that needs fixing, your suggested syntax would be *worse* than the problem. -- Steve From robertvandeneynde at hotmail.com Sun Jun 17 19:58:52 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Sun, 17 Jun 2018 23:58:52 +0000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: I understand the view from the poster, most basic list operations are using brackets, ie reading and writing with [], delete with del L[], why not append ? And being used extensively, that brackets are annoying. And yes, += [] is more "concise" than .append() so some people would think it's more clear because "it's smaller" (they'd be wrong as the OP mentioned). But it would break the "There is only one Obvious way to do it" principle. People would take time to figure out "okay, what should I write ? What's the most Pythonic ?" If someone wants to use their own list class, it's doable with the current syntax : class List(list): def __lshift__(self, x): self.append(x) return self a = List([1,2,7,2]) a = a << 1 << 5 a <<= 0 2018-06-18 1:36 GMT+02:00 Clint Hepner >: > On Jun 17, 2018, at 4:18 PM, Chris Angelico > wrote: > >> On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V > wrote: >> The idea is to introduce new syntax for the list.append() method. >> >> >> Syntax: >> >> Variant 1. >> Use special case of index, namely omitted index: >> >> mylist[] = item > > Creation of syntax cannot be done for just one type. That?s false. @ was added solely for matrix multiplication. > > Regardless, I am still a strong -1 on introducing another way to spell > list.append(). -1 as well. Has any one but the proposer shown any support for it yet? ? Clint _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From paddy3118 at gmail.com Mon Jun 18 06:49:35 2018 From: paddy3118 at gmail.com (Paddy3118) Date: Mon, 18 Jun 2018 03:49:35 -0700 (PDT) Subject: [Python-ideas] Remember the Vasa Message-ID: I thought it might be helpful for the Python community to be aware of the growth issues that the C++ community has/is discussing, at the moment. I am *not *saying we have those same issues, but we might know to avoid similar issues in our future? Here's and letter: http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0977r0.pdf And an article on a later interview: http://www.theregister.co.uk/2018/06/18/bjarne_stroustrup_c_plus_plus/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jun 18 09:58:33 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Jun 2018 01:58:33 +1200 Subject: [Python-ideas] Remember the Vasa In-Reply-To: References: Message-ID: <5B27BA89.6030607@canterbury.ac.nz> Paddy3118 wrote: > Here's and letter: > http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0977r0.pdf This is why it's important for a language to have a BDFL. -- Greg From tir.karthi at gmail.com Mon Jun 18 11:07:45 2018 From: tir.karthi at gmail.com (Karthikeyan) Date: Mon, 18 Jun 2018 08:07:45 -0700 (PDT) Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> Message-ID: <3c552aad-8972-4c4f-958e-699b218c2fa2@googlegroups.com> > BTW. ?cp874? does exist according to the unicode consortium: https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT, and appears to be a codepage for a (the?) Thai language. The user might therefore be running Windows with a Thai locale. This page also lists 874 along with windows-874 as .NET name belonging to Thai language and doesn't mention cp-874. I don't have knowledge of .NET but just wanted to add this as a reference. One another disadvantage of patching the search function (or adding any alias for digit only encoding assuming cpXXXX) is that it prepends "cp" and it also assumes that aliases.py that takes precedence doesn't resolve correctly. Since some of the digit only encodings like '936' that corresponds to 'gbk' are added in aliases.py they don't get resolved as 'cp936' for now. But if new digit only and non-cp encodings are added in future then they have to be added to the file so that precedence works instead of always resolving to cpXXXX encoding. I think this is noted at https://bugs.python.org/issue33865#msg319617. It would be nice if the original poster provided some more context or environment to reproduce it than the screenshot which has limited information. I am keeping aside the search_function.patch and look forward to OP to reply back in the issue. Thanks PS : This is my first mailing list post. Kindly ignore if I am using wrong quoting mechanism. On Monday, June 18, 2018 at 12:01:01 AM UTC+5:30, Ronald Oussoren wrote: > > > > On 17 Jun 2018, at 14:02, Stephen J. Turnbull < > turnbull.... at u.tsukuba.ac.jp > wrote: > > Folks. There are standards. "1252" *is not* an alias for > "windows-1252" according to the IANA, while "866" *is* an alias for > "IBM866" according to the same authority. Most 3-digit "IBMxxx" ARE > aliased to both "cpxxx" and just "xxx", but not all. None of > "IBM874", "874", or "cp874" exists according to the IANA. > > > Sure, but for at least one user Python 3.6 fails to start because > initialising the sys.std* streams fails due to not finding a ?874? > encoding. > > The user sadly enough didn?t provide more information on his machine, > other than that it is running some version of Windows. > > BTW. ?cp874? does exist according to the unicode consortium: > https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT, > and appears to be a codepage for a (the?) Thai language. The user might > therefore be running Windows with a Thai locale. > > Ronald > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Mon Jun 18 09:35:32 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 18 Jun 2018 15:35:32 +0200 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <20180618003446.GK14437@ando.pearwood.info> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> <20180618003446.GK14437@ando.pearwood.info> Message-ID: > On 18 Jun 2018, at 02:34, Steven D'Aprano wrote: > >> Sure, but for at least one user Python 3.6 fails to start because >> initialising the sys.std* streams fails due to not finding a ?874? >> encoding. > > That doesn't mean that the bug is best fixed by adding an alias. I agree, I?ve mentioned in the issue that I?d like to understand why python looks for an encoding with this name. > > If the error was failing to find encoding "ltain-1", would we add an > alias or fix the spelling? If 874 is not an official alias, we should > consider it a misspelling and fix the misspelling, not add an alias. That depends, if a major platform ships with locales where the encoding is misspelled we have little choice but to add an alias. To state it too blunt: standards are fine until they conflict with reality. > > But either way, the point Stephen is making is that even if 874 is a > legitimate alias, that shouldn't give us carte blanche to add numeric > aliases for every encoding. Possibly just for the ?cp?? encodings, but IMHO only if we confirm that the code to look for the preferred encoding returns a codepage number on Windows and changing that code leads to worse results than adding numeric aliases for the ?cp?? encodings. Ronald > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From solipsis at pitrou.net Mon Jun 18 11:30:56 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Jun 2018 17:30:56 +0200 Subject: [Python-ideas] Remember the Vasa References: Message-ID: <20180618173056.3fda5c04@fsol> On Mon, 18 Jun 2018 03:49:35 -0700 (PDT) Paddy3118 wrote: > I thought it might be helpful for the Python community to be aware of the > growth issues that the C++ community has/is discussing, at the moment. I am *not > *saying we have those same issues, but we might know to avoid similar > issues in our future? > > Here's and letter: > http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0977r0.pdf > > And an article on a later interview: > http://www.theregister.co.uk/2018/06/18/bjarne_stroustrup_c_plus_plus/ Interesting quote: ? Adding anything new, however minor carries a cost, such as implementation, teaching, tools upgrades. Major features are those that change the way we think about programming. Those are the ones we must concentrate on. ? Regards Antoine. From mikhailwas at gmail.com Mon Jun 18 15:51:59 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 18 Jun 2018 22:51:59 +0300 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> <20180618005029.GL14437@ando.pearwood.info> Message-ID: On Mon, Jun 18, 2018 at 4:12 AM, Chris Angelico wrote: > On Mon, Jun 18, 2018 at 10:50 AM, Steven D'Aprano wrote: >>>> items[-0:] # failed parallel > ['spam', 'ham', 'foo', 'bar', 'quux'] > > So you use -1 in slices to parallel 1 (unlike using ~1 as with > indexing), and everything works *except zero*. Which means that the > slice-assignment form of insert is easy to write, but the > slice-assignment form of append isn't. > > Mikhail, if it were possible to append using slice assignment, would > that meet the case? > How? like this: items[-0:] = [item] One of main motivation is actually to have just 'item' on the right side. So your idea is to have special syntax for 'last item' so as to avoid something like: c = len(L) L[c:c] = [...] But IIUC you're still about _iterable_ on the right-hand part? > However, this would be far from the first time > that a much-hated-on proposal leads to one that has actual support, > perhaps by restricting it, or maybe by generalizing it, or something. I think it is currently quite restricted. So imo it's rather generalizing that could make change. Namely the idea for 'L[] = item' seems plausible for append() method only, and for the types that have this method. Python array type has similar method set as lists. Numpy arrays have also append() and insert() methods, but they do not change arrays in-place. So just for keeping it more or less focused - assume we're considering the 'L[] = item' approach, say for mutable types. So one syntax generalization possible is towards insert() method. Although IMO it could be made into nice syntax only by introducing some symbol into the index notation. For example: L[] = item append(item) L[^] = item ? appendleft(item) L[^0] = item insert (0, item) ... L[^i] = item insert(i, x) Note, I am not responsible for the adequacy of the technical part - since it may be not plausible technically. So just speculating here: if that was added, then could it be linked to corresponding methods? For example for collections.deque type: deq[] = item deq.append(item) deq[^] = item deq.appendleft(item) ... deq[^i] = item deq.insert(i, item) So syntactically - I would say it makes more or less complete picture. Because insert() and append() come hand in hand semantically. And with operator-only variant or without special symbol - it seems it is not possible, (at least not with bareable index notation). From rosuav at gmail.com Mon Jun 18 16:03:05 2018 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Jun 2018 06:03:05 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> <20180618005029.GL14437@ando.pearwood.info> Message-ID: On Tue, Jun 19, 2018 at 5:51 AM, Mikhail V wrote: > On Mon, Jun 18, 2018 at 4:12 AM, Chris Angelico wrote: >> On Mon, Jun 18, 2018 at 10:50 AM, Steven D'Aprano wrote: > >>>>> items[-0:] # failed parallel >> ['spam', 'ham', 'foo', 'bar', 'quux'] >> >> So you use -1 in slices to parallel 1 (unlike using ~1 as with >> indexing), and everything works *except zero*. Which means that the >> slice-assignment form of insert is easy to write, but the >> slice-assignment form of append isn't. >> >> Mikhail, if it were possible to append using slice assignment, would >> that meet the case? >> > > > How? like this: > > items[-0:] = [item] Well, yes, except for the part where -0 is indistinguishable from 0. > One of main motivation is actually to have just 'item' on the right side. > So your idea is to have special syntax for 'last item' so as to avoid > something like: > > c = len(L) > L[c:c] = [...] > > But IIUC you're still about _iterable_ on the right-hand part? Yes, for consistency with other slice assignment. I don't think a dedicated syntax for slotting a single item into it has any chance of being accepted. (PLEASE NOTE: I am not Guido. [citation needed]) But I'm trying to find some germ of an idea inside what you're asking for, something that IS plausible. ChrisA From jsbueno at python.org.br Mon Jun 18 16:07:45 2018 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 18 Jun 2018 17:07:45 -0300 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: Yes - maybe a proposal to have the MutableSequence protocol to define "<<" as "append" would be something with more traction than the original proposal here. No chanegs needed to to the language core, and a = [1,2, 3] a <<= 4 resulting in a == [1, 2, 3, 4] is quite readable, syntactically valid, unambiguous (and can be implemented in two LoC for a new Sequence class). I know that at least Brython used the "<<=" enhanced assignement operator to manipulate HTML DOM, and it often results in quite compact code when compared with javascript-equivalente manipulations. On Mon, 18 Jun 2018 at 01:32, Robert Vanden Eynde wrote: > > I understand the view from the poster, most basic list operations are using brackets, ie reading and writing with [], delete with del L[], why not append ? > And being used extensively, that brackets are annoying. > And yes, += [] is more "concise" than .append() so some people would think it's more clear because "it's smaller" (they'd be wrong as the OP mentioned). > > But it would break the "There is only one Obvious way to do it" principle. > People would take time to figure out "okay, what should I write ? What's the most Pythonic ?" > > If someone wants to use their own list class, it's doable with the current syntax : > > class List(list): > def __lshift__(self, x): > self.append(x) > return self > > a = List([1,2,7,2]) > a = a << 1 << 5 > a <<= 0 > > 2018-06-18 1:36 GMT+02:00 Clint Hepner : >> >> >> >> > On Jun 17, 2018, at 4:18 PM, Chris Angelico wrote: >> > >> >> On Mon, Jun 18, 2018 at 3:01 AM, Mikhail V wrote: >> >> The idea is to introduce new syntax for the list.append() method. >> >> >> >> >> >> Syntax: >> >> >> >> Variant 1. >> >> Use special case of index, namely omitted index: >> >> >> >> mylist[] = item >> > >> > Creation of syntax cannot be done for just one type. >> >> That?s false. @ was added solely for matrix multiplication. >> >> > >> > Regardless, I am still a strong -1 on introducing another way to spell >> > list.append(). >> >> -1 as well. Has any one but the proposer shown any support for it yet? >> >> ? >> Clint >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mike at selik.org Mon Jun 18 16:43:08 2018 From: mike at selik.org (Michael Selik) Date: Mon, 18 Jun 2018 13:43:08 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On Mon, Jun 18, 2018 at 12:56 PM Mikhail V wrote: > Numpy arrays have also append() and insert() methods, > In [2]: np.arange(1).append(2) AttributeError: 'numpy.ndarray' object has no attribute 'append' In [3]: np.arange(1).insert AttributeError: 'numpy.ndarray' object has no attribute 'insert' So one syntax generalization possible is towards insert() method. > Why would you even want to encourage inserting into a list? It's slow and should be *discouraged*. Although IMO it could be made into nice syntax only by introducing > some symbol into the index notation. For example: > > L[] = item append(item) > L[^] = item ? appendleft(item) > L[^0] = item insert (0, item) > ... > L[^i] = item insert(i, x) > > Note, I am not responsible for the adequacy of the technical > part - since it may be not plausible technically. > Indeed, changing the syntax to add a special meaning to caret operator inside of index assignment is a *major* change. Not worthwhile for such dubious benefit. On Mon, Jun 18, 2018 at 1:08 PM Joao S. O. Bueno wrote: > MutableSequence protocol to define "<<" as "append" No one has demonstrated with any realistic examples why code would look better with ``a <<= b`` instead of ``a.append(b)``. Perhaps a language in its infancy could toy with new spellings, but Python is old now and should be more cautious with change. I know that at least Brython used the "<<=" enhanced assignement > operator to manipulate HTML DOM, and it often results in > quite compact code when compared with javascript-equivalente manipulations. > That sounds great for a dedicated HTML DOM manipulation tool, but not for the core list type. On Sun, Jun 17, 2018 at 6:12 PM Chris Angelico wrote: > Actually, maybe the problem here is that there's no easy way to > represent "-0" in a slice. > >>> items[:0] = ["shim"] # insert at beginning > >>> items[-0:] # failed parallel > The fact that signed 2's complement integers don't support negative zero does cause some problems for both indexing and slicing. However, to me it seems this is a fundamental human problem with zero. Thus the ever-present off-by-one error. I can't think of a solution that doesn't cause more harm than good. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailwas at gmail.com Mon Jun 18 17:55:04 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 19 Jun 2018 00:55:04 +0300 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On Mon, Jun 18, 2018 at 11:43 PM, Michael Selik wrote: > On Mon, Jun 18, 2018 at 12:56 PM Mikhail V wrote: >> >> Numpy arrays have also append() and insert() methods, > > In [2]: np.arange(1).append(2) > AttributeError: 'numpy.ndarray' object has no attribute 'append' > https://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html >> So one syntax generalization possible is towards insert() method. > > > Why would you even want to encourage inserting into a list? It's slow and > should be *discouraged*. I don't - but I have seen it in real projects and modules. and ther is Deque.appendleft() > On Mon, Jun 18, 2018 at 1:08 PM Joao S. O. Bueno > wrote: >> >> MutableSequence protocol to define "<<" as "append" > > > No one has demonstrated with any realistic examples why code would look > better with ``a <<= b`` instead of ``a.append(b)``. So you have 2 separate inquiries in one: explaining why and where is the example. Why it is so - I have tried to explain several times and also in the summary, with a small example (see first post in this thread - 'transposed_row' example). As for examples - below is one example from 'pyparsing' module. But I don't advise to think about "why" but rather just relax and try to 'traverse' the code back and forth several times. (and sometimes it's better to treat things just as an advice if you doubt you can figure it out by yourself - that's not adressed to you but just general life observation) ---------------- with <<= out = [] NL = '\n' out <<= indent + _ustr(self.asList()) if full: if self.haskeys(): items = sorted((str(k), v) for k,v in self.items()) for k,v in items: if out: out <<= NL out <<= "%s%s- %s: " % (indent,(' '*depth), k) if isinstance (v,ParseResults): if v: out <<= v.dump(indent,depth+1) else: out <<= _ustr(v) else: out <<= repr(v) ----------------- with .append() out = [] NL = '\n' out.append( indent+_ustr(self.asList()) ) if full: if self.haskeys(): items = sorted((str(k), v) for k,v in self.items()) for k,v in items: if out: out.append(NL) out.append( "%s%s- %s: " % (indent,(' '*depth), k) ) if isinstance(v,ParseResults): if v: out.append( v.dump(indent,depth+1) ) else: out.append(_ustr(v)) else: out.append(repr(v)) From steve at pearwood.info Mon Jun 18 20:23:33 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Jun 2018 10:23:33 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: <20180619002333.GP14437@ando.pearwood.info> On Mon, Jun 18, 2018 at 05:07:45PM -0300, Joao S. O. Bueno wrote: > a = [1,2, 3] > a <<= 4 > resulting in a == [1, 2, 3, 4] is quite readable Not as readable as a.append(4), which works today and doesn't require the user to memorise the special case that <<= works on lists but (presumably) << doesn't. Or perhaps it would. What would a bare << mean for lists? b = a << 4 could mean the same as b = a + [4] but we already have a spelling for that, do we really need another one? -- Steve From apalala at gmail.com Mon Jun 18 20:52:14 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 18 Jun 2018 20:52:14 -0400 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: > The idea is to introduce new syntax for the list.append() method. > > > Syntax: > > Variant 1. > Use special case of index, namely omitted index: > > mylist[] = item > For all practical purpose, it would be enough to define that the expression: mylist += [item] gets optimized to mylist.append(item). -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 18 20:56:34 2018 From: mike at selik.org (Michael Selik) Date: Mon, 18 Jun 2018 17:56:34 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: On Mon, Jun 18, 2018 at 5:52 PM Juancarlo A?ez wrote: > The idea is to introduce new syntax for the list.append() method. >> >> mylist[] = item >> > > For all practical purpose, it would be enough to define that the > expression: > > mylist += [item] > > gets optimized to mylist.append(item). > Unfortunately, that would create yet another special case of operators breaking the rules. Most operators invoke magic methods. This would prevent ``+=`` from invoking ``__iadd__`` for lists, since the right-hand side would need to be compiled differently. It's similar to why ``and`` and ``or`` keywords can't have magic methods. -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon Jun 18 21:20:25 2018 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 18 Jun 2018 21:20:25 -0400 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: >> For all practical purpose, it would be enough to define that the >> expression: >> >> mylist += [item] >> >> gets optimized to mylist.append(item). >> > > Unfortunately, that would create yet another special case of operators > breaking the rules. Most operators invoke magic methods. This would prevent > ``+=`` from invoking ``__iadd__`` for lists, since the right-hand side > would need to be compiled differently. It's similar to why ``and`` and > ``or`` keywords can't have magic methods. > It seems that the optimization is already in place: import timeit COUNT = 10000 REPS = 10000 def using_append(): result = [] for i in range(COUNT): result.append(i) return result def using_concat(): result = [] for i in range(COUNT): result += [i] return result def using_iadd(): result = [] for i in range(COUNT): result.__iadd__([i]) return result def main(): print(timeit.timeit('using_append()', globals=globals(), number=REPS)) print(timeit.timeit('using_concat()', globals=globals(), number=REPS)) print(timeit.timeit('using_iadd()', globals=globals(), number=REPS)) if __name__ == '__main__': main() -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 18 21:59:45 2018 From: mike at selik.org (Michael Selik) Date: Mon, 18 Jun 2018 18:59:45 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On Mon, Jun 18, 2018 at 2:55 PM Mikhail V wrote: > On Mon, Jun 18, 2018 at 11:43 PM, Michael Selik wrote: > > On Mon, Jun 18, 2018 at 12:56 PM Mikhail V wrote: > >> Numpy arrays have also append() and insert() methods, > > In [2]: np.arange(1).append(2) > > AttributeError: 'numpy.ndarray' object has no attribute 'append' > https://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html Perhaps NumPy chose not to provide an append method to discourage repeated appends, since it's terribly slow. It's a good design philosophy: make inefficient code look appropriately ugly. > On Mon, Jun 18, 2018 at 1:08 PM Joao S. O. Bueno > As for examples - below is one example from 'pyparsing' module. > But I don't advise to think about "why" but rather just relax and > try to 'traverse' the code back and forth several times. > That's interesting advice. I contrast, I find it's best to aggressively ask "Why?" and to rewrite the code in order to fully understand it. I think I found the chunk of code you're referring to. https://github.com/pyparsing/pyparsing/blob/master/pyparsing.py#L848 First, I'll note that ``dump`` usually indicates dumping to a file, while ``dumps`` is dumping to a str. Regardless, I think this ``dump`` function could be improved a bit, and a revision reduces the benefit of a dedicated ``append`` operator. if not full: return indent + _ustr(self.asList()) lines = [_ustr(self.asList())] if self.haskeys(): fmt = ' ' * depth + '- %s: %s' for k, v in sorted((str(k), v) for k, v in self.items()): if isinstance(v, ParseResults): if v: s = v.dump(indent, depth + 1) else: s = _ustr(v) else: s = repr(v) lines.append(fmt % (k, s)) elif any(isinstance(v, ParseResults) for v in self): for i, v in enumerate(self): lines.append(' ' * depth + '[%d]:' % i) if isinstance(v, ParseResults): s = v.dump(indent, depth + 1) else: s = _ustr(v) lines.append(' ' * (depth + 1) + s) return ('\n' + indent).join(lines) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 18 22:14:41 2018 From: mike at selik.org (Michael Selik) Date: Mon, 18 Jun 2018 19:14:41 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: On Mon, Jun 18, 2018 at 6:20 PM Juancarlo A?ez wrote: > For all practical purpose, it would be enough to define that the >>> expression: >>> >>> mylist += [item] >>> >>> gets optimized to mylist.append(item). >>> >> >> Unfortunately, that would create yet another special case of operators >> breaking the rules. Most operators invoke magic methods. This would prevent >> ``+=`` from invoking ``__iadd__`` for lists, since the right-hand side >> would need to be compiled differently. It's similar to why ``and`` and >> ``or`` keywords can't have magic methods. >> > > It seems that the optimization is already in place: > > > def main(): > print(timeit.timeit('using_append()', globals=globals(), number=REPS)) > print(timeit.timeit('using_concat()', globals=globals(), number=REPS)) > print(timeit.timeit('using_iadd()', globals=globals(), number=REPS)) > > I'm not intimately familiar with the opcodes, but I believe that any code involving the expression ``[x]`` will build a list. In [2]: dis.dis("a += [x]") 1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (x) 4 BUILD_LIST 1 6 INPLACE_ADD 8 STORE_NAME 0 (a) 10 LOAD_CONST 0 (None) 12 RETURN_VALUE I didn't run the timings, but I wouldn't be surprised if building a one-element list is faster than looking up an attribute. Or vice-versa. I thought you meant that you wanted to change the syntax such that ``a += [x]`` would, if the left-hand is a list, use different opcodes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Jun 18 22:23:50 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 18 Jun 2018 19:23:50 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <5B1F6AD7.3050302@canterbury.ac.nz> <20180616132723.GG14437@ando.pearwood.info> Message-ID: On Sat, Jun 16, 2018 at 10:57 PM, Tim Peters wrote: Ya, decimal fp doesn't really solve anything except the shallow surprise > that decimal fractions generally aren't exactly representable as binary > fractions. Which is worth a whole lot for casual users, but doesn't > address any of the deep problems (to the contrary, it makes those a bit > worse). > It's my suspicion that the story is the same with "degree-based" trig :-) Which is why, if you want "nice-looking" results, it seems one could simply accept a decimal digit or so less precision, and use the "regular" FP trig functions, rounding to 14 or so decimal digits. Though if someone really wants to implement trig in native degrees -- more power to 'em. However -- if this is really such a good idea -- wouldn't someone have make a C lib that does it? Or has someone? Anyone looked? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Mon Jun 18 22:43:34 2018 From: mike at selik.org (Michael Selik) Date: Mon, 18 Jun 2018 19:43:34 -0700 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: <75E1D78E-9C80-412E-B6CC-483FF4B1EF3F@gmail.com> Message-ID: On Mon, Jun 18, 2018, 6:59 PM Michael Selik wrote: > > if isinstance(v, ParseResults): > if v: > s = v.dump(indent, depth + 1) > else: > s = _ustr(v) > else: > s = repr(v) > lines.append(fmt % (k, s)) > On the 2nd thought, that nested if is ugly. Much better as if not isinstance(v, ParseResults): s = repr(v) elif v: s = v.dump(indent, depth + 1) else: s = _ustr(v) lines.append(fmt % (k, s)) It might seem like this is going off topic. What I'm trying to demonstrate is that cases where an append operator might help are really in need of more thorough revision, not just a little sugar. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcidy at gmail.com Tue Jun 19 01:09:40 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Mon, 18 Jun 2018 22:09:40 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <5B1F6AD7.3050302@canterbury.ac.nz> <20180616132723.GG14437@ando.pearwood.info> Message-ID: On Mon, Jun 18, 2018, 19:25 Chris Barker via Python-ideas < python-ideas at python.org> wrote: > On Sat, Jun 16, 2018 at 10:57 PM, Tim Peters wrote: > > Ya, decimal fp doesn't really solve anything except the shallow surprise >> that decimal fractions generally aren't exactly representable as binary >> fractions. Which is worth a whole lot for casual users, but doesn't >> address any of the deep problems (to the contrary, it makes those a bit >> worse). >> > > It's my suspicion that the story is the same with "degree-based" trig :-) > > Which is why, if you want "nice-looking" results, it seems one could > simply accept a decimal digit or so less precision, and use the "regular" > FP trig functions, rounding to 14 or so decimal digits. > > Though if someone really wants to implement trig in native degrees -- more > power to 'em. > > However -- if this is really such a good idea -- wouldn't someone have > make a C lib that does it? Or has someone? Anyone looked? > quite a few in fact, including cos(n*pi) https://www.boost.org/doc/libs/1_52_0/boost/units/cmath.hpp > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jun 19 02:36:43 2018 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 18 Jun 2018 23:36:43 -0700 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <5B1F6AD7.3050302@canterbury.ac.nz> <20180616132723.GG14437@ando.pearwood.info> Message-ID: On 6/18/18 19:23, Chris Barker via Python-ideas wrote: > On Sat, Jun 16, 2018 at 10:57 PM, Tim Peters > > wrote: > > Ya, decimal fp doesn't really solve anything except the shallow surprise > that decimal fractions generally aren't exactly representable as binary > fractions.? Which is worth a whole lot for casual users, but doesn't address > any of the deep problems (to the contrary, it makes those a bit worse). > > > It's my suspicion that the story is the same with "degree-based" trig :-) > > Which is why, if you want "nice-looking" results, it seems one could simply > accept a decimal digit or so less precision, and use the "regular" FP trig > functions, rounding to 14 or so decimal digits. > > Though if someone really wants to implement trig in native degrees -- more power > to 'em. > > However -- if this is really such a good idea -- wouldn't someone have make a C > lib that does it? Or has someone? Anyone looked? Certainly! scipy.special uses the functions implemented in the Cephes C library. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From steve at pearwood.info Tue Jun 19 06:00:42 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Jun 2018 20:00:42 +1000 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: <20180619100041.GS14437@ando.pearwood.info> On Mon, Jun 18, 2018 at 09:20:25PM -0400, Juancarlo A?ez wrote: > >> For all practical purpose, it would be enough to define that the > >> expression: > >> > >> mylist += [item] > >> > >> gets optimized to mylist.append(item). What if mylist doesn't have an append method? Just because the variable is *called* "mylist" doesn't mean it actually is a list. Since Python is dynamically typed, the compiler has no clue ahead of time whether mylist is a list, so the best it could do is compile the equivalent of: if builtins.type(mylist) is builtins.list: call mylist.append(item) else: call mylist.__iadd__([item]) which I suppose it possible, but that's the sort of special-case optimization which CPython has avoided. (And when it has been tried, has often been very disappointing.) Better to move the optimization into list.__iadd__, but doing that still pays the cost of making a one-item list. Now that's likely to be fast, but if the aim is to avoid making that one-item list (can you say "premature optimization"?) then it is a failure. Besides, it's hardly worthwhile: += is already virtually as fast as calling append. [steve at ando ~]$ python3.5 -m timeit -s "L = []" "L.append(1)" 1000000 loops, best of 3: 0.21 usec per loop [steve at ando ~]$ python3.5 -m timeit -s "L = []" "L += [1]" 1000000 loops, best of 3: 0.294 usec per loop [...] > It seems that the optimization is already in place: Not in 3.5 it isn't. [steve at ando ~]$ python3.5 -c "import dis; dis.dis('L.append(1)')" 1 0 LOAD_NAME 0 (L) 3 LOAD_ATTR 1 (append) 6 LOAD_CONST 0 (1) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 RETURN_VALUE [steve at ando ~]$ python3.5 -c "import dis; dis.dis('L += [1]')" 1 0 LOAD_NAME 0 (L) 3 LOAD_CONST 0 (1) 6 BUILD_LIST 1 9 INPLACE_ADD 10 STORE_NAME 0 (L) 13 LOAD_CONST 1 (None) 16 RETURN_VALUE > import timeit [...] That times a large amount of irrelevant code that has nothing to do with either += or append. -- Steve From steve at pearwood.info Tue Jun 19 06:05:30 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Jun 2018 20:05:30 +1000 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: References: <20180616132723.GG14437@ando.pearwood.info> Message-ID: <20180619100530.GT14437@ando.pearwood.info> On Mon, Jun 18, 2018 at 07:23:50PM -0700, Chris Barker wrote: > Though if someone really wants to implement trig in native degrees -- more > power to 'em. > > However -- if this is really such a good idea -- wouldn't someone have make > a C lib that does it? Or has someone? Anyone looked? No, there's nothing magical about C. You can do it in pure Python. It isn't as fast of course, but it works well enough. When I get a Round Tuit, I'll pop the code up on PyPy. -- Steve From mikhailwas at gmail.com Tue Jun 19 09:57:20 2018 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 19 Jun 2018 16:57:20 +0300 Subject: [Python-ideas] Alternative spelling for list.append() In-Reply-To: References: Message-ID: On Tue, Jun 19, 2018 at 3:52 AM, Juancarlo A?ez wrote: > >> The idea is to introduce new syntax for the list.append() method. >> >> >> Syntax: >> >> Variant 1. >> Use special case of index, namely omitted index: >> >> mylist[] = item > > > For all practical purpose, it would be enough to define that the expression: > > mylist += [item] > > > gets optimized to mylist.append(item). >From what I've read on SO about += ,there is not much penalty in comparison to append() when using one item. And besides, if your idea is to promote += [item] spelling instead of .append() method - then might be you have a totally different idea than mine - so probably start new discussion thread. I suspect though it has little sense because such spelling is imo nothing but obfuscation in terms of readability, and thankfully I haven't seen a lot of such usage. (and this all was many times discussed here too, so please not again) > > > -- > Juancarlo A?ez From mrbm74 at gmail.com Tue Jun 19 10:47:46 2018 From: mrbm74 at gmail.com (Martin Bammer) Date: Tue, 19 Jun 2018 16:47:46 +0200 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) Message-ID: Hello, because Python is a very dynamic language the memory management is heavily used. A lot of time is used for creating (reserve memory and fill object structure with data) and destroying objects. Because of this and because of the discussions about the GIL I was wondering if there isn't a solution to get Python code really executed in parallel without the need to create several processes and without a huge overhead. And here comes the idea for POPT. With this idea the Python interpreter has running several threads in background (1 thread for each object type) which manage a set of objects as an object cache. Each object in the cache is already preconfigured by the object provider thread. So only the part of the object structure which is individual has to be initialized. This saves a lot of processing time for the main thread and the memory management has much less to do, because temporarily unused objects can be reused immediately. Another advantage is that every Python code uses several CPU cores in parallel, even if it is a single threaded application, without the need to change the Python code. If this idea is well implemented I expect a big performance improvement for all Python applications. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Jun 19 10:54:34 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Jun 2018 16:54:34 +0200 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) References: Message-ID: <20180619165434.0b099354@fsol> On Tue, 19 Jun 2018 16:47:46 +0200 Martin Bammer wrote: > Hello, > > because Python is a very dynamic language the memory management is heavily > used. A lot of time is used for creating (reserve memory and fill object > structure with data) and destroying objects. Do you have numbers about that? One modus operandi would be to collect profiling data using Linux "perf" on a real Python workload you care about. > And here comes the idea for POPT. With this idea the Python interpreter has > running several threads in background (1 thread for each object type) which > manage a set of objects as an object cache. Each object in the cache is > already preconfigured by the object provider thread. So only the part of > the object structure which is individual has to be initialized. This saves > a lot of processing time for the main thread and the memory management has > much less to do, because temporarily unused objects can be reused > immediately. How does the main thread (or, rather, the multiple application threads) communicate with the background object threads? What is the communication and synchronization overhead in this scheme? Regards Antoine. From steve at pearwood.info Tue Jun 19 12:13:59 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jun 2018 02:13:59 +1000 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) In-Reply-To: References: Message-ID: <20180619161358.GV14437@ando.pearwood.info> On Tue, Jun 19, 2018 at 04:47:46PM +0200, Martin Bammer wrote: > And here comes the idea for POPT. With this idea the Python interpreter has > running several threads in background (1 thread for each object type) The builtins alone has 47 exception types. I hope you don't mean that each of those gets its own thread. Then there are builtins: str, bytes, int, float, set, frozenset, list, tuple, dict, type, bytearray, property, staticmethod, classmethod, range objects, zip objects, filter objects, slice objects, map objects, MethodType, FunctionType, ModuleType and probably more I forgot. That's 20 threads there. Don't forget things like code objects, DictProxies, BuiltinMethodOrFunction, etc. When I run threaded code on my computer, I find that about 6 or 8 threads is optimal, and more than that and the code slows down. You want to use 25-30 threads just for memory management. Why do you hate me? *wink* > which > manage a set of objects as an object cache. Each object in the cache is > already preconfigured by the object provider thread. So only the part of > the object structure which is individual has to be initialized. For many objects, wouldn't that be close enough to "all of it"? (Apart from a couple of fields which never change.) > This saves a lot of processing time for the main thread Do you know this for a fact or are you just hoping? > and the memory management has > much less to do, because temporarily unused objects can be reused > immediately. Or, unused objects can sit around for a long, long time, locking up memory in a cache that would be better allocated towards *used* objects. > Another advantage is that every Python code uses several CPU cores in > parallel, even if it is a single threaded application, without the need to > change the Python code. How do you synchronise these threaded calls? Suppose I write this: x = (3.5, 4.5, "a", "b", {}, [], 1234567890, b"abcd", "c", 5.5) That has to syncronise 10 pieces of output from six threads before it can construct the tuple. How much overhead does that have? > If this idea is well implemented I expect a big performance improvement for > all Python applications. What are your reasons for this expectation? Do other interpreters do this? Have you tried an implementation and got promising results? -- Steve From jheiv at jheiv.com Tue Jun 19 15:18:17 2018 From: jheiv at jheiv.com (James Edwards) Date: Tue, 19 Jun 2018 15:18:17 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` Message-ID: I've only recently looked for these special methods, so that in and of itself may be the reason these methods aren't exposed, but I could think of objects that may wish to implement __min__ and __max__ themselves, for efficiency. For example: # A "self-sorted" list object class AlwaysSortedListObejct: def __min__(self): return self.lst[0] def __max__(self): return self.lst[-1] # An object that maintains indices of extrema (e.g. for complex comparisons) class KeepsTrackOfExtrema: def __init__(self): self.min_index = None self.max_index = None def append(self, obj): new_index = len(obj) self.backer.append(obj) if (self.max_index is None) or (obj > self.backer[self.max_index]): self.max_index = new_index if (self.min_index is None) or (obj < self.backer[self.min_index]): self.min_index = new_index def __min__(self): return self.backer[self.min_index] def __max__(self): return self.backer[self.max_index] Where these methods be called via the single-argument calls to `max(obj)` and `min(obj)`. If it's not clear, it'd be similar to the way __len__ is called (when defined) via len(obj). My solution was to implement a .min() method, but that caused some ugly special casing when the object could also be a regular list (where I'd want to iterate over all of the items). I searched the list, but has this been discussed before? Is there any merit in it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Tue Jun 19 15:33:15 2018 From: mike at selik.org (Michael Selik) Date: Tue, 19 Jun 2018 12:33:15 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: Do you mind sharing an example usage in a realistic context? There might be a good solution that doesn't require adding magic methods. On Tue, Jun 19, 2018, 12:24 PM James Edwards wrote: > I've only recently looked for these special methods, so that in and of > itself may be the reason these methods aren't exposed, but I could think of > objects that may wish to implement __min__ and __max__ themselves, for > efficiency. For example: > > # A "self-sorted" list object > class AlwaysSortedListObejct: > def __min__(self): return self.lst[0] > def __max__(self): return self.lst[-1] > > > # An object that maintains indices of extrema (e.g. for complex > comparisons) > class KeepsTrackOfExtrema: > def __init__(self): > self.min_index = None > self.max_index = None > > def append(self, obj): > new_index = len(obj) > self.backer.append(obj) > > if (self.max_index is None) or (obj > > self.backer[self.max_index]): > self.max_index = new_index > > if (self.min_index is None) or (obj < > self.backer[self.min_index]): > self.min_index = new_index > > def __min__(self): return self.backer[self.min_index] > def __max__(self): return self.backer[self.max_index] > > Where these methods be called via the single-argument calls to `max(obj)` > and `min(obj)`. > > If it's not clear, it'd be similar to the way __len__ is called (when > defined) via len(obj). > > My solution was to implement a .min() method, but that caused some ugly > special casing when the object could also be a regular list (where I'd want > to iterate over all of the items). > > I searched the list, but has this been discussed before? Is there any > merit in it? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paal.drange at gmail.com Tue Jun 19 17:25:06 2018 From: paal.drange at gmail.com (=?UTF-8?B?UMOlbCBHcsO4bsOlcyBEcmFuZ2U=?=) Date: Tue, 19 Jun 2018 23:25:06 +0200 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: Well, numpy implements ndarray.min(). It would be very nice if min(np.array) worked as expected. P?l On 19 Jun 2018 21:33, "Michael Selik" wrote: > Do you mind sharing an example usage in a realistic context? There might > be a good solution that doesn't require adding magic methods. > > > On Tue, Jun 19, 2018, 12:24 PM James Edwards wrote: > >> I've only recently looked for these special methods, so that in and of >> itself may be the reason these methods aren't exposed, but I could think of >> objects that may wish to implement __min__ and __max__ themselves, for >> efficiency. For example: >> >> # A "self-sorted" list object >> class AlwaysSortedListObejct: >> def __min__(self): return self.lst[0] >> def __max__(self): return self.lst[-1] >> >> >> # An object that maintains indices of extrema (e.g. for complex >> comparisons) >> class KeepsTrackOfExtrema: >> def __init__(self): >> self.min_index = None >> self.max_index = None >> >> def append(self, obj): >> new_index = len(obj) >> self.backer.append(obj) >> >> if (self.max_index is None) or (obj > >> self.backer[self.max_index]): >> self.max_index = new_index >> >> if (self.min_index is None) or (obj < >> self.backer[self.min_index]): >> self.min_index = new_index >> >> def __min__(self): return self.backer[self.min_index] >> def __max__(self): return self.backer[self.max_index] >> >> Where these methods be called via the single-argument calls to `max(obj)` >> and `min(obj)`. >> >> If it's not clear, it'd be similar to the way __len__ is called (when >> defined) via len(obj). >> >> My solution was to implement a .min() method, but that caused some ugly >> special casing when the object could also be a regular list (where I'd want >> to iterate over all of the items). >> >> I searched the list, but has this been discussed before? Is there any >> merit in it? >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ffaristocrat at gmail.com Tue Jun 19 18:54:27 2018 From: ffaristocrat at gmail.com (=?UTF-8?Q?Miche=C3=A1l_Keane?=) Date: Tue, 19 Jun 2018 23:54:27 +0100 Subject: [Python-ideas] Copy (and/or pickle) generators Message-ID: Add a function to generator objects to copy the entire state of it: Proposed example code: game1 = complicated_game_type_thing() # Progress the game to the first decision point choices = game1.send(None) # Choose something response = get_a_response(choices) # Copy the game generator game2 = game1.copy() # send the same response to each game x = game1.send(response) y = game2.send(response) # verify the new set of choices is the same assert x == y History: I found this stackoverflow Q&A which among other things linked to an in-depth explanation of why generators could not be pickled and this enhancement request for 2.6 on the bugtracker. All the reasons given there are perfectly valid.... but they were also given nearly 10 years ago. It may be time to revisit the issue. I couldn't turn up any previous threads here related to this so I'm throwing it out for discussion. Use case: My work involves Monte Carlo Tree Searches of games, eventually in combination with tensorflow. MCTS involves repeatedly copying the state of a simulation to explore the potential outcomes of various choices in depth. If you're doing a game like Chess or Go, a game state is dead simple to summarize - you have a list of board positions with which pieces they have and whose turn it is. If you're doing complex games that don't have an easily summarized state at any given moment, you start running into problems. Think something along the lines of Magic the Gathering with complex turn sequences between players and effect resolutions being done in certain orders that are dependent on choices made by players, etc. Generators are an ideal way to run these types of simulations but the inability to copy the state of a generator makes it impossible to do this in MCTS. As Python is being increasingly used for data science, this use case will be increasingly common. Being able to copy generators will save a lot of work. Keep in mind, I don't necessarily propose that generators should be fully picklable; there are obviously a number of concerns and problems there. Just being able to duplicate the generator's state within the interpreter would be enough for my use case. Workarounds: The obvious choice is to refactor the simulation as an iterator that stores each state as something that's easily copied/pickled. It's probably possible but it'll require a lot of thought and code for each type of simulation. There's a Python2 package from 2009 called generator_tools that purports to do this. I haven't tried it yet to see if it still works in 2.x and it appears beyond my skill level to port to 3.x. PyPy & Stackless Python apparently support this within certain limits? Thoughts? Washington, DC USA ffaristocrat at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jun 19 19:25:40 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jun 2018 16:25:40 -0700 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: References: Message-ID: The state of a generator is not much more that a single Python stack frame plus an integer indicating where in the bytecode the resume point is. But copying/pickling a stack frame is complicated -- it's not just all the locals but also the try/except stack and the expression evaluation stack. Have a look here: https://github.com/python/cpython/blob/master/Include/frameobject.h. I'm not sure that I want to sign up for making all that stuff copyable (pickling will be an even harder challenge). But perhaps you (and/or another fearless hacker) are interested in trying? Or were you just trying to see if the core dev team has spare cycles to implement this for you? --Guido On Tue, Jun 19, 2018 at 3:56 PM Miche?l Keane wrote: > > Add a function to generator objects to copy the entire state of it: > > Proposed example code: > > game1 = complicated_game_type_thing() > > # Progress the game to the first decision point > choices = game1.send(None) > > # Choose something > response = get_a_response(choices) > > # Copy the game generator > game2 = game1.copy() > > # send the same response to each game > x = game1.send(response) > y = game2.send(response) > > # verify the new set of choices is the same > assert x == y > > > History: > > I found this stackoverflow Q&A > which > among other things linked to an in-depth explanation of why generators > could not be pickled > and > this enhancement request for 2.6 on > the bugtracker. All the reasons given there are perfectly valid.... but > they were also given nearly 10 years ago. It may be time to revisit the > issue. > > I couldn't turn up any previous threads here related to this so I'm > throwing it out for discussion. > > > Use case: > > My work involves Monte Carlo Tree Searches of games, eventually in > combination with tensorflow. MCTS involves repeatedly copying the state of > a simulation to explore the potential outcomes of various choices in depth. > > If you're doing a game like Chess or Go, a game state is dead simple to > summarize - you have a list of board positions with which pieces they have > and whose turn it is. > > If you're doing complex games that don't have an easily summarized state > at any given moment, you start running into problems. Think something > along the lines of Magic the Gathering with complex turn sequences between > players and effect resolutions being done in certain orders that are > dependent on choices made by players, etc. > > Generators are an ideal way to run these types of simulations but the > inability to copy the state of a generator makes it impossible to do this > in MCTS. > > As Python is being increasingly used for data science, this use case will > be increasingly common. Being able to copy generators will save a lot of > work. > > Keep in mind, I don't necessarily propose that generators should be fully > picklable; there are obviously a number of concerns and problems there. > Just being able to duplicate the generator's state within the interpreter > would be enough for my use case. > > > Workarounds: > > The obvious choice is to refactor the simulation as an iterator that > stores each state as something that's easily copied/pickled. It's probably > possible but it'll require a lot of thought and code for each type of > simulation. > > There's a Python2 package from 2009 called generator_tools > that purports to do this. I > haven't tried it yet to see if it still works in 2.x and it appears beyond > my skill level to port to 3.x. > > PyPy & Stackless Python apparently support this within certain limits? > > > Thoughts? > > > Washington, DC USA > ffaristocrat at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jun 19 19:36:18 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 19 Jun 2018 16:36:18 -0700 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: References: Message-ID: You might find this useful, either to use directly or as a source of inspiration: https://github.com/llllllllll/cloudpickle-generators -n On Tue, Jun 19, 2018, 15:55 Miche?l Keane wrote: > > Add a function to generator objects to copy the entire state of it: > > Proposed example code: > > game1 = complicated_game_type_thing() > > # Progress the game to the first decision point > choices = game1.send(None) > > # Choose something > response = get_a_response(choices) > > # Copy the game generator > game2 = game1.copy() > > # send the same response to each game > x = game1.send(response) > y = game2.send(response) > > # verify the new set of choices is the same > assert x == y > > > History: > > I found this stackoverflow Q&A > which > among other things linked to an in-depth explanation of why generators > could not be pickled > and > this enhancement request for 2.6 on > the bugtracker. All the reasons given there are perfectly valid.... but > they were also given nearly 10 years ago. It may be time to revisit the > issue. > > I couldn't turn up any previous threads here related to this so I'm > throwing it out for discussion. > > > Use case: > > My work involves Monte Carlo Tree Searches of games, eventually in > combination with tensorflow. MCTS involves repeatedly copying the state of > a simulation to explore the potential outcomes of various choices in depth. > > If you're doing a game like Chess or Go, a game state is dead simple to > summarize - you have a list of board positions with which pieces they have > and whose turn it is. > > If you're doing complex games that don't have an easily summarized state > at any given moment, you start running into problems. Think something > along the lines of Magic the Gathering with complex turn sequences between > players and effect resolutions being done in certain orders that are > dependent on choices made by players, etc. > > Generators are an ideal way to run these types of simulations but the > inability to copy the state of a generator makes it impossible to do this > in MCTS. > > As Python is being increasingly used for data science, this use case will > be increasingly common. Being able to copy generators will save a lot of > work. > > Keep in mind, I don't necessarily propose that generators should be fully > picklable; there are obviously a number of concerns and problems there. > Just being able to duplicate the generator's state within the interpreter > would be enough for my use case. > > > Workarounds: > > The obvious choice is to refactor the simulation as an iterator that > stores each state as something that's easily copied/pickled. It's probably > possible but it'll require a lot of thought and code for each type of > simulation. > > There's a Python2 package from 2009 called generator_tools > that purports to do this. I > haven't tried it yet to see if it still works in 2.x and it appears beyond > my skill level to port to 3.x. > > PyPy & Stackless Python apparently support this within certain limits? > > > Thoughts? > > > Washington, DC USA > ffaristocrat at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue Jun 19 19:38:28 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 20 Jun 2018 00:38:28 +0100 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: References: Message-ID: > > [snip] > > As Python is being increasingly used for data science, this use case will > be increasingly common. Being able to copy generators will save a lot of > work. > > Keep in mind, I don't necessarily propose that generators should be fully > picklable; there are obviously a number of concerns and problems there. > Just being able to duplicate the generator's state within the interpreter > would be enough for my use case. > > [snip] > > Thoughts? > > I also remember I wanted this feature few times in the past, I like the idea. The problem is who will implement this? If you have time and energy, then you can just try (but beware, it may be harder that it looks). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Jun 19 23:19:19 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jun 2018 13:19:19 +1000 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: <20180620031919.GY14437@ando.pearwood.info> On Tue, Jun 19, 2018 at 12:33:15PM -0700, Michael Selik wrote: > Do you mind sharing an example usage in a realistic context? There might be > a good solution that doesn't require adding magic methods. You have some sort of binary search tree that is iterated over in some arbitrary order. Calling min(tree) iterates over the entire tree, even if the tree knows how to find the minimum much more efficiently. Iterating over the entire tree is O(N), where N = number of nodes. More efficent min is typically O(D), where D = depth, which is typically about log_2 N if the tree is balanced. I know that for many purposes, we use dicts as they are more convenient and easier to use than trees, but there are still plenty of realistic use-cases for trees. -- Steve From storchaka at gmail.com Wed Jun 20 00:05:19 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 07:05:19 +0300 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: 19.06.18 22:18, James Edwards ????: > I've only recently looked for these special methods, so that in and of > itself may be the reason these methods aren't exposed, but I could think > of objects that may wish to implement __min__ and __max__ themselves, > for efficiency. There are two questions. 1. What to do with additional min() and max() arguments: key and default. 2. Is the need of this feature large enough? Will the benefit for special cases exceed the drawback of increasing implementation complexity and slowing down common cases? If supporting of arguments key and default are not needed for you, you can implement your own functions and use them instead of min() and max(). From mrbm74 at gmail.com Wed Jun 20 00:07:18 2018 From: mrbm74 at gmail.com (Martin Bammer) Date: Wed, 20 Jun 2018 06:07:18 +0200 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) (Antoine Pitrou) In-Reply-To: References: Message-ID: Am 2018-06-19 um 18:00 schrieb python-ideas-request at python.org: > Re: POPT (Python Ob ject Provider Threads) (Antoine Pitrou) Currently it's just an idea. I didn't have the time yet to create a prototype implementation. From steve at pearwood.info Wed Jun 20 03:00:57 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jun 2018 17:00:57 +1000 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: <20180620070056.GZ14437@ando.pearwood.info> On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote: > 19.06.18 22:18, James Edwards ????: > >I've only recently looked for these special methods, so that in and of > >itself may be the reason these methods aren't exposed, but I could think > >of objects that may wish to implement __min__ and __max__ themselves, > >for efficiency. > > There are two questions. > > 1. What to do with additional min() and max() arguments: key and default. Since there are no reflected versions of min/max, there is no trouble with extra arguments. Just pass them through to the dunder: min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y) > 2. Is the need of this feature large enough? Will the benefit for > special cases exceed the drawback of increasing implementation > complexity and slowing down common cases? Reasonable questions, but I don't think that the cost of testing: if hasattr(type(obj), '__min__') # or equivalent is going to be very large. Amortized over O(N) comparisons, that's practically free :-) More important, I think, is the increase in API complexity. That's two more dunders to learn about. The first part is critical: is this useful enough to justify two more dunders? I think the answer is a definite Maybe. Or perhaps Maybe Not. I think that without at least one use-case in the standard library, perhaps we should hold off on this. Unless numpy arrays are important enough to justify this on their own? Are there any builtins or std library classes that offer their own min()/max() methods? If so, that would be good evidence that making this a dunder-based protocol has stdlib use-cases. -- Steve From J.Demeyer at UGent.be Wed Jun 20 05:56:05 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 20 Jun 2018 11:56:05 +0200 Subject: [Python-ideas] staticmethod and classmethod should be callable Message-ID: <5B2A24B5.7050900@UGent.be> While working on PEP 579 and friends, I noticed one oddity with classmethods: for Python classes, the object stored in the class __dict__ is of type "classmethod". For extension types, the type is "classmethod_descriptor". In turns out that the latter is callable itself, unlike staticmethod or classmethod instances: >>> fromhex = float.__dict__["fromhex"] >>> type(fromhex) >>> fromhex(float, "0xff") 255.0 >>> @classmethod ... def f(cls): pass >>> f(float) Traceback (most recent call last): File "", line 1, in TypeError: 'classmethod' object is not callable Since it makes sense to merge the classes "classmethod" and "classmethod_descriptor" (PEP 579, issue 8), one of the above behaviors should be changed. Given that adding features is less likely to break stuff, I would argue that classmethod instances should become callable. This would also make classmethod more analogous to function: you can see both "function" and "classmethod" as unbound methods. The only thing that is different is the binding behavior (binding to the instance vs. the class). Finally, function decorators typically turn functions into a different kind of callable. I find it counter-intuitive that @classmethod doesn't do that. And for consistency, also staticmethod instances should be callable. Are there any reasons to *not* make staticmethod and classmethod callable? Jeroen. From daniel.sanchez.fabregas at xunta.gal Wed Jun 20 05:43:08 2018 From: daniel.sanchez.fabregas at xunta.gal (=?UTF-8?Q?Daniel_S=c3=a1nchez_F=c3=a1bregas?=) Date: Wed, 20 Jun 2018 11:43:08 +0200 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: <20180614123705.GL12683@ando.pearwood.info> References: <20180614123705.GL12683@ando.pearwood.info> Message-ID: <892db451-588e-f371-050e-bc55fc7b3b90@xunta.gal> El 14/06/18 a las 14:37, Steven D'Aprano escribi?: > On Thu, Jun 14, 2018 at 01:03:37PM +0200, Daniel S?nchez F?bregas wrote: >> My idea consist in: >> Adding a method to perform type checking in traceback objects >> When printing stack traces search for mistyped arguments and warn about >> them to the user. > Can you give a concrete example of how this would work? > Example on how this should work from the user point of view: ~~~ python def one(arg: str) -> str: ??? return two(arg) + "1" def two(arg: str) -> str: ??? return three(arg) * 2 def three(arg: str) -> str: ??? return "{}({}) ".format(arg, len(arg)) print(one("test")) print(one(0)) ~~~ Intended output: ~~~ test(4) test(4) 1 Traceback (most recent call last): ? File "test.py", line 9, in ??? print(one(0)) ? Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' ? File "test.py", line 2, in one ??? return two(arg) + "1" ? Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' ? File "test.py", line 4, in two ??? return three(arg) * 2 ? Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' ? File "test.py", line 6, in three ??? return "{}({}) ".format(arg, len(arg)) TypeError: object of type 'int' has no len() ~~~ How could it be achieved? I don't know enough python to answer this. I suppose that it could be done. From j.van.dorp at deonet.nl Wed Jun 20 06:13:06 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 20 Jun 2018 12:13:06 +0200 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: <892db451-588e-f371-050e-bc55fc7b3b90@xunta.gal> References: <20180614123705.GL12683@ando.pearwood.info> <892db451-588e-f371-050e-bc55fc7b3b90@xunta.gal> Message-ID: 2018-06-20 11:43 GMT+02:00 Daniel S?nchez F?bregas : > > El 14/06/18 a las 14:37, Steven D'Aprano escribi?: >> On Thu, Jun 14, 2018 at 01:03:37PM +0200, Daniel S?nchez F?bregas wrote: >>> My idea consist in: >>> Adding a method to perform type checking in traceback objects >>> When printing stack traces search for mistyped arguments and warn about >>> them to the user. >> Can you give a concrete example of how this would work? >> > Example on how this should work from the user point of view: > > ~~~ python > > def one(arg: str) -> str: > return two(arg) + "1" > def two(arg: str) -> str: > return three(arg) * 2 > def three(arg: str) -> str: > return "{}({}) ".format(arg, len(arg)) > > print(one("test")) > print(one(0)) > > ~~~ > > Intended output: > > ~~~ > > test(4) test(4) 1 > Traceback (most recent call last): > File "test.py", line 9, in > print(one(0)) > Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' > File "test.py", line 2, in one > return two(arg) + "1" > Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' > File "test.py", line 4, in two > return three(arg) * 2 > Warning: TypeMistmatch argument 'arg' of type 'int' is declared as 'str' > File "test.py", line 6, in three > return "{}({}) ".format(arg, len(arg)) > TypeError: object of type 'int' has no len() > > ~~~ > > How could it be achieved? I don't know enough python to answer this. I > suppose that it could be done. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ Looks like you could do this with a decorator, I think def type_check_on_traceback(func): ann = typing.get_type_hints(func) @functools.wraps(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except: for key, value in kwargs.items(): if key in ann and not isinstance(value, ann[key]): # emit warning raise return func would be close, give or take some special cases. Not entirely sure how this'd work with Optional[whatever] types, so you'd have to test that. This takes a limitation on not checking the *args, but if you put some effort in, the inspect module can probably let you check those too. From j.van.dorp at deonet.nl Wed Jun 20 06:15:45 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 20 Jun 2018 12:15:45 +0200 Subject: [Python-ideas] Check type hints in stack trace printing In-Reply-To: References: <20180614123705.GL12683@ando.pearwood.info> <892db451-588e-f371-050e-bc55fc7b3b90@xunta.gal> Message-ID: To clarify some more, you'd then have to use the decorator like: @type_check_on_traceback def three(arg: str) -> str: return "{}({}) ".format(arg, len(arg)) on every function where you want this behaviour. Note that this will also emit warnings on tracebacks of exceptions that are later silenced, and afaik there's no easy way around that. From ffaristocrat at gmail.com Wed Jun 20 06:33:24 2018 From: ffaristocrat at gmail.com (=?UTF-8?Q?Miche=C3=A1l_Keane?=) Date: Wed, 20 Jun 2018 11:33:24 +0100 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: References: Message-ID: I wanted to sound out a couple things. First, I couldn't find any real discussion about it after 2011 so I had no idea if the reasons it was ruled unfeasible with Python 2 still held nearly 10 years later with Python 3. I was mainly wondering if all the recent asynchronous work had changed things significantly. Apparently not? Secondly, one SO comment had included the suggestion that it be posted to this list - my searching couldn't find it ever having been done so here it is. Finally, another comment made the point that there wasn't a strong use case given for it. With the data science libraries that have sprung up around Python in the intervening years, I believe there now is one. Washington, DC USA ffaristocrat at gmail.com On Wed, Jun 20, 2018 at 12:25 AM, Guido van Rossum wrote: > The state of a generator is not much more that a single Python stack frame > plus an integer indicating where in the bytecode the resume point is. But > copying/pickling a stack frame is complicated -- it's not just all the > locals but also the try/except stack and the expression evaluation stack. > Have a look here: https://github.com/python/cpython/blob/master/Include/ > frameobject.h. I'm not sure that I want to sign up for making all that > stuff copyable (pickling will be an even harder challenge). But perhaps you > (and/or another fearless hacker) are interested in trying? > > Or were you just trying to see if the core dev team has spare cycles to > implement this for you? > > --Guido > > On Tue, Jun 19, 2018 at 3:56 PM Miche?l Keane > wrote: > >> >> Add a function to generator objects to copy the entire state of it: >> >> Proposed example code: >> >> game1 = complicated_game_type_thing() >> >> # Progress the game to the first decision point >> choices = game1.send(None) >> >> # Choose something >> response = get_a_response(choices) >> >> # Copy the game generator >> game2 = game1.copy() >> >> # send the same response to each game >> x = game1.send(response) >> y = game2.send(response) >> >> # verify the new set of choices is the same >> assert x == y >> >> >> History: >> >> I found this stackoverflow Q&A >> which >> among other things linked to an in-depth explanation of why generators >> could not be pickled >> and >> this enhancement request for 2.6 on >> the bugtracker. All the reasons given there are perfectly valid.... but >> they were also given nearly 10 years ago. It may be time to revisit the >> issue. >> >> I couldn't turn up any previous threads here related to this so I'm >> throwing it out for discussion. >> >> >> Use case: >> >> My work involves Monte Carlo Tree Searches of games, eventually in >> combination with tensorflow. MCTS involves repeatedly copying the state of >> a simulation to explore the potential outcomes of various choices in depth. >> >> If you're doing a game like Chess or Go, a game state is dead simple to >> summarize - you have a list of board positions with which pieces they have >> and whose turn it is. >> >> If you're doing complex games that don't have an easily summarized state >> at any given moment, you start running into problems. Think something >> along the lines of Magic the Gathering with complex turn sequences between >> players and effect resolutions being done in certain orders that are >> dependent on choices made by players, etc. >> >> Generators are an ideal way to run these types of simulations but the >> inability to copy the state of a generator makes it impossible to do this >> in MCTS. >> >> As Python is being increasingly used for data science, this use case will >> be increasingly common. Being able to copy generators will save a lot of >> work. >> >> Keep in mind, I don't necessarily propose that generators should be fully >> picklable; there are obviously a number of concerns and problems there. >> Just being able to duplicate the generator's state within the interpreter >> would be enough for my use case. >> >> >> Workarounds: >> >> The obvious choice is to refactor the simulation as an iterator that >> stores each state as something that's easily copied/pickled. It's probably >> possible but it'll require a lot of thought and code for each type of >> simulation. >> >> There's a Python2 package from 2009 called generator_tools >> that purports to do this. I >> haven't tried it yet to see if it still works in 2.x and it appears beyond >> my skill level to port to 3.x. >> >> PyPy & Stackless Python apparently support this within certain limits? >> >> >> Thoughts? >> >> >> Washington, DC USA >> ffaristocrat at gmail.com >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jun 20 07:29:22 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Jun 2018 21:29:22 +1000 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) In-Reply-To: References: Message-ID: On 20 June 2018 at 00:47, Martin Bammer wrote: > If this idea is well implemented I expect a big performance improvement for > all Python applications. Given the free lists already maintained for several builtin types in the reference implementation, I suspect you may be disappointed on that front :) (While object creation overhead certainly isn't trivial, the interpreter's already pretty aggressive about repurposing previously allocated and initialised memory for new instances) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Wed Jun 20 07:43:36 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jun 2018 21:43:36 +1000 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <5B2A24B5.7050900@UGent.be> References: <5B2A24B5.7050900@UGent.be> Message-ID: <20180620114335.GA14437@ando.pearwood.info> On Wed, Jun 20, 2018 at 11:56:05AM +0200, Jeroen Demeyer wrote: [...] > Since it makes sense to merge the classes "classmethod" and > "classmethod_descriptor" (PEP 579, issue 8), one of the above behaviors > should be changed. Given that adding features is less likely to break > stuff, I would argue that classmethod instances should become callable. [...] > Are there any reasons to *not* make staticmethod and classmethod callable? (The classes themselves are callable -- you're talking about the instances.) +1 yes please! The fact that classmethods and especially staticmethod instances aren't callable has been a long-running niggling pain for me. Occasionally I want to do something like this: class Spam: @staticmethod def utility(arg): # something which is conceptually related to the Spam class # but doesn't need a cls/self argument. ... value = utility(arg) but it doesn't work as staticmethod objects aren't callable until after they've gone through the descriptor protocol. I'm not the only one bitten by this: https://stackoverflow.com/questions/45375944/python-static-method-is-not-always-callable https://mail.python.org/pipermail/python-list/2011-November/615069.html Part of that thread, see links and discussion here: https://mail.python.org/pipermail/python-list/2011-November/615077.html I thought I had raised a bug report for this on the tracker, but my google-fu is failing me and I can't find it. But my recollection is that the simple fix is to make staticmethod.__call__ simply delegate to the underlying decorated function. And similar for classmethod. (Of course calling classmethod instances directly won't work unless you provide the class argument. But that's just a simple matter of bound versus unbound methods.) -- Steve From kenlhilton at gmail.com Wed Jun 20 07:51:46 2018 From: kenlhilton at gmail.com (Ken Hilton) Date: Wed, 20 Jun 2018 19:51:46 +0800 Subject: [Python-ideas] Dedicated string concatenation operator Message-ID: Hi all, just another wild idea. I've been working a lot with PHP lately (boo!), and one of the things I have a love-hate relationship with in it is its string concatenation operator: . Counter-intuitively to Python, `.` is used to concatenate strings; `"str1" . "str2"` evaluates to `"str1str2"`. Even though it's kind of weird, I find the separation between addition and concatenation useful. In PHP, `"str1" + "str2"` evaluates to 0; `"str1" . "str2"` evaluates to `"str1str2"`. Obviously, in Python, `"str1" + "str2"` could not evaluate to 0, it should instead raise a TypeError. But it's more clear what is going on here: $content .= "foobar"; $i += 1; than here: content += "foobar" i += 1 I propose adding a dedicated operator for string concatenation. I don't propose `.` as that operator - that would break too much. My initial idea is to abuse the @ operator introduced for matrix multiplication to work for string concatenation when applied to strings. This would be an example result of that: >>> from numpy import matrix >>> matrix('1 2; 3 4') @ matrix('4 3; 2 1') [[ 8 5] [20 13]] >>> "str1" @ "str2" "str1str2" >>> "str1" @ 56 #str() is called on 56 before concatenating "str156" >>> 56 @ "str1" #str would also have __rmatmul__ "56str1" >>> content = "foobar" >>> content @= "bazbang" >>> content @= "running out of ideas" >>> content 'foobarbazbangrunning out of ideas' However, the operator does not necessarily have to be @ - I merely picked that because of its lack of use outside matrix math. What are your thoughts? Sincerely, Ken Hilton; -------------- next part -------------- An HTML attachment was scrubbed... URL: From jheiv at jheiv.com Wed Jun 20 08:23:08 2018 From: jheiv at jheiv.com (James Edwards) Date: Wed, 20 Jun 2018 08:23:08 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: <20180620070056.GZ14437@ando.pearwood.info> References: <20180620070056.GZ14437@ando.pearwood.info> Message-ID: > Are there any builtins or std library classes that offer their own min()/max() methods? My first instinct was heapq[1], since the way to access the min value is simply heap[0] (and I thought it could benefit from __min__) -- it's almost the perfect motivating example. But as it stands, the module uses functions to operate directly on a standard list, so even if __min__ were exposed, min(heap) would still iterate over the entire list. That being said, a heap *class* could take advantage of this, and provide a semantically consistent optimization. I'm not sure how many examples will be found in stdlib, as I expect this optimization to be restricted to specialized container types like heaps, but I'll keep searching. [1] https://docs.python.org/3.6/library/heapq.html On Wed, Jun 20, 2018 at 3:00 AM, Steven D'Aprano wrote: > On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote: > > 19.06.18 22:18, James Edwards ????: > > >I've only recently looked for these special methods, so that in and of > > >itself may be the reason these methods aren't exposed, but I could > think > > >of objects that may wish to implement __min__ and __max__ themselves, > > >for efficiency. > > > > There are two questions. > > > > 1. What to do with additional min() and max() arguments: key and default. > > Since there are no reflected versions of min/max, there is no trouble > with extra arguments. Just pass them through to the dunder: > > min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y) > > > > 2. Is the need of this feature large enough? Will the benefit for > > special cases exceed the drawback of increasing implementation > > complexity and slowing down common cases? > > Reasonable questions, but I don't think that the cost of testing: > > if hasattr(type(obj), '__min__') > # or equivalent > > is going to be very large. Amortized over O(N) comparisons, that's > practically free :-) > > More important, I think, is the increase in API complexity. That's two > more dunders to learn about. > > The first part is critical: is this useful enough to justify two more > dunders? I think the answer is a definite Maybe. Or perhaps Maybe Not. > > I think that without at least one use-case in the standard library, > perhaps we should hold off on this. Unless numpy arrays are important > enough to justify this on their own? > > Are there any builtins or std library classes that offer their own > min()/max() methods? If so, that would be good evidence that making this > a dunder-based protocol has stdlib use-cases. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 20 08:48:22 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jun 2018 22:48:22 +1000 Subject: [Python-ideas] Dedicated string concatenation operator In-Reply-To: References: Message-ID: <20180620124821.GB14437@ando.pearwood.info> On Wed, Jun 20, 2018 at 07:51:46PM +0800, Ken Hilton wrote: > I propose adding a dedicated operator for string concatenation. Guido's time machine strikes again: py> "Hello" + "World" 'HelloWorld' (This has been in the language since at least version 1.5 and probably back even further to 1.0 and beyond.) -- Steve From j.van.dorp at deonet.nl Wed Jun 20 10:03:17 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 20 Jun 2018 16:03:17 +0200 Subject: [Python-ideas] Dedicated string concatenation operator In-Reply-To: <20180620124821.GB14437@ando.pearwood.info> References: <20180620124821.GB14437@ando.pearwood.info> Message-ID: For changes that break this much previous code, you need a really, really, really good reason. "Even though it's kind of weird, I find the separation between addition and concatenation useful." does not qualify. From guido at python.org Wed Jun 20 11:24:21 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 08:24:21 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180620070056.GZ14437@ando.pearwood.info> Message-ID: Finding more realistic use cases is key -- the actual spec is pretty obvious and doesn't worry me in terms of added language or implementation complexity. I think just finding a data structure that should implement its own min/max funtionality (or maybe one of these, like heapq) is not enough motivation. You have to find code where such a data structure (let's say a Tree) is passed to some function that also accepts, say, a list. Then that function would benefit from being able to call just `min(x)` rather than `x.min() if isinstance(x, Tree) else min(x)`. If whenever you have a Tree you know that you have a Tree (because it has other unique methods) then there's no burden for the user to call x.min(). --Guido On Wed, Jun 20, 2018 at 5:24 AM James Edwards wrote: > > Are there any builtins or std library classes that offer their own > min()/max() methods? > > My first instinct was heapq[1], since the way to access the min value is > simply heap[0] (and I thought it could benefit from __min__) -- it's almost > the perfect motivating example. But as it stands, the module uses > functions to operate directly on a standard list, so even if __min__ were > exposed, min(heap) would still iterate over the entire list. > > That being said, a heap *class* could take advantage of this, and provide > a semantically consistent optimization. > > I'm not sure how many examples will be found in stdlib, as I expect this > optimization to be restricted to specialized container types like heaps, > but I'll keep searching. > > [1] https://docs.python.org/3.6/library/heapq.html > > On Wed, Jun 20, 2018 at 3:00 AM, Steven D'Aprano > wrote: > >> On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote: >> > 19.06.18 22:18, James Edwards ????: >> > >I've only recently looked for these special methods, so that in and of >> > >itself may be the reason these methods aren't exposed, but I could >> think >> > >of objects that may wish to implement __min__ and __max__ themselves, >> > >for efficiency. >> > >> > There are two questions. >> > >> > 1. What to do with additional min() and max() arguments: key and >> default. >> >> Since there are no reflected versions of min/max, there is no trouble >> with extra arguments. Just pass them through to the dunder: >> >> min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y) >> >> >> > 2. Is the need of this feature large enough? Will the benefit for >> > special cases exceed the drawback of increasing implementation >> > complexity and slowing down common cases? >> >> Reasonable questions, but I don't think that the cost of testing: >> >> if hasattr(type(obj), '__min__') >> # or equivalent >> >> is going to be very large. Amortized over O(N) comparisons, that's >> practically free :-) >> >> More important, I think, is the increase in API complexity. That's two >> more dunders to learn about. >> >> The first part is critical: is this useful enough to justify two more >> dunders? I think the answer is a definite Maybe. Or perhaps Maybe Not. >> >> I think that without at least one use-case in the standard library, >> perhaps we should hold off on this. Unless numpy arrays are important >> enough to justify this on their own? >> >> Are there any builtins or std library classes that offer their own >> min()/max() methods? If so, that would be good evidence that making this >> a dunder-based protocol has stdlib use-cases. >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Jun 20 12:15:18 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 20 Jun 2018 12:15:18 -0400 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: References: Message-ID: On Wed, Jun 20, 2018 at 6:34 AM Miche?l Keane wrote: [..] > First, I couldn't find any real discussion about it after 2011 so I had no idea if the reasons it was ruled unfeasible with Python 2 still held nearly 10 years later with Python 3. I was mainly wondering if all the recent asynchronous work had changed things significantly. Apparently not? No, as message passing works good enough. The code is certainly more readable when you don't have some "global state pickling" kind of magic. > Finally, another comment made the point that there wasn't a strong use case given for it. With the data science libraries that have sprung up around Python in the intervening years, I believe there now is one. As Guido has pointed out, pickling generators would require proper pickling of the entire frame stack (otherwise generators that use "global" or "nonlocal" won't unpickle correctly). Ideally we should also pickle thread locals and contextvars. Even if Python supported that, pickling and unpickling generators would be a slow operation, to the point of being impracticable (and JIT-based Python implementations would probably use the slowest path for any code that involves frame pickling). Instead you should try to encapsulate your state in a dedicated object that is easy to pickle and unpickle. Your generators can then work with that state object instead of storing the state implicitly in their local variables (this is similar to your workaround #1 but still allows you to work with generators; just don't use local variables). The state (or parts of it) can be an immutable collection/object, which will make it easier to copy/pass it by reference at any point. While this approach requires more work than just encapsulating the state in generators, in the long run it should make your code simpler and more scalable. Yury From guido at python.org Wed Jun 20 12:20:38 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 09:20:38 -0700 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <20180620114335.GA14437@ando.pearwood.info> References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> Message-ID: +1 -- when we introduced these we didn't see the use case so clearly, but it definitely exists. On Wed, Jun 20, 2018 at 4:44 AM Steven D'Aprano wrote: > On Wed, Jun 20, 2018 at 11:56:05AM +0200, Jeroen Demeyer wrote: > [...] > > Since it makes sense to merge the classes "classmethod" and > > "classmethod_descriptor" (PEP 579, issue 8), one of the above behaviors > > should be changed. Given that adding features is less likely to break > > stuff, I would argue that classmethod instances should become callable. > [...] > > Are there any reasons to *not* make staticmethod and classmethod > callable? > > (The classes themselves are callable -- you're talking about the > instances.) > > +1 yes please! > > The fact that classmethods and especially staticmethod instances aren't > callable has been a long-running niggling pain for me. Occasionally I > want to do something like this: > > class Spam: > @staticmethod > def utility(arg): > # something which is conceptually related to the Spam class > # but doesn't need a cls/self argument. > ... > > value = utility(arg) > > but it doesn't work as staticmethod objects aren't callable until after > they've gone through the descriptor protocol. > > > I'm not the only one bitten by this: > > > https://stackoverflow.com/questions/45375944/python-static-method-is-not-always-callable > > https://mail.python.org/pipermail/python-list/2011-November/615069.html > > Part of that thread, see links and discussion here: > > https://mail.python.org/pipermail/python-list/2011-November/615077.html > > > I thought I had raised a bug report for this on the tracker, but my > google-fu is failing me and I can't find it. But my recollection is that > the simple fix is to make staticmethod.__call__ simply delegate to the > underlying decorated function. And similar for classmethod. > > (Of course calling classmethod instances directly won't work unless you > provide the class argument. But that's just a simple matter of bound > versus unbound methods.) > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jun 20 12:27:24 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Jun 2018 18:27:24 +0200 Subject: [Python-ideas] Copy (and/or pickle) generators References: Message-ID: <20180620182724.727f49d1@fsol> On Wed, 20 Jun 2018 12:15:18 -0400 Yury Selivanov wrote: > > > Finally, another comment made the point that there wasn't a strong use case given for it. With the data science libraries that have sprung up around Python in the intervening years, I believe there now is one. > > As Guido has pointed out, pickling generators would require proper > pickling of the entire frame stack (otherwise generators that use > "global" or "nonlocal" won't unpickle correctly). Depends what level of automatic (magic?) correctness you're expecting. A generator is conceptually an iterator expressed in a different syntax. If you define an iterator object, it will probably get pickling for free, yet pickling it won't bother serializing the global variables that are accessed from its __next__() and send() methods. A generator needn't be different: you mainly have to be careful to serialize its module's __name__, so that you can lookup the frame's global dict by module name when the generator is recreated. By contrast, closure variables would be an issue. But a first implementation could simply refuse to pickle generators that access an enclosing local state. Regards Antoine. From storchaka at gmail.com Wed Jun 20 12:30:03 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 19:30:03 +0300 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> Message-ID: 20.06.18 19:20, Guido van Rossum ????: > +1 -- when we introduced these we didn't see the use case so clearly, > but it definitely exists. How would you call a classmethod descriptor in this case? From guido at python.org Wed Jun 20 12:37:04 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 09:37:04 -0700 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> Message-ID: On Wed, Jun 20, 2018 at 9:31 AM Serhiy Storchaka wrote: > 20.06.18 19:20, Guido van Rossum ????: > > +1 -- when we introduced these we didn't see the use case so clearly, > > but it definitely exists. > > How would you call a classmethod descriptor in this case? > With an extra first argument that's a class -- it should just call the wrapped function with whatever args are presented to the descriptior. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jheiv at jheiv.com Wed Jun 20 12:53:58 2018 From: jheiv at jheiv.com (James Edwards) Date: Wed, 20 Jun 2018 12:53:58 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180620070056.GZ14437@ando.pearwood.info> Message-ID: I do think there is some merit in being able to as easily as possible transition from something being, say list-backed, to a domain specific data structure that's more efficient. If one wanted to go from a list to a minheap, it'd be nice just to change `x = []` to `x = Heap()` and have everything else "just work", without having to change all `min(x)`s to `x.min()`s, etc. But trying to find places in production code where there is the special casing you described is (now) on the top of my priorities. Outside of invariant-maintaining data structures though, I think the most compelling case I've come up with so far is that it could be implemented on certain types of generators, without having to generate the whole sequence. (IMHO, they're neat side effects, but not very compelling by themselves): - __min__ and __max__ could be implemented on the range object, as calculating these are straightforward, and you wouldn't need to generate the entire sequence. Passing an "un-listified" range to a function that accepts lists is somewhat common, I think. The inefficiency is compounded if you generate the sequence multiple times, e.g.: def get_err(iterable): return max(iterable) - min(iterable) x = range(10000) get_err(x) # vs x = list(range(10000)) get_err(x) Here, implementing __min__ and __max__ would make passing a range not just as fast as passing a "listified" range, but significantly faster. But I think this is just a nice coincidence of exposing the special methods and by no means motivating by itself. - There are also a few infinite generators in itertools where calling min() or max() on the generator will run forever, despite at least one of these being a clearly defined: from itertools import count, cycle gen = count(start=0, step=1) # min=0, no max # or gen = cycle([1,2,3]) # min=1, max=3 print(min(gen)) # Will never terminate I'm even less sure about how often this would actually help than I am the range example. I don't envision many places where people are passing infinite generators to things expecting standard lists -- especially in light of the fact that calling min() or max() on them will prevent further execution. Thanks for all the feedback, the search will continue :) On Wed, Jun 20, 2018 at 11:24 AM, Guido van Rossum wrote: > Finding more realistic use cases is key -- the actual spec is pretty > obvious and doesn't worry me in terms of added language or implementation > complexity. > > I think just finding a data structure that should implement its own > min/max funtionality (or maybe one of these, like heapq) is not enough > motivation. You have to find code where such a data structure (let's say a > Tree) is passed to some function that also accepts, say, a list. Then that > function would benefit from being able to call just `min(x)` rather than > `x.min() if isinstance(x, Tree) else min(x)`. If whenever you have a Tree > you know that you have a Tree (because it has other unique methods) then > there's no burden for the user to call x.min(). > > --Guido > > On Wed, Jun 20, 2018 at 5:24 AM James Edwards wrote: > >> > Are there any builtins or std library classes that offer their own >> min()/max() methods? >> >> My first instinct was heapq[1], since the way to access the min value is >> simply heap[0] (and I thought it could benefit from __min__) -- it's almost >> the perfect motivating example. But as it stands, the module uses >> functions to operate directly on a standard list, so even if __min__ were >> exposed, min(heap) would still iterate over the entire list. >> >> That being said, a heap *class* could take advantage of this, and >> provide a semantically consistent optimization. >> >> I'm not sure how many examples will be found in stdlib, as I expect this >> optimization to be restricted to specialized container types like heaps, >> but I'll keep searching. >> >> [1] https://docs.python.org/3.6/library/heapq.html >> >> On Wed, Jun 20, 2018 at 3:00 AM, Steven D'Aprano >> wrote: >> >>> On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote: >>> > 19.06.18 22:18, James Edwards ????: >>> > >I've only recently looked for these special methods, so that in and >>> of >>> > >itself may be the reason these methods aren't exposed, but I could >>> think >>> > >of objects that may wish to implement __min__ and __max__ themselves, >>> > >for efficiency. >>> > >>> > There are two questions. >>> > >>> > 1. What to do with additional min() and max() arguments: key and >>> default. >>> >>> Since there are no reflected versions of min/max, there is no trouble >>> with extra arguments. Just pass them through to the dunder: >>> >>> min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y) >>> >>> >>> > 2. Is the need of this feature large enough? Will the benefit for >>> > special cases exceed the drawback of increasing implementation >>> > complexity and slowing down common cases? >>> >>> Reasonable questions, but I don't think that the cost of testing: >>> >>> if hasattr(type(obj), '__min__') >>> # or equivalent >>> >>> is going to be very large. Amortized over O(N) comparisons, that's >>> practically free :-) >>> >>> More important, I think, is the increase in API complexity. That's two >>> more dunders to learn about. >>> >>> The first part is critical: is this useful enough to justify two more >>> dunders? I think the answer is a definite Maybe. Or perhaps Maybe Not. >>> >>> I think that without at least one use-case in the standard library, >>> perhaps we should hold off on this. Unless numpy arrays are important >>> enough to justify this on their own? >>> >>> Are there any builtins or std library classes that offer their own >>> min()/max() methods? If so, that would be good evidence that making this >>> a dunder-based protocol has stdlib use-cases. >>> >>> >>> >>> -- >>> Steve >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Jun 20 13:01:35 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 20:01:35 +0300 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> Message-ID: <7a02a413-794e-8b88-1ac1-3b39bb775243@gmail.com> 20.06.18 19:37, Guido van Rossum ????: > On Wed, Jun 20, 2018 at 9:31 AM Serhiy Storchaka > > wrote: > > 20.06.18 19:20, Guido van Rossum ????: > > +1 -- when we introduced these we didn't see the use case so > clearly, > > but it definitely exists. > > How would you call a classmethod descriptor in this case? > > > With an extra first argument that's a class -- it should just call the > wrapped function with whatever args are presented to the descriptior. This differs from calling a class method outside of the class definition body. And in the class definition body the class is not defined still. class Spam: @classmethod def utility(arg): ... value = utility(???, arg) From guido at python.org Wed Jun 20 13:07:54 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 10:07:54 -0700 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <7a02a413-794e-8b88-1ac1-3b39bb775243@gmail.com> References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> <7a02a413-794e-8b88-1ac1-3b39bb775243@gmail.com> Message-ID: On Wed, Jun 20, 2018 at 10:03 AM Serhiy Storchaka wrote: > 20.06.18 19:37, Guido van Rossum ????: > > On Wed, Jun 20, 2018 at 9:31 AM Serhiy Storchaka > > > > wrote: > > > > 20.06.18 19:20, Guido van Rossum ????: > > > +1 -- when we introduced these we didn't see the use case so > > clearly, > > > but it definitely exists. > > > > How would you call a classmethod descriptor in this case? > > > > > > With an extra first argument that's a class -- it should just call the > > wrapped function with whatever args are presented to the descriptior. > > This differs from calling a class method outside of the class definition > body. And in the class definition body the class is not defined still. > > class Spam: > @classmethod > def utility(arg): > ... > > value = utility(???, arg) > Maybe we're misunderstanding each other? I would think that calling the classmethod object directly would just call the underlying function, so this should have to call utility() with a single arg. This is really the only option, since the descriptor doesn't have any context. In any case it should probably `def utility(cls)` in that example to clarify that the first arg to a class method is a class. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Jun 20 13:17:37 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 20:17:37 +0300 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: <20180620070056.GZ14437@ando.pearwood.info> References: <20180620070056.GZ14437@ando.pearwood.info> Message-ID: 20.06.18 10:00, Steven D'Aprano ????: > On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote: >> 1. What to do with additional min() and max() arguments: key and default. > > Since there are no reflected versions of min/max, there is no trouble > with extra arguments. Just pass them through to the dunder: > > min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y) The devil is in details. And you will see this when try to implement min() and __min__(). 1) There is no default value for default. This makes handling it in Python code hard. 2) Two original examples don't work with the key function. You will need to add complex caches for supporting different key functions, and this will add new problems. In future we may add new parameters for min() and max(). This is not closed protocol as for len() or `+`. >> 2. Is the need of this feature large enough? Will the benefit for >> special cases exceed the drawback of increasing implementation >> complexity and slowing down common cases? > > Reasonable questions, but I don't think that the cost of testing: > > if hasattr(type(obj), '__min__') > # or equivalent > > is going to be very large. Amortized over O(N) comparisons, that's > practically free :-) N may be small. And I suppose that for most calls it may be <10 or even <5. Note that the cost will be much larger than for __len__ or __add__, because new dunder methods will not have slots. From joejev at gmail.com Wed Jun 20 13:24:32 2018 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 20 Jun 2018 13:24:32 -0400 Subject: [Python-ideas] Copy (and/or pickle) generators In-Reply-To: <20180620182724.727f49d1@fsol> References: <20180620182724.727f49d1@fsol> Message-ID: This was already posted in the thread, but https://github.com/llllllllll/cloudpickle-generators is just an extension to the standard pickle machinery and is able to support closures, nonlocals, and globals: https://github.com/llllllllll/cloudpickle-generators/blob/master/cloudpickle_generators/tests/test_cloudpickle_generators.py. It can even support the exotic case of a generator closing over itself. The state that needs to be serialized for a generator is: 1. the frame's locals 2. the frame's globals 3. the closure cells 4. the lasti of the frame 5. the frame's data stack 6. the frame's block stack 7. the frame's suspended exception* The frame's suspended exception is the exception that is stored when you have code like: try: raise ValueError() except Exception: yield value raise The frame stores the (type, value, traceback) so that it can make the raise statement work after the yield. You need to be careful to check for recursion in the globals and closure because the generator instance may get stored there. You also need to check the locals because the generator instance could be sent back into itself and stored in a local. You also need to check the data stack for recursion because the instance could be sent into itself and then left on the stack between yields, like if you use a yield expression in the middle of a tuple creation like: a = (the_generator_instance, (yield)) Extracting the lasti, data stack, block stack and held exception require a little C, the rest can be pulled from pure Python. On Wed, Jun 20, 2018 at 12:27 PM, Antoine Pitrou wrote: > On Wed, 20 Jun 2018 12:15:18 -0400 > Yury Selivanov > wrote: >> >> > Finally, another comment made the point that there wasn't a strong use case given for it. With the data science libraries that have sprung up around Python in the intervening years, I believe there now is one. >> >> As Guido has pointed out, pickling generators would require proper >> pickling of the entire frame stack (otherwise generators that use >> "global" or "nonlocal" won't unpickle correctly). > > Depends what level of automatic (magic?) correctness you're expecting. > A generator is conceptually an iterator expressed in a different syntax. > If you define an iterator object, it will probably get pickling for > free, yet pickling it won't bother serializing the global variables that > are accessed from its __next__() and send() methods. > > A generator needn't be different: you mainly have to be careful to > serialize its module's __name__, so that you can lookup the frame's > global dict by module name when the generator is recreated. > > By contrast, closure variables would be an issue. But a first > implementation could simply refuse to pickle generators that access an > enclosing local state. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From storchaka at gmail.com Wed Jun 20 13:27:17 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 20:27:17 +0300 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> <7a02a413-794e-8b88-1ac1-3b39bb775243@gmail.com> Message-ID: 20.06.18 20:07, Guido van Rossum ????: > Maybe we're misunderstanding each other? I would think that calling the > classmethod object directly would just call the underlying function, so > this should have to call utility() with a single arg. This is really the > only option, since the descriptor doesn't have any context. > > In any case it should probably `def utility(cls)` in that example to > clarify that the first arg to a class method is a class. Sorry, I missed the cls parameter in the definition of utility(). class Spam: @classmethod def utility(cls, arg): ... value = utility(???, arg) What should be passed as the first argument to utility() if the Spam class (as well as its subclasses) is not defined still? Maybe there is a use case for calling the staticmethod descriptor. Although in this rare case I would apply the staticmethod decorator after using the function in the class body. class Spam: # @staticmethod def utility(arg): ... value = utility(arg) utility = staticmethod(utility) But I don't see a use case for calling the classmethod descriptor. From storchaka at gmail.com Wed Jun 20 13:33:00 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 20 Jun 2018 20:33:00 +0300 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <5B2A24B5.7050900@UGent.be> References: <5B2A24B5.7050900@UGent.be> Message-ID: 20.06.18 12:56, Jeroen Demeyer ????: > Are there any reasons to *not* make staticmethod and classmethod callable? There were no reasons to make staticmethod and classmethod callable. Just "for consistency" is not considered a good reason. From guido at python.org Wed Jun 20 14:16:20 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 11:16:20 -0700 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> Message-ID: Maybe. Though it has surprised me occasionally that pulling a classmethod or staticmethod out of the class dict (like in Jeroen's original example) doesn't work. On Wed, Jun 20, 2018 at 10:34 AM Serhiy Storchaka wrote: > 20.06.18 12:56, Jeroen Demeyer ????: > > Are there any reasons to *not* make staticmethod and classmethod > callable? > > There were no reasons to make staticmethod and classmethod callable. > Just "for consistency" is not considered a good reason. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrbm74 at gmail.com Wed Jun 20 16:15:22 2018 From: mrbm74 at gmail.com (Martin Bammer) Date: Wed, 20 Jun 2018 22:15:22 +0200 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) In-Reply-To: References: Message-ID: <7cc5257e-0be9-4b54-343c-692197c50e09@gmail.com> Hi, I saw this free lists implementation this morning in the floatobject and listobject sources, which already handles the reuse of objects. I must admit I like this implementation. It's pretty smart. And yes I'm a little bit disappointed because it reduces the benefit of my idea a lot. Regards, Martin On 2018-06-20 13:29, Nick Coghlan wrote: > On 20 June 2018 at 00:47, Martin Bammer wrote: >> If this idea is well implemented I expect a big performance improvement for >> all Python applications. > Given the free lists already maintained for several builtin types in > the reference implementation, I suspect you may be disappointed on > that front :) > > (While object creation overhead certainly isn't trivial, the > interpreter's already pretty aggressive about repurposing previously > allocated and initialised memory for new instances) > > Cheers, > Nick. > From mrbm74 at gmail.com Wed Jun 20 16:52:55 2018 From: mrbm74 at gmail.com (Martin Bammer) Date: Wed, 20 Jun 2018 22:52:55 +0200 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) In-Reply-To: References: <7cc5257e-0be9-4b54-343c-692197c50e09@gmail.com> Message-ID: Of course I'm happy that Python is so much optimized. I'm only disappointed that I couldn't help to improve the speed with my idea ;-) On 2018-06-20 22:50, Guido van Rossum wrote: > On Wed, Jun 20, 2018 at 1:16 PM Martin Bammer > wrote: > > I saw this free lists implementation this morning in the > floatobject and > listobject sources, > > which already handles the reuse of objects. I must admit I like this > implementation. It's pretty smart. > > And yes I'm a little bit disappointed because it reduces the > benefit of > my idea a lot. > > > Why would you be disappointed? Aren't you happy that you get much of > the benefit without having to do any work? > > -- > --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jun 20 16:50:19 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jun 2018 13:50:19 -0700 Subject: [Python-ideas] POPT (Python Ob ject Provider Threads) In-Reply-To: <7cc5257e-0be9-4b54-343c-692197c50e09@gmail.com> References: <7cc5257e-0be9-4b54-343c-692197c50e09@gmail.com> Message-ID: On Wed, Jun 20, 2018 at 1:16 PM Martin Bammer wrote: > I saw this free lists implementation this morning in the floatobject and > listobject sources, > > which already handles the reuse of objects. I must admit I like this > implementation. It's pretty smart. > > And yes I'm a little bit disappointed because it reduces the benefit of > my idea a lot. > Why would you be disappointed? Aren't you happy that you get much of the benefit without having to do any work? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Jun 21 03:17:24 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 21 Jun 2018 16:17:24 +0900 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> <20180618003446.GK14437@ando.pearwood.info> Message-ID: <23339.20740.892027.849859@turnbull.sk.tsukuba.ac.jp> Ronald Oussoren writes: > Possibly just for the ?cp?? encodings, but IMHO only if we confirm > that the code to look for the preferred encoding returns a codepage > number on Windows and changing that code leads to worse results > than adding numeric aliases for the ?cp?? encodings. Almost all of the CPxxx encodings have multiple aliases[1], so I just don't see the point unless numeric-only code page designations are baked in to default "locales"[2] in official releases by major OS vendors. And probably not even then, since it should be easy enough to provide a proper "locale" and/or PYTHONIOENCODING setting. Of course we should help the reporter figure out what's going on and help them fix it with appropriate system configuration. If that doesn't work, then (and *only then*) we could think about doing a stupid thing. Footnotes: [1] Granted, "874" only has "windows-874" registered with the IANA, so it's kind of salient. Still, if numeric-only aliases were a "thing", surely we'd have heard about it by now---I first encountered Thai encodings in 1990 (ok, that was TIS 620, but windows-874 is basically TIS plus Microsoft punctuation extensions IIRC), Thais do use computers in their native language a lot. [2] Scare quotes to refer to appropriate platform facilities, as neither Windows nor Mac OS is strictly conformant to POSIX on this. From J.Demeyer at UGent.be Thu Jun 21 03:45:08 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 21 Jun 2018 09:45:08 +0200 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> Message-ID: <5B2B5784.5000700@UGent.be> On 2018-06-20 19:33, Serhiy Storchaka wrote: > 20.06.18 12:56, Jeroen Demeyer ????: >> Are there any reasons to *not* make staticmethod and classmethod callable? > > There were no reasons to make staticmethod and classmethod callable. You have to compare the advantages of making them callable vs. the advantages of *not* making them callable. I think that consistency *is* good to have, so I consider that one reason to make them callable. Are there any reasons for *not* making them callable? From storchaka at gmail.com Thu Jun 21 04:33:24 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 21 Jun 2018 11:33:24 +0300 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <5B2B5784.5000700@UGent.be> References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> Message-ID: 21.06.18 10:45, Jeroen Demeyer ????: > On 2018-06-20 19:33, Serhiy Storchaka wrote: >> 20.06.18 12:56, Jeroen Demeyer ????: >>> Are there any reasons to *not* make staticmethod and classmethod >>> callable? >> >> There were no reasons to make staticmethod and classmethod callable. > > You have to compare the advantages of making them callable vs. the > advantages of *not* making them callable. You have also to weight the disadvantages of making them callable and the cost of making them callable. > I think that consistency *is* good to have, so I consider that one > reason to make them callable. Are there any reasons for *not* making > them callable? Status quo wins. From J.Demeyer at UGent.be Thu Jun 21 04:40:31 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 21 Jun 2018 10:40:31 +0200 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <94fa6702321c4436994b92b16c9e0242@xmail101.UGent.be> References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> <94fa6702321c4436994b92b16c9e0242@xmail101.UGent.be> Message-ID: <5B2B647F.1050701@UGent.be> On 2018-06-21 10:33, Serhiy Storchaka wrote: > Status quo wins. Well, I'm already planning to make changes to staticmethod/classmethod (not right now, but it's on my post-PEP-580 roadmap). So the "status quo" argument doesn't apply. My question is really: assuming that we redesign staticmethod/classmethod anyway, should we make them callable? From songofacandy at gmail.com Thu Jun 21 05:00:14 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 21 Jun 2018 18:00:14 +0900 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: <5B2B647F.1050701@UGent.be> References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> <94fa6702321c4436994b92b16c9e0242@xmail101.UGent.be> <5B2B647F.1050701@UGent.be> Message-ID: > > > My question is really: assuming that we redesign > staticmethod/classmethod anyway, should we make them callable? > ?I think so. staticmethod and classmethod should affect descriptor behavior. And it should behave as normal function.? >>> @classmethod ... def foo(cls): ... print(cls) ... >>> @staticmethod ... def bar(arg): ... print(arg) ... >>> foo(int) # this should work Traceback (most recent call last): File "", line 1, in TypeError: 'classmethod' object is not callable >>> bar(42) # this should work too Traceback (most recent call last): File "", line 1, in TypeError: 'staticmethod' object is not callable When Python 4, I think we can even throw away classmethod and staticmethod object. PyFunction can have binding flag instead, like METH_CLASS and METH_STATIC for PyCFunction. classmethod and staticmethod is just a function which modify the flag. ?But I'm not sure. Calling in Python is too complicated ?to fully understand. ?Regards,? -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Thu Jun 21 05:08:10 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 21 Jun 2018 11:08:10 +0200 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> <94fa6702321c4436994b92b16c9e0242@xmail101.UGent.be> <5B2B647F.1050701@UGent.be> Message-ID: <5B2B6AFA.9050002@UGent.be> On 2018-06-21 11:00, INADA Naoki wrote: > When Python 4, I think we can even throw away classmethod and > staticmethod object. > PyFunction can have binding flag instead, like METH_CLASS and > METH_STATIC for PyCFunction. > classmethod and staticmethod is just a function which modify the flag. One issue with that idea is that staticmethod, classmethod can actually arbitrary objects, not only Python functions. In fact, even this object can be created: >>> staticmethod(42) So in that sense, they behave more like "method" which can also wrap arbitrary callables (in this case, callability is checked). So I'm vaguely thinking of putting "method", "staticmethod" and "classmethod" on top of a common base class for things wrapping callables. Jeroen. From Eloi.Gaudry at fft.be Thu Jun 21 11:26:37 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Thu, 21 Jun 2018 15:26:37 +0000 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: <1525764405.24469.1.camel@fft.be> References: <1525707472.12114.1.camel@fft.be> , <1525764405.24469.1.camel@fft.be> Message-ID: This request didn't have a lot of traction, but I still consider this is something that would need to be supported (2 lines of code to be changed; no regression so far with python 2 and python 3). My main points are: - HEAP_TYPE is not really used (as anyone being using it ?) - HEAP_TYPE serves other purposes - extension would benefit for allowing direct access to any of its type attributes Petr, what do you think ? Eloi ________________________________ From: Python-ideas on behalf of Eloi Gaudry Sent: Tuesday, May 8, 2018 9:26:47 AM To: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > On 05/07/18 11:37, Eloi Gaudry wrote: > > I mean, to my knowledge, there is no reason why a type should be > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > ) to > > be able to change its attributes at Python level. > > One reason is sub-interpreter support: you can have multiple > interpreters per process, and those shouldn't influence each other. > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > t) > > With heap types, each sub-interpreter can have its own copy of the > type > object. But with builtins, changes done in one interpreter would be > visible in all the others. Yes, this could be a reason, but if you don't rely on such a feature neither implicitly nor explicitly ? I mean, our types are built-in and should be considered as immutable across interpreters. And we (as most users I guess) are only running one interpreter. In case several intepreters are used, it would make sense to have a non-heap type that would be seen as a singleton across all of them, no ? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Thu Jun 21 14:05:48 2018 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Thu, 21 Jun 2018 11:05:48 -0700 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> Message-ID: <5B2BE8FC.1090301@brenbarn.net> On 2018-06-21 01:33, Serhiy Storchaka wrote: > 21.06.18 10:45, Jeroen Demeyer ????: >> >On 2018-06-20 19:33, Serhiy Storchaka wrote: >>> >>20.06.18 12:56, Jeroen Demeyer ????: >>>> >>>Are there any reasons to*not* make staticmethod and classmethod >>>> >>>callable? >>> >> >>> >>There were no reasons to make staticmethod and classmethod callable. >> > >> >You have to compare the advantages of making them callable vs. the >> >advantages of*not* making them callable. > You have also to weight the disadvantages of making them callable and > the cost of making them callable. That's what the OP is trying to do. You were just asked if there were any disadvantages or costs. Are there or not? If so, what are they? -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From songofacandy at gmail.com Fri Jun 22 07:08:11 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 22 Jun 2018 20:08:11 +0900 Subject: [Python-ideas] String and bytes bitwise operations In-Reply-To: References: Message-ID: Bitwise xor is used for "masking" code like these: https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146 https://github.com/tornadoweb/tornado/blob/0b2b055061eb4754c80a8d6bc28614b86954e336/tornado/util.py#L470-L471 https://github.com/tornadoweb/tornado/blob/master/tornado/speedups.c#L5 I think implementing it in C is really helpful for protocol library authors. On Thu, May 17, 2018 at 7:54 PM Ken Hilton wrote: > Hi all, > > We all know the bitwise operators: & (and), | (or), ^ (xor), and ~ (not). > We know how they work with numbers: > > 420 ^ 502 > > 110100100 > 111110110 > == XOR == > 001010010 > = 82 > > But it might be useful in some cases to (let's say) xor a string (or > bytestring): > > HELLO ^ world > > 01001000 01000101 01001100 01001100 01001111 > 01110111 01101111 01110010 01101100 01100100 > =================== XOR ==================== > 00111111 00101010 00111110 00100000 00101011 > = ?*> + > > Currently, that's done with this expression for strings: > > >>> ''.join(chr(ord(a) ^ ord(b)) for a, b in zip('HELLO', 'world')) > '?*> +' > > and this expression for bytestrings: > > >>> bytes(a ^ b for a, b in zip(b'HELLO', b'world')) > b'?*> +' > > It would be much more convenient, however, to allow a simple xor of a > string: > > >>> 'HELLO' ^ 'world' > '?*> +' > > or bytestring: > > >>> b'HELLO' ^ b'world' > b'?*> +' > > (All of this applies to other bitwise operators, of course.) > Compatibility issues are a no-brainer - currently, bitwise operators for > strings raise TypeErrors. > > Thanks. > > Suggesting, > Ken > ? Hilton? > ; > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Jun 22 06:57:23 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 22 Jun 2018 12:57:23 +0200 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <23339.20740.892027.849859@turnbull.sk.tsukuba.ac.jp> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> <20180618003446.GK14437@ando.pearwood.info> <23339.20740.892027.849859@turnbull.sk.tsukuba.ac.jp> Message-ID: <5CC091AA-00BC-4A6A-B7F6-0D2C35D05C86@mac.com> An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jun 22 08:21:44 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Jun 2018 22:21:44 +1000 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <20180620114335.GA14437@ando.pearwood.info> <7a02a413-794e-8b88-1ac1-3b39bb775243@gmail.com> Message-ID: On 21 June 2018 at 03:27, Serhiy Storchaka wrote: > 20.06.18 20:07, Guido van Rossum ????: >> >> Maybe we're misunderstanding each other? I would think that calling the >> classmethod object directly would just call the underlying function, so this >> should have to call utility() with a single arg. This is really the only >> option, since the descriptor doesn't have any context. >> >> In any case it should probably `def utility(cls)` in that example to >> clarify that the first arg to a class method is a class. > > > Sorry, I missed the cls parameter in the definition of utility(). > > class Spam: > @classmethod > def utility(cls, arg): > ... > > value = utility(???, arg) > > What should be passed as the first argument to utility() if the Spam class > (as well as its subclasses) is not defined still? That would depend on the definition of `utility` (it may simply not be useful to call it in the class body, which is also the case with most instance methods). The more useful symmetry improvement is to the consistency of behaviour between instance methods on class instances and the behaviour of class methods on classes themselves. So I don't think this is a huge gain in expressiveness, but I do think it's a low cost consistency improvement that should make it easier to start unifying more of the descriptor handling logic internally. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Fri Jun 22 12:25:53 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Jun 2018 12:25:53 -0400 Subject: [Python-ideas] String and bytes bitwise operations In-Reply-To: References: Message-ID: On 6/22/2018 7:08 AM, INADA Naoki wrote: > Bitwise xor is used for "masking" code like these: > > https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146 This points to a function _my_crypt that is O(n*n) because of using bytes.append. Using bytearray.append makes it O(n). --- import random import struct import timeit range_type = range def _my_crypt(message1, message2): length = len(message1) result = b'' for i in range_type(length): x = (struct.unpack('B', message1[i:i+1])[0] ^ struct.unpack('B', message2[i:i+1])[0]) result += struct.pack('B', x) return result def _my_crypt2(message1, message2): length = len(message1) result = bytearray() for i in range_type(length): x = (struct.unpack('B', message1[i:i+1])[0] ^ struct.unpack('B', message2[i:i+1])[0]) result += struct.pack('B', x) return bytes(result) def make(n): result = bytearray() for i in range(n): result.append(random.randint(0, 255)) return result for m in (10, 100, 1000, 10_000, 100_000, 1000_000): m1 = make(m) m2 = make(m) n = 1000_000 // m print(f'bytes len {m}, timeit reps {n}') print('old ', timeit.timeit('_my_crypt(m1, m2)', number = n, globals=globals())) print('new ', timeit.timeit('_my_crypt2(m1, m2)', number = n, globals=globals())) --- prints bytes len 10, timeit reps 100000 old 1.2277594129999998 new 1.2174212309999999 bytes len 100, timeit reps 10000 old 1.145566423 new 1.0924002120000003 bytes len 1000, timeit reps 1000 old 1.2860306190000002 new 1.1168685839999999 bytes len 10000, timeit reps 100 old 1.6543344650000003 new 1.118191714 bytes len 100000, timeit reps 10 old 4.2568492110000005 new 1.1266137560000011 bytes len 1000000, timeit reps 1 old 60.651238144000004 new 1.1315020199999992 I tried to submit this to https://github.com/PyMySQL/PyMySQL/issues/new but [Submit] does not work for me. -- Terry Jan Reedy From songofacandy at gmail.com Fri Jun 22 13:27:12 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 23 Jun 2018 02:27:12 +0900 Subject: [Python-ideas] String and bytes bitwise operations In-Reply-To: References: Message-ID: Hi Terry, Thanks, but I didn't care because my password is not so long. I just want to illustrate real world bytes xor usage. BTW, New MySQL auth methods (sha256 and caching_sha2) use bytes xor too. For performance point of view, websocket masking is performance critical. Tornado uses extension module only for it. If bytearray ^= bytes is supported, websocket frame masking may look like: frame ^= mask * ((len(frame)+3)//4) # mask is 4 bytes long On Sat, Jun 23, 2018 at 1:26 AM Terry Reedy wrote: > On 6/22/2018 7:08 AM, INADA Naoki wrote: > > Bitwise xor is used for "masking" code like these: > > > > > https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146 > > This points to a function _my_crypt that is O(n*n) because of using > bytes.append. Using bytearray.append makes it O(n). > --- > import random > import struct > import timeit > > range_type = range > > def _my_crypt(message1, message2): > length = len(message1) > result = b'' > for i in range_type(length): > x = (struct.unpack('B', message1[i:i+1])[0] ^ > struct.unpack('B', message2[i:i+1])[0]) > result += struct.pack('B', x) > return result > > def _my_crypt2(message1, message2): > length = len(message1) > result = bytearray() > for i in range_type(length): > x = (struct.unpack('B', message1[i:i+1])[0] ^ > struct.unpack('B', message2[i:i+1])[0]) > result += struct.pack('B', x) > return bytes(result) > > def make(n): > result = bytearray() > for i in range(n): > result.append(random.randint(0, 255)) > return result > > for m in (10, 100, 1000, 10_000, 100_000, 1000_000): > m1 = make(m) > m2 = make(m) > > n = 1000_000 // m > print(f'bytes len {m}, timeit reps {n}') > print('old ', timeit.timeit('_my_crypt(m1, m2)', number = n, > globals=globals())) > print('new ', timeit.timeit('_my_crypt2(m1, m2)', number = n, > globals=globals())) > --- > prints > > bytes len 10, timeit reps 100000 > old 1.2277594129999998 > new 1.2174212309999999 > bytes len 100, timeit reps 10000 > old 1.145566423 > new 1.0924002120000003 > bytes len 1000, timeit reps 1000 > old 1.2860306190000002 > new 1.1168685839999999 > bytes len 10000, timeit reps 100 > old 1.6543344650000003 > new 1.118191714 > bytes len 100000, timeit reps 10 > old 4.2568492110000005 > new 1.1266137560000011 > bytes len 1000000, timeit reps 1 > old 60.651238144000004 > new 1.1315020199999992 > > I tried to submit this to https://github.com/PyMySQL/PyMySQL/issues/new > but [Submit] does not work for me. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Fri Jun 22 16:30:56 2018 From: random832 at fastmail.com (Random832) Date: Fri, 22 Jun 2018 16:30:56 -0400 Subject: [Python-ideas] staticmethod and classmethod should be callable In-Reply-To: References: <5B2A24B5.7050900@UGent.be> <5B2B5784.5000700@UGent.be> <94fa6702321c4436994b92b16c9e0242@xmail101.UGent.be> <5B2B647F.1050701@UGent.be> Message-ID: <1529699456.1328270.1417373760.7CB03599@webmail.messagingengine.com> On Thu, Jun 21, 2018, at 05:00, INADA Naoki wrote: > When Python 4, I think we can even throw away classmethod and staticmethod > object. > PyFunction can have binding flag instead, like METH_CLASS and METH_STATIC > for PyCFunction. > classmethod and staticmethod is just a function which modify the flag. I can't remember the details, but I remember once having a reason to need to use staticmethod to store an attribute which happened to be a function. From qlixed at gmail.com Fri Jun 22 20:31:38 2018 From: qlixed at gmail.com (Ezequiel Brizuela [aka EHB or qlixed]) Date: Fri, 22 Jun 2018 21:31:38 -0300 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) Message-ID: As all the string in python are immutable, is impossible to overwrite the value or to make a "secure disposal" (overwrite-then-free) of a string using something like: >>> a = "something to hide" >>> a = "x"*len(a) This will lead on the process memory "something to hide" and "x" repeated len(a) times. - Who cares? Why is this relevant? Well if you handle some sensitive information like CC numbers, Passwords, PINs, or other kind of information you wanna minimize the chance of leaking any of it. - How this "leak" can happen? If you get a core/memory dump of an app handling sensitive information you will get all the information on that core exposed! - Well, so what we can do about this? I propose to make the required changes on the string objects to add an option to overwrite the underlying buffer. To do so: * Add a wiped as an attribute that is read-only to be set when the string is overwrited. * Add a wipe() method that overwrite the internal string buffer. So this will work like this: >>> pwd =getpass.getpass('Set your password:') # could be other sensitive data. >>> encrypted_pwd = crypt.crypt(pwd) # crypt() just as example. >>> pwd.wiped # Check if pwd was wiped. False >>> pwd.wipe() # Overwrite the underlying buffer >>> pwd.wiped # Check if pwd was wiped. True >>> print(pwd) # Print noise (or empty str?) >>> del pwd # Now is in hands of the GC. The wipe method immediately overwrite the underlying string buffer, setting wiped as True for reference so if the string is further used this can be checked to confirm that the change was made by a wipe and not by another procedure. Also initially the idea is to use unicode NULL datapoint to overwrite the string, but this could be change to let the user parametrize it over wipe() method. An alternative to this is to add a new exception "WipedError" that could be throw where the string is accessed again, but I found this method too disruptive to implement for a normal/standard string workflow usage. Quick & Dirty FAQ: - You do it wrong!, the correct code to do that in a secure way is: >>> pwd = crypt.crypt(getpass.getpass('Set your password')) Don't you know that fool? Well no, the code still generate a temporary string in memory to pass to crypt. But now this string is lying there and can't be accessed for an overwrite with wipe() - Why not create a new type like in C# or Java? I see that this tend to disrupt the usual workflow of string usage. Also the idea here is not to offer secure storage of string in memory because there is already a few mechanism to achieve with the current Python base. I just want to have the hability to overwrite the buffer. - Why don't use one of the standard algorithms to overwrite like DoD5220 or MIL-STD-414? This kind of standard usually are oriented for usage on persistent storage, specially on magnetic media for where the data could be "easily" recoverd. But this could ve an option that could be implemented adding the option to plug a function that do the overwrite work inside the wipe method. - This is far beyond of the almost implementation-agnostic definition of the python lang. How about to you make a module with this functionality and left the lang as is? Well I already do it: https://github.com/qlixed/python-memwiper/ But i hit a lot of problems in the road, I was working on me free time over the last year on this and make it "almost" work, but that is not relevant to the proposal. I think that this kind of security things needs to be tackled from within the language itself specially when the lang have GC. I firmly believe that the security and protections needs to be part of the "with batteries" offer of Python. And I think that this is one little thing that could help a lot to secure our apps. Let me know what do you think! ~ Ezequiel (Ezekiel) Brizuela [ aka Qlixed ] ~ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jun 22 20:45:50 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 23 Jun 2018 10:45:50 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On Sat, Jun 23, 2018 at 10:31 AM, Ezequiel Brizuela [aka EHB or qlixed] wrote: > I propose to make the required changes on the string objects to add an > option to overwrite the underlying buffer. To do so: > > * Add a wiped as an attribute that is read-only to be set when the string > is overwrited. > * Add a wipe() method that overwrite the internal string buffer. Since strings are immutable, it's entirely possible for them to be shared in various ways. Having the string be wiped while still existing seems to be a risky approach. > So this will work like this: > >>>> pwd =getpass.getpass('Set your password:') # could be other sensitive >>>> data. >>>> encrypted_pwd = crypt.crypt(pwd) # crypt() just as example. >>>> pwd.wiped # Check if pwd was wiped. > False >>>> pwd.wipe() # Overwrite the underlying buffer >>>> pwd.wiped # Check if pwd was wiped. > True >>>> print(pwd) # Print noise (or empty str?) >>>> del pwd # Now is in hands of the GC. Would it suffice to flag the string as "this contains sensitive data, please overwrite its buffer when it gets deallocated"? The only difference, in your example, would be that the last print would show the original data, and the wipe would happen afterwards. Advantages of this approach include that getpass can automatically flag the string as sensitive, and the "sensitive" flag can infect other strings (so <> would be automatically flagged to be wiped). Downside: You can't say "I'm done with this string, destroy it immediately". ChrisA From guido at python.org Fri Jun 22 21:30:35 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jun 2018 18:30:35 -0700 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: A wipe() method that mutates a string while it can still be referenced elsewhere is unacceptable -- it breaks an abstraction that is widely assumed. Chris's proposal can be implemented, it would set a hidden flag. Hopefully there's room for the flag without increasing the object header size. On Fri, Jun 22, 2018 at 5:46 PM Chris Angelico wrote: > On Sat, Jun 23, 2018 at 10:31 AM, Ezequiel Brizuela [aka EHB or > qlixed] wrote: > > I propose to make the required changes on the string objects to add an > > option to overwrite the underlying buffer. To do so: > > > > * Add a wiped as an attribute that is read-only to be set when the > string > > is overwrited. > > * Add a wipe() method that overwrite the internal string buffer. > > Since strings are immutable, it's entirely possible for them to be > shared in various ways. Having the string be wiped while still > existing seems to be a risky approach. > > > So this will work like this: > > > >>>> pwd =getpass.getpass('Set your password:') # could be other sensitive > >>>> data. > >>>> encrypted_pwd = crypt.crypt(pwd) # crypt() just as example. > >>>> pwd.wiped # Check if pwd was wiped. > > False > >>>> pwd.wipe() # Overwrite the underlying buffer > >>>> pwd.wiped # Check if pwd was wiped. > > True > >>>> print(pwd) # Print noise (or empty str?) > >>>> del pwd # Now is in hands of the GC. > > Would it suffice to flag the string as "this contains sensitive data, > please overwrite its buffer when it gets deallocated"? The only > difference, in your example, would be that the last print would show > the original data, and the wipe would happen afterwards. Advantages of > this approach include that getpass can automatically flag the string > as sensitive, and the "sensitive" flag can infect other strings (so > <> would be automatically flagged to be wiped). Downside: > You can't say "I'm done with this string, destroy it immediately". > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Jun 22 21:32:42 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Jun 2018 21:32:42 -0400 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 6/22/2018 8:31 PM, Ezequiel Brizuela [aka EHB or qlixed] wrote: > As all the string in python are immutable, is impossible to overwrite > the value Not if one uses ctypes. Is that what you did? > ? Well I already do it: > > https://github.com/qlixed/python-memwiper/ > But i hit a lot of problems in the road, I was working on me free time > over the last year on this and make it "almost" work, but that is not > relevant to the proposal. I think it is. A very small fraction of Python users need such wiping. And I doubt that it can be complete. For instance, I suspect that a password entered into getpass, for instance, first exists in OS form before being copied into a Python string objects. Wiping the Python string would not wipe the original copy. So this really should be attacked at the OS level, not the language level. I have read that phones use separate memory for critical data to try to protect critical data. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Fri Jun 22 21:33:59 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 23 Jun 2018 13:33:59 +1200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: <5B2DA387.8030405@canterbury.ac.nz> Chris Angelico wrote: > Downside: > You can't say "I'm done with this string, destroy it immediately". Also it would be hard to be sure there wasn't another copy of the data somewhere from a time before you got around to marking the string as sensitive, e.g. in a file buffer. -- Greg From rosuav at gmail.com Fri Jun 22 21:35:48 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 23 Jun 2018 11:35:48 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On Sat, Jun 23, 2018 at 11:30 AM, Guido van Rossum wrote: > Chris's proposal can be implemented, it would set a hidden flag. Hopefully > there's room for the flag without increasing the object header size. If I'm reading the include file correctly, the 'state' bitstruct has eight bits with defined meanings, and then 24 of padding to ensure alignment. Allocating one of those bits to say "sensitive" should be 100% backward-compatible. ChrisA From steve at pearwood.info Fri Jun 22 21:45:47 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Jun 2018 11:45:47 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: <5B2DA387.8030405@canterbury.ac.nz> References: <5B2DA387.8030405@canterbury.ac.nz> Message-ID: <20180623014547.GE14437@ando.pearwood.info> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote: > Chris Angelico wrote: > >Downside: > >You can't say "I'm done with this string, destroy it immediately". > > Also it would be hard to be sure there wasn't another > copy of the data somewhere from a time before you > got around to marking the string as sensitive, e.g. > in a file buffer. Don't let the perfect be the enemy of the good. We know there's at least one place that a string could leak private information. Just because there could hypothetically be other such places, doesn't make it useless to wipe that known potential leak. Attackers are not always omniscient. Even if an application leaks private data in ten places, some attacker may only know of, or be capable of, attacking *one* leak. If we can, we ought to plug it, and leave those hypothetical other leaks for another day. (Burglars can lift the tiles off my roof, climb into the ceiling, and hence down into my house. Nevertheless I still lock my front door.) -- Steve From tjreedy at udel.edu Sat Jun 23 00:00:55 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 23 Jun 2018 00:00:55 -0400 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 6/22/2018 8:45 PM, Chris Angelico wrote: > Would it suffice to flag the string as "this contains sensitive data, > please overwrite its buffer when it gets deallocated"? The only > difference, in your example, would be that the last print would show > the original data, and the wipe would happen afterwards. Advantages of > this approach include that getpass can automatically flag the string > as sensitive, and the "sensitive" flag can infect other strings (so > <> would be automatically flagged to be wiped). Downside: > You can't say "I'm done with this string, destroy it immediately". But one can be careful about creating references, and in current CPython, deleting the last reference does mean destroy, and possibly wipe, immediately. -- Terry Jan Reedy From rosuav at gmail.com Sat Jun 23 00:08:01 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 23 Jun 2018 14:08:01 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On Sat, Jun 23, 2018 at 2:00 PM, Terry Reedy wrote: > On 6/22/2018 8:45 PM, Chris Angelico wrote: > >> Would it suffice to flag the string as "this contains sensitive data, >> please overwrite its buffer when it gets deallocated"? The only >> difference, in your example, would be that the last print would show >> the original data, and the wipe would happen afterwards. Advantages of >> this approach include that getpass can automatically flag the string >> as sensitive, and the "sensitive" flag can infect other strings (so >> <> would be automatically flagged to be wiped). Downside: >> You can't say "I'm done with this string, destroy it immediately". > > > But one can be careful about creating references, and in current CPython, > deleting the last reference does mean destroy, and possibly wipe, > immediately. > Yes, you can, for the most part. It's certainly possible to get stung (eg exceptions retaining locals), but mostly it should be fine. How will other Pythons handle this? ChrisA From guido at python.org Sat Jun 23 00:42:42 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jun 2018 21:42:42 -0700 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On Fri, Jun 22, 2018 at 9:11 PM Chris Angelico wrote: > How will other Pythons handle this? > It could be optional behavior. ISTR that in Jython, strings are pretty much just Java strings. Does Java have such a feature? If not, do Java apps worry about this? If not, perhaps Python needn't either. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Jun 23 01:21:47 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 22 Jun 2018 22:21:47 -0700 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: <20180623014547.GE14437@ando.pearwood.info> References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano wrote: > On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote: >> Chris Angelico wrote: >> >Downside: >> >You can't say "I'm done with this string, destroy it immediately". >> >> Also it would be hard to be sure there wasn't another >> copy of the data somewhere from a time before you >> got around to marking the string as sensitive, e.g. >> in a file buffer. > > Don't let the perfect be the enemy of the good. That's true, but for security features it's important to have a proper analysis of the threat and when the mitigation will and won't work; otherwise, you don't know whether it's even "good", and you don't know how to educate people on what they need to do to make effective use of it (or where it's not worth bothering). Another issue: I believe it'd be impossible for this proposal to work correctly on implementations with a compacting GC (e.g., PyPy), because with a compacting GC strings might get copied around in memory during their lifetime. And crucially, this might have already happened before the interpreter was told that a particular string object contained sensitive data. I'm guessing this is part of why Java and C# use a separate type. There's a lot of prior art on this in other languages/environments, and a lot of experts who've thought hard about it. Python-{ideas,dev} doesn't have a lot of security experts, so I'd very much want to see some review of that work before we go running off designing something ad hoc. The PyCA cryptography library has some discussion in their docs: https://cryptography.io/en/latest/limitations/ One possible way to move the discussion forward would be to ask the pyca devs what kind of API they'd like to see in the interpreter, if any. -n -- Nathaniel J. Smith -- https://vorpus.org From gadgetsteve at live.co.uk Sat Jun 23 03:31:19 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Sat, 23 Jun 2018 07:31:19 +0000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: On 23/06/2018 06:21, Nathaniel Smith wrote: > On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano wrote: >> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote: >>> Chris Angelico wrote: >>>> Downside: >>>> You can't say "I'm done with this string, destroy it immediately". >>> >>> Also it would be hard to be sure there wasn't another >>> copy of the data somewhere from a time before you >>> got around to marking the string as sensitive, e.g. >>> in a file buffer. >> >> Don't let the perfect be the enemy of the good. > > That's true, but for security features it's important to have a proper > analysis of the threat and when the mitigation will and won't work; > otherwise, you don't know whether it's even "good", and you don't know > how to educate people on what they need to do to make effective use of > it (or where it's not worth bothering). > > Another issue: I believe it'd be impossible for this proposal to work > correctly on implementations with a compacting GC (e.g., PyPy), > because with a compacting GC strings might get copied around in memory > during their lifetime. And crucially, this might have already happened > before the interpreter was told that a particular string object > contained sensitive data. I'm guessing this is part of why Java and C# > use a separate type. > > There's a lot of prior art on this in other languages/environments, > and a lot of experts who've thought hard about it. Python-{ideas,dev} > doesn't have a lot of security experts, so I'd very much want to see > some review of that work before we go running off designing something > ad hoc. > > The PyCA cryptography library has some discussion in their docs: > https://cryptography.io/en/latest/limitations/ > > One possible way to move the discussion forward would be to ask the > pyca devs what kind of API they'd like to see in the interpreter, if > any. > > -n > All good points - I would think that for this to be effective the string, or secure string, would need to be marked at create time and all operations on it would have to honour the wipe before free flag and include it forward in any copies made. This needs to be implemented at a very low level so that, e.g.: adding to a string, (which makes a copy if the string is growing beyond the current allocation), will have to check for the flag add it to the new string, copy the expanded contents and then wipe the old before freeing it - any normal string which is being added to a secure string should probably get the flag added automatically as well. Of course adding or assigning a secure string to a normal string should automatically make the target string secure as well. This sounds like a lot of overhead to be adding to every string operation secure or not. That being the case it probably makes a lot of sense to use a separate base class - while this will result in a certain amount of bloat in software that makes use of it it will avoid the overhead of checking the flag on the vast majority of software which does not use it. I do know that I have heard in the past of security breaches in both C & Pascal strings where the problem was tracked down to the "delete" mechanism being just setting the first byte to 0x00, (which in both cases would result in the length of the string being 0 but the contents being untouched in Pascal strings and only the first character being lost in C. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com From p.f.moore at gmail.com Sat Jun 23 07:13:32 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 23 Jun 2018 12:13:32 +0100 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 23 June 2018 at 01:31, Ezequiel Brizuela [aka EHB or qlixed] wrote: > As all the string in python are immutable, is impossible to overwrite the > value or to make a "secure disposal" (overwrite-then-free) of a string using > something like: [...] > I propose to make the required changes on the string objects to add an > option to overwrite the underlying buffer. To do so: Is there any reason this could not be implemented as a 3rd party class (implemented in C, of course) which subclasses str? So you'd do from safestring import SafeStr a = SafeStr("my secret data") ... work with a as if it were a string del a When the refcount of a goes to zero, before releasing the memory, the custom class wipes that memory. There are obvious questions around theres_a_copy_here = "prefix " + a + " suffix" which will copy the secure data, but those issues will be just as much of a problem with a change to the builtin string, unless you propose some mechanism for propagating "secureness" from one value to another. And then you get questions like, is a[0] still "secret"? What about sha256(a)? Having a mechanism for handling this seems like a good idea, but my feeling is that even with a mechanism, handling secure data needs care and specialised knowledge from the programmer, and supporting that is better done with a dedicated class rather than having the language runtime try to solve the problem automatically (which runs the risk that a naive programmer expects the language to do the job, and then *doesn't* think about the risks). Paul Paul From p.f.moore at gmail.com Sat Jun 23 07:16:43 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 23 Jun 2018 12:16:43 +0100 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 23 June 2018 at 12:13, Paul Moore wrote: > On 23 June 2018 at 01:31, Ezequiel Brizuela [aka EHB or qlixed] > wrote: >> As all the string in python are immutable, is impossible to overwrite the >> value or to make a "secure disposal" (overwrite-then-free) of a string using >> something like: By the way, Perl has a concept of "tainted strings" which track string values (in Perl's case, whether they came from "external input") in a similar way. Anyone intending to take this proposal forward should almost certainly research that case - my recollection is that taintedness was a mixed success, in that it at best only partially solved the problems and was quite complex to implement and document. But it's probably 15 years or more since I looked at Perl's taint mechanism, so don't trust my recollection without checking :-) Paul From mal at egenix.com Sat Jun 23 08:11:15 2018 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 23 Jun 2018 14:11:15 +0200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: <9922e45f-f949-51b3-711d-2d26cdf6aca8@egenix.com> On 23.06.2018 02:45, Chris Angelico wrote: > Would it suffice to flag the string as "this contains sensitive data, > please overwrite its buffer when it gets deallocated"? The only > difference, in your example, would be that the last print would show > the original data, and the wipe would happen afterwards. Advantages of > this approach include that getpass can automatically flag the string > as sensitive, and the "sensitive" flag can infect other strings (so > <> would be automatically flagged to be wiped). Downside: > You can't say "I'm done with this string, destroy it immediately". I think the flag is an excellent idea. I'm not so sure about the automatic propagation of the flag, though. If a string gets interned with the flag set, this could lead to a lot of other strings receiving the flag without intent. Then again, you will probably not want such strings to be interned in the first place. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jun 23 2018) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From stephanh42 at gmail.com Sat Jun 23 09:57:50 2018 From: stephanh42 at gmail.com (Stephan Houben) Date: Sat, 23 Jun 2018 15:57:50 +0200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: Would it not be much simpler and more secure to just disable core dumps? /etc/security/limits.conf on Linux. If the attacker can cause and read a core dump, the game seems over anyway since sooner or later he will catch the core dump at a time the string was not yet deleted. Stephan Op za 23 jun. 2018 02:32 schreef Ezequiel Brizuela [aka EHB or qlixed] < qlixed at gmail.com>: > As all the string in python are immutable, is impossible to overwrite the > value or to make a "secure disposal" (overwrite-then-free) of a string > using something like: > > >>> a = "something to hide" > >>> a = "x"*len(a) > > This will lead on the process memory "something to hide" and "x" repeated > len(a) times. > > - Who cares? Why is this relevant? > Well if you handle some sensitive information like CC numbers, > Passwords, PINs, or other kind of information you wanna minimize the chance > of leaking any of it. > > - How this "leak" can happen? > If you get a core/memory dump of an app handling sensitive information > you will get all the information on that core exposed! > > - Well, so what we can do about this? > I propose to make the required changes on the string objects to add an > option to overwrite the underlying buffer. To do so: > > * Add a wiped as an attribute that is read-only to be set when the > string is overwrited. > * Add a wipe() method that overwrite the internal string buffer. > > So this will work like this: > > >>> pwd =getpass.getpass('Set your password:') # could be other sensitive > data. > >>> encrypted_pwd = crypt.crypt(pwd) # crypt() just as example. > >>> pwd.wiped # Check if pwd was wiped. > False > >>> pwd.wipe() # Overwrite the underlying buffer > >>> pwd.wiped # Check if pwd was wiped. > True > >>> print(pwd) # Print noise (or empty str?) > >>> del pwd # Now is in hands of the GC. > > The wipe method immediately overwrite the underlying string buffer, > setting wiped as True for reference so if the string is further used this > can be checked to confirm that the change was made by a wipe and not by > another procedure. Also initially the idea is to use unicode NULL datapoint > to overwrite the string, but this could be change to let the user > parametrize it over wipe() method. > An alternative to this is to add a new exception "WipedError" that could > be throw where the string is accessed again, but I found this method too > disruptive to implement for a normal/standard string workflow usage. > > Quick & Dirty FAQ: > > - You do it wrong!, the correct code to do that in a secure way is: > >>> pwd = crypt.crypt(getpass.getpass('Set your password')) > Don't you know that fool? > > Well no, the code still generate a temporary string in memory to pass to > crypt. But now this string is lying there and can't be accessed for an > overwrite with wipe() > > > - Why not create a new type like in C# or Java? > > I see that this tend to disrupt the usual workflow of string usage. Also > the idea here is not to offer secure storage of string in memory because > there is already a few mechanism to achieve with the current Python base. I > just want to have the hability to overwrite the buffer. > > > - Why don't use one of the standard algorithms to overwrite like DoD5220 > or MIL-STD-414? > > This kind of standard usually are oriented for usage on persistent > storage, specially on magnetic media for where the data could be "easily" > recoverd. But this could ve an option that could be implemented adding the > option to plug a function that do the overwrite work inside the wipe method. > > > - This is far beyond of the almost implementation-agnostic definition of > the python lang. How about to you make a module with this functionality and > left the lang as is? > > Well I already do it: > > https://github.com/qlixed/python-memwiper/ > > But i hit a lot of problems in the road, I was working on me free time > over the last year on this and make it "almost" work, but that is not > relevant to the proposal. > I think that this kind of security things needs to be tackled from > within the language itself specially when the lang have GC. I firmly > believe that the security and protections needs to be part of the "with > batteries" offer of Python. And I think that this is one little thing that > could help a lot to secure our apps. > Let me know what do you think! > > ~ Ezequiel (Ezekiel) Brizuela [ aka Qlixed ] ~ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qlixed at gmail.com Sat Jun 23 15:11:04 2018 From: qlixed at gmail.com (Ezequiel Brizuela [aka EHB or qlixed]) Date: Sat, 23 Jun 2018 16:11:04 -0300 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: El vie., 22 de jun. de 2018 21:46, Chris Angelico escribi?:. > > Since strings are immutable, it's entirely possible for them to be > shared in various ways. Having the string be wiped while still > existing seems to be a risky approach. (...) > > Downside: > You can't say "I'm done with this string, destroy it immediately". > That is the main issue with this approach. The proposed one is inmmediate but I understand that is risky. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Sat Jun 23 15:28:02 2018 From: christian at python.org (Christian Heimes) Date: Sat, 23 Jun 2018 21:28:02 +0200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 2018-06-23 15:57, Stephan Houben wrote: > Would it not be much simpler and more secure to just disable core dumps? > > /etc/security/limits.conf on Linux. > > If the attacker can cause and read a core dump, the game seems over > anyway since sooner or later he will catch the core dump at a time the > string was not yet deleted. That's not sufficient. You'd also need to ensure that the memory page is never paged to disk or a visible to gdb, ptrace, or any other kind of debugger. POSIX has mprotect(), but it doesn't necessarily work with malloc()ed memory and requires mmap() memory. Christian From christian at python.org Sat Jun 23 15:54:43 2018 From: christian at python.org (Christian Heimes) Date: Sat, 23 Jun 2018 21:54:43 +0200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: On 2018-06-23 07:21, Nathaniel Smith wrote: > On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano wrote: >> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote: >>> Chris Angelico wrote: >>>> Downside: >>>> You can't say "I'm done with this string, destroy it immediately". >>> >>> Also it would be hard to be sure there wasn't another >>> copy of the data somewhere from a time before you >>> got around to marking the string as sensitive, e.g. >>> in a file buffer. >> >> Don't let the perfect be the enemy of the good. > > That's true, but for security features it's important to have a proper > analysis of the threat and when the mitigation will and won't work; > otherwise, you don't know whether it's even "good", and you don't know > how to educate people on what they need to do to make effective use of > it (or where it's not worth bothering). > > Another issue: I believe it'd be impossible for this proposal to work > correctly on implementations with a compacting GC (e.g., PyPy), > because with a compacting GC strings might get copied around in memory > during their lifetime. And crucially, this might have already happened > before the interpreter was told that a particular string object > contained sensitive data. I'm guessing this is part of why Java and C# > use a separate type. > > There's a lot of prior art on this in other languages/environments, > and a lot of experts who've thought hard about it. Python-{ideas,dev} > doesn't have a lot of security experts, so I'd very much want to see > some review of that work before we go running off designing something > ad hoc. > > The PyCA cryptography library has some discussion in their docs: > https://cryptography.io/en/latest/limitations/ > > One possible way to move the discussion forward would be to ask the > pyca devs what kind of API they'd like to see in the interpreter, if > any. A while ago, I spent a good amount of time to investigate memory wiping for hashlib and hmac module. Although I was only interested to perform memory wiping in C code [1], I eventually gave up. It was too annoying to create a platform and architecture independent implementation. Because compilers do funny things and memset_s() isn't universally available yet, it it requires code like static void * (* const volatile __memset_vp)(void *, int, size_t) = (memset); or assembler code like asm volatile("" : : "r"(s) : "memory"); to just work around compiler optimization. This doesn't even handle CPU architecture, virtual memory, paging, core dumps, debuggers or other things that can read memory or dump memory to disk. I honestly believe, that memory wiping with the current standard memory allocator won't do the trick. It might be possible to implement a 90% solution with a special memory allocator. Said allocator would a specially configured, mmap memory arena and perform wiping on realloc() and free(). The secure area can be prevented from swapping with mlock(), protected with mprotect() and possible hardware encrypted with pkey_mprotect(). It's just a 90% secure solution, because the data will eventually land in public buffers. If you need to protect sensitive data like private keys, then don't load them into memory of your current process. It's that simple. :) Bugs like heartbleed were an issue, because private key were in the same process space as the TLS/SSL code. Solutions like gpg-agent, ssh-agent, TPM, HSM, Linux's keyring and AF_ALG socket all aim to offload operations with private key material into a secure subprocess, Kernel space or special hardware. [1] https://bugs.python.org/issue17405 From qlixed at gmail.com Sat Jun 23 15:55:07 2018 From: qlixed at gmail.com (Ezequiel Brizuela [aka EHB or qlixed]) Date: Sat, 23 Jun 2018 16:55:07 -0300 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: El s?b., 23 de jun. de 2018 10:58, Stephan Houben escribi?: > Would it not be much simpler and more secure to just disable core dumps? > > /etc/security/limits.conf on Linux. > > If the attacker can cause and read a core dump, the game seems over anyway > since sooner or later he will catch the core dump at a time the string was > not yet deleted. > Thing is that this could be leaked in other ways, not just on a core. Additiinally there is the case when you need a core to debug the issue, you could be sharing sensitive info without knowing it. Also is not always an option disabling core generation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Jun 23 15:55:28 2018 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 24 Jun 2018 05:55:28 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: <9922e45f-f949-51b3-711d-2d26cdf6aca8@egenix.com> References: <9922e45f-f949-51b3-711d-2d26cdf6aca8@egenix.com> Message-ID: On Sat, Jun 23, 2018 at 10:11 PM, M.-A. Lemburg wrote: > On 23.06.2018 02:45, Chris Angelico wrote: >> Would it suffice to flag the string as "this contains sensitive data, >> please overwrite its buffer when it gets deallocated"? The only >> difference, in your example, would be that the last print would show >> the original data, and the wipe would happen afterwards. Advantages of >> this approach include that getpass can automatically flag the string >> as sensitive, and the "sensitive" flag can infect other strings (so >> <> would be automatically flagged to be wiped). Downside: >> You can't say "I'm done with this string, destroy it immediately". > > I think the flag is an excellent idea. > > I'm not so sure about the automatic propagation of the flag, > though. If a string gets interned with the flag set, this > could lead to a lot of other strings receiving the flag > without intent. > > Then again, you will probably not want such strings to be > interned in the first place. Yeah, I'm not entirely sure about the semantics of infection. There might need to be a special case, such as "an empty string is never sensitive", to prevent absolutely EVERYTHING from being infected. What do other languages do there? But even if the rules are extremely simple to start with, I think this will be of value. ChrisA From qlixed at gmail.com Sat Jun 23 16:02:30 2018 From: qlixed at gmail.com (Ezequiel Brizuela [aka EHB or qlixed]) Date: Sat, 23 Jun 2018 17:02:30 -0300 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: El vie., 22 de jun. de 2018 22:33, Terry Reedy escribi?: > On 6/22/2018 8:31 PM, Ezequiel Brizuela [aka EHB or qlixed] wrote: > > As all the string in python are immutable, is impossible to overwrite > > the value > > Not if one uses ctypes. Is that what you did? > No. I was using exclusivelly python strings functions from the C api. > Well I already do it: > > > > https://github.com/qlixed/python-memwiper/ > > > But i hit a lot of problems in the road, I was working on me free time > > over the last year on this and make it "almost" work, but that is not > > relevant to the proposal. > > I think it is. A very small fraction of Python users need such wiping. > And I doubt that it can be complete. For instance, I suspect that a > password entered into getpass, for instance, first exists in OS form > before being copied into a Python string objects. Wiping the Python > string would not wipe the original copy. Agree. It migth be more places to search. So this really should be > attacked at the OS level, not the language level. This need to be tackled from all the sides. Ensuring the minimal attack surface possible for anyone. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sat Jun 23 16:04:01 2018 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Jun 2018 13:04:01 -0700 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: On Sat, Jun 23, 2018 at 12:57 PM Christian Heimes wrote: > > If you need to protect sensitive data like private keys, then don't load > them into memory of your current process. It's that simple. :) Bugs like > heartbleed were an issue, because private key were in the same process > space as the TLS/SSL code. Solutions like gpg-agent, ssh-agent, TPM, > HSM, Linux's keyring and AF_ALG socket all aim to offload operations > with private key material into a secure subprocess, Kernel space or > special hardware. > +10 It is fundamentally impossible for a Python VM (certainly CPython) to implement any sort of guaranteed erasure of data and/or control over data to prevent copying that is ever stored in a Python object. This is not unique to Python. All interpreted and jitted VMs share this trait, as do most languages with garbage collection. ex: Java, Ruby, Go, etc. Trying to pretend we could offer tracking and wiping of sensitive data in-process is harmful at best as it cannot be guaranteed and thus gives the wrong impression and will lead to misuse by people who ignore that. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Sat Jun 23 16:43:29 2018 From: christian at python.org (Christian Heimes) Date: Sat, 23 Jun 2018 22:43:29 +0200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: On 2018-06-23 21:55, Ezequiel Brizuela [aka EHB or qlixed] wrote: > > > El s?b., 23 de jun. de 2018 10:58, Stephan Houben > > escribi?: > > Would it not be much simpler and more secure to just disable core dumps? > > /etc/security/limits.conf on Linux. > > If the attacker can cause and read a core dump, the game seems over > anyway since sooner or later he will catch the core dump at a time > the string was not yet deleted. > > > Thing is that this could be leaked in other ways, not just on a core. > Additiinally there is the case when you need a core to debug the issue, > you could be sharing sensitive info without knowing it. > Also is not always an option disabling core generation. If you have core dumps enabled, then memory wiping will not help against accidental leakage of sensitive data. From chris.barker at noaa.gov Sat Jun 23 17:20:31 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sat, 23 Jun 2018 17:20:31 -0400 Subject: [Python-ideas] Fwd: Trigonometry in degrees In-Reply-To: <20180619100530.GT14437@ando.pearwood.info> References: <20180616132723.GG14437@ando.pearwood.info> <20180619100530.GT14437@ando.pearwood.info> Message-ID: >> However -- if this is really such a good idea -- wouldn't someone have make >> a C lib that does it? Or has someone? Anyone looked? > > No, there's nothing magical about C. You can do it in pure Python. Sure, but there are a number of FP subtleties around the edge cases. So wrapping (or translating) an existing (well thought out) lib might be an easier way to go. -CHB From jab at math.brown.edu Sat Jun 23 17:41:15 2018 From: jab at math.brown.edu (jab at math.brown.edu) Date: Sat, 23 Jun 2018 17:41:15 -0400 Subject: [Python-ideas] Replacing Infinite while Loops with an Iterator: async edition Message-ID: I first learned the ``for chunk in iter(lambda: sock.recv(N), b'')`` trick from page 138 of Dave Beazley?s fantastic Python Cookbook (?4.16. Replacing Infinite while Loops with an Iterator?), and never looked back. When I started to play with consuming sockets asynchronously, it occurred to me that it would be nice if we could write an async version of this, as in ``async for chunk in aiter(...)``. Is anyone working on adding that to Python, or at least interested too? Thanks, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Jun 23 20:14:19 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jun 2018 12:14:19 +1200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: <5B2EE25B.9070107@canterbury.ac.nz> Paul Moore wrote: > a = SafeStr("my secret data") > ... work with a as if it were a string > del a But in order to create the SafeStr, you need to first have the data in the form of an ordinary non-safe string. How do you dispose of that safely? -- Greg From greg.ewing at canterbury.ac.nz Sat Jun 23 20:26:12 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jun 2018 12:26:12 +1200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: <5B2EE524.8020703@canterbury.ac.nz> Christian Heimes wrote: > You'd also need to ensure that the memory page is > never paged to disk or a visible to gdb, ptrace, or any other kind of > debugger. If the attacker can attach a debugger to your process, they can already do a lot worse than snoop on your secret strings. -- Greg From greg.ewing at canterbury.ac.nz Sat Jun 23 20:31:09 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jun 2018 12:31:09 +1200 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: <5B2EE64D.3030307@canterbury.ac.nz> Christian Heimes wrote: > It's just a 90% secure solution, because the data will > eventually land in public buffers. Seems like the only completely foolproof solution would have to involve some kind of quantum storage that can't be copied without destroying it. -- Greg From greg.ewing at canterbury.ac.nz Sat Jun 23 20:58:41 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jun 2018 12:58:41 +1200 Subject: [Python-ideas] Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: References: Message-ID: <5B2EECC1.1090602@canterbury.ac.nz> jab at math.brown.edu wrote: > it would be nice if we could write an async version > of this, as in ``async for chunk in aiter(...)``. The time machine seems to have taken care of this: https://docs.python.org/3.6/reference/compound_stmts.html#the-async-for-statement -- Greg From njs at pobox.com Sat Jun 23 21:11:31 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 23 Jun 2018 18:11:31 -0700 Subject: [Python-ideas] Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: <5B2EECC1.1090602@canterbury.ac.nz> References: <5B2EECC1.1090602@canterbury.ac.nz> Message-ID: On Sat, Jun 23, 2018 at 5:58 PM, Greg Ewing wrote: > jab at math.brown.edu wrote: >> >> it would be nice if we could write an async version of this, as in ``async >> for chunk in aiter(...)``. > > The time machine seems to have taken care of this: > > https://docs.python.org/3.6/reference/compound_stmts.html#the-async-for-statement He's asking for an async version of the 'iter' builtin, presumably something like: async def aiter(async_callable, sentinel): while True: value = await async_callable() if value == sentinel: break yield value -n -- Nathaniel J. Smith -- https://vorpus.org From steve at pearwood.info Sat Jun 23 22:04:10 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 24 Jun 2018 12:04:10 +1000 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2DA387.8030405@canterbury.ac.nz> <20180623014547.GE14437@ando.pearwood.info> Message-ID: <20180624020410.GK14437@ando.pearwood.info> On Sat, Jun 23, 2018 at 09:54:43PM +0200, Christian Heimes wrote: > If you need to protect sensitive data like private keys, then don't load > them into memory of your current process. It's that simple. :) How do ordinary Python programmers, like me, who want to do the Right Thing but without thinking too hard about it (or years of study), do this in a more-or-less platform independent way? We have the secrets module that is supposed to be the "batteries included" solution for sensitive data. Should it be involved? -- Steve From tjreedy at udel.edu Sat Jun 23 22:44:05 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 23 Jun 2018 22:44:05 -0400 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: <5B2EE25B.9070107@canterbury.ac.nz> References: <5B2EE25B.9070107@canterbury.ac.nz> Message-ID: On 6/23/2018 8:14 PM, Greg Ewing wrote: > Paul Moore wrote: > >> a = SafeStr("my secret data") >> ... work with a as if it were a string >> del a > > But in order to create the SafeStr, you need to first have > the data in the form of an ordinary non-safe string. How > do you dispose of that safely? getpass could return a SafeStr (or SafeBytes?). SafeStr could be initialized from a sequence of ints. -- Terry Jan Reedy From klahnakoski at mozilla.com Sun Jun 24 15:20:11 2018 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Sun, 24 Jun 2018 15:20:11 -0400 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: Message-ID: Ezequiel (Ezekiel) Brizuela, How is the secret "password" getting into a Python variable?? It is coming from disk, or network? Do the buffers of those systems have a copy?? How about methods that operate on the secrets?? Do they internally decrypt secrets to perform the necessary operations? I had this problem, and the only solution was a hardware security module (HSM): Private keys do not leave the module; encryption/decryption/verification are all done on the module.? Passwords enter the secure system via hardware keypads; which encrypt the password before transmitting bytes to the local computer. I do not think you can trust a network connected machine to have private keys; all private keys end their life stolen, lost or expired. On 2018-06-22 20:31, Ezequiel Brizuela [aka EHB or qlixed] wrote: > As all the string in python are immutable, is impossible to overwrite > the value or to make a "secure disposal" (overwrite-then-free) of a > string using something like: > > >>> a = "something to hide" > >>> a =? "x"*len(a) > > This will lead on the process memory "something to hide" and "x" > repeated len(a) times. > > - Who cares? Why is this relevant? > ? Well if you handle some sensitive information like CC numbers, > Passwords, PINs, or other kind of information you wanna minimize the > chance of leaking any of it. > > - How this "leak" can happen? > ? If you get a core/memory dump of an app handling sensitive > information you will get all the information on that core exposed! > > - Well, so what we can do about this? > ? I propose to make the required changes on the string objects to add > an option to overwrite the underlying buffer. To do so: > > ? * Add a wiped as an attribute that is read-only to be set when the > string is overwrited. > ? * Add a wipe() method that overwrite the internal string buffer. > > So this will work like this: > > >>> pwd =getpass.getpass('Set your password:') # could be other > sensitive data. > >>> encrypted_pwd = crypt.crypt(pwd)? # crypt() just as example. > >>> pwd.wiped? # Check if pwd was wiped. > False > >>> pwd.wipe()? # Overwrite the underlying buffer > >>> pwd.wiped? # Check if pwd was wiped. > True > >>> print(pwd)? # Print noise (or empty str?) > >>> del pwd? # Now is in hands of the GC. > > The wipe method immediately overwrite the underlying string buffer, > setting wiped as True for reference so if the string is further used > this can be checked to confirm that the change was made by a wipe and > not by another procedure. Also initially the idea is to use unicode > NULL datapoint to overwrite the string, but this could be change to > let the user parametrize it over wipe() method. > An alternative to this is to add a new exception "WipedError" that > could be throw where the string is accessed again, but I found this > method too disruptive to implement for a normal/standard string > workflow usage. > ? > Quick & Dirty FAQ: > > - You do it wrong!, the correct code to do that in a secure way is: > >>> pwd = crypt.crypt(getpass.getpass('Set your password')) > Don't you know that fool? > > ? Well no, the code still generate a temporary string in memory to > pass to crypt. But now this string is lying there and can't be > accessed for an overwrite with wipe() > > > - Why not create a new type like in C# or Java? > > ? I see that this tend to disrupt the usual workflow of string usage. > Also the idea here is not to offer secure storage of string in memory > because there is already a few mechanism to achieve with the current > Python base. I just want to have the hability to overwrite the buffer. > > > - Why don't use one of the standard algorithms to overwrite like > DoD5220 or MIL-STD-414? > > ? This kind of?standard?usually are oriented for usage on persistent > storage, specially on magnetic media for where the data could be > "easily" recoverd. But this could ve an option that could be > implemented adding the option to plug a function that do the overwrite > work inside the wipe method. > > > - This is far beyond of the almost implementation-agnostic definition > of the python lang. How about to you make a module with this > functionality and left the lang as is? > > ? Well I already do it: > > https://github.com/qlixed/python-memwiper/ > > > ? But i hit a lot of problems in the road, I was working on me free > time over the last year on this and make it "almost" work, but that is > not relevant to the proposal. > ? I think that this kind of security things needs to be tackled from > within the language itself specially when the lang have GC. I firmly > believe that the security and protections needs to be part of the > "with batteries" offer of Python. And I think that this is one little > thing that could help a lot to secure our apps. > ? Let me know what do you think! > > ~ Ezequiel (Ezekiel) Brizuela [ aka Qlixed ] ~ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jab at math.brown.edu Sun Jun 24 15:30:58 2018 From: jab at math.brown.edu (jab at math.brown.edu) Date: Sun, 24 Jun 2018 15:30:58 -0400 Subject: [Python-ideas] Fwd: Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: <59344505-2604-46A6-B1B6-92A1BB6645BC@gmail.com> References: <5B2EECC1.1090602@canterbury.ac.nz> <59344505-2604-46A6-B1B6-92A1BB6645BC@gmail.com> Message-ID: On Jun 23, 2018, at 21:11, Nathaniel Smith wrote: > He's asking for an async version of the 'iter' builtin, presumably > something like: > async def aiter(async_callable, sentinel): > while True: > value = await async_callable() > if value == sentinel: > break > yield value > -n Yes, exactly (thanks, Nathaniel). Wouldn't that be a useful built-in? (Greg, I too would be surprised if this were the first time this idea has been raised, but I looked before posting and couldn?t immediately find prior discussion.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Sun Jun 24 15:34:21 2018 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Sun, 24 Jun 2018 12:34:21 -0700 Subject: [Python-ideas] Fwd: Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: References: <5B2EECC1.1090602@canterbury.ac.nz> <59344505-2604-46A6-B1B6-92A1BB6645BC@gmail.com> Message-ID: 2018-06-24 12:30 GMT-07:00 : > On Jun 23, 2018, at 21:11, Nathaniel Smith wrote: > >> He's asking for an async version of the 'iter' builtin, presumably >> something like: >> async def aiter(async_callable, sentinel): >> while True: >> value = await async_callable() >> if value == sentinel: >> break >> yield value >> -n > > > Yes, exactly (thanks, Nathaniel). Wouldn't that be a useful built-in? > > (Greg, I too would be surprised if this were the first time this idea has > been raised, but I looked before posting and couldn?t immediately > find prior discussion.) > > There is an open issue for this: https://bugs.python.org/issue31861. It proposes adding aiter() and anext() as builtins. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jab at math.brown.edu Sun Jun 24 15:40:16 2018 From: jab at math.brown.edu (jab at math.brown.edu) Date: Sun, 24 Jun 2018 15:40:16 -0400 Subject: [Python-ideas] Fwd: Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: References: <5B2EECC1.1090602@canterbury.ac.nz> <59344505-2604-46A6-B1B6-92A1BB6645BC@gmail.com> Message-ID: On Sun, Jun 24, 2018 at 3:34 PM Jelle Zijlstra wrote: > There is an open issue for this: https://bugs.python.org/issue31861. It > proposes adding aiter() and anext() as builtins. > Oh, great to see that, thanks! I'll follow along on that issue, and maybe even contribute a PR if I can. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sun Jun 24 16:10:18 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 24 Jun 2018 21:10:18 +0100 Subject: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?) In-Reply-To: References: <5B2EE25B.9070107@canterbury.ac.nz> Message-ID: On 24 June 2018 at 03:44, Terry Reedy wrote: > On 6/23/2018 8:14 PM, Greg Ewing wrote: >> >> Paul Moore wrote: >> >>> a = SafeStr("my secret data") >>> ... work with a as if it were a string >>> del a >> >> >> But in order to create the SafeStr, you need to first have >> the data in the form of an ordinary non-safe string. How >> do you dispose of that safely? > > > getpass could return a SafeStr (or SafeBytes?). > SafeStr could be initialized from a sequence of ints. That's certainly a possibility. It's basically what the .net SecureString class does. But the initialisation problem is definitely a big flaw in the idea that I hadn't thought of :-( The moral of this is probably for me to leave security design to the experts :-) Paul From greg.ewing at canterbury.ac.nz Sun Jun 24 19:47:28 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 25 Jun 2018 11:47:28 +1200 Subject: [Python-ideas] Fwd: Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: References: <5B2EECC1.1090602@canterbury.ac.nz> <59344505-2604-46A6-B1B6-92A1BB6645BC@gmail.com> Message-ID: <5B302D90.3040506@canterbury.ac.nz> jab at math.brown.edu wrote: > On Jun 23, 2018, at 21:11, Nathaniel Smith > wrote: > > He's asking for an async version of the 'iter' builtin, presumably > something like: > async def aiter(async_callable, sentinel): > while True: > value = await async_callable() > if value == sentinel: > break > yield value > -n > > Yes, exactly (thanks, Nathaniel). Wouldn't that be a useful built-in? Ah, sorry, I misunderstood. I'm surprised this doesn't exist already -- it seems like an obvious thing to have along with the other async features. -- Greg From jab at math.brown.edu Sun Jun 24 10:08:36 2018 From: jab at math.brown.edu (jab at math.brown.edu) Date: Sun, 24 Jun 2018 07:08:36 -0700 (PDT) Subject: [Python-ideas] Replacing Infinite while Loops with an Iterator: async edition In-Reply-To: References: <5B2EECC1.1090602@canterbury.ac.nz> Message-ID: <9bd7665e-d0ca-4bcb-9629-0ab2f25326c2@googlegroups.com> Yes, exactly. Wouldn't that be a useful built-in? (And, is this really the first time this idea is being discussed?) Thanks, Josh On Saturday, June 23, 2018 at 9:12:35 PM UTC-4, Nathaniel Smith wrote: > > On Sat, Jun 23, 2018 at 5:58 PM, Greg Ewing > wrote: > > j... at math.brown.edu wrote: > >> > >> it would be nice if we could write an async version of this, as in > ``async > >> for chunk in aiter(...)``. > > > > The time machine seems to have taken care of this: > > > > > https://docs.python.org/3.6/reference/compound_stmts.html#the-async-for-statement > > He's asking for an async version of the 'iter' builtin, presumably > something like: > > async def aiter(async_callable, sentinel): > while True: > value = await async_callable() > if value == sentinel: > break > yield value > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Jun 25 10:50:02 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 25 Jun 2018 23:50:02 +0900 Subject: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874" In-Reply-To: <5CC091AA-00BC-4A6A-B7F6-0D2C35D05C86@mac.com> References: <1529023151.32.0.947875510639.issue33865@psf.upfronthosting.co.za> <1529135518.17.0.56676864532.issue33865@psf.upfronthosting.co.za> <20180616105924.GE14437@ando.pearwood.info> <23334.19898.962713.160394@turnbull.sk.tsukuba.ac.jp> <64CAF5C0-E717-49B0-BA0D-D282D5A4039C@mac.com> <20180618003446.GK14437@ando.pearwood.info> <23339.20740.892027.849859@turnbull.sk.tsukuba.ac.jp> <5CC091AA-00BC-4A6A-B7F6-0D2C35D05C86@mac.com> Message-ID: <23345.282.356677.829848@turnbull.sk.tsukuba.ac.jp> Ronald Oussoren writes: > The user shouldn?t have to do anything other than install Python. IMHO > were doing something wrong when the python interpreter doesn?t start up > with a default system configuration There's no evidence in the issue that I can see that suggests that the user installed Python into the default system configuration. I see a bunch of Python developers who have no access to the OP's system configuration demonstrating that something that shouldn't work and never has worked doesn't work, then providing a patch to make it work. This despite the fact that the OP hasn't provided any configuration details that would confirm this is a system default setting. I wouldn't object to making it work if there were any evidence that it is a real problem that other users will encounter. But there isn't any such evidence yet, it's a non-standard alias according to Microsoft's own IANA registration, and Steven d'Aprano's argument that such aliases may be ambiguous is plausible, though I haven't seen confirmation it would be problem in practice. > (when the user explicitly sets a bogus PYTHONIOENCODING or locale all > bets are off, I'm assuming that is the case, based on the fact that none of my two ;-) Thai students ever had this problem, nor have I seen a report of this problem for any encoding in either Emacs or Python contexts since about 1990, nor has the OP posted anything about his/her configuration. > although even then warning about and then ignoring bad settings > would be more userfriendly than the current behavior) If Python is told to talk YTREWQ and it doesn't know how to talk YTREWQ, ignoring the problem is not possible if any input or output in YTREWQ is required. The program will crash with a much harder to understand error message describing "undecodable input" in an encoding the user doesn't expect. My own experience is that soldiering on is the least user- friendly thing to do, as typically there's a trivial change that the user can make to resolve the problem optimally. The obvious thing to do is to fall back to ASCII, which almost certainly is compatible with the terminal, the log files, and the user's eyes and brain, emit a warning, and quit. That is what we do. The warning seems OK: the OP also diagnosed the missing alias, likely with little trouble. Steve From robertvandeneynde at hotmail.com Mon Jun 25 19:03:54 2018 From: robertvandeneynde at hotmail.com (Robert Vanden Eynde) Date: Mon, 25 Jun 2018 23:03:54 +0000 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: I found it fun to be able to write minutes(50) alongside with 50 * minutes so I did that : from datetime import date, time, datetime, timedelta class CallableTimedelta(timedelta): def __call__(self, x): return self * x seconds, milliseconds, microseconds, days, hours, minutes, weeks = (CallableTimedelta(**{x:1}) for x in ('seconds', 'milliseconds', 'microseconds', 'days', 'hours', 'minutes', 'weeks')) print(minutes(50) / seconds) # 3000.0 print(50 * minutes / seconds) # 3000.0 print(minutes(50).total_seconds()) # 3000.0 2018-06-07 13:34 GMT+02:00 P?l Gr?n?s Drange >: For closure, I've added a package, timeliterals (env) [pgdr at hostname ~]$ pip install timeliterals (env) [pgdr at hostname ~]$ python >>> from timeliterals import * >>> 3*hours datetime.timedelta(0, 10800) >>> 3*minutes datetime.timedelta(0, 180) >>> 3*seconds datetime.timedelta(0, 3) The source code is at https://github.com/pgdr/timeliterals I'm not going to submit a patch to datetime at this time, but I will if people would be interested. - P?l On 5 Jun 2018 13:56, "Jacco van Dorp" > wrote: i'd also be pretty simple to implement.... Just list: minute = timedelta(minutes=1) hour = timedelta(hours=1) etc... and you could import and use them like that. Or if you really want to write 5*m, the just from datetime import minute as m _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Tue Jun 26 08:41:44 2018 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 26 Jun 2018 09:41:44 -0300 Subject: [Python-ideas] datetime.timedelta literals In-Reply-To: References: Message-ID: I like the feel of it. In [1]: import pint In [2]: reg = pint.UnitRegistry() In [3]: from extradict import MapGetter In [4]: with MapGetter(reg): ...: from reg import cm, min, hour, km ...: In [5]: km Out[5]: In [6]: 10 * km / hour Out[6]: As for the request that started the thread - there was a thread about this not long ago - less than 1 year, for sure. Please, whoever intend to support it, check the arguments there. On Sun, 3 Jun 2018 at 07:53, P?l Gr?n?s Drange wrote: > > > What about > > > > 2.5*h - 14*min + 9300*ms * 2 > > That doesn't seem feasible to implement, however, that is essentially how the > Pint [1] module works: > > import pint > u = pint.UnitRegistry() > (2.5*u.hour - 14*u.min + 9300*u.ms) * 2 > # > > ((2.5*u.hour - 14*u.min + 9300*u.ms) * 2).to('sec') > # > > > However why be limited to time units ? One would want in certain > > application to define other units, like meter ? Would we want a litteral > > for that ? > > Pint works with all units imaginable: > > Q = u.Quantity > Q(u.c, (u.m/u.s)).to('km / hour') > # > > > However, the idea was just the six (h|min|s|ms|us|ns) time literals; I believe > time units are used more often than other units, e.g. in constructs like > > while end - start < 1min: > poll() > sleep(1s) # TypeError > sleep(1s.total_seconds()) # works, but ugly > > > [1] https://pypi.org/project/Pint/ > > Best regards, > P?l Gr?n?s Drange > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From Eloi.Gaudry at fft.be Tue Jun 26 10:45:23 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Tue, 26 Jun 2018 14:45:23 +0000 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> , <1525764405.24469.1.camel@fft.be>, , Message-ID: the origin of this feature disappearing for built-in types: http://bugs.jython.org/issue1058 ''' object.__set/delattr__ allow modification of built in types, this is known as the Carlo Verre hack: Jython 2.3a0+ (trunk:4630:4631M, Jun 14 2008, 20:07:38) [Java HotSpot(TM) Client VM (Apple Inc.)] on java1.5.0_13 Type "help", "copyright", "credits" or "license" for more information. >>> object.__setattr__(str, 'lower', str.upper) >>> 'dammit Carlo!'.lower() 'DAMMIT CARLO!' ''' but I do not see any reason why having an explicit flag for python extensions written in C to declare their types as static struct, and still be able to change their __setattr__, __getattr__, etc. slots would not make sense. extensions and core types have not the same constraints and purposes, this should be reflected on the capabilities the first would have somewhere then. ________________________________ From: Eloi Gaudry Sent: Tuesday, June 26, 2018 4:27:18 PM To: python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) some literature: https://mail.python.org/pipermail/python-dev/2008-February/077180.html https://mail.python.org/pipermail/python-dev/2008-February/077169.html where it is stated that python C struct type should not be able to have their attributes changed. but the extension needs is clearly not taken into account. ________________________________ From: Python-ideas on behalf of Eloi Gaudry Sent: Thursday, June 21, 2018 5:26:37 PM To: python-ideas at python.org; encukou at gmail.com Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) This request didn't have a lot of traction, but I still consider this is something that would need to be supported (2 lines of code to be changed; no regression so far with python 2 and python 3). My main points are: - HEAP_TYPE is not really used (as anyone being using it ?) - HEAP_TYPE serves other purposes - extension would benefit for allowing direct access to any of its type attributes Petr, what do you think ? Eloi ________________________________ From: Python-ideas on behalf of Eloi Gaudry Sent: Tuesday, May 8, 2018 9:26:47 AM To: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > On 05/07/18 11:37, Eloi Gaudry wrote: > > I mean, to my knowledge, there is no reason why a type should be > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > ) to > > be able to change its attributes at Python level. > > One reason is sub-interpreter support: you can have multiple > interpreters per process, and those shouldn't influence each other. > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > t) > > With heap types, each sub-interpreter can have its own copy of the > type > object. But with builtins, changes done in one interpreter would be > visible in all the others. Yes, this could be a reason, but if you don't rely on such a feature neither implicitly nor explicitly ? I mean, our types are built-in and should be considered as immutable across interpreters. And we (as most users I guess) are only running one interpreter. In case several intepreters are used, it would make sense to have a non-heap type that would be seen as a singleton across all of them, no ? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Tue Jun 26 11:34:50 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 26 Jun 2018 11:34:50 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: On Wed, Jun 20, 2018, 00:05 Serhiy Storchaka wrote: > 19.06.18 22:18, James Edwards ????: > > I've only recently looked for these special methods, so that in and of > > itself may be the reason these methods aren't exposed, but I could think > > of objects that may wish to implement __min__ and __max__ themselves, > > for efficiency. > > There are two questions. > > 1. What to do with additional min() and max() arguments: key and default. > Neither should be passed to a dunder. It is not possible to handle `key` without figuring out if a function is monotonic (a Turing-complete problem in general) or anti-monotonic (if that is a real term), so you MUST fall back on full iteration if a key is provided. `default` is only used in case of an empty collection. The only question is, who has responsibility for detecting an empty collection, and how? Caller detects: The caller checks length before calling the dunder. If there is no dunder, it doesn't check. Are there real-world cases where length is not defined on an iterable collection? Dunder detects: Right now, `max` detects empty by watching for StopIteration, which can no longer be a false positive. StopIterations from a deeper scope are wrapped. If the dunder throws an error to signal emptiness, it should not be thrown otherwise. I think that's impossible to guarantee. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jun 26 13:19:40 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Jun 2018 10:19:40 -0700 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> <1525764405.24469.1.camel@fft.be> Message-ID: Hey Eloi, I think you need to just give up on this. Nobody here seems to support or understand your use case. At this point you are repeating yourself (again claiming there is no good reason for the prohibition and that it's only a few lines of code to change) and you can be assured that the response will also be the same. --Guido On Tue, Jun 26, 2018 at 8:00 AM Eloi Gaudry wrote: > the origin of this feature disappearing for built-in types: > > http://bugs.jython.org/issue1058 > > > ''' > > object.__set/delattr__ allow modification of built in types, this is > known as the Carlo Verre hack: > > Jython 2.3a0+ (trunk:4630:4631M, Jun 14 2008, 20:07:38) > [Java HotSpot(TM) Client VM (Apple Inc.)] on java1.5.0_13 > Type "help", "copyright", "credits" or "license" for more information. > >>> object.__setattr__(str, 'lower', str.upper) > >>> 'dammit Carlo!'.lower() > 'DAMMIT CARLO!' > ''' > > but I do not see any reason why having an explicit flag for > python extensions written in C to declare their types as static struct, and > still be able to change their __setattr__, __getattr__, etc. slots would > not make sense. > > extensions and core types have not the same constraints and purposes, this > should be reflected on the capabilities the first would have somewhere then. > > > > > ------------------------------ > *From:* Eloi Gaudry > *Sent:* Tuesday, June 26, 2018 4:27:18 PM > *To:* python-ideas at python.org > *Subject:* Re: [Python-ideas] Allow mutable builtin types (optionally) > > > some literature: > > https://mail.python.org/pipermail/python-dev/2008-February/077180.html > > > https://mail.python.org/pipermail/python-dev/2008-February/077169.html > > > where it is stated that python C struct type should not be able to have > their attributes changed. > > but the extension needs is clearly not taken into account. > > ------------------------------ > *From:* Python-ideas > on behalf of Eloi Gaudry > *Sent:* Thursday, June 21, 2018 5:26:37 PM > *To:* python-ideas at python.org; encukou at gmail.com > *Subject:* Re: [Python-ideas] Allow mutable builtin types (optionally) > > > This request didn't have a lot of traction, but I still consider this is > something that would need to be supported (2 lines of code to be changed; > no regression so far with python 2 and python 3). > > > My main points are: > > - HEAP_TYPE is not really used (as anyone being using it ?) > > - HEAP_TYPE serves other purposes > > - extension would benefit for allowing direct access to any of its > type attributes > > > Petr, what do you think ? > > Eloi > ------------------------------ > *From:* Python-ideas > on behalf of Eloi Gaudry > *Sent:* Tuesday, May 8, 2018 9:26:47 AM > *To:* encukou at gmail.com; python-ideas at python.org > *Subject:* Re: [Python-ideas] Allow mutable builtin types (optionally) > > On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > > On 05/07/18 11:37, Eloi Gaudry wrote: > > > I mean, to my knowledge, there is no reason why a type should be > > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > > ) to > > > be able to change its attributes at Python level. > > > > One reason is sub-interpreter support: you can have multiple > > interpreters per process, and those shouldn't influence each other. > > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > > t) > > > > With heap types, each sub-interpreter can have its own copy of the > > type > > object. But with builtins, changes done in one interpreter would be > > visible in all the others. > > Yes, this could be a reason, but if you don't rely on such a feature > neither implicitly nor explicitly ? > > I mean, our types are built-in and should be considered as immutable > across interpreters. And we (as most users I guess) are only running > one interpreter. > > In case several intepreters are used, it would make sense to have a > non-heap type that would be seen as a singleton across all of them, no > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jun 26 14:53:27 2018 From: brett at python.org (Brett Cannon) Date: Tue, 26 Jun 2018 15:53:27 -0300 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> <1525764405.24469.1.camel@fft.be> Message-ID: On Thu, Jun 21, 2018, 12:27 Eloi Gaudry, wrote: > This request didn't have a lot of traction, but I still consider this is > something that would need to be supported > Please be careful about using the word "need" as it comes off as demanding instead of as a suggestion. -Brett (2 lines of code to be changed; no regression so far with python 2 and > python 3). > > > My main points are: > > - HEAP_TYPE is not really used (as anyone being using it ?) > > - HEAP_TYPE serves other purposes > > - extension would benefit for allowing direct access to any of its > type attributes > > Petr, what do you think ? > > Eloi > ------------------------------ > *From:* Python-ideas > on behalf of Eloi Gaudry > *Sent:* Tuesday, May 8, 2018 9:26:47 AM > *To:* encukou at gmail.com; python-ideas at python.org > *Subject:* Re: [Python-ideas] Allow mutable builtin types (optionally) > > On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > > On 05/07/18 11:37, Eloi Gaudry wrote: > > > I mean, to my knowledge, there is no reason why a type should be > > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > > ) to > > > be able to change its attributes at Python level. > > > > One reason is sub-interpreter support: you can have multiple > > interpreters per process, and those shouldn't influence each other. > > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > > t) > > > > With heap types, each sub-interpreter can have its own copy of the > > type > > object. But with builtins, changes done in one interpreter would be > > visible in all the others. > > Yes, this could be a reason, but if you don't rely on such a feature > neither implicitly nor explicitly ? > > I mean, our types are built-in and should be considered as immutable > across interpreters. And we (as most users I guess) are only running > one interpreter. > > In case several intepreters are used, it would make sense to have a > non-heap type that would be seen as a singleton across all of them, no > ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Tue Jun 26 17:49:23 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 26 Jun 2018 17:49:23 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: On Tue, Jun 26, 2018 at 11:34 AM, Franklin? Lee wrote: > On Wed, Jun 20, 2018, 00:05 Serhiy Storchaka wrote: >> >> 19.06.18 22:18, James Edwards ????: >> > I've only recently looked for these special methods, so that in and of >> > itself may be the reason these methods aren't exposed, but I could think >> > of objects that may wish to implement __min__ and __max__ themselves, >> > for efficiency. >> >> There are two questions. >> >> 1. What to do with additional min() and max() arguments: key and default. > > > Neither should be passed to a dunder. > > It is not possible to handle `key` without figuring out if a function is > monotonic (a Turing-complete problem in general) or anti-monotonic (if that > is a real term), so you MUST fall back on full iteration if a key is > provided. > > `default` is only used in case of an empty collection. The only question is, > who has responsibility for detecting an empty collection, and how? > > Caller detects: The caller checks length before calling the dunder. If there > is no dunder, it doesn't check. Are there real-world cases where length is > not defined on an iterable collection? > > Dunder detects: Right now, `max` detects empty by watching for > StopIteration, which can no longer be a false positive. StopIterations from > a deeper scope are wrapped. If the dunder throws an error to signal > emptiness, it should not be thrown otherwise. I think that's impossible to > guarantee. There's an argument that you DO want to pass to the dunder: `last=True`. It's not currently part of `min` and `max`. Currently, if there are multiple items that are maximum, `max` will return the first one. In the future, a `last:bool` param could be added, and a dunder for `max` will want to handle it. From Eloi.Gaudry at fft.be Tue Jun 26 10:27:18 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Tue, 26 Jun 2018 14:27:18 +0000 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> , <1525764405.24469.1.camel@fft.be>, Message-ID: some literature: https://mail.python.org/pipermail/python-dev/2008-February/077180.html https://mail.python.org/pipermail/python-dev/2008-February/077169.html where it is stated that python C struct type should not be able to have their attributes changed. but the extension needs is clearly not taken into account. ________________________________ From: Python-ideas on behalf of Eloi Gaudry Sent: Thursday, June 21, 2018 5:26:37 PM To: python-ideas at python.org; encukou at gmail.com Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) This request didn't have a lot of traction, but I still consider this is something that would need to be supported (2 lines of code to be changed; no regression so far with python 2 and python 3). My main points are: - HEAP_TYPE is not really used (as anyone being using it ?) - HEAP_TYPE serves other purposes - extension would benefit for allowing direct access to any of its type attributes Petr, what do you think ? Eloi ________________________________ From: Python-ideas on behalf of Eloi Gaudry Sent: Tuesday, May 8, 2018 9:26:47 AM To: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > On 05/07/18 11:37, Eloi Gaudry wrote: > > I mean, to my knowledge, there is no reason why a type should be > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > ) to > > be able to change its attributes at Python level. > > One reason is sub-interpreter support: you can have multiple > interpreters per process, and those shouldn't influence each other. > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > t) > > With heap types, each sub-interpreter can have its own copy of the > type > object. But with builtins, changes done in one interpreter would be > visible in all the others. Yes, this could be a reason, but if you don't rely on such a feature neither implicitly nor explicitly ? I mean, our types are built-in and should be considered as immutable across interpreters. And we (as most users I guess) are only running one interpreter. In case several intepreters are used, it would make sense to have a non-heap type that would be seen as a singleton across all of them, no ? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eloi.Gaudry at fft.be Tue Jun 26 15:49:13 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Tue, 26 Jun 2018 19:49:13 +0000 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> <1525764405.24469.1.camel@fft.be> Message-ID: Hi Guido, I would like to be sure that the lack of support would not be the result of my inability to sum-up my use case. This is why I gave some links to illustrate - the reason why the behavior was changed a decade ago - that such a possibility was actually needed by other extension developers (where their built-in types would benefit from being able to redefine some method dynamically) - that python core developers and python extension developers can have different needs and objectives (which was the main reason why I was submitting this to the mailing-list again) I feel sorry if that only resulted in looking like I was repeating myself. Have a good day, Eloi From: Guido van Rossum Sent: Tuesday, June 26, 2018 7:20 PM To: Eloi Gaudry Cc: Python-Ideas ; Serhiy Storchaka Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) Hey Eloi, I think you need to just give up on this. Nobody here seems to support or understand your use case. At this point you are repeating yourself (again claiming there is no good reason for the prohibition and that it's only a few lines of code to change) and you can be assured that the response will also be the same. --Guido On Tue, Jun 26, 2018 at 8:00 AM Eloi Gaudry > wrote: the origin of this feature disappearing for built-in types: http://bugs.jython.org/issue1058 ''' object.__set/delattr__ allow modification of built in types, this is known as the Carlo Verre hack: Jython 2.3a0+ (trunk:4630:4631M, Jun 14 2008, 20:07:38) [Java HotSpot(TM) Client VM (Apple Inc.)] on java1.5.0_13 Type "help", "copyright", "credits" or "license" for more information. >>> object.__setattr__(str, 'lower', str.upper) >>> 'dammit Carlo!'.lower() 'DAMMIT CARLO!' ''' but I do not see any reason why having an explicit flag for python extensions written in C to declare their types as static struct, and still be able to change their __setattr__, __getattr__, etc. slots would not make sense. extensions and core types have not the same constraints and purposes, this should be reflected on the capabilities the first would have somewhere then. ________________________________ From: Eloi Gaudry Sent: Tuesday, June 26, 2018 4:27:18 PM To: python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) some literature: https://mail.python.org/pipermail/python-dev/2008-February/077180.html https://mail.python.org/pipermail/python-dev/2008-February/077169.html where it is stated that python C struct type should not be able to have their attributes changed. but the extension needs is clearly not taken into account. ________________________________ From: Python-ideas > on behalf of Eloi Gaudry > Sent: Thursday, June 21, 2018 5:26:37 PM To: python-ideas at python.org; encukou at gmail.com Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) This request didn't have a lot of traction, but I still consider this is something that would need to be supported (2 lines of code to be changed; no regression so far with python 2 and python 3). My main points are: - HEAP_TYPE is not really used (as anyone being using it ?) - HEAP_TYPE serves other purposes - extension would benefit for allowing direct access to any of its type attributes Petr, what do you think ? Eloi ________________________________ From: Python-ideas > on behalf of Eloi Gaudry > Sent: Tuesday, May 8, 2018 9:26:47 AM To: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > On 05/07/18 11:37, Eloi Gaudry wrote: > > I mean, to my knowledge, there is no reason why a type should be > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > ) to > > be able to change its attributes at Python level. > > One reason is sub-interpreter support: you can have multiple > interpreters per process, and those shouldn't influence each other. > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > t) > > With heap types, each sub-interpreter can have its own copy of the > type > object. But with builtins, changes done in one interpreter would be > visible in all the others. Yes, this could be a reason, but if you don't rely on such a feature neither implicitly nor explicitly ? I mean, our types are built-in and should be considered as immutable across interpreters. And we (as most users I guess) are only running one interpreter. In case several intepreters are used, it would make sense to have a non-heap type that would be seen as a singleton across all of them, no ? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Tue Jun 26 20:30:11 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Tue, 26 Jun 2018 20:30:11 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: On 6/26/18 11:34 AM, Franklin? Lee wrote: > > It is not possible to handle `key` without figuring out if a function > is monotonic (a Turing-complete problem in general) or anti-monotonic > (if that is a real term), so you MUST fall back on full iteration if a > key is provided. > Monotonic (in this sense) just means never changing directions, it can be increasing and never decreasing or decreasing and never increasing so we don't need the 'anti-' version. -- Richard Damon From abedillon at gmail.com Tue Jun 26 20:36:51 2018 From: abedillon at gmail.com (Abe Dillon) Date: Tue, 26 Jun 2018 17:36:51 -0700 (PDT) Subject: [Python-ideas] random.sample should work better with iterators Message-ID: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> The docs on random.sample indicate that it works with iterators: > To choose a sample from a range of integers, use a range() > object as an > argument. This is especially fast and space efficient for sampling from a > large population: sample(range(10000000),k=60). However, when I try to use iterators other than range, like so: random.sample(itertools.product(range(height), range(with)), 0.5*height*width) I get: TypeError: Population must be a sequence or set. For dicts, use list(d). I don't know if Python Ideas is the right channel for this, but this seems overly constrained. The inability to handle dictionaries is especially puzzling. Randomly sampling from some population is often done because the entire population is impractically large which is also a motivation for using iterators, so it seems natural that one would be able to sample from an iterator. A naive implementation could use a heap queue: import heapq import random def stream(): while True: yield random.random() def sample(population, size): q = [tuple()]*size for el in zip(stream(), population): if el > q[0]: heapq.heapreplace(q, el) return [el[1] for el in q if el] It would also be helpful to add a ratio version of the function: def sample(population, size=None, *, ratio=None): assert None in (size, ratio), "can't specify both sample size and ratio" if ratio: return [el for el in population if random.random() < ratio] ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Jun 26 21:05:26 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jun 2018 11:05:26 +1000 Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> Message-ID: <20180627010526.GW14437@ando.pearwood.info> On Tue, Jun 26, 2018 at 05:36:51PM -0700, Abe Dillon wrote: > The docs on random.sample indicate that it works with iterators: > > > To choose a sample from a range of integers, use a range() > > object as an > > argument. This is especially fast and space efficient for sampling from a > > large population: sample(range(10000000),k=60). That doesn't mention anything about iterators. > However, when I try to use iterators other than range, like so: range is not an iterator. Thinking it is is a very common error, but it certainly is not. It is a lazily-generated *sequence*, not an iterator. The definition of an iterator is that the object must have an __iter__ method returning *itself*, and a __next__ method (the "iterator protocol"): py> obj = range(100) py> hasattr(obj, '__next__') False py> obj.__iter__() is obj False However, it is a sequence: py> import collections py> isinstance(obj, collections.Sequence) True (Aside: I'm surprised there's no inspect.isiterator and .isiterable functions.) > random.sample(itertools.product(range(height), range(with)), > 0.5*height*width) > > I get: > > TypeError: Population must be a sequence or set. For dicts, use list(d). > > I don't know if Python Ideas is the right channel for this, but this seems > overly constrained. The inability to handle dictionaries is especially > puzzling. Puzzling in what way? If sample() supported dicts, should it return the keys or the values or both? Also consider this: https://bugs.python.org/issue33098 > Randomly sampling from some population is often done because the entire > population is impractically large which is also a motivation for using > iterators, so it seems natural that one would be able to sample from an > iterator. A naive implementation could use a heap queue: > > import heapq > import random > > def stream(): > while True: yield random.random() > > def sample(population, size): > q = [tuple()]*size > for el in zip(stream(), population): > if el > q[0]: heapq.heapreplace(q, el) > return [el[1] for el in q if el] Is that an improvement over: sample(list(itertools.slice(population, size))) and if so, please explain. > It would also be helpful to add a ratio version of the function: > > def sample(population, size=None, *, ratio=None): > assert None in (size, ratio), "can't specify both sample size and ratio" > if ratio: > return [el for el in population if random.random() < ratio] > ... Helpful under what circumstances? Don't let the source speak for itself. Explain what it means. I understand what sample(population, size=100) does. What would sample(population, ratio=0.25) do? (That's not a rhetorical question, I genuinely don't understand the semantics of this proposed ratio argument.) -- Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Jun 26 23:07:58 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Wed, 27 Jun 2018 12:07:58 +0900 Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <20180627010526.GW14437@ando.pearwood.info> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> Message-ID: <23346.65422.88882.4017@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > > I don't know if Python Ideas is the right channel for this, but this seems > > overly constrained. The inability to handle dictionaries is especially > > puzzling. > > Puzzling in what way? Same misconception, I suppose. > If sample() supported dicts, should it return the keys or the values or > both? I argue below that *if* we were going to make the change, it should be to consistently try list() on non-sequences. But "not every one-liner" and EIBTI: d = {'a': 1, 'b': 2} >>> sample(d.keys(),1) ['a'] >>> sample(d.items(),1) [('a', 1)] But this is weird: >>> sample(d.values(),1) Traceback (most recent call last): File "", line 1, in File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/random.py", line 314, in sample raise TypeError("Population must be a sequence or set. For dicts, use list(d).") TypeError: Population must be a sequence or set. For dicts, use list(d). Oh, I see. Key views are "set-like", item views *may* be set-like, but value views are *not* set-like. Since views are all listable, why not try "list" on them? In general, I would think it makes sense to define this as "Population must be a sequence or convertible to a sequence using list()." And for most of the applications I can think of in my own use, sample(list(d)) is not particularly useful because it's a sample of keys. I usually want sample(list(d.values())). The ramifications are unclear to me, but I guess it's too late to change this because of the efficiency implications Tim describes in issue33098 (so EIBTI; thanks for the reference!) On the other hand, that issue says sets can't be sampled efficiently, so the current behavior seems to *promote* inefficient usage? I would definitely change the error message. I think "Use list(d)" is bad advice because I believe it's not even "almost always" what you'll want, and if keys and values are of the same type, it won't be obvious from the output that you're *not* getting a sample from d.values() if that's what you wanted and thought you were getting. > Don't let the source speak for itself. Explain what it means. I > understand what sample(population, size=100) does. What would > sample(population, ratio=0.25) do? I assume sample(pop, ratio=0.25) == sample(pop, size=0.25*len(pop)). From lizheao940510 at gmail.com Tue Jun 26 23:59:22 2018 From: lizheao940510 at gmail.com (=?UTF-8?B?5p2O6ICF55KI?=) Date: Tue, 26 Jun 2018 20:59:22 -0700 (PDT) Subject: [Python-ideas] Add an optional type file for Type Annotation Message-ID: I'm inspired by TypeScript. TypeScript allows people to add **.d.ts* *to annotate ECMAScript's type. So this can expand the TypeScript's scenes to be used? So I think can we add an optional mechanism to allow people add the extra file to annotate the existing code. Type Annotation can be used after Python 3.5. but many lib/framework has to be compatible for the Python 3.0-3.4, they can't use annotation. If we have this mechanism, the lib/framework can continue to focus on the function, and the third group can add the extra type annotation for lib/framework I think it should be OK -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Jun 27 00:52:55 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 26 Jun 2018 23:52:55 -0500 Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <20180627010526.GW14437@ando.pearwood.info> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> Message-ID: [Abe Dillon] > Randomly sampling from some population is often done because the entire > > population is impractically large which is also a motivation for using > > iterators, so it seems natural that one would be able to sample from an > > iterator. A naive implementation could use a heap queue: > > > > import heapq > > import random > > > > def stream(): > > while True: yield random.random() > > > > def sample(population, size): > > q = [tuple()]*size > > for el in zip(stream(), population): > > if el > q[0]: heapq.heapreplace(q, el) > > return [el[1] for el in q if el] > [Steven D'Aprano] > Is that an improvement over: sample(list(itertools.slice(population, size))) > and if so, please explain. > > Different things entirely. Your spelling is missing sample's required second argument, and the difference should be clear if it's supplied: sample(list(itertools.slice(population, size)). size) That is, it merely returns some permutation of the _initial_ `size` items in the iterable. The rest of the population is ignored. In Python today, the easiest way to spell Abe's intent is, e.g., >>> from heapq import nlargest # or nsmallest - doesn't matter >>> from random import random >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) [75260, 45880, 99486, 13478] >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) [31732, 72288, 26584, 72672] >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) [14180, 86084, 22639, 2004] That also arranges to preserve `sample()'s promise that all sub-slices of the result are valid random samples too (because `nlargest` sorts by the randomly generated keys before returning the list). However, it does _not_ preserve - and nothing can preserve for arbitrary iterables - `sample()`'s promise to "[leave] the original population unchanged". We can't examine an arbitrary iterable's population at all without exhausting the iterable, and that can be destructive. So while this can indeed be useful, it would require changing `sample()` to break that promise in some cases. BTW, using a heap for this is uncommon. Search on "reservoir sampling" for more-common ways Most common is probably Vitter's "Algorithm R", which runs in O(len(iterable)) time (no additional log factor for a heap - it doesn't use a heap). I'd prefer to leave `sample()` alone, and introduce some spelling of `possibly_destructive_sample()` for arbitrary iterables - if that's wanted enough for someone to do the work ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 27 01:08:31 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jun 2018 15:08:31 +1000 Subject: [Python-ideas] Add an optional type file for Type Annotation In-Reply-To: References: Message-ID: <20180627050831.GY14437@ando.pearwood.info> On Tue, Jun 26, 2018 at 08:59:22PM -0700, ??? wrote: > I'm inspired by TypeScript. > > TypeScript allows people to add **.d.ts* *to annotate ECMAScript's type. So > this can expand the TypeScript's scenes to be used? I'm sorry, I do not understand what you mean by this. Unless you mean stub files? Stub files are already supported. https://www.python.org/dev/peps/pep-0484/ > So I think can we add an optional mechanism to allow people add the extra > file to annotate the existing code. > > Type Annotation can be used after Python 3.5. but many lib/framework has to > be compatible for the Python 3.0-3.4, they can't use annotation. Function annotations work in Python 3.0 onwards. Using comments for annotations work for any version of Python: x = [] # type: List[Employee] This is also discussed in the PEP. -- Steve From j.van.dorp at deonet.nl Wed Jun 27 02:43:25 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 27 Jun 2018 08:43:25 +0200 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: 2018-06-26 17:34 GMT+02:00 Franklin? Lee : > Caller detects: The caller checks length before calling the dunder. If there > is no dunder, it doesn't check. Are there real-world cases where length is > not defined on an iterable collection? Generators dont have a __len__ method. And they might have min/max that can be calculated without iterating over the entire thing. The builtin range() is an example. (but also an exception, since it does have a __len__ attribute. This is specifically part of range and not generators in general, though.). However, range() is an example where the dunders could be valuable - max(range(1e7)) already takes noticable time here, while it's rather easy to figure it out from start stop and step, just like len now does for it. From Roman.Fiedler at ait.ac.at Wed Jun 27 03:04:35 2018 From: Roman.Fiedler at ait.ac.at (Fiedler Roman) Date: Wed, 27 Jun 2018 07:04:35 +0000 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow Message-ID: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Hello List, Context: we are conducting machine learning experiments that generate some kind of nested decision trees. As the tree includes specific decision elements (which require custom code to evaluate), we decided to store the decision tree (result of the analysis) as generated Python code. Thus the decision tree can be transferred to sensor nodes (detectors) that will then filter data according to the decision tree when executing the given code. Tracking down a crash when executing that generated code, we came to following simplified reproducer that will cause the interpreter to crash (on Python 2/3) when loading the code before execution is started: #!/usr/bin/python2 -BEsStt A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) The error message is: s_push: parser stack overflow MemoryError Despite the machine having 16GB of RAM, the code cannot be loaded. Splitting it into two lines using an intermediate variable is the current workaround to still get it running after manual adapting. As discussed on Python security list, crashes when loading such decision trees or also mathematical formulas (see bug report [1]) should not be a security problem. Even when not directly covered in the Python security model documentation [2], this case comes too close to "arbitrary code execution", where Python does not attempt to provide any protection. There might be only some border cases of affected software, e.g. Python sandbox systems like Zope/Plone or maybe even Python based smart contract blockchains like Etherereum (do not know if/where the use/derived work from the default Python interpreter for their use). But in both cases they would also be too close violating the security model, thus no changes to Python required from this side. Thus Python security suggested that the discussion should be continued on this list. Even when no security problem involved, the crash is still quite an annoyance. Development of code generators can be a tedious tasks. It is then somehow frustrating, when your generated code is not accepted by the interpreter, even when you do not feel like getting close to some system-relevant limits, e.g. 50 elements in a line like above on a 16GB machine. You may adapt the generator, but as the error does not include any information, which limit you really violated (number of brackets, function calls, list definitions?) you can only do experiments or look on the Python compiler code to figure that out. Even when you fix it, you have no guarantee to hit some other obscure limit the next day or that those limits change from one Python minor version to the next causing regressions. Questions: * Do you deem it possible/sensible to even attempt to write a Python language code generator that will produce non-malicious, syntactically valid decision tree code/mathematical formulas and still having a sufficiently high probability that the Python interpreter will also run that code now and in near future (regressions)? * Assuming yes to the question above, when generating code, what should be the maximal nesting depth a code generator can always expect to be compiled on Python 2.7 and 3.5 on? Are there any other similar restrictions that need to be considered by the code generator? Or is generating code that way not the preferred solution anyway - the code generator should generate e.g. binary python code immediately? Note: in the end the exact same logic code will run as Python process, it seems it is only about how it is loaded into the Python interpreter. * If not possible/recommended/sensible, we might generate Java-bytecode or native x86-code instead, where the likelihood of the (virtual) CPU really executing code that is compliant to the language specification (even with CPU errata like FDIV-bug et al) might be magnitudes higher than with the Python interpreter. Any feedback appreciated! Roman [1] https://bugs.python.org/issue3971) [2] http://python-security.readthedocs.io/security.html#security-model From solipsis at pitrou.net Wed Jun 27 03:11:26 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 27 Jun 2018 09:11:26 +0200 Subject: [Python-ideas] random.sample should work better with iterators References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> Message-ID: <20180627091126.7dfcaefa@fsol> On Tue, 26 Jun 2018 23:52:55 -0500 Tim Peters wrote: > > In Python today, the easiest way to spell Abe's intent is, e.g., > > >>> from heapq import nlargest # or nsmallest - doesn't matter > >>> from random import random > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > [75260, 45880, 99486, 13478] > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > [31732, 72288, 26584, 72672] > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > [14180, 86084, 22639, 2004] > > That also arranges to preserve `sample()'s promise that all sub-slices of > the result are valid random samples too (because `nlargest` sorts by the > randomly generated keys before returning the list). How could slicing return an invalid random sample? Regards Antoine. From mike at selik.org Wed Jun 27 03:36:12 2018 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jun 2018 00:36:12 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: On Tue, Jun 26, 2018, 11:43 PM Jacco van Dorp wrote: > 2018-06-26 17:34 GMT+02:00 Franklin? Lee : > > Caller detects: The caller checks length before calling the dunder. If > there > > is no dunder, it doesn't check. Are there real-world cases where length > is > > not defined on an iterable collection? > > Generators dont have a __len__ method. And they might have min/max > that can be calculated without iterating over the entire thing. The > builtin range() is an example. (but also an exception, since it does > have a __len__ attribute. This is specifically part of range and not > generators in general, though.). > > However, range() is an example where the dunders could be valuable - > max(range(1e7)) already takes noticable time here, while it's rather > easy to figure it out from start stop and step, just like len now does > for it. > Have you ever written ``max(range(x))`` in production code? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.van.dorp at deonet.nl Wed Jun 27 03:59:11 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Wed, 27 Jun 2018 09:59:11 +0200 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: 2018-06-27 9:36 GMT+02:00 Michael Selik : > > > On Tue, Jun 26, 2018, 11:43 PM Jacco van Dorp wrote: >> >> 2018-06-26 17:34 GMT+02:00 Franklin? Lee : >> > Caller detects: The caller checks length before calling the dunder. If >> > there >> > is no dunder, it doesn't check. Are there real-world cases where length >> > is >> > not defined on an iterable collection? >> >> Generators dont have a __len__ method. And they might have min/max >> that can be calculated without iterating over the entire thing. The >> builtin range() is an example. (but also an exception, since it does >> have a __len__ attribute. This is specifically part of range and not >> generators in general, though.). >> >> However, range() is an example where the dunders could be valuable - >> max(range(1e7)) already takes noticable time here, while it's rather >> easy to figure it out from start stop and step, just like len now does >> for it. > > > Have you ever written ``max(range(x))`` in production code? Have you ever written len(range(x)) in production code ? From gregory.lielens at gmail.com Wed Jun 27 05:04:12 2018 From: gregory.lielens at gmail.com (Greg) Date: Wed, 27 Jun 2018 11:04:12 +0200 Subject: [Python-ideas] Allow mutable builtin types (optionally) Message-ID: As I introduced (a long time ago) this demand, let me add my grain of salt here. The use case is pretty simple, and somewhat common when writing manually C extension class: The reason to write extension class is usually performance, or link into an existing library. When doing this manually (instead of using automatic python-wrapping tools like boost, swig,...) you try to wrap the minimum amount of methods/accessors/...to your underlying c/c++ class, and replicate non-critical methods in python. Moreover, extending your class by adding new methods is usually much more easy in Python, especially if it involve complex but not performance-bounded python-data manipulation. Problem is to make those python-implemented methods avaible to instances of your extension class, especially when those instances are returned by the C layer of your extension. The solution we choose was to change the __class__ of each extension type instance to the python derived newclass implementing all those extra-methods.Not too difficult, a simple encapsulation of all methods returning extension-class instances is enough, and can be automated. This solution is quite common I think, it translate something you do for python-class instances, but then you get the __class__ assignment: only for heap types error. The argument about sub-interpreters is a good one, but not really applicable for this use case: we really want to have one extension type (or a hierarchy of it) shared across all interpreter importing the extension, it just happen that instead of being implemented in pure C/C++, the extension is implemented in C/C++ and Python. The fact that the python parts will be seen everywhere is a feature, not a problem: you expect the replacement of C-implemented methods by Python-implemented method to be as transparent as possible. Alternatives would be to use a heap type for our C extensions classes (we need to check what it would imply, but it may be quite painless) or use some form or delegation instead of assigning to __class__. The later is not really painless, AFAIK, in term of coding complexity and possibly performance (extra lookups steps needed). If there are other solutions or if delegation can be made as simple/efficient as the __class__ mechanism, it would be good to know, and it is I think valuable info for many people writing extension classes. Anyway, my personal position on this has not changed in 10y and is in line with Eloi: I think that beeing a heaptype and allowing assigment to the __class__ attribute of instances is indeed quite orthogonal.. . -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eloi.Gaudry at fft.be Tue Jun 26 15:38:29 2018 From: Eloi.Gaudry at fft.be (Eloi Gaudry) Date: Tue, 26 Jun 2018 19:38:29 +0000 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: <1525707472.12114.1.camel@fft.be> <1525764405.24469.1.camel@fft.be> Message-ID: Hi Brett, Sorry about that, I did not mean to be rude. What I wanted to says is: 1. That I relied on such a feature 2. Other people on this mailing-list already asked something similar at several occasions 3. HEAPTYPE would not always be a solution 4. Then I thought this was something that would indeed need more discussion and might get acceptance if discussed once again. Eloi From: Brett Cannon Sent: Tuesday, June 26, 2018 8:53 PM To: Eloi Gaudry Cc: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Thu, Jun 21, 2018, 12:27 Eloi Gaudry, > wrote: This request didn't have a lot of traction, but I still consider this is something that would need to be supported Please be careful about using the word "need" as it comes off as demanding instead of as a suggestion. -Brett (2 lines of code to be changed; no regression so far with python 2 and python 3). My main points are: - HEAP_TYPE is not really used (as anyone being using it ?) - HEAP_TYPE serves other purposes - extension would benefit for allowing direct access to any of its type attributes Petr, what do you think ? Eloi ________________________________ From: Python-ideas > on behalf of Eloi Gaudry > Sent: Tuesday, May 8, 2018 9:26:47 AM To: encukou at gmail.com; python-ideas at python.org Subject: Re: [Python-ideas] Allow mutable builtin types (optionally) On Mon, 2018-05-07 at 15:23 -0400, Petr Viktorin wrote: > On 05/07/18 11:37, Eloi Gaudry wrote: > > I mean, to my knowledge, there is no reason why a type should be > > allocated on the heap (https://docs.python.org/2/c-api/typeobj.html > > ) to > > be able to change its attributes at Python level. > > One reason is sub-interpreter support: you can have multiple > interpreters per process, and those shouldn't influence each other. > (see https://docs.python.org/3/c-api/init.html#sub-interpreter-suppor > t) > > With heap types, each sub-interpreter can have its own copy of the > type > object. But with builtins, changes done in one interpreter would be > visible in all the others. Yes, this could be a reason, but if you don't rely on such a feature neither implicitly nor explicitly ? I mean, our types are built-in and should be considered as immutable across interpreters. And we (as most users I guess) are only running one interpreter. In case several intepreters are used, it would make sense to have a non-heap type that would be seen as a singleton across all of them, no ? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jun 27 05:20:35 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 27 Jun 2018 11:20:35 +0200 Subject: [Python-ideas] Have a "j" format option for lists References: Message-ID: <20180627112035.40483487@fsol> On Wed, 9 May 2018 09:39:08 -0300 Facundo Batista wrote: > This way, I could do: > > >>> authors = ["John", "Mary", "Estela"] > >>> "Authors: {:, j}".format(authors) > 'Authors: John, Mary, Estela' > > In this case the join can be made in the format yes, but this proposal > would be very useful when the info to format comes inside a structure > together with other stuff, like... > > >>> info = { > ... 'title': "A book", > ... 'price': Decimal("2.34"), > ... 'authors: ["John", "Mary", "Estela"], > ... } > ... > >>> print("{title!r} (${price}) by {authors:, j}".format(**info)) > "A book" ($2.34) by John, Mary, Estela > > What do you think? -1. I hate that the format language is slowly becoming more and more crufty, leading to unreadable code. Yes, typing `", ".join(...)` is more work, but at least the end result is readable. Regards Antoine. From steve at pearwood.info Wed Jun 27 06:01:02 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jun 2018 20:01:02 +1000 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: <20180627100101.GE14437@ando.pearwood.info> On Wed, Jun 27, 2018 at 12:36:12AM -0700, Michael Selik wrote: > On Tue, Jun 26, 2018, 11:43 PM Jacco van Dorp wrote: > > Generators dont have a __len__ method. And they might have min/max > > that can be calculated without iterating over the entire thing. The > > builtin range() is an example. (but also an exception, since it does > > have a __len__ attribute. This is specifically part of range and not > > generators in general, though.). range is not a generator. > > However, range() is an example where the dunders could be valuable - > > max(range(1e7)) already takes noticable time here, while it's rather > > easy to figure it out from start stop and step, just like len now does > > for it. > > > > Have you ever written ``max(range(x))`` in production code? I have never written that. But I have written ``max(iterable)`` dozens of times, where iterable could be a range object. -- Steve From mike at selik.org Wed Jun 27 09:52:14 2018 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jun 2018 06:52:14 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: <20180627100101.GE14437@ando.pearwood.info> References: <20180627100101.GE14437@ando.pearwood.info> Message-ID: On Wed, Jun 27, 2018, 3:06 AM Steven D'Aprano wrote: > On Wed, Jun 27, 2018 at 12:36:12AM -0700, Michael Selik wrote: > > On Tue, Jun 26, 2018, 11:43 PM Jacco van Dorp > wrote: > > > > Generators dont have a __len__ method. And they might have min/max > > > that can be calculated without iterating over the entire thing. The > > > builtin range() is an example. (but also an exception, since it does > > > have a __len__ attribute. This is specifically part of range and not > > > generators in general, though.). > > range is not a generator. > > > > > However, range() is an example where the dunders could be valuable - > > > max(range(1e7)) already takes noticable time here, while it's rather > > > easy to figure it out from start stop and step, just like len now does > > > for it. > > > > > > > Have you ever written ``max(range(x))`` in production code? > > I have never written that. > > But I have written ``max(iterable)`` dozens of times, where iterable > could be a range object. > My intent was to ask where a range was in fact passed into max, not merely where it could be. It'd be enlightening to see a complete, realistic example. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jun 27 10:27:01 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jun 2018 00:27:01 +1000 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180627100101.GE14437@ando.pearwood.info> Message-ID: <20180627142701.GH14437@ando.pearwood.info> On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: > > > Have you ever written ``max(range(x))`` in production code? > > > > I have never written that. > > > > But I have written ``max(iterable)`` dozens of times, where iterable > > could be a range object. > > > > My intent was to ask where a range was in fact passed into max, not merely > where it could be. It'd be enlightening to see a complete, realistic > example. A complete, realistic example is as I said: you call max() on some object which you don't control, the caller does. You could be passed a list, or a set, or a bitset, a binary search tree, a range object, whatever the caller happens to pass to you. If you control your own input, then this doesn't sound too interesting. You know when you will get a list, and you can call max() on it, and you know when you are passing yourself a tree, and you can give your own tree a max() method and call that. But if you don't control your input, then this is a good way to delegate back to the unknown object of an unknown class which knows itself. Currently max() and min() have no choice but to walk the entire data structure, even if the object already knows its own maximum and minimum values. That's wasteful. An analogy: in Ruby, the equivalent of the len() built-in falls back on iteration as a last resort for any object which doesn't define a len method: size = 0 for x in obj: size += 1 return size By this analogy, max() and min() currently are like that last-resort version of len(), except they do it for *every* object that can be iterated over even when there's no need. Imagine that Python's len() always walked the entire iterable, from start to end, to count the length. Now suppose that you proposed adding a __len__ protocol so that objects that know their own length can report it quickly, and in response I argued that len(range(x)) was unrealistic and that there is no need for a __len__ method because we could just say range(x).stop instead. I don't think you would find that argument very persuasive, would you? -- Steve From ethan at stoneleaf.us Wed Jun 27 10:46:23 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Jun 2018 07:46:23 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? Message-ID: <5B33A33F.30207@stoneleaf.us> Consider the following Enum definition: class Color(Enum): RED = 1 GREEN = 2 BLUE = 3 @property def lower(self): return self.name.lower() def spam(self): return "I like %s eggs and spam!" % self.lower class SomeClass: pass Which of the above Color attributes are enums, and which aren't? . . . Answer: - RED, GREEN, and BLUE are members - lower and spam() are not - SomeClass /is/ a member (but not its instances) Question: Should `SomeClass` be an enum member? When would it be useful to have an embedded class in an Enum be an enum member? The only example I have seen so far of nested classes in an Enum is when folks want to make an Enum of Enums, and the nested Enum should not itself be an enum member. Since the counter-example already works I haven't seen any requests for it. ;) So I'm asking the community: What real-world examples can you offer for either behavior? Cases where nested classes should be enum members, and cases where nested classes should not be members. Thanks! -- ~Ethan~ From guido at python.org Wed Jun 27 11:04:06 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Jun 2018 08:04:06 -0700 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: I consider this is a bug -- a violation of Python's (informal) promise to the user that when CPython segfaults it is not the user's fault. Given typical Python usage patterns, I don't consider this an important bug, but maybe someone is interested in trying to fix it. As far as your application is concerned, I'm not sure that generating code like that is the right approach. Why don't you generate a data structure and a little engine that walks the data structure? On Wed, Jun 27, 2018 at 12:05 AM Fiedler Roman wrote: > Hello List, > > Context: we are conducting machine learning experiments that generate some > kind of nested decision trees. As the tree includes specific decision > elements (which require custom code to evaluate), we decided to store the > decision tree (result of the analysis) as generated Python code. Thus the > decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > Tracking down a crash when executing that generated code, we came to > following simplified reproducer that will cause the interpreter to crash > (on Python 2/3) when loading the code before execution is started: > > #!/usr/bin/python2 -BEsStt > > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > The error message is: > > s_push: parser stack overflow > MemoryError > > Despite the machine having 16GB of RAM, the code cannot be loaded. > Splitting it into two lines using an intermediate variable is the current > workaround to still get it running after manual adapting. > > As discussed on Python security list, crashes when loading such decision > trees or also mathematical formulas (see bug report [1]) should not be a > security problem. Even when not directly covered in the Python security > model documentation [2], this case comes too close to "arbitrary code > execution", where Python does not attempt to provide any protection. There > might be only some border cases of affected software, e.g. Python sandbox > systems like Zope/Plone or maybe even Python based smart contract > blockchains like Etherereum (do not know if/where the use/derived work from > the default Python interpreter for their use). But in both cases they would > also be too close violating the security model, thus no changes to Python > required from this side. Thus Python security suggested that the discussion > should be continued on this list. > > > Even when no security problem involved, the crash is still quite an > annoyance. Development of code generators can be a tedious tasks. It is > then somehow frustrating, when your generated code is not accepted by the > interpreter, even when you do not feel like getting close to some > system-relevant limits, e.g. 50 elements in a line like above on a 16GB > machine. You may adapt the generator, but as the error does not include any > information, which limit you really violated (number of brackets, function > calls, list definitions?) you can only do experiments or look on the Python > compiler code to figure that out. Even when you fix it, you have no > guarantee to hit some other obscure limit the next day or that those limits > change from one Python minor version to the next causing regressions. > > Questions: > > * Do you deem it possible/sensible to even attempt to write a Python > language code generator that will produce non-malicious, syntactically > valid decision tree code/mathematical formulas and still having a > sufficiently high probability that the Python interpreter will also run > that code now and in near future (regressions)? > > * Assuming yes to the question above, when generating code, what should be > the maximal nesting depth a code generator can always expect to be compiled > on Python 2.7 and 3.5 on? Are there any other similar restrictions that > need to be considered by the code generator? Or is generating code that way > not the preferred solution anyway - the code generator should generate e.g. > binary python code immediately? Note: in the end the exact same logic code > will run as Python process, it seems it is only about how it is loaded into > the Python interpreter. > > * If not possible/recommended/sensible, we might generate Java-bytecode or > native x86-code instead, where the likelihood of the (virtual) CPU really > executing code that is compliant to the language specification (even with > CPU errata like FDIV-bug et al) might be magnitudes higher than with the > Python interpreter. > > Any feedback appreciated! > > Roman > > [1] https://bugs.python.org/issue3971) > [2] http://python-security.readthedocs.io/security.html#security-model > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Wed Jun 27 11:05:19 2018 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jun 2018 08:05:19 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: <20180627142701.GH14437@ando.pearwood.info> References: <20180627100101.GE14437@ando.pearwood.info> <20180627142701.GH14437@ando.pearwood.info> Message-ID: On Wed, Jun 27, 2018 at 7:30 AM Steven D'Aprano wrote: > On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: > > > > Have you ever written ``max(range(x))`` in production code? > > > I have never written that. > > > But I have written ``max(iterable)`` dozens of times, where iterable > > > could be a range object. > > > > > My intent was to ask where a range was in fact passed into max, not > merely > > where it could be. It'd be enlightening to see a complete, realistic > > example. > > A complete, realistic example is as I said: you call max() on some > object which you don't control, the caller does. You could be > passed a list, or a set, or a bitset, a binary search tree, a range > object, whatever the caller happens to pass to you. > This is not a complete, realistic example. You're describing what an example might be, but not providing a concrete one with context. Quoting Guido from earlier in the thread: "I think just finding a data structure that should implement its own min/max funtionality (or maybe one of these, like heapq) is not enough motivation. You have to find code where such a data structure (let's say a Tree) is passed to some function that also accepts, say, a list." Imagine that Python's len() always walked the entire iterable, from > start to end, to count the length. Now suppose that you proposed > adding a __len__ protocol so that objects that know their own length > can report it quickly, and in response I argued that > len(range(x)) > was unrealistic and that there is no need for a __len__ method > because we could just say > range(x).stop > instead. I don't think you would find that argument very persuasive, > would you? > I would, actually. The range object is particularly unpersuasive as a motivation for magic methods, because of its unusual usage. The hypothetical Tree object discussed earlier was much more interesting. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jun 27 11:12:03 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 27 Jun 2018 17:12:03 +0200 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: <20180627171203.2e57a41c@fsol> The OP says "crash" (implying some kind of segfault) but here the snippet raises a mere exception: Python 2.7.12 (default, Dec 4 2017, 14:50:18) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) s_push: parser stack overflow MemoryError >>> Regards Antoine. On Wed, 27 Jun 2018 08:04:06 -0700 Guido van Rossum wrote: > I consider this is a bug -- a violation of Python's (informal) promise to > the user that when CPython segfaults it is not the user's fault. > > Given typical Python usage patterns, I don't consider this an important > bug, but maybe someone is interested in trying to fix it. > > As far as your application is concerned, I'm not sure that generating code > like that is the right approach. Why don't you generate a data structure > and a little engine that walks the data structure? > > On Wed, Jun 27, 2018 at 12:05 AM Fiedler Roman > wrote: > > > Hello List, > > > > Context: we are conducting machine learning experiments that generate some > > kind of nested decision trees. As the tree includes specific decision > > elements (which require custom code to evaluate), we decided to store the > > decision tree (result of the analysis) as generated Python code. Thus the > > decision tree can be transferred to sensor nodes (detectors) that will then > > filter data according to the decision tree when executing the given code. > > > > Tracking down a crash when executing that generated code, we came to > > following simplified reproducer that will cause the interpreter to crash > > (on Python 2/3) when loading the code before execution is started: > > > > #!/usr/bin/python2 -BEsStt > > > > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > > > The error message is: > > > > s_push: parser stack overflow > > MemoryError > > > > Despite the machine having 16GB of RAM, the code cannot be loaded. > > Splitting it into two lines using an intermediate variable is the current > > workaround to still get it running after manual adapting. > > > > As discussed on Python security list, crashes when loading such decision > > trees or also mathematical formulas (see bug report [1]) should not be a > > security problem. Even when not directly covered in the Python security > > model documentation [2], this case comes too close to "arbitrary code > > execution", where Python does not attempt to provide any protection. There > > might be only some border cases of affected software, e.g. Python sandbox > > systems like Zope/Plone or maybe even Python based smart contract > > blockchains like Etherereum (do not know if/where the use/derived work from > > the default Python interpreter for their use). But in both cases they would > > also be too close violating the security model, thus no changes to Python > > required from this side. Thus Python security suggested that the discussion > > should be continued on this list. > > > > > > Even when no security problem involved, the crash is still quite an > > annoyance. Development of code generators can be a tedious tasks. It is > > then somehow frustrating, when your generated code is not accepted by the > > interpreter, even when you do not feel like getting close to some > > system-relevant limits, e.g. 50 elements in a line like above on a 16GB > > machine. You may adapt the generator, but as the error does not include any > > information, which limit you really violated (number of brackets, function > > calls, list definitions?) you can only do experiments or look on the Python > > compiler code to figure that out. Even when you fix it, you have no > > guarantee to hit some other obscure limit the next day or that those limits > > change from one Python minor version to the next causing regressions. > > > > Questions: > > > > * Do you deem it possible/sensible to even attempt to write a Python > > language code generator that will produce non-malicious, syntactically > > valid decision tree code/mathematical formulas and still having a > > sufficiently high probability that the Python interpreter will also run > > that code now and in near future (regressions)? > > > > * Assuming yes to the question above, when generating code, what should be > > the maximal nesting depth a code generator can always expect to be compiled > > on Python 2.7 and 3.5 on? Are there any other similar restrictions that > > need to be considered by the code generator? Or is generating code that way > > not the preferred solution anyway - the code generator should generate e.g. > > binary python code immediately? Note: in the end the exact same logic code > > will run as Python process, it seems it is only about how it is loaded into > > the Python interpreter. > > > > * If not possible/recommended/sensible, we might generate Java-bytecode or > > native x86-code instead, where the likelihood of the (virtual) CPU really > > executing code that is compliant to the language specification (even with > > CPU errata like FDIV-bug et al) might be magnitudes higher than with the > > Python interpreter. > > > > Any feedback appreciated! > > > > Roman > > > > [1] https://bugs.python.org/issue3971) > > [2] http://python-security.readthedocs.io/security.html#security-model > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > From leewangzhong+python at gmail.com Wed Jun 27 11:16:06 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 27 Jun 2018 11:16:06 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: <20180627142701.GH14437@ando.pearwood.info> References: <20180627100101.GE14437@ando.pearwood.info> <20180627142701.GH14437@ando.pearwood.info> Message-ID: On Wed, Jun 27, 2018, 10:31 Steven D'Aprano wrote: > On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: > > > > > Have you ever written ``max(range(x))`` in production code? > > > > > > I have never written that. > > > > > > But I have written ``max(iterable)`` dozens of times, where iterable > > > could be a range object. > > > > > > > My intent was to ask where a range was in fact passed into max, not > merely > > where it could be. It'd be enlightening to see a complete, realistic > > example. > > A complete, realistic example is as I said: you call max() on some > object which you don't control, the caller does. You could be > passed a list, or a set, or a bitset, a binary search tree, a range > object, whatever the caller happens to pass to you. > > If you control your own input, then this doesn't sound too interesting. > You know when you will get a list, and you can call max() on it, and you > know when you are passing yourself a tree, and you can give your own > tree a max() method and call that. > > But if you don't control your input, then this is a good way to delegate > back to the unknown object of an unknown class which knows itself. > > Currently max() and min() have no choice but to walk the entire data > structure, even if the object already knows its own maximum and minimum > values. That's wasteful. > > An analogy: in Ruby, the equivalent of the len() built-in falls back on > iteration as a last resort for any object which doesn't define a len > method: > > > size = 0 > for x in obj: > size += 1 > return size > > > By this analogy, max() and min() currently are like that last-resort > version of len(), except they do it for *every* object that can be > iterated over even when there's no need. > > Imagine that Python's len() always walked the entire iterable, from > start to end, to count the length. Now suppose that you proposed > adding a __len__ protocol so that objects that know their own length > can report it quickly, and in response I argued that > > len(range(x)) > > was unrealistic and that there is no need for a __len__ method > because we could just say > > range(x).stop > > instead. I don't think you would find that argument very persuasive, > would you? > Let's just assume Michael wants to know, and isn't making an argument against the proposal. `range` is more a sequence (a collection) than a generator. It's not Python 2's xrange. You can iterate through one multiple times. If `range` did not have a length, we would eventually propose that it should. I can't think of a real iterable which has no length but knows what its max is. An iterator or generator can't generally know its max without consuming everything. I have some vague thoughts about collections which have known bounds but not known sizes, but nothing concrete. It's bikeshedding, anyway. How isn't as important as whether it's useful. By the way, range(1e7) fails with a type error. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jun 27 11:17:52 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Jun 2018 01:17:52 +1000 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: On 27 June 2018 at 17:04, Fiedler Roman wrote: > Hello List, > > Context: we are conducting machine learning experiments that generate some kind of nested decision trees. As the tree includes specific decision elements (which require custom code to evaluate), we decided to store the decision tree (result of the analysis) as generated Python code. Thus the decision tree can be transferred to sensor nodes (detectors) that will then filter data according to the decision tree when executing the given code. > > Tracking down a crash when executing that generated code, we came to following simplified reproducer that will cause the interpreter to crash (on Python 2/3) when loading the code before execution is started: > > #!/usr/bin/python2 -BEsStt > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > The error message is: > > s_push: parser stack overflow > MemoryError > > Despite the machine having 16GB of RAM, the code cannot be loaded. Splitting it into two lines using an intermediate variable is the current workaround to still get it running after manual adapting. This seems like it may indicate a potential problem in the pgen2 parser generator, since the compilation is failing at the original parse step, but checking the largest version of this that CPython can parse on my machine gives a syntax tree of only ~77kB: >>> tree = parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") >>> sys.getsizeof(tree) 77965 Attempting to print that hints more closely at the potential problem: >>> tree.tolist() Traceback (most recent call last): File "", line 1, in RecursionError: maximum recursion depth exceeded while getting the repr of an object As far as I'm aware, the CPython parser is using the actual C stack for recursion, and is hence throwing MemoryError because it ran out of stack space to recurse into, not because it ran out of memory in general (RecursionError would be a more accurate exception). Trying your original example in PyPy (which uses a different parser implementation) suggests you may want to try using that as your execution target before resorting to switching languages entirely: >>>> tree2 = parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])]]))])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") >>>> len(tree2.tolist()) 5 Alternatively, you could explore mimicking the way that scikit-learn saves its trained models (which I believe is a variation on "use pickle", but I've never actually gone and checked for sure). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mike at selik.org Wed Jun 27 11:23:18 2018 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jun 2018 08:23:18 -0700 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: On Wed, Jun 27, 2018 at 12:04 AM Fiedler Roman wrote: > Context: we are conducting machine learning experiments that generate some > kind of nested decision trees. As the tree includes specific decision > elements (which require custom code to evaluate), we decided to store the > decision tree (result of the analysis) as generated Python code. Thus the > decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > How do you write tests for the sensor nodes? Do they use code as data for test cases? -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jun 27 11:23:34 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 27 Jun 2018 08:23:34 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: Message-ID: I don?t think anyone would argue that there would be use cases for __max__ and __min__ special methods. However, there is substantial overhead to adding new magic methods, so the question is not whether it would be useful in some special cases, but whether it would be useful enough in common enough cases to be worth the overhead. For example, we had a discussion on this list about adding a __sort_key__ magic method, so that classes could make themselves efficiently sortable. That was not deemed generally useful enough to be worth it. In this case, I think numpy arrays are a good example to think about (as supposed to range objects :-) ) Numpy arrays can certainly find their max and min more efficiently than the generic functions, and are ?proper? sequences that can be used in generic code. But they also have a particular signature for max and min (axis parameter) so really don?t map well to a dunder. They also have their own ways of doing all sorts of things that are different than a ?typical? iterarable. I expect that is the case for other special data structures like trees, etc. So I don?t think this rises to the level of generally universal enough for a magic method. -CHB From Roman.Fiedler at ait.ac.at Wed Jun 27 11:33:25 2018 From: Roman.Fiedler at ait.ac.at (Fiedler Roman) Date: Wed, 27 Jun 2018 15:33:25 +0000 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: <472f306d91bb4149b640a76528e68571@ait.ac.at> > Von: Guido van Rossum [mailto:guido at python.org] > > I consider this is a bug -- a violation of Python's (informal) promise to the user > that when CPython segfaults it is not the user's fault. Strictly it is not a segfault, just a parser exception that cannot be caught (at least I failed to catch it in a quick test). Seems that the catch block is parsed after parsing the problematic code, so any "except" in the code itself is useless. Apart from that: even when caught, what to do? Your program partially refuses to load - only benefit is that you can die gracefully. > Given typical Python usage patterns, I don't consider this an important bug, > but maybe someone is interested in trying to fix it. Acknowledged: I do not know of any software, where this has high relevance, but my knowledge is quite limited, so asked PSRT before to be sure. > As far as your application is concerned, I'm not sure that generating code like > that is the right approach. Why don't you generate a data structure and a little > engine that walks the data structure? That's what I told the colleague asking me to assist in analysis of the crash too. I guess that the "simple generator" was just easier to write, thus used as a starting point. And now by chance a model was generated hitting the Python limit of 50 instantiations/lists per statement or whatsoever. So there is not much "why" to be explained, it just happened. Kind regards, Roman > On Wed, Jun 27, 2018 at 12:05 AM Fiedler Roman > wrote: > > > Hello List, > > Context: we are conducting machine learning experiments that > generate some kind of nested decision trees. As the tree includes specific > decision elements (which require custom code to evaluate), we decided to > store the decision tree (result of the analysis) as generated Python code. Thus > the decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > Tracking down a crash when executing that generated code, we came > to following simplified reproducer that will cause the interpreter to crash (on > Python 2/3) when loading the code before execution is started: > > #!/usr/bin/python2 -BEsStt > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A > ([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(No > ne)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > The error message is: > > s_push: parser stack overflow > MemoryError > > Despite the machine having 16GB of RAM, the code cannot be loaded. > Splitting it into two lines using an intermediate variable is the current > workaround to still get it running after manual adapting. > > As discussed on Python security list, crashes when loading such > decision trees or also mathematical formulas (see bug report [1]) should not > be a security problem. Even when not directly covered in the Python security > model documentation [2], this case comes too close to "arbitrary code > execution", where Python does not attempt to provide any protection. There > might be only some border cases of affected software, e.g. Python sandbox > systems like Zope/Plone or maybe even Python based smart contract > blockchains like Etherereum (do not know if/where the use/derived work > from the default Python interpreter for their use). But in both cases they > would also be too close violating the security model, thus no changes to > Python required from this side. Thus Python security suggested that the > discussion should be continued on this list. > > > Even when no security problem involved, the crash is still quite an > annoyance. Development of code generators can be a tedious tasks. It is then > somehow frustrating, when your generated code is not accepted by the > interpreter, even when you do not feel like getting close to some system- > relevant limits, e.g. 50 elements in a line like above on a 16GB machine. You > may adapt the generator, but as the error does not include any information, > which limit you really violated (number of brackets, function calls, list > definitions?) you can only do experiments or look on the Python compiler > code to figure that out. Even when you fix it, you have no guarantee to hit > some other obscure limit the next day or that those limits change from one > Python minor version to the next causing regressions. > > Questions: > > * Do you deem it possible/sensible to even attempt to write a Python > language code generator that will produce non-malicious, syntactically valid > decision tree code/mathematical formulas and still having a sufficiently high > probability that the Python interpreter will also run that code now and in near > future (regressions)? > > * Assuming yes to the question above, when generating code, what > should be the maximal nesting depth a code generator can always expect to > be compiled on Python 2.7 and 3.5 on? Are there any other similar > restrictions that need to be considered by the code generator? Or is > generating code that way not the preferred solution anyway - the code > generator should generate e.g. binary python code immediately? Note: in the > end the exact same logic code will run as Python process, it seems it is only > about how it is loaded into the Python interpreter. > > * If not possible/recommended/sensible, we might generate Java- > bytecode or native x86-code instead, where the likelihood of the (virtual) CPU > really executing code that is compliant to the language specification (even > with CPU errata like FDIV-bug et al) might be magnitudes higher than with the > Python interpreter. > > Any feedback appreciated! > > Roman > > [1] https://bugs.python.org/issue3971) > [2] http://python-security.readthedocs.io/security.html#security- > model > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > > --Guido van Rossum (python.org/~guido ) From mike at selik.org Wed Jun 27 11:30:25 2018 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jun 2018 08:30:25 -0700 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180627100101.GE14437@ando.pearwood.info> <20180627142701.GH14437@ando.pearwood.info> Message-ID: On Wed, Jun 27, 2018 at 8:16 AM Franklin? Lee wrote: > On Wed, Jun 27, 2018, 10:31 Steven D'Aprano wrote: > >> On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: >> > My intent was to ask where a range was in fact passed into max, not >> merely >> > where it could be. It'd be enlightening to see a complete, realistic >> > example. >> >> A complete, realistic example is as I said: you call max() on some >> object which you don't control, the caller does. You could be >> passed a list, or a set, or a bitset, a binary search tree, a range >> object, whatever the caller happens to pass to you. >> > > Let's just assume Michael wants to know, and isn't making an argument > against the proposal. > I do want to know, but it's also an argument against the proposal -- that no one has contributed in-context usage to demonstrate the value. I'd want to see code that currently uses ``if isinstance`` to switch between ``max(x)`` and ``x.max()``. Chris Barker explained the issue well. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Roman.Fiedler at ait.ac.at Wed Jun 27 11:47:15 2018 From: Roman.Fiedler at ait.ac.at (Fiedler Roman) Date: Wed, 27 Jun 2018 15:47:15 +0000 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: <382f4f61df774839b3950a631e9f4c9c@ait.ac.at> > Von: Nick Coghlan [mailto:ncoghlan at gmail.com] > > On 27 June 2018 at 17:04, Fiedler Roman wrote: > > Hello List, > > > > Context: we are conducting machine learning experiments that generate > some kind of nested decision trees. As the tree includes specific decision > elements (which require custom code to evaluate), we decided to store the > decision tree (result of the analysis) as generated Python code. Thus the > decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > > > Tracking down a crash when executing that generated code, we came to > following simplified reproducer that will cause the interpreter to crash (on > Python 2/3) when loading the code before execution is started: > > > > #!/usr/bin/python2 -BEsStt > > > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([ > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])]) > ])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > > > The error message is: > > > > s_push: parser stack overflow > > MemoryError > > > > Despite the machine having 16GB of RAM, the code cannot be loaded. > Splitting it into two lines using an intermediate variable is the current > workaround to still get it running after manual adapting. > > This seems like it may indicate a potential problem in the pgen2 > parser generator, since the compilation is failing at the original > parse step, but checking the largest version of this that CPython can > parse on my machine gives a syntax tree of only ~77kB: > > >>> tree = > parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A( > [A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])] > )])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") > >>> sys.getsizeof(tree) > 77965 > > Attempting to print that hints more closely at the potential problem: > > >>> tree.tolist() > Traceback (most recent call last): > File "", line 1, in > RecursionError: maximum recursion depth exceeded while getting the > repr of an object > > As far as I'm aware, the CPython parser is using the actual C stack > for recursion, and is hence throwing MemoryError because it ran out of > stack space to recurse into, not because it ran out of memory in > general (RecursionError would be a more accurate exception). That seems conclusive. Knowing the cause but fearing regressions, maybe the code should not be changed regarding the limits (thus opening a can of worms) but something like that might be nice: * Raise RecursionError('Maximum supported compile time parser recursion depth of [X] exceeded, see [docuref]') * With the python-warn-all flag, issue a warning if a file reaches half or 75% of the limit during parsing? > Trying your original example in PyPy (which uses a different parser > implementation) suggests you may want to try using that as your > execution target before resorting to switching languages entirely: > > >>>> tree2 = > parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A( > [A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([ > A(None)])])])])])])])])])]]))])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > ])") > >>>> len(tree2.tolist()) > 5 > > Alternatively, you could explore mimicking the way that scikit-learn > saves its trained models (which I believe is a variation on "use > pickle", but I've never actually gone and checked for sure). Thank you for your very informative post, both solutions/workaround seem appropriate. Apart from that, the "scikit-learn" might also have the advantage to use something more "standardizes", thus easing cooperation in scientific community. I will pass this information on to my colleague. LG Roman From Roman.Fiedler at ait.ac.at Wed Jun 27 12:18:54 2018 From: Roman.Fiedler at ait.ac.at (Fiedler Roman) Date: Wed, 27 Jun 2018 16:18:54 +0000 Subject: [Python-ideas] Correct way for writing Python code without causing interpreter crashes due to parser stack overflow In-Reply-To: References: <4f8556a254eb4df9812c1f684dcbfbb1@ait.ac.at> Message-ID: > Von: Michael Selik [mailto:mike at selik.org] > > On Wed, Jun 27, 2018 at 12:04 AM Fiedler Roman > wrote: > > Context: we are conducting machine learning experiments that > generate some kind of nested decision trees. As the tree includes specific > decision elements (which require custom code to evaluate), we decided to > store the decision tree (result of the analysis) as generated Python code. Thus > the decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > How do you write tests for the sensor nodes? Do they use code as data for > test cases? We have two approaches for test data generation: as we are processing log data, we may use adaptive, self-learning log data generators that can then be spiked with anomalies. In other tests we used armored zero day exploits on production-like test systems to get more realistic data. The big picture: When finally everything is working, distributed sensor nodes shall pre-process machine log data streams for security analysis in real time and report findings back to a central instance. Findings also include data, that does not make sense to the sensor node (cannot be classified). This central instance updates its internal model attempting to learn how to classify the new data and then creates new model-evaluation-code (that is the one that caused the crash) that is sent to the sensors again. The sensor replaces the model with the generated code, thus altering the log data analysis behaviour. The current implementation uses https://packages.debian.org/search?keywords=logdata-anomaly-miner to run the sensor nodes, the central instance is experimental code creating configuration for the nodes. When the detection methods get more mature, the way of model distribution is likely to change to a more robust scheme. We try to apply those mining approaches to various domains, e.g. for attack detection based on log data without known structure (proprietary systems, no SIEM-regexes available yet, no rules), but also e.g. for detecting vulnerable code before it is exploited (zero-day discovery of LXC container escape vulnerabilites) but also to detect execution of zeroday exploits itself, that we wrote for demonstration purposes. See https://itsecx.fhstp.ac.at/wp-content/uploads/2016/11/06_RomanFiedler_SyscallAuditLogMining-V1.pdf (sorry, German slides only) From leewangzhong+python at gmail.com Wed Jun 27 12:58:14 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 27 Jun 2018 12:58:14 -0400 Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <20180627091126.7dfcaefa@fsol> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> <20180627091126.7dfcaefa@fsol> Message-ID: On Wed, Jun 27, 2018 at 3:11 AM, Antoine Pitrou wrote: > On Tue, 26 Jun 2018 23:52:55 -0500 > Tim Peters wrote: >> >> In Python today, the easiest way to spell Abe's intent is, e.g., >> >> >>> from heapq import nlargest # or nsmallest - doesn't matter >> >>> from random import random >> >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) >> [75260, 45880, 99486, 13478] >> >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) >> [31732, 72288, 26584, 72672] >> >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) >> [14180, 86084, 22639, 2004] >> >> That also arranges to preserve `sample()'s promise that all sub-slices of >> the result are valid random samples too (because `nlargest` sorts by the >> randomly generated keys before returning the list). > > How could slicing return an invalid random sample? If the sample isn't randomly ordered. def sample(population, k): population = list(population) shuffle(population) return sorted(population[:k]) #No, don't sort! From leewangzhong+python at gmail.com Wed Jun 27 12:45:34 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 27 Jun 2018 12:45:34 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180627100101.GE14437@ando.pearwood.info> <20180627142701.GH14437@ando.pearwood.info> Message-ID: On Wed, Jun 27, 2018 at 11:30 AM, Michael Selik wrote: > On Wed, Jun 27, 2018 at 8:16 AM Franklin? Lee > wrote: >> >> On Wed, Jun 27, 2018, 10:31 Steven D'Aprano wrote: >>> >>> On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: >>> > My intent was to ask where a range was in fact passed into max, not >>> > merely >>> > where it could be. It'd be enlightening to see a complete, realistic >>> > example. >>> >>> A complete, realistic example is as I said: you call max() on some >>> object which you don't control, the caller does. You could be >>> passed a list, or a set, or a bitset, a binary search tree, a range >>> object, whatever the caller happens to pass to you. >> >> >> Let's just assume Michael wants to know, and isn't making an argument >> against the proposal. > > > I do want to know, but it's also an argument against the proposal -- that no > one has contributed in-context usage to demonstrate the value. I'd want to > see code that currently uses ``if isinstance`` to switch between ``max(x)`` > and ``x.max()``. Chris Barker explained the issue well. Then maybe you shouldn't have picked on the verbatim case of `len(range(...))`. You won't find many examples deciding between `max(x)` and `x.max()`. `np.nanmax(x)` will optimize the case where you care about efficiency. (By the way, Numpy checks for `x.max` if `x` isn't an `ndarray`, so it already kind of implements the proposal.) I honestly can't imagine a case where you need to know the max of some possibly-unsorted collection, but don't eventually iterate through the whole thing anyway (no difference in asymptotic time). You might have some savings sometimes, but the overall gain is unpredictable, and shouldn't be depended on. A case more realistic to me is when you know it's definitely a collection that knows its max/min, but you don't know what kind of collection it is. But a sorted collection is pretty much a sequence, and you can usually get the first or last element using c[0] and c[-1], unless it's a SortedDict. A heap or a Young tableau knows its min, and only has to search its leaves/boundary for its max (or vice versa). For a sorted collection, you may want to check against min and max before inserting, deleting, or searching for an element. However, these checks can be implemented by the functions doing the insert/delete/search. I expect existing uses for dunder max will appear in generic algorithm libraries, rather than in concrete uses. If you're switching to a sorted collection, it suggests that you want efficiency, and I imagine you'll modify your code to fit the new data structure. From tim.peters at gmail.com Wed Jun 27 13:35:28 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 27 Jun 2018 12:35:28 -0500 Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <20180627091126.7dfcaefa@fsol> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> <20180627091126.7dfcaefa@fsol> Message-ID: > > [Tim] > In Python today, the easiest way to spell Abe's intent is, e.g., > > > > >>> from heapq import nlargest # or nsmallest - doesn't matter > > >>> from random import random > > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > > [75260, 45880, 99486, 13478] > > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > > [31732, 72288, 26584, 72672] > > >>> nlargest(4, (i for i in range(100000)), key=lambda x: random()) > > [14180, 86084, 22639, 2004] > > > > That also arranges to preserve `sample()'s promise that all sub-slices of > > the result are valid random samples too (because `nlargest` sorts by the > > randomly generated keys before returning the list). > [Antoine Pitrou] > How could slicing return an invalid random sample? > For example, consider random.sample(range(2), 2). As a set, there is only one possible output, {0, 1}. But it doesn't return a set, it returns a list. So there are two possible outputs: [0, 1] [1, 0] random.sample() promises to return each of those about equally often, so that, e.g., result[0:1] and result[1:2] are also random 1-samples. If it always returned, say, [0, 1], that's "a random" 2-sample, but its 1-slices are as far from random 1-samples as is possible to get. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jun 27 13:46:32 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Jun 2018 10:46:32 -0700 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: Message-ID: So the question remains -- why not use a heap type? On Wed, Jun 27, 2018 at 2:05 AM Greg wrote: > As I introduced (a long time ago) this demand, let me add my grain of salt > here. > > The use case is pretty simple, and somewhat common when writing manually C > extension class: The reason to write extension class is usually > performance, or link into an existing library. > When doing this manually (instead of using automatic python-wrapping tools > like boost, swig,...) you try to wrap the minimum amount of > methods/accessors/...to your underlying c/c++ class, and replicate > non-critical methods in python. > Moreover, extending your class by adding new methods is usually much more > easy in Python, especially if it involve complex but not > performance-bounded python-data manipulation. > Problem is to make those python-implemented methods avaible to instances > of your extension class, especially when those instances are returned by > the C layer of your extension. > > The solution we choose was to change the __class__ of each extension type > instance to the python derived newclass implementing all those > extra-methods.Not too difficult, a simple encapsulation of all methods > returning extension-class instances is enough, and can be automated. > This solution is quite common I think, it translate something you do for > python-class instances, but then you get the __class__ assignment: only > for heap types error. > > The argument about sub-interpreters is a good one, but not really > applicable for this use case: we really want to have one extension type (or > a hierarchy of it) shared across all interpreter importing the extension, > it just happen that instead of being implemented in pure C/C++, the > extension is implemented in C/C++ and Python. The fact that the python > parts will be seen everywhere is a feature, not a problem: you expect the > replacement of C-implemented methods by Python-implemented method to be as > transparent as possible. > > Alternatives would be to use a heap type for our C extensions classes (we > need to check what it would imply, but it may be quite painless) > or use some form or delegation instead of assigning to __class__. > The later is not really painless, AFAIK, in term of coding complexity and > possibly performance (extra lookups steps needed). > > If there are other solutions or if delegation can be made as > simple/efficient as the __class__ mechanism, it would be good to know, and > it is I think valuable info for many people writing extension classes. > Anyway, my personal position on this has not changed in 10y and is in line > with Eloi: I think that beeing a heaptype and allowing assigment to the > __class__ attribute of instances is indeed quite orthogonal.. > > > > . > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jun 27 14:52:50 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Jun 2018 11:52:50 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <5B33A33F.30207@stoneleaf.us> References: <5B33A33F.30207@stoneleaf.us> Message-ID: Sounds to me really strange that the nested class would become a member. Probably because everything becomes a member unless it's a function (maybe decorated)? On Wed, Jun 27, 2018 at 7:47 AM Ethan Furman wrote: > Consider the following Enum definition: > > class Color(Enum): > RED = 1 > GREEN = 2 > BLUE = 3 > @property > def lower(self): > return self.name.lower() > def spam(self): > return "I like %s eggs and spam!" % self.lower > class SomeClass: > pass > > Which of the above Color attributes are enums, and which aren't? > > > . > > > . > > > . > > Answer: > > - RED, GREEN, and BLUE are members > > - lower and spam() are not > > - SomeClass /is/ a member (but not its instances) > > > Question: > > Should `SomeClass` be an enum member? When would it be useful to have > an embedded class in an Enum be an enum member? > > > The only example I have seen so far of nested classes in an Enum is when > folks want to make an Enum of Enums, and the > nested Enum should not itself be an enum member. Since the > counter-example already works I haven't seen any requests > for it. ;) > > So I'm asking the community: What real-world examples can you offer for > either behavior? Cases where nested classes > should be enum members, and cases where nested classes should not be > members. > > Thanks! > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Jun 27 15:24:03 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Jun 2018 12:24:03 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: <5B33E453.5080203@stoneleaf.us> On 06/27/2018 11:52 AM, Guido van Rossum wrote: > Sounds to me really strange that the nested class would become a member. Probably because everything becomes a member > unless it's a function (maybe decorated)? Pretty much. __dunders__, _sunders_, and descriptors do not get transformed. Everything else does. -- ~Ethan~ From ethan at stoneleaf.us Wed Jun 27 15:25:58 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Jun 2018 12:25:58 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: <5B33E4C6.5070401@stoneleaf.us> On 06/27/2018 12:04 PM, Elazar wrote: > ?????? ??? ??, 27 ????? 2018, 11:59, ??? Guido van Rossum: >> Sounds to me really strange that the nested class would become a member. >> Probably because everything becomes a member unless it's a function >> (maybe decorated)? > > People working with sum types might expect the instances of the nested > class to be instances of the enclosing class. So if the nested class is > a namedtuple, you get a sum type. The only problem is that there's no > way to express this subtype relationship in code. I have no idea what you just said. :( Is there a link you can share that might explain it? -- ~Ethan~ From rymg19 at gmail.com Wed Jun 27 15:35:55 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 27 Jun 2018 14:35:55 -0500 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <5B33E4C6.5070401@stoneleaf.us> References: <5B33A33F.30207@stoneleaf.us> <5B33E4C6.5070401@stoneleaf.us> Message-ID: <16442bec178.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> I *think* he's referring to something like this: class A(enum.Enum): class Inner(NamedTuple): ... isinstance(A.Inner(), A()) # True I *think* that's it. On June 27, 2018 2:26:23 PM Ethan Furman wrote: > On 06/27/2018 12:04 PM, Elazar wrote: > > ?????? ??? ??, 27 ????? 2018, 11:59, ??? Guido van Rossum: > > >> Sounds to me really strange that the nested class would become a member. > >> Probably because everything becomes a member unless it's a function > >> (maybe decorated)? > > >> People working with sum types might expect the instances of the nested > > class to be instances of the enclosing class. So if the nested class is > > a namedtuple, you get a sum type. The only problem is that there's no > > way to express this subtype relationship in code. > > I have no idea what you just said. :( Is there a link you can share that > might explain it? > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From elazarg at gmail.com Wed Jun 27 15:04:01 2018 From: elazarg at gmail.com (Elazar) Date: Wed, 27 Jun 2018 12:04:01 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: People working with sum types might expect the instances of the nested class to be instances of the enclosing class. So if the nested class is a namedtuple, you get a sum type. The only problem is that there's no way to express this subtype relationship in code. Elazar ?????? ??? ??, 27 ????? 2018, 11:59, ??? Guido van Rossum ?: > Sounds to me really strange that the nested class would become a member. > Probably because everything becomes a member unless it's a function (maybe > decorated)? > > On Wed, Jun 27, 2018 at 7:47 AM Ethan Furman wrote: > >> Consider the following Enum definition: >> >> class Color(Enum): >> RED = 1 >> GREEN = 2 >> BLUE = 3 >> @property >> def lower(self): >> return self.name.lower() >> def spam(self): >> return "I like %s eggs and spam!" % self.lower >> class SomeClass: >> pass >> >> Which of the above Color attributes are enums, and which aren't? >> >> >> . >> >> >> . >> >> >> . >> >> Answer: >> >> - RED, GREEN, and BLUE are members >> >> - lower and spam() are not >> >> - SomeClass /is/ a member (but not its instances) >> >> >> Question: >> >> Should `SomeClass` be an enum member? When would it be useful to have >> an embedded class in an Enum be an enum member? >> >> >> The only example I have seen so far of nested classes in an Enum is when >> folks want to make an Enum of Enums, and the >> nested Enum should not itself be an enum member. Since the >> counter-example already works I haven't seen any requests >> for it. ;) >> >> So I'm asking the community: What real-world examples can you offer for >> either behavior? Cases where nested classes >> should be enum members, and cases where nested classes should not be >> members. >> >> Thanks! >> >> -- >> ~Ethan~ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Wed Jun 27 16:06:24 2018 From: elazarg at gmail.com (Elazar) Date: Wed, 27 Jun 2018 13:06:24 -0700 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <16442bec178.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> References: <5B33A33F.30207@stoneleaf.us> <5B33E4C6.5070401@stoneleaf.us> <16442bec178.27a3.db5b03704c129196a4e9415e55413ce6@gmail.com> Message-ID: Yes, Ryan. I mean a way to express something like sealed classes in scala/kotlin. The Enum class defines a finite region in which subclasses can be defined, thus allows verifying that "elif" cases are exhaustive, for exampe. It mostly helpful for static type checking, but it also helps readability and is a natural way to describe ASTs. class Expr(Enum): class BinOp(NamedTuple): # ideally should subclass Expr left: Expr right: Expr op: str class UnOp(NamedTuple): operator: Expr operand: str ... It's one of the (rejected) ideas here: https://github.com/python/mypy/issues/2464 Not the best link out there but it explains: https://antonioleiva.com/sealed-classes-kotlin/ Elazar On Wed, Jun 27, 2018 at 12:49 PM Ryan Gonzalez wrote: > I *think* he's referring to something like this: > > class A(enum.Enum): > class Inner(NamedTuple): > ... > > isinstance(A.Inner(), A()) # True > > I *think* that's it. > > > > On June 27, 2018 2:26:23 PM Ethan Furman wrote: > > > On 06/27/2018 12:04 PM, Elazar wrote: > > > ?????? ??? ??, 27 ????? 2018, 11:59, ??? Guido van Rossum: > > > > >> Sounds to me really strange that the nested class would become a > member. > > >> Probably because everything becomes a member unless it's a function > > >> (maybe decorated)? > > > > >> People working with sum types might expect the instances of the nested > > > class to be instances of the enclosing class. So if the nested class > is > > > a namedtuple, you get a sum type. The only problem is that there's no > > > way to express this subtype relationship in code. > > > > I have no idea what you just said. :( Is there a link you can share > that > > might explain it? > > > > -- > > ~Ethan~ > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jun 27 16:36:26 2018 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 27 Jun 2018 13:36:26 -0700 (PDT) Subject: [Python-ideas] collections.Counter should implement fromkeys Message-ID: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Consider the following update function for conway's game of life: from collections import Counter def update(live: Set[Tuple[Integer, Integer]]): counts = Counter.fromkeys(live, 0) + Counter(itertools.chain(neighbors(* cell) for cell in live)) flip = {cell for cell, count in counts.items() if (cell in live and not 1 From andrei.kucharavy at gmail.com Wed Jun 27 17:20:01 2018 From: andrei.kucharavy at gmail.com (Andrei Kucharavy) Date: Wed, 27 Jun 2018 17:20:01 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages Message-ID: Over the last 10 years, Python has slowly inched towards becoming the most popular scientific computing language, beating or seriously challenging Matlab, R, Mathematica and many specialized languages (S, SAS, ...) in numerous applications. A large part of this growth is driven by amazing community packages, such as numpy, scipy, scikits-learn, scikits-image, seaborn or pandas, just to name a few. Development of such packages represents a significant time investment by people working in academic environments. To be able to justify the investment of time into such package development and support, the developers usually associated them with a scientific article. The number of citations of those articles are considered as measures of the usefulness of articles and are required to justify the time spent on them. Unfortunately, as of now, a significant issue is that such packages are not cited despite being extensively used. Part of this is due to the difficulties with compiling the list of proper citations for each module (and, for libraries associated with multiple update publications, selecting the relevant citation). Part of this is due to users not realizing which of the modules they are using have associated publications and should be cited. To remediate to that situation, I suggest a __citation__ method associated to each package installation and import. Called from the __main__, __citation__() would scan __citation__ of all imported packages and return the list of all relevant top-level citations associated to the packages. As a scientific package developer working in academia, the problem is quite serious, and the solution seems relatively straightforward. What does Python core team think about addition and long-term maintenance of such a feature to the import and setup mechanisms? What do other users and scientific package developers think of such a mechanism for citations retrieval? Best, *Andrei Kucharavy*Post-Doc @ *Joel S. Bader* * Lab*Johns Hopkins University, Baltimore, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jun 27 18:49:35 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Jun 2018 15:49:35 -0700 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: While I'm not personally in need of citations (and never felt I was) I can easily understand the point -- sometimes citations can make or break a career and having written a popular software package should be acknowledged. Are there other languages or software communities that do something like this? It would be nice not to have to invent this wheel. Eventually a PEP and an implementation should be presented, but first the idea needs to be explored more. --Guido On Wed, Jun 27, 2018 at 3:30 PM Andrei Kucharavy wrote: > Over the last 10 years, Python has slowly inched towards becoming the most > popular scientific computing language, beating or seriously challenging > Matlab, R, Mathematica and many specialized languages (S, SAS, ...) in > numerous applications. > > A large part of this growth is driven by amazing community packages, such > as numpy, scipy, scikits-learn, scikits-image, seaborn or pandas, just to > name a few. Development of such packages represents a significant time > investment by people working in academic environments. To be able to > justify the investment of time into such package development and support, > the developers usually associated them with a scientific article. The > number of citations of those articles are considered as measures of the > usefulness of articles and are required to justify the time spent on them. > > Unfortunately, as of now, a significant issue is that such packages are > not cited despite being extensively used. Part of this is due to the > difficulties with compiling the list of proper citations for each module > (and, for libraries associated with multiple update publications, selecting > the relevant citation). Part of this is due to users not realizing which of > the modules they are using have associated publications and should be cited. > > To remediate to that situation, I suggest a __citation__ method associated > to each package installation and import. Called from the __main__, > __citation__() would scan __citation__ of all imported packages and return > the list of all relevant top-level citations associated to the packages. > > As a scientific package developer working in academia, the problem is > quite serious, and the solution seems relatively straightforward. > > What does Python core team think about addition and long-term maintenance > of such a feature to the import and setup mechanisms? What do other users > and scientific package developers think of such a mechanism for citations > retrieval? > > Best, > > > *Andrei Kucharavy*Post-Doc @ *Joel S. Bader* > * Lab*Johns Hopkins University, Baltimore, USA. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Wed Jun 27 19:00:36 2018 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Wed, 27 Jun 2018 18:00:36 -0500 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: This is an interesting proposal. Speaking as a developer of scientific software packages it would be really cool to have support for something like this in the language itself. The software sustainability institute in the UK have written several blog posts advocating the use of CITATION files containing this sort of metadata: https://software.ac.uk/blog/2017-12-12-standard-format-citation-files A github code search for __citation__ also gets 127 hits that mostly seem to be research software that are using this attribute more or less as suggested here: https://github.com/search?q=__citation__&type=Code It's also worth pointing out http://citeas.org/ which is sort of a citation search engine for software projects. It uses a number of heuristics to figure out what the appropriate citation for a piece of software is. On Wed, Jun 27, 2018 at 5:49 PM, Guido van Rossum wrote: > While I'm not personally in need of citations (and never felt I was) I can > easily understand the point -- sometimes citations can make or break a > career and having written a popular software package should be acknowledged. > > Are there other languages or software communities that do something like > this? It would be nice not to have to invent this wheel. Eventually a PEP > and an implementation should be presented, but first the idea needs to be > explored more. > > --Guido > > On Wed, Jun 27, 2018 at 3:30 PM Andrei Kucharavy < > andrei.kucharavy at gmail.com> wrote: > >> Over the last 10 years, Python has slowly inched towards becoming the >> most popular scientific computing language, beating or seriously >> challenging Matlab, R, Mathematica and many specialized languages (S, SAS, >> ...) in numerous applications. >> >> A large part of this growth is driven by amazing community packages, such >> as numpy, scipy, scikits-learn, scikits-image, seaborn or pandas, just to >> name a few. Development of such packages represents a significant time >> investment by people working in academic environments. To be able to >> justify the investment of time into such package development and support, >> the developers usually associated them with a scientific article. The >> number of citations of those articles are considered as measures of the >> usefulness of articles and are required to justify the time spent on them. >> >> Unfortunately, as of now, a significant issue is that such packages are >> not cited despite being extensively used. Part of this is due to the >> difficulties with compiling the list of proper citations for each module >> (and, for libraries associated with multiple update publications, selecting >> the relevant citation). Part of this is due to users not realizing which of >> the modules they are using have associated publications and should be cited. >> >> To remediate to that situation, I suggest a __citation__ method >> associated to each package installation and import. Called from the >> __main__, __citation__() would scan __citation__ of all imported packages >> and return the list of all relevant top-level citations associated to the >> packages. >> >> As a scientific package developer working in academia, the problem is >> quite serious, and the solution seems relatively straightforward. >> >> What does Python core team think about addition and long-term maintenance >> of such a feature to the import and setup mechanisms? What do other users >> and scientific package developers think of such a mechanism for citations >> retrieval? >> >> Best, >> >> >> *Andrei Kucharavy*Post-Doc @ *Joel S. Bader* >> * Lab*Johns Hopkins University, Baltimore, USA. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Wed Jun 27 15:50:41 2018 From: abedillon at gmail.com (Abe Dillon) Date: Wed, 27 Jun 2018 12:50:41 -0700 (PDT) Subject: [Python-ideas] random.sample should work better with iterators In-Reply-To: <20180627010526.GW14437@ando.pearwood.info> References: <6933138b-85f0-4b29-97ca-fac0ffce1212@googlegroups.com> <20180627010526.GW14437@ando.pearwood.info> Message-ID: <318b680d-55c1-4b11-b30c-3ef2b71f5190@googlegroups.com> Let me start off by saying I agree with Tim Peters that it would be best to implement these changes in a new function (if ever). On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > > range is not an iterator. > My misunderstanding of the details of range objects was, indeed, a huge contributing factor to my confusion. I assumed range was more like a generator function when I initially discovered that random.sample doesn't permit iterators, however; the reason I'm proposing a version of random.sample that accepts iterators is still sound. It's even in the text for why one should use a range object: To choose a sample from a range of integers, use a range() > object as an > argument. *This is especially fast and space efficient for sampling from > a large population* As I claimed before: A major use-case for sampling is to avoid working with an impractically large population. This is also a major use-case for iterators. On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > > this seems overly constrained. The inability to handle dictionaries is > > especially puzzling. > Puzzling in what way? > > If sample() supported dicts, should it return the keys or the values or > both? Like in all other contexts where a dictionary is treated as a collection, it should be treated as a collection of keys. There are plenty of precedence of this: d = dict(zip(names, ages)) chronological_names = sorted(d, key=d.get) name_list, name_set = list(d), set(d) print(*d) On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > Also consider this: > > https://bugs.python.org/issue33098 > > > I respectfully disagree with the conclusion of that issue. It goes against the "consenting adults" ethos of Python. As long as the performance implications are expressly documented and and maybe even a warning thrown, I don't see a reason to prevent people from using a useful function. You can't protect programmers from writing inefficient programs. Also, It seems like the dict interface could expose a way to get a sequence view of the keys. This would be very efficient given the current implementation of dictionaries in CPython. So, it's not like it's fundamentally impossible for random.choice to work efficiently with dicts, it's more of a implementation detail. On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > > Randomly sampling from some population is often done because the entire > > population is impractically large which is also a motivation for using > > iterators, so it seems natural that one would be able to sample from an > > iterator. A naive implementation could use a heap queue: > > > > import heapq > > import random > > > > def stream(): > > while True: yield random.random() > > > > def sample(population, size): > > q = [tuple()]*size > > for el in zip(stream(), population): > > if el > q[0]: heapq.heapreplace(q, el) > > return [el[1] for el in q if el] > > Is that an improvement over: > > sample(list(itertools.slice(population, size))) > > and if so, please explain. Do you mean: sample(list(itertools.islice(population, size), size)? If so, then I'll refer you to Tim Peter's response, otherwise: please clarify what you meant. On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > > It would also be helpful to add a ratio version of the function: > > > > def sample(population, size=None, *, ratio=None): > > assert None in (size, ratio), "can't specify both sample size and > ratio" > > if ratio: > > return [el for el in population if random.random() < ratio] > > ... > > Helpful under what circumstances? I wasn't aware of the linear-time reservoir sampling algorithms that Tim Peters suggested. Those make the ratio proposal less helpful. As you can see from the implementation I proposed, the ratio would be able to work with iterators of undetermined size in linear time, however; it wouldn't satisfy the valid subsampling criteria (unless you shuffle the output) and it would only return *roughly* ratio*len(population) elements instead of an exact number. On Tuesday, June 26, 2018 at 8:06:35 PM UTC-5, Steven D'Aprano wrote: > Don't let the source speak for itself. Explain what it means. I > understand what sample(population, size=100) does. What would > sample(population, ratio=0.25) do? It would return a sample of roughly 25% of the population. [Stephen J. Turnbull] > I argue below that *if* we were going to make the change, it should be > to consistently try list() on non-sequences. But "not every > one-liner" and EIBTI: Converting the input to a list is exactly what I'm trying to avoid. I'd like to sample from an enormous file that won't fit in memory or populate 5% of a large game-of-life grid without using up gigabytes of memory: width, height, ratio = 100000, 100000, 0.05 live_set = {*random.sample(itertools.product(range(height), range(width)) , ratio*width*height)} -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jun 27 20:19:35 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 27 Jun 2018 17:19:35 -0700 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: On Wed, Jun 27, 2018 at 2:20 PM, Andrei Kucharavy wrote: > To remediate to that situation, I suggest a __citation__ method associated > to each package installation and import. Called from the __main__, > __citation__() would scan __citation__ of all imported packages and return > the list of all relevant top-level citations associated to the packages. > > As a scientific package developer working in academia, the problem is quite > serious, and the solution seems relatively straightforward. > > What does Python core team think about addition and long-term maintenance of > such a feature to the import and setup mechanisms? What do other users and > scientific package developers think of such a mechanism for citations > retrieval? This is indeed a serious problem. I suspect python-ideas isn't the best venue for addressing it though ? there's nothing here that needs changes to the Python interpreter itself (I think), and the people who understand this problem the best and who are most affected by it, mostly aren't here. You'll want to check out the duecredit project: https://github.com/duecredit/duecredit One of the things they've thought about is the ability to track citation information at a more fine-grained way than per-package ? for example, there might be a paper that should be cited by anyone who calls a particular method (or even passes a specific argument to some specific method, when that turns on some fancy algorithm). The R world also has some prior art -- in particular I know they have citations as part of the standard metadata in every package. I'd actually like to see a more general solution that isn't restricted to any one language, because multi-language analysis pipelines are very common. For example, we could standardize a convention where if a certain environment variable is set, then the software writes out citation information to a certain location, and then implement libraries that do this in multiple languages. Of course, that's a "dynamic" solution that requires running the software -- which is probably necessary if you want to do fine-grained citations, but it might be useful to also have static metadata, e.g. as part of the package metadata that goes into sdists, wheels, and on PyPI. That would be a discussion for the distutils-sig mailing list, which manages that metadata. One challenge in standardizing this kind of thing is choosing a standard way to represent citation information. Maybe CSL-JSON? There's a lot of complexity as you dig into this, though of course one shouldn't let the perfect be the enemy of the good... -n -- Nathaniel J. Smith -- https://vorpus.org From greg.ewing at canterbury.ac.nz Wed Jun 27 20:22:16 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 28 Jun 2018 12:22:16 +1200 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: <5B342A38.9030002@canterbury.ac.nz> Guido van Rossum wrote: > Sounds to me really strange that the nested class would become a member. > Probably because everything becomes a member unless it's a function > (maybe decorated)? Maybe it would have been better if Enums got told what type their members are supposed to be, an only decorated things of that type. class Color(Enum): __type__ = int RED = 1 GREEN = 2 BLUE = 3 i_get_left_alone = 4.2 Or perhaps this could be made to work somehow: class Color(Enum(int)): RED = 1 GREEN = 2 BLUE = 3 i_get_left_alone = 4.2 -- Greg From jheiv at jheiv.com Wed Jun 27 21:45:05 2018 From: jheiv at jheiv.com (James Edwards) Date: Wed, 27 Jun 2018 21:45:05 -0400 Subject: [Python-ideas] "Exposing" `__min__` and `__max__` In-Reply-To: References: <20180627100101.GE14437@ando.pearwood.info> <20180627142701.GH14437@ando.pearwood.info> Message-ID: > > I'd want to see code that currently uses ``if isinstance`` to switch > between ``max(x)`` and ``x.max()``. I understand that, and I've spent a few hours searching github with less-than-stellar results due to github's search syntax ignoring '.' and '(' . (That being said, there are a number of projects on github that seem to expect __min__ to work (example ), but it's true that some are using the dunder for their own purposes.). However, there are many modules that implement objects that could make good use of exposing __min__ (e.g. bintrees , sortedcontainers ). bintrees provides `min_item()` and `max_item()` accessors. sortedcontainers doesn't seem to explicitly provide similar methods, but maintain the sortedness of the objects so min and max can be accessed via index. And I can't stress enough the value in being able to switch to one of these classes from a standard iterable and have the rest of your code (e.g. `min()`s "just work"). Also, the desire for custom implementations is there , but the proposed solutions rely on (IMO) ugly hacks that would break other things. The search for the perfect `x.min() if isinstance(...) else min(x)` still continues, however. On Wed, Jun 27, 2018 at 11:30 AM, Michael Selik wrote: > On Wed, Jun 27, 2018 at 8:16 AM Franklin? Lee < > leewangzhong+python at gmail.com> wrote: > >> On Wed, Jun 27, 2018, 10:31 Steven D'Aprano wrote: >> >>> On Wed, Jun 27, 2018 at 06:52:14AM -0700, Michael Selik wrote: >>> > My intent was to ask where a range was in fact passed into max, not >>> merely >>> > where it could be. It'd be enlightening to see a complete, realistic >>> > example. >>> >>> A complete, realistic example is as I said: you call max() on some >>> object which you don't control, the caller does. You could be >>> passed a list, or a set, or a bitset, a binary search tree, a range >>> object, whatever the caller happens to pass to you. >>> >> >> Let's just assume Michael wants to know, and isn't making an argument >> against the proposal. >> > > I do want to know, but it's also an argument against the proposal -- that > no one has contributed in-context usage to demonstrate the value. I'd want > to see code that currently uses ``if isinstance`` to switch between > ``max(x)`` and ``x.max()``. Chris Barker explained the issue well. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Jun 27 23:57:45 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 28 Jun 2018 06:57:45 +0300 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <5B33A33F.30207@stoneleaf.us> References: <5B33A33F.30207@stoneleaf.us> Message-ID: 27.06.18 17:46, Ethan Furman ????: > Question: > > ? Should `SomeClass` be an enum member?? When would it be useful to > have an embedded class in an Enum be an enum member? > > > The only example I have seen so far of nested classes in an Enum is when > folks want to make an Enum of Enums, and the nested Enum should not > itself be an enum member.? Since the counter-example already works I > haven't seen any requests for it.? ;) > > So I'm asking the community:? What real-world examples can you offer for > either behavior?? Cases where nested classes should be enum members, and > cases where nested classes should not be members. What would be a benefit of making a class nested if it not be an enum member? Nested functions become methods, but there are no relations between a nested class or its instances and the outer class or its instances. The current behavior looks understandable to me. Functions are descriptors, but classes are not. Making a nested class a member you don't lost anything, because you always can make it not-nested if you don't want it be a member. But when a nested class is not a member, you would lost the possibility of making it a member (and this may break existing code). From j.van.dorp at deonet.nl Thu Jun 28 02:48:19 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 28 Jun 2018 08:48:19 +0200 Subject: [Python-ideas] Delivery Status Notification (Failure) In-Reply-To: <000000000000347129056fae1768@google.com> References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> <000000000000347129056fae1768@google.com> Message-ID: Have you tried just instantiating a Counter() instead ? All missing keys are considerd to be 0 in a fresh counter. So for example: >>> c = Counter() >>> c["a"] += 1 >>> c Counter({'a': 1}) >>> c["b"] 0 works exactly this way. Which means there's no difference between what you're suggesting Counter.fromkeys(key-list-or-set, 0) should do and what Counter() actually does. (Resend because somehow gmail tried to reply to the wrong python-ideas) 2018-06-27 22:36 GMT+02:00 Abe Dillon : > Consider the following update function for conway's game of life: > > from collections import Counter > > def update(live: Set[Tuple[Integer, Integer]]): > counts = Counter.fromkeys(live, 0) + > Counter(itertools.chain(neighbors(*cell) for cell in live)) > flip = {cell for cell, count in counts.items() > if (cell in live and not 1 or (cell not in live and count==3)} > live ^= flip > > around = frozenset(filter(any, itertools.product(range(-1,2), range(-1,2)))) > def neighbors(r: Integer, c: Integer): > return (((r+dr)%height, (c+dc)%width) for dr, dc in around) > > The problem is, Count.fromkeys isn't implemented. I propose that it work > exactly as it does for dict, otherwise it's difficult to add items to a > Counter when you want them to start off at zero or some other count. > > The best solution I came up with is to, more confusingly, count live cells > once extra and adjust the rules accordingly: > > def update(live: Set[Tuple[Integer, Integer]]): > counts = Counter(itertools.chain(live, *(neighbors(*cell) for cell in > live))) > flip = {cell for cell, count in counts.items() > if (cell in live and not 2 or (cell not in live and count==3)} > live ^= flip > > around = frozenset(filter(any, itertools.product(range(-1,2), range(-1,2)))) > def neighbors(r: Integer, c: Integer): > return (((r+dr)%height, (c+dc)%width) for dr, dc in around) > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From j.van.dorp at deonet.nl Thu Jun 28 03:07:03 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Thu, 28 Jun 2018 09:07:03 +0200 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <5B342A38.9030002@canterbury.ac.nz> References: <5B33A33F.30207@stoneleaf.us> <5B342A38.9030002@canterbury.ac.nz> Message-ID: > Greg > Or perhaps this could be made to work somehow: > > class Color(Enum(int)): > RED = 1 > GREEN = 2 > BLUE = 3 > i_get_left_alone = 4.2 Enum already is callable - it creates Enum subclasses. e.g. Color = Enum("Color", ("RED", "GREEN", "BLUE")) (or something similar, I didn't check the docs.) In general, it sounds to me like you already can avoid them becoming members if you want - simply by making them a property. Therefore, letting them be members by default doesn't sound weird to me. As for embedded classes - well, the cannot be subclasses because the outer class is not instantiated. You could perhaps build a class decorator that injects the outer class in the inner class' MRO, but that doesn't sound like code I'd want to debug. From gregory.lielens at gmail.com Thu Jun 28 03:43:55 2018 From: gregory.lielens at gmail.com (Greg) Date: Thu, 28 Jun 2018 09:43:55 +0200 Subject: [Python-ideas] Allow mutable builtin types (optionally) In-Reply-To: References: Message-ID: Mostly historical reasons I guess, we started with static types because most class extension examples were using it, and it worked for all we did at the time (including the __class__ assign trick). We then got hit by the change, and solved the issue by patching python. Now keeping our own patched CPython is still perfectly viable, but there are disadvantages to not using the off-the-self interpreter: it prevent interacting with other proprietary libs within a single interpreter. So that would be plan C (BTW having a common interpreter is a necessary condition, but far from enough...) Plan A was to push the idea that extension type behavior could be more tunable: ATM you choose between static and heap type, and it comes with a bunch of other non obvious differences, whose combination is not necessarily the best for your use case or preferences (see for example http://grokbase.com/t/python/python-dev/097twacntz/py-tpflags-heaptype-too-overloaded ). I still think it's s good idea to allow for more tunable behavior, and certainly not use heap type as a marker for things not obviously related, but as it did not gather traction 10y ago, nor this time either, so I will not push for plan A. Time has teached me it is better to wait for the proposal come back from another source, than invest too much time in it if support, or at least discussion is not strong from the start. It will probably come back and pass later, in a form or another ( PEP225/465 ;-p ) Plan B is to see how much additional work it will be to use heap type, and if we do not suffer from it's other properties. If we end up requiring another patch, bye bye plan B and back to the original patch. Le mer. 27 juin 2018 ? 19:46, Guido van Rossum a ?crit : > So the question remains -- why not use a heap type? > > On Wed, Jun 27, 2018 at 2:05 AM Greg wrote: > >> As I introduced (a long time ago) this demand, let me add my grain of >> salt here. >> >> The use case is pretty simple, and somewhat common when writing manually >> C extension class: The reason to write extension class is usually >> performance, or link into an existing library. >> When doing this manually (instead of using automatic python-wrapping >> tools like boost, swig,...) you try to wrap the minimum amount of >> methods/accessors/...to your underlying c/c++ class, and replicate >> non-critical methods in python. >> Moreover, extending your class by adding new methods is usually much more >> easy in Python, especially if it involve complex but not >> performance-bounded python-data manipulation. >> Problem is to make those python-implemented methods avaible to instances >> of your extension class, especially when those instances are returned by >> the C layer of your extension. >> >> The solution we choose was to change the __class__ of each extension >> type instance to the python derived newclass implementing all those >> extra-methods.Not too difficult, a simple encapsulation of all methods >> returning extension-class instances is enough, and can be automated. >> This solution is quite common I think, it translate something you do for >> python-class instances, but then you get the __class__ assignment: only >> for heap types error. >> >> The argument about sub-interpreters is a good one, but not really >> applicable for this use case: we really want to have one extension type (or >> a hierarchy of it) shared across all interpreter importing the extension, >> it just happen that instead of being implemented in pure C/C++, the >> extension is implemented in C/C++ and Python. The fact that the python >> parts will be seen everywhere is a feature, not a problem: you expect the >> replacement of C-implemented methods by Python-implemented method to be as >> transparent as possible. >> >> Alternatives would be to use a heap type for our C extensions classes >> (we need to check what it would imply, but it may be quite painless) >> or use some form or delegation instead of assigning to __class__. >> The later is not really painless, AFAIK, in term of coding complexity and >> possibly performance (extra lookups steps needed). >> >> If there are other solutions or if delegation can be made as >> simple/efficient as the __class__ mechanism, it would be good to know, and >> it is I think valuable info for many people writing extension classes. >> Anyway, my personal position on this has not changed in 10y and is in >> line with Eloi: I think that beeing a heaptype and allowing assigment to >> the __class__ attribute of instances is indeed quite orthogonal.. >> >> >> >> . >> >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Jun 28 04:03:41 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 28 Jun 2018 10:03:41 +0200 Subject: [Python-ideas] Add a __cite__ method for scientific packages References: Message-ID: <20180628100341.68e85b6f@fsol> On Wed, 27 Jun 2018 17:19:35 -0700 Nathaniel Smith wrote: > On Wed, Jun 27, 2018 at 2:20 PM, Andrei Kucharavy > wrote: > > To remediate to that situation, I suggest a __citation__ method associated > > to each package installation and import. Called from the __main__, > > __citation__() would scan __citation__ of all imported packages and return > > the list of all relevant top-level citations associated to the packages. > > > > As a scientific package developer working in academia, the problem is quite > > serious, and the solution seems relatively straightforward. > > > > What does Python core team think about addition and long-term maintenance of > > such a feature to the import and setup mechanisms? What do other users and > > scientific package developers think of such a mechanism for citations > > retrieval? > > This is indeed a serious problem. I suspect python-ideas isn't the > best venue for addressing it though ? there's nothing here that needs > changes to the Python interpreter itself (I think), and the people who > understand this problem the best and who are most affected by it, > mostly aren't here. > > You'll want to check out the duecredit project: > https://github.com/duecredit/duecredit > One of the things they've thought about is the ability to track > citation information at a more fine-grained way than per-package ? for > example, there might be a paper that should be cited by anyone who > calls a particular method (or even passes a specific argument to some > specific method, when that turns on some fancy algorithm). > > The R world also has some prior art -- in particular I know they have > citations as part of the standard metadata in every package. > > I'd actually like to see a more general solution that isn't restricted > to any one language, because multi-language analysis pipelines are > very common. Perhaps a dedicated CPU instruction? Regards Antoine. From steve at pearwood.info Thu Jun 28 04:43:15 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jun 2018 18:43:15 +1000 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: <20180628084313.GK14437@ando.pearwood.info> On Wed, Jun 27, 2018 at 05:20:01PM -0400, Andrei Kucharavy wrote: [...] > To remediate to that situation, I suggest a __citation__ method associated > to each package installation and import. Called from the __main__, > __citation__() would scan __citation__ of all imported packages and return > the list of all relevant top-level citations associated to the packages. Why does this have to be a dunder method? In general, application code shouldn't be calling dunders directly, they're reserved for Python. I think your description of what this method should do is not really coherent. On the one hand, you have __citation__() be a method that you call (how?) but on the other hand you have it being a data field __citation__ that you scan. Which is it? I do think you have identified an important feature, but I think this is a *tool*, not a *language feature*. My spur of the moment thought is: - we could have a script (a third party script? or in the std lib?) which the user calls, giving the name of their module or package as argument e.g. "python -m cite myapplication.py" - this script knows how to analyse myapplication.py for a list of dependencies, perhaps filtering out standard library packages; - it interrogates myapplication, and each dependency, for a citation; - this might involve reserving a standard __citation__ data field in each module, or a __citation__.xml file in the package, or some other protocol; - or perhaps the cite script nows how to generate the appropriate citation itself, from any of the standard formatted data fields found in many common modules, like __author__, __version__ etc. - either way, the script would generate a list of packages and modules used by myapplication, plus citations for them. Presumably you would need to be able to specify which citation style to use. The point is, the *grunt work* of generating the citations is just a script. It isn't a language feature. It might not even be in the std lib (although perhaps we could ship it as a standard Python script, like the compileall module and a few other tools, starting in version 3.8). The protocol of how the script works out the citations can be developed. Perhaps we could reserve a __citation__ dunder as a de facto standard data field, like people already use __author__ and __version__ and similar. Or it could look for a separate XML or TXT file in the package directory. > As a scientific package developer working in academia, the problem is quite > serious, and the solution seems relatively straightforward. > > What does Python core team think about addition and long-term maintenance > of such a feature to the import and setup mechanisms? What does this have to do with either import or setup? > What do other users > and scientific package developers think of such a mechanism for citations > retrieval? A long time ago, I added a feature request for a page in the documentation to show how to cite Python in various formats: https://bugs.python.org/issue26597 I don't believe there has been any progress on this. (I certainly don't know the right way to cite software.) Perhaps this can be merged with your idea. Should Python have a standard sys.__citation__ field that provides the relevant detail in some format-independent, machine-readable object like a named tuple? Then this hypothetical cite.py tool could read the tuple and format it according to any citation style. -- Steve From gadgetsteve at live.co.uk Thu Jun 28 06:10:59 2018 From: gadgetsteve at live.co.uk (Steve Barnes) Date: Thu, 28 Jun 2018 10:10:59 +0000 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: On 28/06/2018 00:00, Nathan Goldbaum wrote: > This is an interesting proposal. Speaking as a developer of scientific > software packages it would be really cool to have support for something > like this in the language itself. > > The software sustainability institute in the UK have written several > blog posts advocating the use of CITATION files containing this sort of > metadata: > > https://software.ac.uk/blog/2017-12-12-standard-format-citation-files > > A github code search for __citation__ also gets 127 hits that mostly > seem to be research software that are using this attribute more or less > as suggested here: > > https://github.com/search?q=__citation__&type=Code > > It's also worth pointing out http://citeas.org/ which is sort of a > citation search engine for software projects. It uses a number of > heuristics to figure out what the appropriate citation for a piece of > software is. > I just thought that it might be worth pointing out that this should actually work both ways i.e. if a specific package, module or function is inspired by or directly implements the methods included in a specific publication then any __citation__ entries within it should also cite that/those or allow references to them to be recovered. The general principle is if you are expecting to be cited you also have to cite. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. https://www.avg.com From nicolas.rolin at tiime.fr Thu Jun 28 11:25:44 2018 From: nicolas.rolin at tiime.fr (Nicolas Rolin) Date: Thu, 28 Jun 2018 17:25:44 +0200 Subject: [Python-ideas] Allow a group by operation for dict comprehension Message-ID: Hi, I use list and dict comprehension a lot, and a problem I often have is to do the equivalent of a group_by operation (to use sql terminology). For example if I have a list of tuples (student, school) and I want to have the list of students by school the only option I'm left with is to write student_by_school = defaultdict(list) for student, school in student_school_list: student_by_school[school].append(student) What I would expect would be a syntax with comprehension allowing me to write something along the lines of: student_by_school = {group_by(school): student for school, student in student_school_list} or any other syntax that allows me to regroup items from an iterable. Small FAQ: Q: Why include something in comprehensions when you can do it in a small number of lines ? A: A really appreciable part of the list and dict comprehension is the fact that it allows the developer to be really explicit about what he wants to do at a given line. If you see a comprehension, you know that the developer wanted to have an iterable and not have any side effect other than depleting the iterator (if he respects reasonable code guidelines). Initializing an object and doing a for loop to construct it is both too long and not explicit enough about what is intended. It should be reserved for intrinsically complex operations, not one of the base operation one can want to do with lists and dicts. Q: Why group by in particular ? A: If we take SQL queries (https://en.wikipedia.org/wiki/SQL_syntax#Queries) as a reasonable way of seeing how people need to manipulate data on a day-to-day basis, we can see that dict comprehensions already covers most of the base operations, the only missing operations being group by and having. Q: Why not use it on list with syntax such as student_by_school = [ school, student for school, student in student_school_list group by school ] ? A: It would create either a discrepancy with iterators or a perhaps misleading semantic (the one from itertools.groupby, which requires the iterable to be sorted in order to be useful). Having the option do do it with a dict remove any ambiguity and should be enough to cover most "group by" applications. Examples: edible_list = [('fruit', 'orange'), ('meat', 'eggs'), ('meat', 'spam'), ('fruit', 'apple'), ('vegetable', 'fennel'), ('fruit', 'pineapple'), ('fruit', 'pineapple'), ('vegetable', 'carrot')] edible_list_by_food_type = {group_by(food_type): edible for food_type, edible in edible_list} print(edible_list_by_food_type) {'fruit': ['orange', 'pineapple'], 'meat': ['eggs', 'spam'], 'vegetable': ['fennel', 'carrot']} bank_transactions = [200.0, -357.0, -9.99, -15.6, 4320.0, -1200.0] splited_bank_transactions = {group_by('credit' if amount > 0 else 'debit'): amount for amount in bank_transactions} print(splited_bank_transactions) {'credit': [200.0, 4320.0], 'debit': [-357.0, -9.99, -15.6, -1200.0]} -- Nicolas Rolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Jun 28 11:38:10 2018 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jun 2018 08:38:10 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin wrote: > I use list and dict comprehension a lot, and a problem I often have is to > do the equivalent of a group_by operation (to use sql terminology). > > For example if I have a list of tuples (student, school) and I want to > have the list of students by school the only option I'm left with is to > write > > student_by_school = defaultdict(list) > for student, school in student_school_list: > student_by_school[school].append(student) > Thank you for bringing this up. I've been drafting a proposal for a better grouping / group-by operation for a little while. I'm not quite ready to share it, as I'm still researching use cases. I'm +1 that this task needs improvement, but -1 on this particular solution. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 11:18:36 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 28 Jun 2018 11:18:36 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: I think this is a fine idea, but could be achieved by convention, like __version__, rather than by fiat. And it?s certainly not a language feature. So Nathaniel?s right ? the thing to do now is work out the convention, and then advocate for it. -CHB From chris.barker at noaa.gov Thu Jun 28 12:26:28 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 09:26:28 -0700 Subject: [Python-ideas] list configuration Message-ID: Hey all, I've been replying to messages lately, and getting a bounce back: """ Hello chris.barker at noaa.gov, We're writing to let you know that the group you tried to contact (python-ideas) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post: """ And it's not quite clar to me if the message actually got through. IIUC, this is a Mailman list -- so it must be getting mirrored through google groups, and at least with some people's posts, the reply-to header is getting messed up. Anyone know what's going on? It would be nice to fix this... -CHB PS: I've seen a couple other notes about this -- I'm not the only one. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 13:01:00 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 10:01:00 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin wrote: > > I use list and dict comprehension a lot, and a problem I often have is to > do the equivalent of a group_by operation (to use sql terminology). > I don't know from SQL, so "group by" doesn't mean anything to me, but this: > For example if I have a list of tuples (student, school) and I want to > have the list of students by school the only option I'm left with is to > write > > student_by_school = defaultdict(list) > for student, school in student_school_list: > student_by_school[school].append(student) > seems to me that the issue here is that there is not way to have a "defaultdict comprehension" I can't think of syntactically clean way to make that possible, though. Could itertools.groupby help here? It seems to work, but boy! it's ugly: In [*45*]: student_school_list Out[*45*]: [('Fred', 'SchoolA'), ('Bob', 'SchoolB'), ('Mary', 'SchoolA'), ('Jane', 'SchoolB'), ('Nancy', 'SchoolC')] In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted(student_school_list, key=*lambda* t: t[1]), key=*lambda* t: t[ ...: 1])} ...: ...: ...: ...: ...: ...: ...: Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'], 'SchoolC': ['Nancy']} -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Thu Jun 28 13:21:02 2018 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 28 Jun 2018 18:21:02 +0100 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: <0a12a07f-3801-08b4-f9ab-e4f94bad60ff@btinternet.com> Why not write a helper function?? Something like def group_by(iterable, groupfunc, itemfunc=lambda x:x, sortfunc=lambda x:x): # Python 2 & 3 compatible! ??? D = {} ??? for x in iterable: ??????? group = groupfunc(x) ??????? D[group] = D.get(group, []) + [itemfunc(x)] ??? if sortfunc is not None: ??????? for group in D: ??????????? D[group] = sorted(D[group], key=sortfunc) ??? return D Then: student_list = [ ('james', 'Dublin'), ('jim', 'Cork'), ('mary', 'Cork'), ('fred', 'Dublin') ] student_by_school = group_by(student_list, lambda stu_sch : stu_sch[1], lambda stu_sch : stu_sch[0]) print (student_by_school) {'Dublin': ['fred', 'james'], 'Cork': ['jim', 'mary']} Regards Rob Cliffe On 28/06/2018 16:25, Nicolas Rolin wrote: > Hi, > > I use list and dict comprehension a lot, and a problem I often have is > to do the equivalent of a group_by operation (to use sql terminology). > > For example if I have a list of tuples (student, school) and I want to > have the list of students by school the only option I'm left with is > to write > > ??? student_by_school = defaultdict(list) > ??? for student, school in student_school_list: > ??????? student_by_school[school].append(student) > > What I would expect would be a syntax with comprehension allowing me > to write something along the lines of: > > ??? student_by_school = {group_by(school): student for school, student > in student_school_list} > > or any other syntax that allows me to regroup items from an iterable. > > > Small FAQ: > > Q: Why include something in comprehensions when you can do it in a > small number of lines ? > > A: A really appreciable part of the list and dict comprehension is the > fact that it allows the developer to be really explicit about what he > wants to do at a given line. > If you see a comprehension, you know that the developer wanted to have > an iterable and not have any side effect other than depleting the > iterator (if he respects reasonable code guidelines). > Initializing an object and doing a for loop to construct it is both > too long and not explicit enough about what is intended. > It should be reserved for intrinsically complex operations, not one of > the base operation one can want to do with lists and dicts. > > > Q: Why group by in particular ? > > A: If we take SQL queries > (https://en.wikipedia.org/wiki/SQL_syntax#Queries) as a reasonable way > of seeing how people need to manipulate data on a day-to-day basis, we > can see that dict comprehensions already covers most of the base > operations, the only missing operations being group by and having. > > Q: Why not use it on list with syntax such as > ??? student_by_school = [ > ??????? school, student > ??????? for school, student in student_school_list > ??????? group by school > ??? ] > ? > > A: It would create either a discrepancy with iterators or a perhaps > misleading semantic (the one from itertools.groupby, which requires > the iterable to be sorted in order to be useful). > Having the option do do it with a dict remove any ambiguity and should > be enough to cover most "group by" applications. > > > Examples: > > ??? edible_list = [('fruit', 'orange'), ('meat', 'eggs'), ('meat', > 'spam'), ('fruit', 'apple'), ('vegetable', 'fennel'), ('fruit', > 'pineapple'), ('fruit', 'pineapple'), ('vegetable', 'carrot')] > ??? edible_list_by_food_type = {group_by(food_type): edible for > food_type, edible in edible_list} > > ??? print(edible_list_by_food_type) > ?? {'fruit': ['orange', 'pineapple'], 'meat': ['eggs', 'spam'], > 'vegetable': ['fennel', 'carrot']} > > > ?? bank_transactions = [200.0, -357.0, -9.99, -15.6, 4320.0, -12000] > ?? splited_bank_transactions = {group_by('credit' if amount > 0 else > 'debit'): amount for amount in bank_transactions} > > ?? print(splited_bank_transactions) > ?? {'credit': [200.0, 4320.0], 'debit': [-357.0, -9.99, -15.6, -1200.0]} > > > > -- > Nicolas Rolin > > > Virus-free. www.avg.com > > > > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericfahlgren at gmail.com Thu Jun 28 13:31:00 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Thu, 28 Jun 2018 10:31:00 -0700 Subject: [Python-ideas] list configuration In-Reply-To: References: Message-ID: I've been getting those, too, but from the wxPython-dev group. I concur that they look like googlegroups bounces (although I can't confirm that as I've been deleting them without much inspection). On Thu, Jun 28, 2018 at 9:35 AM Chris Barker via Python-ideas < python-ideas at python.org> wrote: > Hey all, > > I've been replying to messages lately, and getting a bounce back: > > """ > Hello chris.barker at noaa.gov, > > We're writing to let you know that the group you tried to contact > (python-ideas) may not exist, or you may not have permission to post > messages to the group. A few more details on why you weren't able to post: > """ > > And it's not quite clar to me if the message actually got through. > > IIUC, this is a Mailman list -- so it must be getting mirrored through > google groups, and at least with some people's posts, the reply-to header > is getting messed up. > > Anyone know what's going on? It would be nice to fix this... > > -CHB > > PS: I've seen a couple other notes about this -- I'm not the only one. > > > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Thu Jun 28 13:43:56 2018 From: random832 at fastmail.com (Random832) Date: Thu, 28 Jun 2018 13:43:56 -0400 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: <1530207836.3337143.1423737512.2B2DB1C5@webmail.messagingengine.com> On Wed, Jun 27, 2018, at 15:04, Elazar wrote: > People working with sum types might expect the instances of the nested > class to be instances of the enclosing class. So if the nested class is a > namedtuple, you get a sum type. The only problem is that there's no way to > express this subtype relationship in code. I bet you could get around it with a custom __build_class__. (As for preventing the nested class from being an enum member, @staticmethod works to get around that) From mike at selik.org Thu Jun 28 14:23:49 2018 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jun 2018 11:23:49 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: <0a12a07f-3801-08b4-f9ab-e4f94bad60ff@btinternet.com> References: <0a12a07f-3801-08b4-f9ab-e4f94bad60ff@btinternet.com> Message-ID: On Thu, Jun 28, 2018 at 10:24 AM Rob Cliffe via Python-ideas < python-ideas at python.org> wrote: > def group_by(iterable, groupfunc, itemfunc=lambda x:x, sortfunc=lambda > x:x): # Python 2 & 3 compatible! > > D = {} > for x in iterable: > group = groupfunc(x) > D[group] = D.get(group, []) + [itemfunc(x)] > if sortfunc is not None: > for group in D: > D[group] = sorted(D[group], key=sortfunc) > return D > The fact that you didn't use ``setdefault`` here, opting for repeatedly constructing new lists via concatenation, demonstrates the need for a built-in or standard library tool that is easier to use. I'll submit a proposal for your review soon. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhodri at kynesim.co.uk Thu Jun 28 13:07:47 2018 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 28 Jun 2018 18:07:47 +0100 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: <1c92d785-dfba-a033-b912-737aa6bd8023@kynesim.co.uk> On 28/06/18 16:25, Nicolas Rolin wrote: > Hi, > > I use list and dict comprehension a lot, and a problem I often have is to > do the equivalent of a group_by operation (to use sql terminology). > > For example if I have a list of tuples (student, school) and I want to have > the list of students by school the only option I'm left with is to write > > student_by_school = defaultdict(list) > for student, school in student_school_list: > student_by_school[school].append(student) > > What I would expect would be a syntax with comprehension allowing me to > write something along the lines of: > > student_by_school = {group_by(school): student for school, student in > student_school_list} > > or any other syntax that allows me to regroup items from an iterable. > Sorry, I don't like the extra load on comprehensions here. You are doing something inherently somewhat complicated and then attempting to hide the magic. Worse, you are hiding it by pretending to be something else (an ordinary comprehension), which will break people's intuition about what is being produced. -- Rhodri James *-* Kynesim Ltd From wes.turner at gmail.com Thu Jun 28 15:47:11 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 28 Jun 2018 15:47:11 -0400 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: PyToolz, Pandas, Dask .groupby() toolz.itertoolz.groupby does this succinctly without any new/magical/surprising syntax. https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.groupby >From https://github.com/pytoolz/toolz/blob/master/toolz/itertoolz.py : """ def groupby(key, seq): """ Group a collection by a key function >>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank'] >>> groupby(len, names) # doctest: +SKIP {3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']} >>> iseven = lambda x: x % 2 == 0 >>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8]) # doctest: +SKIP {False: [1, 3, 5, 7], True: [2, 4, 6, 8]} Non-callable keys imply grouping on a member. >>> groupby('gender', [{'name': 'Alice', 'gender': 'F'}, ... {'name': 'Bob', 'gender': 'M'}, ... {'name': 'Charlie', 'gender': 'M'}]) # doctest:+SKIP {'F': [{'gender': 'F', 'name': 'Alice'}], 'M': [{'gender': 'M', 'name': 'Bob'}, {'gender': 'M', 'name': 'Charlie'}]} See Also: countby """ if not callable(key): key = getter(key) d = collections.defaultdict(lambda: [].append) for item in seq: d[key(item)](item) rv = {} for k, v in iteritems(d): rv[k] = v.__self__ return rv """ If you're willing to install Pandas (and NumPy, and ...), there's pandas.DataFrame.groupby: https://pandas.pydata.org/pandas-docs/stable/generated/ pandas.DataFrame.groupby.html https://github.com/pandas-dev/pandas/blob/v0.23.1/pandas/ core/generic.py#L6586-L6659 Dask has a different groupby implementation: https://gist.github.com/darribas/41940dfe7bf4f987eeaa# file-pandas_dask_test-ipynb https://dask.pydata.org/en/latest/dataframe-api.html#dask.dataframe.DataFrame.groupby On Thursday, June 28, 2018, Chris Barker via Python-ideas < python-ideas at python.org> wrote: > On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin > wrote: >> >> I use list and dict comprehension a lot, and a problem I often have is to >> do the equivalent of a group_by operation (to use sql terminology). >> > > I don't know from SQL, so "group by" doesn't mean anything to me, but this: > > >> For example if I have a list of tuples (student, school) and I want to >> have the list of students by school the only option I'm left with is to >> write >> >> student_by_school = defaultdict(list) >> for student, school in student_school_list: >> student_by_school[school].append(student) >> > > seems to me that the issue here is that there is not way to have a > "defaultdict comprehension" > > I can't think of syntactically clean way to make that possible, though. > > Could itertools.groupby help here? It seems to work, but boy! it's ugly: > > In [*45*]: student_school_list > > Out[*45*]: > > [('Fred', 'SchoolA'), > > ('Bob', 'SchoolB'), > > ('Mary', 'SchoolA'), > > ('Jane', 'SchoolB'), > > ('Nancy', 'SchoolC')] > > > In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted > (student_school_list, key=*lambda* t: t[1]), key=*lambda* t: t[ > > ...: 1])} > > ...: > > ...: > > ...: > > ...: > > ...: > > ...: > > ...: > > Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'], > 'SchoolC': ['Nancy']} > > > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Jun 28 16:34:30 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 28 Jun 2018 16:34:30 -0400 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: I agree with these recommendations. There are excellent 3rd party tools that do what you want. This is way too much to try to shoehorn into a comprehension. I'd add one more option. You want something that behaves like SQL. Right in the standard library is sqlite3, and you can create an in-memory DB to hope the data you expect to group. On Thu, Jun 28, 2018, 3:48 PM Wes Turner wrote: > PyToolz, Pandas, Dask .groupby() > > toolz.itertoolz.groupby does this succinctly without any > new/magical/surprising syntax. > > https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.groupby > > From https://github.com/pytoolz/toolz/blob/master/toolz/itertoolz.py : > > """ > def groupby(key, seq): > """ Group a collection by a key function > >>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank'] > >>> groupby(len, names) # doctest: +SKIP > {3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']} > >>> iseven = lambda x: x % 2 == 0 > >>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8]) # doctest: +SKIP > {False: [1, 3, 5, 7], True: [2, 4, 6, 8]} > Non-callable keys imply grouping on a member. > >>> groupby('gender', [{'name': 'Alice', 'gender': 'F'}, > ... {'name': 'Bob', 'gender': 'M'}, > ... {'name': 'Charlie', 'gender': 'M'}]) # > doctest:+SKIP > {'F': [{'gender': 'F', 'name': 'Alice'}], > 'M': [{'gender': 'M', 'name': 'Bob'}, > {'gender': 'M', 'name': 'Charlie'}]} > See Also: > countby > """ > if not callable(key): > key = getter(key) > d = collections.defaultdict(lambda: [].append) > for item in seq: > d[key(item)](item) > rv = {} > for k, v in iteritems(d): > rv[k] = v.__self__ > return rv > """ > > If you're willing to install Pandas (and NumPy, and ...), there's > pandas.DataFrame.groupby: > > > https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html > > > https://github.com/pandas-dev/pandas/blob/v0.23.1/pandas/core/generic.py#L6586-L6659 > > > Dask has a different groupby implementation: > > https://gist.github.com/darribas/41940dfe7bf4f987eeaa#file-pandas_dask_test-ipynb > > > https://dask.pydata.org/en/latest/dataframe-api.html#dask.dataframe.DataFrame.groupby > > > On Thursday, June 28, 2018, Chris Barker via Python-ideas < > python-ideas at python.org> wrote: > >> On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin >> wrote: >>> >>> I use list and dict comprehension a lot, and a problem I often have is >>> to do the equivalent of a group_by operation (to use sql terminology). >>> >> >> I don't know from SQL, so "group by" doesn't mean anything to me, but >> this: >> >> >>> For example if I have a list of tuples (student, school) and I want to >>> have the list of students by school the only option I'm left with is to >>> write >>> >>> student_by_school = defaultdict(list) >>> for student, school in student_school_list: >>> student_by_school[school].append(student) >>> >> >> seems to me that the issue here is that there is not way to have a >> "defaultdict comprehension" >> >> I can't think of syntactically clean way to make that possible, though. >> >> Could itertools.groupby help here? It seems to work, but boy! it's ugly: >> >> In [*45*]: student_school_list >> >> Out[*45*]: >> >> [('Fred', 'SchoolA'), >> >> ('Bob', 'SchoolB'), >> >> ('Mary', 'SchoolA'), >> >> ('Jane', 'SchoolB'), >> >> ('Nancy', 'SchoolC')] >> >> >> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted(student_school_list, >> key=*lambda* t: t[1]), key=*lambda* t: t[ >> >> ...: 1])} >> >> ...: >> >> ...: >> >> ...: >> >> ...: >> >> ...: >> >> ...: >> >> ...: >> >> Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'], >> 'SchoolC': ['Nancy']} >> >> >> -CHB >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pylang3 at gmail.com Thu Jun 28 17:08:11 2018 From: pylang3 at gmail.com (pylang) Date: Thu, 28 Jun 2018 17:08:11 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: ?? ?> Are there other languages or software communities that do something like this? It would be nice not to have to invent this wheel. ? While I do not use R regularly, I understand their community is largely academic-driven, and citations are strongly encouraged as seen in their documentation: https://stat.ethz.ch/R-manual/R-devel/library/utils/html/citation.html Here is an example use of their `citation()` function: http://www.blopig.com/blog/2013/07/citing-r-packages-in-your-thesispaperassignments/ > citation() To cite R in publications use: R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. A BibTeX entry for LaTeX users is @Manual{, title = {R: A Language and Environment for Statistical Computing}, author = {{R Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2013}, url = {http://www.R-project.org/}, } Calling the `citation()` function generates a BibTex output ( http://www.bibtex.org/), which is one of the most common citation conventions. For reference, I believe this is the source code: https://github.com/wch/r-source/blob/c3f7d32c842ca61fa23a25d4240d6caf980fe2ee/src/library/tools/R/citation.R -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrei.kucharavy at gmail.com Thu Jun 28 17:25:00 2018 From: andrei.kucharavy at gmail.com (Andrei Kucharavy) Date: Thu, 28 Jun 2018 17:25:00 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: That's a lot of responses, thanks for the interest and the suggestions! Are there other languages or software communities that do something like > this? It would be nice not to have to invent this wheel. Eventually a PEP > and an implementation should be presented, but first the idea needs to be > explored more. To my knowledge, R is the only language that implements such a feature. Package developers add a CITATION text file containing a text with whatever text citation format for their package. A specialized citation() built-in function can be called from the REPL that would return a citation for the R itself, including a BibTex file for LateX users. When citation is called on a package instead, it returns the contents of CITATION for that package specifically (eg. citation("ggplot2")) or alternatively uses package metadata to build a sane citation. Given that most of work with R is done within a REPL and packages are installed/imported with commands such as install.package("ggplot2")/import("ggplot2"), this approach makes sense in that context. This, however, didn't feel terribly Pythonic to me. As for PEP and a reference implementation, I will gladly take care of them if the idea gets enough traction, but there seems to be already a PEP draft as well as an attempt at implementation by one of the AstroPy/AstroML maintainers, using the __citation__ field and citation() function to unpack it: https://github.com/adrn/CitationPEP There also seem some packages in the community using __bibtex__ rather than __citation__ to store BibTeX entries but I haven't found yet any large project implementing it or PEP drafts associated to it. The software sustainability institute in the UK have written several blog > posts advocating the use of CITATION files containing this sort of metadata: > https://software.ac.uk/blog/2017-12-12-standard-format-citation-files Yes, that's the R approach I presented above. It is viable, especially if hooked to something accessible from the REPL directly, such as __cite__ or __citation__ attribute/method for modules. I would, however, advocate for a more structured approach - perhaps JSON or BibTeX that would get parsed and converted to suitable citation format by the __cite__, if it was implemented as a method. A github code search for __citation__ also gets 127 hits that mostly seem > to be research software that are using this attribute more or less as > suggested here: > https://github.com/search?q=__citation__&type=Code Most of them are from the AstroPy universe or from the CitationPEP draft I've referenced above. This is indeed a serious problem. I suspect python-ideas isn't the > best venue for addressing it though ? there's nothing here that needs > changes to the Python interpreter itself (I think), and the people who > understand this problem the best and who are most affected by it, > mostly aren't here. There has been localized discussion popping up among the large scientific package maintainers and some attempts to solve the problem at the local level. Until now they seemed to be winding down due to a lack of a large-scale citation mechanism and a discussion about what is concretely doable at the scale of the language is likely to finalize As for the list, reserving a __citation__/__cite__ for packages at the same level as __version__ is now reserved and adding a citation()/cite() function to the standard library seemed large enough modifications to warrant searching a buy-in from the maintainers and the community at large. You'll want to check out the duecredit project: > https://github.com/duecredit/duecredit > One of the things they've thought about is the ability to track > citation information at a more fine-grained way than per-package ? for > example, there might be a paper that should be cited by anyone who > calls a particular method (or even passes a specific argument to some > specific method, when that turns on some fancy algorithm). Due credit looks amazing - I will definitely check it out. The idea was, however, to bring the barrier for adoption and usage as low as possible. In my experience, the vast majority of Python users in academic environment who aren't citing the packages properly are beginners. As such they are unlikely to search for third-party libraries beyond those they've found and used to solve their specific problem. who just assembled a pipeline based on widely-used libraries and would need to generate a citation list for it to pass on to their colleagues responsible for the paper assembly and submission. I'd actually like to see a more general solution that isn't restricted > to any one language, because multi-language analysis pipelines are > very common. For example, we could standardize a convention where if a > certain environment variable is set, then the software writes out > citation information to a certain location, and then implement > libraries that do this in multiple languages. Of course, that's a > "dynamic" solution that requires running the software -- which is > probably necessary if you want to do fine-grained citations, but it > might be useful to also have static metadata, e.g. as part of the > package metadata that goes into sdists, wheels, and on PyPI. That > would be a discussion for the distutils-sig mailing list, which > manages that metadata. Thanks for the reference to the distutils-sig list. I will talk to them if the idea gets traction here I am not entirely convinced for the multi-language pipelines. In bioinformatics, often the heavy lifting is done by a single package (for instance bowtie for RNA-seq alignment) and the output is piped to the custom script, mostly in R or Python. The citations for the library doing the heavy-lifting is often well-known and widely cited and the issues arise in the custom scripts importing and using libraries that should be cited without citing them. One challenge in standardizing this kind of thing is choosing a > standard way to represent citation information. Maybe CSL-JSON? > There's a lot of complexity as you dig into this, though of course one > shouldn't let the perfect be the enemy of the good... CLS-JSON represented as a dict to be supplied to the setup file is definitely one way of doing it. I was, however, thinking more about the BibTeX format, given that CLS-JSON is more closely affiliated with Mendeley Why does this have to be a dunder method? In general, application code shouldn't be calling dunders directly, they're reserved for Python. I was under the impression that sometimes the dunders are used to store relevant information that would not be of use to the most users, such as __version__ and sometimes to better control the execution flow (for instance the if __name__== "main") I think your description of what this method should do is not > really coherent. On the one hand, you have __citation__() be a method > that you call (how?) but on the other hand you have it being a data > field __citation__ that you scan. My initial idea was to have a __cite__ method embedded in the import mechanism that would parse data from config and upon a call on a package, return the citation developers want to see associated to the current package version in the format user needs. (for instance numpy.__cite__('bibtex') would return a citation for the current numpy version in BibTeX format). If called on the script itself __cite__('bibtex') would iterate through all the imported modules and retrieve their citations one by one, at least for those that modules that have associated citation. After reading the feedback in this thread, I believe that a __citation__ reserved field that pulls the data from the setup script and a cite() script in the standard library would be a better approach. In the end, I believe the best would be to implement both of them and see which one feels more pythonic. I do think you have identified an important feature, but I think this is > a *tool*, not a *language feature*. My spur of the moment thought is: > - we could have a script (a third party script? or in the std lib?) > which the user calls, giving the name of their module or package as > argument > e.g. "python -m cite myapplication.py" > - this script knows how to analyse myapplication.py for a list of > dependencies, perhaps filtering out standard library packages; > - it interrogates myapplication, and each dependency, for a citation; > - this might involve reserving a standard __citation__ data field > in each module, or a __citation__.xml file in the package, or > some other protocol; > - or perhaps the cite script nows how to generate the appropriate > citation itself, from any of the standard formatted data fields > found in many common modules, like __author__, __version__ etc. > - either way, the script would generate a list of packages and > modules used by myapplication, plus citations for them. Yes, that's the idea! The biggest reason for me to send the discussion to this list is to check if it would be acceptable to reserve the __citation__ data field in each module and include the cite() script in the standard library. Presumably you would need to be able to specify which citation style to > use. Yes, but to avoid building a configurable citation engine for the thousands of formats there are in the wild, it would take a couple of standard formats and interchangeable formats, such as bibtex or EndNote xref - both text formats that are simple to use. I was thinking about the approach taken by Google Scholar from that perspective. > What does Python core team think about addition and long-term maintenance > > of such a feature to the import and setup mechanisms? > What does this have to do with either import or setup? The implementation I was thinking about would have required __citation__/__cite__ dunder reservation or implementation of a function that would be injected into installed packages. For setup I was thinking about adding the citation field to the distutils setup. I was not really aware of the distutils-sig discussion list that would be more appropriate with that regards. A long time ago, I added a feature request for a page in the > documentation to show how to cite Python in various formats: > https://bugs.python.org/issue26597 > I don't believe there has been any progress on this. (I certainly don't > know the right way to cite software.) Perhaps this can be merged with > your idea. That's a good point. Unfortunately, I have not thought about how to cite code that would not have an associated publication. From what I see by checking google scholar, as of now people are citing the Python language reference manual if they want to cite Python itself in a scientific publication. GVM didn't seem interested in citations for Python and from what I understand the vast majority of non-scientific package developer, given citations are not essential for their career advancement. Should Python have a standard sys.__citation__ field that provides the > relevant detail in some format-independent, machine-readable object like > a named tuple? Then this hypothetical cite.py tool could read the tuple > and format it according to any citation style. The idea for Python itself seems good! However, rather than using a named tuple, I was thinking about using a dict consistent with CSL-JSON or BibTeX. And writing a citation generating engine that would be consistent with hundreds if not thousands journal-specific formats is a bit of the scope of the proposal for now - most of the time people just want something their citation/bibliography engine can ingest and generate a citation from there in their Word/LaTeX documents. Bibtex/EndNote export formats are perfect for that task in my experience. > > just thought that it might be worth pointing out that this should > actually work both ways i.e. if a specific package, module or function > is inspired by or directly implements the methods included in a specific > publication then any __citation__ entries within it should also cite > that/those or allow references to them to be recovered. > The general principle is if you are expecting to be cited you also have > to cite. The general convention is to cite the top-level publication. While some methods definitely deserve a citation on their own (such as Sobol filter in Scikits-image), they provide a link to the relevant citation in their documentation to them and would normally cite them in their master publication. That's definitely an idea to look at but I don't see a straightforward of implementing this so far. I think this is a fine idea, but could be achieved by convention, like > __version__, rather than by fiat. > And it?s certainly not a language feature. > So Nathaniel?s right ? the thing to do now is work out the convention, > and then advocate for it. This already seems to be an idea floating in the air - AstroPy is inching towards that implementation. The idea is to modify the language to make citing as straightforward as possible and create a universal mechanism for that. Best, *Andrei Kucharavy* Post-Doc @ *Joel S. Bader** Lab* Johns Hopkins University, Baltimore, USA. On Thu, Jun 28, 2018 at 11:48 AM Chris Barker - NOAA Federal via Python-ideas wrote: > I think this is a fine idea, but could be achieved by convention, like > __version__, rather than by fiat. > > And it?s certainly not a language feature. > > So Nathaniel?s right ? the thing to do now is work out the convention, > and then advocate for it. > > -CHB > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Jun 28 17:41:24 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 28 Jun 2018 14:41:24 -0700 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: One more thing. There's precedent for this: when you start an interactive Python interpreter it tells you how to get help, but also how to get copyright, credits and license information: $ python3 Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 26 2018, 19:50:54) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> credits Thanks to CWI, CNRI, BeOpen.com, Zope Corporation and a cast of thousands for supporting Python development. See www.python.org for more information. >>> It makes total sense to add citations/references to this list (and those should probably print a reference for Python followed by instructions on how to get references for other packages and how to properly add a reference to your own code). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 18:19:44 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 15:19:44 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 3:17 PM, Chris Barker wrote: > There are also packages designed to make DB-style queries easier. > > Here's one I found with a quick google. > opps -- hit send too soon: http://178.62.194.22/ https://github.com/pythonql/pythonql -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 18:17:01 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 15:17:01 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 1:34 PM, David Mertz wrote: > I'd add one more option. You want something that behaves like SQL. Right > in the standard library is sqlite3, and you can create an in-memory DB to > hope the data you expect to group. > There are also packages designed to make DB-style queries easier. Here's one I found with a quick google. -CHB > On Thu, Jun 28, 2018, 3:48 PM Wes Turner wrote: > >> PyToolz, Pandas, Dask .groupby() >> >> toolz.itertoolz.groupby does this succinctly without any >> new/magical/surprising syntax. >> >> https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.groupby >> >> From https://github.com/pytoolz/toolz/blob/master/toolz/itertoolz.py : >> >> """ >> def groupby(key, seq): >> """ Group a collection by a key function >> >>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank'] >> >>> groupby(len, names) # doctest: +SKIP >> {3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']} >> >>> iseven = lambda x: x % 2 == 0 >> >>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8]) # doctest: +SKIP >> {False: [1, 3, 5, 7], True: [2, 4, 6, 8]} >> Non-callable keys imply grouping on a member. >> >>> groupby('gender', [{'name': 'Alice', 'gender': 'F'}, >> ... {'name': 'Bob', 'gender': 'M'}, >> ... {'name': 'Charlie', 'gender': 'M'}]) # >> doctest:+SKIP >> {'F': [{'gender': 'F', 'name': 'Alice'}], >> 'M': [{'gender': 'M', 'name': 'Bob'}, >> {'gender': 'M', 'name': 'Charlie'}]} >> See Also: >> countby >> """ >> if not callable(key): >> key = getter(key) >> d = collections.defaultdict(lambda: [].append) >> for item in seq: >> d[key(item)](item) >> rv = {} >> for k, v in iteritems(d): >> rv[k] = v.__self__ >> return rv >> """ >> >> If you're willing to install Pandas (and NumPy, and ...), there's >> pandas.DataFrame.groupby: >> >> https://pandas.pydata.org/pandas-docs/stable/generated/ >> pandas.DataFrame.groupby.html >> >> https://github.com/pandas-dev/pandas/blob/v0.23.1/pandas/ >> core/generic.py#L6586-L6659 >> >> >> Dask has a different groupby implementation: >> https://gist.github.com/darribas/41940dfe7bf4f987eeaa# >> file-pandas_dask_test-ipynb >> >> https://dask.pydata.org/en/latest/dataframe-api.html# >> dask.dataframe.DataFrame.groupby >> >> >> On Thursday, June 28, 2018, Chris Barker via Python-ideas < >> python-ideas at python.org> wrote: >> >>> On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin >>> wrote: >>>> >>>> I use list and dict comprehension a lot, and a problem I often have is >>>> to do the equivalent of a group_by operation (to use sql terminology). >>>> >>> >>> I don't know from SQL, so "group by" doesn't mean anything to me, but >>> this: >>> >>> >>>> For example if I have a list of tuples (student, school) and I want to >>>> have the list of students by school the only option I'm left with is to >>>> write >>>> >>>> student_by_school = defaultdict(list) >>>> for student, school in student_school_list: >>>> student_by_school[school].append(student) >>>> >>> >>> seems to me that the issue here is that there is not way to have a >>> "defaultdict comprehension" >>> >>> I can't think of syntactically clean way to make that possible, though. >>> >>> Could itertools.groupby help here? It seems to work, but boy! it's ugly: >>> >>> In [*45*]: student_school_list >>> >>> Out[*45*]: >>> >>> [('Fred', 'SchoolA'), >>> >>> ('Bob', 'SchoolB'), >>> >>> ('Mary', 'SchoolA'), >>> >>> ('Jane', 'SchoolB'), >>> >>> ('Nancy', 'SchoolC')] >>> >>> >>> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted >>> (student_school_list, key=*lambda* t: t[1]), key=*lambda* t: t[ >>> >>> ...: 1])} >>> >>> ...: >>> >>> ...: >>> >>> ...: >>> >>> ...: >>> >>> ...: >>> >>> ...: >>> >>> ...: >>> >>> Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'], >>> 'SchoolC': ['Nancy']} >>> >>> >>> -CHB >>> >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE >>> >>> (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> Chris.Barker at noaa.gov >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Jun 28 19:23:52 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Jun 2018 11:23:52 +1200 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: <5B356E08.5050906@canterbury.ac.nz> Nicolas Rolin wrote: > student_by_school = {group_by(school): student for school, student > in student_school_list} In the spirit of making the target expression look like a template for the generated elements, {school: [student...] for school, student in student_school_list} -- Greg From chris.barker at noaa.gov Thu Jun 28 19:33:26 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 16:33:26 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: <5B356E08.5050906@canterbury.ac.nz> References: <5B356E08.5050906@canterbury.ac.nz> Message-ID: On Thu, Jun 28, 2018 at 4:23 PM, Greg Ewing wrote: > Nicolas Rolin wrote: > >> student_by_school = {group_by(school): student for school, student in >> student_school_list} >> > > In the spirit of making the target expression look like > a template for the generated elements, > > {school: [student...] for school, student in student_school_list} hmm -- this seems a bit non-general -- would this only work for a list? maybe you would want a set, or??? so could be get a defaultdict comprehension with something like: { school: (default_factory=list, student) for school, student in student_school_list } But I can't think of an reasonable syntax to make that work. -CHB > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Jun 28 19:38:21 2018 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jun 2018 16:38:21 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: <5B356E08.5050906@canterbury.ac.nz> Message-ID: On Thu, Jun 28, 2018 at 4:34 PM Chris Barker via Python-ideas < python-ideas at python.org> wrote: > On Thu, Jun 28, 2018 at 4:23 PM, Greg Ewing > wrote: > >> Nicolas Rolin wrote: >> >>> student_by_school = {group_by(school): student for school, student >>> in student_school_list} >>> >> >> In the spirit of making the target expression look like >> a template for the generated elements, >> >> {school: [student...] for school, student in student_school_list} > > > hmm -- this seems a bit non-general -- would this only work for a list? > maybe you would want a set, or??? > > so could be get a defaultdict comprehension with something like: > > { school: (default_factory=list, student) for school, student in > student_school_list } > > But I can't think of an reasonable syntax to make that work. > Many languages with a group-by or grouping function choose to return a mapping of sequences, requiring any reduction, aggregation, or transformation of those sequences to be performed after the grouping. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Jun 28 19:59:28 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jun 2018 09:59:28 +1000 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: <20180628235927.GM14437@ando.pearwood.info> Can I make a plea for people to not post code with source highlighting as HTML please? It is rendered like this for some of us: On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas wrote: In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted(student_school_list, key=*lambda* t: t[1]), key=*lambda* t: t[ ... (Aside from the iPython prompt, the rest ought to be legal Python but isn't because of the extra asterisks added.) And in the archives: https://mail.python.org/pipermail/python-ideas/2018-June/051723.html Gmail, I believe, has a "Paste As Plain Text" command in the right-click menu. Or possibly find a way to copy the text without formatting in the first case. Thanks, -- Steve From steve at pearwood.info Thu Jun 28 20:01:39 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jun 2018 10:01:39 +1000 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: <0a12a07f-3801-08b4-f9ab-e4f94bad60ff@btinternet.com> Message-ID: <20180629000139.GN14437@ando.pearwood.info> On Thu, Jun 28, 2018 at 11:23:49AM -0700, Michael Selik wrote: > The fact that you didn't use ``setdefault`` here, opting for repeatedly > constructing new lists via concatenation, demonstrates the need for a > built-in or standard library tool that is easier to use. That would be setdefault :-) What it indicates to me is the need for people to learn to use setdefault, rather than new syntax :-) -- Steve From chris.barker at noaa.gov Thu Jun 28 20:03:57 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 17:03:57 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: <20180628235927.GM14437@ando.pearwood.info> References: <20180628235927.GM14437@ando.pearwood.info> Message-ID: On Thu, Jun 28, 2018 at 4:59 PM, Steven D'Aprano wrote: > Can I make a plea for people to not post code with source highlighting > as HTML please? It is rendered like this for some of us: > > On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas > wrote: > > In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* > groupby(sorted(student_school_list, > key=*lambda* t: t[1]), key=*lambda* t: t[ > Oh god -- yeach!! -- sorry about that -- that was copy an pasted from iPython -- I was assuming it would strip out the formatting and give reasonable plain text -- but apparently not. I'll stop that. -CHB -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 20:11:30 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 28 Jun 2018 17:11:30 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: Hold the phone! On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin wrote: > student_by_school = defaultdict(list) > for student, school in student_school_list: > student_by_school[school].append(student) > > What I would expect would be a syntax with comprehension allowing me to > write something along the lines of: > > student_by_school = {group_by(school): student for school, student in > student_school_list} > OK -- I agreed that this could/should be easier, and pretty much like using setdefault, but did like the single expression thing, so went to "there should be a way to make a defaultdict comprehension" -- and played with itertools.groupby (which is really really awkward for this), but then light dawned on Marblehead: I've noticed (and taught) that dict comprehensions are kinda redundant with the dict() constructor, and _think_, in fact, that they were added before the current dict() constructor was added. so, if you think "dict constructor" rather than dict comprehensions, you realize that defaultdict takes the same arguments as the dict(), so the above is: defaultdict(list, student_by_school) which really couldn't be any cleaner and neater..... Here it is in action: In [97]: student_school_list Out[97]: [('Fred', 'SchoolA'), ('Bob', 'SchoolB'), ('Mary', 'SchoolA'), ('Jane', 'SchoolB'), ('Nancy', 'SchoolC')] In [98]: result = defaultdict(list, student_by_school) In [99]: result.items() Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob', 'Jane']), ('SchoolC', ['Nancy'])]) So: never mind -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Jun 28 20:22:03 2018 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jun 2018 17:22:03 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 5:12 PM Chris Barker via Python-ideas < python-ideas at python.org> wrote: > In [97]: student_school_list > Out[97]: > [('Fred', 'SchoolA'), > ('Bob', 'SchoolB'), > ('Mary', 'SchoolA'), > ('Jane', 'SchoolB'), > ('Nancy', 'SchoolC')] > > In [98]: result = defaultdict(list, student_by_school) > > In [99]: result.items() > Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob', > 'Jane']), ('SchoolC', ['Nancy'])]) > Wait, wha... In [1]: from collections import defaultdict In [2]: students = [('Fred', 'SchoolA'), ...: ('Bob', 'SchoolB'), ...: ('Mary', 'SchoolA'), ...: ('Jane', 'SchoolB'), ...: ('Nancy', 'SchoolC')] ...: In [3]: defaultdict(list, students) Out[3]: defaultdict(list, {'Fred': 'SchoolA', 'Bob': 'SchoolB', 'Mary': 'SchoolA', 'Jane': 'SchoolB', 'Nancy': 'SchoolC'}) In [4]: defaultdict(list, students).items() Out[4]: dict_items([('Fred', 'SchoolA'), ('Bob', 'SchoolB'), ('Mary', 'SchoolA'), ('Jane', 'SchoolB'), ('Nancy', 'SchoolC')]) I think you accidentally swapped variables there: student_school_list vs student_by_school -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Jun 28 20:24:07 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 28 Jun 2018 20:24:07 -0400 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: I think you cheated a little in your cut-and-paste. `student_by_school` is not defined in the code you've shown. What you **did** define, ` student_school_list` doesn't give you what you want if you use ` defaultdict(list,student_school_list)`. I thought for a moment I might just use: [(b,a) for a,b in student_school_list] But that's wrong for reasons that are probably obvious to everyone else. I'm not really sure what `student_by_school` could possibly be to make this work as shown. On Thu, Jun 28, 2018 at 8:13 PM Chris Barker via Python-ideas < python-ideas at python.org> wrote: > In [97]: student_school_list > Out[97]: > [('Fred', 'SchoolA'), > ('Bob', 'SchoolB'), > ('Mary', 'SchoolA'), > ('Jane', 'SchoolB'), > ('Nancy', 'SchoolC')] > > In [98]: result = defaultdict(list, student_by_school) > > In [99]: result.items() > Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob', > 'Jane']), ('SchoolC', ['Nancy'])]) > > So: never mind > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jun 28 20:30:04 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 28 Jun 2018 17:30:04 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: > I think you accidentally swapped variables there: > student_school_list > vs student_by_school Oops, yeah. That?s what I get for whipping out a message before catching a bus. (And on a phone now) But maybe you could wrap the defaultdict constructor around a generator expression that transforms the list first. That would get the keys right. Though still not call append for you. So maybe a solution is an accumulator special case of defaultdict ? it uses a list be default and appends by default. Almost like counter... -CHB From chris.barker at noaa.gov Thu Jun 28 20:37:47 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 28 Jun 2018 17:37:47 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: > On Jun 28, 2018, at 5:30 PM, Chris Barker - NOAA Federal wrote: > > So maybe a solution is an accumulator special case of defaultdict ? it uses a list be default and appends by default. > > Almost like counter... Which, of course, is pretty much what your proposal is. Which makes me think ? a new classmethod on the builtin dict is a pretty heavy lift compared to a new type of dict in the collections module. -CHB From steve at pearwood.info Thu Jun 28 20:17:25 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jun 2018 10:17:25 +1000 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: <20180629001724.GO14437@ando.pearwood.info> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: > As for the list, reserving a __citation__/__cite__ for packages at the same > level as __version__ is now reserved and adding a citation()/cite() > function to the standard library seemed large enough modifications to > warrant searching a buy-in from the maintainers and the community at large. I think that an approach similar to help/quit/exit is warranted. The cite()/citation() function need not be *literally* built into the language, it could be an external function written in Python and added to builtins by the site.py module. -- Steve From steve at pearwood.info Thu Jun 28 20:25:08 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jun 2018 10:25:08 +1000 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: References: <5B33A33F.30207@stoneleaf.us> Message-ID: <20180629002507.GP14437@ando.pearwood.info> On Thu, Jun 28, 2018 at 06:57:45AM +0300, Serhiy Storchaka wrote: > Making a nested class a member you > don't lost anything, because you always can make it not-nested if you > don't want it be a member. You lose the ability to have Colors.RED.NestedClass() # returns something useful # similar to Colors.RED.method() for the (dubious?) advantage of having Colors.NestedClass treated as a colour enum. > But when a nested class is not a member, you > would lost the possibility of making it a member (and this may break > existing code). I must admit I'm still perplexed why I might want NestedClass to be an enum member. -- Steve From pylang3 at gmail.com Thu Jun 28 21:02:06 2018 From: pylang3 at gmail.com (pylang) Date: Thu, 28 Jun 2018 21:02:06 -0400 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: There are a few tools that can accomplish these map-reduce/transformation tasks. See Options A, B, C below. # Given >>> import itertools as it >>> import collections as ct >>> import more_itertools as mit >>> student_school_list = [ ... ("Albert", "Prospectus"), ("Max", "Smallville"), ("Nikola", "Shockley"), ("Maire", "Excelsior"), ... ("Neils", "Smallville"), ("Ernest", "Tabbicage"), ("Michael", "Shockley"), ("Stephen", "Prospectus") ... ] >>> kfunc = lambda x: x[1] >>> vfunc = lambda x: x[0] >>> sorted_iterable = sorted(student_school_list, key=kfunc) # Example (see OP) >>> student_by_school = ct.defaultdict(list) >>> for student, school in student_school_list: ... student_by_school[school].append(student) >>> student_by_school defaultdict(list, {'Prospectus': ['Albert', 'Stephen'], 'Smallville': ['Max', 'Neils'], 'Shockley': ['Nikola', 'Michael'], 'Excelsior': ['Maire'], 'Tabbicage': ['Ernest']}) --- # Options # A: itertools.groupby >>> {k: [x[0] for x in v] for k, v in it.groupby(sorted_iterable, key=kfunc)} {'Excelsior': ['Maire'], 'Prospectus': ['Albert', 'Stephen'], 'Shockley': ['Nikola', 'Michael'], 'Smallville': ['Max', 'Neils'], 'Tabbicage': ['Ernest']} # B: more_itertools.groupby_transform >>> {k: list(v) for k, v in mit.groupby_transform(sorted_iterable, keyfunc=kfunc, valuefunc=vfunc)} {'Excelsior': ['Maire'], 'Prospectus': ['Albert', 'Stephen'], 'Shockley': ['Nikola', 'Michael'], 'Smallville': ['Max', 'Neils'], 'Tabbicage': ['Ernest']} # C: more_itertools.map_reduce >>> mit.map_reduce(student_school_list, keyfunc=kfunc, valuefunc=vfunc) defaultdict(None, {'Prospectus': ['Albert', 'Stephen'], 'Smallville': ['Max', 'Neils'], 'Shockley': ['Nikola', 'Michael'], 'Excelsior': ['Maire'], 'Tabbicage': ['Ernest']}) --- # Summary - Option A: standard library, sorted iterable, some manual value transformations (via list comprehension) - Option B: third-party tool, sorted iterable, accepts a value transformation function - Option C: third-party tool, any iterable, accepts transformation function(s) I have grown to like `itertools.groupby`, but I understand it can be odd at first. Perhaps something like the `map_reduce` tool (or approach) may help? It's simple, does not require a sorted iterable as in A and B, and you have control over how you want your keys, values and aggregated/reduced values to be (see docs for more details). # Documentation - Option A: https://docs.python.org/3/library/itertools.html#itertools.groupby - Option B: https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.groupby_transform - Option C: https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.map_reduce On Thu, Jun 28, 2018 at 8:37 PM, Chris Barker - NOAA Federal via Python-ideas wrote: > > On Jun 28, 2018, at 5:30 PM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > > > > So maybe a solution is an accumulator special case of defaultdict ? it > uses a list be default and appends by default. > > > > Almost like counter... > > Which, of course, is pretty much what your proposal is. > > Which makes me think ? a new classmethod on the builtin dict is a > pretty heavy lift compared to a new type of dict in the collections > module. > > -CHB > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Jun 28 21:37:10 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 28 Jun 2018 21:37:10 -0400 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: <20180628235927.GM14437@ando.pearwood.info> References: <20180628235927.GM14437@ando.pearwood.info> Message-ID: Ctrl-Shift-V pastes without HTML formatting. On Thursday, June 28, 2018, Steven D'Aprano wrote: > Can I make a plea for people to not post code with source highlighting > as HTML please? It is rendered like this for some of us: > > On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas > wrote: > > In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* > groupby(sorted(student_school_list, > key=*lambda* t: t[1]), key=*lambda* t: t[ > ... > > (Aside from the iPython prompt, the rest ought to be legal Python but > isn't because of the extra asterisks added.) > > And in the archives: > > https://mail.python.org/pipermail/python-ideas/2018-June/051723.html > > Gmail, I believe, has a "Paste As Plain Text" command in the > right-click menu. Or possibly find a way to copy the text without > formatting in the first case. > > > Thanks, > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.rolin at tiime.fr Thu Jun 28 21:45:50 2018 From: nicolas.rolin at tiime.fr (Nicolas Rolin) Date: Fri, 29 Jun 2018 03:45:50 +0200 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: 2018-06-28 22:34 GMT+02:00 David Mertz : > I agree with these recommendations. There are excellent 3rd party tools > that do what you want. This is way too much to try to shoehorn into a > comprehension. > There are actually no 3rd party tools that can "do what I want", because if I wanted to have a function to do a group by, I would have taken the 5 minutes and 7 lines necessary to do so (or don't use a function and do my 3 liner). My main point is that comprehensions in python are very powerful and you can do pretty much any basic data manipulation that you want with it EXCEPT when you want to "split" a list in sublists, in which case you have either to use functions or a for loop. You can note that with list comprehension you can flatten an iterable (from sublists to a single list) with the [a for b in c for a in b] syntax, but doing the inverse operation is impossible. The questions I should have asked In my original post was : - Is splitting lists into sublists (by grouping elements) a high level enough construction to be worthy of a nice integration in the comprehension syntax ? - In which case, is there a way to find a simple syntax that is not too confusing ? My personal answer would be respectively "yes" and "maybe I don't know". I was hoping to have some views on the topic, and it seemed to have a bit sidetracked :) -- Nicolas Rolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jun 28 22:14:33 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 28 Jun 2018 19:14:33 -0700 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018 at 2:25 PM, Andrei Kucharavy wrote: >> This is indeed a serious problem. I suspect python-ideas isn't the >> best venue for addressing it though ? there's nothing here that needs >> changes to the Python interpreter itself (I think), and the people who >> understand this problem the best and who are most affected by it, >> mostly aren't here. > > There has been localized discussion popping up among the large scientific > package maintainers and some attempts to solve the problem at the local > level. Until now they seemed to be winding down due to a lack of a > large-scale citation mechanism and a discussion about what is concretely > doable at the scale of the language is likely to finalize Those are the people with the most motivation and expertise to solve this, and whose buy-in you'll need on any solution. If they haven't solved it yet themselves, then there are basically two reasons why that happens: either because they're busy and no-one's had enough time to work on it, or else because they're uncertain about the best path forward. Neither of these is a problem that python-ideas can help with. If you want to be effective here, you need to talk to them to figure out how you can help them move forward. If I were you, I'd try organizing a birds-of-a-feather at the next SciPy conference, or start getting in touch with others working on this (duecredit devs, the folks listed on that citationPEP thing, etc.), and go from there. (Feel free to CC me if you do start up some effort like this.) > As for the list, reserving a __citation__/__cite__ for packages at the same > level as __version__ is now reserved and adding a citation()/cite() function > to the standard library seemed large enough modifications to warrant > searching a buy-in from the maintainers and the community at large. There isn't actually any formal method for registering special names like __version__, and they aren't treated specially by the language. They're just variables that happen to have a funny name. You shouldn't start using them willy-nilly, but you don't actually have to ask permission or anything. And it's not very likely that someone else will come along and propose using the name __citation__ for something that *isn't* a citation :-). >> You'll want to check out the duecredit project: >> https://github.com/duecredit/duecredit >> One of the things they've thought about is the ability to track >> citation information at a more fine-grained way than per-package ? for >> example, there might be a paper that should be cited by anyone who >> calls a particular method (or even passes a specific argument to some >> specific method, when that turns on some fancy algorithm). > > > Due credit looks amazing - I will definitely check it out. The idea was, > however, to bring the barrier for adoption and usage as low as possible. In > my experience, the vast majority of Python users in academic environment who > aren't citing the packages properly are beginners. As such they are unlikely > to search for third-party libraries beyond those they've found and used to > solve their specific problem. > > who just assembled a pipeline based on widely-used libraries and would need > to generate a citation list for it to pass on to their colleagues > responsible for the paper assembly and submission. The way to do this is to first get your solution implemented as a third-party library and adopted by the scientific packages, and then start thinking about whether it would make sense to move the library into the standard library. It's relatively easy to move things into the standard library. The hard part is making sure that you implemented the right thing in the first place, and that's MUCH more likely if you start out as a third-party package. >> I'd actually like to see a more general solution that isn't restricted >> to any one language, because multi-language analysis pipelines are >> very common. For example, we could standardize a convention where if a >> certain environment variable is set, then the software writes out >> citation information to a certain location, and then implement >> libraries that do this in multiple languages. Of course, that's a >> "dynamic" solution that requires running the software -- which is >> probably necessary if you want to do fine-grained citations, but it >> might be useful to also have static metadata, e.g. as part of the >> package metadata that goes into sdists, wheels, and on PyPI. That >> would be a discussion for the distutils-sig mailing list, which >> manages that metadata. > > > Thanks for the reference to the distutils-sig list. I will talk to them if > the idea gets traction here I think you misunderstand how these lists work :-). (Which is fine -- it's actually pretty opaque and confusing if you don't already know!) Generally, distutils-sig operates totally independently from python-{ideas,dev} -- if you have a packaging proposal, it goes there and not here; if you have a language proposal, it goes here and not there. *If* what you want to do is add some static metadata to python packages through setup.py, then python-ideas is irrelevant and distutils-sig is who you'll have to convince. (But they'll also want to see that your proposal has buy-in from established packages, because they don't understand the intricacies of software citation and will want people they trust to tell them whether the proposal makes sense.) > I am not entirely convinced for the multi-language pipelines. In > bioinformatics, often the heavy lifting is done by a single package (for > instance bowtie for RNA-seq alignment) and the output is piped to the custom > script, mostly in R or Python. The citations for the library doing the > heavy-lifting is often well-known and widely cited and the issues arise in > the custom scripts importing and using libraries that should be cited > without citing them. And often the custom scripts are a mix of R and Python, and maybe some Fortran, ... Plus, if it works for multiple languages, it means you get to share part of the work with other ecosystems, instead of everyone reinventing the wheel. Also, if you want to go down the dynamic route (which is the only way to get accurate fine-grained citations), then it's just as easy to solve the problem in a language independent way. >> One challenge in standardizing this kind of thing is choosing a >> standard way to represent citation information. Maybe CSL-JSON? >> There's a lot of complexity as you dig into this, though of course one >> shouldn't let the perfect be the enemy of the good... > > > CLS-JSON represented as a dict to be supplied to the setup file is > definitely one way of doing it. I was, however, thinking more about the > BibTeX format, given that CLS-JSON is more closely affiliated with Mendeley Huh, is it? I only know it from Zotero. -n -- Nathaniel J. Smith -- https://vorpus.org From mike at selik.org Thu Jun 28 22:57:27 2018 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jun 2018 19:57:27 -0700 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: On Thu, Jun 28, 2018, 6:46 PM Nicolas Rolin wrote: > The questions I should have asked In my original post was : > - Is splitting lists into sublists (by grouping elements) a high level > enough construction to be worthy of a nice integration in the comprehension > syntax ? > My intuition is no, it's not important enough to alter the syntax, despite being an important task. - In which case, is there a way to find a simple syntax that is not too > confusing ? > If you'd like to give it a shot, try to find something which is currently invalid syntax, but does not break compatibility. The latter criteria means no new keywords. The syntax should look nice as a single line with reasonably verbose variable names. One issue is that Python code is mostly 1-dimensional, characters in a line, and you're trying to express something which is 2-dimensional, in a sense. There's only so much you can do without newlines and indentation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Thu Jun 28 23:57:07 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Thu, 28 Jun 2018 23:57:07 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: <20180629001724.GO14437@ando.pearwood.info> References: <20180629001724.GO14437@ando.pearwood.info> Message-ID: <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> Why not scipy.cite() or scipy.citation()? I don't see any reason for these functions to ship with standard python at all. > -----Original Message----- > From: Python-ideas list=sdamon.com at python.org> On Behalf Of Steven D'Aprano > Sent: Thursday, June 28, 2018 8:17 PM > To: python-ideas at python.org > Subject: Re: [Python-ideas] Add a __cite__ method for scientific packages > > On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: > > > As for the list, reserving a __citation__/__cite__ for packages at the same > > level as __version__ is now reserved and adding a citation()/cite() > > function to the standard library seemed large enough modifications to > > warrant searching a buy-in from the maintainers and the community at > large. > > I think that an approach similar to help/quit/exit is warranted. The > cite()/citation() function need not be *literally* built into the > language, it could be an external function written in Python and added > to builtins by the site.py module. > > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mertz at gnosis.cx Fri Jun 29 00:14:10 2018 From: mertz at gnosis.cx (David Mertz) Date: Fri, 29 Jun 2018 00:14:10 -0400 Subject: [Python-ideas] Fwd: Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: Mike Selik asked for my opinion on a draft PEP along these lines. I proposed a slight modification to his idea that is now reflected in his latest edits. With some details fleshed out, I think this is a promising idea. I like the a collections class better, of course, but a dict classmethod is still a lot smaller change than new syntax change in comprehension. On Thu, Jun 28, 2018, 8:15 PM David Mertz wrote: > I see the utility, but I would prefer a slightly different approach than > you suggest; I think my suggestion will have a lower barrier to acceptance > as well. > > Rather than add a new classmethod dict.grouper(), I'd like to have a new > dict subclass collections.Grouper. The name subject to bikeshedding, of > course. I think of this class as a "big sister" of collections.Counter, in > a way. > > There is behavior that I believe would be useful beyond constructing a new > base dictionary. However, I think that construction from an iterable would > be a common use pattern. Oh, I'd also recommend following toolz.groupby() > in keeping a list rather than a set. It's easy enough to convert a list to > a set if wanted, but order and repetitions are preserved in SQL or Pandas > 'groupby' operations, and that seems more general. > > For example (this typed without testing, forgive any typos or thinkos): > > >>> from collections import Grouper # i.e. in Python 3.8+ > >>> grouped = Grouper(range(7), key=mod_2) > >>> grouped > Grouper({0: [0, 2, 4, 6], 1: [1, 3, 5]}) > >>> grouped.update([2, 10, 12, 13], key=mod_2) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12], 1: [1, 3, 5, 13]}) > >>> # Updating with no key function groups by identity > >>> # ... is there a better idea for the default key function? > >>> grouped.update([0, 1, 2]) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0], 1: [1, 3, 5, 13, 1], 2: [2]}) > >>> # Maybe do a different style of update if passed a dict subclass > >>> # - Does a key function make sense here? > >>> grouped.update({0: 88, 1: 77}) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0, 88], > 1: [1, 3, 5, 13, 1, 77], > 2: [2]}) > >>> # Avoiding duplicates might sometimes be useful > >>> grouped.make_unique() # better name? .no_dup()? > >>> grouped > Grouper({0: [0, 2, 4, 6, 10, 12, 88], > 1: [1, 3, 5, 13, 77], > 2: [2]}) > > I think that most of the methods of Counter make sense to include here in > appropriately adjusted versions. Converting to a plain dictionary should > probably just be `dict(grouped)`, but it's possible we'd want > `grouped.as_dict()` or something. > > One thing that *might* be useful is a way to keep using the same key > function across updates. Even with no explicit provision, we *could* > spell it like this: > > >>> grouped.key_func = mod_2 > >>> grouped.update([55, 44, 22, 111], key=grouped.key_func) > > Perhaps some more official API for doing that would be useful though. > > > > > > On Thu, Jun 28, 2018 at 7:35 PM David Mertz wrote: > >> Thanks... Looking now. I'll comment soon. >> >> On Thu, Jun 28, 2018 at 7:05 PM Michael Selik wrote: >> >>> Hi David, >>> >>> We talked about this in Seattle about a year ago at a conference. Would >>> you do me a favor and critique this PEP I've drafted? I'd like to get >>> private feedback before sharing with the group. >>> >>> https://github.com/selik/peps/blob/master/pep-9999.rst >>> >>> Thank you, >>> -- Michael >>> >>> >>> On Thu, Jun 28, 2018 at 1:35 PM David Mertz wrote: >>> >>>> I agree with these recommendations. There are excellent 3rd party tools >>>> that do what you want. This is way too much to try to shoehorn into a >>>> comprehension. >>>> >>>> I'd add one more option. You want something that behaves like SQL. >>>> Right in the standard library is sqlite3, and you can create an in-memory >>>> DB to hope the data you expect to group. >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrianmpw at gmail.com Fri Jun 29 00:16:08 2018 From: adrianmpw at gmail.com (Adrian Price-Whelan) Date: Thu, 28 Jun 2018 23:16:08 -0500 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> Message-ID: For me, it's about setting a standard that is endorsed by the language, and setting expectations for users. There currently is no standard, which is why packages use __citation__, __cite__, __bibtex__, etc., and as a user I don't immediately know where to look for citation information (without going to the source). My feeling is that adopting __citation__ or some dunder name could be implemented on classes, functions, etc. with less of a chance of naming conflicts, but am open to discussion. I have some notes here about various ideas for more advanced functionality that would support automatically keeping track of citation information for imported packages, classes, functions: https://github.com/adrn/CitationPEP/blob/master/NOTES.md On Thu, Jun 28, 2018 at 10:57 PM, Alex Walters wrote: > Why not scipy.cite() or scipy.citation()? I don't see any reason for these > functions to ship with standard python at all. > >> -----Original Message----- >> From: Python-ideas > list=sdamon.com at python.org> On Behalf Of Steven D'Aprano >> Sent: Thursday, June 28, 2018 8:17 PM >> To: python-ideas at python.org >> Subject: Re: [Python-ideas] Add a __cite__ method for scientific packages >> >> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: >> >> > As for the list, reserving a __citation__/__cite__ for packages at the > same >> > level as __version__ is now reserved and adding a citation()/cite() >> > function to the standard library seemed large enough modifications to >> > warrant searching a buy-in from the maintainers and the community at >> large. >> >> I think that an approach similar to help/quit/exit is warranted. The >> cite()/citation() function need not be *literally* built into the >> language, it could be an external function written in Python and added >> to builtins by the site.py module. >> >> >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Adrian M. Price-Whelan Lyman Spitzer, Jr. Postdoctoral Fellow Princeton University http://adrn.github.io From tritium-list at sdamon.com Fri Jun 29 00:26:42 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Fri, 29 Jun 2018 00:26:42 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> Message-ID: <069001d40f61$62103260$26309720$@sdamon.com> But don't all the users who care about citing modules already use the scientific python packages, with scipy itself at it's center? Wouldn't those engaging in science or in academia be better stewards of this than systems programmers? Since you're not asking for anything that can't be done in a third party module, and there is a third party module that most of the target audience of this standard would already have, there is zero reason to take up four names in the python runtime to serve those users. > -----Original Message----- > From: Adrian Price-Whelan > Sent: Friday, June 29, 2018 12:16 AM > To: Alex Walters > Cc: Steven D'Aprano ; python-ideas at python.org > Subject: Re: [Python-ideas] Add a __cite__ method for scientific packages > > For me, it's about setting a standard that is endorsed by the > language, and setting expectations for users. There currently is no > standard, which is why packages use __citation__, __cite__, > __bibtex__, etc., and as a user I don't immediately know where to look > for citation information (without going to the source). My feeling is > that adopting __citation__ or some dunder name could be implemented on > classes, functions, etc. with less of a chance of naming conflicts, > but am open to discussion. > > I have some notes here about various ideas for more advanced > functionality that would support automatically keeping track of > citation information for imported packages, classes, functions: > https://github.com/adrn/CitationPEP/blob/master/NOTES.md > > On Thu, Jun 28, 2018 at 10:57 PM, Alex Walters > wrote: > > Why not scipy.cite() or scipy.citation()? I don't see any reason for these > > functions to ship with standard python at all. > > > >> -----Original Message----- > >> From: Python-ideas >> list=sdamon.com at python.org> On Behalf Of Steven D'Aprano > >> Sent: Thursday, June 28, 2018 8:17 PM > >> To: python-ideas at python.org > >> Subject: Re: [Python-ideas] Add a __cite__ method for scientific packages > >> > >> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: > >> > >> > As for the list, reserving a __citation__/__cite__ for packages at the > > same > >> > level as __version__ is now reserved and adding a citation()/cite() > >> > function to the standard library seemed large enough modifications to > >> > warrant searching a buy-in from the maintainers and the community at > >> large. > >> > >> I think that an approach similar to help/quit/exit is warranted. The > >> cite()/citation() function need not be *literally* built into the > >> language, it could be an external function written in Python and added > >> to builtins by the site.py module. > >> > >> > >> > >> > >> -- > >> Steve > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > Adrian M. Price-Whelan > Lyman Spitzer, Jr. Postdoctoral Fellow > Princeton University > http://adrn.github.io From greg.ewing at canterbury.ac.nz Fri Jun 29 01:39:51 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Jun 2018 17:39:51 +1200 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: <20180628235927.GM14437@ando.pearwood.info> References: <20180628235927.GM14437@ando.pearwood.info> Message-ID: <5B35C627.3040307@canterbury.ac.nz> Steven D'Aprano wrote: > > On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas wrote: > > In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* > groupby(sorted(student_school_list, > key=*lambda* t: t[1]), key=*lambda* t: t[ > ... > > the rest ought to be legal Python but isn't We should *make* it legal Python code! Then there would be no difficulty with adding new keywords! -- Greg From j.van.dorp at deonet.nl Fri Jun 29 03:03:11 2018 From: j.van.dorp at deonet.nl (Jacco van Dorp) Date: Fri, 29 Jun 2018 09:03:11 +0200 Subject: [Python-ideas] list configuration In-Reply-To: References: Message-ID: I've had it to, bounces when attempting to reply or reply all, and it tried to send to some google groups version. 2018-06-28 19:31 GMT+02:00 Eric Fahlgren : > I've been getting those, too, but from the wxPython-dev group. I concur > that they look like googlegroups bounces (although I can't confirm that as > I've been deleting them without much inspection). > > On Thu, Jun 28, 2018 at 9:35 AM Chris Barker via Python-ideas > wrote: >> >> Hey all, >> >> I've been replying to messages lately, and getting a bounce back: >> >> """ >> Hello chris.barker at noaa.gov, >> >> We're writing to let you know that the group you tried to contact >> (python-ideas) may not exist, or you may not have permission to post >> messages to the group. A few more details on why you weren't able to post: >> """ >> >> And it's not quite clar to me if the message actually got through. >> >> IIUC, this is a Mailman list -- so it must be getting mirrored through >> google groups, and at least with some people's posts, the reply-to header is >> getting messed up. >> >> Anyone know what's going on? It would be nice to fix this... >> >> -CHB >> >> PS: I've seen a couple other notes about this -- I'm not the only one. >> >> >> >> >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From turnbull.stephen.fw at u.tsukuba.ac.jp Fri Jun 29 03:16:22 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 29 Jun 2018 16:16:22 +0900 Subject: [Python-ideas] list configuration In-Reply-To: References: Message-ID: <23349.56518.550630.883368@turnbull.sk.tsukuba.ac.jp> I've cc'd python-ideas-owner, the folks who can actually do something about it. Chris Barker via Python-ideas writes: > I've been replying to messages lately, and getting a bounce back: [...] > And it's not quite clar to me if the message actually got through. In my experience it does, because when this happens to me it's due to my reply-all inserting both python.org and googlegroups addresses in the To and Cc fields. > IIUC, this is a Mailman list -- so it must be getting mirrored > through google groups, and at least with some people's posts, the > reply-to header is getting messed up. > > Anyone know what's going on? I'm not sure about the process in your case. What I've experienced is that there are a couple of people who use googlegroups to read, and when they post for some reason both the python.org and the googlegroups addresses end up as addressees. This might be Google trying to monopolize mailing lists by putting the googlegroups address in Reply-To, or it might be a poster error of using reply-all and not cleaning out the googlegroups address (which doesn't bother them because they're subscribed at googlegroups and googlegroups deduplicates). For historical reasons I use reply-all, so both addresses end up in the addressees, and it bounces from googlegroups if I don't clean it up. > It would be nice to fix this... My personal approach would be to blackhole posts containing googlegroups addresses in any header field, but that probably won't fly. ;-) I think it's probably not that hard to add some code to some handler in Mailman's pipeline to strip out googlegroups addresses from the list of addressees in outgoing posts, and if that makes googlegroups unreliable, so be it. It shouldn't, though, because the googlegroup is subscribed to the list at python.org. I don't see why this shouldn't be done globally for all lists at python.org, for all googlegroups addresses. AFAIK there are no non-list mailboxes at googlegroups that would want to receive mail there. Steve From nicolas.rolin at tiime.fr Fri Jun 29 05:04:20 2018 From: nicolas.rolin at tiime.fr (Nicolas Rolin) Date: Fri, 29 Jun 2018 11:04:20 +0200 Subject: [Python-ideas] Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: A syntax that would work (which atm is a syntax error, and requires no new keyword) would be student_by_school = {school: [student] for school, student in student_school_list, grouped=True} with grouped=True being a modifier on the dict comprehension so that at each iteration loop current_dict[key] = value if key not in current_dict else current_dict[key] + value This is an extremely borderline syntax (as it is perfectly legal to put **{'grouped': True} in a dict comprehension), but it works. It even keeps the extremely important "should look like a template of the final object" property. But it doesn't requires me to defines 2 lambda functions just to do the job of a comprehension. -- Nicolas Rolin 2018-06-29 4:57 GMT+02:00 Michael Selik : > On Thu, Jun 28, 2018, 6:46 PM Nicolas Rolin > wrote: > >> The questions I should have asked In my original post was : >> - Is splitting lists into sublists (by grouping elements) a high level >> enough construction to be worthy of a nice integration in the comprehension >> syntax ? >> > > My intuition is no, it's not important enough to alter the syntax, despite > being an important task. > > - In which case, is there a way to find a simple syntax that is not too >> confusing ? >> > > If you'd like to give it a shot, try to find something which is currently > invalid syntax, but does not break compatibility. The latter criteria means > no new keywords. The syntax should look nice as a single line with > reasonably verbose variable names. > > One issue is that Python code is mostly 1-dimensional, characters in a > line, and you're trying to express something which is 2-dimensional, in a > sense. There's only so much you can do without newlines and indentation. > -- -- *Nicolas Rolin* | Data Scientist + 33 631992617 - nicolas.rolin at tiime.fr *15 rue Auber, **75009 Paris* *www.tiime.fr * -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Jun 29 09:12:32 2018 From: brett at python.org (Brett Cannon) Date: Fri, 29 Jun 2018 10:12:32 -0300 Subject: [Python-ideas] list configuration In-Reply-To: <23349.56518.550630.883368@turnbull.sk.tsukuba.ac.jp> References: <23349.56518.550630.883368@turnbull.sk.tsukuba.ac.jp> Message-ID: And I've taken owners off because I don't know how to solve this short of removing Google Groups somehow or getting off of email and switching to Zulip or Discourse. If someone has a solution that doesn't require dropping email then let me know. On Fri, Jun 29, 2018, 04:17 Stephen J. Turnbull, < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > I've cc'd python-ideas-owner, the folks who can actually do something > about it. > > Chris Barker via Python-ideas writes: > > > I've been replying to messages lately, and getting a bounce back: > [...] > > And it's not quite clar to me if the message actually got through. > > In my experience it does, because when this happens to me it's due to > my reply-all inserting both python.org and googlegroups addresses in > the To and Cc fields. > > > IIUC, this is a Mailman list -- so it must be getting mirrored > > through google groups, and at least with some people's posts, the > > reply-to header is getting messed up. > > > > Anyone know what's going on? > > I'm not sure about the process in your case. What I've experienced is > that there are a couple of people who use googlegroups to read, and > when they post for some reason both the python.org and the > googlegroups addresses end up as addressees. This might be Google > trying to monopolize mailing lists by putting the googlegroups address > in Reply-To, or it might be a poster error of using reply-all and not > cleaning out the googlegroups address (which doesn't bother them > because they're subscribed at googlegroups and googlegroups > deduplicates). > > For historical reasons I use reply-all, so both addresses end up in > the addressees, and it bounces from googlegroups if I don't clean it > up. > > > It would be nice to fix this... > > My personal approach would be to blackhole posts containing > googlegroups addresses in any header field, but that probably won't > fly. ;-) > > I think it's probably not that hard to add some code to some handler > in Mailman's pipeline to strip out googlegroups addresses from the > list of addressees in outgoing posts, and if that makes googlegroups > unreliable, so be it. It shouldn't, though, because the googlegroup > is subscribed to the list at python.org. > > I don't see why this shouldn't be done globally for all lists at > python.org, for all googlegroups addresses. AFAIK there are no > non-list mailboxes at googlegroups that would want to receive mail > there. > > Steve > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Fri Jun 29 10:50:38 2018 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Fri, 29 Jun 2018 09:50:38 -0500 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: <069001d40f61$62103260$26309720$@sdamon.com> References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> <069001d40f61$62103260$26309720$@sdamon.com> Message-ID: On Thu, Jun 28, 2018 at 11:26 PM, Alex Walters wrote: > But don't all the users who care about citing modules already use the > scientific python packages, with scipy itself at it's center? Wouldn't > those engaging in science or in academia be better stewards of this than > systems programmers? Since you're not asking for anything that can't be > done in a third party module, and there is a third party module that most > of the target audience of this standard would already have, there is zero > reason to take up four names in the python runtime to serve those users. > Not all scientific software in Python depends on scipy or even numpy. However, it does all depend on Python. Although perhaps that argues for a cross-language solution :) I still think it would be very nice to have an official standard for citation information in Python packages as codified in a PEP. That would reduce ambiguity and make it much easier for tool-writers who want to parse citation information. > -----Original Message----- > > From: Adrian Price-Whelan > > Sent: Friday, June 29, 2018 12:16 AM > > To: Alex Walters > > Cc: Steven D'Aprano ; python-ideas at python.org > > Subject: Re: [Python-ideas] Add a __cite__ method for scientific packages > > > > For me, it's about setting a standard that is endorsed by the > > language, and setting expectations for users. There currently is no > > standard, which is why packages use __citation__, __cite__, > > __bibtex__, etc., and as a user I don't immediately know where to look > > for citation information (without going to the source). My feeling is > > that adopting __citation__ or some dunder name could be implemented on > > classes, functions, etc. with less of a chance of naming conflicts, > > but am open to discussion. > > > > I have some notes here about various ideas for more advanced > > functionality that would support automatically keeping track of > > citation information for imported packages, classes, functions: > > https://github.com/adrn/CitationPEP/blob/master/NOTES.md > > > > On Thu, Jun 28, 2018 at 10:57 PM, Alex Walters > > wrote: > > > Why not scipy.cite() or scipy.citation()? I don't see any reason for > these > > > functions to ship with standard python at all. > > > > > >> -----Original Message----- > > >> From: Python-ideas > >> list=sdamon.com at python.org> On Behalf Of Steven D'Aprano > > >> Sent: Thursday, June 28, 2018 8:17 PM > > >> To: python-ideas at python.org > > >> Subject: Re: [Python-ideas] Add a __cite__ method for scientific > packages > > >> > > >> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: > > >> > > >> > As for the list, reserving a __citation__/__cite__ for packages at > the > > > same > > >> > level as __version__ is now reserved and adding a citation()/cite() > > >> > function to the standard library seemed large enough modifications > to > > >> > warrant searching a buy-in from the maintainers and the community at > > >> large. > > >> > > >> I think that an approach similar to help/quit/exit is warranted. The > > >> cite()/citation() function need not be *literally* built into the > > >> language, it could be an external function written in Python and added > > >> to builtins by the site.py module. > > >> > > >> > > >> > > >> > > >> -- > > >> Steve > > >> _______________________________________________ > > >> Python-ideas mailing list > > >> Python-ideas at python.org > > >> https://mail.python.org/mailman/listinfo/python-ideas > > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > > > Python-ideas mailing list > > > Python-ideas at python.org > > > https://mail.python.org/mailman/listinfo/python-ideas > > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > -- > > Adrian M. Price-Whelan > > Lyman Spitzer, Jr. Postdoctoral Fellow > > Princeton University > > http://adrn.github.io > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Jun 29 13:01:20 2018 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 29 Jun 2018 18:01:20 +0100 Subject: [Python-ideas] Fwd: Allow a group by operation for dict comprehension In-Reply-To: References: Message-ID: <98cb9bf4-0be9-a75c-d04e-a87364eb6ef0@mrabarnett.plus.com> On 2018-06-29 05:14, David Mertz wrote: > > Mike Selik asked for my opinion on a draft PEP along these lines. I > proposed a slight modification to his idea that is now reflected in his > latest edits. With some details fleshed out, I think this is a promising > idea. I like the a collections class better, of course, but a dict > classmethod is still a lot smaller change than new syntax change in > comprehension. > > On Thu, Jun 28, 2018, 8:15 PM David Mertz > wrote: > [snip] > For example (this typed without testing, forgive any typos or thinkos): > > >>> from collections import Grouper # i.e. in Python 3.8+ > >>> grouped = Grouper(range(7), key=mod_2) > >>> grouped > Grouper({0: [0, 2, 4, 6], 1: [1, 3, 5]}) > >>> grouped.update([2, 10, 12, 13], key=mod_2) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12], 1: [1, 3, 5, 13]}) > >>> # Updating with no key function groups by identity > >>> # ... is there a better idea for the default key function? > >>> grouped.update([0, 1, 2]) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0], 1: [1, 3, 5, 13, 1], 2: [2]}) I think that if a Grouper instance is created with a key function, then that key function should be used by the .update method. You _could_ possibly override that key function by providing a new one when updating, but, OTOH, why would you want to? You'd be mixing different kinds of groupings! So -1 on that. > >>> # Maybe do a different style of update if passed a dict subclass > >>> # - Does a key function make sense here? > >>> grouped.update({0: 88, 1: 77}) > >>> grouped > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0, 88], > 1: [1, 3, 5, 13, 1, 77], > 2: [2]}) > >>> # Avoiding duplicates might sometimes be useful > >>> grouped.make_unique() # better name? .no_dup()? > >>> grouped > Grouper({0: [0, 2, 4, 6, 10, 12, 88], > 1: [1, 3, 5, 13, 77], > 2: [2]}) > If you want to avoid duplicates, maybe the grouper should be created with 'set' as the default factory (see 'defaultdict'). However, there's the problem that 'list' has .append but 'set' has .add... > I think that most of the methods of Counter make sense to include > here in appropriately adjusted versions. Converting to a plain > dictionary should probably just be `dict(grouped)`, but it's > possible we'd want `grouped.as_dict()` or something. > > One thing that *might* be useful is a way to keep using the same key > function across updates. Even with no explicit provision, we *could* > spell it like this: > > >>> grouped.key_func = mod_2 > >>> grouped.update([55, 44, 22, 111], key=grouped.key_func) > > Perhaps some more official API for doing that would be useful though. > [snip] From mike at selik.org Fri Jun 29 13:53:34 2018 From: mike at selik.org (Michael Selik) Date: Fri, 29 Jun 2018 10:53:34 -0700 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: Hello, I've drafted a PEP for an easier way to construct groups of elements from a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst As a teacher, I've found that grouping is one of the most awkward tasks for beginners to learn in Python. While this proposal requires understanding a key-function, in my experience that's easier to teach than the nuances of setdefault or defaultdict. Defaultdict requires passing a factory function or class, similar to a key-function. Setdefault is awkwardly named and requires a discussion of references and mutability. Those topics are important and should be covered, but I'd like to let them sink in gradually. Grouping often comes up as a question on the first or second day, especially for folks transitioning from Excel. I've tested this proposal on actual students (no students were harmed during experimentation) and found that the majority appreciate it. Some are even able to guess what it does (would do) without any priming. Thanks for your time, -- Michael On Thu, Jun 28, 2018 at 8:38 AM Michael Selik wrote: > On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin > wrote: > >> I use list and dict comprehension a lot, and a problem I often have is to >> do the equivalent of a group_by operation (to use sql terminology). >> >> For example if I have a list of tuples (student, school) and I want to >> have the list of students by school the only option I'm left with is to >> write >> >> student_by_school = defaultdict(list) >> for student, school in student_school_list: >> student_by_school[school].append(student) >> > > Thank you for bringing this up. I've been drafting a proposal for a better > grouping / group-by operation for a little while. I'm not quite ready to > share it, as I'm still researching use cases. > > I'm +1 that this task needs improvement, but -1 on this particular > solution. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 29 14:00:59 2018 From: mike at selik.org (Michael Selik) Date: Fri, 29 Jun 2018 11:00:59 -0700 Subject: [Python-ideas] Fwd: Allow a group by operation for dict comprehension In-Reply-To: <98cb9bf4-0be9-a75c-d04e-a87364eb6ef0@mrabarnett.plus.com> References: <98cb9bf4-0be9-a75c-d04e-a87364eb6ef0@mrabarnett.plus.com> Message-ID: I created a separate thread to continue this discussion: "grouping / dict of lists" https://github.com/selik/peps/blob/master/pep-9999.rst In my proposal, the update offers a key-function in case the new elements don't follow the same pattern as the existing ones. I can understand the view that the class should retain the key-function from initialization. The issue of the group type -- list, set, Counter, etc. -- is handled by offering a Grouping.aggregate method. The Grouping class creates lists, which are passed to the aggregate function. I included examples of constructing sets and Counters. On Fri, Jun 29, 2018 at 10:04 AM MRAB wrote: > On 2018-06-29 05:14, David Mertz wrote: > > > > Mike Selik asked for my opinion on a draft PEP along these lines. I > > proposed a slight modification to his idea that is now reflected in his > > latest edits. With some details fleshed out, I think this is a promising > > idea. I like the a collections class better, of course, but a dict > > classmethod is still a lot smaller change than new syntax change in > > comprehension. > > > > On Thu, Jun 28, 2018, 8:15 PM David Mertz > > wrote: > > > [snip] > > For example (this typed without testing, forgive any typos or > thinkos): > > > > >>> from collections import Grouper # i.e. in Python 3.8+ > > >>> grouped = Grouper(range(7), key=mod_2) > > >>> grouped > > Grouper({0: [0, 2, 4, 6], 1: [1, 3, 5]}) > > >>> grouped.update([2, 10, 12, 13], key=mod_2) > > >>> grouped > > Grouper({0: [0, 2, 4, 6, 2, 10, 12], 1: [1, 3, 5, 13]}) > > >>> # Updating with no key function groups by identity > > >>> # ... is there a better idea for the default key function? > > >>> grouped.update([0, 1, 2]) > > >>> grouped > > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0], 1: [1, 3, 5, 13, 1], 2: > [2]}) > > I think that if a Grouper instance is created with a key function, then > that key function should be used by the .update method. > > You _could_ possibly override that key function by providing a new one > when updating, but, OTOH, why would you want to? You'd be mixing > different kinds of groupings! So -1 on that. > > > >>> # Maybe do a different style of update if passed a dict subclass > > >>> # - Does a key function make sense here? > > >>> grouped.update({0: 88, 1: 77}) > > >>> grouped > > Grouper({0: [0, 2, 4, 6, 2, 10, 12, 0, 88], > > 1: [1, 3, 5, 13, 1, 77], > > 2: [2]}) > > >>> # Avoiding duplicates might sometimes be useful > > >>> grouped.make_unique() # better name? .no_dup()? > > >>> grouped > > Grouper({0: [0, 2, 4, 6, 10, 12, 88], > > 1: [1, 3, 5, 13, 77], > > 2: [2]}) > > > If you want to avoid duplicates, maybe the grouper should be created > with 'set' as the default factory (see 'defaultdict'). However, there's > the problem that 'list' has .append but 'set' has .add... > > I think that most of the methods of Counter make sense to include > > here in appropriately adjusted versions. Converting to a plain > > dictionary should probably just be `dict(grouped)`, but it's > > possible we'd want `grouped.as_dict()` or something. > > > > One thing that *might* be useful is a way to keep using the same key > > function across updates. Even with no explicit provision, we *could* > > spell it like this: > > > > >>> grouped.key_func = mod_2 > > >>> grouped.update([55, 44, 22, 111], key=grouped.key_func) > > > > Perhaps some more official API for doing that would be useful though. > > > [snip] > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 29 17:42:59 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Jun 2018 14:42:59 -0700 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: On a quick skim I see nothing particularly objectionable or controversial in your PEP, except I'm unclear why it needs to be a class method on `dict`. Adding something to a builtin like this is rather heavy-handed. Is there a really good reason why it can't be a function in `itertools`? (I don't think that it's relevant that it doesn't return an iterator -- it takes in an iterator.) Also, your pure-Python implementation appears to be O(N log N) if key is None but O(N) otherwise; and the version for key is None uses an extra temporary array of size N. Is that intentional? Finally, the first example under "Group and Aggregate" is described as a dict of sets but it actually returns a dict of (sorted) lists. On Fri, Jun 29, 2018 at 10:54 AM Michael Selik wrote: > Hello, > > I've drafted a PEP for an easier way to construct groups of elements from > a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst > > As a teacher, I've found that grouping is one of the most awkward tasks > for beginners to learn in Python. While this proposal requires > understanding a key-function, in my experience that's easier to teach than > the nuances of setdefault or defaultdict. Defaultdict requires passing a > factory function or class, similar to a key-function. Setdefault is > awkwardly named and requires a discussion of references and mutability. > Those topics are important and should be covered, but I'd like to let them > sink in gradually. Grouping often comes up as a question on the first or > second day, especially for folks transitioning from Excel. > > I've tested this proposal on actual students (no students were harmed > during experimentation) and found that the majority appreciate it. Some are > even able to guess what it does (would do) without any priming. > > Thanks for your time, > -- Michael > > > > > > > On Thu, Jun 28, 2018 at 8:38 AM Michael Selik wrote: > >> On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin >> wrote: >> >>> I use list and dict comprehension a lot, and a problem I often have is >>> to do the equivalent of a group_by operation (to use sql terminology). >>> >>> For example if I have a list of tuples (student, school) and I want to >>> have the list of students by school the only option I'm left with is to >>> write >>> >>> student_by_school = defaultdict(list) >>> for student, school in student_school_list: >>> student_by_school[school].append(student) >>> >> >> Thank you for bringing this up. I've been drafting a proposal for a >> better grouping / group-by operation for a little while. I'm not quite >> ready to share it, as I'm still researching use cases. >> >> I'm +1 that this task needs improvement, but -1 on this particular >> solution. >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jun 29 18:23:20 2018 From: mike at selik.org (Michael Selik) Date: Fri, 29 Jun 2018 15:23:20 -0700 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: On Fri, Jun 29, 2018 at 2:43 PM Guido van Rossum wrote: > On a quick skim I see nothing particularly objectionable or controversial > in your PEP, except I'm unclear why it needs to be a class method on `dict`. > Since it constructs a basic dict, I thought it belongs best as a dict constructor like dict.fromkeys. It seemed to match other classmethods like datetime.now. > Adding something to a builtin like this is rather heavy-handed. > I included an alternate solution of a new class, collections.Grouping, which has some advantages. In addition to having less of that "heavy-handed" feel to it, the class can have a few utility methods that help handle more use cases. > Is there a really good reason why it can't be a function in `itertools`? > (I don't think that it's relevant that it doesn't return an iterator -- it > takes in an iterator.) > I considered placing it in the itertools module, but decided against because it doesn't return an iterator. I'm open to that if that's the consensus. > Also, your pure-Python implementation appears to be O(N log N) if key is > None but O(N) otherwise; and the version for key is None uses an extra > temporary array of size N. Is that intentional? > Unintentional. I've been drafting pieces of this over the last year and wasn't careful enough with proofreading. I'll fix that momentarily... > Finally, the first example under "Group and Aggregate" is described as a > dict of sets but it actually returns a dict of (sorted) lists. > Doctest complained at the set ordering, so I sorted for printing. You're not the only one to make that point, so I'll use sets for the example and ignore doctest. Thanks for reading! -- Michael PS. I just pushed an update to the GitHub repo, as per these comments. > On Fri, Jun 29, 2018 at 10:54 AM Michael Selik wrote: > >> Hello, >> >> I've drafted a PEP for an easier way to construct groups of elements from >> a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst >> >> As a teacher, I've found that grouping is one of the most awkward tasks >> for beginners to learn in Python. While this proposal requires >> understanding a key-function, in my experience that's easier to teach than >> the nuances of setdefault or defaultdict. Defaultdict requires passing a >> factory function or class, similar to a key-function. Setdefault is >> awkwardly named and requires a discussion of references and mutability. >> Those topics are important and should be covered, but I'd like to let them >> sink in gradually. Grouping often comes up as a question on the first or >> second day, especially for folks transitioning from Excel. >> >> I've tested this proposal on actual students (no students were harmed >> during experimentation) and found that the majority appreciate it. Some are >> even able to guess what it does (would do) without any priming. >> >> Thanks for your time, >> -- Michael >> >> >> >> >> >> >> On Thu, Jun 28, 2018 at 8:38 AM Michael Selik wrote: >> >>> On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin >>> wrote: >>> >>>> I use list and dict comprehension a lot, and a problem I often have is >>>> to do the equivalent of a group_by operation (to use sql terminology). >>>> >>>> For example if I have a list of tuples (student, school) and I want to >>>> have the list of students by school the only option I'm left with is to >>>> write >>>> >>>> student_by_school = defaultdict(list) >>>> for student, school in student_school_list: >>>> student_by_school[school].append(student) >>>> >>> >>> Thank you for bringing this up. I've been drafting a proposal for a >>> better grouping / group-by operation for a little while. I'm not quite >>> ready to share it, as I'm still researching use cases. >>> >>> I'm +1 that this task needs improvement, but -1 on this particular >>> solution. >>> >>> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jun 29 18:19:12 2018 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 29 Jun 2018 15:19:12 -0700 (PDT) Subject: [Python-ideas] collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: [Michael Selik] > You need an iterable of the keys you're interested in to pass to the > hypothetical ``fromkeys`` method. Why not iterate over that same iterable > paired with zeros instead of passing it into ``fromkeys``? > Because, as in my original example code; the values could be zero or they could be more, I just want to make sure the keys are in the counter when I iterate. I'm not having any trouble finding a work around. I'm having trouble understanding why I need to find a work around when Counter already inherits from dict, dict.fromkeys is perfectly well defined, and there's not really any other *obvious* best way to initialize the value for a set of keys. Counter(dict.fromkeys(keys, value)) works just fine, but it feels wrong. I'm using the copy-constructor because I know Counter is a subclass of dict. I'm using fromkeys because I know how that class method works. So why does the subclass lack functionality that the superclass has? Because programmers wouldn't be able to wrap their heads around it? I don't buy it. This feels like nanny-design trumping SOLID design . -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Jun 29 18:56:02 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 29 Jun 2018 17:56:02 -0500 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: Reposting because the original got bounced from Google Groups. ---------- Forwarded message --------- From: Tim Peters Date: Fri, Jun 29, 2018 at 5:54 PM Subject: Re: [Python-ideas] collections.Counter should implement fromkeys To: Cc: python-ideas [Abe Dillon ] > ... > I'm using the copy-constructor because I know Counter is a subclass of dict. > I'm using fromkeys because I know how that class method works. > So why does the subclass lack functionality that the superclass has? > Because programmers wouldn't be able to wrap their heads around it? > I don't buy it. This feels like nanny-design trumping SOLID design . More because Counter.fromkeys() could be incoherent. From the implementation (in your Lib/collections/__init__.py): @classmethod def fromkeys(cls, iterable, v=None): # There is no equivalent method for counters because setting v=1 # means that no element can have a count greater than one. raise NotImplementedError( 'Counter.fromkeys() is undefined. Use Counter(iterable) instead.') For a dict, a value appearing multiple times in the iterable doesn't matter. But a fundamental use case for Counters is to tally the _number_ of times duplicate keys appear. So, e.g., someone will be unpleasantly surprised no matter what: Counter.fromkeys("aaaaa", 2) returned. "It should set key 'a' to 2! that's what I said it should do!" "No! It should set key 'a' to 10! that's what a Counter _always_ does - sums the values associated with duplicate keys!" "You're both right - and wrong! It should raise an exception if there's a duplicate key, because there's no compelling answer to what it should do!" I expect Raymond called it NotImplementedError instead so he could release the code instead of waiting 3 years for that debate to end ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrei.kucharavy at gmail.com Fri Jun 29 20:13:06 2018 From: andrei.kucharavy at gmail.com (Andrei Kucharavy) Date: Fri, 29 Jun 2018 20:13:06 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> <069001d40f61$62103260$26309720$@sdamon.com> Message-ID: > > One more thing. There's precedent for this: when you start an interactive > Python interpreter it tells you how to get help, but also how to get > copyright, credits and license information: > > $ python3 > Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 26 2018, 19:50:54) > [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> credits > Thanks to CWI, CNRI, BeOpen.com, Zope Corporation and a cast of > thousands > for supporting Python development. See www.python.org for more > information. > >>> > > It makes total sense to add citations/references to this list (and those > should probably print a reference for Python followed by instructions on > how to get references for other packages and how to properly add a > reference to your own code). > > If that's possible, that would be great! > I think that an approach similar to help/quit/exit is warranted. The > cite()/citation() function need not be *literally* built into the > language, it could be an external function written in Python and added > to builtins by the site.py module. I was not aware this was a possibility - it does seem like a good option! If I were you, I'd try organizing a birds-of-a-feather at the next > SciPy conference, or start getting in touch with others working on > this (duecredit devs, the folks listed on that citationPEP thing, > etc.), and go from there. (Feel free to CC me if you do start up some > effort like this.) Not all packages are within the numpy/scipy universe - Pandas and Seaborn are notable examples. I bought this thread to the attention of some major scientific package maintainers as well as the main citationPEP author. I am not entirely sure where this conversations could be moved outside python-ideas given we are talking about something universal across packages, but would gladly take any suggestions. There isn't actually any formal method for registering special names > like __version__, and they aren't treated specially by the language. > They're just variables that happen to have a funny name. You shouldn't > start using them willy-nilly, but you don't actually have to ask > permission or anything. And it's not very likely that someone else > will come along and propose using the name __citation__ for something > that *isn't* a citation :-). Thanks for the explanation - Python development and maintenance do seem to be a complex process from the outside and this kind of subtleties are not always easy to distinguish :). The way to do this is to first get your solution implemented as a > third-party library and adopted by the scientific packages, and then > start thinking about whether it would make sense to move the library > into the standard library. It's relatively easy to move things into > the standard library. The hard part is making sure that you > implemented the right thing in the first place, and that's MUCH more > likely if you start out as a third-party package. Got it. I think you misunderstand how these lists work :-). (Which is fine -- > it's actually pretty opaque and confusing if you don't already know!) > Generally, distutils-sig operates totally independently from > python-{ideas,dev} -- if you have a packaging proposal, it goes there > and not here; if you have a language proposal, it goes here and not > there. *If* what you want to do is add some static metadata to python > packages through setup.py, then python-ideas is irrelevant and > distutils-sig is who you'll have to convince. (But they'll also want > to see that your proposal has buy-in from established packages, > because they don't understand the intricacies of software citation and > will want people they trust to tell them whether the proposal makes > sense.) > Got it as well - that does indeed seem a reasonable way of doing things, although I believe there have been precedents where GVM implemented a feature from scratch after studying existing libraries (I am thinking notably about asyncio, which is orders of magnitude more complex and involved than anything we are talking here). And often the custom scripts are a mix of R and Python, and maybe some > Fortran, ... Plus, if it works for multiple languages, it means you > get to share part of the work with other ecosystems, instead of > everyone reinventing the wheel. > > Also, if you want to go down the dynamic route (which is the only way > to get accurate fine-grained citations), then it's just as easy to > solve the problem in a language independent way. > In my experience, people tend to go with either one or other or use Julia. I am not very familiar with Fortran ecosystem - as far as I've seen, those are extremely efficient libraries that get wrapped and used in most modern scientific computing languages, but very rarely directly. In addition to that, while I see how granular citations could be implemented in Python, I have a bit more trouble understanding how calls to R, Python, Perl, C, C++ or Fortran from command line scripts can be analyzed on the fly to get metadata about citations. I have even more trouble imagining how it would be possible to bring developers across all the separate language communities to agree on a single standard. > CLS-JSON represented as a dict to be supplied to the setup file is > > definitely one way of doing it. I was, however, thinking more about the > > BibTeX format, given that CLS-JSON is more closely affiliated with > Mendeley > > Huh, is it? I only know it from Zotero. > Hm - was not aware Zotero uses it as well - it's definitely a good sign and I will have to look into CLS-JSON it more in depth. Why not scipy.cite() or scipy.citation()? I don't see any reason for these > functions to ship with standard python at all. There are packages that do not depend on scipy and even for those that do - most users writing analysis pipelines for scientific packages are unaware that they are using scipy/numpy underneath the packages that do what they want at the highest level. I don't think that this is a very useful idea, because most people that > I've encountered that don't cite software is people they think that it's > not important, not because they don't know what the right citation is. > The problem is social and not technological. I don't want to spend time > on a technical solution to it. > Thanks for your opinion Gael - as maintainer of scikits-learn you have more experience with this issue more than most of us. In my field (computational biology in molecular biology labs) the situation is somewhat different - most of the custom scripts are implemented by people who often have learned Python or programming at all in the last couple of years. Most of the time they get asked by the corresponding author to provide 1-5 citations for their analytical pipeline and to describe what they did in the supplementary material and I had several junior developers in my labs come forwards to me asking what they were supposed to cite and where to find the citations. We aren't likely to convince everyone to cite code overnight, but making citing as easy as possible does seem like a step in the right direction to me. I still think it would be very nice to have an official standard for > citation information in Python packages as codified in a PEP. That would > reduce ambiguity and make it much easier for tool-writers who want to parse > citation information. > That's my opinion as well. To summarize the conversation until now, it seems that __citation__ data field and a cite() script seem to be the preferred option. If the proposal gets traction and is accepted, the citation for Python as well as the instructions to get citation for a package can be added as a top-level command, similar to credits, copyright or license. As of now, it seems like the next steps would be to: - draft a PEP (or complete the existing one) and implement the cite() script as well as a show-case package using __citation__ - talk to major package maintainers to see if they have any objections to the method or suggestions with regards to pep/implementation - talk to the distutils-sig list to see if we could add the __citation__ metadata to setup.py - submit a proper PEP (Would a pull request to https://github.com/python/peps be an acceptable way of doing it?) Is there something I might be missing so far? Best, *Andrei Kucharavy* Post-Doc @ *Joel S. Bader** Lab* Johns Hopkins University, Baltimore, USA. On Fri, Jun 29, 2018 at 10:51 AM Nathan Goldbaum wrote: > > > On Thu, Jun 28, 2018 at 11:26 PM, Alex Walters > wrote: > >> But don't all the users who care about citing modules already use the >> scientific python packages, with scipy itself at it's center? Wouldn't >> those engaging in science or in academia be better stewards of this than >> systems programmers? Since you're not asking for anything that can't be >> done in a third party module, and there is a third party module that most >> of the target audience of this standard would already have, there is zero >> reason to take up four names in the python runtime to serve those users. >> > > > Not all scientific software in Python depends on scipy or even numpy. > However, it does all depend on Python. > > Although perhaps that argues for a cross-language solution :) > > I still think it would be very nice to have an official standard for > citation information in Python packages as codified in a PEP. That would > reduce ambiguity and make it much easier for tool-writers who want to parse > citation information. > > > -----Original Message----- >> > From: Adrian Price-Whelan >> > Sent: Friday, June 29, 2018 12:16 AM >> > To: Alex Walters >> > Cc: Steven D'Aprano ; python-ideas at python.org >> > Subject: Re: [Python-ideas] Add a __cite__ method for scientific >> packages >> > >> > For me, it's about setting a standard that is endorsed by the >> > language, and setting expectations for users. There currently is no >> > standard, which is why packages use __citation__, __cite__, >> > __bibtex__, etc., and as a user I don't immediately know where to look >> > for citation information (without going to the source). My feeling is >> > that adopting __citation__ or some dunder name could be implemented on >> > classes, functions, etc. with less of a chance of naming conflicts, >> > but am open to discussion. >> > >> > I have some notes here about various ideas for more advanced >> > functionality that would support automatically keeping track of >> > citation information for imported packages, classes, functions: >> > https://github.com/adrn/CitationPEP/blob/master/NOTES.md >> > >> > On Thu, Jun 28, 2018 at 10:57 PM, Alex Walters > > >> > wrote: >> > > Why not scipy.cite() or scipy.citation()? I don't see any reason for >> these >> > > functions to ship with standard python at all. >> > > >> > >> -----Original Message----- >> > >> From: Python-ideas > > >> list=sdamon.com at python.org> On Behalf Of Steven D'Aprano >> > >> Sent: Thursday, June 28, 2018 8:17 PM >> > >> To: python-ideas at python.org >> > >> Subject: Re: [Python-ideas] Add a __cite__ method for scientific >> packages >> > >> >> > >> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: >> > >> >> > >> > As for the list, reserving a __citation__/__cite__ for packages at >> the >> > > same >> > >> > level as __version__ is now reserved and adding a citation()/cite() >> > >> > function to the standard library seemed large enough modifications >> to >> > >> > warrant searching a buy-in from the maintainers and the community >> at >> > >> large. >> > >> >> > >> I think that an approach similar to help/quit/exit is warranted. The >> > >> cite()/citation() function need not be *literally* built into the >> > >> language, it could be an external function written in Python and >> added >> > >> to builtins by the site.py module. >> > >> >> > >> >> > >> >> > >> >> > >> -- >> > >> Steve >> > >> _______________________________________________ >> > >> Python-ideas mailing list >> > >> Python-ideas at python.org >> > >> https://mail.python.org/mailman/listinfo/python-ideas >> > >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > >> > > _______________________________________________ >> > > Python-ideas mailing list >> > > Python-ideas at python.org >> > > https://mail.python.org/mailman/listinfo/python-ideas >> > > Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > >> > >> > -- >> > Adrian M. Price-Whelan >> > Lyman Spitzer, Jr. Postdoctoral Fellow >> > Princeton University >> > http://adrn.github.io >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jun 29 20:32:54 2018 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 29 Jun 2018 17:32:54 -0700 (PDT) Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: [Tim Peters] > a fundamental use case for Counters is to tally the _number_ of times > duplicate keys appear. > Yes, that's why the default constructor already does just that. [Tim Peters] > So, e.g., someone will be unpleasantly surprised no matter what Sure, but in Hettinger's own words "whenever you have a constructor war, everyone should get their wish". People that want a counting constructor have that, people that want the ability to initialize values don't have that. [Tim Peters] > Counter.fromkeys("aaaaa", 2) > > returned. "It should set key 'a' to 2! that's what I said it should > do!" "No! It should set key 'a' to 10! that's what a Counter _always_ > does - sums the values associated with duplicate keys!" > I'm tempted to indulge in the meta argument which you're obviously striving to avoid, but I will say this: "that's what a Counter _always_ does" makes no sense. It's *almost* tantamount to saying that all constructors have to do exactly the same thing, which makes multiple constructors useless. Technically, there is no constructor for counting by X, but if enough people really wanted that, I suppose a third constructor would be in order. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abedillon at gmail.com Fri Jun 29 20:55:44 2018 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 29 Jun 2018 17:55:44 -0700 (PDT) Subject: [Python-ideas] collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: [Michael Selik] > You might be pursuing a local optimum of obviousness. If you step back > from the idea of "initialize the counts with all interesting keys" and > looping over them, you might find a better overall solution. > That's a distinct possibility, but this is far from the first time I've wanted for a better way to initialize a Counter. It's just a simple example for which there are many many alternative approaches. If' you're interested, I have a more complete implementation of the game of life here . [Michael Selik] > That happens more frequently than my OOP professor seemed to think. I end > up with platypi (platypuses?) all too often. Yes, I always try to caution my own students against strict adherence to any ideology, methodology, or cult. "Always do this" and "Never do that" rules are useful crutches for novices trying to navigate the complex world of coding for the first time, but pros are the ones who are experienced enough to know when it's OK to break the rules. When GoTo isn't harmful. When to break the rule of thirds . Obviously, Python breaks SOLID principals successfully all over the place for pragmatic reasons. I don't think this is one of those cases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jun 29 20:57:32 2018 From: mertz at gnosis.cx (David Mertz) Date: Fri, 29 Jun 2018 20:57:32 -0400 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> <069001d40f61$62103260$26309720$@sdamon.com> Message-ID: On Fri, Jun 29, 2018, 8:14 PM Andrei Kucharavy wrote: > Not all packages are within the numpy/scipy universe - Pandas and Seaborn are notable examples. Huh?! Pandas is a thin wrapper around NumPy. To be fair, it is a wrapper that adds a huge number of wrapping methods and classes. Seaborn in turn has at least a soft dependency on Pandas (some of the charts really need a DataFrame to work from). I like the idea of standardizing curation information. But it has little to do with Python itself. Getting the authors of scientific packages to agree on conventions is what needed, and doing that requires accurately determining their needs, not some mandate from Python itself. Nothing in the language needs to change to agree on some certain collection of names (perhaps dunders, perhaps not), and some certain formats for the data that might live inside them. Down the road, if there gets to be widespread acceptance of these conventions, Python standard library might include a function or two to work with them. But the horse should go before the cart. -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcidy at gmail.com Fri Jun 29 21:58:20 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Fri, 29 Jun 2018 18:58:20 -0700 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: <20180629001724.GO14437@ando.pearwood.info> <067d01d40f5d$403ea710$c0bbf530$@sdamon.com> <069001d40f61$62103260$26309720$@sdamon.com> Message-ID: On Fri, Jun 29, 2018, 17:14 Andrei Kucharavy wrote: > One more thing. There's precedent for this: when you start an interactive >> Python interpreter it tells you how to get help, but also how to get >> copyright, credits and license information: >> >> $ python3 >> Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 26 2018, 19:50:54) >> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin >> Type "help", "copyright", "credits" or "license" for more information. >> >>> credits >> Thanks to CWI, CNRI, BeOpen.com, Zope Corporation and a cast of >> thousands >> for supporting Python development. See www.python.org for more >> information. >> >>> >> > This is thin justification to add something to core. It seems like the very small percentage of academic users whose careers depend on this cannot resolve the political issue of forming a standards body. I don't see how externalizing the standard development will help. Kudos for shortcutting the process in a practical way to just get it done, but this just puts core devs in the middle of silly academic spats. A language endorsed citation method isn't a 'correct' method, and without the broad consensus that currently doesn't exist, this becomes _your_ method, a picked winner but ultimately a lightning rod for bored tenured professors with personal axes to grind. If this were about implementing an existing correct method I'm sure a grad student would be tasked with it for an afternoon. This is insanely easy to implement in docstrings, or a standard import, or mandatory include, or decorator, or anywhere else, it's just a parsing protocol. I believe 3.7 now exposes docstrings in the AST, meaning a simple static analyzer can handle all of PyPi, giving you crazy granularity if citations existed. Don't you want to cite the exact algorithm used in an imported method, not just lump them all into one call? Heck, I bet you could use type annotations. This really feels like you've got an amazing multi-tool but you want to turn the world, not the screw. This isn't a tool the majority of people will use, even if the citations exist. Don't get me wrong, I love designing standards and protocols, but this is pretty niche. I assume it won't be mandatory so I'm tilting at windmills, but then if it's not mandatory, what's the point of putting it in core? Just create a jstor style git server where obeying the citation protocol is mandatory. Of course, enforcing a missing citation is impossible, but it does mean citations can be generated by parsing imports. This is how it will evolve over time, by employing core devs on that server framework. >> It makes total sense to add citations/references to this list (and those >> should probably print a reference for Python followed by instructions on >> how to get references for other packages and how to properly add a >> reference to your own code). >> >> > If that's possible, that would be great! > > >> I think that an approach similar to help/quit/exit is warranted. The >> cite()/citation() function need not be *literally* built into the >> language, it could be an external function written in Python and added >> to builtins by the site.py module. > > > I was not aware this was a possibility - it does seem like a good option! > > If I were you, I'd try organizing a birds-of-a-feather at the next >> SciPy conference, or start getting in touch with others working on >> this (duecredit devs, the folks listed on that citationPEP thing, >> etc.), and go from there. (Feel free to CC me if you do start up some >> effort like this.) > > > Not all packages are within the numpy/scipy universe - Pandas and Seaborn > are notable examples. > > I bought this thread to the attention of some major scientific package > maintainers as well as the main citationPEP author. I am not entirely sure > where this conversations could be moved outside python-ideas given we are > talking about something universal across packages, but would gladly take > any suggestions. > > There isn't actually any formal method for registering special names >> like __version__, and they aren't treated specially by the language. >> They're just variables that happen to have a funny name. You shouldn't >> start using them willy-nilly, but you don't actually have to ask >> permission or anything. And it's not very likely that someone else >> will come along and propose using the name __citation__ for something >> that *isn't* a citation :-). > > > Thanks for the explanation - Python development and maintenance do seem to > be a complex process from the outside and this kind of subtleties are not > always easy to distinguish :). > > The way to do this is to first get your solution implemented as a >> third-party library and adopted by the scientific packages, and then >> start thinking about whether it would make sense to move the library >> into the standard library. It's relatively easy to move things into >> the standard library. The hard part is making sure that you >> implemented the right thing in the first place, and that's MUCH more >> likely if you start out as a third-party package. > > > Got it. > > I think you misunderstand how these lists work :-). (Which is fine -- >> it's actually pretty opaque and confusing if you don't already know!) >> Generally, distutils-sig operates totally independently from >> python-{ideas,dev} -- if you have a packaging proposal, it goes there >> and not here; if you have a language proposal, it goes here and not >> there. *If* what you want to do is add some static metadata to python >> packages through setup.py, then python-ideas is irrelevant and >> distutils-sig is who you'll have to convince. (But they'll also want >> to see that your proposal has buy-in from established packages, >> because they don't understand the intricacies of software citation and >> will want people they trust to tell them whether the proposal makes >> sense.) >> > > Got it as well - that does indeed seem a reasonable way of doing things, > although I believe there have been precedents where GVM implemented a > feature from scratch after studying existing libraries (I am thinking > notably about asyncio, which is orders of magnitude more complex and > involved than anything we are talking here). > > And often the custom scripts are a mix of R and Python, and maybe some >> Fortran, ... Plus, if it works for multiple languages, it means you >> get to share part of the work with other ecosystems, instead of >> everyone reinventing the wheel. >> >> Also, if you want to go down the dynamic route (which is the only way >> to get accurate fine-grained citations), then it's just as easy to >> solve the problem in a language independent way. >> > > In my experience, people tend to go with either one or other or use > Julia. I am not very familiar with Fortran ecosystem - as far as I've > seen, those are extremely efficient libraries that get wrapped and used in > most modern scientific computing languages, but very rarely directly. > > In addition to that, while I see how granular citations could be > implemented in Python, I have a bit more trouble understanding how calls to > R, Python, Perl, C, C++ or Fortran from command line scripts can be > analyzed on the fly to get metadata about citations. I have even more > trouble imagining how it would be possible to bring developers across all > the separate language communities to agree on a single standard. > > > CLS-JSON represented as a dict to be supplied to the setup file is >> > definitely one way of doing it. I was, however, thinking more about the >> > BibTeX format, given that CLS-JSON is more closely affiliated with >> Mendeley >> >> Huh, is it? I only know it from Zotero. >> > > Hm - was not aware Zotero uses it as well - it's definitely a good sign > and I will have to look into CLS-JSON it more in depth. > > Why not scipy.cite() or scipy.citation()? I don't see any reason for these >> functions to ship with standard python at all. > > > There are packages that do not depend on scipy and even for those that do > - most users writing analysis pipelines for scientific packages are unaware > that they are using scipy/numpy underneath the packages that do what they > want at the highest level. > > I don't think that this is a very useful idea, because most people that >> I've encountered that don't cite software is people they think that it's >> not important, not because they don't know what the right citation is. >> The problem is social and not technological. I don't want to spend time >> on a technical solution to it. >> > > Thanks for your opinion Gael - as maintainer of scikits-learn you have > more experience with this issue more than most of us. > > In my field (computational biology in molecular biology labs) the > situation is somewhat different - most of the custom scripts are > implemented by people who often have learned Python or programming at all > in the last couple of years. Most of the time they get asked by the > corresponding author to provide 1-5 citations for their analytical pipeline > and to describe what they did in the supplementary material and I had > several junior developers in my labs come forwards to me asking what they > were supposed to cite and where to find the citations. > > We aren't likely to convince everyone to cite code overnight, but making > citing as easy as possible does seem like a step in the right direction to > me. > > I still think it would be very nice to have an official standard for >> citation information in Python packages as codified in a PEP. That would >> reduce ambiguity and make it much easier for tool-writers who want to parse >> citation information. >> > > That's my opinion as well. > > To summarize the conversation until now, it seems that __citation__ data > field and a cite() script seem to be the preferred option. If the proposal > gets traction and is accepted, the citation for Python as well as the > instructions to get citation for a package can be added as a top-level > command, similar to credits, copyright or license. > > As of now, it seems like the next steps would be to: > > - draft a PEP (or complete the existing one) and implement the cite() > script as well as a show-case package using __citation__ > - talk to major package maintainers to see if they have any objections to > the method or suggestions with regards to pep/implementation > - talk to the distutils-sig list to see if we could add the __citation__ > metadata to setup.py > - submit a proper PEP (Would a pull request to > https://github.com/python/peps be an acceptable way of doing it?) > > Is there something I might be missing so far? > > Best, > > *Andrei Kucharavy* > > Post-Doc @ *Joel S. Bader** Lab* > > Johns Hopkins University, Baltimore, USA. > > > On Fri, Jun 29, 2018 at 10:51 AM Nathan Goldbaum > wrote: > >> >> >> On Thu, Jun 28, 2018 at 11:26 PM, Alex Walters >> wrote: >> >>> But don't all the users who care about citing modules already use the >>> scientific python packages, with scipy itself at it's center? Wouldn't >>> those engaging in science or in academia be better stewards of this than >>> systems programmers? Since you're not asking for anything that can't be >>> done in a third party module, and there is a third party module that most >>> of the target audience of this standard would already have, there is zero >>> reason to take up four names in the python runtime to serve those users. >>> >> >> >> Not all scientific software in Python depends on scipy or even numpy. >> However, it does all depend on Python. >> >> Although perhaps that argues for a cross-language solution :) >> >> I still think it would be very nice to have an official standard for >> citation information in Python packages as codified in a PEP. That would >> reduce ambiguity and make it much easier for tool-writers who want to parse >> citation information. >> >> > -----Original Message----- >>> > From: Adrian Price-Whelan >>> > Sent: Friday, June 29, 2018 12:16 AM >>> > To: Alex Walters >>> > Cc: Steven D'Aprano ; python-ideas at python.org >>> > Subject: Re: [Python-ideas] Add a __cite__ method for scientific >>> packages >>> > >>> > For me, it's about setting a standard that is endorsed by the >>> > language, and setting expectations for users. There currently is no >>> > standard, which is why packages use __citation__, __cite__, >>> > __bibtex__, etc., and as a user I don't immediately know where to look >>> > for citation information (without going to the source). My feeling is >>> > that adopting __citation__ or some dunder name could be implemented on >>> > classes, functions, etc. with less of a chance of naming conflicts, >>> > but am open to discussion. >>> > >>> > I have some notes here about various ideas for more advanced >>> > functionality that would support automatically keeping track of >>> > citation information for imported packages, classes, functions: >>> > https://github.com/adrn/CitationPEP/blob/master/NOTES.md >>> > >>> > On Thu, Jun 28, 2018 at 10:57 PM, Alex Walters < >>> tritium-list at sdamon.com> >>> > wrote: >>> > > Why not scipy.cite() or scipy.citation()? I don't see any reason >>> for these >>> > > functions to ship with standard python at all. >>> > > >>> > >> -----Original Message----- >>> > >> From: Python-ideas >> > >> list=sdamon.com at python.org> On Behalf Of Steven D'Aprano >>> > >> Sent: Thursday, June 28, 2018 8:17 PM >>> > >> To: python-ideas at python.org >>> > >> Subject: Re: [Python-ideas] Add a __cite__ method for scientific >>> packages >>> > >> >>> > >> On Thu, Jun 28, 2018 at 05:25:00PM -0400, Andrei Kucharavy wrote: >>> > >> >>> > >> > As for the list, reserving a __citation__/__cite__ for packages >>> at the >>> > > same >>> > >> > level as __version__ is now reserved and adding a >>> citation()/cite() >>> > >> > function to the standard library seemed large enough >>> modifications to >>> > >> > warrant searching a buy-in from the maintainers and the community >>> at >>> > >> large. >>> > >> >>> > >> I think that an approach similar to help/quit/exit is warranted. The >>> > >> cite()/citation() function need not be *literally* built into the >>> > >> language, it could be an external function written in Python and >>> added >>> > >> to builtins by the site.py module. >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> -- >>> > >> Steve >>> > >> _______________________________________________ >>> > >> Python-ideas mailing list >>> > >> Python-ideas at python.org >>> > >> https://mail.python.org/mailman/listinfo/python-ideas >>> > >> Code of Conduct: http://python.org/psf/codeofconduct/ >>> > > >>> > > _______________________________________________ >>> > > Python-ideas mailing list >>> > > Python-ideas at python.org >>> > > https://mail.python.org/mailman/listinfo/python-ideas >>> > > Code of Conduct: http://python.org/psf/codeofconduct/ >>> > >>> > >>> > >>> > -- >>> > Adrian M. Price-Whelan >>> > Lyman Spitzer, Jr. Postdoctoral Fellow >>> > Princeton University >>> > http://adrn.github.io >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Jun 29 22:37:33 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Jun 2018 12:37:33 +1000 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: <20180630023732.GR14437@ando.pearwood.info> On Fri, Jun 29, 2018 at 05:32:54PM -0700, Abe Dillon wrote: > Sure, but in Hettinger's own words > "whenever you have a > constructor war, everyone should get their wish". People that want a > counting constructor have that, > people that want the ability to initialize values don't have that. *scratches head* I can initalise a Counter just fine. py> Counter({'a': 0, 'b': 0, 'ab': 2}) Counter({'ab': 2, 'a': 0, 'b': 0}) The supported API for setting initial values of a counter is to either count the supplied keys: Counter(['a', 'b', 'ab']) or supply initial counts in a dict: Counter({'a': 0, 'b': 0, 'ab': 2}) In the case where all the inital counts are zero, the obvious API is to call the dict fromkeys method: Counter(dict.fromkeys(['a', 'b', 'ab'], 0)) So what you're really asking for is a convenience method to bypass the need to create a temporary dict first: Counter.fromkeys(['a', 'b', 'ab'], 0) Presumably the initial value will default to 0 rather than None, and take any integer value. I'm sympathetic to the idea of this as a convenience, but I don't think its an obvious feature to have. Tim's point about duplicate keys is valid. Should it raise an exception, silently swallow duplicates, or count them? The dict constructors, both the standard dict() and dict.fromkeys(), silently swallow duplicates. As they should. But Counter() does not, and should not. There's a discrepency if Counter() doesn't and Counter.fromkeys() does, and it requires a value judgement to decide whether that discrepency is sufficiently unimportant. [...] > Technically, there is no constructor for counting by X, but if enough > people really wanted that, I suppose a third constructor would be in order. How about a fourth constructor? A fifth? A fiftith? How many constructors is too many before the class becomes unwieldy? Not every way you might count with a counter needs to be a constructor method. You can always just count: c = Counter() for key in keys: c[key] += X I think you make a *reasonable* case for Counter.fromkeys to silently ignore duplicates, as a convenience method for Counter(dict.fromkeys(keys, 0) but its not (in my opinion) a *compelling* argument. I think it comes down to the taste of the designer. You can always subclass it. Or even monkey-patch it. py> def fromkeys(cls, seq, value=0): ... c = cls() ... for key in seq: ... c[key] = value ... return c ... py> from collections import Counter py> Counter.fromkeys = classmethod(fromkeys) py> Counter.fromkeys(['a', 'b', 'ab', 'a', 'b', 'c']) Counter({'a': 0, 'ab': 0, 'b': 0, 'c': 0}) (Subclassing is safer :-) -- Steve From tim.peters at gmail.com Fri Jun 29 22:51:38 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 29 Jun 2018 21:51:38 -0500 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: [ Note to repliers to Abe (and others recently): replies to Google Groups posts are broken for now on this list, so be sure to replace python-ideas at googlegroups.com with python-ideas at python.org in your reply. Else the mailing list (neither Google Groups nor the python.org archive) won't get it. ] [Tim] >> So, e.g., someone will be unpleasantly surprised no matter what [Abe Dillon] > Sure, but in Hettinger's own words "whenever you have a constructor war, > everyone should get their wish". People that want a counting constructor > have that,people that want the ability to initialize values don't have that. I think the missing bit here is that there weren't any "constructor wars" for Counter. In all this time, I don't believe I've heard anyone say they wanted a Counter.fromkeys() before. For that matter, I'd bet a dollar that most Python programmers don't know that dict.fromkeys() exists, despite that it was added in Python 2.3. As I recall, the primary motivation for adding dict.fromkeys() was to make using dicts to mimic sets a little easier, by providing a constructor that threw away duplicates and didn't really care about the values (so no value was required, and nobody cared that it defaulted to the _seemingly_ insane `None` - using `None` values for sets-implemented-as-dicts was a de facto informal standard at the time). But one release later (2.4) a set type was added too, so the primary motivation for fromkeys() went away. 15 years later you're jumping up & down about Counter.fromkeys() not being there, and that's why nobody much cares ;-) > ... > I'm tempted to indulge in the meta argument which you're obviously > striving to avoid, And succeeding! I can't be sucked into it :-) FWIW, fine by me if Counter.fromkeys() is added, doing exactly what you want. Raymond may have a different judgment about that, though. I don't believe he reads python-ideas anymore, so opening an enhancement request on bugs.python.org is the way to get his attention. -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Fri Jun 29 23:45:48 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 29 Jun 2018 20:45:48 -0700 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> Message-ID: <02653E44-8002-4A03-91F5-AAFEAA17F79F@gmail.com> On Jun 29, 2018, at 5:32 PM, Abe Dillon wrote: > > Sure, but in Hettinger's own words "whenever you have a constructor war, everyone should get their wish". People that want a counting constructor have that, > people that want the ability to initialize values don't have that. Sorry Abe, but you're twisting my words and pushing very hard for a proposal that doesn't make sense and isn't necessary. * Counts initialized to zero: This isn't necessary. The whole point of counters is that counts default to zero without pre-initialization. * Counts initialized to one: This is already done by the regular constructor. Use "Counter(keys)" if the keys are known to be unique and "Counter(set(keys)" to ignore duplicates. >>> Counter('abc') Counter({'a': 1, 'b': 1, 'c': 1}) >>> Counter(set('abbacac')) Counter({'a': 1, 'b': 1, 'c': 1}) * Counts initialized to some other value: That would be an unusual thing to do but would be easy with the current API. >>> Counter(dict.fromkeys('abc', 21)) Counter({'a': 21, 'b': 21, 'c': 21}) * Note, the reason that fromkeys() is disabled is that it has nonsensical or surprising interpretations: >>> Counter.fromkeys('aaabbc', 2) # What should this do that doesn't surprise at least some users? * That reason is already shown in the source code. @classmethod def fromkeys(cls, iterable, v=None): # There is no equivalent method for counters because setting v=1 # means that no element can have a count greater than one. raise NotImplementedError( 'Counter.fromkeys() is undefined. Use Counter(iterable) instead.') > Obviously, Python breaks SOLID principals successfully all over the place for pragmatic reasons. > I don't think this is one of those cases. No amount of citing generic design principles will justify adding an API that doesn't make sense. Besides, any possible use cases already have reasonable solutions using the existing API. That is likely why no one has ever requested this behavior before. Based on what I've read in this thread, I see nothing that would change the long-standing decision not to have a fromkeys() method for collections.Counter. The original reasoning still holds. Raymond From abedillon at gmail.com Fri Jun 29 23:50:09 2018 From: abedillon at gmail.com (Abe Dillon) Date: Fri, 29 Jun 2018 22:50:09 -0500 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: <20180630023732.GR14437@ando.pearwood.info> References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> <20180630023732.GR14437@ando.pearwood.info> Message-ID: [Steven D'Aprano] > In the case where all the inital counts are zero, the obvious API is to > call the dict fromkeys method: > Counter(dict.fromkeys(['a', 'b', 'ab'], 0)) Yes, I've discussed this, but since my replies have been miss addressed, it may have gotten lost. I'll quote it below: [Abe Dillon] > Counter(dict.fromkeys(keys, value)) works just fine, but it feels wrong. > I'm using the copy-constructor because I know Counter is a subclass of > dict. > I'm using fromkeys because I know how that class method works. > So why does the subclass lack functionality that the superclass has? > Because programmers wouldn't be able to wrap their heads around it? > I don't buy it. This feels like nanny-design trumping SOLID design > . [Steven D'Aprano] > So what you're really asking for is a convenience method to bypass the > need to create a temporary dict first I'm not asking for anything all that new. Just that the existing .fromkeys inherited from dict not be disabled. [Steven D'Aprano] > Presumably the initial value will default to 0 rather than None, and > take any integer value. Yes. I think that would make the most sense. 0 or 1. As long as it's documented it doesn't matter to me. [Steven D'Aprano] > Tim's point about duplicate keys is valid. Should it raise an exception, > silently swallow duplicates, or > count them? It should do exactly what dict.fromkeys does (except with a numeric default): ignore duplicates [Steven D'Aprano] > The dict constructors, both the standard dict() and dict.fromkeys(), > silently swallow duplicates. As they should. But Counter() does not, > and should not. That's fine. I don't think that's confusing. [Steven D'Aprano] > How about a fourth constructor? A fifth? A fiftith? How many > constructors is too many before the class becomes unwieldy? I think this is a little overboard on the slippery-slope, no? I'm asking for a constructor that already exists, but was deliberately disabled. As far as I can tell, the only people pointing out that others will complain are playing devil's advocate. I can't tell if there are any people that actually believe that Counter.fromkeys should have a multiplier effect. I wouldn't expect the campaign for the third type of constructor to get very far. Especially if Counter multiplication gets accepted. [Tim] > I think the missing bit here is that there weren't any "constructor wars" > for Counter... 15 years later you're jumping up & down about Counter.fromkeys() not being > there, and that's why nobody much cares ;-) I haven't been part of the conversation for 15 years, but most of the argument against the idea (yours especially) seem to focus on the prospect of a constructor war and imply that was the original motivation behind actively disabling the fromkeys method in Counters. I don't mean to give the impression that I'm fanatical about this. It really is a minor inconvenience. It doesn't irk me nearly as much as other minor things, like that the fact that all the functions in the heapq package begin with the redundant word 'heap'. [Tim] > Raymond may have a different judgment about that, though. I don't believe > he reads python-ideas anymore He actually did reply a few comments back! I think I'm having more fun chatting with people that I deeply respect than "jumping up and down". I'm sorry if I'm coming off as an asshole. We can kill this thread if everyone thinks I'm wasting their time. It doesn't look like anyone else shares my minor annoyance. Thanks for indulging me! On Fri, Jun 29, 2018 at 9:37 PM, Steven D'Aprano wrote: > On Fri, Jun 29, 2018 at 05:32:54PM -0700, Abe Dillon wrote: > > > Sure, but in Hettinger's own words > > "whenever you > have a > > constructor war, everyone should get their wish". People that want a > > counting constructor have that, > > people that want the ability to initialize values don't have that. > > *scratches head* > > I can initalise a Counter just fine. > > py> Counter({'a': 0, 'b': 0, 'ab': 2}) > Counter({'ab': 2, 'a': 0, 'b': 0}) > > The supported API for setting initial values of a counter is to either > count the supplied keys: > > Counter(['a', 'b', 'ab']) > > or supply initial counts in a dict: > > Counter({'a': 0, 'b': 0, 'ab': 2}) > > In the case where all the inital counts are zero, the obvious API is to > call the dict fromkeys method: > > Counter(dict.fromkeys(['a', 'b', 'ab'], 0)) > > > So what you're really asking for is a convenience method to bypass the > need to create a temporary dict first: > > Counter.fromkeys(['a', 'b', 'ab'], 0) > > Presumably the initial value will default to 0 rather than None, and > take any integer value. > > I'm sympathetic to the idea of this as a convenience, but I don't think > its an obvious feature to have. Tim's point about duplicate keys is > valid. Should it raise an exception, silently swallow duplicates, or > count them? > > The dict constructors, both the standard dict() and dict.fromkeys(), > silently swallow duplicates. As they should. But Counter() does not, > and should not. > > There's a discrepency if Counter() doesn't and Counter.fromkeys() does, > and it requires a value judgement to decide whether that discrepency is > sufficiently unimportant. > > > [...] > > Technically, there is no constructor for counting by X, but if enough > > people really wanted that, I suppose a third constructor would be in > order. > > How about a fourth constructor? A fifth? A fiftith? How many > constructors is too many before the class becomes unwieldy? > > Not every way you might count with a counter needs to be a constructor > method. You can always just count: > > c = Counter() > for key in keys: > c[key] += X > > I think you make a *reasonable* case for Counter.fromkeys to silently > ignore duplicates, as a convenience method for > > Counter(dict.fromkeys(keys, 0) > > but its not (in my opinion) a *compelling* argument. I think it comes > down to the taste of the designer. > > You can always subclass it. Or even monkey-patch it. > > py> def fromkeys(cls, seq, value=0): > ... c = cls() > ... for key in seq: > ... c[key] = value > ... return c > ... > py> from collections import Counter > py> Counter.fromkeys = classmethod(fromkeys) > py> Counter.fromkeys(['a', 'b', 'ab', 'a', 'b', 'c']) > Counter({'a': 0, 'ab': 0, 'b': 0, 'c': 0}) > > (Subclassing is safer :-) > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jun 30 02:25:03 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Jun 2018 23:25:03 -0700 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: On Fri, Jun 29, 2018 at 3:23 PM Michael Selik wrote: > On Fri, Jun 29, 2018 at 2:43 PM Guido van Rossum wrote: > >> On a quick skim I see nothing particularly objectionable or controversial >> in your PEP, except I'm unclear why it needs to be a class method on `dict`. >> > > Since it constructs a basic dict, I thought it belongs best as a dict > constructor like dict.fromkeys. It seemed to match other classmethods like > datetime.now. > It doesn't strike me as important enough. Surely not every stdlib function that returns a fresh dict needs to be a class method on dict! > Adding something to a builtin like this is rather heavy-handed. >> > > I included an alternate solution of a new class, collections.Grouping, > which has some advantages. In addition to having less of that > "heavy-handed" feel to it, the class can have a few utility methods that > help handle more use cases. > Hm, this actually feels heavier to me. But then again I never liked or understood the need for Counter -- I prefer basic data types and helper functions over custom abstractions. (Also your description doesn't do it justice, you describe a class using a verb phrase, "consume a sequence and construct a Mapping". The key to Grouping seems to me that it is a dict subclass with a custom constructor. But you don't explain why a subclass is needed, and in that sense I like the other approach better. But I still think it is much better off as a helper function in itertools. > Is there a really good reason why it can't be a function in `itertools`? >> (I don't think that it's relevant that it doesn't return an iterator -- it >> takes in an iterator.) >> > > I considered placing it in the itertools module, but decided against > because it doesn't return an iterator. I'm open to that if that's the > consensus. > You'll never get consensus on anything here, but you have my blessing for this without consensus. > Also, your pure-Python implementation appears to be O(N log N) if key is >> None but O(N) otherwise; and the version for key is None uses an extra >> temporary array of size N. Is that intentional? >> > > Unintentional. I've been drafting pieces of this over the last year and > wasn't careful enough with proofreading. I'll fix that momentarily... > Such are the dangers of premature optimization. :-) > Finally, the first example under "Group and Aggregate" is described as a >> dict of sets but it actually returns a dict of (sorted) lists. >> > > Doctest complained at the set ordering, so I sorted for printing. You're > not the only one to make that point, so I'll use sets for the example and > ignore doctest. > > Thanks for reading! > -- Michael > > PS. I just pushed an update to the GitHub repo, as per these comments. > Good luck with your PEP. If it is to go into itertools the biggest hurdle will be convincing Raymond, and I'm not going to overrule him on this: you and he are the educators here so hopefully you two can agree. --Guido > > >> On Fri, Jun 29, 2018 at 10:54 AM Michael Selik wrote: >> >>> Hello, >>> >>> I've drafted a PEP for an easier way to construct groups of elements >>> from a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst >>> >>> As a teacher, I've found that grouping is one of the most awkward tasks >>> for beginners to learn in Python. While this proposal requires >>> understanding a key-function, in my experience that's easier to teach than >>> the nuances of setdefault or defaultdict. Defaultdict requires passing a >>> factory function or class, similar to a key-function. Setdefault is >>> awkwardly named and requires a discussion of references and mutability. >>> Those topics are important and should be covered, but I'd like to let them >>> sink in gradually. Grouping often comes up as a question on the first or >>> second day, especially for folks transitioning from Excel. >>> >>> I've tested this proposal on actual students (no students were harmed >>> during experimentation) and found that the majority appreciate it. Some are >>> even able to guess what it does (would do) without any priming. >>> >>> Thanks for your time, >>> -- Michael >>> >>> >>> >>> >>> >>> >>> On Thu, Jun 28, 2018 at 8:38 AM Michael Selik wrote: >>> >>>> On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin >>>> wrote: >>>> >>>>> I use list and dict comprehension a lot, and a problem I often have is >>>>> to do the equivalent of a group_by operation (to use sql terminology). >>>>> >>>>> For example if I have a list of tuples (student, school) and I want to >>>>> have the list of students by school the only option I'm left with is to >>>>> write >>>>> >>>>> student_by_school = defaultdict(list) >>>>> for student, school in student_school_list: >>>>> student_by_school[school].append(student) >>>>> >>>> >>>> Thank you for bringing this up. I've been drafting a proposal for a >>>> better grouping / group-by operation for a little while. I'm not quite >>>> ready to share it, as I'm still researching use cases. >>>> >>>> I'm +1 that this task needs improvement, but -1 on this particular >>>> solution. >>>> >>>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Jun 30 03:44:34 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 30 Jun 2018 02:44:34 -0500 Subject: [Python-ideas] Fwd: collections.Counter should implement fromkeys In-Reply-To: References: <30549395-be27-458d-8ec4-4002019a1435@googlegroups.com> <20180630023732.GR14437@ando.pearwood.info> Message-ID: [Abe Dillon] > I haven't been part of the conversation for 15 years, but most of the argument > against the idea (yours especially) seem to focus on the prospect of a > constructor war and imply that was the original motivation behind actively > disabling the fromkeys method in Counters. I quoted the source code verbatim - its comment said fromkeys() didn't make sense for Counters. From which it's an easy inference that it makes more than one _kind_ of sense, hence "constructor wars". Not that it matters. Giving some of the history was more a matter of giving a plausible reason for why you weren't getting all that much feedback: it's quite possible that most readers of this list didn't even remember that `dict.fromkeys()` is a thing. > I don't mean to give the impression that I'm fanatical about this. It really > is a minor inconvenience. It doesn't irk me nearly as much as other minor > things, like that the fact that all the functions in the heapq package begin > with the redundant word 'heap'. You have to blame Guido for that one, which is even more futile than arguing with Raymond ;-) It never much bothered me, but I do recall doing this once: from heapq import heappush as push, heappop as pop # etc >> Raymond may have a different judgment about that, though. I don't believe >> he reads python-ideas anymore > He actually did reply a few comments back! Ya, I saw that! He's always trying to make me look bad ;-) > I think I'm having more fun chatting with people that I deeply respect > than "jumping up and down". I'm sorry if I'm coming off as an asshole. Not at all! I've enjoyed your messages. They have tended to more on the side of forceful advocacy than questioning, though, which may grate after a few more years. As to my "jumping up and down", I do a lot of leg-pulling. I'm old. It's not meant to offend, but I'm too old to care if it does :-) > We can kill this thread if everyone thinks I'm wasting their time. It doesn't > look like anyone else shares my minor annoyance. Thanks for indulging me! Raymond's reply didn't leave any hope for adding Counter.fromkeys(), so in the absence of a killer argument that hasn't yet been made, ya, it would be prudent to move on. Unless people want to keep talking about it, knowing that Raymond won't buy it in the end. Decisions, decisions ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Jun 30 03:57:03 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 30 Jun 2018 10:57:03 +0300 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: 30.06.18 00:42, Guido van Rossum ????: > On a quick skim I see nothing particularly objectionable or > controversial in your PEP, except I'm unclear why it needs to be a class > method on `dict`. Adding something to a builtin like this is rather > heavy-handed. Is there a really good reason why it can't be a function > in `itertools`? (I don't think that it's relevant that it doesn't return > an iterator -- it takes in an iterator.) > > Also, your pure-Python implementation appears to be O(N log N) if key is > None but O(N) otherwise; and the version for key is None uses an extra > temporary array of size N. Is that intentional? And it adds a requirement to keys be orderable. I think there should be two functions with different requirements: for hashable and orderable keys. The latter should return a list of pairs or a sorted dict if they be supported by the stdlib. I'm not sure they fit well for the itertools module. Maybe the purposed algorithms module would be a better place. Or maybe just keep them as recipes in the documentation (they are just few lines). Concrete implementation can be simpler than the general implementation. From storchaka at gmail.com Sat Jun 30 04:15:31 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 30 Jun 2018 11:15:31 +0300 Subject: [Python-ideas] Should nested classes in an Enum be Enum members? In-Reply-To: <20180629002507.GP14437@ando.pearwood.info> References: <5B33A33F.30207@stoneleaf.us> <20180629002507.GP14437@ando.pearwood.info> Message-ID: 29.06.18 03:25, Steven D'Aprano ????: > On Thu, Jun 28, 2018 at 06:57:45AM +0300, Serhiy Storchaka wrote: > >> Making a nested class a member you >> don't lost anything, because you always can make it not-nested if you >> don't want it be a member. > > You lose the ability to have > > Colors.RED.NestedClass() # returns something useful > # similar to Colors.RED.method() Since NestedClass is an enum member, you should use Colors.RED.NestedClass.value() or just Colors.NestedClass.value() for calling its value. If you don't want it be a member, just don't initialize it in the enum body. class Color(Enum): RED = 1 class NestedClass: pass Color.RED.NestedClass = NestedClass Now Color.RED.NestedClass() can return something useful. From ncoghlan at gmail.com Sat Jun 30 04:47:01 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jun 2018 18:47:01 +1000 Subject: [Python-ideas] Add a __cite__ method for scientific packages In-Reply-To: References: Message-ID: On 29 June 2018 at 12:14, Nathaniel Smith wrote: > On Thu, Jun 28, 2018 at 2:25 PM, Andrei Kucharavy > wrote: >> As for the list, reserving a __citation__/__cite__ for packages at the same >> level as __version__ is now reserved and adding a citation()/cite() function >> to the standard library seemed large enough modifications to warrant >> searching a buy-in from the maintainers and the community at large. > > There isn't actually any formal method for registering special names > like __version__, and they aren't treated specially by the language. > They're just variables that happen to have a funny name. You shouldn't > start using them willy-nilly, but you don't actually have to ask > permission or anything. The one caveat on dunder names is that we expressly exempt them from our usual backwards compatibility guarantees, so it's worth getting some level of "No, we're not going to do anything that would conflict with your proposed convention" at the language design level. > And it's not very likely that someone else > will come along and propose using the name __citation__ for something > that *isn't* a citation :-). Aye, in this case I think you can comfortably assume that we'll happily leave the "__citation__" and "__cite__" dunder names alone unless/until there's a clear consensus in the scientific Python community to use them a particular way. And even then, it would likely be Python package installers like pip, Python environment managers like pipenv, and data analysis environment managers like conda that would handle the task of actually consuming that metadata (in whatever form it may appear). Having your citation management support depend on which version of Python you were using seems like it would be mostly a source of pain rather than beneficial. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jun 30 05:01:51 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jun 2018 19:01:51 +1000 Subject: [Python-ideas] grouping / dict of lists In-Reply-To: References: Message-ID: On 30 June 2018 at 16:25, Guido van Rossum wrote: > On Fri, Jun 29, 2018 at 3:23 PM Michael Selik wrote: >> I included an alternate solution of a new class, collections.Grouping, >> which has some advantages. In addition to having less of that "heavy-handed" >> feel to it, the class can have a few utility methods that help handle more >> use cases. > > > Hm, this actually feels heavier to me. But then again I never liked or > understood the need for Counter -- I prefer basic data types and helper > functions over custom abstractions. (Also your description doesn't do it > justice, you describe a class using a verb phrase, "consume a sequence and > construct a Mapping". The key to Grouping seems to me that it is a dict > subclass with a custom constructor. But you don't explain why a subclass is > needed, and in that sense I like the other approach better. I'm not sure if the draft was updated since you looked at it, but it does mention that one benefit of the collections.Grouping approach is being able to add native support for mapping a callable across every individual item in the collection (ignoring the group structure), as well as for applying aggregate functions to reduce the groups to single values in a standard dict. Delegating those operations to the container API that way then means that other libraries can expose classes that implement the grouping API, but with a completely different backend storage model. > But I still think it is much better off as a helper function in itertools. I thought we actually had an open enhancement proposal for adding a "defaultdict.freeze" operation that switched it over to raising KeyError the same way a normal dict does, but I can't seem to find it now. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia