From eric at trueblade.com Fri Apr 1 01:46:36 2016 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 1 Apr 2016 01:46:36 -0400 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <20160401004329.GW12526@ando.pearwood.info> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> Message-ID: <56FE0B3C.3080907@trueblade.com> On 3/31/2016 8:43 PM, Steven D'Aprano wrote: > This matter boils down to a question of taste. You apparently don't like > the look of "def spam()", I do. I think my experience supports the > current requirement, you think that it hurts readability, I don't. > Unless you can give some objective evidence that it hurts readability, > you aren't going to convince me. In addition to all of Steven's points, it's just way, way to late for this change. I don't see us ever changing Python to allow (or require!) you to omit the parens on a function definition with no parameters. For backward compatibility, you'd have to allow the parens. And then what's the point in having two ways to do the same thing? You'd just be creating confusion. Eric. From stefan at bytereef.org Fri Apr 1 04:01:56 2016 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 1 Apr 2016 08:01:56 +0000 (UTC) Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FDDD15.40901@mgmiller.net> Message-ID: Mike Miller writes: > The fact that the class definition doesn't require empty parentheses works > against a few of these arguments. I like Mahan's idea and don't find it > confusing, though I agree it isn't a big pain point either way. There is a difference. Classes can be used like C-structs: >>> class the_struct: ... x = 10 ... y = 20 ... >>> the_struct.x 10 >>> the_struct.y 20 >>> In order to use equational reasoning, functions need the empty argument list in the definition: >>> def f(): return 10 * 20 ... >>> f() == 10 * 20 True >>> f == 10 * 20 False Read: We define f, when applied to the empty argument list, to equal 10 * 20. f by itself does not equal 10 * 20. It is a function symbol. Stefan Krah From srkunze at mail.de Fri Apr 1 05:25:24 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 1 Apr 2016 11:25:24 +0200 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <20160401004329.GW12526@ando.pearwood.info> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> Message-ID: <56FE3E84.7030303@mail.de> On 01.04.2016 02:43, Steven D'Aprano wrote: > (where the parameter-list might have one parameter, or ten, or zero). > With your proposal the reader will *nearly always* see the consistent > pattern: > > def name ( parameter-list ) : > > but very occasionally, maybe one time in a hundred functions, or a > thousand, see: > > def name : > > and be surprised. Even if it is just for a millisecond, that doesn't > help readability, it hurts it. I disagree. I remember this "issue" with decorators as well. @property def foo.... @lru_cache() # really? def bar... All decorators created with context_manager have this behavior. The proposal is about removing () for function definitions, so this differs. However, decorators have a very declarative style as do function definitions. So what do those special characters serve? IMHO It's just visual clutter. Best, Sven From srkunze at mail.de Fri Apr 1 05:32:00 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 1 Apr 2016 11:32:00 +0200 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <20160401002722.GV12526@ando.pearwood.info> References: <56FD8D6B.9040000@mail.de> <20160401002722.GV12526@ando.pearwood.info> Message-ID: <56FE4010.2080109@mail.de> On 01.04.2016 02:27, Steven D'Aprano wrote: > On Thu, Mar 31, 2016 at 10:49:47PM +0200, Sven R. Kunze wrote: >> On 31.03.2016 20:06, Terry Reedy wrote: >>>> def greet: # note the missing parenthesis >>>> print('hello') >>> -1 This will lead people to think even more often than they do now >>> that they can omit () in the call. >> Interesting that you mentioned it. Doesn't Ruby handle it this way? >> >> Let's see how this would look like in Python: >> >> >> def distance of point1, point2: >> # Pythagoras >> >> point1 = (3, 1) >> point2 = (1, 4) >> print distance of point1, point2 > I don't know that Ruby allows function calls like that. It doesn't work > in Ruby 1.8, which is the most recent version I have installed: > > steve at orac:~$ irb > irb(main):001:0> def foo(x) > irb(main):002:1> return x+1 > irb(main):003:1> end > => nil > irb(main):004:0> foo(7) > => 8 > irb(main):005:0> foo of 7 > NoMethodError: undefined method `of' for main:Object > from (irb):5 > from :0 > irb(main):006:0> The 'of' was just off the top of my head. You shouldn't take that too literally. http://www.howtogeek.com/howto/programming/ruby/ruby-function-method-syntax/ It works without the 'of'. > However, Hypertalk, and other similar "XTalk" languages, do. Function > calls in Hypertalk generally have a long form and a short form. The long > form will be something like: > > total = the sum of field "expenses" > > while the short form is the more familiar: > > total = sum(field "expenses") > > > Although Hypercard only allowed the long form if there was exactly one > argument. > > > >> Hmmm. Although I like the lightness of this, it's somewhat confusing, isn't? > In Hypertalk, it worked very well. But I wouldn't think it would be a > good fit to Python. Interesting. I think I will have a look at Hypertalk. :) Why do you think it would not fit into Python? > In a previous email, Sven also wrote: > >> I think the keystrokes are less important than the visual clutter >> one have with those parentheses. > I don't think they introduce "visual clutter" to the function. I think > they make it explicit and clear that the function has no arguments. > That's not clutter. > > When I was learning Python, I found it hard to remember when I needed parens > and when I didn't, because the inconsistency between class and def > confused me. I would write `class X()` or `def spam` and get a syntax > error (this was back in Python 1.5). My own experience tells me strongly > that the inconsistency between: > > class X: # must not use parens in Python 1.5 > > class Y(X): # must use parens > > and the inconsistency between class and def was harmful. If I could, I'd > make parens mandatory for both. So I think that making parens optional > for functions doesn't simplify the syntax so much as make the syntax > *more complicated* and therefore harder to learn. > > And I do not agree that the empty parens are "clutter" or make the > function definition harder to read. Interesting view. It never occurred to me that this might be a problem. But I started with a later Python version. So, it was just no a problem because both is possible now. Best, Sven From srkunze at mail.de Fri Apr 1 05:35:55 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 1 Apr 2016 11:35:55 +0200 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE0B3C.3080907@trueblade.com> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FE0B3C.3080907@trueblade.com> Message-ID: <56FE40FB.2000202@mail.de> On 01.04.2016 07:46, Eric V. Smith wrote: > And then what's the point in having two ways to do the same thing? You'd just be creating confusion. Class definitions allows this, function definitions don't? I'd call this confusing. Best, Sven From tjreedy at udel.edu Fri Apr 1 05:52:31 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Apr 2016 05:52:31 -0400 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE3E84.7030303@mail.de> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FE3E84.7030303@mail.de> Message-ID: On 4/1/2016 5:25 AM, Sven R. Kunze wrote: > I remember this "issue" with decorators as well. > > @property > def foo.... > > @lru_cache() # really? > def bar... There is a sematic difference between the two examples. The ()s are not optional. 'property' is a decorator that is applied to foo after it is defined. 'lru_cache()' is a function call that returns a decorator. Quite different. -- Terry Jan Reedy From srkunze at mail.de Fri Apr 1 06:10:52 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 1 Apr 2016 12:10:52 +0200 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FE3E84.7030303@mail.de> Message-ID: <56FE492C.90302@mail.de> On 01.04.2016 11:52, Terry Reedy wrote: > On 4/1/2016 5:25 AM, Sven R. Kunze wrote: > >> I remember this "issue" with decorators as well. >> >> @property >> def foo.... >> >> @lru_cache() # really? >> def bar... > > There is a sematic difference between the two examples. The ()s are > not optional. 'property' is a decorator that is applied to foo after > it is defined. 'lru_cache()' is a function call that returns a > decorator. Quite different. @Terry Thanks for snipping away the relevant explanation and not addressing my point. >.< @others Never said it were optional. Decorating and function definitions have a declarative style in common. The point was that these () don't serve anything in case of declarations. Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Apr 1 07:05:53 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Apr 2016 22:05:53 +1100 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE4010.2080109@mail.de> References: <56FD8D6B.9040000@mail.de> <20160401002722.GV12526@ando.pearwood.info> <56FE4010.2080109@mail.de> Message-ID: <20160401110552.GX12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 11:32:00AM +0200, Sven R. Kunze wrote: > >However, Hypertalk, and other similar "XTalk" languages, do. Function > >calls in Hypertalk generally have a long form and a short form. The long > >form will be something like: > > > > total = the sum of field "expenses" > > > >while the short form is the more familiar: > > > > total = sum(field "expenses") > > > > > >Although Hypercard only allowed the long form if there was exactly one > >argument. [...] > >In Hypertalk, it worked very well. But I wouldn't think it would be a > >good fit to Python. > > Interesting. I think I will have a look at Hypertalk. :) Unfortunately, Hypertalk is long dead. Apple never quite understood why it was popular, or what to do with it. But it influenced the design of the WWW and Javascript, and it lives on in a couple of languages such as OpenXion and LiveCode: https://github.com/kreativekorp/openxion https://livecode.com/download/ (LiveCode has a booming user community, OpenXion is all but dead, but it works and lets you experiment with the language.) If you have an old Classic Mac capable of running System 6 through 9 (pre OS X), or an emulator for the same, then you might be able to run Hypercard, which was a sort of combined software development kit, Rolodex application, and IDE. Hypercard was the GUI to the Hypertalk language, and Hypertalk was the scripting language that controlled the Hypercard GUI. https://en.wikipedia.org/wiki/HyperCard > Why do you think it would not fit into Python? Hypertalk's execution model, data model and syntax are all very different from Python's. Hypertalk was also linked very heavily to the GUI, which makes it a relatively weak fit with less specialised languages like Python. But mostly, Python already has a standard syntax for calling functions: value = function(arg) There's no need to add a more verbose "the function of arg" syntax. -- Steve From contrebasse at gmail.com Fri Apr 1 07:42:08 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Fri, 1 Apr 2016 11:42:08 +0000 (UTC) Subject: [Python-ideas] Working with Path objects: p-strings? References: <56F5B3F7.40502@gmail.com> <56FA33C0.4000902@mail.de> <56FA4E31.3010908@gmail.com> <56FAECD8.3030605@mail.de> <56FB922A.6090607@mail.de> <-6058951228755932973@unknownmsgid> Message-ID: Paul Moore writes: > > People want paths to be a strings so that they will work with all the code > > that already works with strings. > > Correct. That's the prime motivation. But you then say > > > But whatever happened to duck typing? Paths don't need to BE strings. > > Rather, everything that needs a path needs to accept anything that > > acts like a path. > > But "all the code that already works with strings" doesn't do that. If > we're allowed to change that code then a simple > > patharg = getattr(patharg, 'path', patharg) > > is sufficient to work with path objects or strings. > I think you have it backwards here: the problem is not updating code that you can change, it's using any external library that uses paths as strings. When I use a library from pypi, I won't add a new line in every function using a string representing a path. When if I use path.py, I can directly pass paths to any library without changing its code. From eric at trueblade.com Fri Apr 1 07:48:01 2016 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 1 Apr 2016 07:48:01 -0400 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE40FB.2000202@mail.de> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FE0B3C.3080907@trueblade.com> <56FE40FB.2000202@mail.de> Message-ID: <56FE5FF1.1030706@trueblade.com> On 4/1/2016 5:35 AM, Sven R. Kunze wrote: > On 01.04.2016 07:46, Eric V. Smith wrote: >> And then what's the point in having two ways to do the same thing? >> You'd just be creating confusion. > > Class definitions allows this, function definitions don't? I'd call this > confusing. Even if there was agreement on that, it remains a bad idea to add more confusion. Eric. From desmoulinmichel at gmail.com Fri Apr 1 07:49:03 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 13:49:03 +0200 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: References: <20160331175741.GU12526@ando.pearwood.info> Message-ID: <56FE602F.90002@gmail.com> Le 01/04/2016 04:37, Joao S. O. Bueno a ?crit : > On 31 March 2016 at 14:57, Steven D'Aprano wrote: >> On Thu, Mar 31, 2016 at 09:29:36PM +0500, Mahan Marwat wrote: >> >>> I have an idea of making parenthesis optional for functions having no >>> parameters. i.e >>> >>> def greet: # note the missing parenthesis >>> print('hello') >> >> -1 >> >> I don't think that the benefit (two fewer characters to type) is worth >> the effort of learning the special case. Right now, the rule is simple: >> the def keyword ALWAYS needs parentheses after the name of the function, >> regardless of whether there is one argumemt, two arguments, twenty >> arguments, or zero arguments. Why treat zero as special? > > Because class definitions already do so? Yes, and because after more than a decade of Python, I still forget to type out the parenthesis some time, then go back and realize that it's silly that I have to since I don't with classes. That, and: - it's a common student error; - really it wouldn't hurt anyone. Is there really a strong case against it than just "it's not pure" ? I've seen of lot of this argument on the list lately and I find it counter productive. There are dozen of good way to oppose an idea, just saying "we got a moral stand to not do it" is not convincing. Espacially in a language with so many compromised like len(foo) instead of foo.len, functional paradigme and Poo and immutability and mutability, etc. Python has an history of making things to get out of the way: - no {} for indentation; - optional parentheses for tuples; - optional parenthesis for classes; If this changes does not hurt readability, ability to debug and doesn't make your code/program any worst than it was but does't help even a little, why not ? > > So, if it is possible to omit parentheses when inheriting from the > default object when declaring a class, not needing parenthesis for > default parameterless functions would not be an exception - it would > be generalizing the idea of "Python admits less clutter". > > > For that, I'd think of this a good idea - but I don't like changing > the idea syntax in such a fundamental way - so I am +0 on this thing. > > I think this can be an interesting discussion - but I dislike people > taking a ride on this to suggest omitting parentheses on function > calls as well - that is totally broken. :-) +1. This is for another thread (that hope will die). > > js > -><- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From p.f.moore at gmail.com Fri Apr 1 08:37:14 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 1 Apr 2016 13:37:14 +0100 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE602F.90002@gmail.com> References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> Message-ID: On 1 April 2016 at 12:49, Michel Desmoulin wrote: > Is there really a strong case against it than just "it's not pure" ? > I've seen of lot of this argument on the list lately and I find it > counter productive. > > There are dozen of good way to oppose an idea, just saying "we got a > moral stand to not do it" is not convincing. Espacially in a language > with so many compromised like len(foo) instead of foo.len, functional > paradigme and Poo and immutability and mutability, etc. You seem to have missed the argument that's been made a couple of times, which is that you would then have two ways of doing the same thing (empty parentheses would need to still be allowed for backward compatibility) and Python has a strong tradition of "there should only be one way of doing things". (Yes, it's not always followed 100%, as with any real life situation not everything is perfect). You may prefer to allow people to choose their own style and have options. But as a maintainer, I can confirm that I personally prefer code that I have to maintain to have a consistent style, and Python's lack of multiple ways to say the same thing is a benefit in ensuring that happens. So for me, your proposal would result in extra work, and no benefit (I would continue adding "()" as I find that style more readable and consistent). This is not a "moral stand". The "one way to do things" principle is a highly practical decision based on experience of different programming environments - Perl in particular allowed many ways of doing the same thing ("there's more than one way to do it" was a catch phrase in the Perl community) and it's an acknowledged fact that Perl code is hard to maintain as a consequence (unless strict style guidelines are imposed). As someone proposing a change to the Python language, the onus is on you to argue the benefits of your change, not on others to argue against it. If no-one does anything, Python won't change so you have to convince people. At the moment your argument is little more than "it looks neater", and opinions on that are clearly divided. As there's a *huge* amount of material that would need changing as a result of the change (documentation, training courses, style guides, IDEs, ...) you need a much better argument - and complaining that the people pointing out that your argument isn't strong enough are being "counter productive" is not helpful. If we're discussing "I've seen a lot of it on this list lately", then I would argue that there has been an awful lot of ideas proposed here recently which don't take into account the significant and genuine costs involved in *any* change to Python, and as a result don't even try to offer a justification for the proposal that is in proportion to the impact of the change. It's perfectly possible to propose a change here, have it supported, and get it implemented - but it needs *work*, and a lot of the discussions here are pointless because no-one is willing to do any work (not even the amount of work needed to convince the core developers that their proposal is worthwhile). It's very easy to accuse people of not being willing to listen to your proposals - but I challenge you to try to get a change like this included into C++ or Java, if you think that about Python! Paul From ncoghlan at gmail.com Fri Apr 1 09:07:07 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 1 Apr 2016 23:07:07 +1000 Subject: [Python-ideas] Package reputation system In-Reply-To: <4DD01140-B429-4FD0-8164-C965210A2282@selik.org> References: <56F5B3F7.40502@gmail.com> <56FA33C0.4000902@mail.de> <56FA4E31.3010908@gmail.com> <56FA5F7E.3070001@gmail.com> <56FA656B.5070707@mail.de> <56FAB7C4.6060904@gmail.com> <4DD01140-B429-4FD0-8164-C965210A2282@selik.org> Message-ID: On 30 March 2016 at 06:51, Michael Selik wrote: > In days of yore, before package managers, people used to download source code and read it. Maybe not all of it, but enough to feel not terribly scared when running that code. In modern times, with centralized package repositories and convenient installer tools, we want a better way to know "what?s the right package to use for this task?" It's a "first-world problem" in a sense. There are too many products in my supermarket aisle! On a personal note, I have on occasion spent twenty minutes choosing a toothpaste at Target. > > If I care enough, I'll take a moment to look at how many downloads have been counted recently, how many issues there are (usually on GitHub), how many contributors, etc. I'll read the docs. I might even poke around in the source. I'll also check Google rankings to see if people are chatting about the module and linking to it. > > I'm not sure if there's a good centralized solution to this problem, but it's a question many people are asking: How do I know which non-stdlib module to use? > > Back at Georgia Tech, my professor [0] once told me that the way to get rich is to invent an index. He was referring to Richard Florida's "Creative Class" book and the subsequent "Creativity Index" consulting that Florida provided to various municipalities. People who score high on the index pay you to speak. People who score low on the index pay you to consult. > > There are a few companies who sell a Python package reputation service, along with some distribution tools. Continuum's Anaconda, Enthought's Canopy, and ActiveState's ActivePython come to mind. There's clearly value in helping people answer this question. And so do Linux distributions. There's a reason there are so many of the latter: "good" choices depend a great deal on what you're doing, which means there's no point in trying to come up with the "one true reputation system" for software components. djangopackages.com does a relatively good job for the Django ecosystem, but the simple fact of using Django puts you far enough down the path towards solving a particular kind of problem (web service design with a rich user permission system backed by a relational database) that a community driven rating system can work. By contrast, I think Python itself covers too many domains for a common rating system to be feasible - "good for education" is not the same as "good for sysadmin tasks" is not the same as "good for data analysis" is not the same as "good for network service development", etc. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From desmoulinmichel at gmail.com Fri Apr 1 09:43:14 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 15:43:14 +0200 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> Message-ID: <56FE7AF2.60902@gmail.com> Le 01/04/2016 14:37, Paul Moore a ?crit : > On 1 April 2016 at 12:49, Michel Desmoulin wrote: >> Is there really a strong case against it than just "it's not pure" ? >> I've seen of lot of this argument on the list lately and I find it >> counter productive. >> >> There are dozen of good way to oppose an idea, just saying "we got a >> moral stand to not do it" is not convincing. Espacially in a language >> with so many compromised like len(foo) instead of foo.len, functional >> paradigme and Poo and immutability and mutability, etc. > > You seem to have missed the argument that's been made a couple of > times, which is that you would then have two ways of doing the same > thing (empty parentheses would need to still be allowed for backward > compatibility) and Python has a strong tradition of "there should only > be one way of doing things". (Yes, it's not always followed 100%, as > with any real life situation not everything is perfect). You may > prefer to allow people to choose their own style and have options. But > as a maintainer, I can confirm that I personally prefer code that I > have to maintain to have a consistent style, and Python's lack of > multiple ways to say the same thing is a benefit in ensuring that > happens. So for me, your proposal would result in extra work, and no > benefit (I would continue adding "()" as I find that style more > readable and consistent). > > This is not a "moral stand". The "one way to do things" principle is a > highly practical decision based on experience of different programming > environments - Perl in particular allowed many ways of doing the same > thing ("there's more than one way to do it" was a catch phrase in the > Perl community) and it's an acknowledged fact that Perl code is hard > to maintain as a consequence (unless strict style guidelines are > imposed). > > As someone proposing a change to the Python language, the onus is on > you to argue the benefits of your change, not on others to argue > against it. If no-one does anything, Python won't change so you have > to convince people. At the moment your argument is little more than > "it looks neater", and opinions on that are clearly divided. As > there's a *huge* amount of material that would need changing as a > result of the change (documentation, training courses, style guides, > IDEs, ...) you need a much better argument - and complaining that the > people pointing out that your argument isn't strong enough are being > "counter productive" is not helpful. > > If we're discussing "I've seen a lot of it on this list lately", then > I would argue that there has been an awful lot of ideas proposed here > recently which don't take into account the significant and genuine > costs involved in *any* change to Python, and as a result don't even > try to offer a justification for the proposal that is in proportion to > the impact of the change. It's perfectly possible to propose a change > here, have it supported, and get it implemented - but it needs *work*, > and a lot of the discussions here are pointless because no-one is > willing to do any work (not even the amount of work needed to convince > the core developers that their proposal is worthwhile). It's very easy > to accuse people of not being willing to listen to your proposals - > but I challenge you to try to get a change like this included into C++ > or Java, if you think that about Python! > > Paul > This is actually a very convincing one :) From ncoghlan at gmail.com Fri Apr 1 09:44:37 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 1 Apr 2016 23:44:37 +1000 Subject: [Python-ideas] `to_file()` method for strings In-Reply-To: <56F9514F.2090304@gmail.com> References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> Message-ID: On 29 March 2016 at 01:44, Michel Desmoulin wrote: > > > Le 28/03/2016 17:30, Chris Barker - NOAA Federal a ?crit : >>> On Mar 24, 2016, at 7:22 PM, Nick Coghlan : what if we had a JSON-based save builtin that wrote >>> UTF-8 encoded files based on json.dump()? >> >> I've been think about this for a while, but would rather have a >> "pyson" format -- I.e. Python literals, rather than JSON. This would >> preserve the tuple vs list and integer vs float distinction, and allow >> more options for dictionary keys.(and sets?). >> >> Granted, you'd lose the interoperability, but for the quick saving and >> loading of data, it'd be pretty nice. >> >> There is also JSON pickle: >> >> https://jsonpickle.github.io >> >> Though as I understand it, it has the same security issues as pickle. >> >> But could we make a not-quite-as-complete pickle-like protocol that >> could save and load arbitrary objects, without ever running arbitrary >> code? > > If it's for quick data saving, the security is not an issue since the > data will never comes from an attacker if you do a quick script. "These files will never be supplied or altered by an attacker" is the kind of assumption that has graced the world with such things as MS Office macro viruses. That means that as Python makes more inroads into the traditional territory of MS Excel and other spreadsheets, ensuring we encourage a clear distinction between code (which is always dangerous to trust) and data (which *should* be safe to read, aside from processing capacity limits) becomes increasingly important. If we ever did something like this, then Chris's suggestion of a Python-specific format that can be loaded from a string via ast.literal_eval() rather than using JSON likely makes sense [1], but it would also be appropriate to revisit that idea first as a project outside the standard library for ad hoc data persistence, before proposing it for standard library inclusion. Cheers, Nick. [1] https://code.google.com/archive/p/pyon/ is a project from several years ago aimed at that task. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Fri Apr 1 09:54:46 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 2 Apr 2016 00:54:46 +1100 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE602F.90002@gmail.com> References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> Message-ID: <20160401135446.GY12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 01:49:03PM +0200, Michel Desmoulin wrote: > There are dozen of good way to oppose an idea, just saying "we got a > moral stand to not do it" is not convincing. Nobody has made that argument. > Espacially in a language > with so many compromised like len(foo) instead of foo.len, len(foo) isn't a compromise, it is an intentional feature. > functional paradigme and Poo and immutability and mutability, etc. "Paradigm". You might not be aware that "poo" is an English euphemism for excrement, normally used for and by children. So I'm completely confused by what you mean by "and Poo". > Python has an history of making things to get out of the way: There are many people who would say that Python's case sensitivity and significant indentation "get in the way". > - no {} for indentation; > - optional parentheses for tuples; No. Parentheses have nothing to do with tuples (except the empty tuple). Parentheses are used for *grouping*. Parens don't make tuples, and they aren't "optional" any more than parens are "optional" in addition because you can write `result = (a+b)`. The parens here have nothing to do with addition, and it would be misleading to say "optional parentheses for addition". Writing (1, 2, 3) is similar to writing ([1, 2, 3]) or ("abc") or (123). Apart from nested tuples, it's almost never needed. > - optional parenthesis for classes; Needed for backwards compatibility. Let's not copy that misfeature into future misfeatures. > If this changes does not hurt readability, ability to debug and doesn't > make your code/program any worst than it was but does't help even a > little, why not ? Who says that it doesn't hurt readability? My personal experience tells me that it DOES hurt readability, at least a little, and adds confusion to the rules of what needs parens when and what doesn't. You might not agree with my personal experience, but you shouldn't just dismiss it or misrepresent it as a "moral stand". My argument cuts right to the core of the argument that making parens optional helps -- my experience is that it *doesn't help*, it actually HURTS. -- Steve From boekewurm at gmail.com Fri Apr 1 10:18:14 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 16:18:14 +0200 Subject: [Python-ideas] Decorators for variables Message-ID: tldr: Using three method declarations or chaining method calls is ugly, why not allow variables and attributes to be decorated too? Currently the way to create variables with custom get/set/deleters is to use the @property decorator or use property(get, set, del, doc?), and this must be repeated per variable. If I were able to decorate multiple properties with decorators like @not_none or something similar, it would take away a lot of currently used confusing code. Feedback is appreciated. ----------- The current ways to define the getter, setter and deleter methods for a variable or attribute are the following: @property def name(): """ docstring """ ... code @name.setter def name(): ... code @name.deleter def name(): ... code and var = property(getter, setter, deleter, docstring) These two methods are doable when you only need to change access behaviour changes on only one variable or property, but the more variables you want to change access to, the longer and more bloated the code will get. Adding multiple restrictions on a variable will for instance look like this: var = decorator_a(decorator_b(property(value))) or @property def var(self): return decorator_a.getter(decorator_b..getter(self._value)) ... etc or even this @decorator_a @decorator_b def var(self): pass I propose the following syntax, essentially equal to the syntax of function decorators: @decorator var = some_value which would be the same as var = decorator(some_value) and can be chained as well: @decorator @decorator_2 var = some_value which would be var = decorator(decorator_2(some_value)) or similarly var = decorator(decorator_2()) var = some_value The main idea behind the proposal is that you can use the decorator as a standardized way to create variables that have the same behaviour, instead of havng to do that using methods. I think that a lot can be gained by specifying a decorator that can decorate variables or properties. Note that many arguments will be the same as for function decorators (PEP 0318), but then applied to variable/property/attribute declaration. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Fri Apr 1 10:19:04 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 1 Apr 2016 09:19:04 -0500 Subject: [Python-ideas] The future of Python: fixing broken error handling in Python 8 Message-ID: Python's exception handling system is currently badly brokeTypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'n. Therefore, with the recent news of the joyous release of Python 8 ( https://mail.python.org/pipermail/python-dev/2016-March/143603.html), I have decided to propose a revolutionary idea: safe mock objects. A "safe" mock object (qualified name `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`; Java-style naming was adopted for readability purposes; comments are now no longer necessary) is a magic object that supports everything and returns itself. Since examples speak more words than are in the Python source code, here are some (examples, not words in the Python source code): a = 1 b = None c = a + b # Returns a _frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8 print(c) # Prints the empty string. d = c+1 # All operations on `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`'s return a new one. e = d.xyz(1, 2, 3) # `e` is now a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. def f(): assert 0 # Causes the function to return a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. raise 123 # Does the same thing. print(L) # L is undefined, so it becomes a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. Safe mock objects are obviously the Next Error Handling Revolution ?. Unicode errors now simply disappear and return more `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`s. As for `try` and `catch` (protest the naming of `except`!!) statements, they will be completely ignored. The `try`, `except`, and `finally` bodies will all be executed in sequence, except that printing and returning values with an `except` statement does nothing: try: xyz = None.a # `xyz` becomes a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. except: print(123) # Does nothing. return None # Does nothing. finally: return xyz # Returns a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. Aggressive error handling (as shown in PanicSort [https://xkcd.com/1185/]) that does destructive actions (such as `rm -rf /`) will always execute the destructive code, encouraging more honest development. In addition, due to errors simply being ignored, nothing can ever quite go wrong. All discussions about a safe navigation operator can now be immediately halted, since any undefined attributes will simply return a `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`. Although I have not yet destroy--I mean, improved CPython to allow for this amazing idea, I have created a primitive implementation of the `_frozensafemockobjectimplementation` module: https://github.com/kirbyfan64/_frozensafemockobjectimplementation I hope you will all realize that this new idea is a drastic improvement over current technologies and therefore support it, because we can Make Python Great Again?. -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 1 10:27:48 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 2 Apr 2016 01:27:48 +1100 Subject: [Python-ideas] The future of Python: fixing broken error handling in Python 8 In-Reply-To: References: Message-ID: On Sat, Apr 2, 2016 at 1:19 AM, Ryan Gonzalez wrote: > Safe mock objects are obviously the Next Error Handling Revolution ?. > Unicode > errors now simply disappear and return more > `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`s. Finally. It's about time. For some reason, technology deals with the hard problems but not the easy ones. I mean, we put a man on the moon, but we can't cure the common cold OR cancer. We make aeroplanes that fly us around the world any time we like, but can't get through airport security in less than an hour. And we design programming languages that eliminate memory allocation problems, but can't make it so bytes and text magically work together. At last, a language that lets me express myself in code without having to think about any languages other than my own parochial subset of English. ChrisA From tritium-list at sdamon.com Fri Apr 1 10:36:27 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Fri, 1 Apr 2016 10:36:27 -0400 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <8537r69qrj.fsf@benfinney.id.au> References: <56FD5EC7.7060909@sdamon.com> <8537r69qrj.fsf@benfinney.id.au> Message-ID: <56FE876B.2020009@sdamon.com> On 3/31/2016 19:22, Ben Finney wrote: > No mention of keystrokes. I'm endlessly disappointed that discussions of > ?too much noise in the code? are mis-interpreted as*only* about > writing, not about reading. Because, presumably, the people making those comments read a lot of code and don't see a problem with the two characters? >>> def foo(): ... pass ... There are four pieces of information in the first line of that definition. It starts with the keyword at the beginning of the line. Readers of languages that are left to right will instantly know they are in a function definition. The next piece of information is the identifier that the function will be assigned too, and that is clearly defined right there. The proposal would not change this. The third is the argument list, in this case an empty one. The fourth is the 'block delimiter' for lack of anything better for me to call it. As this example sits, that line should be read as "define a function named foo that explicitly takes no arguments with the following code." Omitting the parens changes the way it reads to something along the lines of "define a function named foo with the following code." The proposal has removed vital visual information. I said earlier that if this suggestion was made in 1991 it should have been accepted to make functions more consistent with classes. I have changed my mind; in 1991 classes should have been corrected to always require parens. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.g.kelly at gmail.com Fri Apr 1 10:37:32 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 1 Apr 2016 08:37:32 -0600 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: On Fri, Apr 1, 2016 at 8:18 AM, Matthias welp wrote: > Currently the way to create variables with custom get/set/deleters is to use > the @property decorator or use property(get, set, del, doc?), and this must > be repeated per variable. If I were able to decorate multiple properties > with decorators like @not_none or something similar, it would take away a > lot of currently used confusing code. I don't see the relationship between this paragraph and the rest of your proposal. How does a decorator in place of an explicit function call prevent repetition? > I propose the following syntax, essentially equal to the syntax of function > decorators: > > @decorator > var = some_value > > which would be the same as > > var = decorator(some_value) > > and can be chained as well: > > @decorator > @decorator_2 > var = some_value > > which would be > > var = decorator(decorator_2(some_value)) > > or similarly > > var = decorator(decorator_2()) > var = some_value What about augmented assignment? Should this work? @float var += 20 And would that be equivalent to: @float var = var + 20 Or: var = var + float(20) Also, what about attributes and items? @decorator x.attr = value @another_decorator d['foo'] = value > The main idea behind the proposal is that you can use the decorator as a > standardized way to create variables that have the same behaviour, instead > of havng to do that using methods. I think that a lot can be gained by > specifying a decorator that can decorate variables or properties. By "methods" you mean "function composition", right? Otherwise I don't understand what methods have got to do with this. From ethan at stoneleaf.us Fri Apr 1 10:47:02 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Apr 2016 07:47:02 -0700 Subject: [Python-ideas] Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE492C.90302@mail.de> References: <56FD8D6B.9040000@mail.de> <20160401004329.GW12526@ando.pearwood.info> <56FE3E84.7030303@mail.de> <56FE492C.90302@mail.de> Message-ID: <56FE89E6.3030008@stoneleaf.us> On 04/01/2016 03:10 AM, Sven R. Kunze wrote: > On 01.04.2016 11:52, Terry Reedy wrote: >> On 4/1/2016 5:25 AM, Sven R. Kunze wrote: >> >>> I remember this "issue" with decorators as well. >>> >>> @property >>> def foo.... >>> >>> @lru_cache() # really? >>> def bar... >> >> There is a sematic difference between the two examples. The ()s are >> not optional. 'property' is a decorator that is applied to foo after >> it is defined. 'lru_cache()' is a function call that returns a >> decorator. Quite different. > > @Terry > Thanks for snipping away the relevant explanation and not addressing my > point. >.< Your "point" appears to be that ()s are optional in the case of decorators, and they are not -- decorators don't need them, and can't have them*, while functions that return a decorator do need them and must have them. If you meant something else please offer a better explanation -- no need to be snide with Terry. -- ~Ethan~ * Okay, it is possible to write a decorator that works either with or without parens, but it's a pain, not general-purpose, and can be confusing to use. From tritium-list at sdamon.com Fri Apr 1 10:42:30 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Fri, 1 Apr 2016 10:42:30 -0400 Subject: [Python-ideas] Package reputation system In-Reply-To: References: <56F5B3F7.40502@gmail.com> <56FA33C0.4000902@mail.de> <56FA4E31.3010908@gmail.com> <56FA5F7E.3070001@gmail.com> <56FA656B.5070707@mail.de> <56FAB7C4.6060904@gmail.com> <4DD01140-B429-4FD0-8164-C965210A2282@selik.org> Message-ID: <56FE88D6.7080004@sdamon.com> On 4/1/2016 09:07, Nick Coghlan wrote: > And so do Linux distributions. > > There's a reason there are so many of the latter: "good" choices > depend a great deal on what you're doing, which means there's no point > in trying to come up with the "one true reputation system" for > software components. djangopackages.com does a relatively good job for > the Django ecosystem, but the simple fact of using Django puts you far > enough down the path towards solving a particular kind of problem (web > service design with a rich user permission system backed by a > relational database) that a community driven rating system can work. > > By contrast, I think Python itself covers too many domains for a > common rating system to be feasible - "good for education" is not the > same as "good for sysadmin tasks" is not the same as "good for data > analysis" is not the same as "good for network service development", > etc. > > Cheers, > Nick. Not that this was the original proposal, but there can be such a thing as a universal 'bad' package, though. So about the only thing that a universal package rating system can do effectively is shame developers. I don't think we want that. From ericfahlgren at gmail.com Fri Apr 1 11:07:21 2016 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Fri, 1 Apr 2016 08:07:21 -0700 Subject: [Python-ideas] The next major Python version will be Python 8 In-Reply-To: References: Message-ID: <010a01d18c28$31d0e810$9572b830$@gmail.com> Victor Stinner wrote: > Sent: Thursday, March 31, 2016 15:27 > To: python-ideas > The PSF is happy to announce that the new Python release will be Python 8! Victor, Excellent work! Good to see people thinking about the future. I've been working on some similar enhancements myself. My principal development platform is the Apple ][, which sports a highly ergonomic 40-column display (with supremely aesthetic shades of grey in lieu of useless colors). This leads me to propose an enhancement to PEP 8, which I'm calling PEP 8.1 (since as all Windows users know, 8.1 is much better than 8). One of the major changes in PEP 8.1 is the reduction of the allowed line length from 79 to 39, with a suggested maximum of 36 characters. It would be great if Python 8.1 would implement this, and disallow the use of any source code with characters beyond the magic boundary. This clearly makes sense, since as everyone knows that if the 79-column limit is good in this age of ubiquitous 1080, nay 4k, monitors, then 39 is surely better! Eric From victor.stinner at gmail.com Fri Apr 1 11:34:14 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Apr 2016 17:34:14 +0200 Subject: [Python-ideas] The next major Python version will be Python 8 In-Reply-To: <010a01d18c28$31d0e810$9572b830$@gmail.com> References: <010a01d18c28$31d0e810$9572b830$@gmail.com> Message-ID: 2016-04-01 17:07 GMT+02:00 Eric Fahlgren : > Excellent work! Good to see people thinking about the future. I've been working on some similar enhancements myself. > (...) > One of the major changes in PEP 8.1 is the reduction of the allowed line length from 79 to 39, with a suggested maximum of 36 characters. It would be great if Python 8.1 would implement this, and disallow the use of any source code with characters beyond the magic boundary. This clearly makes sense, since as everyone knows that if the 79-column limit is good in this age of ubiquitous 1080, nay 4k, monitors, then 39 is surely better! The rules of the pep8 module *must* change at each Python release to ensure that each release is backward-incompatible! Victor From boekewurm at gmail.com Fri Apr 1 11:46:26 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 17:46:26 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: > > Currently the way to create variables with custom get/set/deleters is to > > use the @property decorator or use property(get, set, del, doc?), and > > this must be repeated per variable. If I were able to decorate multiple > > properties with decorators like @not_none or something similar, it > > would take away a lot of currently used confusing code. > > How does this prevent repetition? If you were to define a variable you currently could use the @property for each variable, which could take up to 3 declarations of the same name per use of the pattern. Using a decorator might take that down to only 1 extra line. > What about augmented assignment? Should this work? The steps it would go through were these: 1. the value of the statement is calculated. e.g. val + 20 in the first case given. 2. the decorator is applied on that value 3. the return value from the decorator is then assigned to the variable. This is, again, very similar to the way function decorators work, and a short-handed method to make property access more transparent to the programmer. > Also, what about attributes and items? I have not yet thought about that, as they are not direct scope variables. The idea was there to decorate the attribute or variable at the moment it would get defined, not per se after definition. > > instead of having to do that using methods > > By "methods" you mean "function composition", right? Sorry for my terminology, I meant function calls, but function composition is indeed what would happen effectively. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.g.kelly at gmail.com Fri Apr 1 11:55:02 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 1 Apr 2016 09:55:02 -0600 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: On Fri, Apr 1, 2016 at 9:46 AM, Matthias welp wrote: >> > Currently the way to create variables with custom get/set/deleters is to >> > use the @property decorator or use property(get, set, del, doc?), and >> > this must be repeated per variable. If I were able to decorate multiple >> > properties with decorators like @not_none or something similar, it >> > would take away a lot of currently used confusing code. >> >> How does this prevent repetition? > > If you were to define a variable you currently could use the @property for > each variable, which could take up to 3 declarations of the same name per > use of the pattern. Using a decorator might take that down to only 1 extra > line. An example of the transformation that you intend would help here. If you're intending this as a @property replacement then as far as I can see you still need to write up to three functions that define the property's behavior. From rosuav at gmail.com Fri Apr 1 12:45:18 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 2 Apr 2016 03:45:18 +1100 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: On Sat, Apr 2, 2016 at 1:18 AM, Matthias welp wrote: > I propose the following syntax, essentially equal to the syntax of function > decorators: > > @decorator > var = some_value > > which would be the same as > > var = decorator(some_value) The whole point of decorator syntax for classes and functions is that their definitions take many lines, and the decoration belongs as part of the function signature or class definition. At the top of a function block is a line which specifies the function name and any arguments, and then you have the docstring. Similarly with classes - name, superclasses, metaclass, docstring. All up the top. Placing the decorator above that allows for an extremely convenient declarative syntax that keeps all that information together. Also, the decorator syntax replaces the redundant names: def functionname(): ... functionname = decorator(functionname) where the function first gets defined using its name, and then gets rebound (which involves looking up the name and then assigning the result back) - three separate uses of the name. In contrast, you're asking for syntax to help you modify an expression. Expressions already don't need the decorator syntax, because we can replace this: var = some_value var = decorator(var) with this: var = decorator(some_value) as in your example. Decorator syntax buys us nothing above this. ChrisA From boekewurm at gmail.com Fri Apr 1 13:14:41 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 19:14:41 +0200 Subject: [Python-ideas] Decorators for variables Message-ID: > An example of the transformation would help here An example, that detects cycles in a graph, and doesn't do an update if the graph has cycles. class prevent_cycles(property): """ This uses nodes that point to only one other node: if there is a cycle in the current subgraph, it will detect that within O(n) """ def __init__(self, value): super().__init__() self._value = None self.__set__(value) def getter(self): return self._value def setter(self, value): if not turtle_and_hare(value): self._value = value else: raise Exception("cycle detected, shutting down") def deleter(self): del self._value def turtle_and_hare(self, other): """ Generic turtle and hare implementation. true if cycle, false if not. Returns True if cyclic from this point, false if it is not. """ turtle = other hare = other fieldname = self.__name__ # this assuming that properties have access to their name, but that # would also be the same as a function. while True: if hare is None: return False hare = getattr(hare, fieldname) if hare is None: return False if hare is turtle: return True hare = getattr(hare, fieldname) turtle = getattr(turtle, fieldname) if hare is turtle: return True class A(object): def __init__(self, parent): @prevent_cycles self.parent = parent This would prevent cycles from being created in this object A, and would make some highly reusable code. The same can be done for @not_none, etc, to prevent some states which may be unwanted. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Apr 1 13:32:57 2016 From: mike at selik.org (Michael Selik) Date: Fri, 01 Apr 2016 17:32:57 +0000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: On Fri, Apr 1, 2016 at 10:18 AM Matthias welp wrote: > tldr: Using three method declarations or chaining method calls is ugly, > why not allow variables and attributes to be decorated too? > Focusing on attributes, not variables in general. Check out how Django dealt with this ( https://docs.djangoproject.com/en/1.9/topics/db/models/#quick-example). And SQLAlchemy (http://flask-sqlalchemy.pocoo.org/2.1/models/). Do their solutions satisfy? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.g.kelly at gmail.com Fri Apr 1 13:33:15 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 1 Apr 2016 11:33:15 -0600 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: (Resending to correct list. Sorry about that.) On Fri, Apr 1, 2016 at 11:25 AM, Ian Kelly wrote: > On Fri, Apr 1, 2016 at 11:14 AM, Matthias welp wrote: >>> An example of the transformation would help here >> >> An example, that detects cycles in a graph, and doesn't do an update if >> the graph has cycles. > > Thanks. > >> class A(object): >> def __init__(self, parent): >> @prevent_cycles >> self.parent = parent > > I think you'll find that this doesn't work. Properties are members of > the class, not of instances of the class. > >> This would prevent cycles from being created in this object A, and would >> make >> some highly reusable code. The same can be done for @not_none, etc, to >> prevent >> some states which may be unwanted. > > But you could accomplish the same thing with "self.parent = > prevent_cycles(parent)". So I'm still not seeing how the use of the > decorator syntax eliminates repetition. From desmoulinmichel at gmail.com Fri Apr 1 13:41:13 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 19:41:13 +0200 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: Message-ID: <56FEB2B9.4020405@gmail.com> It's dangerous to talk about a new feature the 1st of April, so I'll start by saying this one is not a joke. I read recently a proposal to allow md5 hashing doing python -m hashlib md5 filename. I think it's a great idea, and that can extend this kind of API to other parts of Python such as: python -m random randint 0 10 => print(random.randin(0, 10)) python -m random urandom 10 => print(os.urandom(10)) python -m uuid uuid4 => print(uuid.uuid4().hex) python -m uuid uuid3 => print(uuid.uuid3().hex) From desmoulinmichel at gmail.com Fri Apr 1 13:44:21 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 19:44:21 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: <56FEB375.7010609@gmail.com> Le 01/04/2016 19:33, Ian Kelly a ?crit : > (Resending to correct list. Sorry about that.) > > On Fri, Apr 1, 2016 at 11:25 AM, Ian Kelly wrote: >> On Fri, Apr 1, 2016 at 11:14 AM, Matthias welp wrote: >>>> An example of the transformation would help here >>> >>> An example, that detects cycles in a graph, and doesn't do an update if >>> the graph has cycles. >> >> Thanks. >> >>> class A(object): >>> def __init__(self, parent): >>> @prevent_cycles >>> self.parent = parent >> >> I think you'll find that this doesn't work. Properties are members of >> the class, not of instances of the class. >> >>> This would prevent cycles from being created in this object A, and would >>> make >>> some highly reusable code. The same can be done for @not_none, etc, to >>> prevent >>> some states which may be unwanted. >> >> But you could accomplish the same thing with "self.parent = >> prevent_cycles(parent)". So I'm still not seeing how the use of the >> decorator syntax eliminates repetition. Not saying I like the proposal, but you can argue against regular decorators the same way: @foo def bar(): pass Is just: bar = foo(bar) But, I think the benefit for @decorator on functions is mainly because a function body is big, and this way we can read the decorator next to the function signature while on a variable, this just add another way to call a function on a variable. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From desmoulinmichel at gmail.com Fri Apr 1 13:47:42 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 19:47:42 +0200 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <20160401135446.GY12526@ando.pearwood.info> References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> <20160401135446.GY12526@ando.pearwood.info> Message-ID: <56FEB43E.40103@gmail.com> Le 01/04/2016 15:54, Steven D'Aprano a ?crit : > On Fri, Apr 01, 2016 at 01:49:03PM +0200, Michel Desmoulin wrote: > >> There are dozen of good way to oppose an idea, just saying "we got a >> moral stand to not do it" is not convincing. > > Nobody has made that argument. > > >> Espacially in a language >> with so many compromised like len(foo) instead of foo.len, > > len(foo) isn't a compromise, it is an intentional feature. > > >> functional paradigme and Poo and immutability and mutability, etc. > > "Paradigm". > > You might not be aware that "poo" is an English euphemism for excrement, > normally used for and by children. So I'm completely confused by what > you mean by "and Poo". > Those are both french mistake. paradigme has a "e" in french while OOP is POO. Wrote the email too fast. > >> Python has an history of making things to get out of the way: > > There are many people who would say that Python's case sensitivity and > significant indentation "get in the way". > > >> - no {} for indentation; >> - optional parentheses for tuples; > > No. Parentheses have nothing to do with tuples (except the empty tuple). > Parentheses are used for *grouping*. Parens don't make tuples, and they > aren't "optional" any more than parens are "optional" in addition > because you can write `result = (a+b)`. The parens here have nothing to > do with addition, and it would be misleading to say "optional > parentheses for addition". > > > Writing (1, 2, 3) is similar to writing ([1, 2, 3]) or ("abc") or (123). > Apart from nested tuples, it's almost never needed. > > >> - optional parenthesis for classes; > > Needed for backwards compatibility. Let's not copy that misfeature into > future misfeatures. > > >> If this changes does not hurt readability, ability to debug and doesn't >> make your code/program any worst than it was but does't help even a >> little, why not ? > > Who says that it doesn't hurt readability? My personal experience tells > me that it DOES hurt readability, at least a little, and adds confusion > to the rules of what needs parens when and what doesn't. > > You might not agree with my personal experience, but you shouldn't just > dismiss it or misrepresent it as a "moral stand". My argument cuts right > to the core of the argument that making parens optional helps -- my > experience is that it *doesn't help*, it actually HURTS. > > But as I said earlier, I now agree with you. From boekewurm at gmail.com Fri Apr 1 13:49:41 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 19:49:41 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: > > tldr > Check out how Django dealt with this. And SQLAlchemy > Do their solutions satisfy The thing I'm missing in those solutions is how it isn't chainable. If I would want something that uses access logging (like what Django and SQLAlchemy are doing), and some 'mixin' for that variable to prevent cycle protection, then that would be hard. The only other way is using function composition, and that can lead to statements that are too long to read comfortably. -------------- next part -------------- An HTML attachment was scrubbed... URL: From boekewurm at gmail.com Fri Apr 1 13:53:32 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 19:53:32 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: <56FEB375.7010609@gmail.com> References: <56FEB375.7010609@gmail.com> Message-ID: > But, I think the benefit for @decorator on functions is mainly because a > function body is big, and this way we can read the decorator next to the > function signature while on a variable, this just add another way to > call a function on a variable. Yes, it is, but I proposed it to give a visual indicator that it does not change the value that is assigned to the identifier, but rather changes the behaviour of the identifier, just like what most function decorators do. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 1 13:57:22 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 2 Apr 2016 04:57:22 +1100 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: On Sat, Apr 2, 2016 at 4:53 AM, Matthias welp wrote: >> But, I think the benefit for @decorator on functions is mainly because a >> function body is big, and this way we can read the decorator next to the >> function signature while on a variable, this just add another way to >> call a function on a variable. > > Yes, it is, but I proposed it to give a visual indicator that it does not > change the value that is assigned to the identifier, but rather changes the > behaviour of the identifier, just like what most function decorators do. Wait, what? Function decorators are simply higher-order functions: they take a function as an argument, and return a function [1]. They can't change the behaviour of the name, only the value it's bound to. ChrisA [1] Usually. Nothing's stopping them from returning non-callables, except that it'd confuse the living daylights out of people. From ethan at stoneleaf.us Fri Apr 1 13:58:09 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Apr 2016 10:58:09 -0700 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: <56FEB6B1.9050204@stoneleaf.us> On 04/01/2016 10:53 AM, Matthias welp wrote: > > But, I think the benefit for @decorator on functions is mainly because a > > function body is big, and this way we can read the decorator next to the > > function signature while on a variable, this just add another way to > > call a function on a variable. > > Yes, it is, but I proposed it to give a visual indicator that it does > not change the value that is assigned to the identifier, but rather > changes the behaviour of the identifier, just like what most function > decorators do. So instead of a = Char(length=10, value='empty') you want @Char(length=10) a = 'empty' ? -- ~Ethan~ From desmoulinmichel at gmail.com Fri Apr 1 14:02:10 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 1 Apr 2016 20:02:10 +0200 Subject: [Python-ideas] Provide __main__ for datetime and time In-Reply-To: References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> Message-ID: <56FEB7A2.5020902@gmail.com> As with my previous email about __main__ for random, uuid and os, I wish to suggest a similar __main__ for datetime. The target audience may not be the same, so I'm making it a difference proposal. E.G: python -m datetime now => print(str(datetime.datetime.now()) python -m datetime utcnow => print(str(datetime.datetime.utcnow()) python -m time epoch => print(time.time()) python -m datetime now "%d/%m/%Y" => print(str(datetime.datetime.now().strftime("%d/%m/%Y")) python -m datetime utcnow "%d/%m/%Y" => print(str(datetime.datetime.utcnow().strftime("%d/%m/%Y")) From boekewurm at gmail.com Fri Apr 1 14:08:54 2016 From: boekewurm at gmail.com (Matthias welp) Date: Fri, 1 Apr 2016 20:08:54 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: > Function decorators There are decorators that return a callable that not calls the function that was given as an argument, but also do some other things, and therefore change the behaviour of that function. > So instead of > > a = Char(length=10, value='empty') > > you want > > @Char(length=10) > a = 'empty' > > ? If possible, yes. So that there is a standardized way to access changing variables, or to put limits on the content of the variable, similar to the @accepts and @produces decorators that are seen here ( https://wiki.python.org/moin/PythonDecoratorLibrary#Type_Enforcement_.28accepts.2Freturns.29 ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 1 14:33:03 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Apr 2016 11:33:03 -0700 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: <56FEBEDF.6070303@stoneleaf.us> On 04/01/2016 11:08 AM, Matthias welp wrote: > Even earlier, Ethan Furman wrote: >> So instead of >> >> a = Char(length=10, value='empty') >> >> you want >> >> @Char(length=10) >> a = 'empty' >> >> ? > > If possible, yes. So that there is a standardized way to access changing > variables, or to put limits on the content of the variable, similar to > the @accepts and @produces decorators that are seen here > (https://wiki.python.org/moin/PythonDecoratorLibrary#Type_Enforcement_.28accepts.2Freturns.29) I don't see it happening. Making that change would be a lot of work, and the advantages (if any) of the second method over the first do not warrant it. -- ~Ethan~ From mike at selik.org Fri Apr 1 14:33:34 2016 From: mike at selik.org (Michael Selik) Date: Fri, 01 Apr 2016 18:33:34 +0000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: On Fri, Apr 1, 2016 at 2:09 PM Matthias welp wrote: > > Function decorators > > There are decorators that return a callable that not calls the function > that was given as an argument, but also do some other things, and therefore > change the behaviour of that function. > > > So instead of > > > > a = Char(length=10, value='empty') > > > > you want > > > > @Char(length=10) > > a = 'empty' > > > > ? > > If possible, yes. So that there is a standardized way to access changing > variables, or to put limits on the content of the variable, similar to the > @accepts and @produces decorators that are seen here ( > https://wiki.python.org/moin/PythonDecoratorLibrary#Type_Enforcement_.28accepts.2Freturns.29 > ) > There is a standardized way. You can extend ``property`` or mimic its implementation. Then instead of class Foo: a = property(getter, no_cycles) You can write class Foo: a = NoCycles() I haven't linked to a how-to for doing this is, because I think it's unnecessary for most small projects. Every so often someone asks for a more pleasant syntax for specifying a property, getter and setter. Guido seems to consistently reply that he thinks our current situation is good enough. I'd dig up a link to the email archive for you, but Google wasn't being very kind to me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Apr 1 14:46:11 2016 From: mike at selik.org (Michael Selik) Date: Fri, 01 Apr 2016 18:46:11 +0000 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEB2B9.4020405@gmail.com> References: <56FEB2B9.4020405@gmail.com> Message-ID: There's a large category of these features that is solved by ``python -c``. What's special about these particular tasks? On Fri, Apr 1, 2016 at 1:41 PM Michel Desmoulin wrote: > It's dangerous to talk about a new feature the 1st of April, so I'll > start by saying this one is not a joke. > > I read recently a proposal to allow md5 hashing doing python -m hashlib > md5 filename. > > I think it's a great idea, and that can extend this kind of API to other > parts of Python such as: > > python -m random randint 0 10 => print(random.randin(0, 10)) > python -m random urandom 10 => print(os.urandom(10)) > python -m uuid uuid4 => print(uuid.uuid4().hex) > python -m uuid uuid3 => print(uuid.uuid3().hex) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Fri Apr 1 15:04:40 2016 From: random832 at fastmail.com (Random832) Date: Fri, 01 Apr 2016 15:04:40 -0400 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEB2B9.4020405@gmail.com> References: <56FEB2B9.4020405@gmail.com> Message-ID: <1459537480.2664635.566052546.51DEA248@webmail.messagingengine.com> On Fri, Apr 1, 2016, at 13:41, Michel Desmoulin wrote: > It's dangerous to talk about a new feature the 1st of April, so I'll > start by saying this one is not a joke. > > I read recently a proposal to allow md5 hashing doing python -m hashlib > md5 filename. > > I think it's a great idea, and that can extend this kind of API to other > parts of Python such as: > > python -m random randint 0 10 => print(random.randin(0, 10)) > python -m random urandom 10 => print(os.urandom(10)) > python -m uuid uuid4 => print(uuid.uuid4().hex) > python -m uuid uuid3 => print(uuid.uuid3().hex) Bikeshedding a bit, but how about just python -m uuid for print(str(uuid.uuid4()))? From victor.stinner at gmail.com Fri Apr 1 15:07:41 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Apr 2016 21:07:41 +0200 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEB2B9.4020405@gmail.com> References: <56FEB2B9.4020405@gmail.com> Message-ID: Hi, 2016-04-01 19:41 GMT+02:00 Michel Desmoulin : > I read recently a proposal to allow md5 hashing doing python -m hashlib > md5 filename. > > I think it's a great idea, and that can extend this kind of API to other > parts of Python such as: > > python -m random randint 0 10 => print(random.randin(0, 10)) How many times did you need this feature recently? I don't recall having to generate a random number in a range. > python -m random urandom 10 => print(os.urandom(10)) I don't understand the use case. Can you elaborate? > python -m uuid uuid4 => print(uuid.uuid4().hex) FYI On Linux, you can use "cat /proc/sys/kernel/random/uuid" ;-) I agree that -c is enough here: ./python -c 'import uuid; print(uuid.uuid4())' > python -m uuid uuid3 => print(uuid.uuid3().hex) I don't know UUID3. It looks like you need more parameters. IMHO these use cases are not popular enough to justify a CLI. hashlib CLI is inspired by existing tools: md5sum, sha1sum, etc. Same rationale for Python tarfile CLI. What are the existing commands which inspired your CLI? Victor From tjreedy at udel.edu Fri Apr 1 15:31:59 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Apr 2016 15:31:59 -0400 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE602F.90002@gmail.com> References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> Message-ID: On 4/1/2016 7:49 AM, Michel Desmoulin wrote: > Yes, and because after more than a decade of Python, I still forget to > type out the parenthesis some time, Only in function definitions, or also in function calls. then go back and realize that it's > silly that I have to since I don't with classes. The parallel in invalid. To repeat what I said in the my initial response and what others have said: the header in a def statement shows now to call the function (the signature). The header in a class statement does no such thing. The () in a def statement represent the call operator. The () is a class statement do not. There serve a visual grouping and subordination purpose. "class mystring(str)" does not say anything about how to use mystring as a callable. > Is there really a strong case against it Yes, as has been presented before, but ignored by proponents. For one: I believe that omitting () in def will encourage people to even more ofter omit () in calls, when needed, and that is BAD. I consider this a killer argument against it. > than just "it's not pure" ? I have seen this too often. Practical arguments against a proposal are either ignored or wrongly dismissed as 'purity arguments'. To me, this makes the discussion useless. -- Terry Jan Reedy From tjreedy at udel.edu Fri Apr 1 15:55:38 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Apr 2016 15:55:38 -0400 Subject: [Python-ideas] Provide __main__ for datetime and time In-Reply-To: <56FEB7A2.5020902@gmail.com> References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> <56FEB7A2.5020902@gmail.com> Message-ID: Note: new subjects should be posted as new threads. This was posted as a response in the unrelated "`to_file()` method for strings" thread and will not be seen by anyone who has killed that thread or who has threads collapsed. On 4/1/2016 2:02 PM, Michel Desmoulin wrote: > As with my previous email about __main__ for random, uuid and os, I wish > to suggest a similar __main__ for datetime. File __main__ is for packages. I believe none of the modules you mention are packages, so I presume you mean adding within the module something like def main(args): ... if __name__ = '__main__': from sys import args main(args[1:]) or the equivalent in C. > The target audience may not > be the same, so I'm making it a difference proposal. > > E.G: > > python -m datetime now => print(str(datetime.datetime.now()) > python -m datetime utcnow => print(str(datetime.datetime.utcnow()) > python -m time epoch => print(time.time()) > > python -m datetime now "%d/%m/%Y" => > print(str(datetime.datetime.now().strftime("%d/%m/%Y")) > python -m datetime utcnow "%d/%m/%Y" => > print(str(datetime.datetime.utcnow().strftime("%d/%m/%Y")) What is the particular motivation for this package? Should we add a command line interface to math? So that python -m math func arg => print(func(arg)) ? I am wondering what is or should be the general policy on the subject. How easy is this for C-coded modules? -- Terry Jan Reedy From tjreedy at udel.edu Fri Apr 1 15:57:49 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Apr 2016 15:57:49 -0400 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEB2B9.4020405@gmail.com> References: <56FEB2B9.4020405@gmail.com> Message-ID: On 4/1/2016 1:41 PM, Michel Desmoulin wrote: > It's dangerous to talk about a new feature the 1st of April, so I'll > start by saying this one is not a joke. This you should not have posted this as a followup on the joke thread '-). See my response to your separate similar proposal. -- Terry Jan Reedy From jsbueno at python.org.br Fri Apr 1 16:35:25 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 1 Apr 2016 17:35:25 -0300 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> Message-ID: On 1 April 2016 at 16:07, Victor Stinner wrote: > Hi, > > 2016-04-01 19:41 GMT+02:00 Michel Desmoulin : >> I read recently a proposal to allow md5 hashing doing python -m hashlib >> md5 filename. >> >> I think it's a great idea, and that can extend this kind of API to other >> parts of Python such as: >> >> python -m random randint 0 10 => print(random.randin(0, 10)) > > How many times did you need this feature recently? I don't recall > having to generate a random number in a range. Well...actually, really recently (Tuesday), I did python3 -c "__import__('calendar').calendar(2016)" And just when this thread started here, I recalled it was possible to do just python -m calendar --------- That is an anecdote related with the features in question, and I myself can't decide if it is a point for or against them :-) I suppose they'd be "nice to haves" . But maybe, if instead of sprinkiling 4 - 5 lines of code in "__main__" files everywhere, would not make sense to put all of it in a single place? Maybe shutils itself? So that python -m shutils.uuid , python -m shutils.randint, and so on would each do their thing? js -><- From ethan at stoneleaf.us Fri Apr 1 16:42:35 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Apr 2016 13:42:35 -0700 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> Message-ID: <56FEDD3B.5020907@stoneleaf.us> On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote: > Well...actually, really recently (Tuesday), I did > python3 -c "__import__('calendar').calendar(2016)" Wouldn't have python3 -c "import calendar; calendar(2016)" been clearer? and easier to type? :) -- ~Ethan~ From breamoreboy at yahoo.co.uk Fri Apr 1 18:44:33 2016 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 1 Apr 2016 23:44:33 +0100 Subject: [Python-ideas] Provide __main__ for datetime and time In-Reply-To: <56FEB7A2.5020902@gmail.com> References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> <56FEB7A2.5020902@gmail.com> Message-ID: On 01/04/2016 19:02, Michel Desmoulin wrote: > As with my previous email about __main__ for random, uuid and os, I wish > to suggest a similar __main__ for datetime. The target audience may not > be the same, so I'm making it a difference proposal. > > E.G: > > python -m datetime now => print(str(datetime.datetime.now()) > python -m datetime utcnow => print(str(datetime.datetime.utcnow()) > python -m time epoch => print(time.time()) > > python -m datetime now "%d/%m/%Y" => > print(str(datetime.datetime.now().strftime("%d/%m/%Y")) > python -m datetime utcnow "%d/%m/%Y" => > print(str(datetime.datetime.utcnow().strftime("%d/%m/%Y")) Not very funny for 1st April. Would you care to have another go? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From breamoreboy at yahoo.co.uk Fri Apr 1 18:47:44 2016 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 1 Apr 2016 23:47:44 +0100 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEDD3B.5020907@stoneleaf.us> References: <56FEB2B9.4020405@gmail.com> <56FEDD3B.5020907@stoneleaf.us> Message-ID: On 01/04/2016 21:42, Ethan Furman wrote: > On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote: > >> Well...actually, really recently (Tuesday), I did >> python3 -c "__import__('calendar').calendar(2016)" > > Wouldn't have > > python3 -c "import calendar; calendar(2016)" > > been clearer? and easier to type? :) > > -- > ~Ethan~ > My understanding is that this kind of construction does not make you as much money when you are a consultant, as opposed to a mere programmer or similar. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From jsbueno at python.org.br Fri Apr 1 19:10:51 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 1 Apr 2016 20:10:51 -0300 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <56FEDD3B.5020907@stoneleaf.us> Message-ID: On 1 April 2016 at 19:47, Mark Lawrence via Python-ideas wrote: > On 01/04/2016 21:42, Ethan Furman wrote: >> >> On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote: >> >>> Well...actually, really recently (Tuesday), I did >>> python3 -c "__import__('calendar').calendar(2016)" >> >> >> Wouldn't have >> >> python3 -c "import calendar; calendar(2016)" >> >> been clearer? and easier to type? :) >> >> -- >> ~Ethan~ >> It would, but python3 -c "import calendar; calendar.calendar(2016)" not so much easier to type. The reason, however is that I usually think of "I have the right to use one single expression when using python -c" and not "a single line' > > My understanding is that this kind of construction does not make you as much > money when you are a consultant, as opposed to a mere programmer or similar. Or that ^ :-) Anyway - back to the thread - it does not seen a bad idea for me at all. > > -- > My fellow Pythonistas, ask not what our language can do for you, ask > what you can do for our language. > > Mark Lawrence > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From greg.ewing at canterbury.ac.nz Fri Apr 1 20:27:48 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 02 Apr 2016 13:27:48 +1300 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: <56FF1204.4090404@canterbury.ac.nz> Chris Angelico wrote: > var = some_value > var = decorator(var) > > with this: > > var = decorator(some_value) > > as in your example. Decorator syntax buys us nothing above this. It would if it made the name of the variable available to the decorator: @decorator var = value becomes var = decorator("var", value) -- Greg From alexander.belopolsky at gmail.com Fri Apr 1 20:49:25 2016 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 1 Apr 2016 20:49:25 -0400 Subject: [Python-ideas] Provide __main__ for datetime and time In-Reply-To: <56FEB7A2.5020902@gmail.com> References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> <56FEB7A2.5020902@gmail.com> Message-ID: On Fri, Apr 1, 2016 at 2:02 PM, Michel Desmoulin wrote: > As with my previous email about __main__ for random, uuid and os, I wish > to suggest a similar __main__ for datetime. The target audience may not > be the same, so I'm making it a difference proposal. > > E.G: > > python -m datetime now => print(str(datetime.datetime.now()) > python -m datetime utcnow => print(str(datetime.datetime.utcnow()) > +1 In fact, I would not mind seeing a fairly complete GNU date utility [1] reimplementation as datetime.__main__. [1] http://www.gnu.org/software/coreutils/manual/html_node/date-invocation.html#date-invocation -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sat Apr 2 01:09:30 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 2 Apr 2016 14:09:30 +0900 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <56FEDD3B.5020907@stoneleaf.us> Message-ID: <22271.21514.582250.114454@turnbull.sk.tsukuba.ac.jp> Joao S. O. Bueno writes: > Anyway - back to the thread - it does not seen a bad idea for me at all. I think it's a waste of time: an invitation to endless bikeshedding and a slippery slope to very half-baked attempts at full-featured utilities in the module's "__main__" section. It won't be consistent, as many modules already have test code in there, although I suppose you could move that functionality to a "test" command (at the expense of breaking existing "install and test" scripts). Finally to the extent that the feature itself is pretty consistently present, people are going to demand that their favorite options be one-linable, and that pretty much every module grow one-line-abilty. Whatever happened to "not every 3-liner need be builtin"? The suggested scripts also seem, uh, of incredibly marginal usefulness. Except for calendar, but there, the desirable set of options is huge (did you know that many Japanese like their weeks to start with Monday, for example? and that some people care about the Mayan calendar? or of more relevance, different base years -- in Japan it's Year 28 of the Heisei Emperor). Alexander was right as to the ultimate goal (full implementations of POSIX date or GNU calendar plus whatever extensions might seem useful). But if you do more than one or two of those, you're going to end up with a confusing mishmash of CLI syntaxes, with a long script invocation: "python -m calendar ...". That said, your earlier idea taken one step farther seems like the best way to go: create a module oneline[1], put the functionality in there, put it on PyPI, and you can start lobbying for stdlib inclusion in 2018. Or better yet, create a pysh script[2] so you can omit the "-m". An advantage to this is that you can create an API, a module-level variable such as "one_line_subcommands", which would contain a dict of name-function pairs that pysh could look for and run. You could also have a global directory in pysh so that pysh can override those stubborn module maintainers who won't add your favorite one-liner to __main__. The best idea of all, perhaps, is # echo >> /etc/shells /usr/bin/python # chsh -s /usr/bin/python (Sorry, Windows users, I don't know the equivalent there.) Or make that ipython for even more shellshocking convenience. Footnotes: [1] Too bad it can't be called "1line". "oneln" is a little shorter. [2] ISTR that name is already taken but can't think of a better one. "py1line" maybe? From random832 at fastmail.com Sat Apr 2 01:17:29 2016 From: random832 at fastmail.com (Random832) Date: Sat, 02 Apr 2016 01:17:29 -0400 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <22271.21514.582250.114454@turnbull.sk.tsukuba.ac.jp> References: <56FEB2B9.4020405@gmail.com> <56FEDD3B.5020907@stoneleaf.us> <22271.21514.582250.114454@turnbull.sk.tsukuba.ac.jp> Message-ID: <1459574249.722574.566373482.1F56B9A6@webmail.messagingengine.com> On Sat, Apr 2, 2016, at 01:09, Stephen J. Turnbull wrote: > [1] Too bad it can't be called "1line". "oneln" is a little shorter. It's inconvenient to import a module starting with a digit, but not impossible (and it works fine with -m) From ncoghlan at gmail.com Sat Apr 2 01:29:19 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 2 Apr 2016 15:29:19 +1000 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FE876B.2020009@sdamon.com> References: <56FD5EC7.7060909@sdamon.com> <8537r69qrj.fsf@benfinney.id.au> <56FE876B.2020009@sdamon.com> Message-ID: On 2 April 2016 at 00:36, Alexander Walters wrote: > I said earlier that if this suggestion was made in 1991 it should have been > accepted to make functions more consistent with classes. I have changed my > mind; in 1991 classes should have been corrected to always require parens. In fact, the one change made in this area since then was to *permit* empty parens on class definitions back in Python 2.5: https://hg.python.org/cpython/rev/a0d3f773543d Prior to that, the empty parens were mandatory for functions and explicitly disallowed for classes. Making them mandatory for classes would break too much code for not enough benefit, while making them optional for functions would introduce an additional stylistic choice with no demonstrable benefit to code maintainability, and a clear disadvantage in creating a new opportunity for style inconsistencies. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sat Apr 2 02:37:39 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sat, 2 Apr 2016 09:37:39 +0300 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <56FEB2B9.4020405@gmail.com> References: <56FEB2B9.4020405@gmail.com> Message-ID: On Fri, Apr 1, 2016 at 8:41 PM, Michel Desmoulin wrote: > It's dangerous to talk about a new feature the 1st of April, so I'll > start by saying this one is not a joke. > > I read recently a proposal to allow md5 hashing doing python -m hashlib > md5 filename. > > I think it's a great idea, and that can extend this kind of API to other > parts of Python such as: > > python -m random randint 0 10 => print(random.randin(0, 10)) > python -m random urandom 10 => print(os.urandom(10)) > python -m uuid uuid4 => print(uuid.uuid4().hex) > python -m uuid uuid3 => print(uuid.uuid3().hex) > > This brings back memories :). Back when I wasn't using Python yet, I ended up needing some stuff in a shell script that was available in Python. Then, as I basically did not know any Python, I had to (1) figure out how do that in Python, and (2) how the heck to get the result cleanly out of there. Especially step 2 may have been a waste of time, even if I did eventually find out that I could do python -c and that i could do stuff like "import foo; print bar(foo)". I ended up writing a tiny bash script that did the "import foo; print " part for me so I only had to do the `bar(foo)` part. If I had had to install a package, I probably would have just found another solution. How about something like python --me module expression_in_module_namespace python --me random "randint(0,10)" or python --me module expression_assuming_module_is_imported python --me random "random.randint(0,10)" Or even python -e "random.randint(0,10)" which would automatically import stdlib if their names appear in the expression. - Koos -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Sat Apr 2 04:22:15 2016 From: random832 at fastmail.com (Random832) Date: Sat, 02 Apr 2016 04:22:15 -0400 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> Message-ID: <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> On Sat, Apr 2, 2016, at 02:37, Koos Zevenhoven wrote: > python -e "random.randint(0,10)" #!/usr/bin/env python3 import sys class magicdict(dict): def __getitem__(self, x): try: return super().__getitem__(x) except KeyError: try: mod = __import__(x) self[x] = mod return mod except ImportError: raise KeyError g = magicdict() for arg in sys.argv[1:]: try: p, obj = True, eval(arg, g) except SyntaxError: p = False exec(arg, g) if p: sys.displayhook(obj) Handling modules inside packages is left as an exercise for the reader. From pavol.lisy at gmail.com Sat Apr 2 05:16:08 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 2 Apr 2016 11:16:08 +0200 Subject: [Python-ideas] Package reputation system In-Reply-To: <56FE88D6.7080004@sdamon.com> References: <56F5B3F7.40502@gmail.com> <56FA33C0.4000902@mail.de> <56FA4E31.3010908@gmail.com> <56FA5F7E.3070001@gmail.com> <56FA656B.5070707@mail.de> <56FAB7C4.6060904@gmail.com> <4DD01140-B429-4FD0-8164-C965210A2282@selik.org> <56FE88D6.7080004@sdamon.com> Message-ID: 2016-03-29 22:51 GMT+02:00, Michael Selik : [...] > Back at Georgia Tech, my professor [0] once told me that the way to get rich > is to invent an index. 2016-04-01 16:42 GMT+02:00, Alexander Walters : > On 4/1/2016 09:07, Nick Coghlan wrote: [...] >> By contrast, I think Python itself covers too many domains for a >> common rating system to be feasible - "good for education" is not the >> same as "good for sysadmin tasks" is not the same as "good for data >> analysis" is not the same as "good for network service development", >> etc. >> >> Cheers, >> Nick. > Not that this was the original proposal, but there can be such a thing > as a universal 'bad' package, though. So about the only thing that a > universal package rating system can do effectively is shame developers. > I don't think we want that. 1. There is not only good-bad dimension. beauty-ugly, simple-complex, flat-nested (and others from PEP20) trusted-untrusted is one from others. 2. Art critics are not to shame artists. Constructive criticism could help authors too. It is not benefit only for "customers". One who wants invent index to get rich could probably invent this job (Python critic) too. 3. pypi has number of downloads for day, week and month. If there is public api to data - one could make graphs, trends, etc and public them. If they will be nice then they probably could be included on pypi site too. From rosuav at gmail.com Sat Apr 2 08:28:20 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 2 Apr 2016 23:28:20 +1100 Subject: [Python-ideas] Package reputation system In-Reply-To: References: <56F5B3F7.40502@gmail.com> <56FA33C0.4000902@mail.de> <56FA4E31.3010908@gmail.com> <56FA5F7E.3070001@gmail.com> <56FA656B.5070707@mail.de> <56FAB7C4.6060904@gmail.com> <4DD01140-B429-4FD0-8164-C965210A2282@selik.org> <56FE88D6.7080004@sdamon.com> Message-ID: On Sat, Apr 2, 2016 at 8:16 PM, Pavol Lisy wrote: > 2. Art critics are not to shame artists. Constructive criticism could > help authors too. It is not benefit only for "customers". One who > wants invent index to get rich could probably invent this job (Python > critic) too. A reputation system is simply a score, though. If you are low on the rankings, you don't automatically know why - and you can't look at the high ranked results to figure out what to do better. That's why the inventor of the index gets a lot of money - the consultancy fees to help people get higher on the index will be gladly paid, because there's no other way to learn. > 3. pypi has number of downloads for day, week and month. If there is > public api to data - one could make graphs, trends, etc and public > them. If they will be nice then they probably could be included on > pypi site too. That's popularity, which is different again from reputation. It suffers from several brutal limitations: 1) By definition, it cannot count *usage*. If I download something only to find that it completely fails me, it's still a download; if I download something and use it every day for years, it's still only one download. 2) It can only count downloads from the ultimate origin. If a Python package gets included in a downstream collection such as the Debian repositories, people who apt-get the package won't ting the stats. 3) Installations cascade into their dependencies. Does that count? What if you 'pip freeze' and now explicitly list all those dependencies in your requirements.txt - should they NOW count? 4) Even if you solve all of those problems, popularity begets popularity in many ways. Two competing packages may fight for a while, but one of them will "win", and the other languish - and it's virtually impossible for a new competitor to get to the point where it can be properly respected. Choosing which thing to use based on how many other people use it is sometimes important (the network effect - using Python rather than Bob's Obscure Scripting Language means it's a lot easier to find collaborators, no matter how technically superior BOSL might be), but it's almost completely orthogonal to quality. ChrisA From tjreedy at udel.edu Sat Apr 2 15:13:24 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 2 Apr 2016 15:13:24 -0400 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> Message-ID: On 4/2/2016 2:37 AM, Koos Zevenhoven wrote: > python -e "random.randint(0,10)" > > which would automatically import stdlib if their names appear in the > expression. I like this much better than cluttering a arbitrary selection of modules with main functions with an arbitrary selection functions whose values are printed. -- Terry Jan Reedy From steve at pearwood.info Sat Apr 2 23:09:21 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 3 Apr 2016 13:09:21 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: <56FEB375.7010609@gmail.com> References: <56FEB375.7010609@gmail.com> Message-ID: <20160403030920.GZ12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 07:44:21PM +0200, Michel Desmoulin wrote: > Not saying I like the proposal, but you can argue against regular > decorators the same way: > > @foo > def bar(): > pass > > Is just: > > bar = foo(bar) Not quite. It's actually: def bar(): pass bar = foo(bar) which is *three* uses of the same name, two of which may be a long way from the first. As you say: > But, I think the benefit for @decorator on functions is mainly because a > function body is big, That is certainly part of the justification for @ syntax. It might help to remember that decorators themselves have been possible in Python going all the way back to Python 1.5, at least, if not earlier, and the first three decorators in the standard library (property, classmethod and staticmethod) were added a few releases before decorator syntax using @. (If I remember correctly, property etc were added in 2.2 but @ syntax wasn't added until 2.4.) So the decorator concept had many years to prove itself before being given special syntax. The two reasons for adding extra syntax were: (1) to keep the decoration near the function declaration; and (2) to avoid having to repeat the function name three times. > and this way we can read the decorator next to the > function signature while on a variable, this just add another way to > call a function on a variable. I'm afraid I don't understand what you mean by this. What does "while on a variable" mean here? -- Steve From steve at pearwood.info Sun Apr 3 00:02:15 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 3 Apr 2016 14:02:15 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> Message-ID: <20160403040214.GA12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 08:08:54PM +0200, Matthias welp wrote: > > So instead of > > > > a = Char(length=10, value='empty') > > > > you want > > > > @Char(length=10) > > a = 'empty' > > > > ? > > If possible, yes. So that there is a standardized way to access changing > variables, or to put limits on the content of the variable, But decoratoring a *name binding* isn't going to do that. All it will do is limit the *initial value*, exactly as the function call does. After calling a = Char(length=10, value='empty') the name "a" is bound to the result of Char(...), whatever that happens to call. But that doesn't change the behaviour of the *name* "a", it only sets the value it is bound to. Nothing stops anyone from saying: a = 42 and re-binding the name to another value which is no longer a Char. Python has no default support for running custom code when binding arbitrary names to a value. To get this sort of thing to work, you are limited to attributes, using the descriptor protocol. I'm not going to explain descriptors now, you can google them, but property is a descriptor. So let's imagine that Python allows the @ syntax as you request, and go through the cases to see what that would imply. For local variables inside functions, or global top-level module variables, it doesn't give you any interesting power at all. All you have is an alternative syntax: @spam @eggs @cheese x = 999 is exactly the same as x = spam(eggs(cheese(999))) right now. You don't even save any characters: 28 (including newlines) for both. But this doesn't give us anything new and exciting, since x is now just a regular variable that can be replaced with some other value. So in this scenario, this sounds boring -- it gives you nothing you don't already have. Inside a class, we have the power of descriptors available to us, so we can use them for computed attributes. (That's how property works, among many others.) So suppose we have: class MyClass(object): @decorate x = 999 instance = MyClass() Now instance.x can be a descriptor, which means that it can enforce type and value validation rules, or logging, or whatever amazing functionality you want to add. This is good. But you can already do this. You just have to write: class MyClass(object): x = decorate(999) instead. If decorate() returns a descriptor, it returns a descriptor whatever syntax you use. What benefit do you gain? Well, in the function and class decorator case, you gain the benefit that the decoration is close to the class or function header, and you don't have to write the name three times: class MyClass(object): def method(self): [... many, many, many lines of code ...] method = decorate(method) But this isn't an advantage in the case of the variable: class MyClass(object): x = decorate(value) You only have to write the name once, and the decoration is not going to be far away. So just like the global variable case, there's no advantage to @decorate syntax, regardless of whether you are in a class or not. -- Steve From steve at pearwood.info Sun Apr 3 00:19:22 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 3 Apr 2016 14:19:22 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: Message-ID: <20160403041922.GB12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 07:49:41PM +0200, Matthias welp wrote: > > > tldr > > Check out how Django dealt with this. And SQLAlchemy > > Do their solutions satisfy > > The thing I'm missing in those solutions is how it isn't chainable. If I > would want something that uses access logging (like what Django and > SQLAlchemy are doing), and some 'mixin' for that variable to prevent cycle > protection, then that would be hard. The only other way is using function > composition, and that can lead to statements that are too long to read > comfortably. I'm not sure how function composition is harder to read than decorator syntax: @logging @require(int) @ensure(str) @validate @spam @eggs x = 999 versus: x = logging( require(int)( ensure(str)( validate( spam( eggs( 999 )))))) is not that different. Sure, you have a few extra brackets, but you have fewer @ symbols. And if you're going to do that sort of thing a lot, you just need a couple of helper functions: def compose(f, g): # Return a new function which applies f(g(args)). def composed(*args, **kw): return f(g(*args, **kwargs)) return composed def helper(use_logging=False, requires=None, ensures=None, use_validate=False, use_spam=False, use_eggs=False): funcs = [] if use_logging: funcs.append(logging) if requires: funcs.append(require(requires)) if ensures: funcs.append(ensure(ensures)) if use_validate: funcs.append(validate) # likewise for spam and eggs if not funcs: # Identity function returns whatever it is given. return lambda arg: arg else: f = funcs[0] for g in funcs[1:]: f = compose(g) return f all_validation = helper(True, int, str, True, True, True) x = all_validation(999) -- Steve From steve at pearwood.info Sun Apr 3 00:22:25 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 3 Apr 2016 14:22:25 +1000 Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless functions definitions In-Reply-To: <56FEB43E.40103@gmail.com> References: <20160331175741.GU12526@ando.pearwood.info> <56FE602F.90002@gmail.com> <20160401135446.GY12526@ando.pearwood.info> <56FEB43E.40103@gmail.com> Message-ID: <20160403042224.GC12526@ando.pearwood.info> On Fri, Apr 01, 2016 at 07:47:42PM +0200, Michel Desmoulin wrote: > >> functional paradigme and Poo and immutability and mutability, etc. > > > > "Paradigm". > > > > You might not be aware that "poo" is an English euphemism for excrement, > > normally used for and by children. So I'm completely confused by what > > you mean by "and Poo". > > > > Those are both french mistake. > > paradigme has a "e" in french while OOP is POO. Wrote the email too fast. Today I learned something new! Thank you. > > You might not agree with my personal experience, but you shouldn't just > > dismiss it or misrepresent it as a "moral stand". My argument cuts right > > to the core of the argument that making parens optional helps -- my > > experience is that it *doesn't help*, it actually HURTS. > > But as I said earlier, I now agree with you. I had missed that post. Thanks for the comments. -- Steve From ethan at stoneleaf.us Sun Apr 3 01:47:26 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 02 Apr 2016 22:47:26 -0700 Subject: [Python-ideas] Decorators for variables In-Reply-To: <20160403040214.GA12526@ando.pearwood.info> References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: <5700AE6E.4090206@stoneleaf.us> On 04/02/2016 09:02 PM, Steven D'Aprano wrote: > On Fri, Apr 01, 2016 at 08:08:54PM +0200, Matthias welp wrote: > >>> So instead of >>> >>> a = Char(length=10, value='empty') >>> >>> you want >>> >>> @Char(length=10) >>> a = 'empty' >>> >>> ? >> >> If possible, yes. So that there is a standardized way to access changing >> variables, or to put limits on the content of the variable, Steven, the above was a short cut of a name-binding inside a class definition. Sorry for the confusion. -- ~Ethan~ From boekewurm at gmail.com Sun Apr 3 02:05:36 2016 From: boekewurm at gmail.com (Matthias welp) Date: Sun, 3 Apr 2016 08:05:36 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: <20160403040214.GA12526@ando.pearwood.info> References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: > But that doesn't change the behaviour of the *name* "a", it > only sets the value it is bound to. Nothing stops anyone from saying: > > a = 42 > > and re-binding the name to another value which is no longer a Char. > Python has no default support for running custom code when binding > arbitrary names to a value. To get this sort of thing to work, you are > limited to attributes, using the descriptor protocol. This is exactly my point. When they implemented descriptors (2.2) they did that to add to the new-style class system. I think they did a great job, but namespaces were overlooked in my opinion. Why should we be limited to class attributes when using descriptors? Any namespace which contains a '.__dict__' content mapping should be able to hold descriptors in my opinion. If this has been discussed before, then please link me to the relevant discussion, I'd love to read the points made. > But this isn't an advantage in the case of the variable Assigning variables from function results is fairly common. When chaining function calls to get the desired behaviour of the variable, it will get confusing which part is the 'value' part, and which part is the 'behaviour' part, apart from namings: var = logging( require(int)( factorize( 4 )))) would get more clear if you wrote it this way: @logging @require(int) var = factorize(4) Something I just thought about: currently you can use the property() call to make an attribute have descriptor properties. This may be somewhat controversial, but maybe limit this to decorators only? While keeping the '.__dict__' override method open, the way things work currently won't change that much, but it will make the assignment of attributes or variables with descriptor properties a lot more intuitive, as you do not set a *name* to a variable and then can undo that just a while later: var = property(setter, getter, deleter, docs) var = 20 currently changes behaviour depending on what kind of scope it is located in (class description, any other scope), while decorators (for functions at least) work in every scope I can think of. I think that is strange, and that it should just be the same everywhere. Using decorators here could be a very nice and interesting choice. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Apr 3 02:32:56 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 3 Apr 2016 16:32:56 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp wrote: > While keeping the '.__dict__' override method open, the way things work > currently won't change that much, but it will make the assignment of > attributes or variables with descriptor properties a lot more intuitive, as > you do not set a *name* to a variable and then can undo that just a while > later: > > var = property(setter, getter, deleter, docs) > var = 20 > > currently changes behaviour depending on what kind of scope it is located in > (class description, any other scope), while decorators (for functions at > least) work in every scope I can think of. I think that is strange, and that > it should just be the same everywhere. Can you explain - or, preferably, demonstrate - the difference you're talking about here? ChrisA From boekewurm at gmail.com Sun Apr 3 02:53:10 2016 From: boekewurm at gmail.com (Matthias welp) Date: Sun, 3 Apr 2016 08:53:10 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: > > var = property(setter, getter, deleter, docs) > > var = 20 > > Can you explain - or, preferably, demonstrate - the difference you're > talking about here? Sorry, that was untested code. My expectations of class definitions was wrong, as it does not actually change behaviour inside it's own scope. I thought that when you are defining a class, that when you assign a property value to an attribute, that the attribute 'name value' will directly change it's behaviour to include the descriptor properties of the property object assigned. My mistake. On 3 April 2016 at 08:32, Chris Angelico wrote: > On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp wrote: > > While keeping the '.__dict__' override method open, the way things work > > currently won't change that much, but it will make the assignment of > > attributes or variables with descriptor properties a lot more intuitive, > as > > you do not set a *name* to a variable and then can undo that just a while > > later: > > > > var = property(setter, getter, deleter, docs) > > var = 20 > > > > currently changes behaviour depending on what kind of scope it is > located in > > (class description, any other scope), while decorators (for functions at > > least) work in every scope I can think of. I think that is strange, and > that > > it should just be the same everywhere. > > Can you explain - or, preferably, demonstrate - the difference you're > talking about here? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Apr 3 03:20:13 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 3 Apr 2016 17:20:13 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: On Sun, Apr 3, 2016 at 4:53 PM, Matthias welp wrote: >> > var = property(setter, getter, deleter, docs) >> > var = 20 >> >> Can you explain - or, preferably, demonstrate - the difference you're >> talking about here? > > Sorry, that was untested code. My expectations of class definitions was > wrong, as it does not actually change behaviour inside it's own scope. I > thought that when you are defining a class, that when you assign a property > value to an attribute, that the attribute 'name value' will directly change > it's behaviour to include the descriptor properties of the property object > assigned. My mistake. > Ah. There is a significant difference between assignment within a class definition and assignment from a function _inside_ that class definition, but in any given scope, double assignment always does the same thing: last one wins. Which is a good thing, when it comes to the @property decorator: class LifeAndUniverse: @property def answer(self): return 42 @answer.setter def answer(self, value): print("No fair changing the answer!") @answer.deleter def answer(self): print("You just deleted.... everything.") Each function definition overwrites the previous "answer" with a new one, which (thanks to the way setter and deleter are implemented) incorporates the previous code, but nothing in Python mandates that. So is there anything left of the assignment-decorator proposal, or is it completely withdrawn? (I always like to read over even the bad proposals - there's often something good in them, Martin Farquhar Tupper's "Proverbial Philosophy" aside.) ChrisA From boekewurm at gmail.com Sun Apr 3 04:02:55 2016 From: boekewurm at gmail.com (Matthias welp) Date: Sun, 3 Apr 2016 10:02:55 +0200 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: > So is there anything left of the assignment-decorator proposal, or is > it completely withdrawn? I think this sums the current open ends up: - Namespace variables decoration was dismissed by one of Steven's posts, but is actually something that might be wanted (via being able to put descriptors into namespaces that have a __dict__ accessor (e.g. modules)) - Variable decoration can be more clear about descriptor/value difference at assignment - Giving property objects access to their variable's name (e.g. via __name__) like functions have would open up quite a bit of possibilities, and would mean decorators would get quite a bit more power than what they have. Something that I had said earlier, but what went on a sidepath - Decorators may directly *deep* set the behaviour of the variable, and with it set the further behaviour of the variable (in the same scope). Such that @decorator var = 20 var = 40 will not reset var to 40, but the var = 40 goes through the descriptor (if applied). -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sun Apr 3 05:24:22 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 3 Apr 2016 18:24:22 +0900 Subject: [Python-ideas] Decorators for variables In-Reply-To: <20160403030920.GZ12526@ando.pearwood.info> References: <56FEB375.7010609@gmail.com> <20160403030920.GZ12526@ando.pearwood.info> Message-ID: <22272.57670.441884.891709@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > > and this way we can read the decorator next to the function > > signature while on a variable, this just add another way to call > > a function on a variable. > > I'm afraid I don't understand what you mean by this. What does > "while on a variable" mean here? Missing punctuation (see substring in square brackets): and this way we can read the decorator next to the function signature[. W]hile on a variable, this just add another way to call a function on a variable. So, it appears that he's proposing pure syntactic sugar with no semantic change at all. The syntactic effects are that decorator syntax separates function calls from the values they act on (-1 on that), and makes it appear that in Python a function can change the value of its formal argument (-1 on that, too). The latter characteristic may need some expansion. Of course Python has mutable objects which can be changed "inside" a function and that effect propagates to "outside" the function. But in the containing scope, while the object has mutated, the variable (more precisely, the name-object binding) has not. It seems to me that everything he's asked for can be accomplished by defining a descriptor class and assigning an instance of that class. I don't yet understand why that isn't good enough, unless it's just a matter of taste about the sweetness of syntax. From rosuav at gmail.com Sun Apr 3 05:57:46 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 3 Apr 2016 19:57:46 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: On Sun, Apr 3, 2016 at 6:02 PM, Matthias welp wrote: >> So is there anything left of the assignment-decorator proposal, or is >> it completely withdrawn? > > I think this sums the current open ends up: > > - Namespace variables decoration was dismissed by one of Steven's posts, but > is actually something that might be wanted (via being able to put > descriptors into namespaces that have a __dict__ accessor (e.g. modules)) > - Variable decoration can be more clear about descriptor/value difference at > assignment > - Giving property objects access to their variable's name (e.g. via > __name__) like functions have would open up quite a bit of possibilities, > and would mean decorators would get quite a bit more power than what they > have. > > Something that I had said earlier, but what went on a sidepath > - Decorators may directly *deep* set the behaviour of the variable, and with > it set the further behaviour of the variable (in the same scope). Such that > > @decorator > var = 20 > > var = 40 > > will not reset var to 40, but the var = 40 goes through the descriptor > (if applied). > None of this is possible with decorators *per se*, as they simply mutate an object as it goes past. The magic you're seeing (and yearing for) comes from two places: functions have names (so the bit in "def SOMETHING(...)" actually affects the object constructed, not just the name it's bound to), and attribute access and the descriptor protocol. The latter is what makes @property work, but you don't need decorators to use descriptor protocol - in fact, it's crucial to the way bound methods are created. So, let's take your proposals one by one. > - Namespace variables decoration was dismissed by one of Steven's posts, but > is actually something that might be wanted (via being able to put > descriptors into namespaces that have a __dict__ accessor (e.g. modules)) You'll need to elaborate on exactly what this can and can't do. > - Variable decoration can be more clear about descriptor/value difference at > assignment All Python names are in namespaces, so I'm not sure how this is different from the above. > - Giving property objects access to their variable's name (e.g. via > __name__) like functions have would open up quite a bit of possibilities, > and would mean decorators would get quite a bit more power than what they > have. This is fundamentally not possible in the general case; however, you could easily make a special form of the @property decorator which *requires* that the attribute name be the same as the function name that implements it, and then it would work (because function names are attributes of function objects, while "the name this is bound to" is not an attribute of anything). It's already accessible, as ClassName.property_name.fget.__name__; you could make this more accessible or more discoverable by subclassing property. In fact, you can even go meta: >>> class property(property): ... @property ... def name(self): return self.fget.__name__ ... >>> class X: ... @property ... def foo(self): return 42 ... >>> X.foo.name 'foo' However, this is vulnerable to the same thing that any other "what is my name" system has: you can rebind anything. >>> X.bar = X.foo >>> X.foo = 42 >>> X().foo 42 >>> X().bar 42 >>> X.foo.name Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute 'name' >>> X.bar.name 'foo' So there's fundamentally no way to say "what name is bound to me"; but you can quite effectively say "what is my canonical name", depending on a function to provide that name. ChrisA From steve at pearwood.info Sun Apr 3 06:47:53 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 3 Apr 2016 20:47:53 +1000 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: <20160403104751.GD12526@ando.pearwood.info> On Sun, Apr 03, 2016 at 10:02:55AM +0200, Matthias welp wrote: > > So is there anything left of the assignment-decorator proposal, or is > > it completely withdrawn? > > I think this sums the current open ends up: > > - Namespace variables decoration was dismissed by one of Steven's posts, > but is actually something that might be wanted (via being able to put > descriptors into namespaces that have a __dict__ accessor (e.g. modules)) It's not that I dismissed the idea, but that such a thing is not possible today. Maybe you could hack up something tricky by replacing the module __dict__ with a subclass, or replacing the entire module object with some custom class instance, but that's messy and fragile, and I'm not sure that it would work for (say) globals inside functions. I think the idea of "namespace descriptors" is promising. But it probably needs a PEP and a significant amount of work to determine what actually is possible now and what needs support from the compiler. > - Variable decoration can be more clear about descriptor/value difference > at assignment I'm not sure what you mean by that. > - Giving property objects access to their variable's name (e.g. via > __name__) like functions have would open up quite a bit of possibilities, Such as what? > and would mean decorators would get quite a bit more power than what they > have. What extra power are you thinking of? > Something that I had said earlier, but what went on a sidepath > - Decorators may directly *deep* set the behaviour of the variable, and > with it set the further behaviour of the variable (in the same scope). Such > that > > @decorator > var = 20 > > var = 40 > > will not reset var to 40, but the var = 40 goes through the descriptor > (if applied). This functionality has nothing to do with decorators. If it were possible, if namespaces (other than classes) supported descriptors, then the decorator syntax doesn't add anything to this. You could write: var = magic(20) I'm not calling it "decorator" because the whole @ decorator syntax is irrelevant to this. With respect Matthias, I think you are conflating and mixing up the power of descriptors as applied to classes, and the separate and distinct powers of decorators. You don't need one to have the other. -- Steve From ericfahlgren at gmail.com Sun Apr 3 10:14:32 2016 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Sun, 3 Apr 2016 07:14:32 -0700 Subject: [Python-ideas] Decorators for variables In-Reply-To: References: <56FEB375.7010609@gmail.com> <20160403040214.GA12526@ando.pearwood.info> Message-ID: <0dcc01d18db3$25785300$7068f900$@gmail.com> On Saturday, April 02, 2016 23:33, Chris Angelico wrote: > On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp wrote: > > currently changes behaviour depending on what kind of scope it is > > located in (class description, any other scope), while decorators (for > > functions at > > least) work in every scope I can think of. I think that is strange, > > and that it should just be the same everywhere. > > Can you explain - or, preferably, demonstrate - the difference you're talking about here? I can sort of see it, like this? class C(): def __init__(self): self.x = 99 @property def f(self): return self.x x = 101 @property def f(namespace): return namespace.x c = C() print(c.x) 99 print(c.f) 99 print(C.f.fget(c)) 99 print(x) 101 Here's the inconsistency, should implicitly bind to a "global namespace object". print(f) In other words, something like this: class GNS: def __getattr__(self, name): prop = globals().get(name) if isinstance(prop, property): return prop.fget(self) return prop global_namespace = GNS() print(x is global_namespace.x, "should be 'True'") True should be 'True' print(global_namespace.f) 101 From chris.barker at noaa.gov Sun Apr 3 18:43:21 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 3 Apr 2016 18:43:21 -0400 Subject: [Python-ideas] `to_file()` method for strings In-Reply-To: References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> Message-ID: <8076709334203319565@unknownmsgid> > [1] https://code.google.com/archive/p/pyon/ is a project from several > years ago aimed at that task. Thanks for the link -- I'll check it out. - Chris From python at 2sn.net Sun Apr 3 19:10:54 2016 From: python at 2sn.net (Alexander Heger) Date: Mon, 4 Apr 2016 09:10:54 +1000 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> Message-ID: > FYI On Linux, you can use "cat /proc/sys/kernel/random/uuid" ;-) uuidgen -r From mike at selik.org Mon Apr 4 02:37:51 2016 From: mike at selik.org (Michael Selik) Date: Mon, 04 Apr 2016 06:37:51 +0000 Subject: [Python-ideas] Provide __main__ for datetime and time In-Reply-To: References: <7DDCB676-2AA3-42AB-807C-D7C1F0BA293C@yahoo.com> <-7249072436132249471@unknownmsgid> <56F9514F.2090304@gmail.com> <56FEB7A2.5020902@gmail.com> Message-ID: On Sat, Apr 2, 2016 at 1:49 AM Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Fri, Apr 1, 2016 at 2:02 PM, Michel Desmoulin < > desmoulinmichel at gmail.com> wrote: > >> As with my previous email about __main__ for random, uuid and os, I wish >> to suggest a similar __main__ for datetime. The target audience may not >> be the same, so I'm making it a difference proposal. >> >> E.G: >> >> python -m datetime now => print(str(datetime.datetime.now()) >> python -m datetime utcnow => print(str(datetime.datetime.utcnow()) >> > > +1 > > In fact, I would not mind seeing a fairly complete GNU date utility [1] > reimplementation as datetime.__main__. > > [1] > http://www.gnu.org/software/coreutils/manual/html_node/date-invocation.html#date-invocation > Seems like Microsoft is solving this problem for you. http://www.hanselman.com/blog/DevelopersCanRunBashShellAndUsermodeUbuntuLinuxBinariesOnWindows10.aspx -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.morton13 at gmail.com Wed Apr 6 05:38:35 2016 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Wed, 06 Apr 2016 09:38:35 +0000 Subject: [Python-ideas] Dictionary views are not entirely 'set like' Message-ID: Let's start with a quick quiz: what is the result of each of the following (on Python3.5.x)? {}.keys() | set() # 1 set() | [] # 2 {}.keys() | [] # 3 set().union(set()) # 4 set().union([]) # 5 {}.keys().union(set()) # 6 If your answer was set([]), TypeError, set([]), set([]), set([]), AttributeError, then you were correct. That, to me, is incredibly unintuitive. Next up: {}.keys() == {}.keys() # 7 {}.items() == {}.items() # 8 {}.values() == {}.values() # 9 d = {}; d.values() == d.values() # 10 True, True, False, False. Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up for debate.[1] First thing first, the behavior exhibited by #3 is a bug (or at least it probably should be, and one I'd be happy to fix. However, before doing that I felt it would be good to propose some questions and suggestions for how that might be done. There are, as far as I can tell, two reasons to use a MappingView, memory efficiency or auto-updating (a view will continue to mirror changes in the underlying object). I'll focus on the first because the second can conceivably be solved in other ways. Currently, if I want to union a dictionaries keys with a non-set iterable, I can do `{}.keys() | []`. This is a bug[2], the set or operator should only work on another set. That said, fixing this bug removes the ability to efficiently or keys with another iterable, `set({}.keys()).update([])` for efficiency, or `set({}.keys()).union([])` for clarity. Fixing this is simply a matter of adding a `.union` method to the KeysView (and possibly to the Set abc as well). Although that may not be something that is wanted. The issue, as far as I can tell, is whether we want people converting from MappingViews to "primitives" as soon as possible, or if we want to encourage people people to use the views for as long as possible. There are arguments on both sides: encouraging people to use the views causes these objects to become more complex, introducing more surface area for bugs, more code to be maintained, etc. Further, there's there is currently one obvious way to do things, convert to a primitive, if you're doing any complex action. On the other hand, making MappingViews more like the primitives they represent has positives for performance, simplifies user code, and would probably make testing these objects easier, since many tests could be stolen from test_set.py. My opinion is that the operators on MappingViews should be no more permissive than their primitive counterparts. A KeysView is in various ways more restrictive than a set, so having it be also occasionally less restrictive is surprising and in my opinion bad. This causes the loss of an efficient way to union a dict's keys with a list (among other methods). I'd then add .union, .intersection, etc. to remedy this. This solution would bring the existing objects more in line with their primitive counterparts, while still allowing efficient actions on large dictionaries. In short: - Is #3 intended behavior? - Should it (and the others be)? - As a related aside, should .union and other frozen ops be added to the Set interface? - If so, should the fix solely be a bugfix, should I do what I proposed, or something else entirely? - More generally, should there be a guiding principle when it comes to MappingViews and similar special case objects? [1]: There's some good conversation in this prior thread on this issue https://mail.python.org/pipermail/python-ideas/2015-December/037472.html. The consensus seemed to be that making ValuesViews comparable by value is technically infeasible (O(n^2) worst case), while making it comparable based on the underlying dictionary is a possibility. This would be for OrderedDict, although many of the same arguments apply for a normal dictionary. [2]: Well, it probably should be a bug, its explicitly tested for ( https://github.com/python/cpython/blob/master/Lib/test/test_dictviews.py#L109), whereas sets are explicitly tested for the opposite functionality ( https://github.com/python/cpython/blob/master/Lib/test/test_set.py#L92) Thanks, I'm looking forward to the feedback, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Wed Apr 6 08:49:53 2016 From: random832 at fastmail.com (Random832) Date: Wed, 06 Apr 2016 08:49:53 -0400 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: References: Message-ID: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote: > {}.keys() == {}.keys() # 7 > {}.items() == {}.items() # 8 > {}.values() == {}.values() # 9 > d = {}; d.values() == d.values() # 10 > > True, True, False, False. > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up > for debate.[1] Last time this came up, the conclusion was that making values views comparable was intractable due to the fact that they're unordered but the values themselves aren't hashable. Then the discussion got sidetracked into a discussion of whether the justification for not having them be hashable (Java does just fine with everything being hashable and content-based hashes for mutable objects) makes sense in a "consenting-adults" world. From rob.cliffe at btinternet.com Wed Apr 6 09:03:04 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 6 Apr 2016 14:03:04 +0100 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: References: Message-ID: <57050908.9060408@btinternet.com> Something has changed then, because in Python 2.7.10 I get ##1,3 TypeError ##9,10 True I can't see anything wrong with the actual behaviour in any of the examples. Rob Cliffe On 06/04/2016 10:38, Joshua Morton wrote: > Let's start with a quick quiz: what is the result of each of the > following (on Python3.5.x)? > > {}.keys() | set() # 1 > set() | [] # 2 > {}.keys() | [] # 3 > set().union(set()) # 4 > set().union([]) # 5 > {}.keys().union(set()) # 6 > > If your answer was set([]), TypeError, set([]), set([]), set([]), > AttributeError, then you were correct. That, to me, is incredibly > unintuitive. > > Next up: > > {}.keys() == {}.keys() # 7 > {}.items() == {}.items() # 8 > {}.values() == {}.values() # 9 > d = {}; d.values() == d.values() # 10 > > True, True, False, False. > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is > up for debate.[1] > From rosuav at gmail.com Wed Apr 6 09:15:24 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Apr 2016 23:15:24 +1000 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <57050908.9060408@btinternet.com> References: <57050908.9060408@btinternet.com> Message-ID: On Wed, Apr 6, 2016 at 11:03 PM, Rob Cliffe wrote: > Something has changed then, because in Python 2.7.10 I get > ##1,3 TypeError > ##9,10 True > I can't see anything wrong with the actual behaviour in any of the examples. Python 2.7 has very different handling of .keys() and .values() on dictionaries - they return lists instead of views. You can ignore 2.7 for the purposes of this discussion - it's not "set-like" in the way that's being considered here. ChrisA From guettliml at thomas-guettler.de Wed Apr 6 10:16:49 2016 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Wed, 6 Apr 2016 16:16:49 +0200 Subject: [Python-ideas] Control Flow - Never Executed Loop Body In-Reply-To: <56F42793.6070503@mail.de> References: <56EEE82A.6050505@mail.de> <56F3F19E.4090604@thomas-guettler.de> <56F42793.6070503@mail.de> Message-ID: <57051A51.7040205@thomas-guettler.de> Am 24.03.2016 um 18:44 schrieb Sven R. Kunze: > On 24.03.2016 14:54, Thomas G?ttler wrote: >> for item in my_iterator: >> # do per item >> on empty: >> # this code gets executed if iterator was empty >> on break: >> # this code gets executed if the iteration was left by a "break" >> on notempty: >> # ... >> > > Hmm, interesting. "on" would indeed be a distinguishing keyword to "except". So, "except" handles exceptions and "on" > handles internal control flow (the iter protocol). Nice idea actually. I thank you, because you had the idea to extend the loop syntax :-) But I guess it is impossible to get "on empty" implemented. Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ From desmoulinmichel at gmail.com Wed Apr 6 10:49:05 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 6 Apr 2016 16:49:05 +0200 Subject: [Python-ideas] Control Flow - Never Executed Loop Body In-Reply-To: <57051A51.7040205@thomas-guettler.de> References: <56EEE82A.6050505@mail.de> <56F3F19E.4090604@thomas-guettler.de> <56F42793.6070503@mail.de> <57051A51.7040205@thomas-guettler.de> Message-ID: <570521E1.4000905@gmail.com> What about: >>> for item in my_iterable as iterator: >>> # do per item >>> if iterator.empty: # empty >>> # this code gets executed if iterator was empty >>> else: # not empty The empty marker would be set to True but the for loop and set to False when iterating on the loop. The perf cost of this is only added if you explicitly requires it with the "as" keyword. You don't handle break, which is handle by else anyway. And maybe having access to the iterator could open the door to more features. Le 06/04/2016 16:16, Thomas G?ttler a ?crit : > > > Am 24.03.2016 um 18:44 schrieb Sven R. Kunze: >> On 24.03.2016 14:54, Thomas G?ttler wrote: >>> for item in my_iterator: >>> # do per item >>> on empty: >>> # this code gets executed if iterator was empty >>> on break: >>> # this code gets executed if the iteration was left by a "break" >>> on notempty: >>> # ... >>> >> >> Hmm, interesting. "on" would indeed be a distinguishing keyword to >> "except". So, "except" handles exceptions and "on" >> handles internal control flow (the iter protocol). Nice idea actually. > > I thank you, because you had the idea to extend the loop syntax :-) > > But I guess it is impossible to get "on empty" implemented. > > > Regards, > Thomas G?ttler > From k7hoven at gmail.com Wed Apr 6 12:04:48 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 6 Apr 2016 19:04:48 +0300 Subject: [Python-ideas] Making pathlib paths inherit from str Message-ID: Hi all, Recently, I have spent quite a lot of time thinking about what should be done to improve the situation of pathlib. Last week, after all kinds of wild ideas and trying hard every night to prove to myself that duck-typing-like compatibility with other path-representing objects is sufficient, I noticed that I had failed to prove that and that I had gone around a full circle back to where I started: "why can't paths subclass ``str``?" In the "Working with Path objects: p-strings?" thread, I said I was working on a proposal. Since it's been several days already, I think i should post it here and get some feedback before going any further. Maybe I should have done that even earlier. Anyway, there are some rough edges, and I will need to add links to references etc. So, do not hesitate to give feedback or criticism, which is especially appreciated it you take the time to read through the whole thing first :). You can read a github-rendered version here: https://github.com/k7hoven/strpathlib/blob/master/proposal.rst And the raw rst is here below: Proposal: Making path objects inherit from ``str``. =================================================== Abstract -------- This proposal addresses issues that limit the usability of the object-oriented filesystem path objects, most notably those of ``pathlib``, introduced in the standard library in Python 3.4 via PEP 0428. One issue being that the path objects are not directly compatible with nearly any libraries, including the standard library. A further goal of this proposal is to provide a smooth transition into a Python with better Path handling, while keeping backwards compatiblity concerns to a minimum. The approach involves making the path classes in ``pathlib`` (and optionally also DirEntry) subclasses of ``str``, but takes further measures to avoid problems and unnecessary additions. Introduction ------------ Filesystem paths are strings that give instructions for traversing a directory tree. In Python, they have traditionally been represented as byte strings, and more recently, unicode string. However, Python now has ``pathlib`` in the standard library, which is an object-oriented library for dealing with objects specialized in representing a path and working with it. In this proposal, such objects are generally referred to as *path objects*, or sometimes, in the specific context of instances of the ``pathlib`` path classes, they are explicitly referred to as ``pathlib`` objects. In ``pathlib`` there is a hierarchy of path classes with a common base class ``PurePath``. It has a subclass ``Path`` which essentially assumes the path is intended to represent a path on the current system. However, both of these classes, when called, instantiate a subclass of the ``Windows`` or ``Posix`` flavor, which have slightly different behavior. In total, there are thus five public classes: ``PurePath``, ``PurePosixPath``, ``PureWindowsPath``, ``Path``, ``PosixPath`` and ``WindowsPath``. Since Python 3.5 and the introduction of ``os.scandir``, the family of path classes has a new member, ``DirEntry``, which is a performance-oriented path object with significant duck-typing compatibility with ``pathlib`` objects. The adoption of the different types of path objects is still quite low, which is perhaps unsurprising, because they were only introduced very recently. However, it can also be inconvenient to work with these objects, because, they usually need to be explicitly converted into strings before passing them to functions, and path strings returned by functions need to be explicitly converted into path objects. Especially the latter issue is difficult in terms of backwards compatibility of APIs. While many things were recently discussed on Python ideas regarding the future of path-like objects, this proposal has a much more limited scope, to provide first steps in the right direction. However, the last part of this proposal considers possible future directions that this may optionally lead to. Rationale --------- Filesystem paths (or comparable things like URIs) are strings of characters that represent information needed to access a file or directory (or other resource). In other words, they form a subset of strings, involving specialized functionality such as joining absolute and relative paths together, accessing different parts of the path or file name, and even accessing the resources the path points to. In Python terms, for a path ``path``, one would have ``isinstance(path, str)``. It is also clear that not all strings are paths. On the one hand, this would make an ideal case for making all path-representing objects inherit from ``str``; while Python tries not to over-emphasize object-oriented programming and inheritance, it should not try to avoid class hierarchies when they are appropriate in terms of both purity and practicality. Regarding practicality, making specialized *path objects* also instances of ``str`` would make almost any stdlib or third-party function accept path objects as path arguments, assuming that they accept any instance of ``str``. Furthermore, functions now returning instances of ``str`` to represent paths could in future versions return path objects, with only minor backwards-incompatibility worries. On the other hand, strings are a very general concept, and the Python ``str`` class provides a large variety of methods to manipulate and work with them, including ``.split()``, ``.find()``, ``.isnumeric()`` and ``.join()``. These operations may be defined just as well for a string that represents a path than for any other string. In fact, this is the status quo in Python, as the adoption of ``pathlib`` is still quite limited and paths are in most cases represented as strings (sometimes byte strings). But while the string operations are *defined* on path-representing strings, the results of these operations may not be of any use in most cases, even if in some cases, they may be. While it is not the responsibility of the programming language to prevent doing things that are not useful, it may be practical in some cases. For instance, the string method ``.find()`` could be mistaken to mean finding files on the file system, while it in fact searches for a substring. String concatenation, in turn, can be a perfectly reasonable thing to do: ``show_msg("Data saved to file: " + file_path)``. The result of the concatenation of a string and a path is not a path, but a general string. Directly concatenating two path objects together as strings, however, likely has no sensible use cases. There is prior art in subclassing the Python ``str`` type to build a path object type. Packages on PyPI (TODO: list more?) that do this include ``path.py`` and ``antipathy``. The latter also supports ``bytes``-based paths by instantiating a different class, a subclass of ``bytes``. Since these libraries have existed for several years, experience from them is available for evaluating the potential benefits and weaknesses of this proposal (as well as other aspects regarding ``pathlib``). However, this proposal goes a step further to avoid potential problems and to provide a smooth transition plan that, if desired, can be followed to painlessly move towards a Python with a clear distinction between strings and paths. An optional long-term goal that this proposal facilitates may be to gradually move away from using strings (or even their subclasses) as paths. Specification of standard library changes ----------------------------------------- Making ``pathlib`` classes subclasses of ``str`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Assuming the present class hierarchy in ``pathlib``, inheritance from ``str`` will be introduced by making the base class ``pathlib.PurePath`` a subclass of ``str``. Methods will further be overridden in ``PurePath`` as described in the following. Overriding all ``str``-specific methods ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since most of the ``str`` methods are not of any use on paths and can be confusing, leading to undesired behavior, *most* ``str`` methods (including magic methods, but excluding methods listed below) are overridden in ``PurePath`` with methods that by default raise ``TypeError("str method '' is not available for paths."``. This will help programmers to immediately notice when they are using the wrong method. The perhaps unusual practice of disabling most base-class methods can be regarded as being conservative in adding ``str`` functionality to path objects. All methods, including double-underscore methods are overridden, except for the following, which are *not* overridden: - Methods of the ``str`` or ``object`` types that are already overridden by ``PurePath`` - Methods of the ``object`` type that are not overridden by ``str`` - ``__getattribute__`` - ``__len__`` (this could be debated, but not having it might be weird for a str instance) - ``encode`` - ``startswith`` and ``endswith`` (TODO: override these with case-insensitive behavior on the windows flavor) - ``__add__`` will be overriden separately, as described in later subsections. This will allow ``open(...)`` as well as most ``os`` and ``os.path`` functionality to work immediately, although there are cases that need special handling. Later, if shown to be desirable, some additional string methods may be enabled on paths. Overriding ``.__add__`` to disable adding two path objects together ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overloading of the ``+`` operator in ``str`` will be overridden with a version which disables concatenation of two path objects together while allowing other type combinations (TODO: consider also fully disabling +): .. code:: python def __add__(self, other): if isinstance(other, PurePath): raise TypeError("Operator + for two paths is not defined; use / for joining paths.") return str.__add__(self, other) Optional enabling of string methods ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since many APIs currently have functions or methods that return paths as strings, existing code may expect to have all string functionality available on the returned objects. While most users are unlikely to use much of the ``str`` functionality, a library function may want to explicitly allow these operations on a path object that it returns. Therefore, the overridden ``str`` methods can be enabled by setting a ``._enable_str_functionality`` method on a path object as follows: - ``pathobj._enable_str_functionality = True #`` -- Enable ``str`` methods - ``pathobj._enable_str_functionality = 'warn' #`` -- Enable ``str`` methods, but emit a ``FutureWarning`` with the message ``"str method '' may be disabled on paths in future versions."`` The warning will help the API users notice that the return value is no longer a plain path. .. code:: python def (self, *args, **kwargs): """Method of str, not for use with pathlib path objects.""" try: enable = self._enable_str_functionality except AttributeError: raise TypeError("str method '{}' is not available for paths." .format('')) from None if enable == 'warn': warnings.warn("str method '{}' may be disabled on paths in future versions." .format(''), FutureWarning, stacklevel = 2) elif enable is True: pass else: raise ValueError("_enable_str_functionality can be True or 'warn'") return getattr(str, name)(self, *args, **kwargs) New APIs, however, do not need to enable ``str`` functionality and may return default path objects. Helping interactive python tools and IDEs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Interactive Python tools such as Jupyter are growing in popularity. When they use ``dir(...)`` to give suggestions for code completion, it is harmful to have all the disabled ``str`` methods show up in the list, even if they typically would raise exceptions. Therefore, the ``__dir__`` method should be overridden on ``PurePath`` to only show the methods that are meaningful for paths. Some tools used for code completion, such as ``rope`` and ``jedi`` may need some changes for optimal code completion. This in fact includes also the standard Python REPL, which currently does not respect ``__dir__`` in tab completion. Changes needed to other stdlib modules ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In stdlib modules other than ``pathlib``, mainly ``os``, ``ntpath`` and ``posixpath``, The stdlib functions in modules that use the methods/functionality listed below on path or file names, will be modified to explicitly convert the name ``name`` to a plain string first, e.g., using ``getattr(name, 'path', name)``, which also works for ``DirEntry`` but may return ``bytes``: - ``split`` - ``find`` - ``rfind`` - ``partition`` - ``__iter__`` - ``__getitem__`` (However, if ``DirEntry`` is not made to subclass ``str``, the idiom ``getattr(name, 'path', name)`` which is already supported in the development version, should be implemented in stdlib functions to accept not only ``str`` and path objects, but also DirEntry.) Guidelines for third-party package maintainers ---------------------------------------------- Libraries that take paths as arguments or return them ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since all of the standard library will accept path objects as path arguments, most third-party libraries will automatically do so. However, those that directly manipulate or examine the path name using ``str`` methods may not work. Those libraries will not immediately be ``pathlib``-compatible. To achieve full ``pathlib``-compatiblity, the libraries are advised to: 1. Make sure they do not explicitly check the ``type(...)`` of arguments, but use ``isinstance(...)`` instead, if needed. 2. See if their functions use disabled ``str``/``bytes`` methods on paths that they take as arguments. If so, they should either: \* change their code to, achieve the same using ``os.path`` functions (*this is the preferred option*), or \* convert the argument first using ``name = getattr(name, 'path', name)``, which does not require importing pathlib 3. Consider, when returning a path or file name, to convert it to a path object first if a ``str``-subclassing ``pathlib`` is available. During a transition period, the attribute ``._enable_str_functionality = 'warn'`` should be set before returning the object. For an even softer, transition period it is also possible to set ``._enable_str_functionality = True``, which enables ``str`` methods with no warnings. Pathlib-compatible or near-compatible libraries ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To have the best level of compatibility, all path-like objects should preferably behave similarly to pathlib objects regarding subclassing ``str``. However, for the best level of *compatibility*, the safest options is to subclass ``str`` and *not* disable ``str`` functionality (which is already done by some known libraries). However, they may want to further disable methods of ``str`` to achieve the additional clarity that ``pathlib`` has regarding \* Having a ``.path`` attribute/property which gives a ``str`` (or ``bytes``) instance representing the path Older Python versions ~~~~~~~~~~~~~~~~~~~~~ The ``pathlib2`` module, which provides ``pathlib`` to pre-3.4 python versions, can also subclass ``str``, but it should by default have ``._enable_str_functionality = 'warn'`` or ``.enable_str_functionality = True``, because the stdlib in the older Python-versions is not compatible with paths that have ``str`` functionality disabled. Transition plans and future possibilities for long-term consideration --------------------------------------------------------------------- ``DirEntry`` ~~~~~~~~~~~~ ``DirEntry`` should also undergo a similar transition, which was, at first, part of this proposal, but it was removed to limit the scope (It could be added back, of course, if desired). Since ``DirEntry`` focuses on performance, it is important not to cause any significant performance drops. It would, however, simplify things if ``DirEntry`` did the same as ``pathlib`` regarding subclassing and disabling methods. A slight complication, however, arises from the fact that ``DirEntry`` may represent a path using ``bytes``, making the ``.path`` attribute also an instance of ``bytes`` instead of ``str``. This issue could be solved by at least two different approaches: 1. Make ``bytes``-kind DirEntry instances, interpreted as ``str`` instances, equivalent to ``os.fsdecode(direntry.path)``. 2. Instantiate a different ``DirEntry`` class for ``bytes`` paths, perhaps in a way similar to how the ``antipathy`` path library instantiates ``bPath`` when the ``bytes`` type is used. The future of plain string paths and ``os.path``? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It is possible to imagine having both ``os.path`` and ``pathlib`` coexist, as long as they co-operate well. Potentially, things like ``open("filename.txt")`` with a plain string argument will always be accepted. However, if regardless of what people use Python for, they slowly adopt path objects as the way to represent a path, the support for plain string paths may be deprecated and eventually dropped. On the one hand, to support the former situation, ``os.path`` functions can choose their return type to match the type of the arguments; with multiple different types in the arguments, ``pathlib`` might 'win' because it is already imported. On the other hand, to support the latter, all path-returning functions in the stdlib can begin to return pathlib objects, at first with ``str`` methods enabled with or without warning, and eventually, with ``str`` methods disabled. Literal syntax for paths: p-strings? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Should Python choose the *path* towards not allowing plain strings as paths, a convenient way to instantiate a path is desperately needed. As discussed in the recent python-ideas thread "Working with path objects: p-strings?", one possibility would be a new syntax like ``p"/path/to/file.ext"``, which would instantiate a path object. Another way of turning a string into a path could be to have a ``.path`` property on ``str`` objects that instantiates and returns a path object. It can be debated whether this 'Pythonic' or not. See also the next section. The ``.path`` attribute on path-like objects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``DirEntry`` already had the ``.path`` attribute when it was introduced to the standard library in Python 3.5. It represents the absolute or relative path as a whole as a ``str`` or ``bytes`` instance. However, several people have raised the concern that the word ``path`` not referring to an actual path object may be misleading. However, if path objects are instances of str, the ``.path`` may in the future shift to mean the path object. In the case of ``pathlib`` paths, it would could be implemented as a property that returns ``self``, or during a transition phase, a path object with ``str`` functions enabled: .. code:: python @property def path(self): path = type(self)(self) path._enable_str_functionality = 'warn' return self ``DirEntry`` objects, on the other hand, could be converted to pathlib objects using the ``.path`` method. Similarly, ``str`` objects could have a similar property for conversion into a pathlib object (see previous section). Possibilities for making ``pathlib`` more lightweight ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If path objects were to become the norm in handling paths and file names, there may be a need for optimizations in terms of the speed and memory usage of path objects as well as the import time and memory footprint. Dependencies that are not always used by pathlib objects could also be imported lazily. Another base class for path-like objects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Python already has multiple types that can represent paths-like objects. There could be one common base class for all of them, which would (at least at first) inherit from ``str``. ``DirEntry`` and ``PurePath`` would both be subclasses of this class. One would, however, need to answer the questions of what this class would be called, what it would look like, and what module would it be in (if not builtin). For now, let us call it ``PyRL`` for Python (Pyniversal?-) Resource Locator. This could also be a base class for URLs/URIs. Generalized Resource Locator addresses: a-strings? l-strings? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... not to mention g-strings! A generalized concept may be valuable in the future, because the distinction between local and remote is getting more and more vague. As discussed in the python-ideas thread "URLs/URIs + pathlib.Path + literal syntax = ?", it is possible to quite reliably distinguish common types of URLs from filesystem paths. If this became the norm, many Python-written programs could 'magically' accept URLs as input file paths by simply calling the ``PyRL(...)``, which could be equivalent to some literal syntax for use in a scripting, testing or interactive setting, or when loading config files from fixed locations. From ethan at stoneleaf.us Wed Apr 6 13:05:47 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 06 Apr 2016 10:05:47 -0700 Subject: [Python-ideas] Making pathlib paths inherit from str In-Reply-To: References: Message-ID: <570541EB.9070607@stoneleaf.us> On 04/06/2016 09:04 AM, Koos Zevenhoven wrote: > Recently, I have spent quite a lot of time thinking about what should > be done to improve the situation of pathlib. Last week, after all > kinds of wild ideas and trying hard every night to prove to myself > that duck-typing-like compatibility with other path-representing > objects is sufficient, I noticed that I had failed to prove that and > that I had gone around a full circle back to where I started: "why > can't paths subclass ``str``?" > > In the "Working with Path objects: p-strings?" thread, I said I was > working on a proposal. Since it's been several days already, I think i > should post it here and get some feedback before going any further. > Maybe I should have done that even earlier. Anyway, there are some > rough edges, and I will need to add links to references etc. > > So, do not hesitate to give feedback or criticism, which is especially > appreciated it you take the time to read through the whole thing first > :). Fair enough, and done! :) Overall: excellent research and interesting ideas. However, before spending too much more time on this realize that the fate of pathlib in the stdlib is being discussed on Python-Dev, so you may want to move your efforts over there. > Proposal: Making path objects inherit from ``str``. > =================================================== > Rationale > --------- > There is prior art in subclassing the Python ``str`` type to build a > path object type. Packages on PyPI (TODO: list more?) that do this > include ``path.py`` and ``antipathy``. An honorable mention for antipathy! :) > Specification of standard library changes > ----------------------------------------- > Overriding all ``str``-specific methods > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Have to be careful here -- the os module uses some of those string methods to work with string-paths. > Overriding ``.__add__`` to disable adding two path objects together > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See above. > Optional enabling of string methods > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Since many APIs currently have functions or methods that return paths as > strings, existing code may expect to have all string functionality > available on the returned objects. While most users are unlikely to use > much of the ``str`` functionality, a library function may want to > explicitly allow these operations on a path object that it returns. > Therefore, the overridden ``str`` methods can be enabled by setting a > ``._enable_str_functionality`` method on a path object as follows: > > - ``pathobj._enable_str_functionality = True #`` -- Enable ``str`` > methods > - ``pathobj._enable_str_functionality = 'warn' #`` -- Enable ``str`` > methods, but emit a ``FutureWarning`` with the message > ``"str method '' may be disabled on paths in future versions."`` > > The warning will help the API users notice that the return value is no > longer a plain path. This would be good for the pathlib backport, which I think should inherit from str/unicode. > Helping interactive python tools and IDEs > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Interactive Python tools such as Jupyter are growing in popularity. When > they use ``dir(...)`` to give suggestions for code completion, it is > harmful to have all the disabled ``str`` methods show up in the list, > even if they typically would raise exceptions. Therefore, the > ``__dir__`` method should be overridden on ``PurePath`` to only show the > methods that are meaningful for paths. Good idea. > Older Python versions > ~~~~~~~~~~~~~~~~~~~~~ > > The ``pathlib2`` module, which provides ``pathlib`` to pre-3.4 python > versions, can also subclass ``str``, but it should by default have > ``._enable_str_functionality = 'warn'`` or > ``.enable_str_functionality = True``, because the stdlib in the older > Python-versions is not compatible with paths that have ``str`` > functionality disabled. The 'warn' setting would be preferable. > The future of plain string paths and ``os.path``? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > It is possible to imagine having both ``os.path`` and ``pathlib`` > coexist, as long as they co-operate well. Potentially, things like > ``open("filename.txt")`` with a plain string argument will always be > accepted. However, if regardless of what people use Python for, they > slowly adopt path objects as the way to represent a path, the support > for plain string paths may be deprecated and eventually dropped. I don't see the stdlib ever /not/ accepting string paths. > Literal syntax for paths: p-strings? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I don't see this as necessary. > Another way of turning a string into a path could be to have a ``.path`` > property on ``str`` objects that instantiates and returns a path object. And definitely not this. > Generalized Resource Locator addresses: a-strings? l-strings? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > A generalized concept may be valuable in the future, because the > distinction between local and remote is getting more and more vague. As > discussed in the python-ideas thread "URLs/URIs + pathlib.Path + literal > syntax = ?", it is possible to quite reliably distinguish common types > of URLs from filesystem paths. This was disproved. Just about anything can be a posix-path. -- ~Ethan~ From ian.g.kelly at gmail.com Wed Apr 6 13:04:44 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Wed, 6 Apr 2016 11:04:44 -0600 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> References: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> Message-ID: On Wed, Apr 6, 2016 at 6:49 AM, Random832 wrote: > Last time this came up, the conclusion was that making values views > comparable was intractable due to the fact that they're unordered but > the values themselves aren't hashable. Then the discussion got > sidetracked into a discussion of whether the justification for not > having them be hashable (Java does just fine with everything being > hashable and content-based hashes for mutable objects) makes sense in a > "consenting-adults" world. At risk of repeating the derailment: while that's true, I'm also under the impression that interest in deeply immutable objects is growing in the Java community. From joshua.morton13 at gmail.com Wed Apr 6 13:25:15 2016 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Wed, 06 Apr 2016 17:25:15 +0000 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> References: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> Message-ID: Indeed, I noted that (in a not-obvious endnote). Perhaps I shouldn't have mentioned the issue at all as its relatively secondary, but on the other hand while my main points have to do specifically with KeysView, it would be worth bundling all of the inconsistencies in all MappingViews at once, if only for pragmatic reasons. In the prior discussion, guido also seemed open to making values equivalent on dictionary identity equality (#10), which I think makes more sense than current behavior and doesn't suffer performance issues. In any case, I would consider that a secondary concern. On Wed, Apr 6, 2016 at 8:50 AM Random832 wrote: > On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote: > > {}.keys() == {}.keys() # 7 > > {}.items() == {}.items() # 8 > > {}.values() == {}.values() # 9 > > d = {}; d.values() == d.values() # 10 > > > > True, True, False, False. > > > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up > > for debate.[1] > > Last time this came up, the conclusion was that making values views > comparable was intractable due to the fact that they're unordered but > the values themselves aren't hashable. Then the discussion got > sidetracked into a discussion of whether the justification for not > having them be hashable (Java does just fine with everything being > hashable and content-based hashes for mutable objects) makes sense in a > "consenting-adults" world. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Apr 6 13:27:31 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Apr 2016 03:27:31 +1000 Subject: [Python-ideas] Making pathlib paths inherit from str In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 2:04 AM, Koos Zevenhoven wrote: > Rationale > --------- > Furthermore, functions now > returning instances of ``str`` to represent paths could in future > versions return path objects, with only minor backwards-incompatibility > worries. > > Making ``pathlib`` classes subclasses of ``str`` > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Overriding all ``str``-specific methods > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > The perhaps unusual practice of disabling most base-class > methods can be regarded as being conservative in adding ``str`` > functionality to path objects. I'm not sure I entirely understand what's going on here. Your rationale is "it should be possible to use a Path as a str", and that's supported by your proposal to subclass str; but then you want to override a bunch of methods to force users to be aware that a Path is *not* a str. Why subclass only to force people to distinguish? If you want to make a Path act "just a little bit" like a str, I'd expect to go the other way: don't subclass str, and add in a specific set of methods to provide str-like functionality. Or am I missing something here? ChrisA From joshua.morton13 at gmail.com Wed Apr 6 13:48:14 2016 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Wed, 06 Apr 2016 17:48:14 +0000 Subject: [Python-ideas] Making pathlib paths inherit from str In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 1:28 PM Chris Angelico wrote: > I'm not sure I entirely understand what's going on here. Your > rationale is "it should be possible to use a Path as a str", and > that's supported by your proposal to subclass str; but then you want > to override a bunch of methods to force users to be aware that a Path > is *not* a str. Why subclass only to force people to distinguish? > If you want to make a Path act "just a little bit" like a str, I'd > expect to go the other way: don't subclass str, and add in a specific > set of methods to provide str-like functionality. Or am I missing > something here? Indeed this was my thought as well. Originally (this was discussed on the python subreddit to some extent as well), my objection was that a simple program like say, left_pad.py would break depending on the implementation. This addresses that issue, but in my opinion creates even deeper problems. Its not hard to conceive some formatting or pretty printing library that at some point contains the code def concat(user_list: List[String]) -> String: return functools.reduce(lambda x, y: x + y, user_list) where user_list is given from some sort of user input. This change would lead to some nasty bugs, where this action will throw an error when any two adjacent args are Paths. What's worse is that type annotations actually exacerbates this issue since the objects disobey the typechecker. Java subclasses aren't allowed to just unimplement a parent's methods, they shouldn't in python either. This argument seems to boil to "well a string is one valid representation of a path, so a Path is a string" but by that definition a Path is also a node in a tree, and so we should create a Tree class and have Path subclass TreeNode, so that we could find out its children and depth from the root. There are so many surprising results of a change like this that I can't see a reason to do this. -Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Apr 6 14:31:57 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 6 Apr 2016 19:31:57 +0100 Subject: [Python-ideas] Making pathlib paths inherit from str In-Reply-To: References: Message-ID: On 6 April 2016 at 17:04, Koos Zevenhoven wrote: > In the "Working with Path objects: p-strings?" thread, I said I was > working on a proposal. Since it's been several days already, I think i > should post it here and get some feedback before going any further. > Maybe I should have done that even earlier. Anyway, there are some > rough edges, and I will need to add links to references etc. Thanks for putting this together. I don't agree with much of it, but it's good to have the proposal stated so clearly. > So, do not hesitate to give feedback or criticism, which is especially > appreciated it you take the time to read through the whole thing first > :). While I've read the whole proposal, there's a lot to digest, and honestly I don't have the time to spend on this right now - so my apologies if I missed anything relevant. Hopefully my comments will make sense anyway :-) > Filesystem paths are strings that give instructions for traversing a > directory tree. In Python, they have traditionally been represented as > byte strings, and more recently, unicode string. However, Python now has > ``pathlib`` in the standard library, which is an object-oriented library > for dealing with objects specialized in representing a path and working > with it. In this proposal, such objects are generally referred to as > *path objects*, or sometimes, in the specific context of instances of > the ``pathlib`` path classes, they are explicitly referred to as > ``pathlib`` objects. I'm not sure I agree with this. To me, "filesystem paths" are a things which define the location of a file in a filesystem. They are not strings, even though they can be represented by strings (actually, they can't, technically - POSIX allows nearly arbitrary bytestrings for for paths, whereas Python strings are Unicode). Saying a path is a string is no more true than saying that integers are strings that represent whole numbers. Traditionally, people haven't thought of paths as objects because not many languages provide *any* sort of abstraction of paths - doing so in a cross-platform way is *hard* and most languages duck the issue. Python is exceptional in providing good path manipulation functions (even os.path is streets ahead of what many other languages offer). > Filesystem paths (or comparable things like URIs) are strings of > characters that represent information needed to access a file or > directory (or other resource). In other words, they form a subset of > strings, involving specialized functionality such as joining absolute > and relative paths together, accessing different parts of the path or > file name, and even accessing the resources the path points to. In > Python terms, for a path ``path``, one would have > ``isinstance(path, str)``. It is also clear that not all strings are > paths. As noted above, this makes no sense to me. By this argument "integers are strings of characters that represent numbers". The string representation of an object is *not* the object. > On the one hand, this would make an ideal case for making all > path-representing objects inherit from ``str``; while Python tries not > to over-emphasize object-oriented programming and inheritance, it should > not try to avoid class hierarchies when they are appropriate in terms of > both purity and practicality. Regarding practicality, making specialized > *path objects* also instances of ``str`` would make almost any stdlib or > third-party function accept path objects as path arguments, assuming > that they accept any instance of ``str``. Furthermore, functions now > returning instances of ``str`` to represent paths could in future > versions return path objects, with only minor backwards-incompatibility > worries. You mention both practicality and purity here but only offer "practical" arguments. The practical arguments are fair, and as far as I can see are the crux of any proposal to make Path objects subclass str. You should focus on this, and not try to argue that subclassing str is "right" in any purity sense. > On the other hand, strings are a very general concept, and the Python > ``str`` class provides a large variety of methods to manipulate and work > with them, including ``.split()``, ``.find()``, ``.isnumeric()`` and > ``.join()``. These operations may be defined just as well for a string > that represents a path than for any other string. In fact, this is the > status quo in Python, as the adoption of ``pathlib`` is still quite > limited and paths are in most cases represented as strings (sometimes > byte strings). But while the string operations are *defined* on > path-representing strings, the results of these operations may not be of > any use in most cases, even if in some cases, they may be. This seems to me to be a key point - if (many) of the operations that are part of the interface of a string don't make sense for a filesystem path, doesn't that very clearly make the point that filesystem paths are *not* strings? > There is prior art in subclassing the Python ``str`` type to build a > path object type. Packages on PyPI (TODO: list more?) that do this pylib's path.local object (used in pytest in particular) is another. > include ``path.py`` and ``antipathy``. The latter also supports > ``bytes``-based paths by instantiating a different class, a subclass of > ``bytes``. Since these libraries have existed for several years, > experience from them is available for evaluating the potential benefits > and weaknesses of this proposal (as well as other aspects regarding > ``pathlib``). I don't think there's been any attempt made to collect or quantify that experience, though. All I've ever seen is hearsay "I've not heard of anyone reporting problems" evidence. While anecdotal evidence is a lot better than nothing, it's of limited value. Apart from anything else, there's a self-selection issue - people who *did* have problems may simply have stopped using the libraries. > Overriding all ``str``-specific methods > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Since most of the ``str`` methods are not of any use on paths and can be > confusing, leading to undesired behavior, *most* ``str`` methods > (including magic methods, but excluding methods listed below) are > overridden in ``PurePath`` with methods that by default raise > ``TypeError("str method '' is not available for paths."``. This > will help programmers to immediately notice when they are using the > wrong method. The perhaps unusual practice of disabling most base-class > methods can be regarded as being conservative in adding ``str`` > functionality to path objects. This seems to me to be the biggest issue. You're proposing that Path objects will subclass strings, but code written to expect a string may fail if passed a Path object. Presumably though that code works if passed str(the_path_object) - as it works correctly right now. Maybe it's doing "string-like" things, but equally, it's presumably intended to. Consider a "make path uppercase" function that simply does .upper() on its argument. You are proposing a class that is a subclass of str, but calling str() on an instance gives an object that behaves differently. That's bizarre at best, and realistically I'd describe it as fundamentally broken. I don't want to argue type-theory here, but I'm pretty sure that violates most people's intuition of what inheritance means. > Optional enabling of string methods > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Since many APIs currently have functions or methods that return paths as > strings, existing code may expect to have all string functionality > available on the returned objects. While most users are unlikely to use > much of the ``str`` functionality, a library function may want to > explicitly allow these operations on a path object that it returns. > Therefore, the overridden ``str`` methods can be enabled by setting a > ``._enable_str_functionality`` method on a path object as follows: > > - ``pathobj._enable_str_functionality = True #`` -- Enable ``str`` > methods > - ``pathobj._enable_str_functionality = 'warn' #`` -- Enable ``str`` > methods, but emit a ``FutureWarning`` with the message > ``"str method '' may be disabled on paths in future versions."`` This is a huge chunk of extra complexity, both in terms of implementation, and even more so in terms of understanding. If someone wants a "real" string, just call cast using str() or use the .path attribute. This whole section of the proposal says to me that you haven't actually solved the problem you're trying to solve - you still expect people to have problems passing Path objects to functions that aren't expecting them, and you've had to consider how to work round that. The fact that you came up with (in effect) a "configuration flag" on an immutable object like a Path rather than just using the existing "give me a real string" options on Path, implies that your proposal is not well thought through in this area. Here's some questions for you (but IMO this section is unfixable - no matter what answers you give, I still consider this whole mechanism as a non-starter). * Are Path objects hashable, given they now have a mutable attribute? * If you change the _enable_str_functionality flag, does the object's hash change? * If it doesn't, what happens when you add 2 identical paths with different _enable_str_functionality flags to a set? * If you enable str methods do they return str or Path objects? If the latter, what is the flag set to on these objects? Basically, you broke a fundamental property of both Path and string objects - they are immutable. > Changes needed to other stdlib modules > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > In stdlib modules other than ``pathlib``, mainly ``os``, ``ntpath`` and > ``posixpath``, The stdlib functions in modules that use the > methods/functionality listed below on path or file names, will be > modified to explicitly convert the name ``name`` to a plain string > first, e.g., using ``getattr(name, 'path', name)``, which also works for > ``DirEntry`` but may return ``bytes``: > > - ``split`` > - ``find`` > - ``rfind`` > - ``partition`` > - ``__iter__`` > - ``__getitem__`` This can be done with the current Path objects (and should). It is unrelated to this proposal. And it doesn't need to be restricted to "if overridden string functions are used". Just do it regardless, and all existing functions work immediately. The only issue is functions that *return* paths. And they are no harder under current Pathlib than under your proposal - a decision on what type to return has to be made either way. > Guidelines for third-party package maintainers > ---------------------------------------------- > > Libraries that take paths as arguments or return them > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Since all of the standard library will accept path objects as path > arguments, most third-party libraries will automatically do so. However, > those that directly manipulate or examine the path name using ``str`` > methods may not work. Those libraries will not immediately be > ``pathlib``-compatible. Overcomplicated. If you accept paths, just do getattr(patharg, 'path', patharg) and you're fine. If you return paths, do nothing (or if you prefer, think about your API and make a more considered decision). Your proposal means that library authors have to actually consider whether the new path objects will cause subtle failures, because the string-like objects will not fail quickly, leading to bugs propogating into unrelated code. Overall, I'm a strong -1. If we subclass str, we should just do it and not over-complicate like this. I'm still not convinced we should do so, but your proposal *has* convinced me that any attempt to compromise is going to end up being worse than either option. Sorry I can't be more positive - but again, thanks for the thorough write-up. Paul From random832 at fastmail.com Wed Apr 6 14:50:54 2016 From: random832 at fastmail.com (Random832) Date: Wed, 06 Apr 2016 14:50:54 -0400 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: References: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> Message-ID: <1459968654.867659.571034033.15260ABE@webmail.messagingengine.com> On Wed, Apr 6, 2016, at 13:25, Joshua Morton wrote: > In the prior discussion, guido also seemed open to making values > equivalent > on dictionary identity equality (#10), which I think makes more sense > than > current behavior and doesn't suffer performance issues. In any case, I > would consider that a secondary concern. You could probably handle a lot of common cases by: - Eliminating any items which are present in both collections by identity. - Attempting to sort the lists of remaining items. But, yeah, the way to do it in O(N) requires an "ephemeral hash" operation which Python doesn't have and can't grow _now_, no matter what the justification for not always having had it. That he won't change it even after he builds the time machine doesn't really mean anything when it comes down to it. From wes.turner at gmail.com Wed Apr 6 16:31:01 2016 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Apr 2016 15:31:01 -0500 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> References: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> Message-ID: On Apr 6, 2016 7:50 AM, "Random832" wrote: > > On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote: > > {}.keys() == {}.keys() # 7 > > {}.items() == {}.items() # 8 > > {}.values() == {}.values() # 9 > > d = {}; d.values() == d.values() # 10 > > > > True, True, False, False. > > > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up > > for debate.[1] > > Last time this came up, the conclusion was that making values views > comparable was intractable due to the fact that they're unordered but > the values themselves aren't hashable. Then the discussion got > sidetracked into a discussion of whether the justification for not > having them be hashable (Java does just fine with everything being > hashable and content-based hashes for mutable objects) makes sense in a > "consenting-adults" world. here's a related discussion: [Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected? https://mail.python.org/pipermail/python-ideas/2015-December/037472.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Apr 6 16:33:29 2016 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Apr 2016 15:33:29 -0500 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: References: Message-ID: On Apr 6, 2016 4:39 AM, "Joshua Morton" wrote: > > Let's start with a quick quiz: what is the result of each of the following (on Python3.5.x)? > > {}.keys() | set() # 1 > set() | [] # 2 > {}.keys() | [] # 3 > set().union(set()) # 4 > set().union([]) # 5 > {}.keys().union(set()) # 6 > > If your answer was set([]), TypeError, set([]), set([]), set([]), AttributeError, then you were correct. That, to me, is incredibly unintuitive. > > Next up: > > {}.keys() == {}.keys() # 7 > {}.items() == {}.items() # 8 > {}.values() == {}.values() # 9 > d = {}; d.values() == d.values() # 10 > > True, True, False, False. > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up for debate.[1] > > First thing first, the behavior exhibited by #3 is a bug (or at least it probably should be, and one I'd be happy to fix. However, before doing that I felt it would be good to propose some questions and suggestions for how that might be done. > > There are, as far as I can tell, two reasons to use a MappingView, memory efficiency or auto-updating (a view will continue to mirror changes in the underlying object). a third reason: Mapping and MutableMapping do not assume that all of the data is buffered into RAM. * e.g. on top of a DB * https://docs.python.org/3/library/collections.abc.html#collections.abc.MutableMapping > I'll focus on the first because the second can conceivably be solved in other ways. > > Currently, if I want to union a dictionaries keys with a non-set iterable, I can do `{}.keys() | []`. This is a bug[2], the set or operator should only work on another set. That said, fixing this bug removes the ability to efficiently or keys with another iterable, `set({}.keys()).update([])` for efficiency, or `set({}.keys()).union([])` for clarity. > > Fixing this is simply a matter of adding a `.union` method to the KeysView (and possibly to the Set abc as well). Although that may not be something that is wanted. The issue, as far as I can tell, is whether we want people converting from MappingViews to "primitives" as soon as possible, or if we want to encourage people people to use the views for as long as possible. > > There are arguments on both sides: encouraging people to use the views causes these objects to become more complex, introducing more surface area for bugs, more code to be maintained, etc. Further, there's there is currently one obvious way to do things, convert to a primitive, if you're doing any complex action. On the other hand, making MappingViews more like the primitives they represent has positives for performance, simplifies user code, and would probably make testing these objects easier, since many tests could be stolen from test_set.py. > > My opinion is that the operators on MappingViews should be no more permissive than their primitive counterparts. A KeysView is in various ways more restrictive than a set, so having it be also occasionally less restrictive is surprising and in my opinion bad. This causes the loss of an efficient way to union a dict's keys with a list (among other methods). I'd then add .union, .intersection, etc. to remedy this. > > This solution would bring the existing objects more in line with their primitive counterparts, while still allowing efficient actions on large dictionaries. > > In short: > > - Is #3 intended behavior? > - Should it (and the others be)? > - As a related aside, should .union and other frozen ops be added to the Set interface? > - If so, should the fix solely be a bugfix, should I do what I proposed, or something else entirely? > - More generally, should there be a guiding principle when it comes to MappingViews and similar special case objects? > > > [1]: There's some good conversation in this prior thread on this issue https://mail.python.org/pipermail/python-ideas/2015-December/037472.html. The consensus seemed to be that making ValuesViews comparable by value is technically infeasible (O(n^2) worst case), while making it comparable based on the underlying dictionary is a possibility. This would be for OrderedDict, although many of the same arguments apply for a normal dictionary. > > [2]: Well, it probably should be a bug, its explicitly tested for ( https://github.com/python/cpython/blob/master/Lib/test/test_dictviews.py#L109), whereas sets are explicitly tested for the opposite functionality ( https://github.com/python/cpython/blob/master/Lib/test/test_set.py#L92) > > > Thanks, I'm looking forward to the feedback, > Josh > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Apr 6 16:40:01 2016 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Apr 2016 15:40:01 -0500 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> References: <1459946993.779576.570657809.7B7DA7F0@webmail.messagingengine.com> Message-ID: On Apr 6, 2016 7:50 AM, "Random832" wrote: > > On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote: > > {}.keys() == {}.keys() # 7 > > {}.items() == {}.items() # 8 > > {}.values() == {}.values() # 9 > > d = {}; d.values() == d.values() # 10 > > > > True, True, False, False. > > > > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up > > for debate.[1] > > Last time this came up, the conclusion was that making values views > comparable was intractable due to the fact that they're unordered but > the values themselves aren't hashable. Then the discussion got > sidetracked into a discussion of whether the justification for not > having them be hashable (Java does just fine with everything being > hashable and content-based hashes for mutable objects) makes sense in a > "consenting-adults" world. With e.g. OrderedDict, one practical solution is to subclass and wrap .keys() in a set and .values() in a list. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at 2sn.net Wed Apr 6 18:26:39 2016 From: python at 2sn.net (Alexander Heger) Date: Thu, 7 Apr 2016 08:26:39 +1000 Subject: [Python-ideas] unpacking in index expression Message-ID: I was trying to unpack an array or tuple in index expression (python 3.5.1) but this failed. I think for consistency of language use it would be desirable to allow that as well. Below an example of my attempt indexing a numpy array, x: >>> import numpy as np >>> x = np.arange(12).reshape(3,-1) >>> x[*[0,1]] x[*[0,1]] ^ SyntaxError: invalid syntax >>> x[*(0,1)] x[*(0,1)] ^ SyntaxError: invalid syntax for the case of numpy, the latter works as intended w/o unpacking >>> x[(0,1)] 1 which is equivalent to the desired behaviour of unpacking >>> x[0,1] 1 but for the list case the behaviour is different (as intended by design) >>> x[[0,1]] array([[0, 1, 2, 3], [4, 5, 6, 7]]) My proposal hence is to allow unpacking in index expressions as well. I do not have a use case for dictionary / keyword unpacking; this would look awkward anyway. -Alexander From storchaka at gmail.com Thu Apr 7 00:39:54 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 7 Apr 2016 07:39:54 +0300 Subject: [Python-ideas] unpacking in index expression In-Reply-To: References: Message-ID: On 07.04.16 01:26, Alexander Heger wrote: > I was trying to unpack an array or tuple in index expression (python > 3.5.1) but this failed. > I think for consistency of language use it would be desirable to allow > that as well. Below an example of my attempt indexing a numpy array, > x: > >>>> import numpy as np >>>> x = np.arange(12).reshape(3,-1) >>>> x[*[0,1]] > x[*[0,1]] > ^ > SyntaxError: invalid syntax x[0, 1] is a syntax sugar for x[(0, 1)]. Thus you need just convert your list index to tuple index: x[tuple([0, 1])]. Unpacking arguments for function is different thing. From solipsis at pitrou.net Thu Apr 7 03:46:18 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Apr 2016 09:46:18 +0200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ Message-ID: <20160407094618.5c01847d@fsol> Hello, Booleans currently have reasonable overrides for the bitwise binary operators: >>> True | False True >>> True & False False >>> True ^ False True However, the same cannot be said of bitwise unary complement, which returns rather useless integer values: >>> ~False -1 >>> ~True -2 Numpy's boolean type does the more useful (and more expected) thing: >>> ~np.bool_(True) False How about changing the behaviour of bool.__invert__ to make it in line with the Numpy boolean? (i.e. bool.__invert__ == operator.not_) Regards Antoine. From dickinsm at gmail.com Thu Apr 7 04:00:36 2016 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 7 Apr 2016 09:00:36 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160407094618.5c01847d@fsol> References: <20160407094618.5c01847d@fsol> Message-ID: On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou wrote: > Booleans currently have reasonable overrides for the bitwise binary > operators: > >>>> True | False > True >>>> True & False > False >>>> True ^ False > True All those are consistent with bool being a subclass of int, in the sense that (for example) `int(True | False)` is identical to `int(True) | int(False)`. Redefining ~True to be False wouldn't preserve that: int(~True) == ~int(True) would become invalid. In short, the proposal would break the Liskov Substitution Principle. (The obvious fix is of course to make True have value -1 rather than 1. Then everything's consistent. No, I'm not seriously suggesting this - the amount of breakage would be insane.) NumPy has the luxury that numpy.bool_ is *not* a subclass of any integer type. Mark From solipsis at pitrou.net Thu Apr 7 04:15:18 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Apr 2016 10:15:18 +0200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ References: <20160407094618.5c01847d@fsol> Message-ID: <20160407101518.7d6997b3@fsol> On Thu, 7 Apr 2016 09:00:36 +0100 Mark Dickinson wrote: > On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou wrote: > > Booleans currently have reasonable overrides for the bitwise binary > > operators: > > > >>>> True | False > > True > >>>> True & False > > False > >>>> True ^ False > > True > > All those are consistent with bool being a subclass of int, in the > sense that (for example) `int(True | False)` is identical to > `int(True) | int(False)`. However, the return type (bool in one place, int in another one) is inconsistent, and the user-visible semantics are confusing... Apparently someone went to the trouble of overriding __and__, __or__ and __xor__ for booleans, which is why it looks unexpected to leave __invert__ alone. Regards Antoine. From steve at pearwood.info Thu Apr 7 07:04:25 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 7 Apr 2016 21:04:25 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160407094618.5c01847d@fsol> References: <20160407094618.5c01847d@fsol> Message-ID: <20160407110425.GK12526@ando.pearwood.info> On Thu, Apr 07, 2016 at 09:46:18AM +0200, Antoine Pitrou wrote: > > Hello, > > Booleans currently have reasonable overrides for the bitwise binary > operators: > > >>> True | False > True > >>> True & False > False > >>> True ^ False > True Substitute 1 for True and 0 for False, and these results are exactly the same as the bitwise operations on ints. And that works since True is defined to equal 1 and False to equal 0. > However, the same cannot be said of bitwise unary complement, which > returns rather useless integer values: > > >>> ~False > -1 > >>> ~True > -2 Substitute 0 for False and 1 for True, and you get exactly the same results. What else did you expect from bitwise-not? > Numpy's boolean type does the more useful (and more expected) thing: > > >>> ~np.bool_(True) > False Expected by whom? I wouldn't expect bitwise-not to be the same as binary not. If I want binary not, I'll spell it `not`. > How about changing the behaviour of bool.__invert__ to make it in line > with the Numpy boolean? > (i.e. bool.__invert__ == operator.not_) Why? What problem does this solve? We already have a perfectly good way of spelling binary not, why break backwards compatibility to get a second way to spell it? It also breaks a fundamental property of most mathematical relations: if a == b, then f(a) == f(b) (assuming f(a) and f(b) are defined for the type of both a and b). That is currently true for bools: py> (True == 1) and (~True == ~1) True py> (False==0) and (~False == ~0) True You want ~b to return `not b`: py> (True == 1) and (False == ~1) False py> (False==0) and (True == ~0) False I see no upside and a serious downside to this proposal. -- Steve From oscar.j.benjamin at gmail.com Thu Apr 7 08:19:38 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 7 Apr 2016 13:19:38 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160407094618.5c01847d@fsol> References: <20160407094618.5c01847d@fsol> Message-ID: On 7 April 2016 at 08:46, Antoine Pitrou wrote: > > Hello, > > Booleans currently have reasonable overrides for the bitwise binary > operators: > >>>> True | False > True >>>> True & False > False >>>> True ^ False > True > > However, the same cannot be said of bitwise unary complement, which > returns rather useless integer values: > >>>> ~False > -1 >>>> ~True > -2 This is all just a consequence of the (unfortunate IMO) design that treats True and False as being equivalent to 1 and 0. > Numpy's boolean type does the more useful (and more expected) thing: > >>>> ~np.bool_(True) > False This is a consequence of another unfortunate design by numpy. The reason for this is that numpy uses Python's bitwise operators to do element-wise logical operations. This is because it is not possible to overload Python's 'and', 'or', and 'not' operators. So if I write: >>> import numpy as np >>> a = np.array([1, 2, 3, 4, 5, 6]) >>> a array([1, 2, 3, 4, 5, 6]) >>> 1 < a array([False, True, True, True, True, True], dtype=bool) >>> a < 4 array([ True, True, True, False, False, False], dtype=bool) >>> (1 < a) & (a < 4) array([False, True, True, False, False, False], dtype=bool) >>> (1 < a) and (a < 4) Traceback (most recent call last): File "", line 1, in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() The problem is that Python tries to shortcut this expression and so it calls bool(1 < a): >>> bool(1 < a) Traceback (most recent call last): File "", line 1, in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Returning a non-bool from __bool__ is prohibited which implicitly prevents numpy from overloading an expression like `not a`: >>> class boolish: ... def __bool__(self): ... return "true" ... >>> bool(boolish()) Traceback (most recent call last): File "", line 1, in TypeError: __bool__ should return bool, returned str >>> not boolish() Traceback (most recent call last): File "", line 1, in TypeError: __bool__ should return bool, returned str Because of this numpy uses &,|,~ in place of and,or,not for numpy arrays and by extension this occurs for numpy scalars as you showed. I think the numpy project really wanted to use the normal Python operators but no mechanism was available to do it. It would have been nice to use &,|,~ for genuine bitwise operations (on e.g. unsigned int arrays). It also means that chained relations don't work e.g.: >>> 1 < a < 4 Traceback (most recent call last): File "", line 1, in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() The recommended way is (1 < a) & (a < 4) which not as nice and requires factoring a out if a is actually an expression like sin(x)**2 or something. -- Oscar From rosuav at gmail.com Thu Apr 7 09:04:56 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Apr 2016 23:04:56 +1000 Subject: [Python-ideas] Dunder method to make object str-like Message-ID: This is a spin-off from the __fspath__ discussion on python-dev, in which a few people said that a more general approach would be better. Proposal: Objects should be allowed to declare that they are "string-like" by creating a dunder method (analogously to __index__ for integers) which implies a loss-less conversion to str. This could supersede the __fspath__ "give me the string for this path" protocol, or could stand in parallel with it. Obviously str will have this dunder method, returning self. Most other core types (notably 'object') will not define it. Absence of this method implies that the object cannot be treated as a string. String methods will be defined as accepting string-like objects. For instance, "hello"+foo will succeed if foo is string-like. Downside: Two string-like objects may behave unexpectedly - foo+bar will concatenate strings if either is an actual string, but if both are other string-like objects, depends on the implementation of those objects. Bikeshedding: 1) What should the dunder method be named? __str_coerce__? __stringlike__? 2) Which standard types are sufficiently string-like to be used thus? 3) Should there be a bytes equivalent? 4) Should there be a format string "coerce to str"? "{}".format(x) is equivalent to str(x), but it might be nice to be able to assert that something's stringish already. Open season! From robert.kern at gmail.com Thu Apr 7 09:08:44 2016 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 7 Apr 2016 14:08:44 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> Message-ID: On 2016-04-07 13:19, Oscar Benjamin wrote: > On 7 April 2016 at 08:46, Antoine Pitrou wrote: >> Numpy's boolean type does the more useful (and more expected) thing: >> >>>>> ~np.bool_(True) >> False > > This is a consequence of another unfortunate design by numpy. The > reason for this is that numpy uses Python's bitwise operators to do > element-wise logical operations. This is not correct. & | ^ ~ are all genuine bitwise operations on numpy arrays. >>> i = np.arange(5) >>> ~i array([-1, -2, -3, -4, -5]) What you are seeing with bool arrays is that bool arrays are *not* just uint8 arrays. Each element happens to take up a single 8-bit byte, but only one of those bits contributes to its value; the other 7 bits are mere padding. The bitwise & | ^ ~ operators all work on that single bit correctly. They do not operate on the padding bits as they are not part of the bool's value. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From antoine at python.org Thu Apr 7 09:17:57 2016 From: antoine at python.org (Antoine Pitrou) Date: Thu, 7 Apr 2016 13:17:57 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Changing_the_meaning_of_bool=2E=5F=5Fin?= =?utf-8?b?dmVydF9f?= References: <20160407094618.5c01847d@fsol> <20160407110425.GK12526@ando.pearwood.info> Message-ID: Steven D'Aprano writes: > > > Numpy's boolean type does the more useful (and more expected) thing: > > > > >>> ~np.bool_(True) > > False > > Expected by whom? By anyone who takes booleans at face value (that is, takes booleans as representing a truth value and expects operations on booleans to reflect the semantics of useful operations on truth values, not some arbitrary side-effect of the internal representation of a boolean...). But I'm not surprised by such armchair commenting and pointless controversy on python-ideas, since that's what the list is for.... Regards Antoine. From phd at phdru.name Thu Apr 7 09:28:45 2016 From: phd at phdru.name (Oleg Broytman) Date: Thu, 7 Apr 2016 15:28:45 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: <20160407132845.GA15898@phdru.name> On Thu, Apr 07, 2016 at 11:04:56PM +1000, Chris Angelico wrote: > 1) What should the dunder method be named? __str_coerce__? __stringlike__? __tostring__ Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From oscar.j.benjamin at gmail.com Thu Apr 7 09:28:27 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 7 Apr 2016 14:28:27 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> Message-ID: On 7 April 2016 at 14:08, Robert Kern wrote: >> This is a consequence of another unfortunate design by numpy. The >> reason for this is that numpy uses Python's bitwise operators to do >> element-wise logical operations. > > > This is not correct. & | ^ ~ are all genuine bitwise operations on numpy > arrays. > >>>> i = np.arange(5) >>>> ~i > array([-1, -2, -3, -4, -5]) > > What you are seeing with bool arrays is that bool arrays are *not* just > uint8 arrays. Each element happens to take up a single 8-bit byte, but only > one of those bits contributes to its value; the other 7 bits are mere > padding. The bitwise & | ^ ~ operators all work on that single bit > correctly. They do not operate on the padding bits as they are not part of > the bool's value. I stand corrected. -- Oscar From p.f.moore at gmail.com Thu Apr 7 09:41:18 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 14:41:18 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: On 7 April 2016 at 14:04, Chris Angelico wrote: > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. What would this be used for? Other than Path, what types would be viable candidates for being "string like"? I can't think of a use case for this feature. For that matter, what constitutes a "lossless conversion to str"? If you mean "doesn't lose information", then integers could easily be said to have such a conversion - but that doesn't seem right (we don't want things to start auto-converting numbers to strings). I'm not sure I understand the point. Paul From vgr255 at live.ca Thu Apr 7 09:46:52 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Thu, 7 Apr 2016 09:46:52 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <20160407132845.GA15898@phdru.name> References: <20160407132845.GA15898@phdru.name> Message-ID: +1 for me. On Thu, Apr 07, 2016 at 11:04:56PM +1000, Chris Angelico wrote: > 1) What should the dunder method be named? __str_coerce__? __stringlike__? The equivalent for ints is __index__, which is one subset of the vast possibilities of integers, yet one that will mostly be served by int-like objects. By that reasoning, the str-like method should correspond to a large - yet wide - use case. The first use case for strings that comes to mind is dict (namespace) keys. I'd rule out __keys__ as not explicit enough. I was thinking __item__, but then again probably not explicit enough. I don't have "the obvious choice" in mind, but I'd like to be one word only. > 2) Which standard types are sufficiently string-like to be used thus? No built-in types, but I'd like the feature for a third-party app, for example, that pretends to be a string. > 3) Should there be a bytes equivalent? This would probably fall in the "nice to have, but no use case backing it up" category. +0 for me. > 4) Should there be a format string "coerce to str"? "{}".format(x) is equivalent to str(x), but it might be nice to be able to assert that something's stringish already. __index__ is meant to return the same thing as __int__, and I think the same restrictions should apply here - we have an object pretending to be a string, so there should *not* be a difference between use_as_string(obj) and use_as_string(str(obj)) - in the same sense that there should not be a difference between use_as_int(obj) and use_as_int(int(obj)). -Emanuel From rosuav at gmail.com Thu Apr 7 09:53:41 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Apr 2016 23:53:41 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> Message-ID: On Thu, Apr 7, 2016 at 11:46 PM, ?manuel Barry wrote: >> 4) Should there be a format string "coerce to str"? "{}".format(x) is > equivalent to str(x), but it might be nice to be able to assert that > something's stringish already. > > __index__ is meant to return the same thing as __int__, and I think the same > restrictions should apply here - we have an object pretending to be a > string, so there should *not* be a difference between use_as_string(obj) and > use_as_string(str(obj)) - in the same sense that there should not be a > difference between use_as_int(obj) and use_as_int(int(obj)). Absolutely agreed; however, I was thinking of the same restriction: "{!must_be_str}".format(x) would raise an exception if use_as_str(x) raises. But if it succeeds, yes, it's the same as simple str() formatting. ChrisA From mal at egenix.com Thu Apr 7 10:10:00 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 7 Apr 2016 16:10:00 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: <57066A38.7050408@egenix.com> On 07.04.2016 15:04, Chris Angelico wrote: > This is a spin-off from the __fspath__ discussion on python-dev, in > which a few people said that a more general approach would be better. > > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. I must be missing something... we already have a method for this: .__str__() -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From python at mrabarnett.plus.com Thu Apr 7 10:11:09 2016 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 7 Apr 2016 15:11:09 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: <57066A7D.6000308@mrabarnett.plus.com> On 2016-04-07 14:04, Chris Angelico wrote: > This is a spin-off from the __fspath__ discussion on python-dev, in > which a few people said that a more general approach would be better. > > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. > > This could supersede the __fspath__ "give me the string for this path" > protocol, or could stand in parallel with it. > > Obviously str will have this dunder method, returning self. Most other > core types (notably 'object') will not define it. Absence of this > method implies that the object cannot be treated as a string. > > String methods will be defined as accepting string-like objects. For > instance, "hello"+foo will succeed if foo is string-like. > > Downside: Two string-like objects may behave unexpectedly - foo+bar > will concatenate strings if either is an actual string, but if both > are other string-like objects, depends on the implementation of those > objects. > > Bikeshedding: > > 1) What should the dunder method be named? __str_coerce__? __stringlike__? > > 2) Which standard types are sufficiently string-like to be used thus? > > 3) Should there be a bytes equivalent? > > 4) Should there be a format string "coerce to str"? "{}".format(x) is > equivalent to str(x), but it might be nice to be able to assert that > something's stringish already. > > Open season! > __as_str__? From random832 at fastmail.com Thu Apr 7 10:07:06 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 10:07:06 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 09:41, Paul Moore wrote: > On 7 April 2016 at 14:04, Chris Angelico wrote: > > Proposal: Objects should be allowed to declare that they are > > "string-like" by creating a dunder method (analogously to __index__ > > for integers) which implies a loss-less conversion to str. > > What would this be used for? Other than Path, what types would be > viable candidates for being "string like"? I can't think of a use case > for this feature. How about ASCII bytes strings? ;) > For that matter, what constitutes a "lossless conversion to str"? What's __index__ for? From random832 at fastmail.com Thu Apr 7 10:22:23 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 10:22:23 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57066A38.7050408@egenix.com> References: <57066A38.7050408@egenix.com> Message-ID: <1460038943.1130439.571861105.1FAD3CFF@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 10:10, M.-A. Lemburg wrote: > I must be missing something... we already have a method for this: > .__str__() The point is to have a method that objects that "shouldn't" be used as strings in some contexts *won't* have, as floats (let alone strings) don't have __index__, even though they do have __int__. From rymg19 at gmail.com Thu Apr 7 10:42:32 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Thu, 7 Apr 2016 09:42:32 -0500 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: I can see this going horribly wrong when it comes to things like concatenation. I've used JS far too many times to be excited for things like this. -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ On Apr 7, 2016 8:05 AM, "Chris Angelico" wrote: > This is a spin-off from the __fspath__ discussion on python-dev, in > which a few people said that a more general approach would be better. > > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. > > This could supersede the __fspath__ "give me the string for this path" > protocol, or could stand in parallel with it. > > Obviously str will have this dunder method, returning self. Most other > core types (notably 'object') will not define it. Absence of this > method implies that the object cannot be treated as a string. > > String methods will be defined as accepting string-like objects. For > instance, "hello"+foo will succeed if foo is string-like. > > Downside: Two string-like objects may behave unexpectedly - foo+bar > will concatenate strings if either is an actual string, but if both > are other string-like objects, depends on the implementation of those > objects. > > Bikeshedding: > > 1) What should the dunder method be named? __str_coerce__? __stringlike__? > > 2) Which standard types are sufficiently string-like to be used thus? > > 3) Should there be a bytes equivalent? > > 4) Should there be a format string "coerce to str"? "{}".format(x) is > equivalent to str(x), but it might be nice to be able to assert that > something's stringish already. > > Open season! > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 7 10:56:35 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Apr 2016 07:56:35 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: Why strings and not, say, floats or bools or lists or bytes? Is it because strings have a uniquely compelling use case, or...? (For context from the numpy side of things, since IIUC numpy's fixed-width integer types were the impetus for __index__: numpy actually has a shadow type system that has versions of all of those types above except list. It also has a general conversion/casting system with a concept of different levels of safety, so for us __index__ is like a "safe" cast to int, and __int__ is like a "unsafe", and we have versions of this for all pairs of internal types, which if you care about this in general then having a single parametrized concept of safe casting might make more sense then adding new methods one by one. I'm not aware of any particular pain points triggered by numpy's strings not being real strings, though, at least in numpy's current design.) -n On Apr 7, 2016 6:05 AM, "Chris Angelico" wrote: This is a spin-off from the __fspath__ discussion on python-dev, in which a few people said that a more general approach would be better. Proposal: Objects should be allowed to declare that they are "string-like" by creating a dunder method (analogously to __index__ for integers) which implies a loss-less conversion to str. This could supersede the __fspath__ "give me the string for this path" protocol, or could stand in parallel with it. Obviously str will have this dunder method, returning self. Most other core types (notably 'object') will not define it. Absence of this method implies that the object cannot be treated as a string. String methods will be defined as accepting string-like objects. For instance, "hello"+foo will succeed if foo is string-like. Downside: Two string-like objects may behave unexpectedly - foo+bar will concatenate strings if either is an actual string, but if both are other string-like objects, depends on the implementation of those objects. Bikeshedding: 1) What should the dunder method be named? __str_coerce__? __stringlike__? 2) Which standard types are sufficiently string-like to be used thus? 3) Should there be a bytes equivalent? 4) Should there be a format string "coerce to str"? "{}".format(x) is equivalent to str(x), but it might be nice to be able to assert that something's stringish already. Open season! _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Apr 7 10:58:12 2016 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 7 Apr 2016 15:58:12 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57066A38.7050408@egenix.com> References: <57066A38.7050408@egenix.com> Message-ID: <57067584.1070900@mrabarnett.plus.com> On 2016-04-07 15:10, M.-A. Lemburg wrote: > On 07.04.2016 15:04, Chris Angelico wrote: >> This is a spin-off from the __fspath__ discussion on python-dev, in >> which a few people said that a more general approach would be better. >> >> Proposal: Objects should be allowed to declare that they are >> "string-like" by creating a dunder method (analogously to __index__ >> for integers) which implies a loss-less conversion to str. > > I must be missing something... we already have a method for this: > .__str__() > It's for making string-like objects into strings. __str__ isn't suitable because, ints, for example, have a __str__ method, but they aren't string-like. From ethan at stoneleaf.us Thu Apr 7 11:00:44 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 08:00:44 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> Message-ID: <5706761C.3070907@stoneleaf.us> On 04/07/2016 06:46 AM, ?manuel Barry wrote: > __index__ is meant to return the same thing as __int__, and I think the same > restrictions should apply here Do you mean "the same thing" as in both return 'int', or "the same thing" is in __int__(3.4) == __index__(3.4) ? Because that last one is False. > - we have an object pretending to be a > string, so there should *not* be a difference between use_as_string(obj) and > use_as_string(str(obj)) Hmmm -- so maybe you are saying that if the results of __str__ and __tostring__ are the same we're fine, just like if the results of __int__ and __index__ are the same we're fine? Otherwise an error is raised. Sounds reasonable. -- ~Ethan~ From p.f.moore at gmail.com Thu Apr 7 11:01:12 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 16:01:12 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> Message-ID: On 7 April 2016 at 15:07, Random832 wrote: > On Thu, Apr 7, 2016, at 09:41, Paul Moore wrote: >> On 7 April 2016 at 14:04, Chris Angelico wrote: >> > Proposal: Objects should be allowed to declare that they are >> > "string-like" by creating a dunder method (analogously to __index__ >> > for integers) which implies a loss-less conversion to str. >> >> What would this be used for? Other than Path, what types would be >> viable candidates for being "string like"? I can't think of a use case >> for this feature. > > How about ASCII bytes strings? ;) OK, I guess. I can't see why it's a significant enough use case to justify a new protocol, though. >> For that matter, what constitutes a "lossless conversion to str"? > > What's __index__ for? I don't follow. It's for indexing, which requires an integer. How does that relate to the question of what constitutes a lossless conversion to str? Paul From ethan at stoneleaf.us Thu Apr 7 11:03:29 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 08:03:29 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> Message-ID: <570676C1.3020808@stoneleaf.us> On 04/07/2016 07:07 AM, Random832 wrote: > What's __index__ for? __index__ is a way to get an int from an int-like object without losing information; so it fails with values like 3.4, but should succeed with values like Fraction(4, 2). __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and and the 4/10's is lost). -- ~Ethan~ From vgr255 at live.ca Thu Apr 7 11:05:45 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Thu, 7 Apr 2016 11:05:45 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706761C.3070907@stoneleaf.us> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: > From: Ethan Furman > Sent: Thursday, April 07, 2016 11:01 AM > To: python-ideas at python.org > Subject: Re: [Python-ideas] Dunder method to make object str-like > > On 04/07/2016 06:46 AM, ?manuel Barry wrote: > > > __index__ is meant to return the same thing as __int__, and I think the > same > > restrictions should apply here > > Do you mean "the same thing" as in both return 'int', or "the same > thing" is in __int__(3.4) == __index__(3.4) ? Because that last one is > False. I badly phrased that, let me try again. In the cases that __index__ is defined (and does not raise), it should return the same thing as __int__. > > Hmmm -- so maybe you are saying that if the results of __str__ and > __tostring__ are the same we're fine, just like if the results of > __int__ and __index__ are the same we're fine? Otherwise an error is > raised. > Pretty much; as above, if __tostring__ (in absence of a better name) is defined and does not raise, it should return the same as __str__. -Emanuel From ethan at stoneleaf.us Thu Apr 7 11:15:59 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 08:15:59 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160407094618.5c01847d@fsol> References: <20160407094618.5c01847d@fsol> Message-ID: <570679AF.7020502@stoneleaf.us> On 04/07/2016 12:46 AM, Antoine Pitrou wrote: > Booleans currently have reasonable overrides for the bitwise binary > operators: > > --> True | False > True > --> True & False > False > --> True ^ False > True > > However, the same cannot be said of bitwise unary complement, which > returns rather useless integer values: > > --> ~False > -1 > --> ~True > -2 > > Numpy's boolean type does the more useful (and more expected) thing: > >>>> ~np.bool_(True) > False > > How about changing the behaviour of bool.__invert__ to make it in line > with the Numpy boolean? > (i.e. bool.__invert__ == operator.not_) No. bool is a subclass of int, and changing that now would be a serious breach of backward-compatibility, not to mention breaking existing code for no good reason. Anyone who wants to can create their own Boolean class that doesn't subclass int and then declare the behaviour they want. If bool had been it's own thing from the start this wouldn't have been a problem, but it is far too late to change that now. You would be better off suggesting a new Logical type instead (it could even support unknown values). > Apparently someone went to the trouble of overriding __and__, __or__ > and __xor__ for booleans, which is why it looks unexpected to leave > __invert__ alone. __and__, __or__, and __xor__'s results in within bool's domain (right word?) so keeping them in the bool subtype makes sense; the result of __invert__ is not. > But I'm not surprised by such armchair commenting and pointless controversy > on python-ideas, since that's what the list is for.... If you aren't going to be civil, don't bother coming back. -- ~Ethan~ From p.f.moore at gmail.com Thu Apr 7 11:26:26 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 16:26:26 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: On 7 April 2016 at 16:05, ?manuel Barry wrote: >> Hmmm -- so maybe you are saying that if the results of __str__ and >> __tostring__ are the same we're fine, just like if the results of >> __int__ and __index__ are the same we're fine? Otherwise an error is >> raised. >> > > Pretty much; as above, if __tostring__ (in absence of a better name) is > defined and does not raise, it should return the same as __str__. OK, that makes sense as a definition of "what __tostring__ does". But I'm struggling to think of an example of code that would legitimately want to *use* it (unlike __index__ where the obvious motivating case was indexing a sequence). And even with a motivating example of use, I don't know that I can think of an example of something other than a string that might *provide* the method. The problem with the "ASCII byte string" example is that if a type provides __tostring__ I'd expect it to work for all values of that type. I'm not clear if "ASCII byte string" is intended to mean "a value of type bytes that only contains ASCII characters", or "a type that subclasses bytes to refuse to allow non-ASCII". The former would imply __tostring__ working sometimes, but not always. The latter seems like a contrived example (although I'm open to someone explaining a real-world use case where it's important). Paul From mal at egenix.com Thu Apr 7 11:26:54 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 7 Apr 2016 17:26:54 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57067584.1070900@mrabarnett.plus.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> Message-ID: <57067C3E.9030808@egenix.com> On 07.04.2016 16:58, MRAB wrote: > On 2016-04-07 15:10, M.-A. Lemburg wrote: >> On 07.04.2016 15:04, Chris Angelico wrote: >>> This is a spin-off from the __fspath__ discussion on python-dev, in >>> which a few people said that a more general approach would be better. >>> >>> Proposal: Objects should be allowed to declare that they are >>> "string-like" by creating a dunder method (analogously to __index__ >>> for integers) which implies a loss-less conversion to str. >> >> I must be missing something... we already have a method for this: >> .__str__() >> > It's for making string-like objects into strings. > > __str__ isn't suitable because, ints, for example, have a __str__ > method, but they aren't string-like. Depends on what you define as "string-like", I guess :-) We have abstract base classes for such tests, but there's nothing which would define "string-like" as ABC. Before trying to define a test via a special method, I think it's better to define what exactly you mean by "string-like". Something along the lines of numbers.Number, but for strings. To make an object string-like, you then test for the ABC and then call .__str__() to get the string representation as string. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From rosuav at gmail.com Thu Apr 7 11:34:23 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 01:34:23 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 12:56 AM, Nathaniel Smith wrote: > Why strings and not, say, floats or bools or lists or bytes? Is it because > strings have a uniquely compelling use case, or...? > > (For context from the numpy side of things, since IIUC numpy's fixed-width > integer types were the impetus for __index__: numpy actually has a shadow > type system that has versions of all of those types above except list. It > also has a general conversion/casting system with a concept of different > levels of safety, so for us __index__ is like a "safe" cast to int, and > __int__ is like a "unsafe", and we have versions of this for all pairs of > internal types, which if you care about this in general then having a single > parametrized concept of safe casting might make more sense then adding new > methods one by one. > > I'm not aware of any particular pain points triggered by numpy's strings not > being real strings, though, at least in numpy's current design.) I'm not against there being more such "treat as" dunder methods, but I'm not sure which types need them. We have __index__/__int__, and I'm proposing adding __something__/__str__; bytes was mentioned in the "okay folks, start bikeshedding" section. With bool, I'm not sure there needs to be anything. What object would need to say "I'm functionally a boolean!" in a way that's stronger than truthiness/falsiness? In what context would it be appropriate to use True and numpy.bool8(1) identically, but *not* use 42 or "spam" or object() to also mean true? If there's a use-case for that, then sure, that can be added to the proposal, and maybe something generic should be added. +0. Similar with float. I don't know of any use-cases myself, but I'd guess numpy is the best place to look. Remember, though, this would convert the other object into a float, not the float into a numpy object; if you want that, stick with the existing dunder methods and implement them yourself - numpy.float64(1.0)+2.0 is an instance of numpy.float64, NOT float. +0. With lists, though, I'm -1. The normal way to say "I function as a list" is to implement sequence protocol. If you want to be able to accept a list or a "thing like a list", what you usually want is a Sequence. So are you suggesting that we should instead have a single "safe cast" dunder method? class object: def __interpret_as__(self, t): """Return the same value as self, as an instance of t. If self cannot losslessly be interpreted as t, raise TypeError. """ if self.__class__ is t: return self if t is int and hasattr(self, "__index__"): return self.__index__() raise TypeError("'%s' object cannot be interpreted as %s" % (self.__class__.__name__, t.__name__)) class Path: def __interpret_as__(self, t): if t is str: return str(self) return super().__interpret_as__(t) Maybe that's the way things should be looking. ChrisA From rosuav at gmail.com Thu Apr 7 11:37:49 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 01:37:49 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore wrote: > On 7 April 2016 at 16:05, ?manuel Barry wrote: >>> Hmmm -- so maybe you are saying that if the results of __str__ and >>> __tostring__ are the same we're fine, just like if the results of >>> __int__ and __index__ are the same we're fine? Otherwise an error is >>> raised. >>> >> >> Pretty much; as above, if __tostring__ (in absence of a better name) is >> defined and does not raise, it should return the same as __str__. > > OK, that makes sense as a definition of "what __tostring__ does". But > I'm struggling to think of an example of code that would legitimately > want to *use* it (unlike __index__ where the obvious motivating case > was indexing a sequence). And even with a motivating example of use, I > don't know that I can think of an example of something other than a > string that might *provide* the method. > > The problem with the "ASCII byte string" example is that if a type > provides __tostring__ I'd expect it to work for all values of that > type. I'm not clear if "ASCII byte string" is intended to mean "a > value of type bytes that only contains ASCII characters", or "a type > that subclasses bytes to refuse to allow non-ASCII". The former would > imply __tostring__ working sometimes, but not always. The latter seems > like a contrived example (although I'm open to someone explaining a > real-world use case where it's important). The original use-case was Path objects, which stringify as the human-readable representation of the file system path. If you need to pass a string-or-Path to something that requires a string, you need to convert the Path to a string, but *not* convert other arbitrary objects to strings; calling str(x) would happily convert the integer 1234 into the string "1234", and then you'd go trying to create that file in the current directory. Python does not let you append non-strings to strings unless you write __radd__ manually; this proposal would allow objects to declare to the interpreter "hey, I'm basically a string here", and allow Paths and strings to interoperate. ChrisA From rosuav at gmail.com Thu Apr 7 11:42:07 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 01:42:07 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57067C3E.9030808@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> Message-ID: On Fri, Apr 8, 2016 at 1:26 AM, M.-A. Lemburg wrote: > On 07.04.2016 16:58, MRAB wrote: >> On 2016-04-07 15:10, M.-A. Lemburg wrote: >>> On 07.04.2016 15:04, Chris Angelico wrote: >>>> This is a spin-off from the __fspath__ discussion on python-dev, in >>>> which a few people said that a more general approach would be better. >>>> >>>> Proposal: Objects should be allowed to declare that they are >>>> "string-like" by creating a dunder method (analogously to __index__ >>>> for integers) which implies a loss-less conversion to str. >>> >>> I must be missing something... we already have a method for this: >>> .__str__() >>> >> It's for making string-like objects into strings. >> >> __str__ isn't suitable because, ints, for example, have a __str__ >> method, but they aren't string-like. > > Depends on what you define as "string-like", I guess :-) > > We have abstract base classes for such tests, but there's nothing > which would define "string-like" as ABC. Before trying to define > a test via a special method, I think it's better to define what > exactly you mean by "string-like". > > Something along the lines of numbers.Number, but for strings. > > To make an object string-like, you then test for the ABC and > then call .__str__() to get the string representation as string. The trouble with the ABC is that it implies a large number of methods; to declare that your object is string-like, you have to define a boatload of interactions with other types, and then decide whether str+Path yields a Path or a string. By defining __tostring__ (or whatever it gets called), you effectively say "hey, go ahead and cast me to str any time you need a str". Instead of defining all those methods, you simply define one thing, and then you can basically just treat it as a string thereafter. While you could do this by simply creating a whole lot of methods and making them all return strings, you'd still have the fundamental problem that you can't be sure that this is "safe". How do you distinguish between "object that can be treated as a string" and "object that knows how to add itself to a string"? Hence, this. ChrisA From ethan at stoneleaf.us Thu Apr 7 11:43:31 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 08:43:31 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: <57068023.4080006@stoneleaf.us> On 04/07/2016 08:37 AM, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore wrote: >> OK, that makes sense as a definition of "what __tostring__ does". But >> I'm struggling to think of an example of code that would legitimately >> want to *use* it (unlike __index__ where the obvious motivating case >> was indexing a sequence). And even with a motivating example of use, I >> don't know that I can think of an example of something other than a >> string that might *provide* the method. > > The original use-case was Path objects, which stringify as the > human-readable representation of the file system path. If you need to > pass a string-or-Path to something that requires a string, you need to > convert the Path to a string, but *not* convert other arbitrary > objects to strings; calling str(x) would happily convert the integer > 1234 into the string "1234", and then you'd go trying to create that > file in the current directory. Python does not let you append > non-strings to strings unless you write __radd__ manually; this > proposal would allow objects to declare to the interpreter "hey, I'm > basically a string here", and allow Paths and strings to interoperate. The problem with that is that Paths are conceptually not strings, they just serialize to strings, and we have a lot of infrastructure in place to deal with that serialized form. Is there anything, anywhere in the Python ecosystem, that would also benefit from something like an __as_str__ method/attribute? -- ~Ethan~ From tritium-list at sdamon.com Thu Apr 7 11:46:14 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Thu, 7 Apr 2016 11:46:14 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57068023.4080006@stoneleaf.us> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: <570680C6.8000407@sdamon.com> On 4/7/2016 11:43, Ethan Furman wrote: > The problem with that is that Paths are conceptually not strings, they > just serialize to strings, and we have a lot of infrastructure in > place to deal with that serialized form. Does any OS expose access to paths as anything other than the serialized form? From p.f.moore at gmail.com Thu Apr 7 11:47:57 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 16:47:57 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: On 7 April 2016 at 16:37, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore wrote: >> On 7 April 2016 at 16:05, ?manuel Barry wrote: >>>> Hmmm -- so maybe you are saying that if the results of __str__ and >>>> __tostring__ are the same we're fine, just like if the results of >>>> __int__ and __index__ are the same we're fine? Otherwise an error is >>>> raised. >>>> >>> >>> Pretty much; as above, if __tostring__ (in absence of a better name) is >>> defined and does not raise, it should return the same as __str__. >> >> OK, that makes sense as a definition of "what __tostring__ does". But >> I'm struggling to think of an example of code that would legitimately >> want to *use* it (unlike __index__ where the obvious motivating case >> was indexing a sequence). And even with a motivating example of use, I >> don't know that I can think of an example of something other than a >> string that might *provide* the method. >> >> The problem with the "ASCII byte string" example is that if a type >> provides __tostring__ I'd expect it to work for all values of that >> type. I'm not clear if "ASCII byte string" is intended to mean "a >> value of type bytes that only contains ASCII characters", or "a type >> that subclasses bytes to refuse to allow non-ASCII". The former would >> imply __tostring__ working sometimes, but not always. The latter seems >> like a contrived example (although I'm open to someone explaining a >> real-world use case where it's important). > > The original use-case was Path objects, which stringify as the > human-readable representation of the file system path. If you need to > pass a string-or-Path to something that requires a string, you need to > convert the Path to a string, but *not* convert other arbitrary > objects to strings; calling str(x) would happily convert the integer > 1234 into the string "1234", and then you'd go trying to create that > file in the current directory. Python does not let you append > non-strings to strings unless you write __radd__ manually; this > proposal would allow objects to declare to the interpreter "hey, I'm > basically a string here", and allow Paths and strings to interoperate. But the proposal for paths is to have a *specific* method that says "give me a string representing a filesystem path from this object". An "interpret this object as a string" wouldn't be appropriate for the cases where I'd want to do "give me a string representing a filesystem path". And that's where I get stuck, as I can't think of an example where I *would* want the more general option. For a number of reasons: 1. I can't think of a real-world example of when I'd *use* such a facility 2. I can't think of a real-world example of a type that might *provide* such a facility 3. I can't see how something so general would be of benefit. The __index__ and __fspath__ special methods have clear, focused reasons for existing. The proposed __tostring__ protocol seems too general to have a purpose. Paul From rosuav at gmail.com Thu Apr 7 11:49:15 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 01:49:15 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57068023.4080006@stoneleaf.us> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman wrote: > The problem with that is that Paths are conceptually not strings, they just > serialize to strings, and we have a lot of infrastructure in place to deal > with that serialized form. > > Is there anything, anywhere in the Python ecosystem, that would also benefit > from something like an __as_str__ method/attribute? I'd like to see this used in 2/3 compatibility code. You can make an Ascii class which subclasses bytes, but can be treated as str-compatible in both versions. By restricting its contents to [0,128) it can easily be "safe" in both byte and text contexts, and it'll cover 99%+ of use cases. So if you try to getattr(x, Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of str) and in Py3 (because Ascii.__tostring__ returns a valid string), and there's a guarantee that it'll behave the same way (because the string must be ASCII-only). ChrisA From contrebasse at gmail.com Thu Apr 7 12:00:20 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 7 Apr 2016 16:00:20 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Changing_the_meaning_of_bool=2E=5F=5Fin?= =?utf-8?b?dmVydF9f?= References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> Message-ID: Ethan Furman writes: > > --> ~False > > -1 > > --> ~True > > -2 > > > No. bool is a subclass of int, and changing that now would be a serious > breach of backward-compatibility, not to mention breaking existing code > for no good reason. I get that "technically" it would be a backward incompatible change, but any code that relies on `~True == -2` has other problems than backward compatibility. From p.f.moore at gmail.com Thu Apr 7 12:00:40 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 17:00:40 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <570680C6.8000407@sdamon.com> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <570680C6.8000407@sdamon.com> Message-ID: On 7 April 2016 at 16:46, Alexander Walters wrote: > On 4/7/2016 11:43, Ethan Furman wrote: >> >> The problem with that is that Paths are conceptually not strings, they >> just serialize to strings, and we have a lot of infrastructure in place to >> deal with that serialized form. > > Does any OS expose access to paths as anything other than the serialized > form? Does any OS have an object oriented API? At the OS level, you're lucky to get any sort of structured access, so I don't think that "what the OS provides" is the level to look at this at. Languages are what provide abstractions. Lisp provides a path object, I believe. Perl does (although it may well be a 3rd party CPAN library as with a lot of things in Perl). A path abstraction isn't commonly provided, but that doesn't mean it's not useful. But this is of course offtopic for this thread. Paul From contrebasse at gmail.com Thu Apr 7 12:02:37 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 7 Apr 2016 16:02:37 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Changing_the_meaning_of_bool=2E=5F=5Fin?= =?utf-8?b?dmVydF9f?= References: <20160407094618.5c01847d@fsol> <20160407110425.GK12526@ando.pearwood.info> Message-ID: Steven D'Aprano writes: > It also breaks a fundamental property of most mathematical relations: > > if a == b, then f(a) == f(b) > > (assuming f(a) and f(b) are defined for the type of both a and b). > > That is currently true for bools: > > py> (True == 1) and (~True == ~1) > True > py> (False==0) and (~False == ~0) > True > > You want ~b to return `not b`: > > py> (True == 1) and (False == ~1) > False > py> (False==0) and (True == ~0) > False How about the str function ? str(True) != str(1) str(False) != str(0) True == 1 is not a mathematical relation, it is python's history. From p.f.moore at gmail.com Thu Apr 7 12:11:25 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 17:11:25 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On 7 April 2016 at 16:49, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman wrote: >> The problem with that is that Paths are conceptually not strings, they just >> serialize to strings, and we have a lot of infrastructure in place to deal >> with that serialized form. >> >> Is there anything, anywhere in the Python ecosystem, that would also benefit >> from something like an __as_str__ method/attribute? > > I'd like to see this used in 2/3 compatibility code. You can make an > Ascii class which subclasses bytes, but can be treated as > str-compatible in both versions. By restricting its contents to > [0,128) it can easily be "safe" in both byte and text contexts, and > it'll cover 99%+ of use cases. So if you try to getattr(x, > Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of > str) and in Py3 (because Ascii.__tostring__ returns a valid string), > and there's a guarantee that it'll behave the same way (because the > string must be ASCII-only). OK, that makes sense. But when you say "it'll work fine in Python 3" how will that happen? What code needs to call __fromstring__ to make this happen? You mention getattr. Would you expect every builtin and stdlib function that takes a string to be modified to try __fromstring__? That sounds like a pretty big performance hit, as strings are very critical to the interpreter. Even worse, what should open() do? It takes a string as an argument. To support patthlib, it needs to also call __fspath__. I presume you'd also want it to call __fromstring__ so that your Ascii class could be used as an argument to open as well. This is starting to seem incredibly messy to solve a problem that's basically about extending support for Python 2, which is explicitly not something the Python 3 core should be doing... Paul From random832 at fastmail.com Thu Apr 7 12:16:31 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 12:16:31 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> Message-ID: <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote: > >> For that matter, what constitutes a "lossless conversion to str"? > > > > What's __index__ for? > > I don't follow. It's for indexing, which requires an integer. Sure, but why isn't int() good enough? For the same reason you only want the kinds of objects that implement __index__ (and not, say, a float or a string that happens to be numeric) for indexing, you only want the kinds of objects that implement this method for certain purposes. From random832 at fastmail.com Thu Apr 7 12:23:42 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 12:23:42 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> Message-ID: <1460046222.1160371.571991561.59F02A3F@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 12:00, Joseph Martinot-Lagarde wrote: > Ethan Furman writes: > > > > --> ~False > > > -1 > > > --> ~True > > > -2 > > > > > No. bool is a subclass of int, and changing that now would be a serious > > breach of backward-compatibility, not to mention breaking existing code > > for no good reason. > > I get that "technically" it would be a backward incompatible change, but > any > code that relies on `~True == -2` has other problems than backward > compatibility. It's more a combination of other things that code may rely on: True == 1 False == 0 False + 1, False + True # truthy True + True == 2 # imagine implementing count_if_true() as sum(map(bool, ...)) From ethan at stoneleaf.us Thu Apr 7 12:25:52 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 09:25:52 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> Message-ID: <57068A10.3020904@stoneleaf.us> On 04/07/2016 09:00 AM, Joseph Martinot-Lagarde wrote: > Ethan Furman writes: > >>> --> ~False >>> -1 >>> --> ~True >>> -2 >> >> No. bool is a subclass of int, and changing that now would be a serious >> breach of backward-compatibility, not to mention breaking existing code >> for no good reason. > > I get that "technically" it would be a backward incompatible change, but any > code that relies on `~True == -2` has other problems than backward > compatibility. Really? Code that relies on correct behavior (~1 == -2) has problems? Well, it may, but that's not it. -- ~Ethan~ From mal at egenix.com Thu Apr 7 12:27:30 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 7 Apr 2016 18:27:30 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> Message-ID: <57068A72.4090708@egenix.com> On 07.04.2016 17:42, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:26 AM, M.-A. Lemburg wrote: >> On 07.04.2016 16:58, MRAB wrote: >>> On 2016-04-07 15:10, M.-A. Lemburg wrote: >>>> On 07.04.2016 15:04, Chris Angelico wrote: >>>>> This is a spin-off from the __fspath__ discussion on python-dev, in >>>>> which a few people said that a more general approach would be better. >>>>> >>>>> Proposal: Objects should be allowed to declare that they are >>>>> "string-like" by creating a dunder method (analogously to __index__ >>>>> for integers) which implies a loss-less conversion to str. >>>> >>>> I must be missing something... we already have a method for this: >>>> .__str__() >>>> >>> It's for making string-like objects into strings. >>> >>> __str__ isn't suitable because, ints, for example, have a __str__ >>> method, but they aren't string-like. >> >> Depends on what you define as "string-like", I guess :-) >> >> We have abstract base classes for such tests, but there's nothing >> which would define "string-like" as ABC. Before trying to define >> a test via a special method, I think it's better to define what >> exactly you mean by "string-like". >> >> Something along the lines of numbers.Number, but for strings. >> >> To make an object string-like, you then test for the ABC and >> then call .__str__() to get the string representation as string. > > The trouble with the ABC is that it implies a large number of methods; > to declare that your object is string-like, you have to define a > boatload of interactions with other types, and then decide whether > str+Path yields a Path or a string. Not necessarily. In fact, a string.String ABC could have only a single method: .__str__() defined. > By defining __tostring__ (or > whatever it gets called), you effectively say "hey, go ahead and cast > me to str any time you need a str". Instead of defining all those > methods, you simply define one thing, and then you can basically just > treat it as a string thereafter. As I said: it's all a matter of defining what a "string-like" object is supposed to mean. With the above definition of string.String, you'd have exactly what you want. > While you could do this by simply creating a whole lot of methods and > making them all return strings, you'd still have the fundamental > problem that you can't be sure that this is "safe". How do you > distinguish between "object that can be treated as a string" and > "object that knows how to add itself to a string"? Hence, this. I suppose the .__add__() method of Path objects would implement the necessary check isinstance(other_obj, string.String) to find out whether it's safe to assume that other_obj.__str__() returns the str version of the "string-like" object other_obj. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From random832 at fastmail.com Thu Apr 7 12:31:23 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 12:31:23 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 12:11, Paul Moore wrote: > OK, that makes sense. But when you say "it'll work fine in Python 3" > how will that happen? What code needs to call __fromstring__ to make > this happen? You mention getattr. Would you expect every builtin and > stdlib function that takes a string to be modified to try > __fromstring__? That sounds like a pretty big performance hit, as > strings are very critical to the interpreter. Isn't it only a performance hit on something that's an exception now? Like, if PyString_Check fails, then call it. I wonder how much could be done in a "blanket" way without having to change individual methods. Like, put the necessary machinery in PyArg_ParseTuple. Or does that borrow a reference? ---- Taking a step back, can someone explain to me in plain english why io and os shouldn't directly support pathlib? All this "well maybe make it a subclass, well maybe make a special protocol stuff can implement" stuff is dancing around that. From ethan at stoneleaf.us Thu Apr 7 12:32:42 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 09:32:42 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: <57068BAA.5090105@stoneleaf.us> On 04/07/2016 08:49 AM, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman wrote: >> The problem with that is that Paths are conceptually not strings, they just >> serialize to strings, and we have a lot of infrastructure in place to deal >> with that serialized form. >> >> Is there anything, anywhere in the Python ecosystem, that would also benefit >> from something like an __as_str__ method/attribute? > > I'd like to see this used in 2/3 compatibility code. You can make an > Ascii class which subclasses bytes, but can be treated as > str-compatible in both versions. By restricting its contents to > [0,128) it can easily be "safe" in both byte and text contexts, and > it'll cover 99%+ of use cases. So if you try to getattr(x, > Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of > str) and in Py3 (because Ascii.__tostring__ returns a valid string), > and there's a guarantee that it'll behave the same way (because the > string must be ASCII-only). Make the Ascii class subclass bytes in 2.x and str in 3.x; the 2.x version's __str__ returns bytes, while it's __unicode__ returns unicode and in 3.x __str__returns str, and __unicode__ doesn't exist. Problem solved, no other special methods needed.* ;) -- ~Ethan~ * crosses fingers hoping to not have missed something obvious ;) From guido at python.org Thu Apr 7 12:38:25 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Apr 2016 09:38:25 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <57068A10.3020904@stoneleaf.us> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> Message-ID: Honestly I think that the OP has a point, and I don't think we have to bend over backwards to preserve int compatibility. After all str(True) != str(1), and surely there are other examples. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Thu Apr 7 12:43:28 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 09:43:28 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <20160407110425.GK12526@ando.pearwood.info> Message-ID: <57068E30.1020702@stoneleaf.us> On 04/07/2016 09:02 AM, Joseph Martinot-Lagarde wrote: > How about the str function ? > str(True) != str(1) > str(False) != str(0) bools are a subclass of int, and the string represention of an int is not integral to its value: class fraction(int): ... fraction('4/2') == 2 str(fraction('4/2')) != str(2) -- ~Ethan~ From rosuav at gmail.com Thu Apr 7 12:44:17 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 02:44:17 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore wrote: > Even worse, what should open() do? It takes a string as an argument. > To support patthlib, it needs to also call __fspath__. I presume you'd > also want it to call __fromstring__ so that your Ascii class could be > used as an argument to open as well. This is starting to seem > incredibly messy to solve a problem that's basically about extending > support for Python 2, which is explicitly not something the Python 3 > core should be doing... This would replace __fspath__. There'd be no need for a Path-specific dunder if there's a generic "this can be treated as a string" dunder. ChrisA From rosuav at gmail.com Thu Apr 7 12:46:32 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 02:46:32 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57068A72.4090708@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> Message-ID: On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg wrote: > Not necessarily. In fact, a string.String ABC could have only > a single method: .__str__() defined. That wouldn't be very useful; object.__str__ exists and is functional, so EVERY object would count as a string. The point of "string-like" is that it can be treated as a string, not just that it can be converted to one. This is exactly parallel to the difference between __index__ and __int__; floats can be converted to int (and will truncate), but cannot be *treated* as ints. ChrisA From rosuav at gmail.com Thu Apr 7 13:02:30 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 03:02:30 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> Message-ID: On Fri, Apr 8, 2016 at 2:31 AM, Random832 wrote: > On Thu, Apr 7, 2016, at 12:11, Paul Moore wrote: >> OK, that makes sense. But when you say "it'll work fine in Python 3" >> how will that happen? What code needs to call __fromstring__ to make >> this happen? You mention getattr. Would you expect every builtin and >> stdlib function that takes a string to be modified to try >> __fromstring__? That sounds like a pretty big performance hit, as >> strings are very critical to the interpreter. > > Isn't it only a performance hit on something that's an exception now? > Like, if PyString_Check fails, then call it. I like this idea; it can be supported by a very small semantic change, namely that this method is not guaranteed to be called on strings or subclasses of strings. For cross-implementation compatibility, I would require that str.__name_needed__() return self, and that well-behaved subclasses not override this; that way, there's no semantic difference. Consider this like the "x is y implies x == y" optimization that crops up here and there; while it's legal to have x.__eq__(x) return False (eg NaN), there are some places where it's assumed to be found anyway (or, to be technically correct, where something looks for "x is y or x == y" rather than simply "x == y"). So this would be "self if isinstance(self, str) else self.__thingy__()". Deep inside CPython, it would simply be failure-mode handling: instead of directly raising TypeError, attempt a treat-as-string conversion, which will raise TypeError if it's not possible. A cursory inspection of the code suggests that the only place that needs to changed is PyUnicode_FromObject in unicodeobject.c. It currently checks if the object is exactly an instance of PyUnicode (aka 'str'), then for a subclass of same, then raises TypeError "Can't convert '%.100s' object to str implicitly". Even the wording of that message leaves it open to more types being implicitly convertible; the only change here is that you can make your type convertible without actually subclassing str. ChrisA ChrisA From desmoulinmichel at gmail.com Thu Apr 7 13:03:35 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Thu, 7 Apr 2016 19:03:35 +0200 Subject: [Python-ideas] Boundaries for unpacking Message-ID: <570692E7.9080904@gmail.com> Python is a lot about iteration, and I often have to get values from an iterable. For this, unpacking is fantastic: a, b, c = iterable One that problem arises is that you don't know when iterable will contain 3 items. In that case, this beautiful code becomes: iterator = iter(iterable) a = next(iterator, "default value") b = next(iterator, "default value") c = next(iterator, "default value") More often than not, I wish there were a way to specify a default value for unpacking. This would also come in handy to get the first or last item of an iterable that can be empty. Indeed: a, *b = iterable *a, b = iterable Would fail if the iterable's iterator cannot yield at least one item. I don't have a clean syntax in mind, so I hope some people will get inspired by it and will make suggestions. I add a couple of ideas: a, b, c with "default value" = iterable a, b, c except "default value" = iterable a, b, c | "default value" = iterable But none of them are great. Another problem is that if you can't use unpacking on generator yielding 1000000 values: a, b = iterable # error a, b, c* = iterable # consume the whole thing, and load it in memory So if I want the 2 first items, I have to go back to: iterator = iter(iterable) a = next(iterator) b = next(iterator) Now, I made a suggestion to allow slicing on any iterable (or at least on any iterator), but I'm not going to hold my breath on it: a, b = iterable[:2] # allow that on generators or a, b = iter(iterable)[:2] # allow that Some have suggested to add a new syntax: a, b = iterable(:2) But even just a way to allow to specify that you just want the 2 first items without unpacking the all iterable would work for me. Ideally though, the 2 options should be able to be used together. From p.f.moore at gmail.com Thu Apr 7 13:16:56 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 18:16:56 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: On 7 April 2016 at 17:16, Random832 wrote: > On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote: >> >> For that matter, what constitutes a "lossless conversion to str"? >> > >> > What's __index__ for? >> >> I don't follow. It's for indexing, which requires an integer. > > Sure, but why isn't int() good enough? For the same reason you only want > the kinds of objects that implement __index__ (and not, say, a float or > a string that happens to be numeric) for indexing, you only want the > kinds of objects that implement this method for certain purposes. You're making my point here. A "lossless conversion to str" isn't a good enough definition of which things should implement the new protocol. I'm trying to get someone to tell me what criteria I should use to decide if my type should implement the new protocol. At the moment, I have no idea. Paul From ethan at stoneleaf.us Thu Apr 7 13:17:57 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 10:17:57 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> Message-ID: <57069645.8040509@stoneleaf.us> On 04/07/2016 09:38 AM, Guido van Rossum wrote: > Honestly I think that the OP has a point, and I don't think we have to > bend over backwards to preserve int compatibility. After all str(True) > != str(1), and surely there are other examples. I think the str() of a value, while possibly being the most interesting piece of information (IntEnum, anyone?), is hardly the most intrinsic. If we do make this change, besides needing a couple major versions to make it happen, will anything else be different? - no longer subclass int? - add an "unknown" value? - how will indexing work? - or any of the other operations? - don't bother with any of the other mathematical operations? - counting True's is not the same as adding True's I'm not firmly opposed, I just don't see a major issue here -- I've needed an Unknown value for more often that I've needed ~True to be False. -- ~Ethan~ From p.f.moore at gmail.com Thu Apr 7 13:18:54 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 18:18:54 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On 7 April 2016 at 17:44, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore wrote: >> Even worse, what should open() do? It takes a string as an argument. >> To support patthlib, it needs to also call __fspath__. I presume you'd >> also want it to call __fromstring__ so that your Ascii class could be >> used as an argument to open as well. This is starting to seem >> incredibly messy to solve a problem that's basically about extending >> support for Python 2, which is explicitly not something the Python 3 >> core should be doing... > > This would replace __fspath__. There'd be no need for a Path-specific > dunder if there's a generic "this can be treated as a string" dunder. So the only things that should implement the new protocol would be paths. Otherwise, they could be passed to things that expect a path. Once again, I'm confused. Can someone please explain to me how to decide whether my type should provide the new protocol. And whether my code should check the new protocol. At the moment, I can't answer those questions with the information given in this thread. Paul From ian.g.kelly at gmail.com Thu Apr 7 13:19:10 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Thu, 7 Apr 2016 11:19:10 -0600 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 10:38 AM, Guido van Rossum wrote: > Honestly I think that the OP has a point, and I don't think we have to > bend over backwards to preserve int compatibility. After all str(True) > != str(1), and surely there are other examples. I can see it going either way: if we treat the domain of bool as that of the integers, then ~True == ~1 == -2. If on the other hand we treat it as the integers modulo 2, then it makes sense that ~True == ~1 == 0. But this would also imply that True + True == False, which would definitely break existing code. I note that if you add an explicit modulo division by 2, then it works out: py> ~True % 2 0 py> ~False % 2 1 The salient point to me is that there's no strong justification for making the change. As has been pointed out elsewhere in the thread, if you want binary not, just use not. From guido at python.org Thu Apr 7 13:23:59 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Apr 2016 10:23:59 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <57069645.8040509@stoneleaf.us> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: Nothing else is on the table. Seriously. Stop hijacking the thread. --Guido (mobile) On Apr 7, 2016 10:18 AM, "Ethan Furman" wrote: > On 04/07/2016 09:38 AM, Guido van Rossum wrote: > > Honestly I think that the OP has a point, and I don't think we have to >> bend over backwards to preserve int compatibility. After all str(True) >> != str(1), and surely there are other examples. >> > > I think the str() of a value, while possibly being the most interesting > piece of information (IntEnum, anyone?), is hardly the most intrinsic. > > If we do make this change, besides needing a couple major versions to make > it happen, will anything else be different? > > - no longer subclass int? > - add an "unknown" value? > - how will indexing work? > - or any of the other operations? > - don't bother with any of the other mathematical operations? > - counting True's is not the same as adding True's > > I'm not firmly opposed, I just don't see a major issue here -- I've needed > an Unknown value for more often that I've needed ~True to be False. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 7 13:30:43 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 10:30:43 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> Message-ID: <57069943.8050008@stoneleaf.us> On 04/07/2016 09:31 AM, Random832 wrote: > Taking a step back, can someone explain to me in plain english why io > and os shouldn't directly support pathlib? All this "well maybe make it > a subclass, well maybe make a special protocol stuff can implement" > stuff is dancing around that. The protocol is "the how" of supporting pathlib, and, interestingly enough, the easiest way to do so (avoids circular imports, etc., etc,.). -- ~Ethan~ From brett at python.org Thu Apr 7 13:31:19 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 17:31:19 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 08:58 Chris Angelico wrote: > On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman wrote: > > The problem with that is that Paths are conceptually not strings, they > just > > serialize to strings, and we have a lot of infrastructure in place to > deal > > with that serialized form. > > > > Is there anything, anywhere in the Python ecosystem, that would also > benefit > > from something like an __as_str__ method/attribute? > > I'd like to see this used in 2/3 compatibility code. You can make an > Ascii class which subclasses bytes, but can be treated as > str-compatible in both versions. By restricting its contents to > [0,128) it can easily be "safe" in both byte and text contexts, and > it'll cover 99%+ of use cases. So if you try to getattr(x, > Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of > str) and in Py3 (because Ascii.__tostring__ returns a valid string), > and there's a guarantee that it'll behave the same way (because the > string must be ASCII-only). > But couldn't you also just define a str subclass that checks its argument(s) are only valid ASCII values? What you're proposing is to potentially change all places that operate with a string to now check for a special method which would be a costly change potentially to performance as well as propagating this concept everywhere a string-like object is expected. I think it's important to realize that the main reason we are considering this special method concept is to make it easier to introduce in third-party code which doesn't have pathlib or for people who don't want to import pathlib just to convert a pathlib.PurePath object to a string. Otherwise we would simply have pathlib.fspath() or something that checked if its argument was a subclass of pathlib.PurePath and if so call str() on it, else double-check that argument was an instance of str and keep it simple. But instead we are trying to do the practical thing and come up with a common method name that people can be sure will exist on pathlib.PurePath and any other third-party path library. I think trying to generalize to "string-like but not __str__()" is a premature optimization that has not exposed enough use-cases to warrant such a deep change to the language. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Apr 7 13:39:56 2016 From: mike at selik.org (Michael Selik) Date: Thu, 7 Apr 2016 18:39:56 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: > On Apr 7, 2016, at 6:23 PM, Guido van Rossum wrote: > Nothing else is on the table. Seriously. Stop hijacking the thread. To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct? From brett at python.org Thu Apr 7 13:40:23 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 17:40:23 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 10:19 Paul Moore wrote: > On 7 April 2016 at 17:44, Chris Angelico wrote: > > On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore wrote: > >> Even worse, what should open() do? It takes a string as an argument. > >> To support patthlib, it needs to also call __fspath__. I presume you'd > >> also want it to call __fromstring__ so that your Ascii class could be > >> used as an argument to open as well. This is starting to seem > >> incredibly messy to solve a problem that's basically about extending > >> support for Python 2, which is explicitly not something the Python 3 > >> core should be doing... > > > > This would replace __fspath__. There'd be no need for a Path-specific > > dunder if there's a generic "this can be treated as a string" dunder. > > So the only things that should implement the new protocol would be > paths. Otherwise, they could be passed to things that expect a path. > > Once again, I'm confused. > > Can someone please explain to me how to decide whether my type should > provide the new protocol. And whether my code should check the new > protocol. At the moment, I can't answer those questions with the > information given in this thread. > I've reached the same conclusion/point myself. The attempt to come up with a pure solution is wreaking havoc with the practicality side of my brain that doesn't understand the rules it's supposed to follow in order to make this work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Apr 7 13:44:50 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 7 Apr 2016 13:44:50 -0400 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: <570692E7.9080904@gmail.com> References: <570692E7.9080904@gmail.com> Message-ID: On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin wrote: > Python is a lot about iteration, and I often have to get values from an > iterable. For this, unpacking is fantastic: > > a, b, c = iterable > > One that problem arises is that you don't know when iterable will > contain 3 items. > > In that case, this beautiful code becomes: > > iterator = iter(iterable) > a = next(iterator, "default value") > b = next(iterator, "default value") > c = next(iterator, "default value") > I think rather than adding a new syntax, it would be better to just make one of these work: a, b, c, *d = itertools.chain(iterable, itertools.repeat('default value')) a, b, c = itertools.chain(iterable, itertools.repeat('default value')) a, b, c = itertools.chain.from_iterable(iterable, default='default value') -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Apr 7 13:53:40 2016 From: mike at selik.org (Michael Selik) Date: Thu, 7 Apr 2016 18:53:40 +0100 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: References: Message-ID: <4E0BE43A-2B95-4AE0-87E2-731DCF44859F@selik.org> > On Apr 6, 2016, at 10:38 AM, Joshua Morton wrote: > set() | [] # 2 > {}.keys() | [] # 3 Looks like this should be standardized. Either both raise TypeError, or both return a set. My preference would be TypeError, but that might be worse for backwards-compatibility. > {}.keys().union(set()) # 6 Seems to me that the pipe operator is staying on MappingView, so it's reasonable to add a corresponding ``.union`` to mimic sets. And intersection, etc. > {}.values() == {}.values() # 9 > d = {}; d.values() == d.values() # 10 It's weird, but float('nan') != float('nan'). I'm not particularly bothered by this. From ethan at stoneleaf.us Thu Apr 7 13:56:04 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 10:56:04 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: <57069F34.3080102@stoneleaf.us> On 04/07/2016 10:23 AM, Guido van Rossum wrote: > Nothing else is on the table. Okay. > Seriously. Whew, I thought you were joking. > Stop hijacking the thread. Bite me. > --Guido (mobile) --Ethan (pissed) From guido at python.org Thu Apr 7 14:08:38 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Apr 2016 11:08:38 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik wrote: > >> On Apr 7, 2016, at 6:23 PM, Guido van Rossum wrote: >> Nothing else is on the table. Seriously. Stop hijacking the thread. > > To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct? Yes. To be more precise, there are some "arithmetic" operations (+, -, *, /, **) and they all treat bools as ints and always return ints; there are also some "bitwise" operations (&, |, ^, ~) and they should all treat bools as bools and return a bool. Currently the only exception to this idea is that ~ returns an int, so the proposal is to fix that. (There are also some "boolean" operations (and, or, not) and they are also unchanged.) -- --Guido van Rossum (python.org/~guido) From brett at python.org Thu Apr 7 14:10:32 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 18:10:32 +0000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <57069F34.3080102@stoneleaf.us> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <57069F34.3080102@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 10:55 Ethan Furman wrote: > On 04/07/2016 10:23 AM, Guido van Rossum wrote: > > > Nothing else is on the table. > > Okay. > > > Seriously. > > Whew, I thought you were joking. > > > Stop hijacking the thread. > > Bite me. > > > --Guido (mobile) > > --Ethan (pissed) > OK, that's enough. You already snapped at Antoine over his negative dig at python-ideas -- which I don't condone either -- and now this. Please consider stepping away from the keyboard until you've calmed down. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Thu Apr 7 14:13:16 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Thu, 7 Apr 2016 20:13:16 +0200 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: References: <570692E7.9080904@gmail.com> Message-ID: <5706A33C.6080100@gmail.com> Ignoring the fact that it's inelegant and verbose (not even counting the import), the biggest problem is that there is no way I will ever remember it. Which mean everytime I want to do it (probably twice a week), I will have to look it up on the doc, just like I do with itertools.groupby or the receipe to iterate on a window of values. Le 07/04/2016 19:44, Todd a ?crit : > a, b, c, *d = itertools.chain(iterable, itertools.repeat('default value')) From mike at selik.org Thu Apr 7 14:15:26 2016 From: mike at selik.org (Michael Selik) Date: Thu, 7 Apr 2016 19:15:26 +0100 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: References: <570692E7.9080904@gmail.com> Message-ID: > On Apr 7, 2016, at 6:44 PM, Todd wrote: > > On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin wrote: > Python is a lot about iteration, and I often have to get values from an > iterable. For this, unpacking is fantastic: > > a, b, c = iterable > > One that problem arises is that you don't know when iterable will > contain 3 items. > > In that case, this beautiful code becomes: > > iterator = iter(iterable) > a = next(iterator, "default value") > b = next(iterator, "default value") > c = next(iterator, "default value") I actually don't think that's so ugly. Looks fairly clear to me. What about these alternatives? >>> from itertools import repeat, islice, chain >>> it = range(2) >>> a, b, c = islice(chain(it, repeat('default')), 3) >>> a, b, c (0, 1, 'default') If you want to stick with builtins: >>> it = iter(range(2)) >>> a, b, c = [next(it, 'default') for i in range(3)] >>> a, b, c (0, 1, 'default') If your utterable is sliceable and sizeable: >>> it = list(range(2)) >>> a, b, c = it[:3] + ['default'] * (3 - len(it)) >>> a, b, c (0, 1, 'default') Perhaps add a recipe to itertools, or change the "take" recipe? def unpack(n, iterable, default=None): "Slice the first n items of the iterable, padding with a default" padding = repeat(default) return islice(chain(iterable, padding), n)) From desmoulinmichel at gmail.com Thu Apr 7 14:24:23 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Thu, 7 Apr 2016 20:24:23 +0200 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: References: <570692E7.9080904@gmail.com> Message-ID: <5706A5D7.9020407@gmail.com> Le 07/04/2016 20:15, Michael Selik a ?crit : > >> On Apr 7, 2016, at 6:44 PM, Todd wrote: >> >> On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin wrote: >> Python is a lot about iteration, and I often have to get values from an >> iterable. For this, unpacking is fantastic: >> >> a, b, c = iterable >> >> One that problem arises is that you don't know when iterable will >> contain 3 items. >> >> In that case, this beautiful code becomes: >> >> iterator = iter(iterable) >> a = next(iterator, "default value") >> b = next(iterator, "default value") >> c = next(iterator, "default value") > > I actually don't think that's so ugly. Looks fairly clear to me. What about these alternatives? Well, there is nothing wrong with: a = mylist[0] b = mylist[1] c = mylist[2] But we still prefer unpacking. And this is way more verbose. > >>>> from itertools import repeat, islice, chain >>>> it = range(2) >>>> a, b, c = islice(chain(it, repeat('default')), 3) >>>> a, b, c > (0, 1, 'default') > > > If you want to stick with builtins: > >>>> it = iter(range(2)) >>>> a, b, c = [next(it, 'default') for i in range(3)] >>>> a, b, c > (0, 1, 'default') They all work, but they are impossible to remember, plus you will need a comment everytime you use them outside of the shell. > > > If your utterable is sliceable and sizeable: > >>>> it = list(range(2)) >>>> a, b, c = it[:3] + ['default'] * (3 - len(it)) >>>> a, b, c > (0, 1, 'default') Same problem, and as you said, you can forget about generators. > > > > Perhaps add a recipe to itertools, or change the "take" recipe? > > def unpack(n, iterable, default=None): > "Slice the first n items of the iterable, padding with a default" > padding = repeat(default) > return islice(chain(iterable, padding), n)) Not a bad idea. Built in would be better, but I can live with itertools. I import it so often I'm wondering if itertools shouldn't be in __builtins__ :) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From rosuav at gmail.com Thu Apr 7 14:28:35 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 04:28:35 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 3:31 AM, Brett Cannon wrote: > But couldn't you also just define a str subclass that checks its argument(s) > are only valid ASCII values? What you're proposing is to potentially change > all places that operate with a string to now check for a special method > which would be a costly change potentially to performance as well as > propagating this concept everywhere a string-like object is expected. Fair enough. That was a spur-of-the-moment thought. To be honest, this proposal is a massive generalization from, ultimately, a single use-case. > I think it's important to realize that the main reason we are considering > this special method concept is to make it easier to introduce in third-party > code which doesn't have pathlib or for people who don't want to import > pathlib just to convert a pathlib.PurePath object to a string. And this should make it easier for third-party code to be functional without even being aware of the Path object. There are two basic things that code will be doing with paths: passing them unchanged to standard library functions (eg open()), and combining them with strings. The first will work by definition; the second will if paths can implicitly upcast to strings. In contrast, a function or method to convert a path to a string requires conscious effort on the part of any function that needs to do such manipulation, which correspondingly means version compatibility checks. Libraries will need to release a new version that's path-compatible, even though they don't actually gain any functionality. ChrisA From mal at egenix.com Thu Apr 7 14:42:46 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 7 Apr 2016 20:42:46 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> Message-ID: <5706AA26.1090504@egenix.com> On 07.04.2016 18:46, Chris Angelico wrote: > On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg wrote: >> Not necessarily. In fact, a string.String ABC could have only >> a single method: .__str__() defined. > > That wouldn't be very useful; object.__str__ exists and is functional, > so EVERY object would count as a string. No, only those objects that register with the ABC would be considered string-like, not all objects implementing the .__str__() method. In regular Python, only str() objects would register with strings.String. Path objects could also register to be treated as "string-like" object. Just like numeric types are only considered part of the ABCs under numbers, if they register with these. Perhaps the name strings.String sounds confusing, so perhaps strings.StringLike or strings.FineToConvertToAStringIfNeeded would be better :-) > The point of "string-like" is that it can be treated as a string, not > just that it can be converted to one. This is exactly parallel to the > difference between __index__ and __int__; floats can be converted to > int (and will truncate), but cannot be *treated* as ints. Right, and that's what you can define via an ABC. Those abstract base classes are not to be confused with introspecting methods on objects - they help optimize this kind of test, but most importantly help express the combination of providing an interface by exposing methods with the semantics of how those methods are expected to be used. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From random832 at fastmail.com Thu Apr 7 14:43:40 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 14:43:40 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57069943.8050008@stoneleaf.us> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <1460046683.1162090.571998529.6552D34B@webmail.messagingengine.com> <57069943.8050008@stoneleaf.us> Message-ID: <1460054620.1192014.572127913.4C432246@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 13:30, Ethan Furman wrote: > The protocol is "the how" of supporting pathlib, and, interestingly > enough, the easiest way to do so (avoids circular imports, etc., etc,.). If the problem is importing pathlib, what about a "pathlib lite" that can check if an object is a path without importing pathlib? This could be a recipe, a separate module, or part of os. def is_Path(x): return 'pathlib' in sys.modules and isinstance(x, sys.modules['pathlib'].Path) def path_str(x): if isinstance(x, str): return x if isPath(x): return x.path raise TypeError # convenience methods for modules that want to support returning a Path but don't import it def to_Path(x): import pathlib return pathlib.Path(x) # most common case, return a path iff an input argument is a path. def to_Path_maybe(value, *args): if any(is_Path(arg) for arg in args): return to_Path(value) else: return value All this other stuff seems to have an ambition of making these things fully general, such that some other library can be dropped in instead of pathlib without subclassing either str or Path, and I'm not sure what the use case for that is. Pathlib is the battery that's included. From chris.barker at noaa.gov Thu Apr 7 14:45:36 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 11:45:36 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico wrote: > Fair enough. That was a spur-of-the-moment thought. To be honest, this > proposal is a massive generalization from, ultimately, a single > use-case. yes, but now that you say: ... > And this should make it easier for third-party code to be functional > without even being aware of the Path object. There are two basic > things that code will be doing with paths: passing them unchanged to > standard library functions (eg open()), and combining them with > strings. The first will work by definition; the second will if paths > can implicitly upcast to strings. > so this is really a way to make the whole path!=string thing easier -- we don't want to simply call str() on anything that might be a path, because that will work on anything, whether it's the least bit pathlike at all, but this would introduce a new kind of __str__, so that only things for which it makes sense would "Just work": str + something Would only concatenate to a new string if somethign was an object that couuld "losslessly" be considered a string? but if we use the __index__ example, that is specifically "an integer that can be ued as an index", not "something that can be losslessly converted to an integer". so why not the __fspath__ protocol anyway? Unless there are all sorts of other use cases you have in mind? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Apr 7 14:48:55 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Apr 2016 14:48:55 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: On 4/7/2016 12:16 PM, Random832 wrote: > On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote: >> Someone (lost is quotes) >>> What's __index__ for? >> >> I don't follow. It's for indexing, which requires an integer. > > Sure, but why isn't int() good enough? Good question. The reason to add __index__ was to allow indexing with some things other that ints while not allowing just anything that can be converted to int (with or without loss). The reason for the restriction is to prevent subtle bugs. Expanding the domain of indexing with __index__ instead of int() was a judgment call, not a logical necessity. > For the same reason you only want > the kinds of objects that implement __index__ (and not, say, a float or > a string that happens to be numeric) for indexing, you only want the > kinds of objects that implement this method for certain purposes. I understand this, but I am going to challenge the analogy. An index really is a int -- a count of items of the sequence from either end (where the right end can be thought of as an invisible End_ofSequence item). A path, on the other hand, is not really a string. A path is a sequence of nodes in the filesystem graph. Converting structured data, in this case paths, to strings, is just a least-common-denominator way of communicating structured data between different languages. To me, the default proposal to expand the domain of open and other path functions is to call str on the path arg, either always or as needed. We should then ask "why isn't str() good enough"? Most bad args for open will immediately result in a file-not-found exception. But the os.path functions would not. Do the possible bugs and violation of python philosophy out-weigh the simplicity of the proposal? -- Terry Jan Reedy From brett at python.org Thu Apr 7 14:51:31 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 18:51:31 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 11:29 Chris Angelico wrote: > On Fri, Apr 8, 2016 at 3:31 AM, Brett Cannon wrote: > > But couldn't you also just define a str subclass that checks its > argument(s) > > are only valid ASCII values? What you're proposing is to potentially > change > > all places that operate with a string to now check for a special method > > which would be a costly change potentially to performance as well as > > propagating this concept everywhere a string-like object is expected. > > Fair enough. That was a spur-of-the-moment thought. To be honest, this > proposal is a massive generalization from, ultimately, a single > use-case. > > > I think it's important to realize that the main reason we are considering > > this special method concept is to make it easier to introduce in > third-party > > code which doesn't have pathlib or for people who don't want to import > > pathlib just to convert a pathlib.PurePath object to a string. > > And this should make it easier for third-party code to be functional > without even being aware of the Path object. There are two basic > things that code will be doing with paths: passing them unchanged to > standard library functions (eg open()), and combining them with > strings. The first will work by definition; the second will if paths > can implicitly upcast to strings. > > > In contrast, a function or method to convert a path to a string > requires conscious effort on the part of any function that needs to do > such manipulation, which correspondingly means version compatibility > checks. > Libraries will need to release a new version that's > path-compatible, even though they don't actually gain any > functionality. > But they will with yours as well. At best you could get this into Python 3.6 and then propagate this implicit string-like conversion functionality throughout the language and stdlib, but any implicitness won't be backported either as it's implicit. So while you're saying you don't like the explicitness required to backport this, your solution doesn't help with that either as you will still need to do the exact same method lookup for any code that wants to work pre-3.6. And if using path objects takes off and new APIs come up that don't take a string for paths, then the explicit conversion can be left out -- and perhaps removed in converted APIs -- while your implicit conversion will forever be baked into Python itself. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 7 14:53:34 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 04:53:34 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706AA26.1090504@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> Message-ID: On Fri, Apr 8, 2016 at 4:42 AM, M.-A. Lemburg wrote: > On 07.04.2016 18:46, Chris Angelico wrote: >> On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg wrote: >>> Not necessarily. In fact, a string.String ABC could have only >>> a single method: .__str__() defined. >> >> That wouldn't be very useful; object.__str__ exists and is functional, >> so EVERY object would count as a string. > > No, only those objects that register with the ABC would be considered > string-like, not all objects implementing the .__str__() method. > In regular Python, only str() objects would register with > strings.String. Oh, gotcha. In that case, it would still be pretty much the same as the __fspath__ proposal, except that instead of creating a dunder method (to implement a protocol), you register with an ABC. ChrisA From ethan at stoneleaf.us Thu Apr 7 14:59:41 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 11:59:41 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: <5706AE1D.7030605@stoneleaf.us> On 04/07/2016 11:45 AM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico wrote: >> And this should make it easier for third-party code to be functional >> without even being aware of the Path object. There are two basic >> things that code will be doing with paths: passing them unchanged to >> standard library functions (eg open()), and combining them with >> strings. The first will work by definition; the second will if paths >> can implicitly upcast to strings. > > so this is really a way to make the whole path!=string thing easier -- > we don't want to simply call str() on anything that might be a path, > because that will work on anything, whether it's the least bit pathlike > at all, but this would introduce a new kind of __str__, so that only > things for which it makes sense would "Just work": > > str + something > > Would only concatenate to a new string if somethign was an object that > couuld "losslessly" be considered a string? Which is mildly attractive. > but if we use the __index__ example, that is specifically "an integer > that can be ued as an index", not "something that can be losslessly > converted to an integer". __index__ was originally created to support indexing, but has morphed over time to mean "something that can be losslessly converted to an integer". -- ~Ethan~ From brett at python.org Thu Apr 7 14:59:20 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 18:59:20 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706AA26.1090504@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> Message-ID: On Thu, 7 Apr 2016 at 11:43 M.-A. Lemburg wrote: > On 07.04.2016 18:46, Chris Angelico wrote: > > On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg wrote: > >> Not necessarily. In fact, a string.String ABC could have only > >> a single method: .__str__() defined. > > > > That wouldn't be very useful; object.__str__ exists and is functional, > > so EVERY object would count as a string. > > No, only those objects that register with the ABC would be considered > string-like, not all objects implementing the .__str__() method. > In regular Python, only str() objects would register with > strings.String. > > Path objects could also register to be treated as "string-like" > object. > > Just like numeric types are only considered part of the ABCs > under numbers, if they register with these. > > Perhaps the name strings.String sounds confusing, so > perhaps strings.StringLike or strings.FineToConvertToAStringIfNeeded > would be better :-) > > > The point of "string-like" is that it can be treated as a string, not > > just that it can be converted to one. This is exactly parallel to the > > difference between __index__ and __int__; floats can be converted to > > int (and will truncate), but cannot be *treated* as ints. > > Right, and that's what you can define via an ABC. Those abstract > base classes are not to be confused with introspecting methods > on objects - they help optimize this kind of test, but most > importantly help express the combination of providing an interface > by exposing methods with the semantics of how those > methods are expected to be used. > To make MAL's proposal concrete: class StringLike(abc.ABC): @abstractmethod def __str__(self): """Return the string representation of something.""" StringLike.register(pathlib.PurePath) # Any 3rd-party library can do the same. You could also call the class StringablePath or something and get the exact same concept across where you are using the registration abilities of ABCs to semantically delineate when a class's __str__() returns a usable file path. The drawback is that this isn't easily backported like `path.__ospath__() if hasattr(path, '__ospath__') else path` for libraries that don't necessarily have access to pathlib but want to be compatible with accepting path objects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 7 14:59:09 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 11:59:09 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: On Thu, Apr 7, 2016 at 11:48 AM, Terry Reedy wrote: > To me, the default proposal to expand the domain of open and other path > functions is to call str on the path arg, either always or as needed. We > should then ask "why isn't str() good enough"? Most bad args for open will > immediately result in a file-not-found exception. But the os.path > functions would not. Do the possible bugs and violation of python > philosophy out-weigh the simplicity of the proposal? Thanks -- this is exactly the question at hand. it's gotten a bit caught up in all the discussion of what to call the magic method, should there be a top-level function, etc. but this is the only question on the table that isn't just bikeshedding. personally, I think not -- but I"m very close to the fence. Though no, most calls to open() would not fail -- most calls to open(path, 'r') would but most calls to open(path, 'w') would succeed and produce some really weird filenames -- but so what? that's the question -- are these subtle and hard to find bugs we want to prevent? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 7 15:00:53 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 12:00:53 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706AE1D.7030605@stoneleaf.us> References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <5706AE1D.7030605@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 11:59 AM, Ethan Furman wrote: > __index__ was originally created to support indexing, but has morphed over > time to mean "something that can be losslessly converted to an integer". > Ahh! very good precedent, then -- where else is this used? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 7 15:02:16 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 12:02:16 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> Message-ID: On Thu, Apr 7, 2016 at 11:59 AM, Brett Cannon wrote: > class StringLike(abc.ABC): > > @abstractmethod > def __str__(self): > """Return the string representation of something.""" > > StringLike.register(pathlib.PurePath) # Any 3rd-party library can do > the same. > > You could also call the class StringablePath or something and get the > exact same concept across where you are using the registration abilities of > ABCs to semantically delineate when a class's __str__() returns a usable > file path. > > The drawback is that this isn't easily backported like `path.__ospath__() > if hasattr(path, '__ospath__') else path` for libraries that don't > necessarily have access to pathlib but want to be compatible with accepting > path objects. > and a plus is that it's compatible with type hinting -- is that the future of Python??? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 7 15:06:40 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 12:06:40 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: <5706AFC0.30604@stoneleaf.us> On 04/07/2016 11:59 AM, Chris Barker wrote: > Though no, most calls to open() would not fail -- most calls to > open(path, 'r') would but most calls to open(path, 'w') would succeed > and produce some really weird filenames -- but so what? > > that's the question -- are these subtle and hard to find bugs we want to > prevent? If we are trying to fix issues, why would we leave the door open to other, more subtle, bugs? -- ~Ethan~ From rosuav at gmail.com Thu Apr 7 15:06:31 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 05:06:31 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 4:45 AM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico wrote: >> And this should make it easier for third-party code to be functional >> without even being aware of the Path object. There are two basic >> things that code will be doing with paths: passing them unchanged to >> standard library functions (eg open()), and combining them with >> strings. The first will work by definition; the second will if paths >> can implicitly upcast to strings. > > > so this is really a way to make the whole path!=string thing easier -- we > don't want to simply call str() on anything that might be a path, because > that will work on anything, whether it's the least bit pathlike at all, but > this would introduce a new kind of __str__, so that only things for which it > makes sense would "Just work": > > str + something > > Would only concatenate to a new string if somethign was an object that > couuld "losslessly" be considered a string? > > but if we use the __index__ example, that is specifically "an integer that > can be ued as an index", not "something that can be losslessly converted to > an integer". That's two ways of describing the same thing. When you call __index__, you either get back an integer with the exact same meaning as the original object, or you get an exception. In contrast, calling __int__ may result in an integer which is similar, but not equal, to the original - for instance, int(3.2) is 3. That's a lossy conversion, which __index__ will refuse to do. Similarly, str() can be quite lossy and non-strict on arbitrary objects. With this conversion, it will always either return an exactly-equivalent string, or raise. > so why not the __fspath__ protocol anyway? > > Unless there are all sorts of other use cases you have in mind? Not actually in mind, but there have been multiple people raising concerns that __fspath__ is too specific for what it achieves. ChrisA From tjreedy at udel.edu Thu Apr 7 15:06:47 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Apr 2016 15:06:47 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <570676C1.3020808@stoneleaf.us> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <570676C1.3020808@stoneleaf.us> Message-ID: On 4/7/2016 11:03 AM, Ethan Furman wrote: > On 04/07/2016 07:07 AM, Random832 wrote: > >> What's __index__ for? > > __index__ is a way to get an int from an int-like object without losing > information; so it fails with values like 3.4, but should succeed with > values like Fraction(4, 2). > > __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and > and the 4/10's is lost). Why is that a problem? Loss is not the issue. And as someone else pointed out, int('4') does not lose information, but seq['1'] is prohibited. Why is passing integer strings as indexes bad? The answer to the latter, I believe, is that Guido is against treating numbers and string representations of numbers as interchangable. And the answer to both, I believe, is that the downside of flexibility is ease of creating buggy code, especially code where bugs do not immediately raise something, or ease of creating confusing code that is hard to maintain. -- Terry Jan Reedy From ethan at stoneleaf.us Thu Apr 7 15:10:22 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 12:10:22 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> Message-ID: <5706B09E.8090603@stoneleaf.us> On 04/07/2016 11:59 AM, Brett Cannon wrote: > To make MAL's proposal concrete: > > class StringLike(abc.ABC): > > @abstractmethod > def __str__(self): > """Return the string representation of something.""" > > StringLike.register(pathlib.PurePath) # Any 3rd-party library can do > the same. > > You could also call the class StringablePath or something and get the > exact same concept across where you are using the registration abilities > of ABCs to semantically delineate when a class's __str__() returns a > usable file path. I think I might like this better than a new magic method. > The drawback is that this isn't easily backported like > `path.__ospath__() if hasattr(path, '__ospath__') else path` for > libraries that don't necessarily have access to pathlib but want to be > compatible with accepting path objects. I don't understand. -- ~Ethan~ From brett at python.org Thu Apr 7 15:10:00 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 19:10:00 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> Message-ID: On Thu, 7 Apr 2016 at 12:03 Chris Barker wrote: > > On Thu, Apr 7, 2016 at 11:59 AM, Brett Cannon wrote: > >> class StringLike(abc.ABC): >> >> @abstractmethod >> def __str__(self): >> """Return the string representation of something.""" >> >> StringLike.register(pathlib.PurePath) # Any 3rd-party library can do >> the same. >> >> You could also call the class StringablePath or something and get the >> exact same concept across where you are using the registration abilities of >> ABCs to semantically delineate when a class's __str__() returns a usable >> file path. >> >> The drawback is that this isn't easily backported like `path.__ospath__() >> if hasattr(path, '__ospath__') else path` for libraries that don't >> necessarily have access to pathlib but want to be compatible with accepting >> path objects. >> > > and a plus is that it's compatible with type hinting -- is that the future > of Python??? > So the other way to do this is to combine the proposals: class BasePath(abc.ABC): @abstractmethod def __ospath__(self): """Return the file system path, serialized as a string.""" Then pathlib.PurePath can inherit from this ABC and anyone else can as well or be registered as doing so. Then in typing.py you can have: class Path(extra=pathlib.BasePath): __slots__ = () PathLike = Union[str, Path] Then any third-party library can register with the ABC and get the typing correctly (assuming I didn't botch the type specification). Guido also had some protocol proposal a while back that I think he floated here, but I don't think the discussion really went anywhere as it was an early idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 7 15:10:33 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 05:10:33 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <570676C1.3020808@stoneleaf.us> Message-ID: On Fri, Apr 8, 2016 at 5:06 AM, Terry Reedy wrote: > On 4/7/2016 11:03 AM, Ethan Furman wrote: >> >> On 04/07/2016 07:07 AM, Random832 wrote: >> >>> What's __index__ for? >> >> >> __index__ is a way to get an int from an int-like object without losing >> information; so it fails with values like 3.4, but should succeed with >> values like Fraction(4, 2). >> >> __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and >> and the 4/10's is lost). > > > Why is that a problem? Loss is not the issue. And as someone else pointed > out, int('4') does not lose information, but seq['1'] is prohibited. Why is > passing integer strings as indexes bad? > > The answer to the latter, I believe, is that Guido is against treating > numbers and string representations of numbers as interchangable. And the > answer to both, I believe, is that the downside of flexibility is ease of > creating buggy code, especially code where bugs do not immediately raise > something, or ease of creating confusing code that is hard to maintain. Hence my wording of "string-like". Anything can be converted to a string, but only certain objects are sufficiently string-like to be implicitly treated as strings. But ultimately it's all the same concept. ChrisA From chris.barker at noaa.gov Thu Apr 7 15:10:24 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 12:10:24 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706AFC0.30604@stoneleaf.us> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <5706AFC0.30604@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 12:06 PM, Ethan Furman wrote: > that's the question -- are these subtle and hard to find bugs we want to >> prevent? >> > > If we are trying to fix issues, why would we leave the door open to other, > more subtle, bugs? > we're not trying to fix issues -- we're trying to make Path compatible with the stdlib and other libs that use strings as path. Or do you mean that not having Path subclass str is trying to fix issues, in which case, yes, I suppose, but you need to stop somewhere, or we'll have a statically typed language... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 7 15:12:18 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 12:12:18 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <570676C1.3020808@stoneleaf.us> Message-ID: <5706B112.7040701@stoneleaf.us> On 04/07/2016 12:06 PM, Terry Reedy wrote: > On 4/7/2016 11:03 AM, Ethan Furman wrote: >> On 04/07/2016 07:07 AM, Random832 wrote: >>> What's __index__ for? >> >> __index__ is a way to get an int from an int-like object without losing >> information; so it fails with values like 3.4, but should succeed with >> values like Fraction(4, 2). >> >> __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and >> and the 4/10's is lost). > > Why is that a problem? It isn't. I was explaining the difference between __int__ and __index__. -- ~Ethan~ From joshua.morton13 at gmail.com Thu Apr 7 15:11:49 2016 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Thu, 07 Apr 2016 19:11:49 +0000 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: <5706A5D7.9020407@gmail.com> References: <570692E7.9080904@gmail.com> <5706A5D7.9020407@gmail.com> Message-ID: If I had to make a syntax suggestion, I think it would be a, b, c = iterable or defaults although making or both pseudo None-coalescing and pseudo error-coalescing might be a bit too much sugar. However I think it clearly expresses the idea. That said, I have to ask what the usecase is for dealing with fixed length unpacking where you might not have the fixed length of items. That feels smelly to me. Why are you trying to name values from a variable length list? -Josh On Thu, Apr 7, 2016 at 2:24 PM Michel Desmoulin wrote: > > > Le 07/04/2016 20:15, Michael Selik a ?crit : > > > >> On Apr 7, 2016, at 6:44 PM, Todd wrote: > >> > >> On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin < > desmoulinmichel at gmail.com> wrote: > >> Python is a lot about iteration, and I often have to get values from an > >> iterable. For this, unpacking is fantastic: > >> > >> a, b, c = iterable > >> > >> One that problem arises is that you don't know when iterable will > >> contain 3 items. > >> > >> In that case, this beautiful code becomes: > >> > >> iterator = iter(iterable) > >> a = next(iterator, "default value") > >> b = next(iterator, "default value") > >> c = next(iterator, "default value") > > > > I actually don't think that's so ugly. Looks fairly clear to me. What > about these alternatives? > > Well, there is nothing wrong with: > > a = mylist[0] > > b = mylist[1] > > c = mylist[2] > > But we still prefer unpacking. And this is way more verbose. > > > > > > >>>> from itertools import repeat, islice, chain > >>>> it = range(2) > >>>> a, b, c = islice(chain(it, repeat('default')), 3) > >>>> a, b, c > > (0, 1, 'default') > > > > > > If you want to stick with builtins: > > > >>>> it = iter(range(2)) > >>>> a, b, c = [next(it, 'default') for i in range(3)] > >>>> a, b, c > > (0, 1, 'default') > > They all work, but they are impossible to remember, plus you will need a > comment everytime you use them outside of the shell. > > > > > > > If your utterable is sliceable and sizeable: > > > >>>> it = list(range(2)) > >>>> a, b, c = it[:3] + ['default'] * (3 - len(it)) > >>>> a, b, c > > (0, 1, 'default') > > Same problem, and as you said, you can forget about generators. > > > > > > > > > Perhaps add a recipe to itertools, or change the "take" recipe? > > > > def unpack(n, iterable, default=None): > > "Slice the first n items of the iterable, padding with a default" > > padding = repeat(default) > > return islice(chain(iterable, padding), n)) > > Not a bad idea. Built in would be better, but I can live with itertools. > I import it so often I'm wondering if itertools shouldn't be in > __builtins__ :) > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 7 15:15:54 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 12:15:54 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <5706AFC0.30604@stoneleaf.us> Message-ID: <5706B1EA.6040703@stoneleaf.us> On 04/07/2016 12:10 PM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 12:06 PM, Ethan Furman wrote: >> If we are trying to fix issues, why would we leave the door open to >> other, more subtle, bugs? > > we're not trying to fix issues -- we're trying to make Path compatible > with the stdlib and other libs that use strings as path. Yeah, that's the issue. ;) -- ~Ethan~ From brett at python.org Thu Apr 7 15:17:26 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 19:17:26 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706B09E.8090603@stoneleaf.us> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 12:11 Ethan Furman wrote: > On 04/07/2016 11:59 AM, Brett Cannon wrote: > > > To make MAL's proposal concrete: > > > > class StringLike(abc.ABC): > > > > @abstractmethod > > def __str__(self): > > """Return the string representation of something.""" > > > > StringLike.register(pathlib.PurePath) # Any 3rd-party library can do > > the same. > > > > You could also call the class StringablePath or something and get the > > exact same concept across where you are using the registration abilities > > of ABCs to semantically delineate when a class's __str__() returns a > > usable file path. > > I think I might like this better than a new magic method. > > > The drawback is that this isn't easily backported like > > `path.__ospath__() if hasattr(path, '__ospath__') else path` for > > libraries that don't necessarily have access to pathlib but want to be > > compatible with accepting path objects. > > I don't understand. > How do you make Python 3.3 code work with this when the ABC will simply not be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0 (under the assumption that the ABC is put in pathlib and backported thanks to its provisional status)? The ternary operator one-liner is backwards-compatible while the ABC is only forward-compatible. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Apr 7 15:18:59 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Apr 2016 15:18:59 -0400 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: <570692E7.9080904@gmail.com> References: <570692E7.9080904@gmail.com> Message-ID: On 4/7/2016 1:03 PM, Michel Desmoulin wrote: > Python is a lot about iteration, and I often have to get values from an > iterable. For this, unpacking is fantastic: > > a, b, c = iterable > > One that problem arises is that you don't know when iterable will > contain 3 items. You need to augment if needed to get at least 3 items. You need to chop if needed to get at most 3 items. The following does exactly this. import itertools as it a, b, c = it.islice(it.chain(iterable, it.repeat(default, 3), 3) You can wrap this if you want with def exactly(iterable, n, default). This might already be a recipe in the itertools doc recipe section. -- Terry Jan Reedy From random832 at fastmail.com Thu Apr 7 15:35:57 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 15:35:57 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> Message-ID: <1460057757.1204370.572180449.21C09305@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 15:17, Brett Cannon wrote: > How do you make Python 3.3 code work with this when the ABC will simply > not > be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0 > (under the assumption that the ABC is put in pathlib and backported > thanks > to its provisional status)? The ternary operator one-liner is > backwards-compatible while the ABC is only forward-compatible. If it's not available to you, then it's not available to anyone to register either, so you obviously act as if no objects are StringLike if you get an ImportError when trying to use it. Isn't this just the standard dance for using *any* function that's new to a new version of Python? try: from pathlib import StringLike # if it's in pathlib why is it called StringLike? def is_StringLike(x): return isinstance(x, StringLike) except ImportError: def is_StringLike(x): return False From donald at stufft.io Thu Apr 7 15:37:03 2016 From: donald at stufft.io (Donald Stufft) Date: Thu, 7 Apr 2016 15:37:03 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> Message-ID: > On Apr 7, 2016, at 3:17 PM, Brett Cannon wrote: > > The ternary operator one-liner is backwards-compatible while the ABC is only forward-compatible. I like the idea of doing both. Make a __fspath__ method and pathlib.fspath that uses it, and then have an ABC that checks for the existence of __fspath__. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ethan at stoneleaf.us Thu Apr 7 15:43:27 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 12:43:27 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> Message-ID: <5706B85F.6080606@stoneleaf.us> On 04/07/2016 12:17 PM, Brett Cannon wrote: > How do you make Python 3.3 code work with this when the ABC will simply > not be available to you unless you're running Python 3.4.3, 3.5.2, or > 3.6.0 (under the assumption that the ABC is put in pathlib and > backported thanks to its provisional status)? The ternary operator > one-liner is backwards-compatible while the ABC is only forward-compatible. __os_path__ (or whatever it's called) also won't be available on those earlier versions -- so I'm not seeing that the ABC route as any worse. Am I missing something? -- ~Ethan~ From njs at pobox.com Thu Apr 7 15:47:14 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Apr 2016 12:47:14 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: On Apr 7, 2016 8:57 AM, "Paul Moore" wrote: [...] > But the proposal for paths is to have a *specific* method that says > "give me a string representing a filesystem path from this object". An > "interpret this object as a string" wouldn't be appropriate for the > cases where I'd want to do "give me a string representing a filesystem > path". And that's where I get stuck, as I can't think of an example > where I *would* want the more general option. For a number of reasons: > > 1. I can't think of a real-world example of when I'd *use* such a facility > 2. I can't think of a real-world example of a type that might > *provide* such a facility > 3. I can't see how something so general would be of benefit. Numpy and friends would implement this if it existed, for things like converting numpy strings to python strings. But I can't think of the cases where this would be useful either. The reason that it's useful to have __index__, and that it would be useful to have __fspath__, is that in both cases there are a bunch of interfaces defined as part of the core language that need to implement the consumer side of the protocol, so a core language protocol is the only real way to go. I'm not thinking of a lot of analogous APIs in core python take specifically-strings? Most of the important ones are str methods, so regular self-dispatch already handles it. Or I guess str.__add__(someobj), but that's already handled by binop dispatch. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Thu Apr 7 16:04:50 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 16:04:50 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> Message-ID: <1460059490.1211724.572202217.7F06A07F@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 15:47, Nathaniel Smith wrote: > The reason that it's useful to have __index__, and that it would be > useful > to have __fspath__, is that in both cases there are a bunch of interfaces > defined as part of the core language that need to implement the consumer > side of the protocol, so a core language protocol is the only real way to > go. I'm not thinking of a lot of analogous APIs in core python take > specifically-strings? Well, there's getattr and the like. I'm not sure why you'd want to pass a not-really-a-string (though, an ASCII bytes is the one thing I _can_ think of*, and didn't the inability to pass unicode strings to a bunch of APIs cause no end of trouble for Python 2 users?), but then I'm not sure why you'd want to pass anything but an int (or a python 2 int/long) to list indexing. *Maybe a RUE-string that subclasses bytes and uses UTF-8. From tjreedy at udel.edu Thu Apr 7 16:05:19 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Apr 2016 16:05:19 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: On 4/7/2016 2:08 PM, Guido van Rossum wrote: > On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik wrote: >> To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct? > > Yes. To be more precise, there are some "arithmetic" operations (+, -, > *, /, **) and they all treat bools as ints and always return ints; > there are also some "bitwise" operations (&, |, ^, ~) and they should > all treat bools as bools and return a bool. Currently the only > exception to this idea is that ~ returns an int, so the proposal is to > fix that. (There are also some "boolean" operations (and, or, not) and > they are also unchanged.) When the proposal is expressed as "Make bools consistently follow this simple rule -- Logical and 'bitwise' operations on bools return the expected bool, while arithmetic operations treat bools as 0 or 1 and return ints.", it makes sense to me. Given that ~bool hardly make any sense currently, I would not expect it to be in much use now. Hence not much to break. -- Terry Jan Reedy From random832 at fastmail.com Thu Apr 7 16:08:37 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Apr 2016 16:08:37 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: <1460059717.1212592.572215041.7353712D@webmail.messagingengine.com> On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote: > Given that ~bool hardly make any sense currently, I would not expect it > to be in much use now. Hence not much to break. I suspect the fear is of one being passed into a place that expects an int, and staying alive as a bool (i.e. not being converted to an int by an arithmetic operation) long enough to confuse code that is trying to do ~int. From p.f.moore at gmail.com Thu Apr 7 16:21:28 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 21:21:28 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706B85F.6080606@stoneleaf.us> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> Message-ID: On 7 April 2016 at 20:43, Ethan Furman wrote: > On 04/07/2016 12:17 PM, Brett Cannon wrote: > >> How do you make Python 3.3 code work with this when the ABC will simply >> not be available to you unless you're running Python 3.4.3, 3.5.2, or >> 3.6.0 (under the assumption that the ABC is put in pathlib and >> backported thanks to its provisional status)? The ternary operator >> one-liner is backwards-compatible while the ABC is only >> forward-compatible. > > > __os_path__ (or whatever it's called) also won't be available on those > earlier versions -- so I'm not seeing that the ABC route as any worse. > > Am I missing something? You can check for __os_path__ even if it doesn't exist (hasattr takes the name as a string), but you can't check for the ABC if it doesn't exist (isinstance takes the actual ABC as the argument). Paul From p.f.moore at gmail.com Thu Apr 7 16:26:38 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 21:26:38 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <1460059717.1212592.572215041.7353712D@webmail.messagingengine.com> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <1460059717.1212592.572215041.7353712D@webmail.messagingengine.com> Message-ID: On 7 April 2016 at 21:08, Random832 wrote: > On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote: >> Given that ~bool hardly make any sense currently, I would not expect it >> to be in much use now. Hence not much to break. > > I suspect the fear is of one being passed into a place that expects an > int, and staying alive as a bool (i.e. not being converted to an int by > an arithmetic operation) long enough to confuse code that is trying to > do ~int. That is indeed the only place likely to hit problems. But I'd be surprised if it was sufficiently common to be a major problem. I don't think the backward compatibility constraints on a minor release would preclude a change like this. Personally, I'm +0 on the proposal. It seems like a more useful behaviour, but it's one I'm never likely to need personally. Paul From ethan at stoneleaf.us Thu Apr 7 16:28:57 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 13:28:57 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> Message-ID: <5706C309.8030108@stoneleaf.us> On 04/07/2016 01:21 PM, Paul Moore wrote: > On 7 April 2016 at 20:43, Ethan Furman wrote: >> On 04/07/2016 12:17 PM, Brett Cannon wrote: >> >>> How do you make Python 3.3 code work with this when the ABC will simply >>> not be available to you unless you're running Python 3.4.3, 3.5.2, or >>> 3.6.0 (under the assumption that the ABC is put in pathlib and >>> backported thanks to its provisional status)? The ternary operator >>> one-liner is backwards-compatible while the ABC is only >>> forward-compatible. >> >> __os_path__ (or whatever it's called) also won't be available on those >> earlier versions -- so I'm not seeing that the ABC route as any worse. >> >> Am I missing something? > > You can check for __os_path__ even if it doesn't exist (hasattr takes > the name as a string), but you can't check for the ABC if it doesn't > exist (isinstance takes the actual ABC as the argument). True, but: - Python 3.3 isn't going to check for __os_path__ - Python 3.3 isn't going to check for an ABC In other words, Python 3.3* simply isn't going to work with pathlib, so I continue to be confused with why Brett brought it up. -- ~Ethan~ * In case there's any confusion: by "not work" I mean the stdlib is not going to correctly interpret a pathlib.Path in 3.3 and earlier. From p.f.moore at gmail.com Thu Apr 7 16:46:11 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Apr 2016 21:46:11 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706C309.8030108@stoneleaf.us> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> <5706C309.8030108@stoneleaf.us> Message-ID: On 7 April 2016 at 21:28, Ethan Furman wrote: > In case there's any confusion: by "not work" I mean the stdlib is not going > to correctly interpret a pathlib.Path in 3.3 and earlier. I think the confusion is over *who* will be checking the protocol or the ABC. You're correct that the stdlib will not do so in earlier versions. I've been assuming that is obvious (and I suspect Brett has too). Talk about using the check for older versions is basically around the possibility that 3rd party libraries might do so. I think it's unlikely that they'll bother (but I'm commenting on the suggestion because I'd like it to be easy to do if anyone does want to bother, against my expectations). I have the feeling Brett might think that it's somewhat more likely. If all we're thinking about is a way for the stdlib to work with strings and pathlib objects seamlessly, while also allowing 3rd party path classes to register to be treated the same way, then yes, there's no real difference between an ABC and a protocol (well, to register with the ABC, 3rd party code would need to conditionally import the ABC and make sure not to fail if the ABC can't be found, but that's just some boilerplate). Frankly, if it wasn't for the fact that you have stated that you'll add support for the protocol to your path library, I'd be surprised if *any* 3rd party code changed as a result of this discussion. There's been no comment from the authors of path.py or pylib (the only other 2 path objects I know of). And the only comments I've heard from authors of libraries that consume paths is "I don't see any reason why I'd bother". So as long as you're happy with the final form of the proposal, I see little reason to worry about how other 3rd party code might use it. Paul From brett at python.org Thu Apr 7 17:26:52 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 21:26:52 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> <5706C309.8030108@stoneleaf.us> Message-ID: On Thu, 7 Apr 2016 at 13:46 Paul Moore wrote: > On 7 April 2016 at 21:28, Ethan Furman wrote: > > In case there's any confusion: by "not work" I mean the stdlib is not > going > > to correctly interpret a pathlib.Path in 3.3 and earlier. > > I think the confusion is over *who* will be checking the protocol or > the ABC. You're correct that the stdlib will not do so in earlier > versions. I've been assuming that is obvious (and I suspect Brett has > too). Yep, you're right. I was never worried about making the stdlib work since that's a fully controlled environment that we can update at once. What I'm worried about is any third-party library that has an API that takes a path as an argument that may need to be updated to support pathlib -- or any other path library -- and doesn't want to directly rely on Python 3.6 for support. > Talk about using the check for older versions is basically > around the possibility that 3rd party libraries might do so. I think > it's unlikely that they'll bother (but I'm commenting on the > suggestion because I'd like it to be easy to do if anyone does want to > bother, against my expectations). I have the feeling Brett might think > that it's somewhat more likely. > Yes, that's my hope as third-party libraries are the ones that have older Python version compatibility to care about (the stdlib is obviously always latest-and-greatest). > > If all we're thinking about is a way for the stdlib to work with > strings and pathlib objects seamlessly, while also allowing 3rd party > path classes to register to be treated the same way, then yes, there's > no real difference between an ABC and a protocol (well, to register > with the ABC, 3rd party code would need to conditionally import the > ABC and make sure not to fail if the ABC can't be found, but that's > just some boilerplate). > My point is the boilerplate is minimized for third-party libraries in the instance of the magic method vs the ABC, but otherwise they accomplish the same thing. > > Frankly, if it wasn't for the fact that you have stated that you'll > add support for the protocol to your path library, I'd be surprised if > *any* 3rd party code changed as a result of this discussion. There's > been no comment from the authors of path.py or pylib (the only other 2 > path objects I know of). And the only comments I've heard from authors > of libraries that consume paths is "I don't see any reason why I'd > bother". So as long as you're happy with the final form of the > proposal, I see little reason to worry about how other 3rd party code > might use it. > I'm trying to be a bit more optimistic on the uptake. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 7 17:50:30 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Apr 2016 14:50:30 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> <5706C309.8030108@stoneleaf.us> Message-ID: On Apr 7, 2016 2:32 PM, "Brett Cannon" wrote: > [...] >> Frankly, if it wasn't for the fact that you have stated that you'll >> add support for the protocol to your path library, I'd be surprised if >> *any* 3rd party code changed as a result of this discussion. There's >> been no comment from the authors of path.py or pylib (the only other 2 >> path objects I know of). And the only comments I've heard from authors >> of libraries that consume paths is "I don't see any reason why I'd >> bother". So as long as you're happy with the final form of the >> proposal, I see little reason to worry about how other 3rd party code >> might use it. > > > I'm trying to be a bit more optimistic on the uptake. :) If data points are useful, numpy just merged a PR for supporting pathlib paths: https://github.com/numpy/numpy/pull/6660 It's a bit awkward given the current api, but someone did care enough to take the trouble. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Apr 7 17:50:53 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Apr 2016 21:50:53 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460057757.1204370.572180449.21C09305@webmail.messagingengine.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <1460057757.1204370.572180449.21C09305@webmail.messagingengine.com> Message-ID: On Thu, 7 Apr 2016 at 12:36 Random832 wrote: > On Thu, Apr 7, 2016, at 15:17, Brett Cannon wrote: > > How do you make Python 3.3 code work with this when the ABC will simply > > not > > be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0 > > (under the assumption that the ABC is put in pathlib and backported > > thanks > > to its provisional status)? The ternary operator one-liner is > > backwards-compatible while the ABC is only forward-compatible. > > If it's not available to you, then it's not available to anyone to > register either, so you obviously act as if no objects are StringLike if > you get an ImportError when trying to use it. Isn't this just the > standard dance for using *any* function that's new to a new version of > Python? > Yes, but the lack of a magic method is not as severe as a lack of an ABC you will be using in an isinstance() check. > > try: > from pathlib import StringLike # if it's in pathlib why is it called > StringLike? > I called it StringLike because I was replying to Chris' proposal of a generic string-like protocol (which wouldn't live in pathlib). > def is_StringLike(x): return isinstance(x, StringLike) > except ImportError: > def is_StringLike(x): return False > That would cut out all third-party libraries no matter what their Python version support was. My point is that if I wanted this to work in Python 3.3.x, Python 3.4.2, or Python 3.5.1 then the ABC solution is out as the ABC won't exist. The magic method, though, would still work with the one-liner all the way back to Python 2.5 when conditional expressions were added to the language. For instance, let's say I'm the author of a library that uses file paths that wants to support Python 3.3 and newer. How do I add support for using pathlib.Path and Ethan's path library? With the magic method solution I can use: def ospath(path): return path.__ospath__() if hasattr(path, '__ospath__') else path If I really wanted to I could just embed that wherever I want to work with paths. Now how about the ABC? # In pathlib. class StringPath(abc.ABC): @abstractmethod def __str__(self): ... StringPath.register(pathlib.PurePath, str) # Maybe not cover str? # In my library trying to support Python 3.3 and newer, pathlib and Ethan's path library. try: from importlib import StringPath except ImportError: StringPath = None def ospath(path): if StringPath is None: if isinstance(path, StringPath): return str(path) # What am I supposed to do here? Now you could set `StringPath = object`, but that starts to negate the point of not subclassing strings as you're now accepting anything that defines __str__() in that case unless you're on a version of Python "new" enough to have pathlib w/ the ABC defined. And if you go some other route, what would you want to do if StringPath wasn't available? So the ABC vs magic method discussion comes down to whether we think third-party libraries will use whatever approach is decided upon and whether they care about how good the support is for Python 3.3.x, 3.4.2, and 3.5.1. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 7 18:09:01 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 15:09:01 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> <5706C309.8030108@stoneleaf.us> Message-ID: <5706DA7D.9010105@stoneleaf.us> On 04/07/2016 02:26 PM, Brett Cannon wrote: > Yep, you're right. I was never worried about making the stdlib work > since that's a fully controlled environment that we can update at once. > What I'm worried about is any third-party library that has an API that > takes a path as an argument that may need to be updated to support > pathlib -- or any other path library -- and doesn't want to directly > rely on Python 3.6 for support. Ah, I understand (finally!). Okay, let's go with the protocol then. If it makes sense to also have an ABC I'm fine with that. -- ~Ethan~ From eric at trueblade.com Thu Apr 7 20:09:12 2016 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 7 Apr 2016 20:09:12 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <20160407132845.GA15898@phdru.name> <5706761C.3070907@stoneleaf.us> <57068023.4080006@stoneleaf.us> <5706AE1D.7030605@stoneleaf.us> Message-ID: <5706F6A8.5000200@trueblade.com> On 4/7/2016 3:00 PM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 11:59 AM, Ethan Furman > wrote: > > __index__ was originally created to support indexing, but has > morphed over time to mean "something that can be losslessly > converted to an integer". > > > Ahh! very good precedent, then -- where else is this used? It's used by hex, oct, and bin, at least: >>> class Foo: ... def __index__(self): return 42 ... >>> hex(Foo()) '0x2a' >>> oct(Foo()) '0o52' >>> bin(Foo()) '0b101010' >>> From k7hoven at gmail.com Thu Apr 7 20:27:42 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 8 Apr 2016 03:27:42 +0300 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <5706DA7D.9010105@stoneleaf.us> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> <57068A72.4090708@egenix.com> <5706AA26.1090504@egenix.com> <5706B09E.8090603@stoneleaf.us> <5706B85F.6080606@stoneleaf.us> <5706C309.8030108@stoneleaf.us> <5706DA7D.9010105@stoneleaf.us> Message-ID: I like this double-underscore path attribute better than my suggestion almost two weeks ago in the other thread, but which did not get any reactions: On Sun, Mar 27, 2016 at 5:40 PM, Koos Zevenhoven wrote: > On Sun, Mar 27, 2016 at 1:23 AM, Koos Zevenhoven wrote: >> >> I assume you meant to type pathlib.Path.path, so that Path("...").path == >> str(Path("...")). That's a good start, and I'm looking forward to Serhiy's >> patch for making the stdlib accept Paths. But if Path will not subclass str, >> we also need new stdlib functions that *return* Paths. > > Actually, now that .path is not out yet, would it make sense to call it > Path.str or Path.strpath instead, and introduce the same thing on DirEntry > and guarantee a str (instead of str or bytes as DirEntry.path now does)? > Maybe that would lead to fewer broken implementations in third-party > libraries too? But maybe it could be called `__pathname__`, since 'names' are commonly thought of as strings. This, implemented across the stdlib, would obviously be better than the status quo. Anyway, you are now discussing that remind me a lot of my thoughts last week. I hope everyone realizes that, while this would still require effort from the maintainers of every library that deals with paths, this would not allow libraries to start returning pathlib objects from functions without either breaking backwards compatibility of their APIs or adding duplicate functions. We will end up with a mess of some libraries accepting path objects and some not, and some that screw up their DirEntry compatibility regarding bytestring paths or that break their existing pure bytestring compatibility that they didn't know of. All this could take a while and give people an impression of inconsistency in Python. -Koos From brenbarn at brenbarn.net Thu Apr 7 21:32:58 2016 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Thu, 07 Apr 2016 18:32:58 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <570679AF.7020502@stoneleaf.us> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> Message-ID: <57070A4A.3000606@brenbarn.net> On 2016-04-07 08:15, Ethan Furman wrote: > On 04/07/2016 12:46 AM, Antoine Pitrou wrote: >> >> How about changing the behaviour of bool.__invert__ to make it in line >> with the Numpy boolean? >> (i.e. bool.__invert__ == operator.not_) > > No. bool is a subclass of int, and changing that now would be a serious > breach of backward-compatibility, not to mention breaking existing code > for no good reason. Let's not forget that subclasses don't have to exactly duplicate all the behavior of their superclasses. That's why there's such a thing as overriding. Bool could remain a subclass of int, and still change its __invert__ behavior by overriding __invert__. It's true that this would be a backwards incompatible change, but behavior like ~True==-2 doesn't seem like something a lot of people are relying on. It would be worth looking into how much code actually does rely on it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From steve at pearwood.info Thu Apr 7 21:40:30 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Apr 2016 11:40:30 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: <20160408014030.GL12526@ando.pearwood.info> On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote: > To be more precise, there are some "arithmetic" operations (+, -, > *, /, **) and they all treat bools as ints and always return ints; > there are also some "bitwise" operations (&, |, ^, ~) and they should > all treat bools as bools and return a bool. You missed two: >> and <<. What are we to do with (True << 1)? Honestly, I cannot even imagine what it means to say "shift a truth value N bits". I think the idea that bitwise operations on bools are actually boolean operations in disguise is not a well-formed idea. Sometimes it happens to work out (& | ^), and sometimes it doesn't (<< and ~). And I'm not sure what to make of >> as an operation on bools. It doesn't *mean* anything, you can't shift a truth value, the very concept is meaningless, but if it did mean something it would surely return False. So >> could go into either category. But ultimately, ~ has meant bitwise-not for 25 years, and it's never caused a problem before, not back in the days when people used to write TRUE, FALSE = 1, 0 and not now. If you want to perform a boolean "not" on a truth value, you use `not`. Nobody cared enough to "fix" this (if it is a problem that needs fixing, which I doubt) when bools were first introduced, and nobody cared when Python 3 came out. So why are we talking about rushing a backwards- incompatible semantic change into a point release? Even if we "fix" this, surely we should go through the usual deprecation process? This isn't a critical security bug that needs fixing, it's a semantic change to something that has worked this way for 25 years, and its going to break something somewhere. There are just far too many people that expect that bools are ints. After all, not withstanding their fancy string representation, they behave like ints and actually are ints. -- Steve From steve at pearwood.info Thu Apr 7 21:53:44 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Apr 2016 11:53:44 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <20160407110425.GK12526@ando.pearwood.info> Message-ID: <20160408015344.GM12526@ando.pearwood.info> On Thu, Apr 07, 2016 at 01:17:57PM +0000, Antoine Pitrou wrote: > Steven D'Aprano writes: > > > > > Numpy's boolean type does the more useful (and more expected) thing: > > > > > > >>> ~np.bool_(True) > > > False > > > > Expected by whom? > > By anyone who takes booleans at face value (that is, takes booleans as > representing a truth value and expects operations on booleans to reflect > the semantics of useful operations on truth values, not some arbitrary > side-effect of the internal representation of a boolean...). Bools in Python have *always* been integers, so who are these people taking booleans at face value? Beginners? If so, say so. That's a motive I can understand. But I think that people who expect bools to be real truth values, like in Pascal, probably won't expect BITWISE operations to operate on them at all and will use the BOOLEAN operators and, or, not. Bitwise operators operate on a sequence of bits, not a single truth value. What do these naive "bools are truth values" people think: True << 3 should return? I don't think we need a second way to spell "not bool". Bools have always been ints in Python, and apart from their fancy string representation they behave like ints. I don't think it helps to make ~ a special case where they don't. > But I'm not surprised by such armchair commenting and pointless controversy > on python-ideas, since that's what the list is for.... Thanks for your feedback, I'll give it the due consideration it deserves. -- Steve From ethan at stoneleaf.us Thu Apr 7 21:56:23 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 18:56:23 -0700 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] Message-ID: <57070FC7.3010803@stoneleaf.us> In the recent thread about changing the meaning of __invert__ on bools I was accused of an attempted hijack. Can anybody please enlighten me as to what, exactly, I did wrong? Original post follows: > I think the str() of a value, while possibly being the most > interesting piece of information (IntEnum, anyone?), is hardly the > most intrinsic. > > If we do make this change, besides needing a couple major versions to > make it happen, will anything else be different? > > - no longer subclass int? > - add an "unknown" value? > - how will indexing work? > - or any of the other operations? > - don't bother with any of the other mathematical operations? > - counting True's is not the same as adding True's > > I'm not firmly opposed, I just don't see a major issue here -- I've > needed an Unknown value for more often that I've needed ~True to be > False. -- ~Ethan~ From guido at python.org Thu Apr 7 22:13:29 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Apr 2016 19:13:29 -0700 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: <57070FC7.3010803@stoneleaf.us> References: <57070FC7.3010803@stoneleaf.us> Message-ID: You know full well that none of the things you brought up are up for discussion. Honestly I don't care any more and I am going to mute this thread and any others in the same vein. On Thu, Apr 7, 2016 at 6:56 PM, Ethan Furman wrote: > In the recent thread about changing the meaning of __invert__ on bools I was > accused of an attempted hijack. > > Can anybody please enlighten me as to what, exactly, I did wrong? > > Original post follows: > >> I think the str() of a value, while possibly being the most >> interesting piece of information (IntEnum, anyone?), is hardly the >> most intrinsic. >> >> If we do make this change, besides needing a couple major versions to >> make it happen, will anything else be different? >> >> - no longer subclass int? >> - add an "unknown" value? >> - how will indexing work? >> - or any of the other operations? >> - don't bother with any of the other mathematical operations? >> - counting True's is not the same as adding True's >> >> I'm not firmly opposed, I just don't see a major issue here -- I've >> needed an Unknown value for more often that I've needed ~True to be >> False. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Thu Apr 7 22:17:16 2016 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 7 Apr 2016 22:17:16 -0400 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: <57070FC7.3010803@stoneleaf.us> References: <57070FC7.3010803@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 9:56 PM, Ethan Furman wrote: > In the recent thread about changing the meaning of __invert__ on bools I > was accused of an attempted hijack. > > Can anybody please enlighten me as to what, exactly, I did wrong? > You greatly expanded the scope of the discussion. The original thread was focused on a single feature, and a not so widely used one. You asked 6-7 (rhetorical?) questions below that have nothing to do with the original topic. > > Original post follows: > > > I think the str() of a value, while possibly being the most > > interesting piece of information (IntEnum, anyone?), is hardly the > > most intrinsic. > > > > If we do make this change, besides needing a couple major versions to > > make it happen, will anything else be different? > > > > - no longer subclass int? > > - add an "unknown" value? > > - how will indexing work? > > - or any of the other operations? > > - don't bother with any of the other mathematical operations? > > - counting True's is not the same as adding True's > > > > I'm not firmly opposed, I just don't see a major issue here -- I've > > needed an Unknown value for more often that I've needed ~True to be > > False. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rian at thelig.ht Thu Apr 7 22:17:33 2016 From: rian at thelig.ht (Rian Hunter) Date: Thu, 7 Apr 2016 19:17:33 -0700 (PDT) Subject: [Python-ideas] Consistent programming error handling idiom Message-ID: I like that in Python all errors are exceptions. This enables elegant code and provides a default error idiom for all Python code. An important distinction between exceptions arises when handling an exception in a top-level exception handler that doesn't have the context to properly handle it. In these situations some exceptions can be ignored (and hopefully logged) and some exceptions should the terminate/reset the program (not necessarily literally). I'd call exceptions that should terminate the program "programming errors? or ?bugs.? It's practically impossible to gracefully recover from a programming error when encountered (you can potentially hot reload a bug fix but I digress). Examples of current exceptions that definitely represent programming errors include AssertionError, NameError and SyntaxError. In the case of AssertionError, it's very important the program terminates or the state is reset lest you run the risk of corrupting persistent data. I'd argue that other exceptions like AttributeError, ValueError, and TypeError can also represent programming errors depending on where they are caught. If one of these exceptions is caught near the top of the call stack where nothing useful can be done it's very likely a programming error. If it's caught locally where it can be predictably handled, no hard reset is necessary, e.g.: try: foo = val.bar except AttributeError: foo = 0 Contrast with the following code that never makes sense (and is why I said that NameError definitely signifies a programming error): try: foo = bar except NameError: foo = 0 Toy examples aside, this problem arises in real programs, like an extensible HTTP server: try: response = client_request_handler(global_state, connection_state, request) except Exception: response = create_500_response() # TODO: should the server be reset? # is global_state invalid? is connection_state invalid? In this case I think it?s polite to always send a ?500? error (the HTTP status code for ?internal server error?) but the question remains as to whether or not the server should reset its global state or close the connection. Something smells here and I don?t think Python currently has a solution to this problem. I don?t think it should be ambiguous by the time an exception is caught whether or not the program has a bug in it or whether it has simply run into an external error. I don?t know what the solution to this should be. I?ve seen proposals like a new exception hierarchy for ?bad code? but I don?t think a good solution necessarily requires changes to the language. I think the solution could be as simple as a universally accepted PEP (like PEP 8) that discusses some of these issues, and related ones, and presents correct / pythonic ways of dealing with them. Maybe the answer is recommending that top-level exception handlers should only be used with extreme care and, unless you know what you?re doing, it?s best to let your program die (or affected state reset) and bias towards more fine-grained exception handling. Thoughts? Thanks for reading. Rian From rosuav at gmail.com Thu Apr 7 23:28:20 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 13:28:20 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 12:17 PM, Rian Hunter wrote: > Contrast with the following code that never makes sense (and is why I > said that NameError definitely signifies a programming error): > > try: > foo = bar > except NameError: > foo = 0 This is exactly the idiom used to cope with builtins that may or may not exist. If you want to support Python 2 as well as 3, you might use something like this: try: input = raw_input except NameError: raw_input = input After this, you're guaranteed that both names exist and refer to the non-evaluating input function. So it doesn't *definitely* signify a bug; like every other exception, you can declare that it's a known and expected situation by try/excepting. An *uncaught* NameError is legitimately a bug - but then, so would most uncaught exceptions. StopIteration gets raised and caught all the time when you iterate over things, but if one leaks out, it's a bug somewhere. > Toy examples aside, this problem arises in real programs, like an > extensible HTTP server: > > try: > response = client_request_handler(global_state, connection_state, > request) > except Exception: > response = create_500_response() > # TODO: should the server be reset? > # is global_state invalid? is connection_state invalid? This is what I'd call a boundary location. You have "outer" code and "inner" code. Any uncaught exception in the inner code should get logged rather than aborting the outer code. I'd spell it like this: try: response = ... except BaseException as e: logging.exception(...) response = create_500_response() Notably, the exception should be *logged* in some way that the author of the inner code can find it. (The outer and inner code needn't be the same 'thing', and needn't have the same author, although they might.) At this kind of boundary, you basically catch-and-log *all* exceptions, handling them the same way. Doesn't matter whether it's ValueError, SyntaxError, NameError, RuntimeError, GeneratorStop, or SystemExit - they all get logged, the client gets a 500, and you go back and look for another request. As to resetting stuff: I wouldn't bother; your functions should already not mess with global state. The only potential mess you should consider dealing with is a database rollback; and actually, my personal recommendation is to do that with a context manager inside the inner code, rather than a reset in the exception handler in the outer code. I don't think there's anything to be codified here; all you have is the basic principle that uncaught exceptions are bugs, modified by boundary locations. > Maybe the > answer is recommending that top-level exception handlers should only > be used with extreme care and, unless you know what you?re doing, it?s > best to let your program die (or affected state reset) and bias > towards more fine-grained exception handling. This should already be the recommendation. The only time you should ever catch an exception is when you can actually do something useful with it; at a boundary location, you log all exceptions and return to some sort of main loop, and everywhere else, you catch stuff because you can usefully cope with it. This is exactly how structured exception handling should normally be used; most programs have no boundaries in them, so you simply catch what you can handle and let the rest hit the console. ChrisA From stephen at xemacs.org Fri Apr 8 01:51:08 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 8 Apr 2016 14:51:08 +0900 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> Message-ID: <22279.18124.303055.108794@turnbull.sk.tsukuba.ac.jp> Random832 writes: > How about ASCII bytes strings [as a use case]? ;) Very good point, and not at all funny. In fact, this is pretty well useless except to try to propagate the Python 2 model of str as maybe-encoded-maybe-not into Python 3 "when it's safe". But that's way less than "99%+" of use cases. Anything where you combine it with bytes has to result in bytes, or prepare for the combining operation to raise if the bytes contain any non-ASCII. Ditto str. Either way, you've not really gained much safety for the programmer, although maybe the user would prefer a crashed program to corrupt data or mojibake on screen. But they'd rather the program worked as intended, so the programmer has to do pretty much the same amount of thinking and coding around the semantic differences between Python 2 and Python 3 AFAICS. The only place I can see this StringableASCIIBytes *frequently* being really "safe" is scripting where you pass a string literal representing a path to an os function, or print a string literal. But guess what, that already works if you just use a str literal! I'm also not really sure that, given the flexibility and ubiquity of string representations, that there really are a lot of use cases whose use of __lossless_i_promise_str__ are mutually compatible. Eg, if we use this for pathlib.Path and for urllib_tng.URL, are those really compatible in the sense that we're happy handing an urllib_tng.URL to open() and have it try to operate on that? If not, we need separate dunders for this purpose. And ditto for all those hypothetical future uses (which are unlikely to be as compatible as filesystem paths and URIs, or URI paths). From stephen at xemacs.org Fri Apr 8 01:52:45 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 8 Apr 2016 14:52:45 +0900 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160408014030.GL12526@ando.pearwood.info> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> Message-ID: <22279.18221.103226.654215@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote: > > > To be more precise, there are some "arithmetic" operations (+, -, > > *, /, **) and they all treat bools as ints and always return ints; > > there are also some "bitwise" operations (&, |, ^, ~) and they should > > all treat bools as bools and return a bool. I'm with Steven on this. Knuth would call these operations "seminumerical". I would put the emphasis on "numerical", *expecting* True and False to be one-bit representations of the (mathematical) integers 1 and 0. If numerical operations widen bools to int and then operate, I would *expect* seminumerical operations to do so as well. In fact, I was startled by Antoine's post. I even have a couple of lines of code using "^" as a *logical* operator on known bools, carefully labeled "# Hack! works only on true bools." That said, I'm not Dutch, and if treating bool as "not actually int" here is the right thing to do, then I would think the easiest thing to do would be to interpret the bitwise operations as performed on (mythical) C "itty-bitty ints".[1] Then ~ does the right thing and True << 1 == True >> 1 == False << 1 == False >> 1 == 0 giving us four new ways to spell 0 as a bonus! > After all, not withstanding their fancy string representation, I guess "fancy string representation" was the original motivation for the overrides. If the intent was really to make operator versions of logical operators (but only for true bools!), they would have fixed ~ too. > they behave like ints and actually are ints. I can't fellow-travel all the way to "actually are", though. bools are what we decide to make them. I just don't see why the current behaviors of &|^ are particularly useful, since you'll have to guard all bitwise expressions against non-bool truthies and falsies. Footnotes: [1] "itty-bitty" almost reads "1 bit" in Japanese! From rosuav at gmail.com Fri Apr 8 02:03:42 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Apr 2016 16:03:42 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <22279.18124.303055.108794@turnbull.sk.tsukuba.ac.jp> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <22279.18124.303055.108794@turnbull.sk.tsukuba.ac.jp> Message-ID: On Fri, Apr 8, 2016 at 3:51 PM, Stephen J. Turnbull wrote: > Anything > where you combine it with bytes has to result in bytes, or prepare for > the combining operation to raise if the bytes contain any non-ASCII. More helpfully, the Ascii object would raise *before* that - it would raise on construction if it had any non-ASCII in it. Combining an Ascii with a bytes results in a perfectly well-formed bytes; combining an Ascii with a unicode results in a perfectly well-formed unicode. But I'm not sure how (a) useful and (b) feasible this is. The only real use-case I know of is Path objects, and there may be similar tasks around numpy or pandas, but I don't know for sure. ChrisA From ethan at stoneleaf.us Fri Apr 8 02:08:46 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Apr 2016 23:08:46 -0700 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: References: <57070FC7.3010803@stoneleaf.us> Message-ID: <57074AEE.2080300@stoneleaf.us> On 04/07/2016 07:17 PM, Alexander Belopolsky wrote: > You greatly expanded the scope of the discussion. The original thread > was focused on a single feature, and a not so widely used one. You > asked 6-7 (rhetorical?) questions below that have nothing to do with the > original topic. Thank you, Alexander. I appreciate the feedback. It's entirely possible I'm talking to myself, but in case anyone is listening: Guido, it felt to me that you were being a jerk. I was definitely an ass in response. For that I apologize. As for broadening the scope of the discussion I will say this: So far I have authored or helped with three successful PEPs; The first was to amend an incomplete feature (raise from None), and the third was to add back a feature that had been ripped out (%-interpolation with bytes). Why were those two even necessary? I suspect because the folks involved were only thinking about their own needs and/or didn't have the relevant experience as to why those features were useful. Perhaps I am only flattering myself, but I think I try hard to see all sides of every issue, and the only way I can do that is by asking questions of those with more or different experience than I have. I think the current pathlib discussions are a fair indicator: I don't particularly care for it, and might never use it -- but I hate to see it tossed and Antoine's work and effort lost; so I'm working to find a reasonable way to keep it, and not just in asking questions and offering ideas -- I volunteered my time to write the code. I'm starting to ramble so let me close with this: I'm not sorry for asking questions and trying to look at the broader issues, and I'm not going to stop doing that -- but I will stop pursuing any particular issue when asked to do so... but please don't be insulting about it. -- ~Ethan~ From greg.ewing at canterbury.ac.nz Fri Apr 8 02:26:34 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Apr 2016 18:26:34 +1200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: <57074F1A.20009@canterbury.ac.nz> Michael Selik wrote: > To clarify, the proposal is: ``~True == False`` but every other operation on > ``True`` remains the same, including ``True * 42 == 42``. Correct? Seems to me things are fine as they are. The justification for & and | on bools returning bools is that the result remains within the domain of bools, even when they are interpreted as int operations. But ~ on a bool-interpreted-as-an-int doesn't have that property, so ~True is more in the realm of True * 42 in that regard. -- Greg From greg.ewing at canterbury.ac.nz Fri Apr 8 02:54:33 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Apr 2016 18:54:33 +1200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> Message-ID: <570755A9.5050104@canterbury.ac.nz> Terry Reedy wrote: > Given that ~bool hardly make any sense currently, I would not expect it > to be in much use now. Hence not much to break. But conversely, any code that *is* using ~bool instead of "not bool" is probably doing it precisely because it *does* want the integer interpretation. Why break that code, when "not bool" is available as the obvious way of getting a logically negated bool? -- Greg From greg.ewing at canterbury.ac.nz Fri Apr 8 03:06:33 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Apr 2016 19:06:33 +1200 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: <57070FC7.3010803@stoneleaf.us> References: <57070FC7.3010803@stoneleaf.us> Message-ID: <57075879.1050609@canterbury.ac.nz> Ethan Furman wrote: > Can anybody please enlighten me as to what, exactly, I did wrong? I think Guido was objecting to talk about things such as making bool no longer subclass int, which is not only going a long way beyond the original proposal, but is almost certainly never going to happen, so any discussion of it could be seen as wasting people's time. -- Greg From njs at pobox.com Fri Apr 8 03:34:06 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Apr 2016 00:34:06 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <570755A9.5050104@canterbury.ac.nz> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <570755A9.5050104@canterbury.ac.nz> Message-ID: On Thu, Apr 7, 2016 at 11:54 PM, Greg Ewing wrote: > Terry Reedy wrote: >> >> Given that ~bool hardly make any sense currently, I would not expect it to >> be in much use now. Hence not much to break. > > > But conversely, any code that *is* using ~bool instead of > "not bool" is probably doing it precisely because it > *does* want the integer interpretation. It would be interesting actually if anyone has any idea of how to get some empirical data on this -- it isn't actually clear to me whether this is true or not. The reason I'm uncertain is that in numpy code, using operations like ~ on booleans is *very* common, because the whole idea of numpy is that it gives you a way to write code that works the same on either a single value or on an array of values, and when you're working with booleans then this means you have to use '~': '~' works on arrays and 'not' doesn't. And, for numpy bools or arrays of bools, ~ does logical negation: In [1]: ~np.bool_(True) Out[1]: False So you can write code like: # Contrived function that doubles 'value' if do_double is true # and otherwise halves it def double_or_halve(value, do_double): value = np.asarray(value) value[do_double] *= 2 value[~do_double] *= 0.5 return value and then this works correctly if 'do_double' is a numpy bool or array of bools: In [16]: double_or_halve(np.arange(3, dtype=float), np.array([True, False, True])) Out[16]: array([ 0. , 0.5, 4. ]) In [21]: double_or_halve(5.0, np.bool_(False)) Out[21]: array(2.5) But if you pass in a regular Python bool then the attempt to index by ~do_double turns into negative integer indexing and blows up: In [23]: double_or_halve(5.0, False) IndexError: too many indices for array Of course this is a totally contrived function, and anyway it has a bug -- the user should have said 'do_double = np.asarray(do_double)' at the top of the function, and that would fix the problem. This is definitely not some massive problem afflicting numerical users, and I don't have any strong opinion on Antoine's proposal. But, it is the only case where I can imagine someone intentionally writing ~bool, so it actually strikes me as plausible that the majority of existing code that writes ~bool is like this: doing it by mistake and expecting it to be the same as 'not'. FWIW. -n -- Nathaniel J. Smith -- https://vorpus.org From niki.spahiev at gmail.com Fri Apr 8 03:46:36 2016 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 8 Apr 2016 10:46:36 +0300 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <57074F1A.20009@canterbury.ac.nz> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <57074F1A.20009@canterbury.ac.nz> Message-ID: On 8.04.2016 09:26, Greg Ewing wrote: > Michael Selik wrote: >> To clarify, the proposal is: ``~True == False`` but every other >> operation on >> ``True`` remains the same, including ``True * 42 == 42``. Correct? > > Seems to me things are fine as they are. > > The justification for & and | on bools returning bools > is that the result remains within the domain of bools, > even when they are interpreted as int operations. > > But ~ on a bool-interpreted-as-an-int doesn't have > that property, so ~True is more in the realm of > True * 42 in that regard. What should be the result of +True? 1 or True? regards, Niki From k7hoven at gmail.com Fri Apr 8 08:28:08 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 8 Apr 2016 15:28:08 +0300 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160408014030.GL12526@ando.pearwood.info> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> Message-ID: On Fri, Apr 8, 2016 at 4:40 AM, Steven D'Aprano wrote: > On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote: > >> To be more precise, there are some "arithmetic" operations (+, -, >> *, /, **) and they all treat bools as ints and always return ints; >> there are also some "bitwise" operations (&, |, ^, ~) and they should >> all treat bools as bools and return a bool. > > You missed two: >> and <<. What are we to do with (True << 1)? > Not everyone considers bit shifts 'bitwise', as they don't act at the level of individual bit positions: https://en.wikipedia.org/wiki/Bitwise_operation > Honestly, I cannot even imagine what it means to say "shift a truth > value N bits". I think the idea that bitwise operations on bools are > actually boolean operations in disguise is not a well-formed idea. > Sometimes it happens to work out (& | ^), and sometimes it doesn't (<< > and ~). One point of view is that bitwise operations should stay within bool, while shifts return ints, the left-shift operations actually being much more useful than "left-pad" ;). The main point of >> can be seen as consistency, although perhaps useless. That said, I don't really have an opinion on the OP's suggestion. -Koos From songofacandy at gmail.com Fri Apr 8 09:25:50 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 8 Apr 2016 22:25:50 +0900 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57067C3E.9030808@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> Message-ID: > > > We have abstract base classes for such tests, but there's nothing > which would define "string-like" as ABC. Before trying to define > a test via a special method, I think it's better to define what > exactly you mean by "string-like". > > Something along the lines of numbers.Number, but for strings. > > To make an object string-like, you then test for the ABC and > then call .__str__() to get the string representation as string. > > Does ABC is easy to use from C? There should be easy way to (1) define "string-like" class in C and (2) use "string-like" object from C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Fri Apr 8 10:15:10 2016 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 8 Apr 2016 16:15:10 +0200 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: References: <570692E7.9080904@gmail.com> Message-ID: <5707BCEE.1090901@gmail.com> Le 07/04/2016 21:18, Terry Reedy a ?crit : > On 4/7/2016 1:03 PM, Michel Desmoulin wrote: >> Python is a lot about iteration, and I often have to get values from an >> iterable. For this, unpacking is fantastic: >> >> a, b, c = iterable >> >> One that problem arises is that you don't know when iterable will >> contain 3 items. > > You need to augment if needed to get at least 3 items. > You need to chop if needed to get at most 3 items. The following does > exactly this. > > import itertools as it > a, b, c = it.islice(it.chain(iterable, it.repeat(default, 3), 3) > > You can wrap this if you want with def exactly(iterable, n, default). > This might already be a recipe in the itertools doc recipe section. > Yes and you can also do that for regular slicing on list. But you don't, because you have regular slicing, which is cleaner, and easier to read and remember. This is not only hard to remember, but needs a comment. And an import. From p.f.moore at gmail.com Fri Apr 8 10:33:16 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 8 Apr 2016 15:33:16 +0100 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <57067C3E.9030808@egenix.com> References: <57066A38.7050408@egenix.com> <57067584.1070900@mrabarnett.plus.com> <57067C3E.9030808@egenix.com> Message-ID: On 7 April 2016 at 16:26, M.-A. Lemburg wrote: > We have abstract base classes for such tests, but there's nothing > which would define "string-like" as ABC. Before trying to define > a test via a special method, I think it's better to define what > exactly you mean by "string-like". As a more general point here, if the point is simply to group a set of types together for use by "user" code (whether application code or 3rd party libraries) then that's precisely what the ABC machinery is for. It doesn't require a core change or a PEP or anything other than agreement between the users of the facility to define an ABC for *anything*, even something as vague as "string-like" - if all parties concerned agree that something is "string-like" when it is an instance of the StringLike ABC, then that's a done deal. You can add a "to_string()" method to the ABC, which can be as simple as def to_string(obj): return str(obj) and let types for which that's not appropriate override it. The only real reason for a new "protocol" (a dunder method) is if the core interpreter or low-level parts of the stdlib need to participate. In that case, the ABC mechanisms may not be available/appropriate or may introduce unacceptable overhead (e.g., in something like dict lookup) But in general, we have a mechanism for things like this, why are people inventing home-grown solutions rather than using them? (In the case of pathlib and __fspath__, it's because of the low-level implications of adding support to open() and places like importlib - but that's *also* why the solution should remain focused on the specific problem, and not become over-generalised). Paul From guido at python.org Fri Apr 8 10:54:16 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Apr 2016 07:54:16 -0700 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: <57074AEE.2080300@stoneleaf.us> References: <57070FC7.3010803@stoneleaf.us> <57074AEE.2080300@stoneleaf.us> Message-ID: On Thu, Apr 7, 2016 at 11:08 PM, Ethan Furman wrote: > On 04/07/2016 07:17 PM, Alexander Belopolsky wrote: > >> You greatly expanded the scope of the discussion. The original thread >> was focused on a single feature, and a not so widely used one. You >> asked 6-7 (rhetorical?) questions below that have nothing to do with the >> original topic. > > > Thank you, Alexander. I appreciate the feedback. > > It's entirely possible I'm talking to myself, but in case anyone is > listening: > > Guido, it felt to me that you were being a jerk. I was definitely an ass in > response. For that I apologize. Thanks for that. I didn't mean to be a jerk, just to prevent the thread from derailing way out of scope. And I was on my cell phone, where I can't be as eloquent as when I have a real keyboard. So I apologize for sounding like a jerk. > As for broadening the scope of the discussion I will say this: > > So far I have authored or helped with three successful PEPs; The first was > to amend an incomplete feature (raise from None), and the third was to add > back a feature that had been ripped out (%-interpolation with bytes). > > Why were those two even necessary? I suspect because the folks involved > were only thinking about their own needs and/or didn't have the relevant > experience as to why those features were useful. > > Perhaps I am only flattering myself, but I think I try hard to see all sides > of every issue, and the only way I can do that is by asking questions of > those with more or different experience than I have. > > I think the current pathlib discussions are a fair indicator: I don't > particularly care for it, and might never use it -- but I hate to see it > tossed and Antoine's work and effort lost; so I'm working to find a > reasonable way to keep it, and not just in asking questions and offering > ideas -- I volunteered my time to write the code. > > I'm starting to ramble so let me close with this: I'm not sorry for asking > questions and trying to look at the broader issues, and I'm not going to > stop doing that -- but I will stop pursuing any particular issue when asked > to do so... but please don't be insulting about it. Starting a new thread with a broader (or different) scope is always totally fine. Bringing up a whole bunch of contentious things that distract from a relatively simple issue is not.Thanks for asking for feedback! -- --Guido van Rossum (python.org/~guido) From guido at python.org Fri Apr 8 11:00:27 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Apr 2016 08:00:27 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160408014030.GL12526@ando.pearwood.info> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> Message-ID: On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano wrote: > You missed two: >> and <<. What are we to do with (True << 1)? [...] Those are indeed ambiguous -- they are defined as multiplication or floor division with a power of two, e.g. x<>n is x//2**n (for integral x and nonnegative n). The point of this thread seems to be to see whether some operations can be made more useful by staying in the bool domain -- I don't think making both of these return 0 if n != 0, so let's keep them unchanged. > Even if we "fix" this, surely we should go through the usual deprecation > process? This isn't a critical security bug that needs fixing, it's a > semantic change to something that has worked this way for 25 years, and > its going to break something somewhere. There are just far too many > people that expect that bools are ints. After all, not withstanding > their fancy string representation, they behave like ints and actually > are ints. The thing here is, this change is too small to warrant a __future__ import. So we're either going to introduce it in 3.6 and tell people about it in case their code might break, or we're never going to do it. I'm honestly on the fence, but I feel this is a rarely used operator so changing its meaning is not likely to break a lot of code. -- --Guido van Rossum (python.org/~guido) From random832 at fastmail.com Fri Apr 8 11:27:13 2016 From: random832 at fastmail.com (Random832) Date: Fri, 08 Apr 2016 11:27:13 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> Message-ID: <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote: > The thing here is, this change is too small to warrant a __future__ > import. So we're either going to introduce it in 3.6 and tell people > about it in case their code might break, or we're never going to do > it. I'm honestly on the fence, but I feel this is a rarely used > operator so changing its meaning is not likely to break a lot of code. What about just having a DeprecationWarning, but no __future__ import? From guido at python.org Fri Apr 8 11:42:52 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Apr 2016 08:42:52 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: DeprecationWarning every time you use ~ on a bool? That would still be too big a burden on using it the new way. On Fri, Apr 8, 2016 at 8:27 AM, Random832 wrote: > > > On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote: >> The thing here is, this change is too small to warrant a __future__ >> import. So we're either going to introduce it in 3.6 and tell people >> about it in case their code might break, or we're never going to do >> it. I'm honestly on the fence, but I feel this is a rarely used >> operator so changing its meaning is not likely to break a lot of code. > > What about just having a DeprecationWarning, but no __future__ import? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From rian at thelig.ht Fri Apr 8 12:40:15 2016 From: rian at thelig.ht (Rian Hunter) Date: Fri, 8 Apr 2016 09:40:15 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: <5F6ABBCD-92B3-44D4-8358-28FE593C16A2@thelig.ht> > From: Chris Angelico >> On Fri, Apr 8, 2016 at 12:17 PM, Rian Hunter wrote: >> Contrast with the following code that never makes sense (and is why I >> said that NameError definitely signifies a programming error): >> >> try: >> foo = bar >> except NameError: >> foo = 0 > > This is exactly the idiom used to cope with builtins that may or may > not exist. If you want to support Python 2 as well as 3, you might use > something like this: > > try: > input = raw_input > except NameError: > raw_input = input Oops, good call. That was a bad example. Maybe this one is a bit better: try: assert no_bugs() except AssertionError: # bugs are okay pass > This is what I'd call a boundary location. You have "outer" code and > "inner" code. Any uncaught exception in the inner code should get > logged rather than aborting the outer code. I'd spell it like this: > > try: > response = ... > except BaseException as e: > logging.exception(...) > response = create_500_response() > > [snip] > > At this kind of boundary, you basically catch-and-log *all* > exceptions, handling them the same way. Doesn't matter whether it's > ValueError, SyntaxError, NameError, RuntimeError, GeneratorStop, or > SystemExit - they all get logged, the client gets a 500, and you go > back and look for another request. I'm very familiar with this pattern in Python and I've used it myself countless times. Unfortunately I've seen instances where it can lead to disastrous behavior, I think we all have. It seems to have limited usefulness in production and more usefulness in development. The point of my original email was precisely to put the legitimacy and usefulness of code like that into question. > As to resetting stuff: I wouldn't bother; your functions should > already not mess with global state. The only potential mess you should > consider dealing with is a database rollback; and actually, my > personal recommendation is to do that with a context manager inside > the inner code, rather than a reset in the exception handler in the > outer code. So I agree this pattern works if you assume all code is exception-safe (context managers will clean up intermediate state) and there are no programming errors. There are lots of things that good code should do but as we all well know good code doesn't always do those things. My point is that except-log loops are dangerous and careless in the face of programming errors. Programming errors are unavoidable. In a large system they are a daily fact of life. When there is a bug in production it's very dangerous to re-enter a code block that has demonstrated itself to be buggy, else you risk corrupting data. For example: def random_code(state): assert is_valid(state.data) def user_code_a(state): state.data = "bad data" # the following throws random_code(state) def user_code_b(state): state.db.write(state.data) def main_loop(): state = State() loop = [user_code_a, user_code_b] for fn in loop: try: fn() except Exception: log_exception() This code allows user_code_b() to execute and corrupt data even though random_code() was lucky enough to be called and detect bad state early on. You may say the fix is to assert correct data before writing to the database and, yes, that would fix the problem for future executions in this instance. That's not the point, the point is that incorrect buggy code is running in production today and it's imperative to have multiple safeguards to limit its damage. For example, you wouldn't have an except-log loop in an airplane control system. Sometimes an error is just an error but sometimes an error signifies the running system itself is in a bad state. It would be nice to distinguish between the two in a consistent way across all Python event loops. Halting on any escaped exception is inconvenient, but continuing after any escape exception is dangerous. Rian From rosuav at gmail.com Fri Apr 8 12:53:21 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 02:53:21 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <5F6ABBCD-92B3-44D4-8358-28FE593C16A2@thelig.ht> References: <5F6ABBCD-92B3-44D4-8358-28FE593C16A2@thelig.ht> Message-ID: On Sat, Apr 9, 2016 at 2:40 AM, Rian Hunter wrote: > >> From: Chris Angelico >> This is exactly the idiom used to cope with builtins that may or may >> not exist. If you want to support Python 2 as well as 3, you might use >> something like this: >> >> try: >> input = raw_input >> except NameError: >> raw_input = input > > Oops, good call. That was a bad example. Maybe this one is a bit better: > > try: > assert no_bugs() > except AssertionError: > # bugs are okay > pass Not sure I understand this example. It's obviously a toy, because you'd never put an 'assert' just to catch and discard the exception - the net result is that you call no_bugs() if you're in debug mode and don't if you're not, which is more cleanly spelled "if __debug__: no_bugs()". >> This is what I'd call a boundary location. You have "outer" code and >> "inner" code. Any uncaught exception in the inner code should get >> logged rather than aborting the outer code. > > I'm very familiar with this pattern in Python and I've used it myself countless times. Unfortunately I've seen instances where it can lead to disastrous behavior, I think we all have. It seems to have limited usefulness in production and more usefulness in development. The point of my original email was precisely to put the legitimacy and usefulness of code like that into question. > When does this lead to disaster? Is it because someone creates a boundary that shouldn't exist, or because the code maintains global state in bad ways? The latter is a problem even without exceptions; imagine a programming error that doesn't cause an exception, but just omits some crucial "un-modify global state" call. You have no way of detecting that in the outer code, and your state is messed up. >> As to resetting stuff: I wouldn't bother; your functions should >> already not mess with global state. The only potential mess you should >> consider dealing with is a database rollback; and actually, my >> personal recommendation is to do that with a context manager inside >> the inner code, rather than a reset in the exception handler in the >> outer code. > > So I agree this pattern works if you assume all code is exception-safe (context managers will clean up intermediate state) and there are no programming errors. There are lots of things that good code should do but as we all well know good code doesn't always do those things. My point is that except-log loops are dangerous and careless in the face of programming errors. > I don't think the except-log loop is the problem here. The problem is the code that can go in part way, come out again, and leave itself in a mess. > Programming errors are unavoidable. In a large system they are a daily fact of life. When there is a bug in production it's very dangerous to re-enter a code block that has demonstrated itself to be buggy, else you risk corrupting data. For example: > > def random_code(state): > assert is_valid(state.data) > > def user_code_a(state): > state.data = "bad data" > # the following throws > random_code(state) > > def user_code_b(state): > state.db.write(state.data) > > def main_loop(): > state = State() > loop = [user_code_a, > user_code_b] > for fn in loop: > try: > fn() > except Exception: > log_exception() > > This code allows user_code_b() to execute and corrupt data even though random_code() was lucky enough to be called and detect bad state early on. Can you give a non-toy example that has this kind of mutable state at top level? I suspect it's bad design. If it's truly necessary, use a context manager to guarantee the reset: def user_code_a(state): with state.set_data("bad data"): random_code(state) > You may say the fix is to assert correct data before writing to the database and, yes, that would fix the problem for future executions in this instance. That's not the point, the point is that incorrect buggy code is running in production today and it's imperative to have multiple safeguards to limit its damage. For example, you wouldn't have an except-log loop in an airplane control system. > Actually, yes I would. The alternative that you're suggesting is to have any error immediately shut down the whole system. Is that really better? To have the entire control system disabled? > Sometimes an error is just an error but sometimes an error signifies the running system itself is in a bad state. It would be nice to distinguish between the two in a consistent way across all Python event loops. Halting on any escaped exception is inconvenient, but continuing after any escape exception is dangerous. > There's no way for Python to be able to fix this for you. The tools exist - most notably context managers - so the solution is to use them. I don't think we're in python-ideas territory here. What I see here is a perfect subject for a blog post or other scholarly article on "Python exception handling best practice", plus possibly an internal style guide for your company/organization. The blog post I would definitely read with interest; the style guide ought to be stating the obvious (as most style guides should). ChrisA From brett at python.org Fri Apr 8 13:39:21 2016 From: brett at python.org (Brett Cannon) Date: Fri, 08 Apr 2016 17:39:21 +0000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: On Fri, 8 Apr 2016 at 08:44 Guido van Rossum wrote: > DeprecationWarning every time you use ~ on a bool? That would still be > too big a burden on using it the new way. > I think proposal would be a DeprecationWarning to flush out/remove all current uses of ~bool with Python 3.6, and then in Python 3.7 introduce the new semantics. -Brett > > On Fri, Apr 8, 2016 at 8:27 AM, Random832 wrote: > > > > > > On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote: > >> The thing here is, this change is too small to warrant a __future__ > >> import. So we're either going to introduce it in 3.6 and tell people > >> about it in case their code might break, or we're never going to do > >> it. I'm honestly on the fence, but I feel this is a rarely used > >> operator so changing its meaning is not likely to break a lot of code. > > > > What about just having a DeprecationWarning, but no __future__ import? > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 8 13:42:24 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 03:42:24 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:04 PM, Chris Angelico wrote: > This is a spin-off from the __fspath__ discussion on python-dev, in > which a few people said that a more general approach would be better. > > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. > > This could supersede the __fspath__ "give me the string for this path" > protocol, or could stand in parallel with it. In the light of all the arguments put forward in this thread and elsewhere, I'm withdrawing this proposal in favour of __fspath__. If additional use-cases are found for a "string-like object", this can be revived, but otherwise, it's too general and insufficiently useful. ChrisA From rian at thelig.ht Fri Apr 8 14:24:15 2016 From: rian at thelig.ht (rian at thelig.ht) Date: Fri, 08 Apr 2016 11:24:15 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> >> Oops, good call. That was a bad example. Maybe this one is a bit >> better: >> >> try: >> assert no_bugs() >> except AssertionError: >> # bugs are okay >> pass > > Not sure I understand this example. It's obviously a toy, because > you'd never put an 'assert' just to catch and discard the exception - > the net result is that you call no_bugs() if you're in debug mode and > don't if you're not, which is more cleanly spelled "if __debug__: > no_bugs()". The example is removed from its original context. The point was to show that there is no reason to locally catch AssertionError (contrast with AttributeError, ValueError, TypeError, etc.). AssertionError is *always* indicative of a bug (unlike AttributeError which only may be indicative of a bug). >>> This is what I'd call a boundary location. You have "outer" code and >>> "inner" code. Any uncaught exception in the inner code should get >>> logged rather than aborting the outer code. >> >> I'm very familiar with this pattern in Python and I've used it myself >> countless times. Unfortunately I've seen instances where it can lead >> to disastrous behavior, I think we all have. It seems to have limited >> usefulness in production and more usefulness in development. The point >> of my original email was precisely to put the legitimacy and >> usefulness of code like that into question. >> > > When does this lead to disaster? Is it because someone creates a > boundary that shouldn't exist, or because the code maintains global > state in bad ways? The latter is a problem even without exceptions; > imagine a programming error that doesn't cause an exception, but just > omits some crucial "un-modify global state" call. You have no way of > detecting that in the outer code, and your state is messed up. This leads to disaster in the case of buggy code. This leads to a greater disaster when the boundary allows the code to continue running. It's not a black and white thing, there is no 100% way to detect for buggy code but my argument is that you shouldn't ignore an assert when you're lucky enough to get one. >>> As to resetting stuff: I wouldn't bother; your functions should >>> already not mess with global state. The only potential mess you >>> should >>> consider dealing with is a database rollback; and actually, my >>> personal recommendation is to do that with a context manager inside >>> the inner code, rather than a reset in the exception handler in the >>> outer code. >> >> So I agree this pattern works if you assume all code is exception-safe >> (context managers will clean up intermediate state) and there are no >> programming errors. There are lots of things that good code should do >> but as we all well know good code doesn't always do those things. My >> point is that except-log loops are dangerous and careless in the face >> of programming errors. >> > > I don't think the except-log loop is the problem here. The problem is > the code that can go in part way, come out again, and leave itself in > a mess. I'm not saying except-log is to blame for buggy code. I'm saying that when an exception occurs, there's no way to tell whether it was caused by an internal bug or an external error. Except-log should normally be limited to exceptions caused by external errors in production code. >> Programming errors are unavoidable. In a large system they are a daily >> fact of life. When there is a bug in production it's very dangerous to >> re-enter a code block that has demonstrated itself to be buggy, else >> you risk corrupting data. For example: >> >> def random_code(state): >> assert is_valid(state.data) >> >> def user_code_a(state): >> state.data = "bad data" >> # the following throws >> random_code(state) >> >> def user_code_b(state): >> state.db.write(state.data) >> >> def main_loop(): >> state = State() >> loop = [user_code_a, >> user_code_b] >> for fn in loop: >> try: >> fn() >> except Exception: >> log_exception() >> >> This code allows user_code_b() to execute and corrupt data even though >> random_code() was lucky enough to be called and detect bad state early >> on. > > Can you give a non-toy example that has this kind of mutable state at > top level? I suspect it's bad design. If it's truly necessary, use a > context manager to guarantee the reset: > > def user_code_a(state): > with state.set_data("bad data"): > random_code(state) > Yes it may be bad design or the code may have bugs but that's precisely the point. By the time the exception hits your catch-all, there's no universal way of determining whether or not the exception was due to an internal bug (or bad design or whatever) or an external error. >> You may say the fix is to assert correct data before writing to the >> database and, yes, that would fix the problem for future executions in >> this instance. That's not the point, the point is that incorrect buggy >> code is running in production today and it's imperative to have >> multiple safeguards to limit its damage. For example, you wouldn't >> have an except-log loop in an airplane control system. >> > > Actually, yes I would. The alternative that you're suggesting is to > have any error immediately shut down the whole system. Is that really > better? To have the entire control system disabled? The alternative I'm suggesting is to reset that flight computer and switch to the backup one (potentially written by a different team). >> Sometimes an error is just an error but sometimes an error signifies >> the running system itself is in a bad state. It would be nice to >> distinguish between the two in a consistent way across all Python >> event loops. Halting on any escaped exception is inconvenient, but >> continuing after any escape exception is dangerous. >> > > There's no way for Python to be able to fix this for you. The tools > exist - most notably context managers - so the solution is to use > them. Context managers don't allow me to determine whether an exception is caused by an internal bug or external error, so it's not a solution to this problem. I want to live in a world where I can do this: while cbs: cb = cbs.pop() try: cb() except Exception as e: logging.exception("In main loop") if is_a_bug(e): raise SystemExit() from e Python may be able to do something here. One possible thing is a new exception hierarchy, there may be other solutions. This may be sufficient but I doubt it: def is_a_bug(e): return isinstance(e, AssertionError) The reason I am polling python-ideas is that I can't be the only one who has ever encountered this deficiency in Python. Rian From rosuav at gmail.com Fri Apr 8 14:28:47 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 04:28:47 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> Message-ID: On Sat, Apr 9, 2016 at 4:24 AM, wrote: > Python may be able to do something here. One possible thing is a new > exception hierarchy, there may be other solutions. This may be sufficient > but I doubt it: > > def is_a_bug(e): > return isinstance(e, AssertionError) > > The reason I am polling python-ideas is that I can't be the only one who has > ever encountered this deficiency in Python. How about this: def is_a_bug(e): return True ANY uncaught exception is a bug. Any caught exception is not a bug. ChrisA From steve at pearwood.info Fri Apr 8 15:01:19 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Apr 2016 05:01:19 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> Message-ID: <20160408190115.GO12526@ando.pearwood.info> On Sat, Apr 09, 2016 at 04:28:47AM +1000, Chris Angelico wrote: > ANY uncaught exception is a bug. What, even SystemExit and KeyboardInterrupt? > Any caught exception is not a bug. Even: try: ... except: # Look, bug free programming! pass I am sure that any attempt to make universal rules about what is or isn't a bug will be doomed to failure. Remember that people can write code like this: # Writing a BASIC interpreter in Python. if not isinstance(command[0], int): raise SyntaxError('expected line number') So, no, SyntaxError does not necessarily mean a bug in your code. *ALL* exceptions have to be understood in context, you can't just make a sweeping generalisation that exception A is always a bug, exception B is always recoverable. It depends on the context. -- Steve From chris.barker at noaa.gov Fri Apr 8 15:02:16 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 8 Apr 2016 12:02:16 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> Message-ID: On Fri, Apr 8, 2016 at 11:24 AM, wrote: > AssertionError is *always* indicative of a bug >> >> I don't think there is any way to know that -- it all depends on how it's used. For my part, Assertions are only for testing -- and are actually turned off in debug mode. And I think we can say the same thing about ALL Exceptions --even a syntax error may not be a "stop everything" error in a system that runs user-written scripts. I agree with Chris A's point: Any unhandled Exception is a bug. Simple as that. Any other interpretation would be a style issue, decided for your group/application. And if you are going to go there, I would do: if not_a_bug() instead. Consistent with the principle of good Python code -- only handle the Exceptions you know to handle. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 8 15:08:08 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 05:08:08 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <20160408190115.GO12526@ando.pearwood.info> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <20160408190115.GO12526@ando.pearwood.info> Message-ID: On Sat, Apr 9, 2016 at 5:01 AM, Steven D'Aprano wrote: > On Sat, Apr 09, 2016 at 04:28:47AM +1000, Chris Angelico wrote: > >> ANY uncaught exception is a bug. > > What, even SystemExit and KeyboardInterrupt? In terms of a boundary location? Yes, probably. If you're writing a web server that allows web applications to be plugged into it, you probably want to let the console halt processing of one request, then go back and handle another. Although it is contextual; you might choose the other way, in which case it's "any uncaught subclass of Exception is a bug", or "anything other than KeyboardInterrupt is a bug", or some other definition. I'm on the fence about SystemExit; in any context where there's this kind of boundary, sys.exit() simply wouldn't be used. So it could viably be called a bug, or it could validly be used to terminate the entire server. >> Any caught exception is not a bug. > > Even: > > try: > ... > except: > # Look, bug free programming! > pass As far as the boundary location's concerned, yes. The inner code can be as ridiculous as it likes, but the boundary will never do the "log and resume" if all exceptions are caught. (It won't even get an opportunity to.) > I am sure that any attempt to make universal rules about what is or > isn't a bug will be doomed to failure. Remember that people can write > code like this: > > # Writing a BASIC interpreter in Python. > if not isinstance(command[0], int): > raise SyntaxError('expected line number') > > So, no, SyntaxError does not necessarily mean a bug in your code. *ALL* > exceptions have to be understood in context, you can't just make a > sweeping generalisation that exception A is always a bug, exception B is > always recoverable. It depends on the context. If one escapes to a boundary, then yes it does. That's the context we're talking about here. ChrisA From steve at pearwood.info Fri Apr 8 15:30:13 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Apr 2016 05:30:13 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> Message-ID: <20160408193013.GP12526@ando.pearwood.info> On Fri, Apr 08, 2016 at 11:24:15AM -0700, rian at thelig.ht wrote: > I want to live in a world where I can do this: > > while cbs: > cb = cbs.pop() > try: > cb() > except Exception as e: > logging.exception("In main loop") > if is_a_bug(e): > raise SystemExit() from e And I want a pony :-) So long as Python allows people to write code like this: # A deliberately silly example if len(seq) == 0: raise ImportError("assertion fails") you cannot expect to automatically know what an exception means without any context of where it came from and why it happened. The above example is silly, but it doesn't take much effort to come up with more serious ones: - an interpreter written in Python may raise SyntaxError, which is not a bug in the interpreter; - a test framework may raise AssertionError for a failed test, which is not a bug in the framework; - a function may raise MemoryError if the call *would* run out of memory, but without actually running out of memory; consequently it is not a fatal error, while another function may raise the same MemoryError because it actually did fatally run out of memory. Effectively, you want the compiler to Do What I Mean when it comes to exceptions. DWIM may, occasionally, be a good idea in applications, but I maintain it is never a good idea in a programming language. http://www.catb.org/jargon/html/D/DWIM.html I'm afraid that there's no hope for it: you're going to have to actually understand where an exception came from, and why it happened, before deciding whether or not it can be recovered from. The interpreter can't do that for you. -- Steve From rian at thelig.ht Fri Apr 8 15:47:13 2016 From: rian at thelig.ht (Rian Hunter) Date: Fri, 8 Apr 2016 12:47:13 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> Message-ID: <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> > On Apr 8, 2016, at 12:02 PM, Chris Barker wrote: > I agree with Chris A's point: > > Any unhandled Exception is a bug. Simple as that. I'm happy with that interpretation. If that was codified in a style document accessible to newbies I think that would help achieve a more consistent approach to exceptions. Yet something dark and hideous inside me tells me except-log loops will continue to be pervasive and bugs will continue to be ignored in a large number of Python programs. From ethan at stoneleaf.us Fri Apr 8 16:06:24 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Apr 2016 13:06:24 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> Message-ID: <57080F40.6070100@stoneleaf.us> On 04/08/2016 12:47 PM, Rian Hunter wrote: >> On Apr 8, 2016, at 12:02 PM, Chris Barker wrote: >> I agree with Chris A's point: >> >> Any unhandled Exception is a bug. Simple as that. > > I'm happy with that interpretation. If that was codified in a style > document accessible to newbies I think that would help achieve a > more consistent approach to exceptions. > > Yet something dark and hideous inside me tells me except-log loops > will continue to be pervasive and bugs will continue to be ignored > in a large number of Python programs. Sadly, the language cannot force someone to investigate problems instead of ignoring them. :( -- ~Ethan~ From chris.barker at noaa.gov Fri Apr 8 16:47:34 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 8 Apr 2016 13:47:34 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> Message-ID: On Fri, Apr 8, 2016 at 12:47 PM, Rian Hunter wrote: > > > On Apr 8, 2016, at 12:02 PM, Chris Barker wrote: > > I agree with Chris A's point: > > > > Any unhandled Exception is a bug. Simple as that. > > I'm happy with that interpretation. If that was codified in a style > document accessible to newbies I think that would help achieve a more > consistent approach to exceptions. > it already is in advice all over the place: "don't use bare except" Doesn't mean folks don't do it anyway.... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rian at thelig.ht Fri Apr 8 16:50:14 2016 From: rian at thelig.ht (Rian Hunter) Date: Fri, 8 Apr 2016 13:50:14 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> Message-ID: <88F09FF7-1E35-43FE-BE8C-1C52EA2C0E7F@thelig.ht> > On Apr 8, 2016, at 1:47 PM, Chris Barker wrote: > >> On Fri, Apr 8, 2016 at 12:47 PM, Rian Hunter wrote: >> >> > On Apr 8, 2016, at 12:02 PM, Chris Barker wrote: >> > I agree with Chris A's point: >> > >> > Any unhandled Exception is a bug. Simple as that. >> >> I'm happy with that interpretation. If that was codified in a style document accessible to newbies I think that would help achieve a more consistent approach to exceptions. > > it already is in advice all over the place: "don't use bare except" > > Doesn't mean folks don't do it anyway.... I think bare except is different from "except Exception" which is common and not discouraged. "except Exception" still masks programming errors. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 8 17:00:13 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Apr 2016 14:00:13 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <88F09FF7-1E35-43FE-BE8C-1C52EA2C0E7F@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> <88F09FF7-1E35-43FE-BE8C-1C52EA2C0E7F@thelig.ht> Message-ID: <57081BDD.7090401@stoneleaf.us> On 04/08/2016 01:50 PM, Rian Hunter wrote: > I think bare except is different from "except Exception" which is common > and not discouraged. "except Exception" still masks programming errors. And comes with the advice to "log it, research it". -- ~Ethan~ From joejev at gmail.com Fri Apr 8 17:03:05 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Fri, 8 Apr 2016 17:03:05 -0400 Subject: [Python-ideas] New scope for exception handlers Message-ID: I would like to propose a change to exception handlers to make it harder to accidently leak names defined only in the exception handler blocks. This change follows from the decision to delete the name of an exception at the end of a handler. The goal of this change is to prevent people from relying on names that are defined only in a handler. As an example, let's looks at a function with a try except: def f(): try: ... except: a = 1 return a This function will only work if the body raises some exception, otherwise we will get an UnBoundLocalError. I propose that we should make `a` fall out of scope when we leave the except handler regardless to prevent people from depending on this behavior. This will make it easier to catch bugs like this in testing. There is one case where I think the name should not fall out of scope, and that is when the name is already defined outside of the handler. For example: def g(): a = 1 try: ... except: a = 2 return a I think this code is well behaved and should continue to work as it already does. There are a couple of ways to implment this new behavior but I think the simplest way to do this would be to treat the handler as a closure where all the free variables defined as nonlocal. This would need to be a small change to the compiler but may have some performance implications for code that does not hit the except handler. If the handler is longer than the bytecode needed to create the inner closure then it may be faster to run the function when the except handler is not hit. This changes our definition of f from: 2 0 SETUP_EXCEPT 8 (to 11) 3 3 LOAD_CONST 1 (Ellipsis) 6 POP_TOP 7 POP_BLOCK 8 JUMP_FORWARD 14 (to 25) 4 >> 11 POP_TOP 12 POP_TOP 13 POP_TOP 5 14 LOAD_CONST 2 (1) 17 STORE_FAST 0 (a) 20 POP_EXCEPT 21 JUMP_FORWARD 1 (to 25) 24 END_FINALLY 6 >> 25 LOAD_FAST 0 (a) 28 RETURN_VALUE to something more like: f - 3 0 SETUP_EXCEPT 8 (to 11) 4 3 LOAD_CONST 0 (Ellipsis) 6 POP_TOP 7 POP_BLOCK 8 JUMP_FORWARD 20 (to 31) 5 >> 11 POP_TOP 12 POP_TOP 13 POP_TOP 14 LOAD_CONST 1 ( at 0x7febcd6e2300, file "", line 1>) 17 LOAD_CONST 2 ('') 20 MAKE_FUNCTION 0 23 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 26 POP_EXCEPT 27 JUMP_FORWARD 1 (to 31) 30 END_FINALLY 7 >> 31 LOAD_FAST 0 (a) 34 RETURN_VALUE f. ----------------- 1 0 LOAD_CONST 0 (1) 3 STORE_FAST 0 (a) 6 LOAD_CONST 1 (None) 9 RETURN_VALUE This new code properly raises the unbound locals exception when executed. For g we could use a MAKE_CLOSURE instead of MAKE_FUNCTION. -------------- next part -------------- An HTML attachment was scrubbed... URL: From keithcu at gmail.com Fri Apr 8 17:15:52 2016 From: keithcu at gmail.com (Keith Curtis) Date: Fri, 8 Apr 2016 17:15:52 -0400 Subject: [Python-ideas] A tuple of various Python suggestions Message-ID: Hi all, I just discovered this alias and I thought I'd post a few ideas. I wouldn't call myself a Python master yet, but it's an amazing language and my biggest wish is that it was more widely used in the industry. Here are a few suggestions: 1. Decrease the bug count. I recently noticed that there are about 5,400 active bugs in http://bugs.python.org/. That surprised me because I almost never see anyone complain about bugs in Python (compared to the number who complain about bugs in LibreOffice, graphics drivers, Gnome / KDE, etc.) There are a lot of people on this list, and if the brainpower can be applied to practical, known, existing problems, it is a great way to improve Python while also considering more exotic ideas. I can also suggest making a pretty graph of the bug count and putting it on the front page of python.org for greater visibility. 2. Python is somewhat popular on servers, and there is a lot of potential for more. WordPress is easy to use and powerful, but lots of people don't want to program in PHP. Or Javascript, Java, Ruby, etc. Codebases like Whoosh full-text search (https://bitbucket.org/mchaput/whoosh/wiki/Home) are important, but have minimal dev resources as most people are using Lucene / ElasticSearch. The common choice is between 1.3M lines of Java: https://www.openhub.net/p/elasticsearch, containing 1100 todos and 1000 references to "deprecated", or 41K lines of Python written mostly by one person. Hadoop is another big Java project (1.9M lines), and there is even an ecosystem around it. Python interoperates with Hadoop, but it should be possible to build a radically simpler framework that provides the same functionality using Python-native functionality and without all the baggage. Hadoop has several interesting sister projects: a distributed database, scalable machine learning, a high-level data flow language, a coordination service, etc. I'm sure you'd build something smaller, cleaner, faster in many cases, more reliable, etc. 3. It was a sad mistake that Google picked Java over Python for Android. However, there is now a great program called Kivy which allows people to write apps for IOS or Android with one codebase, but it could also use more resources, as for example it doesn't fully support Python 3.x yet. There are 10s of thousands of bugs in the popular Python libraries and I would fix those before proposing more language changes. 4. I enjoy reading about the Python performance improvements, but it is mostly a perception problem with all the existing workarounds. Gnome wrote version 3 of their shell in Javascript because they didn't think Python would be fast enough. Lots of people write Node because it's compiled and "fast". I suggest taking some of the effort working on performance, and spend it on evangelizing to other programmers that Python / Cython / PyPy, etc. are already good enough! There are a lot of programmers out there who would be happier if they could work in Python. 5. It would be great to get Python in the web browsers as an alternative to Javascript. There are a number of projects which convert Python to Javascript, but this would be more direct. LibreOffice ships with a Python interpreter, why can't Firefox and Webkit? ;-) Obviously there are interoperability issues, but it would be great to just side-step all the complexity of Javascript (Here is a server-oriented article, but it gives a flavor: http://notes.ericjiang.com/posts/751) This might sound like a crazy idea, but the engineering problems aren't that hard. 6. In a few cases, there are two many codebases providing the same functionality, and none of them are really doing the job. For example, the de-facto MySQL Python interop library (https://pypi.python.org/pypi/MySQL-python) only supports Python 2.x and appears to be abandoned. There are several other libraries out there with different features, performance, compatibility, etc. and it's kind of a minefield for what should be a basic scenario. It takes leadership to jump in and figure out how to fix it and make one codebase good enough that the others can switch over. 7. Focus more on evolving the libraries rather than the language. I've recently discovered Toolz, which has a more complete set of functional language methods. I think some of them should be included in the official versions. A lot of people don't think Python is good enough for functional programming and this would help. These new routines add complexity, but a newbie doesn't need to write in a functional way, so it obeys the "only pay for what you use" rule. There are a number of under-staffed libraries and frameworks. I see people complain about the YAML parsing library being unmaintained, the default HTTP functionality being difficult and limiting, poor SOAP support, etc. There are a million ways to improve the Python ecosystem without making any language changes. You don't have a big rich company who can pay for thousands of full-time developers working on libraries, but the bug reports are a great way to prioritize. 8. I've yet to find a nice simple free IDE with debugging for Python. I use Atom, but it has primitive debugging. I tried PyCharm but it's very complicated (and not free, and Java). I use Jupyter sometimes also but I'd prefer a rich client app with watch windows, etc. 9. It would be interesting to re-imagine the spreadsheet with a more native Python interface. Pandas and matplotlib are great, but it would be cool to have it in LibreOffice Calc that supports drag and drop, copy and paste, can read and write ODS, etc. (Also, LibreOffice Base is basically unmaintained. I think if 10 Python programmers passionate about databases and GUIs showed up, it could re-invigorate this dead codebase.) 10. Being simple to learn and powerful is very hard. Fortunately, you can break compatibility every 10 years. My only suggestion is to get rid of the __self__ somehow ;-) Regards, -Keith http://keithcu.com/ From breamoreboy at yahoo.co.uk Fri Apr 8 17:18:41 2016 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 8 Apr 2016 22:18:41 +0100 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <88F09FF7-1E35-43FE-BE8C-1C52EA2C0E7F@thelig.ht> References: <575a5516d7b439e487ea551e3a4c07d1@thelig.ht> <2E13D03D-2F55-4CA8-947A-0D28CB82CB28@thelig.ht> <88F09FF7-1E35-43FE-BE8C-1C52EA2C0E7F@thelig.ht> Message-ID: On 08/04/2016 21:50, Rian Hunter wrote: > > I think bare except is different from "except Exception" which is common > and not discouraged. "except Exception" still masks programming errors. > "except Exception" is a programming error as far as I'm concerned, but I'm not expecting my own code to keep running 24/7/365. Horses for courses. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From rian at thelig.ht Fri Apr 8 17:25:47 2016 From: rian at thelig.ht (Rian Hunter) Date: Fri, 8 Apr 2016 14:25:47 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: >> On Fri, Apr 08, 2016 at 11:24:15AM -0700, rian at thelig.ht wrote: >> I want to live in a world where I can do this: >> >> while cbs: >> cb = cbs.pop() >> try: >> cb() >> except Exception as e: >> logging.exception("In main loop") >> if is_a_bug(e): >> raise SystemExit() from e > > And I want a pony :-) > > So long as Python allows people to write code like this: > > # A deliberately silly example > if len(seq) == 0: > raise ImportError("assertion fails") > > you cannot expect to automatically know what an exception means without > any context of where it came from and why it happened. The above example > is silly, but it doesn't take much effort to come up with more serious > ones: > > - an interpreter written in Python may raise SyntaxError, which > is not a bug in the interpreter; > > - a test framework may raise AssertionError for a failed test, > which is not a bug in the framework; > > - a function may raise MemoryError if the call *would* run out of > memory, but without actually running out of memory; consequently > it is not a fatal error, while another function may raise the > same MemoryError because it actually did fatally run out of memory. > > Effectively, you want the compiler to Do What I Mean when it comes to > exceptions. DWIM may, occasionally, be a good idea in applications, but > I maintain it is never a good idea in a programming language. I think you're misinterpreting me. I don't want a pony and I don't want a sufficiently smart compiler. I want a consistent opt-in idiom with community consensus. I want a clear way to express that an exception is an error and not an internal bug. It doesn't have to catch 100% of cases, the idiom just needs to approach consistency across all Python libraries that I may import. If the programmer doesn't pay attention to the idiom, then is_a_bug() will never return true (or not True if it's is_not_a_bug()). AssertionError is already unambiguous, I'm sure there are other candidates as well. I'm not the first or only one to want something like this http://blog.tsunanet.net/2012/10/pythons-screwed-up-exception-hierarchy.html But I can see the pile on has begun and my point is lost. Maybe this will get added in ten or twenty years. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 8 17:33:18 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Apr 2016 14:33:18 -0700 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: <5708239E.9090506@stoneleaf.us> On 04/08/2016 02:25 PM, Rian Hunter wrote: > I want a consistent opt-in idiom with community consensus. I want a > clear way to express that an exception is an error and not an internal > bug. It doesn't have to catch 100% of cases, the idiom just needs to > approach consistency across all Python libraries that I may import. > But I can see the pile on has begun and my point is lost. Maybe this > will get added in ten or twenty years. Then come up with one, test it out, and come back with "Here's a cool new idiom, and this is why it's better" post. -- ~Ethan~ From pavol.lisy at gmail.com Fri Apr 8 17:55:34 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Fri, 8 Apr 2016 23:55:34 +0200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> Message-ID: 2016-04-08 17:00 GMT+02:00, Guido van Rossum : > On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano > wrote: > >> You missed two: >> and <<. What are we to do with (True << 1)? > [...] > > Those are indeed ambiguous -- they are defined as multiplication or > floor division with a power of two, e.g. x<>n is > x//2**n (for integral x and nonnegative n). The point of this thread > seems to be to see whether some operations can be made more useful by > staying in the bool domain -- I don't think making both of these > return 0 if n != 0, so let's keep them unchanged. It is also compatible with numpy: np.bool_(True) <<2 4 but there is also minus operator: -np.bool_(True) False -np.bool_(False) True From brett at python.org Fri Apr 8 18:14:24 2016 From: brett at python.org (Brett Cannon) Date: Fri, 08 Apr 2016 22:14:24 +0000 Subject: [Python-ideas] New scope for exception handlers In-Reply-To: References: Message-ID: On Fri, 8 Apr 2016 at 14:03 Joseph Jevnik wrote: > I would like to propose a change to exception handlers to make it harder > to accidentally leak names defined only in the exception handler blocks. > This change follows from the decision to delete the name of an exception at > the end of a handler. > So that change was to prevent memory leaks for the caught exception due to tracebacks being attached to exceptions, not with anything to do with scoping. > The goal of this change is to prevent people from relying on names that > are defined only in a handler. > So that will break code and would make an `except` clause even more special in terms of how it works compared to other blocks. Right now the Python scoping rules are pretty straight-forward (local, non-local/free, global, built-in). Adding this would shove in a "unless you're in an `except` clause" rule that makes things more complicated for little benefit in the face of backwards-compatibility. -Brett > > As an example, let's looks at a function with a try except: > > > def f(): > try: > ... > except: > a = 1 > return a > > > This function will only work if the body raises some exception, otherwise > we will get an UnBoundLocalError. I propose that we should make `a` fall > out of scope when we leave the except handler regardless to prevent people > from depending on this behavior. This will make it easier to catch bugs > like this in testing. There is one case where I think the name should not > fall out of scope, and that is when the name is already defined outside of > the handler. For example: > > > def g(): > a = 1 > try: > ... > except: > a = 2 > return a > > > I think this code is well behaved and should continue to work as it > already does. There are a couple of ways to implment this new behavior but > I think the simplest way to do this would be to treat the handler as a > closure where all the free variables defined as nonlocal. > > This would need to be a small change to the compiler but may have some > performance implications for code that does not hit the except handler. If > the handler is longer than the bytecode needed to create the inner closure > then it may be faster to run the function when the except handler is not > hit. > > This changes our definition of f from: > > 2 0 SETUP_EXCEPT 8 (to 11) > > 3 3 LOAD_CONST 1 (Ellipsis) > 6 POP_TOP > 7 POP_BLOCK > 8 JUMP_FORWARD 14 (to 25) > > 4 >> 11 POP_TOP > 12 POP_TOP > 13 POP_TOP > > 5 14 LOAD_CONST 2 (1) > 17 STORE_FAST 0 (a) > 20 POP_EXCEPT > 21 JUMP_FORWARD 1 (to 25) > 24 END_FINALLY > > 6 >> 25 LOAD_FAST 0 (a) > 28 RETURN_VALUE > > to something more like: > > f > - > 3 0 SETUP_EXCEPT 8 (to 11) > > 4 3 LOAD_CONST 0 (Ellipsis) > 6 POP_TOP > 7 POP_BLOCK > 8 JUMP_FORWARD 20 (to 31) > > 5 >> 11 POP_TOP > 12 POP_TOP > 13 POP_TOP > 14 LOAD_CONST 1 ( > at 0x7febcd6e2300, file "", line 1>) > 17 LOAD_CONST 2 ('') > 20 MAKE_FUNCTION 0 > 23 CALL_FUNCTION 0 (0 positional, 0 keyword pair) > 26 POP_EXCEPT > 27 JUMP_FORWARD 1 (to 31) > 30 END_FINALLY > > 7 >> 31 LOAD_FAST 0 (a) > 34 RETURN_VALUE > > f. > ----------------- > 1 0 LOAD_CONST 0 (1) > 3 STORE_FAST 0 (a) > 6 LOAD_CONST 1 (None) > 9 RETURN_VALUE > > > This new code properly raises the unbound locals exception when executed. > For g we could use a MAKE_CLOSURE instead of MAKE_FUNCTION. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Apr 8 18:20:17 2016 From: brett at python.org (Brett Cannon) Date: Fri, 08 Apr 2016 22:20:17 +0000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: On Fri, 8 Apr 2016 at 14:16 Keith Curtis wrote: > Hi all, > > I just discovered this alias and I thought I'd post a few ideas. I > wouldn't call myself a Python master yet, but it's an amazing language > and my biggest wish is that it was more widely used in the industry. > > Here are a few suggestions: > > 1. Decrease the bug count. I recently noticed that there are about > 5,400 active bugs in http://bugs.python.org/. That surprised me > because I almost never see anyone complain about bugs in Python > (compared to the number who complain about bugs in LibreOffice, > graphics drivers, Gnome / KDE, etc.) > > There are a lot of people on this list, and if the brainpower can be > applied to practical, known, existing problems, it is a great way to > improve Python while also considering more exotic ideas. I can also > suggest making a pretty graph of the bug count and putting it on the > front page of python.org for greater visibility. > That bug count is misleading because it includes not only legitimate bugs but also enhancement proposals. > > 2. Python is somewhat popular on servers, and there is a lot of > potential for more. WordPress is easy to use and powerful, but lots of > people don't want to program in PHP. Or Javascript, Java, Ruby, etc. > > Codebases like Whoosh full-text search > (https://bitbucket.org/mchaput/whoosh/wiki/Home) are important, but > have minimal dev resources as most people are using Lucene / > ElasticSearch. The common choice is between 1.3M lines of Java: > https://www.openhub.net/p/elasticsearch, containing 1100 todos and > 1000 references to "deprecated", or 41K lines of Python written mostly > by one person. > > Hadoop is another big Java project (1.9M lines), and there is even an > ecosystem around it. Python interoperates with Hadoop, but it should > be possible to build a radically simpler framework that provides the > same functionality using Python-native functionality and without all > the baggage. Hadoop has several interesting sister projects: a > distributed database, scalable machine learning, a high-level data > flow language, a coordination service, etc. I'm sure you'd build > something smaller, cleaner, faster in many cases, more reliable, etc. > > 3. It was a sad mistake that Google picked Java over Python for > Android. However, there is now a great program called Kivy which > allows people to write apps for IOS or Android with one codebase, but > it could also use more resources, as for example it doesn't fully > support Python 3.x yet. > > There are 10s of thousands of bugs in the popular Python libraries and > I would fix those before proposing more language changes. > > 4. I enjoy reading about the Python performance improvements, but it > is mostly a perception problem with all the existing workarounds. > Gnome wrote version 3 of their shell in Javascript because they didn't > think Python would be fast enough. Lots of people write Node because > it's compiled and "fast". I suggest taking some of the effort working > on performance, and spend it on evangelizing to other programmers that > Python / Cython / PyPy, etc. are already good enough! There are a lot > of programmers out there who would be happier if they could work in > Python. > Two things. One, there is actually a good amount of performance work potentially landing in Python 3.6. Two, we have been saying Python is fast enough for most things for decades at this point, so this is an old issue that's never going away. > > 5. It would be great to get Python in the web browsers as an > alternative to Javascript. There are a number of projects which > convert Python to Javascript, but this would be more direct. > LibreOffice ships with a Python interpreter, why can't Firefox and > Webkit? ;-) Obviously there are interoperability issues, but it would > be great to just side-step all the complexity of Javascript (Here is a > server-oriented article, but it gives a flavor: > http://notes.ericjiang.com/posts/751) This might sound like a crazy > idea, but the engineering problems aren't that hard. > Once https://github.com/WebAssembly lands in the browser then it won't be as much of an issue. There are also various transpilers of Python to JavaScript already. I also tried back in 2006 but Mozilla rejected the idea so this has been tried before. :) -Brett > > 6. In a few cases, there are two many codebases providing the same > functionality, and none of them are really doing the job. For example, > the de-facto MySQL Python interop library > (https://pypi.python.org/pypi/MySQL-python) only supports Python 2.x > and appears to be abandoned. There are several other libraries out > there with different features, performance, compatibility, etc. and > it's kind of a minefield for what should be a basic scenario. It takes > leadership to jump in and figure out how to fix it and make one > codebase good enough that the others can switch over. > > 7. Focus more on evolving the libraries rather than the language. I've > recently discovered Toolz, which has a more complete set of functional > language methods. I think some of them should be included in the > official versions. A lot of people don't think Python is good enough > for functional programming and this would help. These new routines add > complexity, but a newbie doesn't need to write in a functional way, so > it obeys the "only pay for what you use" rule. > > There are a number of under-staffed libraries and frameworks. I see > people complain about the YAML parsing library being unmaintained, the > default HTTP functionality being difficult and limiting, poor SOAP > support, etc. There are a million ways to improve the Python ecosystem > without making any language changes. You don't have a big rich company > who can pay for thousands of full-time developers working on > libraries, but the bug reports are a great way to prioritize. > > 8. I've yet to find a nice simple free IDE with debugging for Python. > I use Atom, but it has primitive debugging. I tried PyCharm but it's > very complicated (and not free, and Java). I use Jupyter sometimes > also but I'd prefer a rich client app with watch windows, etc. > > 9. It would be interesting to re-imagine the spreadsheet with a more > native Python interface. Pandas and matplotlib are great, but it would > be cool to have it in LibreOffice Calc that supports drag and drop, > copy and paste, can read and write ODS, etc. (Also, LibreOffice Base > is basically unmaintained. I think if 10 Python programmers passionate > about databases and GUIs showed up, it could re-invigorate this dead > codebase.) > > 10. Being simple to learn and powerful is very hard. Fortunately, you > can break compatibility every 10 years. My only suggestion is to get > rid of the __self__ somehow ;-) > > Regards, > > -Keith > http://keithcu.com/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Fri Apr 8 18:50:17 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Fri, 8 Apr 2016 18:50:17 -0400 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: <57075879.1050609@canterbury.ac.nz> References: <57070FC7.3010803@stoneleaf.us> <57075879.1050609@canterbury.ac.nz> Message-ID: On Apr 8, 2016 3:09 AM, "Greg Ewing" wrote: > > Ethan Furman wrote: >> >> Can anybody please enlighten me as to what, exactly, I did wrong? > > > I think Guido was objecting to talk about things such as > making bool no longer subclass int, which is not only > going a long way beyond the original proposal, but is > almost certainly never going to happen, so any discussion > of it could be seen as wasting people's time. While reading the thread, I was honestly wondering about whether it was Pythonic to have bool be a subclass of int. (It's definitely convenient.) So that, at least, might be a justified "Should we also...". -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Apr 8 19:08:04 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 8 Apr 2016 17:08:04 -0600 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <570755A9.5050104@canterbury.ac.nz> Message-ID: On Fri, Apr 8, 2016 at 1:34 AM, Nathaniel Smith wrote: > The reason I'm uncertain is that in numpy code, using operations like > ~ on booleans is *very* common, because the whole idea of numpy is > that it gives you a way to write code that works the same on either a > single value or on an array of values, and when you're working with > booleans then this means you have to use '~': '~' works on arrays and > 'not' doesn't. Would it be a different story if the logical operators (and, or, not) had a protocol, e.g. __not__? My guess is that ~ was made to work in the absence of __not__. -eric From njs at pobox.com Fri Apr 8 21:25:41 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Apr 2016 18:25:41 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <570755A9.5050104@canterbury.ac.nz> Message-ID: On Fri, Apr 8, 2016 at 4:08 PM, Eric Snow wrote: > On Fri, Apr 8, 2016 at 1:34 AM, Nathaniel Smith wrote: >> The reason I'm uncertain is that in numpy code, using operations like >> ~ on booleans is *very* common, because the whole idea of numpy is >> that it gives you a way to write code that works the same on either a >> single value or on an array of values, and when you're working with >> booleans then this means you have to use '~': '~' works on arrays and >> 'not' doesn't. > > Would it be a different story if the logical operators (and, or, not) > had a protocol, e.g. __not__? My guess is that ~ was made to work in > the absence of __not__. It would, but you can't have a regular protocol for and/or because they're actually not operators, they're control-flow syntax: In [1]: a = True In [2]: a or b Out[2]: True In [3]: a and b NameError: name 'b' is not defined You could define a protocol for __or__/__ror__/__and__/__rand__ despite this, but it would have weird issues, like 'True and array([True, False])' would call ndarray.__rand__ and return array([True, False]), but 'True or array([True, False])' would return True (because it short-circuits and returns before it can even check for the presence of ndarray.__ror__). Given this, it's not clear whether it even makes sense to try. There was a discussion about this on python-ideas a few months ago, and Guido asked whether it would still be useful despite these weird issues, but I dropped the ball and my email to numpy-discussion soliciting feedback on that is still sitting in my drafts folder... And I guess you could have a protocol just for 'not', but there might be some performance concerns (e.g. right now the peephole optimizer actually knows how to optimize 'if not' into a single opcode), and overriding 'not' without overriding 'and' + 'or' is probably more confusing than useful. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Sat Apr 9 01:00:07 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Apr 2016 15:00:07 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: On 8 April 2016 at 04:48, Terry Reedy wrote: > To me, the default proposal to expand the domain of open and other path > functions is to call str on the path arg, either always or as needed. We > should then ask "why isn't str() good enough"? Most bad args for open will > immediately result in a file-not-found exception. Not when you're *creating* files and directories. Using "str(path)" as the protocol means these all become valid operations: open(1.0, "w") open(object, "w") open(object(), "w") open(str, "w") open(input, "w") Everything implements __str__ or __repr__, so *everything* becomes acceptable as an argument to filesystem mutating operations, instead of those operations bailing out immediately complaining they've been asked to do something that doesn't make any sense. I strongly encourage folks interested in the fspath protocol design debate to read the __index__ PEP: https://www.python.org/dev/peps/pep-0357/ Start from the title: "Allowing Any Object to be Used for Slicing" The protocol wasn't designed in the abstract: it had a concrete goal of allowing objects other than builtins to be usable in the "x:y:z" slicing syntax. Those objects weren't hypothetical either: the rationale spells out "In NumPy, for example, there are 8 different integer scalars corresponding to unsigned and signed integers of 8, 16, 32, and 64 bits. These type-objects could reasonably be used as integers in many places where Python expects true integers but cannot inherit from the Python integer type because of incompatible memory layouts. There should be some way to be able to tell Python that an object can behave like an integer." The PEP also spells out what's wrong with the "just use int(obj)" alternative: "It is not possible to use the nb_int (and __int__ special method) for this purpose because that method is used to *coerce* objects to integers. It would be inappropriate to allow every object that can be coerced to an integer to be used as an integer everywhere Python expects a true integer. For example, if __int__ were used to convert an object to an integer in slicing, then float objects would be allowed in slicing and x[3.2:5.8] would not raise an error as it should." Extending the use of the protocol to other contexts (such as sequence repetition and optimised lookups on range objects) was then taken up on a case by case basis, but the protocol semantics themselves were defined by that original use case of "allow NumPy integers to be used when slicing sequences". The equivalent motivating use case here is "allow pathlib objects to be used with the open() builtin, os module functions, and os.path module functions". The open() builtin handles str paths, integers (file descriptors), and bytes-like objects (pre-encoded paths) The os and os.path functions handle some combination of those 3 depending on the specific function Working directly with file descriptors is relatively rare, so we can leave that as a special case. Similarly, working directly with bytes-like objects introduces cross-platform portability problems, and also changes the output type of many operations, so we'll keep that as a special case, too. That leaves the text representation, and the question of defining equivalents to "operator.index" and its underlying __index__ protocol. My suggestion of os.fspath as the conversion function is based on: - "path" being too generic (we have sys.path, os.path, and the PATH envvar as potential sources of confusion) - "fspath" being similar to "os.fsencode" and "os.fsdecode", which are the operations for converting a filesystem path in text form to and from its bytes-like object form - os and os.path being two of the main consumers of the proposed protocol - os being a builtin module that underpins most filesystem operations anyway, so folks shouldn't be averse to importing it in code that wants to consume the new protocol Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Sat Apr 9 01:26:26 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 15:26:26 +1000 Subject: [Python-ideas] New scope for exception handlers In-Reply-To: References: Message-ID: On Sat, Apr 9, 2016 at 7:03 AM, Joseph Jevnik wrote: > > def g(): > a = 1 > try: > ... > except: > a = 2 > return a > > > I think this code is well behaved and should continue to work as it already > does. There are a couple of ways to implment this new behavior but I think > the simplest way to do this would be to treat the handler as a closure where > all the free variables defined as nonlocal. I'm not sure what the point there is; when do you need this kind of thing, and why only in the 'except' clause? Also, what if the name is assigned to in the 'try' and the 'except' but nowhere else? If you're curious, I actually put together a "just for the fun of it" patch that adds a form of sub-function scoping to Python. It'd be a great way to figure out just how far this flies in the face of Python's design. ChrisA From rosuav at gmail.com Sat Apr 9 01:32:12 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 15:32:12 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: On Sat, Apr 9, 2016 at 7:25 AM, Rian Hunter wrote: > I want a consistent opt-in idiom with community consensus. I want a clear > way to express that an exception is an error and not an internal bug. It > doesn't have to catch 100% of cases, the idiom just needs to approach > consistency across all Python libraries that I may import. > > If the programmer doesn't pay attention to the idiom, then is_a_bug() will > never return true (or not True if it's is_not_a_bug()). AssertionError is > already unambiguous, I'm sure there are other candidates as well. > > I'm not the first or only one to want something like this > http://blog.tsunanet.net/2012/10/pythons-screwed-up-exception-hierarchy.html The problem with posts like that is that it assumes there's some kind of defined category of "stuff you always want to catch" vs "stuff you never want to catch". This simply isn't the case. You ONLY catch the exceptions you truly understand. Everything else, you leave. There's seldom a need to catch a broad-but-specific subset of exceptions, and those needs aren't entirely consistent, so they fundamentally cannot be codified into the hierarchy. ChrisA From vito.detullio at gmail.com Sat Apr 9 01:58:07 2016 From: vito.detullio at gmail.com (Vito De Tullio) Date: Sat, 09 Apr 2016 07:58:07 +0200 Subject: [Python-ideas] Dunder method to make object str-like References: Message-ID: Chris Angelico wrote: > Proposal: Objects should be allowed to declare that they are > "string-like" by creating a dunder method (analogously to __index__ > for integers) which implies a loss-less conversion to str. > Obviously str will have this dunder method, returning self. Most other > core types (notably 'object') will not define it. Absence of this > method implies that the object cannot be treated as a string. do you expect int to define it? -- By ZeD From rosuav at gmail.com Sat Apr 9 02:00:14 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Apr 2016 16:00:14 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: Message-ID: On Sat, Apr 9, 2016 at 3:58 PM, Vito De Tullio wrote: > Chris Angelico wrote: > >> Proposal: Objects should be allowed to declare that they are >> "string-like" by creating a dunder method (analogously to __index__ >> for integers) which implies a loss-less conversion to str. > >> Obviously str will have this dunder method, returning self. Most other >> core types (notably 'object') will not define it. Absence of this >> method implies that the object cannot be treated as a string. > > do you expect int to define it? Nope! An integer cannot be treated as a string. ChrisA From ncoghlan at gmail.com Sat Apr 9 02:03:18 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Apr 2016 16:03:18 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: On 9 April 2016 at 07:15, Keith Curtis wrote: > Hi all, > > I just discovered this alias and I thought I'd post a few ideas. I > wouldn't call myself a Python master yet, but it's an amazing language > and my biggest wish is that it was more widely used in the industry. > > Here are a few suggestions: > > 1. Decrease the bug count. I recently noticed that there are about > 5,400 active bugs in http://bugs.python.org/. That surprised me > because I almost never see anyone complain about bugs in Python > (compared to the number who complain about bugs in LibreOffice, > graphics drivers, Gnome / KDE, etc.) > > There are a lot of people on this list, and if the brainpower can be > applied to practical, known, existing problems, it is a great way to > improve Python while also considering more exotic ideas. I can also > suggest making a pretty graph of the bug count and putting it on the > front page of python.org for greater visibility. Folks in institutional environments can most readily contribute to this by obtaining Python from commercial redistributors (rather than using the free community versions and then expecting free ongoing support from volunteers), and then funneling bug reports through their vendor rather than filing them directly with upstream (since donating bug fixes to commercial organisations for free isn't something most people consider a fun hobby). > 2. Python is somewhat popular on servers, and there is a lot of > potential for more. WordPress is easy to use and powerful, but lots of > people don't want to program in PHP. Or Javascript, Java, Ruby, etc. > > Codebases like Whoosh full-text search > (https://bitbucket.org/mchaput/whoosh/wiki/Home) are important, but > have minimal dev resources as most people are using Lucene / > ElasticSearch. The common choice is between 1.3M lines of Java: > https://www.openhub.net/p/elasticsearch, containing 1100 todos and > 1000 references to "deprecated", or 41K lines of Python written mostly > by one person. > > Hadoop is another big Java project (1.9M lines), and there is even an > ecosystem around it. Python interoperates with Hadoop, but it should > be possible to build a radically simpler framework that provides the > same functionality using Python-native functionality and without all > the baggage. Hadoop has several interesting sister projects: a > distributed database, scalable machine learning, a high-level data > flow language, a coordination service, etc. I'm sure you'd build > something smaller, cleaner, faster in many cases, more reliable, etc. This is about business dynamics and institutional supply chain management, not software, so wishing won't make it so. However, given the business challenges facing the vendors behind the Java projects you mention here, Python-based alternatives are also going to be a tough sell to potential investors. (That said, you may also be interested in the Apache Spark project, where Python, Java, Scala and R are the top tier analytics development languages) > 3. It was a sad mistake that Google picked Java over Python for > Android. However, there is now a great program called Kivy which > allows people to write apps for IOS or Android with one codebase, but > it could also use more resources, as for example it doesn't fully > support Python 3.x yet. > > There are 10s of thousands of bugs in the popular Python libraries and > I would fix those before proposing more language changes. People work on things in their own time because they find them enjoyable (or otherwise inherently rewarding), not because they're interested in facilitating increased corporate adoption. That said, yes, while it's high profile, contributing to CPython is one of the *least* effective ways of helping to improve the overall Python ecosystem in the near term, as it can literally take years to roll out major changes (as Python 2.6 is still the de facto baseline version, and there are many situations where folks are still using Python 2.4 and earlier). The best cases are those where we can define new APIs, protocols and idioms in ways that can also be adopted in earlier versions of the language by way of third party libraries and standard library backports. > 4. I enjoy reading about the Python performance improvements, but it > is mostly a perception problem with all the existing workarounds. > Gnome wrote version 3 of their shell in Javascript because they didn't > think Python would be fast enough. Lots of people write Node because > it's compiled and "fast". I suggest taking some of the effort working > on performance, and spend it on evangelizing to other programmers that > Python / Cython / PyPy, etc. are already good enough! There are a lot > of programmers out there who would be happier if they could work in > Python. There's nothing stopping anyone interested in this area from working on any kind of evangelisation they want to. However, what harm does it cause us personally if people decide to use other programming languages? Python tautologically fits the brains of Pythonistas, but that's nowhere near being the same thing as it being the right language for everyone for every purpose. It's also the case that any developer with only one language currently in their toolbox (even when that language is Python) is a developer with lots of learning opportunities ahead of them: http://www.curiousefficiency.org/posts/2015/10/languages-to-improve-your-python.html > 5. It would be great to get Python in the web browsers as an > alternative to Javascript. There are a number of projects which > convert Python to Javascript, but this would be more direct. > LibreOffice ships with a Python interpreter, why can't Firefox and > Webkit? ;-) Obviously there are interoperability issues, but it would > be great to just side-step all the complexity of Javascript (Here is a > server-oriented article, but it gives a flavor: > http://notes.ericjiang.com/posts/751) This might sound like a crazy > idea, but the engineering problems aren't that hard. This misunderstands the nature of the relationship between JavaScript, CSS and the HTML Domain Object Model: these are technologies that have co-evolved for describing and dynamically updating user interfaces that interact with remote services over a network, and they're *really* good at it. Replicating that ecosystem in other program languages would technically be possible, but there's little incentive to do so given the work on WebAssembly and ongoing improvements in transpilers. > 6. In a few cases, there are two many codebases providing the same > functionality, and none of them are really doing the job. For example, > the de-facto MySQL Python interop library > (https://pypi.python.org/pypi/MySQL-python) only supports Python 2.x > and appears to be abandoned. There are several other libraries out > there with different features, performance, compatibility, etc. and > it's kind of a minefield for what should be a basic scenario. It takes > leadership to jump in and figure out how to fix it and make one > codebase good enough that the others can switch over. This is why people pay open source redistributors to ensure they have access to commercially supported components, rather than trusting that whatever they happened to find on the internet will continue to be maintained by anonymous benefactors. > 7. Focus more on evolving the libraries rather than the language. I've > recently discovered Toolz, which has a more complete set of functional > language methods. I think some of them should be included in the > official versions. A lot of people don't think Python is good enough > for functional programming and this would help. These new routines add > complexity, but a newbie doesn't need to write in a functional way, so > it obeys the "only pay for what you use" rule. "Some of them should be included" is not an actionable proposal. Anyone is free to propose specific additions to functools, and make the case for why those particular ones should be included in the standard library rather than continuing to be accessed via version independent 3rd party libraries like Toolz. (Advance warning: "I want a purely functional solution to a problem that can already be readily solved with procedural code" generally isn't accepted as a compelling justification) A simpler possibility might be to review the Functional Programming HOWTO at https://docs.python.org/3/howto/functional.html and consider ways that that might be updated to reference third part libraries, as well as potentially made more discoverable via the functools and itertools reference documentation. > There are a number of under-staffed libraries and frameworks. I see > people complain about the YAML parsing library being unmaintained, the > default HTTP functionality being difficult and limiting, poor SOAP > support, etc. There are a million ways to improve the Python ecosystem > without making any language changes. You don't have a big rich company > who can pay for thousands of full-time developers working on > libraries, but the bug reports are a great way to prioritize. What makes you think anyone here has the authority to tell anyone else how to spend their time? Folks work on community open source projects based on their personal interests and their commercial interests. While you're right that lots of things could stand to be improved, the best people to complain to about underinvestment (or misdirected investment) are commercial redistributors selling supported versions of CPython and other community projects in the Python ecosystem, rather than the already overcommitted volunteers contributing their own time for their own reasons. > 8. I've yet to find a nice simple free IDE with debugging for Python. > I use Atom, but it has primitive debugging. I tried PyCharm but it's > very complicated (and not free, and Java). I use Jupyter sometimes > also but I'd prefer a rich client app with watch windows, etc. Not the right list for that question. > 9. It would be interesting to re-imagine the spreadsheet with a more > native Python interface. Pandas and matplotlib are great, but it would > be cool to have it in LibreOffice Calc that supports drag and drop, > copy and paste, can read and write ODS, etc. (Also, LibreOffice Base > is basically unmaintained. I think if 10 Python programmers passionate > about databases and GUIs showed up, it could re-invigorate this dead > codebase.) Also not the right list. > 10. Being simple to learn and powerful is very hard. Fortunately, you > can break compatibility every 10 years. My only suggestion is to get > rid of the __self__ somehow ;-) Methods are just functions, so this is never going to happen. (When folks understand why, they've generally made a decent step forward in appreciating the differences between a procedural-first usage model for a language, vs an objects-first one) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sat Apr 9 03:44:44 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Apr 2016 17:44:44 +1000 Subject: [Python-ideas] New scope for exception handlers In-Reply-To: References: Message-ID: <20160409074444.GQ12526@ando.pearwood.info> On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote: > I would like to propose a change to exception handlers to make it harder to > accidently leak names defined only in the exception handler blocks. This > change follows from the decision to delete the name of an exception at the > end of a handler. The goal of this change is to prevent people from relying > on names that are defined only in a handler. An interesting proposal, but you're missing one critical point: why is it harmful to create names inside an except block? There is a concrete reason why Python 3, and not Python 2, deletes the "except Exception as err" name when the except block leaves: because exceptions now hold on to a lot more call info, which can prevent objects from being garbage-collected. But the same doesn't apply to arbitrary names. At the moment, only a few block statements create a new scope: def and class mostly. In particular, no flow control statement does: if, elif, else, for, while, try, except all use the existing scope. This is a nice clean design, and in my opinion must better than the rule that any indented block is a new scope. I would certainly object to making "except" the only exception (pun intended) and I would object even more to making *all* the block statements create a new scope. Here is an example of how your proposal would bite people. Nearly all by code is hybrid 2+3 code, so I often have a construct like this at the start of modules: try: import builtins # Python 3.x except ImportError: # Python 2.x import __builtin__ as builtins Nice and clean. But what if try and except introduced a new scope? I would have to write: builtins = None try: global builtins import builtins except ImportError: global builtins import __builtin__ as builtins assert builtins is not None Since try and except are different scopes, I need a separate global declaration in each. If you think this second version is an improvement over the first, then our ideas of what makes good looking code are so far apart that I don't think its worth discussing this further :-) If only except is a different scope, then I have this shorter version: try: # global scope import builtins except ImportError: # local scope global builtins import __builtin__ as builtins > As an example, let's looks at a function with a try except: > > > def f(): > try: > ... > except: > a = 1 > return a > > > This function will only work if the body raises some exception, otherwise > we will get an UnBoundLocalError. Not necessary. It depends on what is hidden by the ... dots. For example: def f(): try: a = sequence.pop() except AttributeError: a = -1 return a It might not be the most Pythonic code around, but it works, and your proposal will break it. Bottom line is, there's nothing fundamentally wrong with except blocks *not* starting a new scope. I'm not sure if there's any real benefit to the proposal, but even if there is, I doubt it's worth the cost of breaking existing working code. So if you still want to champion your proposal, it's not enough to demonstrate that it could be done. You're going to have to demonstrate not only a benefit from the change, but that the benefit is worth breaking other people's code. -- Steve From tjreedy at udel.edu Sat Apr 9 04:07:04 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 9 Apr 2016 04:07:04 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: On 4/8/2016 11:42 AM, Guido van Rossum wrote: > DeprecationWarning every time you use ~ on a bool? A DeprecationWarning should only be in the initial version of bool.__invert__, which initially would return int.__invert__ after issuing the warning that we plan to change the meaning. -- Terry Jan Reedy From joejev at gmail.com Sat Apr 9 04:09:50 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Sat, 9 Apr 2016 04:09:50 -0400 Subject: [Python-ideas] New scope for exception handlers In-Reply-To: <20160409074444.GQ12526@ando.pearwood.info> References: <20160409074444.GQ12526@ando.pearwood.info> Message-ID: Thank you for the responses. I did not realize that the delete fast was added because the traceback is on the exception, that makes a lot of sense. Regarding the case of: try: import a except ImportError: import b as a my proposal would still have this work as intended because the first import would appear as an assignment to the name `a` outside the scope of the handler which would case the inner scope to emit a store deref after the import instead of a store fast. The only thing this change would block is introducing a new variable which is only defined inside the except handler. > So if you still want to champion your proposal, it's not enough to > demonstrate that it could be done. You're going to have to demonstrate > not only a benefit from the change, but that the benefit is worth > breaking other people's code. To be honest, this was mainly proposed because I thought about _how_ to implement it and less about _should_ we implement this. I implemented a small version of this to generate the dis outputs that I put in the first email. After reading the responses I agree that this should probably not be added; I do not think that we need to discuss this further unless someone else has strong feelings about this. On Sat, Apr 9, 2016 at 3:44 AM, Steven D'Aprano wrote: > On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote: > > > I would like to propose a change to exception handlers to make it harder > to > > accidently leak names defined only in the exception handler blocks. This > > change follows from the decision to delete the name of an exception at > the > > end of a handler. The goal of this change is to prevent people from > relying > > on names that are defined only in a handler. > > An interesting proposal, but you're missing one critical point: why is > it harmful to create names inside an except block? > > There is a concrete reason why Python 3, and not Python 2, deletes the > "except Exception as err" name when the except block leaves: because > exceptions now hold on to a lot more call info, which can prevent > objects from being garbage-collected. But the same doesn't apply to > arbitrary names. > > At the moment, only a few block statements create a new scope: def and > class mostly. In particular, no flow control statement does: if, elif, > else, for, while, try, except all use the existing scope. This is a nice > clean design, and in my opinion must better than the rule that any > indented block is a new scope. I would certainly object to making > "except" the only exception (pun intended) and I would object even more > to making *all* the block statements create a new scope. > > Here is an example of how your proposal would bite people. Nearly all by > code is hybrid 2+3 code, so I often have a construct like this at the > start of modules: > > try: > import builtins # Python 3.x > except ImportError: > # Python 2.x > import __builtin__ as builtins > > > Nice and clean. But what if try and except introduced a new scope? I > would have to write: > > builtins = None > try: > global builtins > import builtins > except ImportError: > global builtins > import __builtin__ as builtins > assert builtins is not None > > > Since try and except are different scopes, I need a separate global > declaration in each. If you think this second version is an improvement > over the first, then our ideas of what makes good looking code are so > far apart that I don't think its worth discussing this further :-) > > If only except is a different scope, then I have this shorter version: > > try: # global scope > import builtins > except ImportError: # local scope > global builtins > import __builtin__ as builtins > > > > > As an example, let's looks at a function with a try except: > > > > > > def f(): > > try: > > ... > > except: > > a = 1 > > return a > > > > > > This function will only work if the body raises some exception, otherwise > > we will get an UnBoundLocalError. > > Not necessary. It depends on what is hidden by the ... dots. For > example: > > def f(): > try: > a = sequence.pop() > except AttributeError: > a = -1 > return a > > > It might not be the most Pythonic code around, but it works, and your > proposal will break it. > > Bottom line is, there's nothing fundamentally wrong with except blocks > *not* starting a new scope. I'm not sure if there's any real benefit to > the proposal, but even if there is, I doubt it's worth the cost of > breaking existing working code. > > So if you still want to champion your proposal, it's not enough to > demonstrate that it could be done. You're going to have to demonstrate > not only a benefit from the change, but that the benefit is worth > breaking other people's code. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Apr 9 04:14:50 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Apr 2016 18:14:50 +1000 Subject: [Python-ideas] New scope for exception handlers In-Reply-To: <20160409074444.GQ12526@ando.pearwood.info> References: <20160409074444.GQ12526@ando.pearwood.info> Message-ID: On 9 April 2016 at 17:44, Steven D'Aprano wrote: > So if you still want to champion your proposal, it's not enough to > demonstrate that it could be done. You're going to have to demonstrate > not only a benefit from the change, but that the benefit is worth > breaking other people's code. Not just any code, but "try it and see if it works" name binding idioms recommended in the reference documentation: https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection It's also worth noting that when it comes to detecting this kind of structural error, tools like pylint already do a good job of tracing possible control flow problems: $ cat > conditional_name_binding.py try: pass except: a = 1 print(a) $ pylint -E --enable=invalid-name conditional_name_binding.py No config file found, using default configuration ************* Module conditional_name_binding C: 4, 4: Invalid constant name "a" (invalid-name) More easily finding this kind of problem is one of the major advantages of using static analysis tools in addition to dynamic testing. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sat Apr 9 07:16:47 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sat, 9 Apr 2016 14:16:47 +0300 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: On Sat, Apr 9, 2016 at 11:07 AM, Terry Reedy wrote: > On 4/8/2016 11:42 AM, Guido van Rossum wrote: >> >> DeprecationWarning every time you use ~ on a bool? > > A DeprecationWarning should only be in the initial version of > bool.__invert__, which initially would return int.__invert__ after issuing > the warning that we plan to change the meaning. > Maybe the right warning type would be FutureWarning. -Koos From ian.g.kelly at gmail.com Sat Apr 9 10:28:10 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Sat, 9 Apr 2016 08:28:10 -0600 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: On Sat, Apr 9, 2016 at 2:07 AM, Terry Reedy wrote: > On 4/8/2016 11:42 AM, Guido van Rossum wrote: >> >> DeprecationWarning every time you use ~ on a bool? > > > A DeprecationWarning should only be in the initial version of > bool.__invert__, which initially would return int.__invert__ after issuing > the warning that we plan to change the meaning. It seems unusual to deprecate something without also providing a means of using the new thing in the same release. "Don't use this feature because we're going to change what it does in the future. Oh, you want to use the new version? Psych! We haven't actually done anything yet. Use not instead." It creates a weird void in Python 3.6 where the operator still exists but absolutely nobody has a legitimate reason to be using it. What happens if somebody is using ~ for its current semantics, skips the 3.6 release in their upgrade path, and doesn't read the release notes carefully enough? They'll never see the warning and will just experience a silent and difficult-to-diagnose breakage. From steve at pearwood.info Sat Apr 9 11:25:14 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Apr 2016 01:25:14 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: <20160409152514.GT12526@ando.pearwood.info> On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote: > It seems unusual to deprecate something without also providing a means > of using the new thing in the same release. "Don't use this feature > because we're going to change what it does in the future. Oh, you want > to use the new version? Psych! We haven't actually done anything yet. > Use not instead." It creates a weird void in Python 3.6 where the > operator still exists but absolutely nobody has a legitimate reason to > be using it. Not really. This is quite similar to what happened in Python 2.3 during int/long unification. The behaviour of certain integer operations changed, including the meaning of some literals, and warnings were displayed. I don't have 2.3 available to demonstrate but I can show you the change in behaviour: [steve at ando ~]$ python1.5 -c "print 0xffffffff" -1 [steve at ando ~]$ python2.4 -c "print 0xffffffff" 4294967295 By memory, 0xffffffff in python2.3 would print a warning that the result will change in the next release, and return -1. See: https://www.python.org/dev/peps/pep-0237/ https://www.python.org/download/releases/2.3.5/notes/ > What happens if somebody is using ~ for its current semantics, skips > the 3.6 release in their upgrade path, and doesn't read the release > notes carefully enough? They'll never see the warning and will just > experience a silent and difficult-to-diagnose breakage. Then they'll be in the same position as everybody if there's no depreciation at all. -- Steve From steve at pearwood.info Sat Apr 9 11:43:05 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Apr 2016 01:43:05 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <22279.18221.103226.654215@turnbull.sk.tsukuba.ac.jp> References: <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <22279.18221.103226.654215@turnbull.sk.tsukuba.ac.jp> Message-ID: <20160409154305.GU12526@ando.pearwood.info> On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote: > > After all, not withstanding their fancy string representation, > > I guess "fancy string representation" was the original motivation for > the overrides. If the intent was really to make operator versions of > logical operators (but only for true bools!), they would have fixed ~ > too. No need to guess. There's a PEP: https://www.python.org/dev/peps/pep-0285/ > > they behave like ints and actually are ints. > > I can't fellow-travel all the way to "actually are", though. bools > are what we decide to make them. I'm not talking about bools in other languages, or bools in Python in some alternate universe. But in the Python we have right now, bools *are* ints, no ifs, buts or maybes: py> isinstance(True, int) True This isn't an accident of the implementation, it was an explicit BDFL pronouncement in PEP 285: 6) Should bool inherit from int? => Yes. Now I'll certainly admit that bools-are-ints is an accident of history. Had Guido been more influenced by Pascal, say, and less by C, he might have choosen to include a dedicated Boolean type right from the beginning. But he wasn't, and so he didn't, and consequently bools are now ints. > I just don't see why the current > behaviors of &|^ are particularly useful, since you'll have to guard > all bitwise expressions against non-bool truthies and falsies. flag ^ flag is useful since we don't have a boolean-xor operator and bitwise-xor does the right thing for bools. And I suppose some people might prefer & and | over boolean-and and boolean-or because they're shorter and require less typing. I don't think that's a particularly good reason for using them, and as you say, you do have to guard against non-bools slipping, but Consenting Adults applies. -- Steve From steve at pearwood.info Sat Apr 9 11:47:42 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Apr 2016 01:47:42 +1000 Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of bool.__invert__] In-Reply-To: References: <57070FC7.3010803@stoneleaf.us> <57075879.1050609@canterbury.ac.nz> Message-ID: <20160409154742.GV12526@ando.pearwood.info> On Fri, Apr 08, 2016 at 06:50:17PM -0400, Franklin? Lee wrote: > While reading the thread, I was honestly wondering about whether it was > Pythonic to have bool be a subclass of int. (It's definitely convenient.) > So that, at least, might be a justified "Should we also...". https://www.python.org/dev/peps/pep-0285/ and see my comments to Stephen Turnbull send a few minutes ago. -- Steve From steve at pearwood.info Sat Apr 9 11:49:12 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Apr 2016 01:49:12 +1000 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: <20160409154912.GW12526@ando.pearwood.info> On Sat, Apr 09, 2016 at 02:16:47PM +0300, Koos Zevenhoven wrote: > Maybe the right warning type would be FutureWarning. If we accept this proposal -- and I hope we don't -- I think that FutureWarning is the right one to use. It is what was used in 2.3 when the behaviour of ints changed as part of int/long unification. -- Steve From ian.g.kelly at gmail.com Sat Apr 9 12:07:16 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Sat, 9 Apr 2016 10:07:16 -0600 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160409152514.GT12526@ando.pearwood.info> References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> Message-ID: On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano wrote: > On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote: > >> It seems unusual to deprecate something without also providing a means >> of using the new thing in the same release. "Don't use this feature >> because we're going to change what it does in the future. Oh, you want >> to use the new version? Psych! We haven't actually done anything yet. >> Use not instead." It creates a weird void in Python 3.6 where the >> operator still exists but absolutely nobody has a legitimate reason to >> be using it. > > Not really. This is quite similar to what happened in Python 2.3 during > int/long unification. The behaviour of certain integer operations > changed, including the meaning of some literals, and warnings were > displayed. Pointing out that this has been done once before, 11 minor releases prior, does not dissuade me from continuing to characterize it as "unusual". The int/long unification was also a much more visible change overall. >> What happens if somebody is using ~ for its current semantics, skips >> the 3.6 release in their upgrade path, and doesn't read the release >> notes carefully enough? They'll never see the warning and will just >> experience a silent and difficult-to-diagnose breakage. > > Then they'll be in the same position as everybody if there's no > depreciation at all. I'm not suggesting there should be no deprecation. I'm just questioning whether the proposed deprecation is sufficient. From random832 at fastmail.com Sat Apr 9 12:23:55 2016 From: random832 at fastmail.com (Random832) Date: Sat, 09 Apr 2016 12:23:55 -0400 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160409154305.GU12526@ando.pearwood.info> References: <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <22279.18221.103226.654215@turnbull.sk.tsukuba.ac.jp> <20160409154305.GU12526@ando.pearwood.info> Message-ID: <1460219035.3810107.573773625.34885A10@webmail.messagingengine.com> On Sat, Apr 9, 2016, at 11:43, Steven D'Aprano wrote: > flag ^ flag is useful since we don't have a boolean-xor operator and > bitwise-xor does the right thing for bools. If you have bools, so does !=. If you don't, ^ is no better. From guido at python.org Sat Apr 9 12:24:54 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 9 Apr 2016 09:24:54 -0700 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> Message-ID: Let me pronounce something here. This change is not worth the amount of effort and pain a deprecation would cause everyone. Either we change this quietly in 3.6 (adding it to What's New etc. of course) or we don't do it at all. -- --Guido van Rossum (python.org/~guido) From ian.g.kelly at gmail.com Sat Apr 9 12:25:50 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Sat, 9 Apr 2016 10:25:50 -0600 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> Message-ID: On Sat, Apr 9, 2016 at 10:07 AM, Ian Kelly wrote: > On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano wrote: >> On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote: >> >>> It seems unusual to deprecate something without also providing a means >>> of using the new thing in the same release. "Don't use this feature >>> because we're going to change what it does in the future. Oh, you want >>> to use the new version? Psych! We haven't actually done anything yet. >>> Use not instead." It creates a weird void in Python 3.6 where the >>> operator still exists but absolutely nobody has a legitimate reason to >>> be using it. >> >> Not really. This is quite similar to what happened in Python 2.3 during >> int/long unification. The behaviour of certain integer operations >> changed, including the meaning of some literals, and warnings were >> displayed. > > Pointing out that this has been done once before, 11 minor releases > prior, does not dissuade me from continuing to characterize it as > "unusual". The int/long unification was also a much more visible > change overall. Also, in that case there was a way to start using long literals immediately: 0xffffffffL From storchaka at gmail.com Sat Apr 9 14:27:57 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Apr 2016 21:27:57 +0300 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <20160407094618.5c01847d@fsol> <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> Message-ID: On 09.04.16 11:07, Terry Reedy wrote: > On 4/8/2016 11:42 AM, Guido van Rossum wrote: >> DeprecationWarning every time you use ~ on a bool? > > A DeprecationWarning should only be in the initial version of > bool.__invert__, which initially would return int.__invert__ after > issuing the warning that we plan to change the meaning. For such cases there is FutureWarning. From keithcu at gmail.com Sat Apr 9 20:28:52 2016 From: keithcu at gmail.com (Keith Curtis) Date: Sat, 9 Apr 2016 20:28:52 -0400 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: Hi again all, Thanks for the replies. I don't have a studied opinion of what new methods should be added to functools, I mostly brought it up because it won't create an incompatible change and things like that should be much easier to do. Functools is one of the few remaining places in the core runtime where it seems what is provided is meant to be a sample or flavor, and not used on a daily basis without group-by, merge-sort, frequencies, pipe, take / drop, etc. I agree that more companies paying for Python support would be a great thing, but most often companies buy support at a different level of granularity: for example, RHEL where they get it for an entire OS, or hiring Django people to build and maintain a website. I wouldn't count on commercial redistributors of Python as ever being a big source of resources to fix the thousands of random bugs. The vast majority of people use the standard runtime. I realize that volunteers work on their own time and choice of tasks, but you've got quite a large community, and by periodically reporting on the bug count and having a goal to resolve old ones eventually, you will even find problem areas no one is talking about. Sometimes the issue isn't resources, but grit, focus, pride, etc. If you never weighed yourself, at looked at yourself in the mirror, you'd get heavier than you realized. You are mostly volunteers, but you can try to produce software as good as what paid professionals can do. Perhaps the bug list should be broken up into two categories: future optional or incompatible features, and code that people can easily agree are worth doing now. That way important holes don't get ignored for years. People should be able to trust the standard runtime. It didn't seem like WebAssembly would enable access to Numpy. It seems doubtful it will be much adopted if it can't access the rich Python runtime that might be installed. Python is much cleaner than some of the alternatives, so you should consider adoption as a humanitarian effort! Regards, -Keith From steve at pearwood.info Sat Apr 9 21:14:41 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Apr 2016 11:14:41 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: <20160410011441.GZ12526@ando.pearwood.info> Hi Keith, On Sat, Apr 09, 2016 at 08:28:52PM -0400, Keith Curtis wrote: [...] > Python is much cleaner than some of the alternatives, so you should > consider adoption as a humanitarian effort! I keep seeing you telling us that "you should do this, you should do that, you you you..." but I don't see *you* volunteering to work on any of the bugs. Or do you see yourself more in a supervisory role? -- Steve From ncoghlan at gmail.com Sun Apr 10 00:32:22 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 10 Apr 2016 14:32:22 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: On 10 April 2016 at 10:28, Keith Curtis wrote: > I agree that more companies paying for Python support would be a great > thing, but most often companies buy support at a different level of > granularity: for example, RHEL where they get it for an entire OS, or > hiring Django people to build and maintain a website. For context, I work for Red Hat's developer experience team and occasionally put the graph at http://bugs.python.org/issue?@template=stats in front of our Python maintenance team and ask "This is a trend that worries me, what are we currently doing about it?". I'd encourage anyone using Python in a commercial or large organisational context that are getting their Python runtimes via a commercially supported Linux distro, Platform-as-a-Service provider, or one of the cross-platform CPython redistributors to take that graph, show it to their vendor and say: "These upstream CPython metrics worry us, what are you doing about them on our behalf?". (If you don't know how to ask your vendor that kind of question since another team handles the supplier relationship, then find that team and ask *them* what they're doing to address your concerns) For folks in those kinds of contexts running directly off the binaries or source code published by the Python Software Foundation, then my recommendation is a bit different: * option 1 is to advocate for switching to a commercial redistributor, and asking prospective vendors how they're supporting ongoing CPython maintenance as part of the vendor evaluation * option 2 is to advocate for becoming a PSF Sponsor Member and then advocate for a technical fellowship program aimed at general maintenance and project health monitoring for CPython (keeping in mind that "How do we manage it fairly and sustainably?" would actually be a bigger challenge for such a fellowship than funding it) Whether advocating for option 1 or option 2 makes more sense will vary by organisation, as it depends greatly on whether or not the organisation would prefer to work with a traditional commercial supplier, or engage directly with a charitable foundation that operates in the wider public interest. There's also option 3, which is to hire existing core developers and give them time to work on general upstream CPython maintenance, but there are actually relatively few organisations for which that strategy makes commercial sense (and they're largely going to be operating system companies, public cloud infrastructure companies, or commercial Python redistributors). > I wouldn't count > on commercial redistributors of Python as ever being a big source of > resources to fix the thousands of random bugs. The vast majority of > people use the standard runtime. If that were an accurate assessment, I wouldn't need to write articles like http://www.curiousefficiency.org/posts/2015/04/stop-supporting-python26.html and the manylinux cross-distro Linux ABI wouldn't be based on the 9 year old CentOS 5 userspace. (We also wouldn't be needing to pursue redistributor-friendly proposals like PEP 493 to help propagate the network security enhancements in the 2.7.x series) Direct downloads from python.org are certainly a major factor on Windows (especially in educational contexts), but even there, many of the largest commercial adopters are using ActiveState's distribution (see http://www.activestate.com/customers ) or one of the analytical distributions from Enthought or Continuum Analytics. > I realize that volunteers work on their own time and choice of tasks, > but you've got quite a large community, and by periodically reporting > on the bug count and having a goal to resolve old ones eventually, you > will even find problem areas no one is talking about. Sometimes the > issue isn't resources, but grit, focus, pride, etc. So, you're suggesting we deliberately set out to make our volunteers feel bad in an effort to guilt them into donating more free work to corporations and other large organisations? When I phrase it in that deliberately negative way, I hope it becomes obvious why we *don't* do this. We do send out the weekly tracker metrics emails, and offers the statistics graphs on the tracker itself if people want to use them as parts of their own business cases for increased investment (whether direct or indirect) in upstream sustaining engineering for CPython. > If you never weighed yourself, at looked at yourself in the mirror, > you'd get heavier than you realized. You are mostly volunteers, but > you can try to produce software as good as what paid professionals can > do. You seem to be operating under some significant misapprehensions here. As far as I am aware, most of the core developers *are* paid professionals (even the folks that didn't start out that way) - we're just typically not paid specifically to work on CPython. Instead, we work on CPython to make sure it's fit for *our* purposes, or because we find it to be an enjoyable way to spend our free time. Engaging effectively with that environment means pursuing a "co-contributor" experience - we each make CPython better for our own purposes, and by doing so, end up making it better for everyone. The experience you seem to be seeking in this thread is that of a pure software consumer, and that's the role downstream redistributors fulfil in an open source ecosystem: they provide (usually for a fee) a traditional customer experience, with the redistributor taking care of the community engagement side of things. > Perhaps the bug list should be broken up into two categories: future > optional or incompatible features, and code that people can easily > agree are worth doing now. That way important holes don't get ignored > for years. People should be able to trust the standard runtime. Important holes don't get ignored for years - they get addressed, since people are motivated to invest the time to ensure they're fixed. That said, we do experience the traditional open source "community gap" where problems that are only important to folks that aren't yet personally adept at effective community engagement can languish indefinitely (hence my emphasis above on organisations considering the idea of outsourcing their community engagement efforts, rather than assuming that it's easy, or that someone else will take care of it for them without any particular personal motivation). > It didn't seem like WebAssembly would enable access to Numpy. It seems > doubtful it will be much adopted if it can't access the rich Python > runtime that might be installed. Project Jupyter already addressed that aspect by separating out the platform independent UI service for code entry from the language kernel backend for code evaluation. There's certainly a lot that could be done in terms of making Jupyter language kernels installable as web browser add-ons running native code, but that would be a question for the Project Jupyter folks, rather than here. > Python is much cleaner than some of the alternatives, so you should > consider adoption as a humanitarian effort! As tempting as it is to believe that "things that matter to software developers" dictate where the world moves, it's important to keep in mind that it's estimated we make up barely 0.25% of the world's population [1] - this means that for everyone that would consider themselves a "software developer", there's around 399 people who don't. Accordingly, if your motivation is "How do we help empower people to control the technology in their lives, rather than having it control them?" (which is a genuinely humanitarian effort), then I'd recommend getting involved with education-focused organisations like the Raspberry Pi Foundation, Software Carpentry and Django Girls, or otherwise helping to build bridges between the software development community and professional educators (see [2] for an example of the latter). When it comes to technology choices *within* that 0.25%, it's worth remembering that one of the key design principles of Python is *trusting developers to make their own decisions* (hence things like allowing monkeypatching as being incredibly useful in cases like testing, even while actively advising against it as a regular programming practice). That trust extends to trusting them to choose languages that are appropriate for them and their use cases. If that leads to them choosing Python, cool. If it leads to them choosing something else, that's cool, too - there are so many decent open source programming languages and runtimes out there these days that software developers are truly spoiled for choice, so providing a solid foundation for exploring those alternatives is at least as important a goal as reducing the incentives for people that already favour Python to need to explore them. Cheers, Nick. [1] http://www.infoq.com/news/2014/01/IDC-software-developers [2] https://2016.pycon-au.org/programme/python_in_education_seminar -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Sun Apr 10 01:38:41 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 10 Apr 2016 17:38:41 +1200 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: References: Message-ID: <5709E6E1.8030107@canterbury.ac.nz> Rian Hunter wrote: > I want a consistent opt-in idiom with community consensus. I want a > clear way to express that an exception is an error and not an internal > bug. There's already a convention for this. Exceptions derived from EnvironmentError (OSError in python 3) usually result from something the user did wrong. Anything else that escapes lower-level exception handling is probably a bug. So my top-level exception handlers usually look like this: try: ... except EnvironmentError as e: # Display an appropriate error message to the user # If it's something interactive, go back for another # request, otherwise exit Anything else is left to propagate and generate a traceback. It's not perfect, but it works well enough most of the time. If I find a case where some non-bug exception gets raised that doesn't derive from EnvironmentError, I fix things so that it gets caught somewhere lower down and re-raised as an EnvironmentError. So the rule you seem to be after is probably "If it's likely to be a user error, derive it from EnvironmentError, otherwise don't." I think that's about the best that can be done, considering that library code can't always know whether a given exception is a user error or a bug, because it depends on what the calling code is trying to accomplish. For example, some_library.read_file("confug.ini") will probably produce an OSError (file not found) even though it's the programmer's fault for misspelling "config". -- Greg From greg.ewing at canterbury.ac.nz Sun Apr 10 02:46:17 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 10 Apr 2016 18:46:17 +1200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> Message-ID: <5709F6B9.5060603@canterbury.ac.nz> Guido van Rossum wrote: > Let me pronounce something here. This change is not worth the amount > of effort and pain a deprecation would cause everyone. Either we > change this quietly in 3.6 (adding it to What's New etc. of course) or > we don't do it at all. I'm having trouble seeing why it should be done at all. What actual problem would it be solving? Does anyone desperately want to be able to spell boolean negation as ~b instead of not b? -- Greg From p.f.moore at gmail.com Sun Apr 10 06:15:28 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 10 Apr 2016 11:15:28 +0100 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <5709F6B9.5060603@canterbury.ac.nz> References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> <5709F6B9.5060603@canterbury.ac.nz> Message-ID: On 10 April 2016 at 07:46, Greg Ewing wrote: > Guido van Rossum wrote: >> >> Let me pronounce something here. This change is not worth the amount >> of effort and pain a deprecation would cause everyone. Either we >> change this quietly in 3.6 (adding it to What's New etc. of course) or >> we don't do it at all. > > I'm having trouble seeing why it should be done at all. > What actual problem would it be solving? Does anyone > desperately want to be able to spell boolean negation > as ~b instead of not b? I have no axe to grind either way, but my impression from this thread is that some people would prefer bool to be consistent with user-defined types (such as numpy's) in this regard - specifically because user-defined types *have* to use ~ as the negation operator because "not" is not overridable in they way they require. Paul From stephen at xemacs.org Sun Apr 10 12:33:53 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 11 Apr 2016 01:33:53 +0900 Subject: [Python-ideas] Boundaries for unpacking In-Reply-To: <5707BCEE.1090901@gmail.com> References: <570692E7.9080904@gmail.com> <5707BCEE.1090901@gmail.com> Message-ID: <22282.32881.29954.946126@turnbull.sk.tsukuba.ac.jp> Michel Desmoulin writes: > Yes and you can also do that for regular slicing on list. But you don't, > because you have regular slicing, which is cleaner, and easier to read > and remember. It's clean because it's well-defined. Slices on general iterables don't have an OWTDI. For example, "a = somelist[:]" is an idiom for copying somelist to a so that destructive manipulations of a don't change the original. Should "a = someiterable[:]" reproduce those semantics? After "head, tail = someiterable" should tail contain a list or someiterable itself or something else? "WIBNI iterable[] worked" has already been posted to this thread about 5 times, and nobody disagrees that IWBN. But slicing and unpacking of iterables are fraught with such issues. It's time the wishful thinkers got down to edge cases and wrote a PEP. From steve at pearwood.info Sun Apr 10 12:39:24 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Apr 2016 02:39:24 +1000 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <5709E6E1.8030107@canterbury.ac.nz> References: <5709E6E1.8030107@canterbury.ac.nz> Message-ID: <20160410163924.GB12526@ando.pearwood.info> On Sun, Apr 10, 2016 at 05:38:41PM +1200, Greg Ewing wrote: > Rian Hunter wrote: > >I want a consistent opt-in idiom with community consensus. I want a > >clear way to express that an exception is an error and not an internal > >bug. > > There's already a convention for this. Exceptions derived > from EnvironmentError (OSError in python 3) usually > result from something the user did wrong. Anything else that > escapes lower-level exception handling is probably a bug. I don't think this is a valid description of EnvironmentError. I don't think you can legitimately say it "usually" comes from user error. Let's say I have an application which, on startup, looks for config files in various places. If they exist and are readable, the application uses them. If they don't exist or aren't readable, an EnvironmentError will be generated (say, IOError). This isn't a user error, and my app can and should just ignore such missing config files. Then the application goes to open a user-specified data file. If the file doesn't exist, or can't be read, that will generate an EnvironmentError, but it isn't one that should be logged. (Who wants to log every time the user mispells a file name, or tries to open a file they don't have permission for?) In an interactive application, the app should display an error message and then wait for more commands. In a batch or command-line app, the application should exit. So treatment of the error depends on *what sort of application* you have, not just the error itself. Then the application phones home, looking for updates to download, uploading the popularity statistics of the most commonly used commands, bug reports, etc. What if it can't contact the home server? That's most likely an EnvironmentError too, but it's not a user-error. Oops, I spelled the URL of my server "http://myserver.com.ua" instead of .au. So that specific EnvironmentError is a programming bug. (No wonder I haven't had any popularity stats uploaded...) So EnvironmentError can represent any of: - situation normal, not actually an error at all; - a non-fatal user-error; - a fatal user-error; - transient network errors that will go away on their own; - programming bugs. > It's not perfect, but it works well enough most of the time. > If I find a case where some non-bug exception gets raised that > doesn't derive from EnvironmentError, I fix things so that > it gets caught somewhere lower down and re-raised as an > EnvironmentError. That's ... rather strange. As in: EnvironmentError("substring not found") for an unsuccessful search? -- Steve From cory at lukasa.co.uk Sun Apr 10 14:41:44 2016 From: cory at lukasa.co.uk (Cory Benfield) Date: Sun, 10 Apr 2016 19:41:44 +0100 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: Message-ID: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> > On 10 Apr 2016, at 01:28, Keith Curtis wrote: > > It didn't seem like WebAssembly would enable access to Numpy. It seems > doubtful it will be much adopted if it can't access the rich Python > runtime that might be installed. I feel like this demonstrates a confusion about how the web ecosystem works. Code that executes inside a web browser?s Javascript runtime does not have arbitrary access to the system. That?s not an arbitrary limitation, it?s vital, because code shipped over the web is inherently untrusted: it can be intercepted, manipulated, edited, and attacked. Each time the web application developers of the world want a new interface to system hardware they have to specify up a whole set of APIs for the browser Javascript to obtain that access in a way that is controlled and enabled by the user. Browser developers will allow code executing in the browser to access binaries on the host system *over their dead bodies*. Allowing that represents a terrifying security vulnerability. At no point will any sane browser developer allow that. As it turns out, of course, WebAssembly *would* enable access to NumPy because NumPy could simply be compiled to WebAssembly as well, and distributed along with the Python interpreter and all the other code you?d have to ship. The TL;DR here is: no web browser will ever allow access to a Python runtime (or any other runtime) that is installed. However, WebAssembly does not have the limitation you?ve suggested. Cory -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From Nikolaus at rath.org Sun Apr 10 17:24:00 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sun, 10 Apr 2016 14:24:00 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: (Nick Coghlan's message of "Sat, 9 Apr 2016 15:00:07 +1000") References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> Message-ID: <87h9f9p37z.fsf@vostro.rath.org> On Apr 09 2016, Nick Coghlan wrote: > The equivalent motivating use case here is "allow pathlib objects to > be used with the open() builtin, os module functions, and os.path > module functions". Why isn't this use case perfectly solved by: def open(obj, *a): # If pathlib isn't imported, we can't possibly receive # a Path object. pathlib = sys.modules.get('pathlib', None): if pathlib is not None and isinstance(obj, pathlib.Path): obj = str(obj) return real_open(obj, *a) > That leaves the text representation, and the question of defining > equivalents to "operator.index" and its underlying __index__ protocol. well, or the above (as far as I can see) :-). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From keithcu at gmail.com Sun Apr 10 17:24:59 2016 From: keithcu at gmail.com (Keith Curtis) Date: Sun, 10 Apr 2016 17:24:59 -0400 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: Hi again, I mean this merely as food for thought. Thank you for reading. I personally find Python very stable, but the bug counts have been heading in the wrong direction: http://bugs.python.org/issue?@template=stats I recently discovered those charts, and they have valuable data that can be turned into action. My distributor of Python is Arch. I presume it is very close to what you guys release. That's just my example, but from what I've seen, few pay for a Python runtime. Given your license, I wouldn't expect a lot to. If someone waits for the wrong train, they won't get there. I don't suggest guilt-tripping volunteers into fixing bugs. If you've got to dig a ditch, you can try to enjoy the sun. Getting the bug count under control is an admirable goal. The list is an opportunity to focus on the known problems real people care about most, and to try to deal with old issues before taking on new ones. I don't recommend people fix bugs to help "corporations and large organizations". You won't be very motivated by that anti-capitalist mindset. People should fix bugs because it helps real Python users, and gets rid of barriers. If people steadily remove roadblocks, things will flourish. I was teasing about you guys not being paid professionals, however, it appears that very few of you have the goal of getting the official bug count down. That is a big difference between amateurs and professionals. If few people feel ownership of the official Python, it can be bad. Maybe the PSF could hire more with that mindset. I don't know the solutions, but I am grateful to be able to send you a few words. As for WebAssembly, I don't know if Numpy can "simply" be re-compiled for it. It seems like the sort of workitem that could take 2-3 years, and never be actually fully compatible or as fast. Python can run in sandboxes as well: https://wiki.python.org/moin/SandboxedPython. People who care about code being intercepted and manipulated should use SSL or sign it. People who write their own code to run on their own machine would likely prefer to be able to just directly reference the Python runtime they've already setup. I wonder whether the sandbox can be used outside of Web Assembly so that code distribution and security are not so intermingled. I did see someone write that WebAssembly is the "dawn of a new era", but I wonder whether it is mostly a bunch of Javascript people trying to solve their own problems rather than those who care about making Python work well on it. Regards, -Keith From ethan at stoneleaf.us Sun Apr 10 17:31:15 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 10 Apr 2016 14:31:15 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <87h9f9p37z.fsf@vostro.rath.org> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> Message-ID: <570AC623.1070701@stoneleaf.us> On 04/10/2016 02:24 PM, Nikolaus Rath wrote: > On Apr 09 2016, Nick Coghlan wrote: >> The equivalent motivating use case here is "allow pathlib objects to >> be used with the open() builtin, os module functions, and os.path >> module functions". > > Why isn't this use case perfectly solved by: > > def open(obj, *a): > # If pathlib isn't imported, we can't possibly receive > # a Path object. > pathlib = sys.modules.get('pathlib', None): > if pathlib is not None and isinstance(obj, pathlib.Path): > obj = str(obj) > return real_open(obj, *a) pathlib is the primary motivator, but there's no reason to shut everyone else out. By having a well-defined protocol and helper function we not only make our own lives easier but we also make the lives of third-party libraries and experimenters easier. -- ~Ethan~ From greg.ewing at canterbury.ac.nz Sun Apr 10 20:36:27 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 11 Apr 2016 12:36:27 +1200 Subject: [Python-ideas] Consistent programming error handling idiom In-Reply-To: <20160410163924.GB12526@ando.pearwood.info> References: <5709E6E1.8030107@canterbury.ac.nz> <20160410163924.GB12526@ando.pearwood.info> Message-ID: <570AF18B.2030606@canterbury.ac.nz> Steven D'Aprano wrote: > Let's say I have an application which, on startup, looks for config > files in various places. If they exist and are readable, the application > uses them. If they don't exist or aren't readable, an EnvironmentError > will be generated (say, IOError). This isn't a user error, and my app > can and should just ignore such missing config files. Of course you can't assume that an EnvironmentError *anywhere* in the program represents a user error. I was just suggesting a heuristic for the *top level* exception hander to use. Making that heuristic work well requires cooperation from the rest of the code. In this case, it means wrapping the code that reads config files to catch file-not-found errors. You're going to have to do that anyway if you want to carry on with the rest of the processing. > That's ... rather strange. As in: > > EnvironmentError("substring not found") > > for an unsuccessful search? I might put a bit more information in there to help the user. The idea is that whatever I put in the EnvironmentError will end up getting displayed to the user as an error message by my top-level exception handler. I might also use a subclass of EnvironmentError if I want to be able to catch it specifically, but that's not strictly necessary. I don't show the user the exception class name, only its argument. -- Greg From cory at lukasa.co.uk Mon Apr 11 04:17:31 2016 From: cory at lukasa.co.uk (Cory Benfield) Date: Mon, 11 Apr 2016 09:17:31 +0100 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: <612B55AC-E500-48F0-9583-2EB316AF727A@lukasa.co.uk> > On 10 Apr 2016, at 22:24, Keith Curtis wrote: > > Python can run in sandboxes as well: > https://wiki.python.org/moin/SandboxedPython. People who care about > code being intercepted and manipulated should use SSL or sign it. I don?t think you?ve understood the problem here: you seem to be saying that we can solve the lack of trust issue by ?rubbing some crypto on it?. But that doesn?t solve the problem at all. Let?s take this apart. ?People who care about code being intercepted and manipulated?: if that code runs directly on the user?s machine then *everyone* should care. Put another way: it doesn?t matter what the *author* of the code cares about, it matters what the user cares about, and all users care about executing safe code! Otherwise, I can insert whatever Python code I like and run arbitrary code with the permissions of the browser. This opens the entire machine up to attack: an attacker can consume their CPU resources, transition into FS access, and basically do all kinds of wacky things. But wrapping this in SSL or signing the code doesn?t solve the problem at all. I opened a new browser window, turned my ad blocker off, and then loaded www.wired.com. The following domains loaded and executed Javascript code (this isn?t a complete list either, I gave up and got bored): - ads.rubiconproject.com - 5be16.v.fwmrm.net - optimized-by.rubiconproject.com - cdn.optimizely.com - rtb.adgrx.com - player.cnevids.com - dy48bnzanqw0v.cloudfront.net - cdn-akamai.mookie1.com - cdnjs.cloudflare.com - tpc.googlesyndication.com - securepubads.g.doubleclick.net - c.amazon-adsystem.com - s.update.rubiconproject.com - b.scorecardresearch.com - aax.amazon-adsystem.com - static.chartbeat.com - segment-data.zqtk.net - wired.disqus.com - animate.adobe.com - aam.wired.com - pagead2.googlesyndication.com - condenast.demdex.net - www.googletagservices.com - h5.adgear.com - dpm.demdex.net - a.adgear.com - i.yldbt.com - use.typekit.net - assets.adobedtm.com - z.moatads.com The point is that I, the user, did not consent to *any* of those. This is the way the web platform works: users don?t get asked who gets to execute code on their machine. Most users would not download a random binary to their machine from one of those domains and execute it. Wrapping that code in SSL or signing it prevents man-in-the-middle attacks on the code, but doesn?t in any sense prevent those actors listed there from doing terrible stuff! It is well known that Sony wrote an actual rootkit that they used as DRM and distributed it via CDs. If you believe that one of the above sites wouldn?t do something equally malicious with full access to my machine, you?re living in a fantasy world. This means that any code that those domains are allowed to execute needs to run in an absurdly restricted context: because neither I nor any other user trusts arbitrary domains to run arbitrary code! A Python sandbox that allows access to any code not distributed via the web browser is not restrictive enough. That Python tells you too much about the machine on which it is running. This is doubly bad if that Python is capable of calling into native code extensions distributed outside the browser (such as NumPy), because sandboxing code like that requires running a complete virtual machine. Either distributing that code would be a nightmare, or you?d be forcing users to run a complete x86 virtual machine in order to keep them safe from the arbitrary code that these actors are delivering. > People who write their own code to run on their own machine would > likely prefer to be able to just directly reference the Python runtime > they've already setup. People who write their own code to run on their own machine can write Python directly. They don?t need a web browser. Hell, they can bundle Flask and provide a localhost website that runs Python code. Those users are currently served just fine. > I wonder whether the sandbox can be used outside of Web Assembly so > that code distribution and security are not so intermingled. I did see > someone write that WebAssembly is the "dawn of a new era", but I > wonder whether it is mostly a bunch of Javascript people trying to > solve their own problems rather than those who care about making > Python work well on it. Python will work just fine on it if you don?t add the bizarre requirement to be able to access the user?s machine from inside the browser sandbox. Python on the web is a laudable goal, and should be pursued, but the idea of a Python-on-the-web as powerful as Python-on-the-machine is not. No-one wants their website Javascript to be able to call into a Node.js installed on the user?s machine, because they know that?s absurd. We shouldn?t want Python to be able to do it either. Anyway, at this point I think we?re about as off-topic as we can get for this list, so I?m stepping back out of this conversation now. Feel free to follow-up off-list. Cory -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ncoghlan at gmail.com Mon Apr 11 04:51:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Apr 2016 18:51:05 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: On 11 April 2016 at 07:24, Keith Curtis wrote: > I personally find Python very stable, but the bug counts have been > heading in the wrong direction: > http://bugs.python.org/issue?@template=stats > > I recently discovered those charts, and they have valuable data that > can be turned into action. How? Are you offer to pay someone to bring them down? Are you offering to work on it yourself? Are you offering to help with the current workflow improvement efforts that are designed to make more efficient use of the limited pool of available contributor time? It's very easy to say "Hey, these metrics are going in the wrong direction". It's very hard to do something productive about it, rather than just complaining on the internet. > My distributor of Python is Arch. I presume it is very close to what > you guys release. That's just my example, but from what I've seen, few > pay for a Python runtime. Then you'd be incorrect (they may pay for it as part of something else, like a Red Hat subscription, or an Azure/AWS/GCE/Heroku/OpenShift account, but they're paying for it). It's true that *open source and free software communities* are very bad at ensuring the development processes for the software we use are sustainable, but that seems to be because we're bad at supply chain management in general and even more confused than most by the difference between "duplicating an already existing piece of software can readily be made zero cost" and "ensuring a piece of software remains useful as the world around it changes requires ongoing investments of time and money". > I don't suggest guilt-tripping volunteers into fixing bugs. If you've > got to dig a ditch, you can try to enjoy the sun. Getting the bug > count under control is an admirable goal. The list is an opportunity > to focus on the known problems real people care about most, and to try > to deal with old issues before taking on new ones. But is that a fun way for volunteers to prioritise their time, as compared to working on things that interest them personally, or that they otherwise find inherently rewarding? Many of the oldest issues remain open because they're rare, easily worked around, hard to fix, only arguably a bug, or some combination of the above. Even reviewing them to see if they're still valid can be time consuming. A lot of projects deal with that problem by automatically closing old issues that haven't been touched in a while, which seems to be a text book case of gaming the system - once you treat "number of open issues" as an important metric, you've created an incentive to "default to closing", even if the underlying problem hasn't been addressed. > I don't recommend people fix bugs to help "corporations and large > organizations". You won't be very motivated by that anti-capitalist > mindset. What anti-capitalist mindset? "If someone wants me to fix bugs for their reasons rather than mine, they can pay me" is a quintessentiallity capitalist attitude. I only add the "corporations and large organisations" qualifier because they're more likely to be able to afford to pay someone, and less likely to have volunteers willing to work on their problems for altruistic reasons. > People should fix bugs because it helps real Python users, > and gets rid of barriers. This is still guilt tripping volunteers - neither "you're not volunteering enough" nor "you're volunteering for the wrong things" are acceptable messages to send to folks that are already contributing (and folks that send that message regardless of its inappropriateness are a major factor in open source contributors burning out and quitting community contributions entirely, so it has the exact opposite effect of the intended one). By contrast, it's perfectly reasonable to let folks that have *spare* time they're willing to give to the community know that there are plenty of opportunities to contribute and provide info regarding some of the many areas where assistance would be helpful, as long as it's accompanied by the reminder that a sensible and sustainable personal priority order is "health, relationships, paid work, volunteer work". Even putting that question of healthy priorities aside, it's also the case that the vast majority of Pythonistas *aren't* affected by the edge cases that aren't handled correctly, or they're using third party libraries that already work around those issues. The latter is especially true for folks working across multiple Python versions. > I was teasing about you guys not being paid professionals, however, it > appears that very few of you have the goal of getting the official bug > count down. That is a big difference between amateurs and > professionals. As noted above, commercial projects tend to be far more ruthless about culling old "We're not going to invest resources in addressing this" issues. We prefer to leave them open in case they catch someone's interest (and because politely explaining to someone why you closed their issue report can itself be quite time consuming). However, I'll also reiterate my original point: if you want a customer experience, pay someone. > If few people feel ownership of the official Python, it > can be bad. Maybe the PSF could hire more with that mindset. There's a big difference between "not offering a customer experience for free" and not feeling ownership. Now, you may *want* a customer experience for free, but that doesn't make it a reasonable expectation. > I don't > know the solutions, but I am grateful to be able to send you a few > words. This isn't a case where articulating the problem is at all helpful - we're well aware of the problem already (hence the metrics). The PSF is also aware of the concern, but it's a longer term challenge compared to other areas (such as the Python Package Index), so it's not something we're interested in driving from the Board level at this point in time. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Mon Apr 11 04:56:01 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Apr 2016 18:56:01 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan wrote: > Many of the oldest issues remain open because they're rare, easily > worked around, hard to fix, only arguably a bug, or some combination > of the above. Even reviewing them to see if they're still valid can be > time consuming. Maybe this is where someone like Keith can contribute? Go through a lot of old issues and inevitably there'll be some that you can reproduce with the version of Python that was current then, but can't repro with today's Python, and they can be closed as fixed. Doesn't take any knowledge of C, and maybe not even of Python (if there's a good enough test case there). ChrisA From ncoghlan at gmail.com Mon Apr 11 05:59:54 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Apr 2016 19:59:54 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: On 11 April 2016 at 18:56, Chris Angelico wrote: > On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan wrote: >> Many of the oldest issues remain open because they're rare, easily >> worked around, hard to fix, only arguably a bug, or some combination >> of the above. Even reviewing them to see if they're still valid can be >> time consuming. > > Maybe this is where someone like Keith can contribute? Go through a > lot of old issues and inevitably there'll be some that you can > reproduce with the version of Python that was current then, but can't > repro with today's Python, and they can be closed as fixed. Doesn't > take any knowledge of C, and maybe not even of Python (if there's a > good enough test case there). +1 As David Murray has pointed out on occasion, browsing through the tracker "oldest first" can also be interesting in terms of reading the discussions and seeing what leads to issues remaining open for a long time. Separating out "Open Enhancements" as a subcategory of "Open Issues" is also a longstanding "nice-to-have" for the metrics collection - the script for the weekly data collection is at https://hg.python.org/tracker/python-dev/file/tip/scripts/roundup-summary , while https://hg.python.org/tracker/python-dev/file/tip/scripts/issuestats.py pulls the time series created by those notifications and turns it into a chart. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From berker.peksag at gmail.com Mon Apr 11 06:38:02 2016 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Mon, 11 Apr 2016 13:38:02 +0300 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: On Mon, Apr 11, 2016 at 12:24 AM, Keith Curtis wrote: > I was teasing about you guys not being paid professionals, however, it > appears that very few of you have the goal of getting the official bug > count down. While I understand your point, it's more complicated than that. I spent some of my weekends to close/commit old issues on bugs.python.org and unfortunately I can say that issue triaging is one of the most unrewarding (you will probably get a "oh so it took three years to commit a two lines patch?" or "wow it took only two years to notice my problem" response) and time consuming (you will spend at least 30 minutes to understand what the issue is about) tasks in open source development. --Berker From stephen at xemacs.org Mon Apr 11 08:01:28 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 11 Apr 2016 21:01:28 +0900 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: <22283.37400.964955.954452@turnbull.sk.tsukuba.ac.jp> Executive summary: Keep your shirts on, Python-Dev is scaling. Nick Coghlan writes: > On 11 April 2016 at 07:24, Keith Curtis wrote: > > I personally find Python very stable, but the bug counts have been > > heading in the wrong direction: > > http://bugs.python.org/issue?@template=stats > > > > I recently discovered those charts, and they have valuable data that > > can be turned into action. > > How? Wrong question, in my opinion. "Why do you think so?" is what I'd ask. To me, only the last graph, which shows the closed and total growing at the same rate (more or less), and therefore suggests that Python development is on a stable path vis-a-vis bugginess, is particularly interesting. Growth in number of open issues in that situation is arguably a *good* thing. Why might the open issues be growing, even though the fraction of open issues (= 1 - the fraction of closed issues) is constant? (1) Users are reporting lots of issues. How would that happen? Python is getting lots of users, and they aren't going away because Python is "too buggy". Hard to see that as a bad thing. (2) Python has a growing amount of code to be buggy, and users are exercising the new code and reporting issues they encounter, and aren't going away because it's too buggy. Hard to see that as a bad thing. (3) The reported issues are duplicates reported because nobody's fixing an important subset. That's bad, but is it real? Well, can't disprove that just looking at the numbers, but (a) we'd notice the dupes (people are looking at the issue tracker, because issues are getting closed) and (b) the users would go away, but (1) and (2) say they aren't. The same "back of the envelope" analysis applies to the "issues with patches," I think. Not to be Pollyanna about it; there are problems with Python's issue management (and Nick knows them better than most). And perhaps there is an opportunity to leverage that stability, and improve the open to total ratio while maintaining user and feature growth. But AFAICS, those graphs don't really tell us anything we can act on (except to reassure us that the rumors that Python is about to be consumed by termites are unfounded). From ncoghlan at gmail.com Mon Apr 11 10:06:04 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Apr 2016 00:06:04 +1000 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: <22283.37400.964955.954452@turnbull.sk.tsukuba.ac.jp> References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> <22283.37400.964955.954452@turnbull.sk.tsukuba.ac.jp> Message-ID: On 11 April 2016 at 22:01, Stephen J. Turnbull wrote: > Executive summary: Keep your shirts on, Python-Dev is scaling. > > Nick Coghlan writes: > > On 11 April 2016 at 07:24, Keith Curtis wrote: > > > I personally find Python very stable, but the bug counts have been > > > heading in the wrong direction: > > > http://bugs.python.org/issue?@template=stats > > > > > > I recently discovered those charts, and they have valuable data that > > > can be turned into action. > > > > How? > > Not to be Pollyanna about it; there are problems with Python's issue > management (and Nick knows them better than most). Yeah, I have a biased perspective here as I have a vested interest in folks taking greater advantage of any support contracts they already have (since Red Hat's subscriptions are what ultimately pay my salary), and my pre-Red-Hat employment was with a large scale system integrator, so I'm now inclined to look for supply chain based solutions to capacity problems whenever there are credible commercial incentives to be found. Since keeping high touch customers out of community channels also helps reduce the overall support burden for volunteers, I'll generally advocate for that approach whenever the questions of increased development capacity or changes in focus come up :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From Nikolaus at rath.org Mon Apr 11 10:43:49 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Mon, 11 Apr 2016 16:43:49 +0200 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <570AC623.1070701@stoneleaf.us> (Ethan Furman's message of "Sun, 10 Apr 2016 14:31:15 -0700") References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> Message-ID: <87d1pwrysa.fsf@thinkpad.rath.org> On Apr 10 2016, Ethan Furman wrote: > On 04/10/2016 02:24 PM, Nikolaus Rath wrote: >> On Apr 09 2016, Nick Coghlan wrote: > >>> The equivalent motivating use case here is "allow pathlib objects to >>> be used with the open() builtin, os module functions, and os.path >>> module functions". >> >> Why isn't this use case perfectly solved by: >> >> def open(obj, *a): >> # If pathlib isn't imported, we can't possibly receive >> # a Path object. >> pathlib = sys.modules.get('pathlib', None): >> if pathlib is not None and isinstance(obj, pathlib.Path): >> obj = str(obj) >> return real_open(obj, *a) > > pathlib is the primary motivator, but there's no reason to shut > everyone else out. > > By having a well-defined protocol and helper function we not only make > our own lives easier but we also make the lives of third-party > libraries and experimenters easier. To me this sounds like catering to a hypothetical audience that may want to do hypothetical things. If you start with the above, and people complain that their favorite non-pathlib path library is not supported by the stdlib, you can still add a protocol. But if you add a protocol right away, you're stuck with the complexity for a very long time even if almost no one actually uses it. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From random832 at fastmail.com Mon Apr 11 10:57:47 2016 From: random832 at fastmail.com (Random832) Date: Mon, 11 Apr 2016 10:57:47 -0400 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <87d1pwrysa.fsf@thinkpad.rath.org> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> Message-ID: <1460386667.3224899.575276065.4DF986EC@webmail.messagingengine.com> On Mon, Apr 11, 2016, at 10:43, Nikolaus Rath wrote: > To me this sounds like catering to a hypothetical audience that may want > to do hypothetical things. If you start with the above, and people > complain that their favorite non-pathlib path library is not supported > by the stdlib, you can still add a protocol. But if you add a protocol > right away, you're stuck with the complexity for a very long time even > if almost no one actually uses it. And there's nothing stopping people from subclassing from Path (should it be PurePath?), or monkey-patching pathlib. In this case, "isinstance(foo, Path) returns true" is the protocol. Also, where is the .path attribute? It's documented , but... >>> pathlib.Path(".").path Traceback (most recent call last): File "", line 1, in AttributeError: 'PosixPath' object has no attribute 'path' From ethan at stoneleaf.us Mon Apr 11 11:10:24 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Apr 2016 08:10:24 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <1460386667.3224899.575276065.4DF986EC@webmail.messagingengine.com> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <1460386667.3224899.575276065.4DF986EC@webmail.messagingengine.com> Message-ID: <570BBE60.1060508@stoneleaf.us> On 04/11/2016 07:57 AM, Random832 wrote: > Also, where is the .path attribute? It's documented > , > but... > >>>> pathlib.Path(".").path > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'PosixPath' object has no attribute 'path' pathlib is provisional -- `.path` has been committed and will be available with the next releases (assuming we don't change it out for the protocol version). -- ~Ethan~ From ethan at stoneleaf.us Mon Apr 11 11:13:32 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Apr 2016 08:13:32 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <87d1pwrysa.fsf@thinkpad.rath.org> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> Message-ID: <570BBF1C.9090805@stoneleaf.us> On 04/11/2016 07:43 AM, Nikolaus Rath wrote: > On Apr 10 2016, Ethan Furman wrote: >> By having a well-defined protocol and helper function we not only make >> our own lives easier but we also make the lives of third-party >> libraries and experimenters easier. > > To me this sounds like catering to a hypothetical audience that may want > to do hypothetical things. If you start with the above, and people > complain that their favorite non-pathlib path library is not supported > by the stdlib, you can still add a protocol. But if you add a protocol > right away, you're stuck with the complexity for a very long time even > if almost no one actually uses it. The part of "make our own lives easier" is not a hypothetical audience. Making my own library (antipathy) work seamlessly with pathlib and DirEntry is not hypothetical. -- ~Ethan~ From Nikolaus at rath.org Mon Apr 11 17:09:01 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Mon, 11 Apr 2016 14:09:01 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <570BBF1C.9090805@stoneleaf.us> (Ethan Furman's message of "Mon, 11 Apr 2016 08:13:32 -0700") References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <570BBF1C.9090805@stoneleaf.us> Message-ID: <871t6bua36.fsf@thinkpad.rath.org> On Apr 11 2016, Ethan Furman wrote: > On 04/11/2016 07:43 AM, Nikolaus Rath wrote: >> On Apr 10 2016, Ethan Furman wrote: > >>> By having a well-defined protocol and helper function we not only make >>> our own lives easier but we also make the lives of third-party >>> libraries and experimenters easier. >> >> To me this sounds like catering to a hypothetical audience that may want >> to do hypothetical things. If you start with the above, and people >> complain that their favorite non-pathlib path library is not supported >> by the stdlib, you can still add a protocol. But if you add a protocol >> right away, you're stuck with the complexity for a very long time even >> if almost no one actually uses it. > > The part of "make our own lives easier" is not a hypothetical > audience. My assumption was that "own" refers to core developers here, while.. > Making my own library (antipathy) work seamlessly with pathlib and > DirEntry is not hypothetical. ..here you seem to be wearing your third-party maintainer hat :-). As far as I can see, implementing a protocol instead of adding a few isinstance checks is more likely to make the life of a CPython developer harder than easier. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From ethan at stoneleaf.us Mon Apr 11 17:19:39 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Apr 2016 14:19:39 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <871t6bua36.fsf@thinkpad.rath.org> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <570BBF1C.9090805@stoneleaf.us> <871t6bua36.fsf@thinkpad.rath.org> Message-ID: <570C14EB.1080007@stoneleaf.us> On 04/11/2016 02:09 PM, Nikolaus Rath wrote: > On Apr 11 2016, Ethan Furman wrote: >> On 04/11/2016 07:43 AM, Nikolaus Rath wrote: >>> On Apr 10 2016, Ethan Furman wrote: >> >>>> By having a well-defined protocol and helper function we not only make >>>> our own lives easier but we also make the lives of third-party >>>> libraries and experimenters easier. >>> >>> To me this sounds like catering to a hypothetical audience that may want >>> to do hypothetical things. If you start with the above, and people >>> complain that their favorite non-pathlib path library is not supported >>> by the stdlib, you can still add a protocol. But if you add a protocol >>> right away, you're stuck with the complexity for a very long time even >>> if almost no one actually uses it. >> >> The part of "make our own lives easier" is not a hypothetical >> audience. > > My assumption was that "own" refers to core developers here, while.. It does. >> Making my own library (antipathy) work seamlessly with pathlib and >> DirEntry is not hypothetical. > > ..here you seem to be wearing your third-party maintainer hat :-). I was. :) > As far as I can see, implementing a protocol instead of adding a few > isinstance checks is more likely to make the life of a CPython developer > harder than easier. I disagree. And the protocol idea was not mine, so apparently other core-devs also disagree (or think it's worth it, regardless). Even without the protocol I would think we'd still make a separate function to check the input. -- ~Ethan~ From d3matt at gmail.com Mon Apr 11 17:22:25 2016 From: d3matt at gmail.com (Matthew Stoltenberg) Date: Mon, 11 Apr 2016 15:22:25 -0600 Subject: [Python-ideas] simpler weakref.finalize Message-ID: Currently, if I want to have weakref.finalize call an object's cleanup method, I have to do something like: class foo: _finalizer = None def __init__(self): self._finalizer = weakref.finalize(self, foo._cleanup, weakref.WeakMethod(self.cleanup)) def __del__(self): self.cleanup() @classmethod def _cleanup(cls, func): func()() def cleanup(self): if self._finalizer is not None: self._finalizer.detach() print('cleaning up') It wouldn't be difficult to have weakref.finalize automatically handle the conversion to WeakMethod and automatically attempt to dereference then call the function passed in. -------------- next part -------------- An HTML attachment was scrubbed... URL: From keithcu at gmail.com Mon Apr 11 17:36:27 2016 From: keithcu at gmail.com (Keith Curtis) Date: Mon, 11 Apr 2016 17:36:27 -0400 Subject: [Python-ideas] (no subject) Message-ID: Hello again, Many FOSS communities struggle with sustainability, but Python is very rich so you've clearly figured it out. It is easy to say that bug metrics are going in the wrong direction. It is also easy to do nothing, or to criticize those who are concerned about a systemic issue. Data can be turned into action by leadership and plans. It can be helpful to have people who think 5,000 is a big number. (Automatically closing old issues doesn't take grit and isn't a valid solution.) It seems you could use at least 10 more dedicated devs who focus on the official CPython bug count. As for WebAssembly, if their security is done mostly via their virtual machine, then they won't be able to separate it. However, if Python in the browser can't enable optional access Numpy, etc., then it will be missing the best reasons to use it. Sometimes security can destroy utility. Just because there are some untrustworthy websites doesn't mean all must be. Warm regards, -Keith From ethan at stoneleaf.us Mon Apr 11 17:46:23 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Apr 2016 14:46:23 -0700 Subject: [Python-ideas] (no subject) In-Reply-To: References: Message-ID: <570C1B2F.30206@stoneleaf.us> On 04/11/2016 02:36 PM, Keith Curtis wrote: > It seems you could use at least 10 more dedicated > devs who focus on the official CPython bug count. At least. So create a Python Fund and let us know where to apply. :) -- ~Ethan~ From ian.g.kelly at gmail.com Mon Apr 11 18:07:42 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Mon, 11 Apr 2016 16:07:42 -0600 Subject: [Python-ideas] (no subject) In-Reply-To: References: Message-ID: On Mon, Apr 11, 2016 at 3:36 PM, Keith Curtis wrote: > As for WebAssembly, if their security is done mostly via their virtual > machine, then they won't be able to separate it. However, if Python in > the browser can't enable optional access Numpy, etc., then it will be > missing the best reasons to use it. Sometimes security can destroy > utility. Just because there are some untrustworthy websites doesn't > mean all must be. How do you propose to ascertain whether the website the user is visiting is trustworthy? Ask the user whether they trust the author? This sounds like it has great potential to be the security disaster of signed Java applets all over again. From tjreedy at udel.edu Mon Apr 11 18:45:38 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Apr 2016 18:45:38 -0400 Subject: [Python-ideas] A tuple of various Python suggestions In-Reply-To: References: <8C288144-3C4D-44D0-AC25-CA0F1DD91583@lukasa.co.uk> Message-ID: On 4/11/2016 4:56 AM, Chris Angelico wrote: > On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan wrote: >> Many of the oldest issues remain open because they're rare, easily >> worked around, hard to fix, only arguably a bug, or some combination >> of the above. Even reviewing them to see if they're still valid can be >> time consuming. > > Maybe this is where someone like Keith can contribute? Go through a > lot of old issues and inevitably there'll be some that you can > reproduce with the version of Python that was current then, but can't > repro with today's Python, and they can be closed as fixed. Doesn't > take any knowledge of C, and maybe not even of Python (if there's a > good enough test case there). Or: look through old enhancement requests. There are many that were posted before python-ideas existed. Today, they would likely be posted here first, and if no support (as happens more often than not), never posted. Or is posted first to bugs, told to go to python-ideas for discussion. Anyone could repost old ideas to python-list now and post the result of discussion back to the tracker. If the discussion suggests rejection, then a coredev can close the issue. (I certainly would.) Closure is not erasure, so the idea is there permanently anyway. -- Terry Jan Reedy From wes.turner at gmail.com Mon Apr 11 18:47:07 2016 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 11 Apr 2016 17:47:07 -0500 Subject: [Python-ideas] (no subject) In-Reply-To: <570C1B2F.30206@stoneleaf.us> References: <570C1B2F.30206@stoneleaf.us> Message-ID: Python Software Foundation (PSF) | Web: https://www.python.org/psf/ | Twitter: https://twitter.com/ThePSF PSF accepts donations, yeah. * https://www.python.org/psf/donations/ * Other ways to fund (additions to, fixes for, idle talk about) open source projects: * Crowdfunding campaign (specific) https://en.wikipedia.org/wiki/Crowdfunding#Crowdfunding_platforms * Bounties (specific / open) https://en.wikipedia.org/wiki/Open-source_bounty * "Hire a developer" - https://www.python.org/jobs/ - http://www.bls.gov/ooh/computer-and-information-technology/home.htm https://en.wikipedia.org/wiki/Business_models_for_open-source_software On Apr 11, 2016 4:48 PM, "Ethan Furman" wrote: > On 04/11/2016 02:36 PM, Keith Curtis wrote: > > It seems you could use at least 10 more dedicated >> devs who focus on the official CPython bug count. >> > > At least. So create a Python Fund and let us know where to apply. :) > > -- > ~Ethan~ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Apr 11 22:58:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Apr 2016 12:58:05 +1000 Subject: [Python-ideas] (no subject) In-Reply-To: References: Message-ID: On 12 April 2016 at 07:36, Keith Curtis wrote: > Hello again, > > Many FOSS communities struggle with sustainability, but Python is very > rich so you've clearly figured it out. > > It is easy to say that bug metrics are going in the wrong direction. > It is also easy to do nothing, or to criticize those who are concerned > about a systemic issue. Data can be turned into action by leadership > and plans. It can be helpful to have people who think 5,000 is a big > number. (Automatically closing old issues doesn't take grit and isn't > a valid solution.) It seems you could use at least 10 more dedicated > devs who focus on the official CPython bug count. Keith, I get it. You're worried about the issue tracker stats, and apparently believe if you just yell long enough and hard enough here we'll suddenly go "You know, you're right, we never thought of that, and we should drop everything else immediately in favour of seeking funding for full-time core development work". However, CPython core development is only *one* of the activities the PSF helps to support (see [1] for a partial list of others), and it's one where commercial entities can most readily contribute people's time and energy directly rather than indirectly through the Python Software Foundation. As core developers, we're individually free to add our details to the Motivations & Affiliations page at [2] and negotiate with our current and future employers for dedicated time to devote to general CPython maintenance, rather than focusing solely on specific items relevant to our work. Folks that aren't core developers yet, but are fortunate enough to work for organisations with a good career planning process and a vested interest in Python's continued success are free to negotiate with their managers to add "become a CPython core developer and spend some of my working hours on general CPython maintenance" to their individual career goals. Any core developer that chooses to do so is also already free to submit a development grant proposal to the PSF to dedicate some of their time to issue tracker grooming, and it's a fair bet (although not a guarantee) that any such grant proposal would be approved as long as the hourly rate and total amount requested were reasonable, and the activities to be pursued and the desired outcome were defined clearly. However, whether or not anyone chooses to do any of those things is a decision that takes place in the context of that "health, relationships, paid work, volunteer work" priority order I mentioned earlier. Not everyone is going to want to turn a volunteer activity into a paid one, and not everyone is going to want to prioritise CPython core development over their other activities. Telling people "your priorities should be different because I say they should be different" is an approach that has never worked in volunteer management, and never *will* work in volunteer management, as overcoming those differences in intrinsic motivation is the key rationale for paid employment. Regards, Nick. [1] https://wiki.python.org/moin/PythonSoftwareFoundation/Proposals/StrategicPriorities [2] https://docs.python.org/devguide/motivations.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ben+python at benfinney.id.au Mon Apr 11 23:08:45 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 12 Apr 2016 13:08:45 +1000 Subject: [Python-ideas] (no subject) References: Message-ID: <85bn5f1q2q.fsf@benfinney.id.au> Keith Curtis writes: > It is easy to say that bug metrics are going in the wrong direction. > It is also easy to do nothing, or to criticize those who are concerned > about a systemic issue. This is astounding hubris. You started several threads unprompted and made many posts this week, which can all be fairly characterised as criticism without actionable solutions. How is that usefully distinct from ?do nothing?, except for occupying time in apparently fruitless nagging? -- \ ?Corporation, n. An ingenious device for obtaining individual | `\ profit without individual responsibility.? ?Ambrose Bierce, | _o__) _The Devil's Dictionary_, 1906 | Ben Finney From nicholas.chammas at gmail.com Tue Apr 12 10:39:28 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Tue, 12 Apr 2016 14:39:28 +0000 Subject: [Python-ideas] (no subject) In-Reply-To: References: Message-ID: On Mon, Apr 11, 2016 at 10:58 PM Nick Coghlan wrote: > However, whether or not anyone chooses to do any of those things is a > decision that takes place in the context of that "health, > relationships, paid work, volunteer work" priority order I mentioned > earlier. Not everyone is going to want to turn a volunteer activity > into a paid one, and not everyone is going to want to prioritise > CPython core development over their other activities. Telling people > "your priorities should be different because I say they should be > different" is an approach that has never worked in volunteer > management, and never *will* work in volunteer management, as > overcoming those differences in intrinsic motivation is the key > rationale for paid employment. > I think this gets to the core of how work gets done in a volunteer organization, and addresses a common misunderstanding people have. Does anyone know of a blog post that expands on this point? I think it would be a good resource to point people to in situations like this. It's easy to think that Python is somehow developed by an organization that decides top-down what's going to get done and who's going to do what, allocating resources as needed. After all, that's how a typical company works. In the Python community (and perhaps in most open-source communities not dominated by a backing company) I gather the rules are very different. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Tue Apr 12 13:07:02 2016 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 12 Apr 2016 17:07:02 +0000 (UTC) Subject: [Python-ideas] (no subject) References: Message-ID: Nicholas Chammas writes: > It's easy to think that Python is somehow developed by an organization that decides top-down what's going to get done and who's going to do what, allocating resources as needed. After all, that's how a typical company works. > > In the Python community (and perhaps in most open-source communities not dominated by a backing company) I gather the rules are very different. This is an excellent point. I think that one of the problems is that the Python website is entirely dominated by the PSF. Perhaps it is time to put a little more emphasis on development again. Stefan Krah From Nikolaus at rath.org Tue Apr 12 13:17:52 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 12 Apr 2016 10:17:52 -0700 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <570C14EB.1080007@stoneleaf.us> (Ethan Furman's message of "Mon, 11 Apr 2016 14:19:39 -0700") References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <570BBF1C.9090805@stoneleaf.us> <871t6bua36.fsf@thinkpad.rath.org> <570C14EB.1080007@stoneleaf.us> Message-ID: <878u0ivj9b.fsf@thinkpad.rath.org> On Apr 11 2016, Ethan Furman wrote: >> As far as I can see, implementing a protocol instead of adding a few >> isinstance checks is more likely to make the life of a CPython developer >> harder than easier. > > I disagree. And the protocol idea was not mine, so apparently other > core-devs also disagree (or think it's worth it, regardless). I haven't found any email explaining why a protocol would make things easier than the isinstance() approach (and I read most of the threads both here and on -dev), so I was assuming that the core-devs in question don't disagree but haven't considered the second approach. But this will be my last mail advocating it. If there's still no interest, then there probably is a reason for it :-). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From brett at python.org Tue Apr 12 13:33:38 2016 From: brett at python.org (Brett Cannon) Date: Tue, 12 Apr 2016 17:33:38 +0000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: <878u0ivj9b.fsf@thinkpad.rath.org> References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <570BBF1C.9090805@stoneleaf.us> <871t6bua36.fsf@thinkpad.rath.org> <570C14EB.1080007@stoneleaf.us> <878u0ivj9b.fsf@thinkpad.rath.org> Message-ID: On Tue, 12 Apr 2016 at 10:18 Nikolaus Rath wrote: > On Apr 11 2016, Ethan Furman < > ethan-gcWI5d7PMXnvaiG9KC9N7Q at public.gmane.org> wrote: > >> As far as I can see, implementing a protocol instead of adding a few > >> isinstance checks is more likely to make the life of a CPython developer > >> harder than easier. > > > > I disagree. And the protocol idea was not mine, so apparently other > > core-devs also disagree (or think it's worth it, regardless). > > I haven't found any email explaining why a protocol would make things > easier than the isinstance() approach (and I read most of the threads > both here and on -dev), so I was assuming that the core-devs in question > don't disagree but haven't considered the second approach. > I disagree with the idea. :) Type checking tends to be too strict as it prevents duck typing. And locking ourselves to only what's in the stdlib is way too restrictive when there are alternative path libraries out there already. Plus it's short-sighted to assume no one will ever come up with a better path library, and so tying us down to only pathlib.PurePath explicitly would be a mistake long-term when specifying a single method keeps things flexible (this is the same reason __index__() exists and we simply don't say indexing has to be with a subclass of int or something). -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Tue Apr 12 17:40:33 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Tue, 12 Apr 2016 21:40:33 +0000 Subject: [Python-ideas] Modifying yield from's return value Message-ID: I asked a question in stackoverflow regarding a way of modifying yield from's return value. There doesn't seem to be any way of modifying the data yielded by the yield from expression without breaking the yield from "pipe". By "pipe" I mean the fact that .send() and .throw() pass to the inner generator. Useful cases are parsers as I've demonstrated in the question, transcoders, encoding and decoding the yielded values without interrupting .send() or .throw() and generally coroutines. I believe it would be beneficial to many aspects and libraries in Python, most notably asyncio. I couldn't think of a good syntax though, other than creating a wrapping function, lets say in itertools, that creates a class overriding send, throw and __next__ and receives the generator and associated modification functions (like suggested in one of the answers). What do you think? Is it useful? Any suggested syntax? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Tue Apr 12 17:40:32 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Tue, 12 Apr 2016 22:40:32 +0100 Subject: [Python-ideas] random.choice on non-sequence Message-ID: <570D6B50.9070005@btinternet.com> It surprised me a bit the first time I realised that random.choice did not work on a set. (One expects "everything" to "just work" in Python! :-) ) There is a simple workaround (convert the set to a tuple or list before passing it) but ... On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module' on Python-Ideas, one of which was to allow random.choice to work on an arbitrary iterable. Raymond Hettinger commented "It could be generalized to arbitrary iterables (Bentley provides an example of how to do this) but it is fragile (i.e. falls apart badly with weak random number generators) and doesn't correspond well with real use cases." Sven replied "Again, there definitely are real world use cases ..." I would like to revive this suggestion. I don't understand Raymond's comment. It seems to me that a reasonable implementation is better and more friendly than none, particularly for newbies. This is the source code for random.choice in Python 3.2 (as far as I know it hasn't changed since): def choice(self, seq): """Choose a random element from a non-empty sequence.""" try: i = self._randbelow(len(seq)) except ValueError: raise IndexError('Cannot choose from an empty sequence') return seq[i] And here is my proposed (tested) version: def choice(self, it): """Choose a random element from a non-empty iterable.""" try: it[0] except IndexError: raise IndexError('Cannot choose from an empty sequence') except TypeError: it = tuple(it) try: i = self._randbelow(len(it)) except ValueError: raise IndexError('Cannot choose from an empty iterable') return it[i] This works on (e.g.) a list/tuple/string, a set, a dictionary view or a generator. Obviously the generator has to be 'fresh' (i.e. any previously consumed values will be excluded from the choice) and 'throw-away' (random.choice will exhaust it). But it means you can write code like random.choice(x for x in xrange(10) if x%3) # this feels quite "Pythonic" to me! (I experimented with versions that, when passed an object that supported len() but not indexing, viz. a set, iterated through the object as far as necessary instead of converting it to a list. But I found that this was slower in practice as well as more complicated. There might be a memory issue with huge iterables, but we are no worse off using existing workarounds for those.) Downsides of proposed version: (1) Backward compatibility. Could mask an error if a "wrong" object, but one that happens to be an iterable, is passed to random.choice. There is also slightly different behaviour if a dictionary (D) is passed (an unusual thing to do): The current version will get a random integer in the range [0, len(D)), try to use that as a key of D, and raise KeyError if that fails. The proposed version behaves similarly except that it will always raise "KeyError: 0" if 0 is not a key of D. One could argue that this is an improvement, if it matters at all (error reproducibility). And code that deliberately exploits the existing behaviour, provided that it uses a dictionary whose keys are consecutive integers from 0 upwards, e.g. D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 : 'four' } d = random.choice(D) # equivalent, in this example, to "d = random.choice(list(D.values()))" if d == 'zero': ... will continue to work (although one might argue that it deserves not to). (2) Speed - the proposed version is almost certainly slightly slower when called with a sequence. For what it's worth, I tried to measure the difference on my several-years-old Windows PC, but is was hard to measure accurately, even over millions of iterations. All I can say is that it appeared to be a small fraction of a millisecond, possibly in the region of 50 nanoseconds, per call. Best wishes Rob Cliffe From guido at python.org Tue Apr 12 17:53:17 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Apr 2016 14:53:17 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570D6B50.9070005@btinternet.com> References: <570D6B50.9070005@btinternet.com> Message-ID: The problem with your version is that copying the input is slow if it is large. Raymond was referencing the top answer here: http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values. It's also slow though (draws N random values if the input is of length N). On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe wrote: > It surprised me a bit the first time I realised that random.choice did not > work on a set. (One expects "everything" to "just work" in Python! :-) ) > There is a simple workaround (convert the set to a tuple or list before > passing it) but ... > > On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module' > on Python-Ideas, one of which was to allow random.choice to work on an > arbitrary iterable. > Raymond Hettinger commented > "It could be generalized to arbitrary iterables (Bentley provides an > example of how to do this) but it is fragile (i.e. falls apart badly with > weak random number generators) and doesn't correspond well with real use > cases." > Sven replied > "Again, there definitely are real world use cases ..." > > I would like to revive this suggestion. I don't understand Raymond's > comment. It seems to me that a reasonable implementation is better and more > friendly than none, particularly for newbies. > This is the source code for random.choice in Python 3.2 (as far as I know it > hasn't changed since): > > > def choice(self, seq): > """Choose a random element from a non-empty sequence.""" > try: > i = self._randbelow(len(seq)) > except ValueError: > raise IndexError('Cannot choose from an empty sequence') > return seq[i] > > > And here is my proposed (tested) version: > > > def choice(self, it): > """Choose a random element from a non-empty iterable.""" > try: > it[0] > except IndexError: > raise IndexError('Cannot choose from an empty sequence') > except TypeError: > it = tuple(it) > try: > i = self._randbelow(len(it)) > except ValueError: > raise IndexError('Cannot choose from an empty iterable') > return it[i] > > > This works on (e.g.) a list/tuple/string, a set, a dictionary view or a > generator. > Obviously the generator has to be 'fresh' (i.e. any previously consumed > values will be excluded from the choice) and 'throw-away' (random.choice > will exhaust it). > But it means you can write code like > random.choice(x for x in xrange(10) if x%3) # this feels quite > "Pythonic" to me! > > (I experimented with versions that, when passed an object that supported > len() but not indexing, viz. a set, iterated through the object as far as > necessary instead of converting it to a list. But I found that this was > slower in practice as well as more complicated. There might be a memory > issue with huge iterables, but we are no worse off using existing > workarounds for those.) > > Downsides of proposed version: > (1) Backward compatibility. Could mask an error if a "wrong" object, > but one that happens to be an iterable, is passed to random.choice. > There is also slightly different behaviour if a dictionary (D) is > passed (an unusual thing to do): > The current version will get a random integer in the range > [0, len(D)), try to use that as a key of D, > and raise KeyError if that fails. > The proposed version behaves similarly except that it will > always raise "KeyError: 0" if 0 is not a key of D. > One could argue that this is an improvement, if it matters > at all (error reproducibility). > And code that deliberately exploits the existing behaviour, > provided that it uses a dictionary whose keys are consecutive integers from > 0 upwards, e.g. > D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 : > 'four' } > d = random.choice(D) # equivalent, in this example, > to "d = random.choice(list(D.values()))" > if d == 'zero': > ... > will continue to work (although one might argue that it > deserves not to). > (2) Speed - the proposed version is almost certainly slightly slower > when called with a sequence. > For what it's worth, I tried to measure the difference on my > several-years-old Windows PC, but is was hard to measure accurately, even > over millions of iterations. > All I can say is that it appeared to be a small fraction of a > millisecond, possibly in the region of 50 nanoseconds, per call. > > Best wishes > Rob Cliffe > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From joejev at gmail.com Tue Apr 12 17:55:17 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Tue, 12 Apr 2016 17:55:17 -0400 Subject: [Python-ideas] Modifying yield from's return value In-Reply-To: References: Message-ID: I have written a library `cotoolz` that provides the primitive pieces needed for this. It implements a `comap` type which is like `map` but properly forwards `send`, `throw`, and `close` to the underlying coroutine. There is no special syntax needed, just write: ``` yield from comap(f, inner_coroutine()) ``` This also supports `cozip` which lets you zip together multiple coroutines, for example: ``` yield from cozip(inner_coroutine_a(), inner_corouting_b(), inner_coroutine_c(), ...) ``` This will fan out the sends to all of the coroutines and collect the results into a single tuple to yield. Just like `zip`, this will be exchausted when the first coroutine is exhausted. The library is available as free software on pypi or on here: https://github.com/llllllllll/cotoolz On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel wrote: > I asked a question in stackoverflow > regarding a way of > modifying yield from's return value. > There doesn't seem to be any way of modifying the data yielded by the > yield from expression without breaking the yield from "pipe". By "pipe" I > mean the fact that .send() and .throw() pass to the inner generator. Useful > cases are parsers as I've demonstrated in the question, transcoders, > encoding and decoding the yielded values without interrupting .send() or > .throw() and generally coroutines. > > I believe it would be beneficial to many aspects and libraries in Python, > most notably asyncio. > > I couldn't think of a good syntax though, other than creating a wrapping > function, lets say in itertools, that creates a class overriding send, > throw and __next__ and receives the generator and associated modification > functions (like suggested in one of the answers). > > What do you think? Is it useful? Any suggested syntax? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Tue Apr 12 18:02:40 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Tue, 12 Apr 2016 22:02:40 +0000 Subject: [Python-ideas] Modifying yield from's return value In-Reply-To: References: Message-ID: Perhaps something like this should go in the standard library? Or change the builtin map to forward everything? Changing the builtin map in this case will be backwards compatible and will overall be beneficial. If someone made a workaround, the workaround would still work, but now the builtins will do it per se. On Wed, Apr 13, 2016 at 12:55 AM Joseph Jevnik wrote: > I have written a library `cotoolz` that provides the primitive pieces > needed for this. It implements a `comap` type which is like `map` but > properly forwards `send`, `throw`, and `close` to the underlying coroutine. > There is no special syntax needed, just write: > > ``` > yield from comap(f, inner_coroutine()) > ``` > > This also supports `cozip` which lets you zip together multiple > coroutines, for example: > > ``` > yield from cozip(inner_coroutine_a(), inner_corouting_b(), > inner_coroutine_c(), ...) > ``` > > This will fan out the sends to all of the coroutines and collect the > results into a single tuple to yield. Just like `zip`, this will be > exchausted when the first coroutine is exhausted. > > The library is available as free software on pypi or on here: > https://github.com/llllllllll/cotoolz > > On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel wrote: > >> I asked a question in stackoverflow >> regarding a way of >> modifying yield from's return value. >> There doesn't seem to be any way of modifying the data yielded by the >> yield from expression without breaking the yield from "pipe". By "pipe" I >> mean the fact that .send() and .throw() pass to the inner generator. Useful >> cases are parsers as I've demonstrated in the question, transcoders, >> encoding and decoding the yielded values without interrupting .send() or >> .throw() and generally coroutines. >> >> I believe it would be beneficial to many aspects and libraries in Python, >> most notably asyncio. >> >> I couldn't think of a good syntax though, other than creating a wrapping >> function, lets say in itertools, that creates a class overriding send, >> throw and __next__ and receives the generator and associated modification >> functions (like suggested in one of the answers). >> >> What do you think? Is it useful? Any suggested syntax? >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cognetta.marco at gmail.com Tue Apr 12 18:09:40 2016 From: cognetta.marco at gmail.com (Marco Cognetta) Date: Tue, 12 Apr 2016 18:09:40 -0400 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: Maybe I am misunderstanding but for a generator, couldn't you avoid storing it in memory and just using the following online algorithm to save space? http://stackoverflow.com/a/23352100 (I hope this message goes through correctly, first time participating in a discussion on this mailing list). -Marco On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum wrote: > The problem with your version is that copying the input is slow if it is large. > > Raymond was referencing the top answer here: > http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values. > It's also slow though (draws N random values if the input is of length > N). > > On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe wrote: >> It surprised me a bit the first time I realised that random.choice did not >> work on a set. (One expects "everything" to "just work" in Python! :-) ) >> There is a simple workaround (convert the set to a tuple or list before >> passing it) but ... >> >> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module' >> on Python-Ideas, one of which was to allow random.choice to work on an >> arbitrary iterable. >> Raymond Hettinger commented >> "It could be generalized to arbitrary iterables (Bentley provides an >> example of how to do this) but it is fragile (i.e. falls apart badly with >> weak random number generators) and doesn't correspond well with real use >> cases." >> Sven replied >> "Again, there definitely are real world use cases ..." >> >> I would like to revive this suggestion. I don't understand Raymond's >> comment. It seems to me that a reasonable implementation is better and more >> friendly than none, particularly for newbies. >> This is the source code for random.choice in Python 3.2 (as far as I know it >> hasn't changed since): >> >> >> def choice(self, seq): >> """Choose a random element from a non-empty sequence.""" >> try: >> i = self._randbelow(len(seq)) >> except ValueError: >> raise IndexError('Cannot choose from an empty sequence') >> return seq[i] >> >> >> And here is my proposed (tested) version: >> >> >> def choice(self, it): >> """Choose a random element from a non-empty iterable.""" >> try: >> it[0] >> except IndexError: >> raise IndexError('Cannot choose from an empty sequence') >> except TypeError: >> it = tuple(it) >> try: >> i = self._randbelow(len(it)) >> except ValueError: >> raise IndexError('Cannot choose from an empty iterable') >> return it[i] >> >> >> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a >> generator. >> Obviously the generator has to be 'fresh' (i.e. any previously consumed >> values will be excluded from the choice) and 'throw-away' (random.choice >> will exhaust it). >> But it means you can write code like >> random.choice(x for x in xrange(10) if x%3) # this feels quite >> "Pythonic" to me! >> >> (I experimented with versions that, when passed an object that supported >> len() but not indexing, viz. a set, iterated through the object as far as >> necessary instead of converting it to a list. But I found that this was >> slower in practice as well as more complicated. There might be a memory >> issue with huge iterables, but we are no worse off using existing >> workarounds for those.) >> >> Downsides of proposed version: >> (1) Backward compatibility. Could mask an error if a "wrong" object, >> but one that happens to be an iterable, is passed to random.choice. >> There is also slightly different behaviour if a dictionary (D) is >> passed (an unusual thing to do): >> The current version will get a random integer in the range >> [0, len(D)), try to use that as a key of D, >> and raise KeyError if that fails. >> The proposed version behaves similarly except that it will >> always raise "KeyError: 0" if 0 is not a key of D. >> One could argue that this is an improvement, if it matters >> at all (error reproducibility). >> And code that deliberately exploits the existing behaviour, >> provided that it uses a dictionary whose keys are consecutive integers from >> 0 upwards, e.g. >> D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 : >> 'four' } >> d = random.choice(D) # equivalent, in this example, >> to "d = random.choice(list(D.values()))" >> if d == 'zero': >> ... >> will continue to work (although one might argue that it >> deserves not to). >> (2) Speed - the proposed version is almost certainly slightly slower >> when called with a sequence. >> For what it's worth, I tried to measure the difference on my >> several-years-old Windows PC, but is was hard to measure accurately, even >> over millions of iterations. >> All I can say is that it appeared to be a small fraction of a >> millisecond, possibly in the region of 50 nanoseconds, per call. >> >> Best wishes >> Rob Cliffe >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ethan at stoneleaf.us Tue Apr 12 18:21:11 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 12 Apr 2016 15:21:11 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: <570D74D7.5070508@stoneleaf.us> On 04/12/2016 02:53 PM, Guido van Rossum wrote: > The problem with your version is that copying the input is slow if it is large. > > Raymond was referencing the top answer here: > http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values. > It's also slow though (draws N random values if the input is of length > N). So the objection is that because performance can vary widely it is better for users to select their own algorithm rather than rely on a one-size-fits-all stdlib solution? -- ~Ethan~ From guido at python.org Tue Apr 12 18:25:05 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Apr 2016 15:25:05 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: Ow, sorry! That's the algorithm I meant to reference (the link I actually gave is for a different situation). See also https://en.wikipedia.org/wiki/Reservoir_sampling. It does indeed avoid storing a copy of the sequence -- at the cost of N calls to random(). Try timing it -- I wouldn't be surprised if copying a set of N elements into a tuple is a lot faster than N random() calls. So we're stuck with two alternatives, neither of which is always the best: (1) just copy the sequence (like Rob's proposal) -- this loses big time if the sequence is large and not already held in memory; (2) use reservoir sampling, at the cost of many more random() calls. Since Rob didn't link to the quoted conversation I don't have it handy to guess what Raymond meant with "Bentley" either, but I'm guessing Jon Bentley wrote about reservoir sampling (before it was named that). I recall hearing about the algorithm for the first time in the late '80s from a coworker with Bell Labs ties. (Later, Ethan wrote) > So the objection is that because performance can vary widely it is better for users to select their own algorithm rather than rely on a one-size-fits-all stdlib solution? That's pretty much it. Were this 1995, I probably would have put some toy in the stdlib. But these days we should be more careful than that. --Guido On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta wrote: > Maybe I am misunderstanding but for a generator, couldn't you avoid > storing it in memory and just using the following online algorithm to > save space? http://stackoverflow.com/a/23352100 > > (I hope this message goes through correctly, first time participating > in a discussion on this mailing list). > > -Marco > > On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum wrote: >> The problem with your version is that copying the input is slow if it is large. >> >> Raymond was referencing the top answer here: >> http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values. >> It's also slow though (draws N random values if the input is of length >> N). >> >> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe wrote: >>> It surprised me a bit the first time I realised that random.choice did not >>> work on a set. (One expects "everything" to "just work" in Python! :-) ) >>> There is a simple workaround (convert the set to a tuple or list before >>> passing it) but ... >>> >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module' >>> on Python-Ideas, one of which was to allow random.choice to work on an >>> arbitrary iterable. >>> Raymond Hettinger commented >>> "It could be generalized to arbitrary iterables (Bentley provides an >>> example of how to do this) but it is fragile (i.e. falls apart badly with >>> weak random number generators) and doesn't correspond well with real use >>> cases." >>> Sven replied >>> "Again, there definitely are real world use cases ..." >>> >>> I would like to revive this suggestion. I don't understand Raymond's >>> comment. It seems to me that a reasonable implementation is better and more >>> friendly than none, particularly for newbies. >>> This is the source code for random.choice in Python 3.2 (as far as I know it >>> hasn't changed since): >>> >>> >>> def choice(self, seq): >>> """Choose a random element from a non-empty sequence.""" >>> try: >>> i = self._randbelow(len(seq)) >>> except ValueError: >>> raise IndexError('Cannot choose from an empty sequence') >>> return seq[i] >>> >>> >>> And here is my proposed (tested) version: >>> >>> >>> def choice(self, it): >>> """Choose a random element from a non-empty iterable.""" >>> try: >>> it[0] >>> except IndexError: >>> raise IndexError('Cannot choose from an empty sequence') >>> except TypeError: >>> it = tuple(it) >>> try: >>> i = self._randbelow(len(it)) >>> except ValueError: >>> raise IndexError('Cannot choose from an empty iterable') >>> return it[i] >>> >>> >>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a >>> generator. >>> Obviously the generator has to be 'fresh' (i.e. any previously consumed >>> values will be excluded from the choice) and 'throw-away' (random.choice >>> will exhaust it). >>> But it means you can write code like >>> random.choice(x for x in xrange(10) if x%3) # this feels quite >>> "Pythonic" to me! >>> >>> (I experimented with versions that, when passed an object that supported >>> len() but not indexing, viz. a set, iterated through the object as far as >>> necessary instead of converting it to a list. But I found that this was >>> slower in practice as well as more complicated. There might be a memory >>> issue with huge iterables, but we are no worse off using existing >>> workarounds for those.) >>> >>> Downsides of proposed version: >>> (1) Backward compatibility. Could mask an error if a "wrong" object, >>> but one that happens to be an iterable, is passed to random.choice. >>> There is also slightly different behaviour if a dictionary (D) is >>> passed (an unusual thing to do): >>> The current version will get a random integer in the range >>> [0, len(D)), try to use that as a key of D, >>> and raise KeyError if that fails. >>> The proposed version behaves similarly except that it will >>> always raise "KeyError: 0" if 0 is not a key of D. >>> One could argue that this is an improvement, if it matters >>> at all (error reproducibility). >>> And code that deliberately exploits the existing behaviour, >>> provided that it uses a dictionary whose keys are consecutive integers from >>> 0 upwards, e.g. >>> D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 : >>> 'four' } >>> d = random.choice(D) # equivalent, in this example, >>> to "d = random.choice(list(D.values()))" >>> if d == 'zero': >>> ... >>> will continue to work (although one might argue that it >>> deserves not to). >>> (2) Speed - the proposed version is almost certainly slightly slower >>> when called with a sequence. >>> For what it's worth, I tried to measure the difference on my >>> several-years-old Windows PC, but is was hard to measure accurately, even >>> over millions of iterations. >>> All I can say is that it appeared to be a small fraction of a >>> millisecond, possibly in the region of 50 nanoseconds, per call. >>> >>> Best wishes >>> Rob Cliffe >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Tue Apr 12 18:57:47 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 13 Apr 2016 08:57:47 +1000 Subject: [Python-ideas] random.choice on non-sequence References: <570D6B50.9070005@btinternet.com> Message-ID: <85inzmzb84.fsf@benfinney.id.au> Rob Cliffe writes: > It surprised me a bit the first time I realised that random.choice did > not work on a set. (One expects "everything" to "just work" in Python! > :-) ) I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and even ?dict?, should each be accepted as input for ?random.choice?. > On 2011-06-23, Sven Marnach posted 'A few suggestions for the random > module' on Python-Ideas, one of which was to allow random.choice to > work on an arbitrary iterable. I am ?1 on the notion of an arbitrary iterable as ?random.choice? input. Arbitrary iterables may be consumed merely by being iterated. This will force the caller to make a copy, which may as well be a normal sequence type, defeating the purpose of accepting that ?arbitrary iterable? type. Arbitrary iterables may never finish iterating. This means the call to ?random.choice? would sometimes never return. The ?random.choice? function should IMO not be responsible for dealing with those cases. Instead, a more moderate proposal would be to have ?random.choice? accept an arbitrary container. If the object implements the container protocol (by which I think I mean that it conforms to ?collections.abc.Container?), it is safe to iterate and to treat its items as a collection from which to choose a random item. Rob, does that proposal satisfy the requirements that motivated this? -- \ ?Some people, when confronted with a problem, think ?I know, | `\ I'll use regular expressions?. Now they have two problems.? | _o__) ?Jamie Zawinski, in alt.religion.emacs | Ben Finney From guido at python.org Tue Apr 12 19:04:11 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Apr 2016 16:04:11 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <85inzmzb84.fsf@benfinney.id.au> References: <570D6B50.9070005@btinternet.com> <85inzmzb84.fsf@benfinney.id.au> Message-ID: This still seems wrong *unless* we add a protocol to select a random item in O(1) time. There currently isn't one for sets and mappings -- only for sequences. It would be pretty terrible if someone wrote code to get N items from a set of size N and their code ended up being O(N**2). On Tue, Apr 12, 2016 at 3:57 PM, Ben Finney wrote: > Rob Cliffe writes: > >> It surprised me a bit the first time I realised that random.choice did >> not work on a set. (One expects "everything" to "just work" in Python! >> :-) ) > > I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and > even ?dict?, should each be accepted as input for ?random.choice?. > >> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random >> module' on Python-Ideas, one of which was to allow random.choice to >> work on an arbitrary iterable. > > I am ?1 on the notion of an arbitrary iterable as ?random.choice? input. > > Arbitrary iterables may be consumed merely by being iterated. This will > force the caller to make a copy, which may as well be a normal sequence > type, defeating the purpose of accepting that ?arbitrary iterable? type. > > Arbitrary iterables may never finish iterating. This means the call > to ?random.choice? would sometimes never return. > > The ?random.choice? function should IMO not be responsible for dealing > with those cases. > > > Instead, a more moderate proposal would be to have ?random.choice? > accept an arbitrary container. > > If the object implements the container protocol (by which I think I mean > that it conforms to ?collections.abc.Container?), it is safe to iterate > and to treat its items as a collection from which to choose a random > item. > > > Rob, does that proposal satisfy the requirements that motivated this? > > -- > \ ?Some people, when confronted with a problem, think ?I know, | > `\ I'll use regular expressions?. Now they have two problems.? | > _o__) ?Jamie Zawinski, in alt.religion.emacs | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From mahmoud at hatnote.com Tue Apr 12 19:05:27 2016 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Tue, 12 Apr 2016 16:05:27 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: Reservoir sampling is super handy. For a Pythonic introduction, along with links to implementation, see here: https://www.paypal-engineering.com/2016/04/11/statistics-for-software/#dipping_into_the_stream Even though I love the elegance enough to write and illustrate about it, I'd be very surprised to have found it built into Python at this low of a level. Mahmoud On Tue, Apr 12, 2016 at 3:25 PM, Guido van Rossum wrote: > Ow, sorry! That's the algorithm I meant to reference (the link I > actually gave is for a different situation). See also > https://en.wikipedia.org/wiki/Reservoir_sampling. > > It does indeed avoid storing a copy of the sequence -- at the cost of > N calls to random(). Try timing it -- I wouldn't be surprised if > copying a set of N elements into a tuple is a lot faster than N > random() calls. > > So we're stuck with two alternatives, neither of which is always the > best: (1) just copy the sequence (like Rob's proposal) -- this loses > big time if the sequence is large and not already held in memory; (2) > use reservoir sampling, at the cost of many more random() calls. > > Since Rob didn't link to the quoted conversation I don't have it handy > to guess what Raymond meant with "Bentley" either, but I'm guessing > Jon Bentley wrote about reservoir sampling (before it was named that). > I recall hearing about the algorithm for the first time in the late > '80s from a coworker with Bell Labs ties. > > (Later, Ethan wrote) > > So the objection is that because performance can vary widely it is > better for users to select their own algorithm rather than rely on a > one-size-fits-all stdlib solution? > > That's pretty much it. Were this 1995, I probably would have put some > toy in the stdlib. But these days we should be more careful than that. > > --Guido > > On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta > wrote: > > Maybe I am misunderstanding but for a generator, couldn't you avoid > > storing it in memory and just using the following online algorithm to > > save space? http://stackoverflow.com/a/23352100 > > > > (I hope this message goes through correctly, first time participating > > in a discussion on this mailing list). > > > > -Marco > > > > On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum > wrote: > >> The problem with your version is that copying the input is slow if it > is large. > >> > >> Raymond was referencing the top answer here: > >> > http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values > . > >> It's also slow though (draws N random values if the input is of length > >> N). > >> > >> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe > wrote: > >>> It surprised me a bit the first time I realised that random.choice did > not > >>> work on a set. (One expects "everything" to "just work" in Python! > :-) ) > >>> There is a simple workaround (convert the set to a tuple or list before > >>> passing it) but ... > >>> > >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random > module' > >>> on Python-Ideas, one of which was to allow random.choice to work on an > >>> arbitrary iterable. > >>> Raymond Hettinger commented > >>> "It could be generalized to arbitrary iterables (Bentley provides > an > >>> example of how to do this) but it is fragile (i.e. falls apart badly > with > >>> weak random number generators) and doesn't correspond well with real > use > >>> cases." > >>> Sven replied > >>> "Again, there definitely are real world use cases ..." > >>> > >>> I would like to revive this suggestion. I don't understand Raymond's > >>> comment. It seems to me that a reasonable implementation is better > and more > >>> friendly than none, particularly for newbies. > >>> This is the source code for random.choice in Python 3.2 (as far as I > know it > >>> hasn't changed since): > >>> > >>> > >>> def choice(self, seq): > >>> """Choose a random element from a non-empty sequence.""" > >>> try: > >>> i = self._randbelow(len(seq)) > >>> except ValueError: > >>> raise IndexError('Cannot choose from an empty sequence') > >>> return seq[i] > >>> > >>> > >>> And here is my proposed (tested) version: > >>> > >>> > >>> def choice(self, it): > >>> """Choose a random element from a non-empty iterable.""" > >>> try: > >>> it[0] > >>> except IndexError: > >>> raise IndexError('Cannot choose from an empty sequence') > >>> except TypeError: > >>> it = tuple(it) > >>> try: > >>> i = self._randbelow(len(it)) > >>> except ValueError: > >>> raise IndexError('Cannot choose from an empty iterable') > >>> return it[i] > >>> > >>> > >>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a > >>> generator. > >>> Obviously the generator has to be 'fresh' (i.e. any previously consumed > >>> values will be excluded from the choice) and 'throw-away' > (random.choice > >>> will exhaust it). > >>> But it means you can write code like > >>> random.choice(x for x in xrange(10) if x%3) # this feels quite > >>> "Pythonic" to me! > >>> > >>> (I experimented with versions that, when passed an object that > supported > >>> len() but not indexing, viz. a set, iterated through the object as far > as > >>> necessary instead of converting it to a list. But I found that this > was > >>> slower in practice as well as more complicated. There might be a > memory > >>> issue with huge iterables, but we are no worse off using existing > >>> workarounds for those.) > >>> > >>> Downsides of proposed version: > >>> (1) Backward compatibility. Could mask an error if a "wrong" > object, > >>> but one that happens to be an iterable, is passed to random.choice. > >>> There is also slightly different behaviour if a dictionary > (D) is > >>> passed (an unusual thing to do): > >>> The current version will get a random integer in the > range > >>> [0, len(D)), try to use that as a key of D, > >>> and raise KeyError if that fails. > >>> The proposed version behaves similarly except that it > will > >>> always raise "KeyError: 0" if 0 is not a key of D. > >>> One could argue that this is an improvement, if it > matters > >>> at all (error reproducibility). > >>> And code that deliberately exploits the existing > behaviour, > >>> provided that it uses a dictionary whose keys are consecutive integers > from > >>> 0 upwards, e.g. > >>> D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : > 'three', 4 : > >>> 'four' } > >>> d = random.choice(D) # equivalent, in this > example, > >>> to "d = random.choice(list(D.values()))" > >>> if d == 'zero': > >>> ... > >>> will continue to work (although one might argue that it > >>> deserves not to). > >>> (2) Speed - the proposed version is almost certainly slightly > slower > >>> when called with a sequence. > >>> For what it's worth, I tried to measure the difference on my > >>> several-years-old Windows PC, but is was hard to measure accurately, > even > >>> over millions of iterations. > >>> All I can say is that it appeared to be a small fraction of a > >>> millisecond, possibly in the region of 50 nanoseconds, per call. > >>> > >>> Best wishes > >>> Rob Cliffe > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas at python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > >> > >> > >> -- > >> --Guido van Rossum (python.org/~guido) > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joejev at gmail.com Tue Apr 12 19:20:12 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Tue, 12 Apr 2016 19:20:12 -0400 Subject: [Python-ideas] Modifying yield from's return value In-Reply-To: References: Message-ID: Changing the builtin map to do this would be questionable because there is some performance overhead to dispatching to the inner coroutine's `send` methods. `__next__` is implemented as a tp slot; however, coroutine `send` is just a normal method that needs to be looked up every time we call next. While my implementation is not as optimized as it could be, it is about twice as slow as builtin map (and 4x faster is . I think it would be better to keep the library separate until there is a lot of demand for this feature. On Tue, Apr 12, 2016 at 6:02 PM, Bar Harel wrote: > Perhaps something like this should go in the standard library? Or change > the builtin map to forward everything? Changing the builtin map in this > case will be backwards compatible and will overall be beneficial. If > someone made a workaround, the workaround would still work, but now the > builtins will do it per se. > > On Wed, Apr 13, 2016 at 12:55 AM Joseph Jevnik wrote: > >> I have written a library `cotoolz` that provides the primitive pieces >> needed for this. It implements a `comap` type which is like `map` but >> properly forwards `send`, `throw`, and `close` to the underlying coroutine. >> There is no special syntax needed, just write: >> >> ``` >> yield from comap(f, inner_coroutine()) >> ``` >> >> This also supports `cozip` which lets you zip together multiple >> coroutines, for example: >> >> ``` >> yield from cozip(inner_coroutine_a(), inner_corouting_b(), >> inner_coroutine_c(), ...) >> ``` >> >> This will fan out the sends to all of the coroutines and collect the >> results into a single tuple to yield. Just like `zip`, this will be >> exchausted when the first coroutine is exhausted. >> >> The library is available as free software on pypi or on here: >> https://github.com/llllllllll/cotoolz >> >> On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel wrote: >> >>> I asked a question in stackoverflow >>> regarding a way of >>> modifying yield from's return value. >>> There doesn't seem to be any way of modifying the data yielded by the >>> yield from expression without breaking the yield from "pipe". By "pipe" I >>> mean the fact that .send() and .throw() pass to the inner generator. Useful >>> cases are parsers as I've demonstrated in the question, transcoders, >>> encoding and decoding the yielded values without interrupting .send() or >>> .throw() and generally coroutines. >>> >>> I believe it would be beneficial to many aspects and libraries in >>> Python, most notably asyncio. >>> >>> I couldn't think of a good syntax though, other than creating a wrapping >>> function, lets say in itertools, that creates a class overriding send, >>> throw and __next__ and receives the generator and associated modification >>> functions (like suggested in one of the answers). >>> >>> What do you think? Is it useful? Any suggested syntax? >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From cognetta.marco at gmail.com Tue Apr 12 20:07:15 2016 From: cognetta.marco at gmail.com (Marco Cognetta) Date: Tue, 12 Apr 2016 20:07:15 -0400 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: At the risk of making this not very easily accessible, couldn't we use both approaches (reservoir sampling and just loading the whole thing into memory) and apply the ski-rental problem? We could come up with some cost for calling random and use that to determine when we should just give up and load it into memory if we do not know the length of the sequence produced by the generator. If the user knows something about the generator beforehand they could specify in some optional parameter whether or not they want to use reservoir sampling or just load it into memory at the start. The theory side of me likes this but I admit its a little ugly to be included in Python's stdlib. -Marco On Tue, Apr 12, 2016 at 7:05 PM, Mahmoud Hashemi wrote: > Reservoir sampling is super handy. For a Pythonic introduction, along with > links to implementation, see here: > https://www.paypal-engineering.com/2016/04/11/statistics-for-software/#dipping_into_the_stream > > Even though I love the elegance enough to write and illustrate about it, I'd > be very surprised to have found it built into Python at this low of a level. > > Mahmoud > > On Tue, Apr 12, 2016 at 3:25 PM, Guido van Rossum wrote: >> >> Ow, sorry! That's the algorithm I meant to reference (the link I >> actually gave is for a different situation). See also >> https://en.wikipedia.org/wiki/Reservoir_sampling. >> >> It does indeed avoid storing a copy of the sequence -- at the cost of >> N calls to random(). Try timing it -- I wouldn't be surprised if >> copying a set of N elements into a tuple is a lot faster than N >> random() calls. >> >> So we're stuck with two alternatives, neither of which is always the >> best: (1) just copy the sequence (like Rob's proposal) -- this loses >> big time if the sequence is large and not already held in memory; (2) >> use reservoir sampling, at the cost of many more random() calls. >> >> Since Rob didn't link to the quoted conversation I don't have it handy >> to guess what Raymond meant with "Bentley" either, but I'm guessing >> Jon Bentley wrote about reservoir sampling (before it was named that). >> I recall hearing about the algorithm for the first time in the late >> '80s from a coworker with Bell Labs ties. >> >> (Later, Ethan wrote) >> > So the objection is that because performance can vary widely it is >> > better for users to select their own algorithm rather than rely on a >> > one-size-fits-all stdlib solution? >> >> That's pretty much it. Were this 1995, I probably would have put some >> toy in the stdlib. But these days we should be more careful than that. >> >> --Guido >> >> On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta >> wrote: >> > Maybe I am misunderstanding but for a generator, couldn't you avoid >> > storing it in memory and just using the following online algorithm to >> > save space? http://stackoverflow.com/a/23352100 >> > >> > (I hope this message goes through correctly, first time participating >> > in a discussion on this mailing list). >> > >> > -Marco >> > >> > On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum >> > wrote: >> >> The problem with your version is that copying the input is slow if it >> >> is large. >> >> >> >> Raymond was referencing the top answer here: >> >> >> >> http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values. >> >> It's also slow though (draws N random values if the input is of length >> >> N). >> >> >> >> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe >> >> wrote: >> >>> It surprised me a bit the first time I realised that random.choice did >> >>> not >> >>> work on a set. (One expects "everything" to "just work" in Python! >> >>> :-) ) >> >>> There is a simple workaround (convert the set to a tuple or list >> >>> before >> >>> passing it) but ... >> >>> >> >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random >> >>> module' >> >>> on Python-Ideas, one of which was to allow random.choice to work on an >> >>> arbitrary iterable. >> >>> Raymond Hettinger commented >> >>> "It could be generalized to arbitrary iterables (Bentley provides >> >>> an >> >>> example of how to do this) but it is fragile (i.e. falls apart badly >> >>> with >> >>> weak random number generators) and doesn't correspond well with real >> >>> use >> >>> cases." >> >>> Sven replied >> >>> "Again, there definitely are real world use cases ..." >> >>> >> >>> I would like to revive this suggestion. I don't understand Raymond's >> >>> comment. It seems to me that a reasonable implementation is better >> >>> and more >> >>> friendly than none, particularly for newbies. >> >>> This is the source code for random.choice in Python 3.2 (as far as I >> >>> know it >> >>> hasn't changed since): >> >>> >> >>> >> >>> def choice(self, seq): >> >>> """Choose a random element from a non-empty sequence.""" >> >>> try: >> >>> i = self._randbelow(len(seq)) >> >>> except ValueError: >> >>> raise IndexError('Cannot choose from an empty sequence') >> >>> return seq[i] >> >>> >> >>> >> >>> And here is my proposed (tested) version: >> >>> >> >>> >> >>> def choice(self, it): >> >>> """Choose a random element from a non-empty iterable.""" >> >>> try: >> >>> it[0] >> >>> except IndexError: >> >>> raise IndexError('Cannot choose from an empty sequence') >> >>> except TypeError: >> >>> it = tuple(it) >> >>> try: >> >>> i = self._randbelow(len(it)) >> >>> except ValueError: >> >>> raise IndexError('Cannot choose from an empty iterable') >> >>> return it[i] >> >>> >> >>> >> >>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or >> >>> a >> >>> generator. >> >>> Obviously the generator has to be 'fresh' (i.e. any previously >> >>> consumed >> >>> values will be excluded from the choice) and 'throw-away' >> >>> (random.choice >> >>> will exhaust it). >> >>> But it means you can write code like >> >>> random.choice(x for x in xrange(10) if x%3) # this feels quite >> >>> "Pythonic" to me! >> >>> >> >>> (I experimented with versions that, when passed an object that >> >>> supported >> >>> len() but not indexing, viz. a set, iterated through the object as far >> >>> as >> >>> necessary instead of converting it to a list. But I found that this >> >>> was >> >>> slower in practice as well as more complicated. There might be a >> >>> memory >> >>> issue with huge iterables, but we are no worse off using existing >> >>> workarounds for those.) >> >>> >> >>> Downsides of proposed version: >> >>> (1) Backward compatibility. Could mask an error if a "wrong" >> >>> object, >> >>> but one that happens to be an iterable, is passed to random.choice. >> >>> There is also slightly different behaviour if a dictionary >> >>> (D) is >> >>> passed (an unusual thing to do): >> >>> The current version will get a random integer in the >> >>> range >> >>> [0, len(D)), try to use that as a key of D, >> >>> and raise KeyError if that fails. >> >>> The proposed version behaves similarly except that it >> >>> will >> >>> always raise "KeyError: 0" if 0 is not a key of D. >> >>> One could argue that this is an improvement, if it >> >>> matters >> >>> at all (error reproducibility). >> >>> And code that deliberately exploits the existing >> >>> behaviour, >> >>> provided that it uses a dictionary whose keys are consecutive integers >> >>> from >> >>> 0 upwards, e.g. >> >>> D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : >> >>> 'three', 4 : >> >>> 'four' } >> >>> d = random.choice(D) # equivalent, in this >> >>> example, >> >>> to "d = random.choice(list(D.values()))" >> >>> if d == 'zero': >> >>> ... >> >>> will continue to work (although one might argue that >> >>> it >> >>> deserves not to). >> >>> (2) Speed - the proposed version is almost certainly slightly >> >>> slower >> >>> when called with a sequence. >> >>> For what it's worth, I tried to measure the difference on my >> >>> several-years-old Windows PC, but is was hard to measure accurately, >> >>> even >> >>> over millions of iterations. >> >>> All I can say is that it appeared to be a small fraction of a >> >>> millisecond, possibly in the region of 50 nanoseconds, per call. >> >>> >> >>> Best wishes >> >>> Rob Cliffe >> >>> _______________________________________________ >> >>> Python-ideas mailing list >> >>> Python-ideas at python.org >> >>> https://mail.python.org/mailman/listinfo/python-ideas >> >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> >> >> >> >> -- >> >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > From greg.ewing at canterbury.ac.nz Tue Apr 12 21:42:41 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Apr 2016 13:42:41 +1200 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: <570DA411.3020003@canterbury.ac.nz> Guido van Rossum wrote: > The problem with your version is that copying the input is slow if it is large. Maybe sets should have a method that returns an indexable view? The order could be defined as equivalent to iteration order, and it would allow things like random.choice to work efficiently on sets. -- Greg From guido at python.org Tue Apr 12 21:59:10 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Apr 2016 18:59:10 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570DA411.3020003@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> Message-ID: On Tue, Apr 12, 2016 at 6:42 PM, Greg Ewing wrote: > Maybe sets should have a method that returns an indexable > view? The order could be defined as equivalent to iteration > order, and it would allow things like random.choice to > work efficiently on sets. How would you implement it though? There are gaps in the hash table. -- --Guido van Rossum (python.org/~guido) From rosuav at gmail.com Tue Apr 12 22:02:43 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Apr 2016 12:02:43 +1000 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570D6B50.9070005@btinternet.com> References: <570D6B50.9070005@btinternet.com> Message-ID: On Wed, Apr 13, 2016 at 7:40 AM, Rob Cliffe wrote: > And here is my proposed (tested) version: > > def choice(self, it): > """Choose a random element from a non-empty iterable.""" > try: > it[0] > except IndexError: > raise IndexError('Cannot choose from an empty sequence') > except TypeError: > it = tuple(it) > try: > i = self._randbelow(len(it)) > except ValueError: > raise IndexError('Cannot choose from an empty iterable') > return it[i] > > > This works on (e.g.) a list/tuple/string, a set, a dictionary view or a > generator. > Obviously the generator has to be 'fresh' (i.e. any previously consumed > values will be excluded from the choice) and 'throw-away' (random.choice > will exhaust it). > But it means you can write code like > random.choice(x for x in xrange(10) if x%3) # this feels quite > "Pythonic" to me! Small point of order: Pretty much everything discussed here on python-ideas is about Python 3. It's best to make sure your code works with the latest CPython (currently 3.5), as a change like this would be landing in 3.6 at the earliest. So what I'd be looking at is this: random.choice(x for x in range(10) if x%3) AFAIK this doesn't change your point at all, but it is worth being careful of; Python 3's range object isn't quite the same as Python 2's xrange, and it's possible something might "just work". (For the inverse case, "if x%3 == 0", you can simply use random.choice(random(0, 10, 3)) to do what you want.) I don't like the proposed acceptance of arbitrary iterables. In the extremely rare case where you actually do want that, you can always manually wrap it in list() or tuple(). But your original use-case does have merit: > It surprised me a bit the first time I realised that random.choice did not work on a set. A set has a length, but it can't be indexed. It should be perfectly reasonable to ask for a random element out of a set! So here's my counter-proposal: Make random.choice accept any iterable with a __len__. def choice(self, coll): """Choose a random element from a non-empty collection.""" try: i = self._randbelow(len(coll)) except ValueError: raise IndexError('Cannot choose from an empty collection') try: return coll[i] except TypeError: for _, value in zip(range(i+1), coll): pass return value Feel free to bikeshed the method of iterating part way into a collection (enumerate() and a comparison? call iter, then call next that many times, then return next(it)?), but the basic concept is that you have to have a length and then you iterate into it that far. It's still not arbitrary iterables, but it'll handle sets and dict views. Handling dicts directly may be a little tricky; since they support subscripting, they'll either raise KeyError or silently return a value, where the iteration-based return value would be a key. Breaking this out into its own function would be reliable there (take out the try/except and just go straight into iteration). ChrisA From tim.peters at gmail.com Tue Apr 12 22:07:43 2016 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 12 Apr 2016 21:07:43 -0500 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570DA411.3020003@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> Message-ID: [Greg Ewing] > Maybe sets should have a method that returns an indexable > view? The order could be defined as equivalent to iteration > order, and it would allow things like random.choice to > work efficiently on sets. Well, sets and dicts are implemented very similarly in CPython. Note that a dict view (whether of keys, values or items) doesn't support indexing! It's unclear how that could be added efficiently, and the same applies to sets. Iteration works pretty efficiently, because a hidden "search finger" is maintained internally, skipping over the gaps in the hash table as needed (so iteration can, internally, make a number of probes equal to the number of hash slots, and no more than that) - but that's no help for indexing by a random integer. I'll just add that I've never had a real need for a random selection from a set. For all the many set algorithms I've coded, an "arbitrary" element worked just as well as a "random" element, and set.pop() works fine for that. From guido at python.org Tue Apr 12 22:11:38 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Apr 2016 19:11:38 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: This is no good. Who wants a choice() that is O(1) on sequences but degrades to O(N) if the argument is a set? I see the current behavior as a feature: it works when the argument is indexable, and when it isn't, it fails cleanly, prompting the user to use their brain and decide what's right. Maybe you're programming a toy example. Then you can just call random.choice(list(x)) and move on. Or maybe you're trying to do something serious -- then it behooves you to copy the set into a list variable once and then repeatedly choose from that list, or maybe you should have put the data in a list in the first place. But putting the toy solution in the stdlib is a bad idea, and so is putting a bad (O(N)) algorithm there. So the status quo wins for a reason! --Guido On Tue, Apr 12, 2016 at 7:02 PM, Chris Angelico wrote: > On Wed, Apr 13, 2016 at 7:40 AM, Rob Cliffe wrote: >> And here is my proposed (tested) version: >> >> def choice(self, it): >> """Choose a random element from a non-empty iterable.""" >> try: >> it[0] >> except IndexError: >> raise IndexError('Cannot choose from an empty sequence') >> except TypeError: >> it = tuple(it) >> try: >> i = self._randbelow(len(it)) >> except ValueError: >> raise IndexError('Cannot choose from an empty iterable') >> return it[i] >> >> >> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a >> generator. >> Obviously the generator has to be 'fresh' (i.e. any previously consumed >> values will be excluded from the choice) and 'throw-away' (random.choice >> will exhaust it). >> But it means you can write code like >> random.choice(x for x in xrange(10) if x%3) # this feels quite >> "Pythonic" to me! > > Small point of order: Pretty much everything discussed here on > python-ideas is about Python 3. It's best to make sure your code works > with the latest CPython (currently 3.5), as a change like this would > be landing in 3.6 at the earliest. So what I'd be looking at is this: > > random.choice(x for x in range(10) if x%3) > > AFAIK this doesn't change your point at all, but it is worth being > careful of; Python 3's range object isn't quite the same as Python 2's > xrange, and it's possible something might "just work". (For the > inverse case, "if x%3 == 0", you can simply use > random.choice(random(0, 10, 3)) to do what you want.) > > I don't like the proposed acceptance of arbitrary iterables. In the > extremely rare case where you actually do want that, you can always > manually wrap it in list() or tuple(). But your original use-case does > have merit: > >> It surprised me a bit the first time I realised that random.choice did not work on a set. > > A set has a length, but it can't be indexed. It should be perfectly > reasonable to ask for a random element out of a set! So here's my > counter-proposal: Make random.choice accept any iterable with a > __len__. > > def choice(self, coll): > """Choose a random element from a non-empty collection.""" > try: > i = self._randbelow(len(coll)) > except ValueError: > raise IndexError('Cannot choose from an empty collection') > try: > return coll[i] > except TypeError: > for _, value in zip(range(i+1), coll): > pass > return value > > Feel free to bikeshed the method of iterating part way into a > collection (enumerate() and a comparison? call iter, then call next > that many times, then return next(it)?), but the basic concept is that > you have to have a length and then you iterate into it that far. > > It's still not arbitrary iterables, but it'll handle sets and dict > views. Handling dicts directly may be a little tricky; since they > support subscripting, they'll either raise KeyError or silently return > a value, where the iteration-based return value would be a key. > Breaking this out into its own function would be reliable there (take > out the try/except and just go straight into iteration). > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From rosuav at gmail.com Tue Apr 12 22:15:28 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Apr 2016 12:15:28 +1000 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: On Wed, Apr 13, 2016 at 12:11 PM, Guido van Rossum wrote: > This is no good. Who wants a choice() that is O(1) on sequences but > degrades to O(N) if the argument is a set? Fair point. Suggestion withdrawn. If you want something that does the iteration-till-it-finds-something, that should be a separate function. (And then it'd work automatically on dicts, too.) ChrisA From chris.barker at noaa.gov Wed Apr 13 00:52:31 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 12 Apr 2016 21:52:31 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570DA411.3020003@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> Message-ID: <-176815014396412028@unknownmsgid> I also haven't had a use case for random items from a set, but have had a use case for randomly ( not arbitrarily ) choosing from a dict (see trigrams: http://codekata.com/kata/kata14-tom-swift-under-the-milkwood/) As dicts and sets share implementation, why not both? Anyway, the hash table has gaps, but wouldn't it be sufficiently random to pick a random index, and if it's a gap, pick another one? I suppose in theory, this could be in infinite process, but in practice, it would be O(1) with a constant of two or three... Better than iterating through on average half of the keys. ( how many gaps are there? I have no idea ) I know nothing about the implementation, and have not thought carefully about the statistics, but it seems do-able. I'll just shut up if I'm way off base. BTW, isn't it impossible to randomly select from an infinite iterable anyway? -CHB > On Apr 12, 2016, at 6:43 PM, Greg Ewing wrote: > > Guido van Rossum wrote: >> The problem with your version is that copying the input is slow if it is large. > > Maybe sets should have a method that returns an indexable > view? The order could be defined as equivalent to iteration > order, and it would allow things like random.choice to > work efficiently on sets. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rob.cliffe at btinternet.com Wed Apr 13 00:09:30 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 13 Apr 2016 05:09:30 +0100 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <85inzmzb84.fsf@benfinney.id.au> Message-ID: <570DC67A.4040609@btinternet.com> Thanks to everyone for all the feedback so far. Interesting. I had not heard of reservoir sampling, and it had not occurred to me that one could select a uniformly random element from a sequence of unknown length without copying it. I am with Ben in that I consider the most important type for random.choice to accept, that it doesn't already, is "set". If just that could be achieved, yes Ben, I think that would be a plus. (The fact that I could write a version that handled generators (efficiently or not) just seemed to be a bonus.) ISTM however that there is a problem accepting dict, because of the ambiguity - does it mean a random key, a random value, or a random (key,value) item? (I would pick a random key (consistent with "for k in MyDict"), but ISTM "explicit is better than implict".) There are problems with arbitrary iterables as Ben points out. But I don't see these as an objection to making random.choice accept them, because (1) It is easy to produce examples of stdlib functions that can be crashed with infinite iterables, e.g. >>> def forever(): ... while 1: yield 1 ... >>> itertools.product(forever(), forever()) Traceback (most recent call last): File "", line 1, in MemoryError (2) As I indicated in my original post, it is understood that generators are "throw-away", so may need to be copied. 'Consenting adults', no? One small point: Whenever I've tested it (quite a few times, for different projects), I've always found that it was faster to copy to a tuple than to a list. I did a little experimenting (admittedly non-rigorous, on one platform only, and using Python 2.7.10, not Python 3, and using code which could very possibly be improved) on selecting a random element from a generator, and found that for small or moderate generators, reservoir sampling was almost always slower than generating a tuple as the generator length increased up to roughly 10,000, the ratio (time taken by reservoir) / (time taken by tuple) increased, reaching a maximum of over 4 as the generator length increased further, the ratio started to decrease, although for a length of 80 million (about as large as I could test) it was still over 3. (This would change if the reservoir approach could be recoded in a way that was amazingly faster than the way I did it, but, arrogance aside, I doubt that. I will supply details of my code on request.) I think this is a tribute to the efficiency of tuple building. It's also enough to convince me that reservoir sampling, or Marco Cognetta's compromise approach, are non-starters. More rigorous testing would be necessary to convince the rest of the world. I am also pretty well convinced from actual tests (not from theoretical reasoning) that: the "convert it to a tuple" recipe is not far off O(N), both for sets and generators (it gets worse for generators that produce objects with large memory footprints), and is certainly fast enough to be useful (the point at which it gets unacceptably slow is generally not far from the point where you run out of memory). I did try an approach similar to Chris Angelico's of iterating over sets up to the selection point (my bikeshed actually used enumerate rather than zip), but it was much slower than the "tuple-ise" algorithm. On 13/04/2016 00:04, Guido van Rossum wrote: > This still seems wrong *unless* we add a protocol to select a random > item in O(1) time. There currently isn't one for sets and mappings -- > only for sequences. Hm, would this need a change to the structure of dicts and sets? Is there by any chance a quick way of getting the Nth item added in an OrderedDict? > It would be pretty terrible if someone wrote code > to get N items from a set of size N and their code ended up being > O(N**2). Er, I'm not quite sure what you're saying here. random.sample allows you to select k items from a set of size N (it converts the set to a tuple). If k=N, i.e. you're trying to select the whole set, that's a silly thing to do if you know you're doing it. But even in this case, random.sample is approximately O(N); I tested it. If someone writes their own inefficient version of random.sample, that's their problem. Have I misunderstood? Do you mean selecting N items with replacement? Hm, it occurs to me that random.sample itself could be extended to choose k unique random elements from a general iterable. I have seen a reservoir sampling algorithm that claims to be O(N): https://en.wikipedia.org/wiki/Reservoir_sampling Rob On Tue, Apr 12, 2016 at 3:57 PM, Ben Finney wrote: >> Rob Cliffe writes: >> >>> It surprised me a bit the first time I realised that random.choice did >>> not work on a set. (One expects "everything" to "just work" in Python! >>> :-) ) >> I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and >> even ?dict?, should each be accepted as input for ?random.choice?. >> >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random >>> module' on Python-Ideas, one of which was to allow random.choice to >>> work on an arbitrary iterable. >> I am ?1 on the notion of an arbitrary iterable as ?random.choice? input. >> >> Arbitrary iterables may be consumed merely by being iterated. This will >> force the caller to make a copy, which may as well be a normal sequence >> type, defeating the purpose of accepting that ?arbitrary iterable? type. >> >> Arbitrary iterables may never finish iterating. This means the call >> to ?random.choice? would sometimes never return. >> >> The ?random.choice? function should IMO not be responsible for dealing >> with those cases. >> >> >> Instead, a more moderate proposal would be to have ?random.choice? >> accept an arbitrary container. >> >> If the object implements the container protocol (by which I think I mean >> that it conforms to ?collections.abc.Container?), it is safe to iterate >> and to treat its items as a collection from which to choose a random >> item. >> >> >> Rob, does that proposal satisfy the requirements that motivated this? >> >> -- >> \ ?Some people, when confronted with a problem, think ?I know, | >> `\ I'll use regular expressions?. Now they have two problems.? | >> _o__) ?Jamie Zawinski, in alt.religion.emacs | >> Ben Finney >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > From tim.peters at gmail.com Wed Apr 13 01:08:47 2016 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 13 Apr 2016 00:08:47 -0500 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <-176815014396412028@unknownmsgid> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: [Chris Barker - NOAA Federal ] > ... > Anyway, the hash table has gaps, but wouldn't it be sufficiently > random to pick a random index, and if it's a gap, pick another one? That would be a correct implementation, but ... > I suppose in theory, this could be in infinite process, but in practice, > it would be O(1) with a constant of two or three... Better than > iterating through on average half of the keys. There's an upper limit on how dense a CPython dict or set can become (the load factor doesn't exceed 2/3), but no lower limit. For example, it's easy to end up with a dict holding a single entry hiding among millions of empty slots (dicts are never resized on key deletion, only on key insertion). > ... > BTW, isn't it impossible to randomly select from an infinite iterable anyway? Of course, but it is possible to do uniform random selection, in one pass using constant space and in linear time, from an iterable whose length isn't known in advance (simple case of "reservoir sampling"). From grant.jenks at gmail.com Wed Apr 13 01:20:07 2016 From: grant.jenks at gmail.com (Grant Jenks) Date: Tue, 12 Apr 2016 22:20:07 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570D6B50.9070005@btinternet.com> References: <570D6B50.9070005@btinternet.com> Message-ID: On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe wrote: > > It surprised me a bit the first time I realised that random.choice did not > work on a set. (One expects "everything" to "just work" in Python! :-) ) As you point out in your proposed solution, the problem is `random.choice` expects an index-able value. You can easily create an index-able set using the sortedcontainers.SortedSet interface even if if the elements aren't comparable. Simply use: ```python from sortedcontainers import SortedSet def zero(value): return 0 class IndexableSet(SortedSet): def __init__(self, *args, **kwargs): super(IndexableSet, self).__init__(*args, key=zero, **kwargs) values = IndexableSet(range(100)) import random random.choice(values) ``` `__contains__`, `__iter__`, `len`, `add`, `__getitem__` and `__delitem__` will all be fast. But `discard` will take potentially linear time. So: ``` index = random.randrange(len(values)) value = values[index] del values[index] ``` will be much faster than: ``` value = random.choice(values) values.discard(value) ``` Otherwise `IndexableSet` will work as a drop-in replacement for your `set` needs. Read more about the SortedContainers project at http://www.grantjenks.com/docs/sortedcontainers/ Grant From jsbueno at python.org.br Wed Apr 13 01:29:28 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 13 Apr 2016 02:29:28 -0300 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: Maybe, instead of all of this, since as put by others, hardly there would be a "one size fits all" - the nice thing to have would be a "choice" or "randompop" method for sets. Sets are the one natural kind of object from which it would seen normal to be able to pick a random element (or pop one) - since they have no order to start with, and the one from which it is frustrating when one realizes "random.choice" fails with, So either have a "random.set_choice" and "random.set_pop" functions in random - or "choice" and "pop_random" on the "set" interface itself could be more practical - in terms of real world usage - than speculate a way for choice to work in iterables of unknown size, among other objects. On 13 April 2016 at 02:20, Grant Jenks wrote: > On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe wrote: >> >> It surprised me a bit the first time I realised that random.choice did not >> work on a set. (One expects "everything" to "just work" in Python! :-) ) > > As you point out in your proposed solution, the problem is `random.choice` > expects an index-able value. You can easily create an index-able set using > the sortedcontainers.SortedSet interface even if if the elements aren't > comparable. Simply use: > > ```python > from sortedcontainers import SortedSet > > def zero(value): > return 0 > > class IndexableSet(SortedSet): > def __init__(self, *args, **kwargs): > super(IndexableSet, self).__init__(*args, key=zero, **kwargs) > > values = IndexableSet(range(100)) > > import random > > random.choice(values) > ``` > > `__contains__`, `__iter__`, `len`, `add`, `__getitem__` and `__delitem__` will > all be fast. But `discard` will take potentially linear time. So: > > ``` > index = random.randrange(len(values)) > value = values[index] > del values[index] > ``` > > will be much faster than: > > ``` > value = random.choice(values) > values.discard(value) > ``` > > Otherwise `IndexableSet` will work as a drop-in replacement for > your `set` needs. Read more about the SortedContainers project at > http://www.grantjenks.com/docs/sortedcontainers/ > > Grant > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From p.f.moore at gmail.com Wed Apr 13 04:32:47 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 13 Apr 2016 09:32:47 +0100 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: On 13 April 2016 at 06:29, Joao S. O. Bueno wrote: > Maybe, instead of all of this, since as put by others, hardly there would be a > "one size fits all" - the nice thing to have would be a "choice" or > "randompop" method for sets. Given that your choice method for sets would be little more than value = random.choice(tuple(s)) it seems like it's probably overkill to build it into Python - particularly as doing so restricts the implementation to Python 3.6+ whereas writing it out by hand works for any version. It also makes the complexity of the operation explicit, which is a good thing. Paul From rob.cliffe at btinternet.com Wed Apr 13 05:47:53 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 13 Apr 2016 10:47:53 +0100 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> Message-ID: <570E15C9.7050703@btinternet.com> On 13/04/2016 09:32, Paul Moore wrote: > On 13 April 2016 at 06:29, Joao S. O. Bueno wrote: >> Maybe, instead of all of this, since as put by others, hardly there would be a >> "one size fits all" - the nice thing to have would be a "choice" or >> "randompop" method for sets. > Given that your choice method for sets would be little more than > > value = random.choice(tuple(s)) > > it seems like it's probably overkill to build it into Python - > particularly as doing so restricts the implementation to Python 3.6+ > whereas writing it out by hand works for any version. It also makes > the complexity of the operation explicit, which is a good thing. > > Paul > _______________________________________________ > Isn't there an inconsistency that random.sample caters to a set by converting it to a tuple, but random.choice doesn't? Surely random.sample(someSet, k) will always be slower than a "tuple-ising" random.choice(someSet) implementation? And it wouldn't prevent you from converting your set (or other iterable) to a sequence explicitly. Rob From tjreedy at udel.edu Wed Apr 13 06:36:04 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 13 Apr 2016 06:36:04 -0400 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <-176815014396412028@unknownmsgid> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: > BTW, isn't it impossible to randomly select from an infinite iterable anyway? With equal probability, yes, impossible. With skewed probabilities, no, possible. -- Terry Jan Reedy From rosuav at gmail.com Wed Apr 13 06:46:08 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Apr 2016 20:46:08 +1000 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy wrote: > On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: > >> BTW, isn't it impossible to randomly select from an infinite iterable >> anyway? > > > With equal probability, yes, impossible. > With skewed probabilities, no, possible. What, you mean like this? def choice(it): it = iter(it) value = next(it) try: while random.randrange(2): value = next(it) except StopIteration: pass return value I'm not sure how useful it is, but it does accept potentially infinite iterables, and does return values selected at random... ChrisA From ncoghlan at gmail.com Wed Apr 13 09:31:54 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Apr 2016 23:31:54 +1000 Subject: [Python-ideas] Dunder method to make object str-like In-Reply-To: References: <1460038026.1126929.571843177.7BC2EE4C@webmail.messagingengine.com> <1460045791.1159102.571987969.5CC10DF6@webmail.messagingengine.com> <87h9f9p37z.fsf@vostro.rath.org> <570AC623.1070701@stoneleaf.us> <87d1pwrysa.fsf@thinkpad.rath.org> <570BBF1C.9090805@stoneleaf.us> <871t6bua36.fsf@thinkpad.rath.org> <570C14EB.1080007@stoneleaf.us> <878u0ivj9b.fsf@thinkpad.rath.org> Message-ID: On 13 April 2016 at 03:33, Brett Cannon wrote: > > > On Tue, 12 Apr 2016 at 10:18 Nikolaus Rath wrote: >> >> On Apr 11 2016, Ethan Furman >> wrote: >> >> As far as I can see, implementing a protocol instead of adding a few >> >> isinstance checks is more likely to make the life of a CPython >> >> developer >> >> harder than easier. >> > >> > I disagree. And the protocol idea was not mine, so apparently other >> > core-devs also disagree (or think it's worth it, regardless). >> >> I haven't found any email explaining why a protocol would make things >> easier than the isinstance() approach (and I read most of the threads >> both here and on -dev), so I was assuming that the core-devs in question >> don't disagree but haven't considered the second approach. > > > I disagree with the idea. :) Type checking tends to be too strict as it > prevents duck typing. And locking ourselves to only what's in the stdlib is > way too restrictive when there are alternative path libraries out there > already. Plus it's short-sighted to assume no one will ever come up with a > better path library, and so tying us down to only pathlib.PurePath > explicitly would be a mistake long-term when specifying a single method > keeps things flexible (this is the same reason __index__() exists and we > simply don't say indexing has to be with a subclass of int or something). In addition to these general API design points, there's also a key pragmatic point, which is that we probably want importlib._bootstrap_external to be able to use the new API, which means it needs to work before any non-builtin and non-frozen modules are available. That's easy with a protocol, hard with a concrete class. Most folks don't need to worry about "How we do make this code work when the interpeter isn't fully configured yet?", but we don't always have that luxury :) As for why it wasn't specifically discussed before now, "protocols are preferable to dependencies on concrete classes" is such an entrenched design pattern in Python by this point that we tend to take it for granted, so unless someone specifically asks for an explanation, we're unlikely to spell it out again. (Thanks for doing that, by the way - I suspect you're far from the only one that was puzzled by the apparent omission). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Wed Apr 13 11:31:15 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 13 Apr 2016 08:31:15 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: <5974248239762161974@unknownmsgid> > There's an upper limit on how dense a CPython dict or set can become > (the load factor doesn't exceed 2/3), but no lower limit. For > example, it's easy to end up with a dict holding a single entry hiding > among millions of empty slots (dicts are never resized on key > deletion, only on key insertion). Easy, yes. Common? I wonder. If it were common then wouldn't there be good reason to resize the hash table when that occurred? Aside from being able to select random items, of course... -CHB > >> ... >> BTW, isn't it impossible to randomly select from an infinite iterable anyway? > > Of course, but it is possible to do uniform random selection, in one > pass using constant space and in linear time, from an iterable whose > length isn't known in advance (simple case of "reservoir sampling"). From guido at python.org Wed Apr 13 12:16:55 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Apr 2016 09:16:55 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <5974248239762161974@unknownmsgid> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <5974248239762161974@unknownmsgid> Message-ID: On Wed, Apr 13, 2016 at 8:31 AM, Chris Barker - NOAA Federal wrote: >> There's an upper limit on how dense a CPython dict or set can become >> (the load factor doesn't exceed 2/3), but no lower limit. For >> example, it's easy to end up with a dict holding a single entry hiding >> among millions of empty slots (dicts are never resized on key >> deletion, only on key insertion). > > Easy, yes. Common? I wonder. If it were common then wouldn't there be > good reason to resize the hash table when that occurred? Aside from > being able to select random items, of course... So it's becoming clear that if we wanted to do this right for sets (and dicts) the choice() implementation should be part of the set/dict implementation. It can then take care of compacting the set if it's density is too low. But honestly I don't see the use case come up enough to warrant all that effort. -- --Guido van Rossum (python.org/~guido) From mal at egenix.com Wed Apr 13 12:33:12 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 13 Apr 2016 18:33:12 +0200 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: <570E74C8.70308@egenix.com> On 13.04.2016 07:08, Tim Peters wrote: > [Chris Barker - NOAA Federal ] >> ... >> Anyway, the hash table has gaps, but wouldn't it be sufficiently >> random to pick a random index, and if it's a gap, pick another one? > > That would be a correct implementation, but ... > >> I suppose in theory, this could be in infinite process, but in practice, >> it would be O(1) with a constant of two or three... Better than >> iterating through on average half of the keys. > > There's an upper limit on how dense a CPython dict or set can become > (the load factor doesn't exceed 2/3), but no lower limit. For > example, it's easy to end up with a dict holding a single entry hiding > among millions of empty slots (dicts are never resized on key > deletion, only on key insertion). Converting a set to a list is O(N) (with N being the number of slots allocated for the set, with N > n, the number of used keys), so any gap skipping logic won't be better in performance than doing: import random my_set = {1, 2, 3, 4} l = list(my_set) selection = [random.choice(l) for x in l] print (selection) You'd only have a memory advantage, AFAICT. >> ... >> BTW, isn't it impossible to randomly select from an infinite iterable anyway? > > Of course, but it is possible to do uniform random selection, in one > pass using constant space and in linear time, from an iterable whose > length isn't known in advance (simple case of "reservoir sampling"). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 13 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From guido at python.org Wed Apr 13 12:31:05 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Apr 2016 09:31:05 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570E15C9.7050703@btinternet.com> References: <570D6B50.9070005@btinternet.com> <570E15C9.7050703@btinternet.com> Message-ID: On Wed, Apr 13, 2016 at 2:47 AM, Rob Cliffe wrote: > Isn't there an inconsistency that random.sample caters to a set by > converting it to a tuple, but random.choice doesn't? Perhaps because the use cases are different? Over the years I've learned that inconsistencies aren't always signs of sloppy thinking -- they may actually point to deep issues that aren't apparently on the surface. I imagine the typical use case for sample() to be something that samples the population once and then does something to the sample; the next time sample() is called the population is probably different (e.g. the next lottery has a different set of players). But I imagine a fairly common use case for choice() to be choosing from the same population over and over, and that's exactly the case where the copying implementation you're proposing would be a small disaster. -- --Guido van Rossum (python.org/~guido) From tim.peters at gmail.com Wed Apr 13 14:04:46 2016 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 13 Apr 2016 13:04:46 -0500 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <5974248239762161974@unknownmsgid> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <5974248239762161974@unknownmsgid> Message-ID: [Tim] >> There's an upper limit on how dense a CPython dict or set can become >> (the load factor doesn't exceed 2/3), but no lower limit. For >> example, it's easy to end up with a dict holding a single entry hiding >> among millions of empty slots (dicts are never resized on key >> deletion, only on key insertion). [Chris Barker - NOAA Federal ] > Easy, yes. Common? I wonder. Depends on the app. I doubt it's common across all apps. > If it were common then wouldn't there be good reason to resize the > hash table when that occurred? Aside from being able to select random > items, of course... Shrinking the table would have no primary effect on speed of subsequent access or deletion - it's O(1) expected regardless. Shrinking the table is expensive when it's done (the entire object has to be traversed, and the items reinserted one by one, into a smaller dict/set). Regardless, I'd be loathe to add a conditional branch on every deletion to check. Nothing is free, and the check would be a waste of cycles every time for apps that don't care about O(1) uniform random dict/set selection. Note that I don't care about the "wasted" memory, though. But then I don't care about O(1) uniform random dict/set selection either ;-) From greg.ewing at canterbury.ac.nz Wed Apr 13 17:47:37 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Apr 2016 09:47:37 +1200 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: <570EBE79.6070704@canterbury.ac.nz> Chris Angelico wrote: > On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy wrote: > >>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: >> >>>BTW, isn't it impossible to randomly select from an infinite iterable >>>anyway? >> >>With equal probability, yes, impossible. > > def choice(it): > it = iter(it) > value = next(it) > try: > while random.randrange(2): > value = next(it) > except StopIteration: pass > return value I think Terry meant that you can't pick just one item that's equally likely to be any of the infinitely many items returned by the iterator. You can prove that by considering that the probability of a given item being returned would have to be 1/infinity, which is zero -- so you can't return anything! -- Greg From ethan at stoneleaf.us Wed Apr 13 18:55:04 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 13 Apr 2016 15:55:04 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570DC67A.4040609@btinternet.com> References: <570D6B50.9070005@btinternet.com> <85inzmzb84.fsf@benfinney.id.au> <570DC67A.4040609@btinternet.com> Message-ID: <570ECE48.7000104@stoneleaf.us> On 04/12/2016 09:09 PM, Rob Cliffe wrote: > I did a little experimenting (admittedly non-rigorous, on one platform > only, and using Python 2.7.10, not Python 3, and using code which could > very possibly be improved) on selecting a random element from a > generator, and found that > > for small or moderate generators, reservoir sampling was almost > always slower than generating a tuple > > as the generator length increased up to roughly 10,000, the ratio > (time taken by reservoir) / (time taken by tuple) > increased, reaching a maximum of over 4 > as the generator length increased further, the ratio started to > decrease, although for a length of 80 million (about as large as I could > test) it was still over 3. I suspect this proves the point -- reservoir sampling is good not because it is fast, but because it won't drain your memory keeping items you will not return. -- ~Ethan~ From tjreedy at udel.edu Wed Apr 13 19:29:53 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 13 Apr 2016 19:29:53 -0400 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> Message-ID: On 4/13/2016 6:46 AM, Chris Angelico wrote: > On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy wrote: >> On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: >> >>> BTW, isn't it impossible to randomly select from an infinite iterable >>> anyway? >> >> >> With equal probability, yes, impossible. I have seen too many mathematical or statistical writers who ought to know better write "Let N be a random integer..." with no indication of a finite bound or other than a uniform distribution. No wonder students and readers sometimes get confused. >> With skewed probabilities, no, possible. > > What, you mean like this? > > def choice(it): > it = iter(it) > value = next(it) > try: > while random.randrange(2): > value = next(it) > except StopIteration: pass > return value With a perfect source of random numbers, this will halt with probability one. With current pseudorandom generators, this will definitely halt. And yes, both theoretically and practically, this is an example of skewed probabilities -- a waiting time distribution. > I'm not sure how useful it is, but it does accept potentially infinite > iterables, and does return values selected at random... People often want variates selected from a non-uniform distribution. Often, a uniform variate can be transformed. Sometimes multiple uniform variates are needed. If 'it' above is itertools.count, the above models the waiting time to get 'heads' (or 'tails') with a fair coin. Waiting times might be obtained more efficiently, perhaps, with, say, randrange(2**16), with another draw for as long as the value is 0 plus some arithmethic that uses int.bit_length. (Details to be verified.) -- Terry Jan Reedy From steve at pearwood.info Wed Apr 13 21:30:38 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Apr 2016 11:30:38 +1000 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570EBE79.6070704@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <570EBE79.6070704@canterbury.ac.nz> Message-ID: <20160414013037.GF1819@ando.pearwood.info> On Thu, Apr 14, 2016 at 09:47:37AM +1200, Greg Ewing wrote: > Chris Angelico wrote: > >On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy wrote: > > > >>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: > >> > >>>BTW, isn't it impossible to randomly select from an infinite iterable > >>>anyway? > >> > >>With equal probability, yes, impossible. [...] > I think Terry meant that you can't pick just one item that's > equally likely to be any of the infinitely many items returned > by the iterator. Correct. That's equivalent to chosing a positive integer with uniform probability distribution and no upper bound. > You can prove that by considering that the probability of > a given item being returned would have to be 1/infinity, > which is zero -- so you can't return anything! That's not how probability works :-) Consider a dart which is thrown at a dartboard. The probability of it landing on any specific point is zero, since the area of a single point is zero. Nevertheless, the dart does hit somewhere! A formal and precise treatment would have to involve calculus and limits as the probability approaches zero, rather than a flat out "the probability is zero, therefore it's impossible". Slightly less formally, we can say (only horrifying mathematicians a little bit) that the probability of any specific number is an infinitesimal number. https://en.wikipedia.org/wiki/Infinitesimal While it is *mathematically* meaningful to talk about selecting a random positive integer uniformly, its hard to do much more than that. The mean (average) is undefined[1]. A typical value chosen would have a vast number of digits, far larger than anything that could be stored in computer memory. Indeed Almost All[2] of the values we generate would be so large that we have no notation for writing it down (and not enough space in the universe to write it even if we did). So it is impossible in practice to select a random integer with uniform distribution and no upper bound. Non-uniform distributions, though, are easy :-) [1] While weird, this is not all that weird. For example, selecting numbers from a Cauchy distribution also has an undefined mean. What this means in practice is that the *sample mean* will not converge as you take more and more samples: the more samples you take, the more wildly the average will jump all over the place. https://en.wikipedia.org/wiki/Cauchy_distribution#Estimation_of_parameters [2] https://en.wikipedia.org/wiki/Almost_all -- Steve From greg.ewing at canterbury.ac.nz Thu Apr 14 02:03:08 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 14 Apr 2016 18:03:08 +1200 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <20160414013037.GF1819@ando.pearwood.info> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <570EBE79.6070704@canterbury.ac.nz> <20160414013037.GF1819@ando.pearwood.info> Message-ID: <570F329C.9070500@canterbury.ac.nz> Steven D'Aprano wrote: > A formal and precise treatment would have to involve calculus and limits > as the probability approaches zero, rather than a flat out "the > probability is zero, therefore it's impossible". The limit of p = 1/n as n goes to infinity is zero. Events with zero probability can't happen. I don't know how it can be made more rigorous than that. I think what this means is that if you somehow wrote a program to draw a number from an infinite uniform distribution, it would never terminate. < A typical value chosen would have a vast > number of digits, far larger than anything that could be stored in > computer memory. I'm not sure it even makes sense to talk about a typical number, because for any number you pick, there are infinitely many numbers with more digits, but only finitely many with the same or fewer digits. So the probability of getting only that many digits is zero, too! Infinity: Just say no. -- Greg From joshua.morton13 at gmail.com Thu Apr 14 02:59:44 2016 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Thu, 14 Apr 2016 06:59:44 +0000 Subject: [Python-ideas] Dictionary views are not entirely 'set like' In-Reply-To: <4E0BE43A-2B95-4AE0-87E2-731DCF44859F@selik.org> References: <4E0BE43A-2B95-4AE0-87E2-731DCF44859F@selik.org> Message-ID: These were exactly my thoughts. I wanted to bump this, since it was drowned out by more important things like Paths and invertible booleans and almost no discussion was had on the main issue, that of dict views not acting like sets. Since things have died down, is that behavior that should be remedied, and if so should it be backwards compatible (make set more permissive), treat it as a bugfix (make the views raise errors)? And should it include union/other methods to keep performance on the usecase that now throws a bug? -Josh On Thu, Apr 7, 2016 at 1:53 PM Michael Selik wrote: > > > On Apr 6, 2016, at 10:38 AM, Joshua Morton > wrote: > > set() | [] # 2 > > {}.keys() | [] # 3 > > Looks like this should be standardized. Either both raise TypeError, or > both return a set. My preference would be TypeError, but that might be > worse for backwards-compatibility. > > > {}.keys().union(set()) # 6 > > Seems to me that the pipe operator is staying on MappingView, so it's > reasonable to add a corresponding ``.union`` to mimic sets. And > intersection, etc. > > > {}.values() == {}.values() # 9 > > d = {}; d.values() == d.values() # 10 > > It's weird, but float('nan') != float('nan'). I'm not particularly > bothered by this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.g.kelly at gmail.com Thu Apr 14 04:03:00 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Thu, 14 Apr 2016 02:03:00 -0600 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570F329C.9070500@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <570EBE79.6070704@canterbury.ac.nz> <20160414013037.GF1819@ando.pearwood.info> <570F329C.9070500@canterbury.ac.nz> Message-ID: On Thu, Apr 14, 2016 at 12:03 AM, Greg Ewing wrote: > Steven D'Aprano wrote: >> >> A formal and precise treatment would have to involve calculus and limits >> as the probability approaches zero, rather than a flat out "the probability >> is zero, therefore it's impossible". > > > The limit of p = 1/n as n goes to infinity is zero. > Events with zero probability can't happen. I don't > know how it can be made more rigorous than that. This only holds true if the sample space is finite. The mathematical term for an event with zero probability that nonetheless can happen is "almost never". See https://en.wikipedia.org/wiki/Almost_surely As usual, infinity is weird. From stephen at xemacs.org Thu Apr 14 04:16:06 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 14 Apr 2016 17:16:06 +0900 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <570F329C.9070500@canterbury.ac.nz> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <570EBE79.6070704@canterbury.ac.nz> <20160414013037.GF1819@ando.pearwood.info> <570F329C.9070500@canterbury.ac.nz> Message-ID: <22287.20934.563800.765106@turnbull.sk.tsukuba.ac.jp> Greg Ewing writes: > The limit of p = 1/n as n goes to infinity is zero. > Events with zero probability can't happen. I don't > know how it can be made more rigorous than that. Sorry, as Steven pointed out, events *that are modeled as* having zero probability happen all the time. What you should not do with such events is give them positive (ie, atomic) weight in statistical calculations (= theory), and therefore you should not *predict* their occurance as individuals (unless you have no fear of becoming a laughingstock = practice). > Infinity: Just say no. I thought your name was Ewing. Is it really Brouwer? (Don't bother, I know. The reference to Brouwer is close enough for tsukkomi[1].) Footnotes: [1] https://en.wikipedia.org/wiki/Glossary_of_owarai_terms N.B. "Straight man" is not an accurate translation. From nikita at nemkin.ru Thu Apr 14 04:23:05 2016 From: nikita at nemkin.ru (Nikita Nemkin) Date: Thu, 14 Apr 2016 13:23:05 +0500 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 Message-ID: Reading PEP 3121/PEP 489 I can't stop wondering, why do extension modules require such specialized lifecycle APIs? Why not just let them subclass ModuleType? (Or any type, really, but ModuleType might be a good place to define some standard behavior.) Module instance naturally encapsulates C level module state. GC/finalization happens just like for any other object. PEP 3121 becomes redundant. Two-step initialization (PEP 489) can be achieved by defining a new kind of PyInit_XXX entry point, returning a module *type*, instead of a module *instance*. No extra API needed beyond that! Now, importer can simply instantiate this module type, passing __name__, __file__ and the rest. ModuleType.tp_new will perform attribute init, sys.modules registration etc. OR the importer can manually pull tp_new/tp_init/attribute setup, supplanting type_call. (This is closer to the current way of doing things.) Actual module initialization ("executing the module body") happens in tp_init. reload() is equivalent to calling tp_init again. Subinterpreter interaction becomes transparent: every interpreter instantiates its own module copy. "Singleton" modules with external global state should fail second instantiation (maybe by deriving from a special SingletonModuleType subclass that will handle it for them). Additionally, custom module type allows fine grained attribute access control (aka metamodules), useful to many complex modules. C synchronized module "variables" become super-easy to define (tp_members). For lazy loading and importing there's tp_getattro, tp_getset, etc. One problem not solved by this approach (nor the current approach) is module state access from methods of extension types. At least two solutions are possible: 1. Look up the module by name (sys.modules) or type (new per-interpreter cache). 2. Define a new METH_XXX calling convention (or flag) and pass both PyCFunctionObject.m_self and PyCFunctionObject.m_module to the C level method implementations. Both can be implemented, #1 being simple and #2 being proper. What do you think? From encukou at gmail.com Thu Apr 14 05:57:19 2016 From: encukou at gmail.com (Petr Viktorin) Date: Thu, 14 Apr 2016 11:57:19 +0200 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: Message-ID: <570F697F.2020606@gmail.com> On 04/14/2016 10:23 AM, Nikita Nemkin wrote: > Reading PEP 3121/PEP 489 I can't stop wondering, why do extension > modules require such specialized lifecycle APIs? Why not just let > them subclass ModuleType? (Or any type, really, but ModuleType might > be a good place to define some standard behavior.) Good question. I'll list some assumptions; if you don't share them we can talk more about these: - somethings are easy in Python, but not pleasant to do correctly* in C: - subclassing - creating simple objects - setting attributes - most modules don't need special functionality * ("Correctly" includes things like error checking, which many third-party modules skimp on.) Most of the API is in the style of "leave this NULL unless you need something special", i.e. simple things are easy, but the complex cases are possible. Creating custom ModuleType subclasses is possible, but I definitely wouldn't want to require every module author to do that. A lot of the API is convenience features, which are important: they do the right thing (for example, w.r.t. error checking), and they're easier to use than alternatives. This makes the API grows into several layers; for example: - m_methods in the PyModuleDef - PyModule_AddFunctions - PyCFunction_NewEx & PyObject_SetAttrString Usually you just set the first, but if you need more control (e.g. you're creating modules dynamically), you can use the lower-level tools. Your suggestion won't really help with this kind of complexity. Oh, and there is a technical reason against subclassing ModuleType unless necessary: Custom ModuleType subclasses cannot be made to work with runpy (i.e. python -m). For ModuleType (the ones without a custom create_module in the current API), this doesn't *currently* work, but the PEPs were written so that it's "just" a question of spending some development effort on runpy. > Module instance naturally encapsulates C level module state. > GC/finalization happens just like for any other object. PEP 3121 > becomes redundant. It doesn't become fully redundant: m_methods is still useful. The rest of PyModuleDef makes the more common complex cases simpler than a full-blown ModuleType subclass. > Two-step initialization (PEP 489) can be achieved by defining > a new kind of PyInit_XXX entry point, returning a module *type*, > instead of a module *instance*. No extra API needed beyond that! With the current API, you don't return a module *instance*, but a module *description* (PyModuleDef). This is a lot easier than creating a subclass in C. With your suggestion, I fear that someone would quickly come up with a macro to automate creating simple ModuleType instances, and at that point the API would be as complex as it is now, but every module instance would now also have an extra ModuleType subclass ? and I don't think that's either simpler or more effective. > Now, importer can simply instantiate this module type, passing > __name__, __file__ and the rest. ModuleType.tp_new will perform > attribute init, sys.modules registration etc. > OR > the importer can manually pull tp_new/tp_init/attribute setup, supplanting > type_call. (This is closer to the current way of doing things.) > > Actual module initialization ("executing the module body") > happens in tp_init. reload() is equivalent to calling tp_init again. > > Subinterpreter interaction becomes transparent: every interpreter > instantiates its own module copy. "Singleton" modules with > external global state should fail second instantiation > (maybe by deriving from a special SingletonModuleType subclass > that will handle it for them). Current status: every interpreter instantiates its own module instance. "Singleton" modules with external global state are marked as such, and should be written so that they fail second instantiation. (Maybe the failing can be automated by the import machinery, but that part is not yet implemented). It seems to me that adding a custom ModuleType subclass to the mix wouldn't change much. > Additionally, custom module type allows fine grained attribute > access control (aka metamodules), useful to many complex modules. > C synchronized module "variables" become super-easy to define > (tp_members). For lazy loading and importing there's > tp_getattro, tp_getset, etc. Right. This is the kind of thing you *do* need a ModuleType subclass for, and the current API makes it possible to do it. > One problem not solved by this approach (nor the current approach) > is module state access from methods of extension types. > At least two solutions are possible: > 1. Look up the module by name (sys.modules) or type (new per-interpreter > cache). > 2. Define a new METH_XXX calling convention (or flag) and pass both > PyCFunctionObject.m_self and PyCFunctionObject.m_module > to the C level method implementations. > Both can be implemented, #1 being simple and #2 being proper. *This* is the real problem now. I think #2 is viable, and I'm slowly (too slowly perhaps) working on it > What do you think? I personally think that your suggestion wouldn't make the API substantially simpler, assuming you would keep it as robust and easy to use as the current solution. And if you would want to maintain backwards compatibility (even with only the pre-PEP 489 state), it would be even harder. Many people thought about the current APIs, and (almost) all of the unpleasant decisions we had to make do have their reasons. (And if you ask about a more specific decision, I can give you the specific reasons.) From nikita at nemkin.ru Thu Apr 14 09:51:15 2016 From: nikita at nemkin.ru (Nikita Nemkin) Date: Thu, 14 Apr 2016 18:51:15 +0500 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 Message-ID: On Thu, Apr 14, 2016 at 2:57 PM, Petr Viktorin wrote: > On 04/14/2016 10:23 AM, Nikita Nemkin wrote: >> Reading PEP 3121/PEP 489 I can't stop wondering, why do extension >> modules require such specialized lifecycle APIs? Why not just let >> them subclass ModuleType? (Or any type, really, but ModuleType might >> be a good place to define some standard behavior.) > > Good question. > I'll list some assumptions; if you don't share them we can talk more > about these: > - somethings are easy in Python, but not pleasant to do correctly* in C: > - subclassing > - creating simple objects > - setting attributes > - most modules don't need special functionality > > * ("Correctly" includes things like error checking, which many > third-party modules skimp on.) > > Most of the API is in the style of "leave this NULL unless you need > something special", i.e. simple things are easy, but the complex cases > are possible. Creating custom ModuleType subclasses is possible, but I > definitely wouldn't want to require every module author to do that. > > A lot of the API is convenience features, which are important: they do > the right thing (for example, w.r.t. error checking), and they're easier > to use than alternatives. This makes the API grows into several layers; > for example: > - m_methods in the PyModuleDef > - PyModule_AddFunctions > - PyCFunction_NewEx & PyObject_SetAttrString > Usually you just set the first, but if you need more control (e.g. > you're creating modules dynamically), you can use the lower-level tools. > > Your suggestion won't really help with this kind of complexity. I agree that C API is quite difficult and dangerous to use directly (that's why I always use Cython), but I don't agree that separate module init API makes things simpler. PyModuleDef appears to be PyTypeObject surrogate with similar (or identical?) semantics. That's an extra concept to learn about, an extra bit of documentation to consult when writing new code. PyTypeObject, on the other hand, is fundamental and unavoidable, regardless of module init system. Practically speaking, PyModuleDef_Init(&spam_def) and PyType_Ready(&spam_type) differ only in the number of zeroes in their respective static structs. There's also PyType_FromSpec (stable ABI), which might appeal to some people more than PyType_Ready. > Oh, and there is a technical reason against subclassing ModuleType > unless necessary: Custom ModuleType subclasses cannot be made to work > with runpy (i.e. python -m). For ModuleType (the ones without a custom > create_module in the current API), this doesn't *currently* work, but > the PEPs were written so that it's "just" a question of spending some > development effort on runpy. I didn't even know that python -m supported extension modules. >> Module instance naturally encapsulates C level module state. >> GC/finalization happens just like for any other object. PEP 3121 >> becomes redundant. > > It doesn't become fully redundant: m_methods is still useful. ModuleType subclasses have tp_methods. A little check in PyType_Ready can make it behave like PyModuleDef.m_methods. Modules don't need normal methods anyway. > With your suggestion, I fear that someone would quickly come up with a > macro to automate creating simple ModuleType instances, and at that > point the API would be as complex as it is now, but every module > instance would now also have an extra ModuleType subclass ? and I don't > think that's either simpler or more effective. One or two standard macros that help with ModuleType subsclassing are in order. Something like PyModule_HEAD_INIT, whose usage is already obvious from the name. > I personally think that your suggestion wouldn't make the API > substantially simpler, assuming you would keep it as robust and easy to > use as the current solution. And if you would want to maintain backwards > compatibility (even with only the pre-PEP 489 state), it would be even > harder. > > Many people thought about the current APIs, and (almost) all of the > unpleasant decisions we had to make do have their reasons. (And if you > ask about a more specific decision, I can give you the specific reasons.) Regarding backward compatibility, PEP 3121 and most of PEP 489 will work just fine as a facade to ModuleType subclass. I understand that existing PEPs are well researched and there's no real incentive to change. My proposal is more of a hypothetical kind. From stefan at bytereef.org Thu Apr 14 11:07:28 2016 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 14 Apr 2016 15:07:28 +0000 (UTC) Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 References: Message-ID: Nikita Nemkin writes: > I understand that existing PEPs are well researched and there's no > real incentive to change. My proposal is more of a hypothetical kind. Petr obviously has researched all this carefully, but there is an incentive: _decimal for example takes a speed hit of 20% with PEP-3121, so it's not implemented. I suspect that the later PEP also slows down modules (which does not matter most of the time). Any new proposal should absolutely include the performance issue. Stefan Krah From walker_s at hotmail.co.uk Thu Apr 14 12:25:56 2016 From: walker_s at hotmail.co.uk (SW) Date: Thu, 14 Apr 2016 17:25:56 +0100 Subject: [Python-ideas] PEP8 operator must come before line break Message-ID: Hi, PEP8 says that: "The preferred place to break around a binary operator is after the operator, not before it." This is ignored in the multiline if examples, and seems to generally be a bad idea as it negatively impacts clarity. For example, the following seems much clearer as the entire line does not need to be scanned to see the intent- only the start of the line is needed to see how the different properties are used for filtering: mylist = [ item for item in mylist if item['property'] == 2 and item['otherproperty'] == 'test' ] The alternative seems less clear: mylist = [ item for item in mylist if item['property'] == 2 and item['otherproperty'] == 'test' ] If this recommendation remains in force, it would be good to: 1. Follow it in the style guide. 2. Provide a rationale for it, as currently it seems arbitrary and unhelpful. Just raising this as it has bitten me a couple of times with style checkers recently. Thanks, S From guido at python.org Thu Apr 14 12:35:43 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Apr 2016 09:35:43 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: Where in PEP 8 does it violate its own advice? That's a bug. (The PEP has many authors by now.) As the PEP acknowledges, style is hard to agree over. It's even harder to change an agreement that has been documented (if not always followed consistently) for over 20 years. My rationale for this rule is that ending a line in a binary operator is a clear hint to the reader that the line isn't finished. (If you think about it, a comma is a kind of binary operator, and you wouldn't move the comma to the start of the continuation line, would you? :-) On Thu, Apr 14, 2016 at 9:25 AM, SW wrote: > Hi, > PEP8 says that: "The preferred place to break around a binary operator > is after the operator, not before it." > > This is ignored in the multiline if examples, and seems to generally be > a bad idea as it negatively impacts clarity. > > For example, the following seems much clearer as the entire line does > not need to be scanned to see the intent- only the start of the line is > needed to see how the different properties are used for filtering: > mylist = [ > item for item in mylist > if item['property'] == 2 > and item['otherproperty'] == 'test' > ] > > The alternative seems less clear: > mylist = [ > item for item in mylist > if item['property'] == 2 and > item['otherproperty'] == 'test' > ] > > If this recommendation remains in force, it would be good to: > 1. Follow it in the style guide. > 2. Provide a rationale for it, as currently it seems arbitrary and > unhelpful. > > Just raising this as it has bitten me a couple of times with style > checkers recently. > > Thanks, > S > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From random832 at fastmail.com Thu Apr 14 12:58:08 2016 From: random832 at fastmail.com (Random832) Date: Thu, 14 Apr 2016 12:58:08 -0400 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: <1460653088.58996.578913609.71F02FF1@webmail.messagingengine.com> On Thu, Apr 14, 2016, at 12:35, Guido van Rossum wrote: > Where in PEP 8 does it violate its own advice? That's a bug. (The PEP > has many authors by now.) > > As the PEP acknowledges, style is hard to agree over. It's even harder > to change an agreement that has been documented (if not always > followed consistently) for over 20 years. > > My rationale for this rule is that ending a line in a binary operator > is a clear hint to the reader that the line isn't finished. (If you > think about it, a comma is a kind of binary operator, and you wouldn't > move the comma to the start of the continuation line, would you? :-) Well, yes, but "and" and "or" are English conjunctions in addition to being binary operators and, a line break is a natural break in reading and, something flows more naturally if you pause before conjunctions rather than after them or, am I completely wrong about that? What I'm saying is, maybe there should be a separate rule for "and" and "or" in certain contexts, rather than a universal rule for all binary operators. From boekewurm at gmail.com Thu Apr 14 13:02:08 2016 From: boekewurm at gmail.com (Matthias welp) Date: Thu, 14 Apr 2016 19:02:08 +0200 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: > Where in PEP 8 does it violate its own advice As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008) section indentation, just after 'Acceptable options in this situation include, but are not limited to: ' # Add some extra indentation on the conditional continuation line. if (this_is_one_thing and that_is_another_thing): do_something() That is the only place I could find just now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 14 13:02:02 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Apr 2016 10:02:02 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: <1460653088.58996.578913609.71F02FF1@webmail.messagingengine.com> References: <1460653088.58996.578913609.71F02FF1@webmail.messagingengine.com> Message-ID: On Thu, Apr 14, 2016 at 9:58 AM, Random832 wrote: > On Thu, Apr 14, 2016, at 12:35, Guido van Rossum wrote: >> Where in PEP 8 does it violate its own advice? That's a bug. (The PEP >> has many authors by now.) >> >> As the PEP acknowledges, style is hard to agree over. It's even harder >> to change an agreement that has been documented (if not always >> followed consistently) for over 20 years. >> >> My rationale for this rule is that ending a line in a binary operator >> is a clear hint to the reader that the line isn't finished. (If you >> think about it, a comma is a kind of binary operator, and you wouldn't >> move the comma to the start of the continuation line, would you? :-) > > Well, yes, but "and" and "or" are English conjunctions in addition to > being binary operators and, a line break is a natural break in reading > and, something flows more naturally if you pause before conjunctions > rather than after them or, am I completely wrong about that? > > What I'm saying is, maybe there should be a separate rule for "and" and > "or" in certain contexts, rather than a universal rule for all binary > operators. OK, do you want to moderate the discussion about that particular issue? If you can gather support from some of the usual suspects I'm not against being persuaded. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Apr 14 13:23:59 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Apr 2016 10:23:59 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: Thanks, that was obviously an oversight. I've fixed the PEP. If the discussion ends up with rough consensus on changing this I will happily change it back (and change all other occurrences to match the new rule). Note that my request for "rough consensus" does *not* imply a vote. +1 and -1 votes (nor fractions in between) should not be posted -- however cogent arguments for/against the status quo (or for relinquishing the rule altogether) are welcome. On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp wrote: >> Where in PEP 8 does it violate its own advice > > As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008) > > section indentation, just after 'Acceptable options in this situation > include, but are not limited to: ' > > # Add some extra indentation on the conditional continuation line. > if (this_is_one_thing > and that_is_another_thing): > do_something() > > That is the only place I could find just now. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From walker_s at hotmail.co.uk Thu Apr 14 14:48:23 2016 From: walker_s at hotmail.co.uk (SW) Date: Thu, 14 Apr 2016 19:48:23 +0100 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: That'll teach me for stepping away from the computer... As for changing an established rule, I agree that can be difficult. The reason this one became an irritation for me is that it was only in the last few months that I saw flake8 (my style complainer of choice) start complaining about this, so it's not quite so entrenched as other elements of style. I agree that placing the binary operator at the end shows the line should continue, and thus could be valid, but I also think that placing it at the start of the next line shows the logic flow for each part of the expression more clearly- as shown in the examples I originally gave. Thanks, S On 14/04/16 18:23, Guido van Rossum wrote: > Thanks, that was obviously an oversight. I've fixed the PEP. > > If the discussion ends up with rough consensus on changing this I will > happily change it back (and change all other occurrences to match the > new rule). > > Note that my request for "rough consensus" does *not* imply a vote. +1 > and -1 votes (nor fractions in between) should not be posted -- > however cogent arguments for/against the status quo (or for > relinquishing the rule altogether) are welcome. > > On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp wrote: >>> Where in PEP 8 does it violate its own advice >> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008) >> >> section indentation, just after 'Acceptable options in this situation >> include, but are not limited to: ' >> >> # Add some extra indentation on the conditional continuation line. >> if (this_is_one_thing >> and that_is_another_thing): >> do_something() >> >> That is the only place I could find just now. >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > From bruce at leban.us Thu Apr 14 15:10:08 2016 From: bruce at leban.us (Bruce Leban) Date: Thu, 14 Apr 2016 12:10:08 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: I find the operator at the beginning of the line much more clear in code like this: innerWidth = (outerWidth - 2 * border_width - left_margin - right_margin) outerHeight = (innerHeight + (title_height if have_title else 0) + (subtitle_height if have_subtitle else 0) - (1 if have_title and have_subtitle else 0)) outerHeight = (innerHeight + (title_height if have_title else 0) + (subtitle_height if have_subtitle else 0) - (1 if have_title and have_subtitle else 0)) area = ((multiline_calculation_of_height) * (multiline_calculation_of_width)) The first two are taken and sanitized from real code. --- Bruce Check out my puzzle book and get it free here: http://J.mp/ingToConclusionsFree (available on iOS) On Thu, Apr 14, 2016 at 11:48 AM, SW wrote: > That'll teach me for stepping away from the computer... > > As for changing an established rule, I agree that can be difficult. The > reason this one became an irritation for me is that it was only in the > last few months that I saw flake8 (my style complainer of choice) start > complaining about this, so it's not quite so entrenched as other > elements of style. > > I agree that placing the binary operator at the end shows the line > should continue, and thus could be valid, but I also think that placing > it at the start of the next line shows the logic flow for each part of > the expression more clearly- as shown in the examples I originally gave. > > Thanks, > S > > On 14/04/16 18:23, Guido van Rossum wrote: > > Thanks, that was obviously an oversight. I've fixed the PEP. > > > > If the discussion ends up with rough consensus on changing this I will > > happily change it back (and change all other occurrences to match the > > new rule). > > > > Note that my request for "rough consensus" does *not* imply a vote. +1 > > and -1 votes (nor fractions in between) should not be posted -- > > however cogent arguments for/against the status quo (or for > > relinquishing the rule altogether) are welcome. > > > > On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp > wrote: > >>> Where in PEP 8 does it violate its own advice > >> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008) > >> > >> section indentation, just after 'Acceptable options in this situation > >> include, but are not limited to: ' > >> > >> # Add some extra indentation on the conditional continuation line. > >> if (this_is_one_thing > >> and that_is_another_thing): > >> do_something() > >> > >> That is the only place I could find just now. > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Apr 14 15:22:03 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 14 Apr 2016 22:22:03 +0300 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: On 14.04.16 20:23, Guido van Rossum wrote: > Thanks, that was obviously an oversight. I've fixed the PEP. > > If the discussion ends up with rough consensus on changing this I will > happily change it back (and change all other occurrences to match the > new rule). > > Note that my request for "rough consensus" does *not* imply a vote. +1 > and -1 votes (nor fractions in between) should not be posted -- > however cogent arguments for/against the status quo (or for > relinquishing the rule altogether) are welcome. An argument for operators "+" and "-": result = expr1 + expr2 is syntax error, while result = expr1 + expr2 is silent bug. From ianlee1521 at gmail.com Thu Apr 14 15:25:20 2016 From: ianlee1521 at gmail.com (Ian Lee) Date: Thu, 14 Apr 2016 12:25:20 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: <685A7234-8BC3-4C87-B7CE-D5210506BDAA@gmail.com> My preference has been towards having the binary operator at the front rather than end since listening to Brandon Rhodes 2012 PyCon Canada talk (specifically starting around [1]) . Specifically he is arguing that following Knuth rather than PEP 8 [2] might be a better way to go. Maybe my preference there is due to the Django work I was doing where I would get long query lines and didn?t always want to split them into multiple lines [3]: query = (Person.objects .filter(last_name==?Smith?) .order_by('social_security_number') .select_related('spouse') ) [1] http://rhodesmill.org/brandon/slides/2012-11-pyconca/#id183 [2] http://rhodesmill.org/brandon/slides/2012-11-pyconca/#knuth-instead-of-pep-8 [3] http://rhodesmill.org/brandon/slides/2012-11-pyconca/#option-3 ~ Ian Lee | IanLee1521 at gmail.com > On Apr 14, 2016, at 11:48, SW wrote: > > That'll teach me for stepping away from the computer... > > As for changing an established rule, I agree that can be difficult. The > reason this one became an irritation for me is that it was only in the > last few months that I saw flake8 (my style complainer of choice) start > complaining about this, so it's not quite so entrenched as other > elements of style. > > I agree that placing the binary operator at the end shows the line > should continue, and thus could be valid, but I also think that placing > it at the start of the next line shows the logic flow for each part of > the expression more clearly- as shown in the examples I originally gave. > > Thanks, > S > > On 14/04/16 18:23, Guido van Rossum wrote: >> Thanks, that was obviously an oversight. I've fixed the PEP. >> >> If the discussion ends up with rough consensus on changing this I will >> happily change it back (and change all other occurrences to match the >> new rule). >> >> Note that my request for "rough consensus" does *not* imply a vote. +1 >> and -1 votes (nor fractions in between) should not be posted -- >> however cogent arguments for/against the status quo (or for >> relinquishing the rule altogether) are welcome. >> >> On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp wrote: >>>> Where in PEP 8 does it violate its own advice >>> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008) >>> >>> section indentation, just after 'Acceptable options in this situation >>> include, but are not limited to: ' >>> >>> # Add some extra indentation on the conditional continuation line. >>> if (this_is_one_thing >>> and that_is_another_thing): >>> do_something() >>> >>> That is the only place I could find just now. >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From contrebasse at gmail.com Thu Apr 14 15:28:39 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 14 Apr 2016 19:28:39 +0000 (UTC) Subject: [Python-ideas] PEP8 operator must come before line break References: Message-ID: Guido van Rossum writes: > My rationale for this rule is that ending a line in a binary operator > is a clear hint to the reader that the line isn't finished. (If you > think about it, a comma is a kind of binary operator, and you wouldn't > move the comma to the start of the continuation line, would you? I personally tend to look more at the start of the lines because that's where the blocks are defined (by indentation). Also the end of the lines are usually not aligned which makes binary operators harder to see. Because of these two reasons I always put binary operator at the start of new lines, because that's where I have the most chance to see them, and I'm in favor of changing this in PEP8. From python at lucidity.plus.com Thu Apr 14 15:27:29 2016 From: python at lucidity.plus.com (Erik) Date: Thu, 14 Apr 2016 20:27:29 +0100 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: <570FEF21.3010702@lucidity.plus.com> On 14/04/16 18:23, Guido van Rossum wrote: > If the discussion ends up with rough consensus on changing this I will > happily change it back (and change all other occurrences to match the > new rule). Interestingly, the sentence above specifically binds the 'and' with what follows, via parentheses. It is not part of the preceding text to indicate something follows ... > however cogent arguments for/against the status quo (or for > relinquishing the rule altogether) are welcome. What I said above pretty much sums up my argument for this (it came up in a different context a month or so ago). I think that is also Matthias's argument (though I won't speak for him). English speakers will not usually accentuate the 'and's and 'or's in a sentence before pausing. To me, I read: if foo == bar and \ baz == spam: as: "if foo equals bar AND, baz equals spam". (emphasis on the "and", a pause before 'baz'). Where I will read: if foo == bar \ and baz == spam: as: "If foo equals bar, and baz equals spam" (pause after 'bar'). Just my personal opinion. The problem with this sort of style issue is that it's a pattern-based thing. If one is used to reading code written with a particular pattern then it's hard to adjust to another. So I don't think you'll get a complete consensus one way or the other. E. From python at lucidity.plus.com Thu Apr 14 15:41:58 2016 From: python at lucidity.plus.com (Erik) Date: Thu, 14 Apr 2016 20:41:58 +0100 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: References: Message-ID: <570FF286.9020307@lucidity.plus.com> On 14/04/16 20:22, Serhiy Storchaka wrote: > An argument for operators "+" and "-": > > result = expr1 + > expr2 > > is syntax error, while > > result = expr1 > + expr2 > > is silent bug. Interesting. But that bug potentially exists regardless of what PEP8 says. This seems to me to be something that should be tackled separately. I imagine that there are versions of 'expr2' that makes this a valid and useful construct (which is why it remains) - if I've missed some historical discussion on this then please refer me to it, I'd like to understand what I'm missing on first glance. Thanks, E. From rosuav at gmail.com Thu Apr 14 16:04:12 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 15 Apr 2016 06:04:12 +1000 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: <570FF286.9020307@lucidity.plus.com> References: <570FF286.9020307@lucidity.plus.com> Message-ID: On Fri, Apr 15, 2016 at 5:41 AM, Erik wrote: > On 14/04/16 20:22, Serhiy Storchaka wrote: >> >> An argument for operators "+" and "-": >> >> result = expr1 + >> expr2 >> >> is syntax error, while >> >> result = expr1 >> + expr2 >> >> is silent bug. > > > Interesting. But that bug potentially exists regardless of what PEP8 says. > Yes, but if you encourage people to spell it the first way, you can't accidentally leave part of your expression out of your result. However, I wouldn't write it like *either* of those. To me, the options are: result = expr1 + expr2 and result = expr1 + expr 2 And either of those would give an immediate IndentationError without the parentheses. So indenting the continuation is even better than choosing where to place the binary operator. ChrisA From python at lucidity.plus.com Thu Apr 14 16:30:56 2016 From: python at lucidity.plus.com (Erik) Date: Thu, 14 Apr 2016 21:30:56 +0100 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: References: <570FF286.9020307@lucidity.plus.com> Message-ID: <570FFE00.8000001@lucidity.plus.com> On 14/04/16 21:04, Chris Angelico wrote: > However, I wouldn't write it like *either* of those. To me, the options are: > > result = expr1 + > expr2 > > and > > result = expr1 > + expr 2 Agreed, and the second one is how I prefer to format my C code too. > So indenting the continuation is even better than > choosing where to place the binary operator. The two are not mutually exclusive. Indenting the continuation tackles the "silent bug" issue, while choosing where to place the operator is purely a readability issue - hence I prefer the second of your examples ... it addresses both problems from my POV. However, that doesn't answer my question of when a line consisting of just "+ expr" is a useful thing. E. From barry at python.org Thu Apr 14 16:34:36 2016 From: barry at python.org (Barry Warsaw) Date: Thu, 14 Apr 2016 16:34:36 -0400 Subject: [Python-ideas] PEP8 operator must come before line break References: Message-ID: <20160414163436.43a4fa8d@anarchist.wooz.org> On Apr 14, 2016, at 05:25 PM, SW wrote: >PEP8 says that: "The preferred place to break around a binary operator >is after the operator, not before it." Personally, my own preferred style prefers keyword binary operators, e.g. 'and' and 'or', after the line break but operator symbols (e.g. '*' or '+') before the line break. The way my editor syntax highlights the keywords but not the operators makes this style the most readable to me. However, I think the pep8 tool is too strict here. PEP 8 the document doesn't say the line break around binary operators is *required* just that it's a preferred style. Forcing the line break here seems like the tool is overstepping and I would favor a relaxation of the tool. Where I've adopted flake8 and such, I've grumbled when it forces me to make this change. (But I guess not enough to file a bug. ;) If the pep8 developers insist on reading PEP 8's "preferred" as a strict requirement, then I would favor clarification in PEP 8 that such line breaks are not required and that either choice of line break positioning around binary operators is acceptable. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From zachary.ware+pyideas at gmail.com Thu Apr 14 16:41:36 2016 From: zachary.ware+pyideas at gmail.com (Zachary Ware) Date: Thu, 14 Apr 2016 15:41:36 -0500 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: <570FFE00.8000001@lucidity.plus.com> References: <570FF286.9020307@lucidity.plus.com> <570FFE00.8000001@lucidity.plus.com> Message-ID: On Thu, Apr 14, 2016 at 3:30 PM, Erik wrote: > However, that doesn't answer my question of when a line consisting of just > "+ expr" is a useful thing. It's probably not, but that's for a linter to point out. For a stupid, contrived, untested example, though: class Foo: def __init__(self, value): self.value = value def __pos__(self): self.value += 1 def __neg__(self): self.value -= 2 f = Foo(3) +f assert f.value == 4 -f assert f.value == 2 -- Zach From tjreedy at udel.edu Thu Apr 14 16:55:25 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Apr 2016 16:55:25 -0400 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: On 4/14/2016 1:23 PM, Guido van Rossum wrote: > however cogent arguments for/against the status quo (or for > relinquishing the rule altogether) are welcome. Outside of Python, binary operators are (in the examples I can think of) more often more strongly associated with the second argument than the first, even to the point of switching the order of 2nd arg and operator. English: Start with A, add B, subtract C, and assign the result to D. Assembly: load A add B sub C stor D Calculator tape (with literals, not symbols) A B + C - D = I suggest relinquishing the rule, except maybe to suggest consistency within an expression, if not the whole file. -- Terry Jan Reedy From python at lucidity.plus.com Thu Apr 14 17:22:47 2016 From: python at lucidity.plus.com (Erik) Date: Thu, 14 Apr 2016 22:22:47 +0100 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: References: <570FF286.9020307@lucidity.plus.com> <570FFE00.8000001@lucidity.plus.com> Message-ID: <57100A27.4030704@lucidity.plus.com> Hi Zach, On 14/04/16 21:41, Zachary Ware wrote: > It's probably not, but that's for a linter to point out. I was waiting for the "linter" response (in a good way - I suspected that would eventually be the answer, but I was hoping to be surprised :)). > stupid, contrived, untested example, though: Yeah, OK. In nearly 20 years, I've never once needed to implement those dunder methods, but that's explanation enough for me - thanks. E. From random832 at fastmail.com Thu Apr 14 18:36:55 2016 From: random832 at fastmail.com (Random832) Date: Thu, 14 Apr 2016 18:36:55 -0400 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: <570FFE00.8000001@lucidity.plus.com> References: <570FF286.9020307@lucidity.plus.com> <570FFE00.8000001@lucidity.plus.com> Message-ID: <1460673415.1605987.579217849.4B932CEA@webmail.messagingengine.com> On Thu, Apr 14, 2016, at 16:30, Erik wrote: > However, that doesn't answer my question of when a line consisting of > just "+ expr" is a useful thing. Well, it's a legitimate unary operator. I've never heard a good explanation of what the point of it is, but the same can't be said obviously for -expr. As for allowing an expression on its own (sure, any operator _can_ have side effects, but calling most operators for their side effects is a huge stylistic problem with the code regardless)... the same can be said for most operators, not just the unary + and -. It'd probably be reasonable to make a static checking tool that gives you warnings if almost any non-function-call expression is used as a statement. (along with other things like most constructors, pure functions, etc). But I think the idea of making it a syntax error has actually been brought up recently and rejected. From njs at pobox.com Thu Apr 14 20:45:10 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 14 Apr 2016 17:45:10 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: On Thu, Apr 14, 2016 at 12:28 PM, Joseph Martinot-Lagarde < contrebasse at gmail.com> wrote: > Guido van Rossum writes: > >> My rationale for this rule is that ending a line in a binary operator >> is a clear hint to the reader that the line isn't finished. (If you >> think about it, a comma is a kind of binary operator, and you wouldn't >> move the comma to the start of the continuation line, would you? > > I personally tend to look more at the start of the lines because that's > where the blocks are defined (by indentation). Also the end of the lines are > usually not aligned which makes binary operators harder to see. > Because of these two reasons I always put binary operator at the start of > new lines, because that's where I have the most chance to see them, and I'm > in favor of changing this in PEP8. This is the case that jumped to mind for me as well... If I saw code like this in a code review I'd force the author to change it because the style is outright misleading: return (something1() + a * b + some_other_thing() ** 2 - f - normalizer) We're summing a list of items with some things being negated, but that structure is impossible to see, and worst of all, the association between the operator and the thing being operated on is totally lost. OTOH if we write like this: return (something1() + a * b + some_other_thing() ** 2 - f - normalizer) then the list structure is immediately obvious, and it's immediately obvious which terms are being added and which are being subtracted. Similarly, who can even tell if this code is correct: return (something1() + a * b + some_other_thing() ** 2 - f / normalizer) but if the same code is formatted this way then it's immediately obvious that the code is buggy: return (something1() + a * b + some_other_thing() ** 2 - f / normalizer) It should be corrected to: return ((something1() + a * b + some_other_thing() ** 2 - f) / normalizer) (I'm calling the final item "normalizer" because this pattern actually comes up fairly often in bayesian computations -- if you're working in log space then at the end of some computation you subtract off a magic normalizing factor, and if you're working in linear space then at the end you divide off a magic normalizing factor. You can also get a similar pattern for chains of * and /, though it's less common.) In all of these cases, the hanging indent makes the fact that we have a continuation line obvious from across the room -- I don't need help knowing that there's a continuation, I need help figuring out what the the computation actually does :-). I actually find a similar effect for and/or chains, where the beginning-of-line format makes things lined up and easier to scan -- e.g. comparing these two options: return (something1() and f() and some_expression > 1 and (some_other_thing() == whatever or blahblah or asdf)) return (something1() and f() and some_expression > 1 and (some_other_thing() == whatever or blahblah or asdf)) then I find the second option dramatically more readable. But maybe that's just me -- for and/or the argument is a bit less compelling, because you don't regularly have chains of different operators with the same precedence, like you do with +/-. I guess this might have to do with a more underlying stylistic difference: as soon as I have an expression that stretches across multiple lines, I try to break it into pieces where each line is a meaningful unit, even if this doesn't produce the minimum number of lines. For example I generally avoid writing stuff like: return (something1() and f() some_expression > 1 and (some_other_thing() == whatever or blahblah or asdf)) return (something1() + a * b + some_other_thing() ** 2 - f - normalizer) -n BIKESHEDDING REDUCTION ACT NOTICE: I hereby swear that I have said everything useful I have to say about this topic and that this will be my only contribution to this thread unless someone addresses me directly. -- Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 14 21:07:48 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 14 Apr 2016 18:07:48 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: Message-ID: <57103EE4.8090009@stoneleaf.us> On 04/14/2016 05:45 PM, Nathaniel Smith wrote: > On Thu, Apr 14, 2016 at 12:28 PM, Joseph Martinot-Lagarde wrote: > I actually find a similar effect for and/or chains, where the > beginning-of-line format makes things lined up and easier to scan -- > e.g. comparing these two options: > > return (something1() and > f() and > some_expression > 1 and > (some_other_thing() == whatever or > blahblah or > asdf)) > > return (something1() > and f() > and some_expression > 1 > and (some_other_thing() == whatever > or blahblah > or asdf)) > > then I find the second option dramatically more readable. But maybe > that's just me -- for and/or the argument is a bit less compelling, > because you don't regularly have chains of different operators with the > same precedence, like you do with +/-. I agree. ;) Most of my continuation lines are with `and` and `or`, and I try to keep the logical pieces on one line with the join `and` or `or` beginning the next line. Much easier for me to read. -- ~Ethan~ From guido at python.org Thu Apr 14 22:28:20 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Apr 2016 19:28:20 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: <57103EE4.8090009@stoneleaf.us> References: <57103EE4.8090009@stoneleaf.us> Message-ID: OK, I get it. Brandon's slides with the Knuth references were especially useful. So let's change the PEP! Who wants to draft a diff? -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Fri Apr 15 00:27:39 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Apr 2016 14:27:39 +1000 Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before line break In-Reply-To: <570FFE00.8000001@lucidity.plus.com> References: <570FF286.9020307@lucidity.plus.com> <570FFE00.8000001@lucidity.plus.com> Message-ID: <20160415042737.GH1819@ando.pearwood.info> On Thu, Apr 14, 2016 at 09:30:56PM +0100, Erik wrote: > However, that doesn't answer my question of when a line consisting of > just "+ expr" is a useful thing. Almost nowhere. But, if you have a DSL that operates in a imperative fashion (giving commands, rather than returning results in a functional fashion) you might do something like: +tracing command() command() -tracing which is modelled after the doctest directives: # doctest: +SKIP # doctest: -NORMALIZE_WHITESPACE etc. I don't think this imperative style is a really good fit for Python's syntax, but it could work for some people. -- Steve From ncoghlan at gmail.com Fri Apr 15 02:43:25 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Apr 2016 16:43:25 +1000 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: Message-ID: On 14 April 2016 at 23:51, Nikita Nemkin wrote: > PyModuleDef appears to be PyTypeObject surrogate with similar > (or identical?) semantics. That's an extra concept to learn about, > an extra bit of documentation to consult when writing new code. > PyTypeObject, on the other hand, is fundamental and unavoidable, > regardless of module init system. The internal details of PyTypeObject are eminently avoidable, either by not defining your own custom types in C code at all (you can get a long way with C level acceleration just by defining functions and using instances of existing types), or else by only defining them dynamically as heap types (the kind created by class statements) via PyType_FromSpec: https://www.python.org/dev/peps/pep-0384/#type-objects To answer your original question, though, PEP 489 needs to read in the context of PEP 451, which was the one that switched the overall import system over to the multi-phase import model: https://www.python.org/dev/peps/pep-0451/ One of the main goals of importlib in general is to let the interpreter do more of the heavy lifting for things that absolutely have to be done correctly if you want your import hook or module to behave "normally". With Python level modules, the interpreter has always taken care of creating the module for "normal" imports, with the module author only having to care about populating that namespace with content (by running Python code). PEP 451 extended that same convenience to authors of module loaders, as they could now just define a custom exec_module, and use the default module creation code rather than having to write their own as part of a load_module implementation. PEP 489 then brought that capability to extension modules: extension module authors can now decide not to worry about module creation at all, and instead just use the Exec hook to populate the standard module object that CPython provides by default. That means caring about the module creation step in an extension module is now primarily a matter of performance optimisation for access to module global state - as Stefan notes, the indirection mechanism in PEP 3121 can be significantly slower than using C level static variables, and indirection through a Python level namespace is likely to be even slower. However, even in those cases, the PEP 489 mechanism gives the extension module reliable access to information it didn't previously have access to, since the create method receives a fully populated module spec. Once the question is narrowed down to "How can an extension module fully support subinterpreters and multiple Py_Initialize/Finalize cycles without incurring PEP 3121's performance overhead?" then the short answer becomes "We don't know, but ideas for that are certainly welcome, either here or over on import-sig". Returning custom module subclasses from the Create hook is certainly one mechanism for that (it's why supporting such subclasses was a design goal for PEP 489), but like other current solutions, they run afoul of the problem that methods defined in C extension modules currently don't receive a reference to the defining module, only to the class instance (which is a general problem with the way methods are defined in C rather than a problem with the import system specifically). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ianlee1521 at gmail.com Fri Apr 15 03:07:33 2016 From: ianlee1521 at gmail.com (Ian Lee) Date: Fri, 15 Apr 2016 00:07:33 -0700 Subject: [Python-ideas] PEP8 operator must come before line break In-Reply-To: References: <57103EE4.8090009@stoneleaf.us> Message-ID: > On Apr 14, 2016, at 19:28, Guido van Rossum wrote: > > OK, I get it. Brandon's slides with the Knuth references were > especially useful. So let's change the PEP! > > Who wants to draft a diff? http://bugs.python.org/issue26763 ~ Ian Lee | IanLee1521 at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From nikita at nemkin.ru Fri Apr 15 04:59:35 2016 From: nikita at nemkin.ru (Nikita Nemkin) Date: Fri, 15 Apr 2016 13:59:35 +0500 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: Message-ID: Thanks for your input. I now see how things evolved to the present state. in the context of PEP 451, my proposal would have been to move all default module creation tasks to ModuleType.tp_new (taking an optional spec parameter), making separate create and exec unnecessary. Too late, I guess. > Once the question is narrowed down to "How can an extension module > fully support subinterpreters and multiple Py_Initialize/Finalize > cycles without incurring PEP 3121's performance overhead?" then the > short answer becomes "We don't know, but ideas for that are certainly > welcome, either here or over on import-sig". I mentioned the way to avoid state access overhead in my first post. It's independent of module loading mechanism: 1) define a new "calling convention" flag like METH_GLOBALS. 2) store module ref in PyCFunctionObject.m_module (currently it stores only the module name) 3) pass module ref as an extra arg to methods with METH_GLOBALS flag. 4) PyModule_State, reimplemented as a macro, would amount to one indirection from the passed parameter. I suspect that most C ABIs allow to pass the extra arg unconditionally, (this is certainly the case for x86 and x64 on Windows and Linux). Meaning that METH_GLOBALS won't increase the actual number of possible dispatch targets in PyCFunction_Call and won't impact Python-to-C call performance at all. From stefan at bytereef.org Fri Apr 15 05:22:34 2016 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 15 Apr 2016 09:22:34 +0000 (UTC) Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 References: Message-ID: Nikita Nemkin writes: > I mentioned the way to avoid state access overhead in my first post. > It's independent of module loading mechanism: It's great to see people discussing this. I must clarify the 20% slowdown figure that I posted earlier: The slowdown was due to changing the static variables to module state *and* the static types to heap types. It was recommended at the time to do both or nothing. I haven't measured the module state impact in isolation. Stefan Krah From encukou at gmail.com Fri Apr 15 05:24:58 2016 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 15 Apr 2016 11:24:58 +0200 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: Message-ID: <5710B36A.2060709@gmail.com> On 04/15/2016 10:59 AM, Nikita Nemkin wrote: > Thanks for your input. I now see how things evolved to the present state. > > in the context of PEP 451, my proposal would have been to move > all default module creation tasks to ModuleType.tp_new (taking > an optional spec parameter), making separate create and exec > unnecessary. Too late, I guess. > >> Once the question is narrowed down to "How can an extension module >> fully support subinterpreters and multiple Py_Initialize/Finalize >> cycles without incurring PEP 3121's performance overhead?" then the >> short answer becomes "We don't know, but ideas for that are certainly >> welcome, either here or over on import-sig". > > I mentioned the way to avoid state access overhead in my first post. > It's independent of module loading mechanism: > > 1) define a new "calling convention" flag like METH_GLOBALS. > 2) store module ref in PyCFunctionObject.m_module > (currently it stores only the module name) Wouldn't that break backwards compatibility, though? > 3) pass module ref as an extra arg to methods with METH_GLOBALS flag. > 4) PyModule_State, reimplemented as a macro, would amount to one > indirection from the passed parameter. > > I suspect that most C ABIs allow to pass the extra arg unconditionally, > (this is certainly the case for x86 and x64 on Windows and Linux). > Meaning that METH_GLOBALS won't increase the actual number > of possible dispatch targets in PyCFunction_Call and won't impact > Python-to-C call performance at all. My planned approach is a bit more flexible: - Add a reference to the module (ht_module) to heap types - Create a calling convention METH_METHOD, where methods are passed the class that defines the method (which might PyTYPE(self) or a superclass of it) This way methods can get both module state and the class they are defined on, and the replacement for PyModule_State is two indirections. Still, both approaches won't work with slot methods (e.g. nb_add), where there's no space in the API to add an extra argument. Nick proposed a solution in import-sig [0], which is workable but not elegant. But, I think: - METH_METHOD would be useful even if it doesn't solve the problem with slot methods. - A good solution to the slot methods problem is unlikely to render METH_METHOD obsolete. so perhaps solving the 90% case first would be OK. [0] https://mail.python.org/pipermail/import-sig/2015-July/001035.html From nikita at nemkin.ru Fri Apr 15 06:58:53 2016 From: nikita at nemkin.ru (Nikita Nemkin) Date: Fri, 15 Apr 2016 15:58:53 +0500 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: <5710B36A.2060709@gmail.com> References: <5710B36A.2060709@gmail.com> Message-ID: On Fri, Apr 15, 2016 at 2:24 PM, Petr Viktorin wrote: >> >> 1) define a new "calling convention" flag like METH_GLOBALS. >> 2) store module ref in PyCFunctionObject.m_module >> (currently it stores only the module name) > > Wouldn't that break backwards compatibility, though? It will, and I consider this level of breakage acceptable. Alternatively, another field can be added to this struct. > My planned approach is a bit more flexible: > - Add a reference to the module (ht_module) to heap types > - Create a calling convention METH_METHOD, where methods are passed the > class that defines the method (which might PyTYPE(self) or a superclass > of it) > > This way methods can get both module state and the class they are > defined on, and the replacement for PyModule_State is two indirections. I've read the linked import-sig thread and realized the depth of issues involved... Those poor heap type methods don't even have access their own type pointer! In the light of that, your variant is more useful than mine. Still, without a good slot support option, new METH_X conventions don't look attractive at all. Such fundamental change, but only solves half of the problem. Also, MRO walking is actually not as slow as it seems. Checking Py_TYPE(self) and Py_TYPE(self)->tp_base *inline* will minimize performance overhead in the (very) common case. If non-slot methods had a suitable static anchor (the equivalent of slot function address for slots), they could use MRO walking too. From ncoghlan at gmail.com Fri Apr 15 07:01:49 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Apr 2016 21:01:49 +1000 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: Message-ID: On 15 April 2016 at 18:59, Nikita Nemkin wrote: > Thanks for your input. I now see how things evolved to the present state. > > in the context of PEP 451, my proposal would have been to move > all default module creation tasks to ModuleType.tp_new (taking > an optional spec parameter), making separate create and exec > unnecessary. Too late, I guess. That doesn't work either, as not only aren't modules in general actually required to be instances of ModuleType (see [1]), we also need to be able to create modules to hold __main__, os, sys and _frozen_importlib before we have an import system to manipulate. That's a large part of the reason we hived off import-sig from python-ideas a while back - the import system involves a whole lot of intertwined arcana stemming from accidents-of-implementation early in Python's history, as well as the flexible import hook system that was defined in PEP 302, so a separate list has proven useful for thrashing out technical details, while we tend to use python-dev and python-ideas more to check the end result is still comprehensible to folks that aren't familiar with all those internals :) Cheers, Nick. [1] https://www.python.org/dev/peps/pep-0489/#the-py-mod-create-slot -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From encukou at gmail.com Fri Apr 15 09:15:07 2016 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 15 Apr 2016 15:15:07 +0200 Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489 In-Reply-To: References: <5710B36A.2060709@gmail.com> Message-ID: <5710E95B.3070900@gmail.com> Let's move the discussion to import-sig, as Nick explained in the other subthread. Please drop python-ideas from CC when you reply. On 04/15/2016 12:58 PM, Nikita Nemkin wrote: > On Fri, Apr 15, 2016 at 2:24 PM, Petr Viktorin wrote: >>> >>> 1) define a new "calling convention" flag like METH_GLOBALS. >>> 2) store module ref in PyCFunctionObject.m_module >>> (currently it stores only the module name) >> >> Wouldn't that break backwards compatibility, though? > > It will, and I consider this level of breakage acceptable. Alternatively, > another field can be added to this struct. > >> My planned approach is a bit more flexible: >> - Add a reference to the module (ht_module) to heap types >> - Create a calling convention METH_METHOD, where methods are passed the >> class that defines the method (which might PyTYPE(self) or a superclass >> of it) >> >> This way methods can get both module state and the class they are >> defined on, and the replacement for PyModule_State is two indirections. > > I've read the linked import-sig thread and realized the depth of issues > involved... > Those poor heap type methods don't even have access their own type > pointer! In the light of that, your variant is more useful than mine. > > Still, without a good slot support option, new METH_X conventions > don't look attractive at all. Such fundamental change, but only solves > half of the problem. Well, it solves the problem for methods that have calling conventions, and I'm pretty sure by now that a full solution will need this *plus* another solution for slots. So I'm looking at the problems as two separate parts, and I also think that when it comes to writing PEPs, having two separate PEPs would make this more understandable. > Also, MRO walking is actually not as slow as it seems. > Checking Py_TYPE(self) and Py_TYPE(self)->tp_base *inline* > will minimize performance overhead in the (very) common case. I think so as well. This would mean that module state access in named methods is fast; in slot methods it's possible (and usually fast *enough*), and the full solution with __typeslots__ would still be possible. > If non-slot methods had a suitable static anchor (the equivalent > of slot function address for slots), they could use MRO walking too. I think a new METH_* calling style and explicit pointers is a better alternative here. From egregius313 at gmail.com Fri Apr 15 09:41:41 2016 From: egregius313 at gmail.com (Ed Minnix) Date: Fri, 15 Apr 2016 09:41:41 -0400 Subject: [Python-ideas] Adding a pipe function to functools Message-ID: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Hello, I have been looking over the toolz library and one of the functions I like the most is pipe. Since most programmers are familiar with piping (via the Unix `|` symbol), and it can help make tighter code, I think it would be nice to add it to the standard library (such as functools). - Ed Minnix From wes.turner at gmail.com Fri Apr 15 12:25:51 2016 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 15 Apr 2016 11:25:51 -0500 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: "[Python-ideas] The pipe protocol, a convention for extensible method chaining" https://mail.python.org/pipermail/python-ideas/2015-May/033673.html https://groups.google.com/forum/m/#!topic/python-ideas/4HgpT5yE06o On Apr 15, 2016 9:51 AM, "Ed Minnix" wrote: Hello, I have been looking over the toolz library and one of the functions I like the most is pipe. Since most programmers are familiar with piping (via the Unix `|` symbol), and it can help make tighter code, I think it would be nice to add it to the standard library (such as functools). - Ed Minnix _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.prosser at mail.com Fri Apr 15 12:57:05 2016 From: richard.prosser at mail.com (Richard Prosser) Date: Fri, 15 Apr 2016 17:57:05 +0100 Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list In-Reply-To: References: Message-ID: Is there any mileage in having a naming convention to indicate the type of a variable? I have never really liked the fact that the Python 'duck typing' policy is so lax, yet the new "Type Hints" package for Python 3 is rather clumsy, IMO. For example: github_response = requests.get('https://api.github.com/user', auth=('user', 'pass')) # Derived from http://docs.python-requests.org/en/master. The above request returns a Response object and so the variable has 'response' in its name. Likewise: word_count = total_words_in_file('text_file') where 'count' has been defined (in the IDE, by the user perhaps) as an Integer and the function is known to return an Integer, perhaps via a local 'count' or 'total' variable. I know that this has been attempted before but I think that an IDE like PyCharm could actually check variable usage and issue a warning if a conflict is detected. Also earlier usages of this 'Hungarian Notation' have largely been applied to compiled languages - rather strangely, in the case of known types - rather than an interpreted one like Python. Please note that I have shown suffixes above but prefixes could also be valid. I am not sure about relying on 'type strings' *within* a variable name however. Is this idea feasible, do you think? Thanks ... Richard PS Originally posted in https://intellij-support.jetbrains.com/hc/en-us/community/posts/207286145--Hungarian-Notation-and-type-checking -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Apr 15 13:03:53 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Apr 2016 10:03:53 -0700 Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list In-Reply-To: References: Message-ID: Thanks for the positive tone of your message. I do think there's some benefit in having a naming convention that's consistent within your own code base -- especially if you're working with a team and you can agree on a convention before you have written much code. Joel Spolsky agrees: http://www.joelonsoftware.com/articles/Wrong.html On Fri, Apr 15, 2016 at 9:57 AM, Richard Prosser wrote: > Is there any mileage in having a naming convention to indicate the type of > a variable? I have never really liked the fact that the Python 'duck > typing' policy is so lax, yet the new "Type Hints" package for Python 3 is > rather clumsy, IMO. > > > For example: > > github_response = requests.get('https://api.github.com/user', auth=('user', 'pass')) > # Derived from http://docs.python-requests.org/en/master. > > > The above request returns a Response > object > and so the variable has 'response' in its name. > > > Likewise: > > word_count = total_words_in_file('text_file') > > where 'count' has been defined (in the IDE, by the user perhaps) as an > Integer and the function is known to return an Integer, perhaps via a local > 'count' or 'total' variable. > > > > I know that this has been attempted before but I think that an IDE like > PyCharm could actually check variable usage and issue a warning if a > conflict is detected. Also earlier usages of this 'Hungarian Notation' have > largely been applied to compiled languages - rather strangely, in the case > of known types - rather than an interpreted one like Python. > > > Please note that I have shown suffixes above but prefixes could also be > valid. I am not sure about relying on 'type strings' *within* a variable > name however. > > Is this idea feasible, do you think? > > > Thanks ... > > Richard > > PS Originally posted in > https://intellij-support.jetbrains.com/hc/en-us/community/posts/207286145--Hungarian-Notation-and-type-checking > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 15 13:05:15 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 15 Apr 2016 10:05:15 -0700 Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list In-Reply-To: References: Message-ID: <57111F4B.2040202@stoneleaf.us> On 04/15/2016 09:57 AM, Richard Prosser wrote: > Is there any mileage in having a naming convention to indicate the type > of a variable? I have never really liked the fact that the Python 'duck > typing' policy is so lax, yet the new "Type Hints" package for Python 3 > is rather clumsy, IMO. Hungarian notation can be very helpful if used meaningfully. Your examples are good: word_count (not int_count) github_result (not str_result) However, it is still a style question, and as such it will not be added/enforced by Python. If you just want to discuss the merits of Hungarian Notation then Python-List is a better place to go. -- ~Ethan~ From steve at pearwood.info Fri Apr 15 13:04:13 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Apr 2016 03:04:13 +1000 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: <20160415170412.GL1819@ando.pearwood.info> On Fri, Apr 15, 2016 at 09:41:41AM -0400, Ed Minnix wrote: > Hello, > > I have been looking over the toolz library and one of the functions I > like the most is pipe. Since most programmers are familiar with piping > (via the Unix `|` symbol), and it can help make tighter code, I think > it would be nice to add it to the standard library (such as > functools). I don't know the toolz library, but I have this: http://code.activestate.com/recipes/580625-collection-pipeline-in-python/ Is that the sort of thing you mean? It's certainly not ready yet for the std lib, but if there's interest in it, I should be able to tidy it up and publish it. -- Steve From jbvsmo at gmail.com Fri Apr 15 13:29:40 2016 From: jbvsmo at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Bernardo?=) Date: Fri, 15 Apr 2016 14:29:40 -0300 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: I really like this tool: https://github.com/JulienPalard/Pipe It is really a matter of implementing the __or__ operator on iterators. Or possibly, as this module above, add the operator for all functions dealing with iterators as first argument -------------- next part -------------- An HTML attachment was scrubbed... URL: From julien at palard.fr Fri Apr 15 14:05:45 2016 From: julien at palard.fr (Julien Palard) Date: Fri, 15 Apr 2016 18:05:45 +0000 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: <75D8C8A4-33BD-4DAB-8ECE-B352C08D7D9D@palard.fr> Hi, Le 15 avril 2016 15:41:41 GMT+02:00, Ed Minnix a ?crit : >Since most programmers are familiar with piping >(via the Unix `|` symbol) See also: https://mail.python.org/pipermail//python-ideas/2014-October/029839.html -- Julien Palard -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Apr 15 14:19:02 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Apr 2016 04:19:02 +1000 Subject: [Python-ideas] Hungarian notation [was Welcome ...] In-Reply-To: References: Message-ID: <20160415181902.GM1819@ando.pearwood.info> (Changing the subject line to something a little more relevant.) Hi Richard, and welcome. My repsonses are below, interleaved between your comments as appropriate. On Fri, Apr 15, 2016 at 05:57:05PM +0100, Richard Prosser wrote: > Is there any mileage in having a naming convention to indicate the type of > a variable? Perhaps not mileage, as such, but maybe yardage or even inchage :-) Certainly there are naming conventions which are very common: s for strings; n for ints; i, j, k for indexes, especially in a loop; o or obj for arbitrary objects of any type; x, y for floats or arbitrary numbers; etc. They're good for short, generic functions, and it wouldn't surprise me if linters and type-checkers had an option to complain if they see what looks like misused standard names: n = 23 # okay n = {} # perhaps not? For less generic functions, we should pick names which describe the purpose of the variable. That's usually far more important than it's type, especially in languages like Python where *variables* (the names themselves) are untyped, only the values assigned to them have types. > I have never really liked the fact that the Python 'duck > typing' policy is so lax, yet the new "Type Hints" package for > Python 3 is rather clumsy, IMO. Oh? Can you give an example of type annotations or declarations which aren't clumsy? > For example: > > github_response = requests.get('https://api.github.com/user', > auth=('user', 'pass')) > # Derived from http://docs.python-requests.org/en/master. > > > The above request returns a Response > object > and so the variable has 'response' in its name. I consider that a perfectly reasonable variable name, since it describes the variable and gives an idea of its purpose. The fact that it is of type Response is not as important as the fact that it is a response from a web-server. > Likewise: > > word_count = total_words_in_file('text_file') > > where 'count' has been defined (in the IDE, by the user perhaps) as an > Integer Declaring that "count" is an integer doesn't tell us anything about "word_count". They're completely different variables. > and the function is known to return an Integer, perhaps via a local > 'count' or 'total' variable. A good type-checker should be able to infer that if total_words_in_file returns an int, then word_count is also an int. (At least up to the point where it is re-bound to another value.) > I know that this has been attempted before but I think that an IDE like > PyCharm could actually check variable usage and issue a warning if a > conflict is detected. That's the purpose of MyPy and related projects. (By the way, for the avoidance of doubt, you should understand that Python will never force the use of type annotations or static typing.) > Also earlier usages of this 'Hungarian Notation' have > largely been applied to compiled languages - rather strangely, in the case > of known types - rather than an interpreted one like Python. You should be aware that there are two forms of Hungarian Notation. One is useless and widely despised, and the other is helpful but rarely used because it's reputation was ruined by the other kind. http://www.joelonsoftware.com/articles/Wrong.html > Please note that I have shown suffixes above but prefixes could also be > valid. I am not sure about relying on 'type strings' *within* a variable > name however. I'm not sure I understand what you mean here. > Is this idea feasible, do you think? If you mean, could Python enforce type hints through naming conventions, I think not. Do you really want to see code like: numpages_Integer_Or_None = self.count_pages() While some form of Hungarian Notation is useful, using it everywhere as a type hint is going to get frustrating really fast. A good type hint or declaration should happen once: Declare numpages: Integer|None # making up some syntax numpages = self.count_pages() if numpages is not None: for page_num in range(1, numpages+1): print("Page %d of %d." % (page_num, numpages)) not: numpages_Integer_Or_None = self.count_pages() if numpages_Integer_Or_None is not None: for page_num in range(1, numpages_Integer_Or_None+1): print("Page %d of %d." % (page_num, numpages_Integer_Or_None)) The second is too tiresome to read and write. Using abbreviations will decrease the typing burden, but increase the burden of remembering what those abbreviations mean. -- Steve From julien at palard.fr Fri Apr 15 14:24:22 2016 From: julien at palard.fr (Julien Palard) Date: Fri, 15 Apr 2016 18:24:22 +0000 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: Hi, Le 15 avril 2016 19:29:40 GMT+02:00, "Jo?o Bernardo" a ?crit : >I really like this tool: > >https://github.com/JulienPalard/Pipe First, thank you Jo?o ! However I almost don't use it myself, I dislike the idea of exposing an overloaded operator (may cause surprises), and there's in fact a very few places where it's really clearer without reducing maintainability. I mean, working with a long pipe is opaque (no variables involved, so no logging no breakpoints, and no names, well chosen names helps readability). Also using my Pipe module forces you to mix infix with prefix calls, and I dislike mixing syntaxes, yet having some DSL is sometime cool when they are really useful, think of SQL typically. Finally the resulting code does clearly not look like Python code, even if readable, it may hurt readability, causing a surprise like "wow what is that, how is it possible ?" Almost forcing one to read Pipe doc instead of simply read Python code ... All of this obviously also apply to the idea of adding an infix call syntax to the stdlib, so I'm -1 on it. -- Julien Palard -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Apr 15 14:51:45 2016 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 15 Apr 2016 13:51:45 -0500 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: On Apr 15, 2016 9:51 AM, "Ed Minnix" wrote: > > Hello, > > I have been looking over the toolz library and one of the functions I like the most is pipe. Since most programmers are familiar with piping (via the Unix `|` symbol), and it can help make tighter code, I think it would be nice to add it to the standard library (such as functools). toolz.functoolz.pipe: Docs: http:// toolz.readthedocs.org /en/latest/ api.html# toolz.functoolz.pipe Src: https://github.com/pytoolz/toolz/blob/master/toolz/functoolz.py > def pipe(data, *funcs): > """ [...] """ > for func in funcs: > data = func(data) > return data `|` is a "vertical bar": https://en.m.wikipedia.org/wiki/Vertical_bar In Python, | is the 'bitwise or' operator __or__: * [ ] the operator. docs seem to omit ``|`` #TODO * https://docs.python.org/2/library/operator.html?#operator.__or__ * https://docs.python.org/3/library/operator.html?#operator.__or__ * https://docs.python.org/2/reference/expressions.html#index-63 * http://python-reference.readthedocs.org/en/latest/docs/operators/bitwise_OR.html ... Sarge also has `|` within command strings for command pipelines http://sarge.readthedocs.org/en/latest/tutorial.html#creating-command-pipelines ... functools.compose() & functools.partial() * https://docs.python.org/3.1/howto/functional.html#the-functional-module from functional import compose, partial import functools multi_compose = partial(functools.reduce, compose) * https://docs.python.org/3.5/library/functools.html * functools.compose is gone! > > - Ed Minnix > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Apr 15 20:32:00 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Apr 2016 12:32:00 +1200 Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list In-Reply-To: References: Message-ID: <57118800.8080106@canterbury.ac.nz> Richard Prosser wrote: > For example: > > github_response = requests.get('https://api.github.com/user', auth=('user', 'pass')) > # Derived from http://docs.python-requests.org/en/master. > > The above request returns a |Response| > object > and so the variable has 'response' in its name. Conventions like that can be useful, but only when they clarify the code, which they don't always do. Sometimes they just make it needlessly verbose and hard to read. So I would not like to see any formal use made of such a convention. -- Greg From steve at pearwood.info Sat Apr 16 01:06:33 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Apr 2016 15:06:33 +1000 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> Message-ID: <20160416050632.GN1819@ando.pearwood.info> On Fri, Apr 15, 2016 at 01:51:45PM -0500, Wes Turner wrote: > * https://docs.python.org/3.1/howto/functional.html#the-functional-module > > from functional import compose, partial > import functools > multi_compose = partial(functools.reduce, compose) > > * https://docs.python.org/3.5/library/functools.html > > * functools.compose is gone! functools.compose never existed. The above example is functional.compose, a third party library. -- Steve From wes.turner at gmail.com Sat Apr 16 02:04:25 2016 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 16 Apr 2016 01:04:25 -0500 Subject: [Python-ideas] Adding a pipe function to functools In-Reply-To: References: <55191387-6FC7-4631-8340-C153F70C41C8@gmail.com> <20160416050632.GN1819@ando.pearwood.info> Message-ID: On Apr 16, 2016 12:07 AM, "Steven D'Aprano" wrote: > > On Fri, Apr 15, 2016 at 01:51:45PM -0500, Wes Turner wrote: > > > * https://docs.python.org/3.1/howto/functional.html#the-functional-module > > > > from functional import compose, partial > > import functools > > multi_compose = partial(functools.reduce, compose) > > > > * https://docs.python.org/3.5/library/functools.html > > > > * functools.compose is gone! > > functools.compose never existed. The above example is > functional.compose, a third party library. my mistake. in my haste, I had thought that functional was functools. it looks like funcy has both curry() and compose(): * http://funcy.readthedocs.org/en/stable/funcs.html#curry #compose * https://github.com/Suor/funcy/blob/master/funcy/funcs.py #compose * https://github.com/Suor/funcy/blob/master/funcy/simple_funcs.py #curry and fn.py has __lshift__ and __rshift__ for F.__compose: * https://github.com/kachayev/fn.py/blob/master/fn/func.py * > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Apr 16 22:50:53 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 17 Apr 2016 12:50:53 +1000 Subject: [Python-ideas] __getattr__ bouncer for modules Message-ID: Every now and then there's been talk of making it easier to subclass modules, and the most common use case that I can remember hearing about is descriptor protocol requiring code on the type. (For instance, you can't change a module-level constant into a property without moving away from the default ModuleType.) How bad would it be for the default ModuleType to have a __getattr__ function which defers to the instance? Something like this: def __getattr__(self, name): if '__getattr__' in self.__dict__: return self.__dict__['__getattr__'](name) raise AttributeError The biggest downside I'm seeing is that module attributes double as global names, which might mean this would get checked for every global name that ends up being resolved from the builtins (which is going to be a LOT). But I'm not sure if that's even true. Dumb idea? Already been thought of and rejected? ChrisA From random832 at fastmail.com Sat Apr 16 23:26:44 2016 From: random832 at fastmail.com (Random832) Date: Sat, 16 Apr 2016 23:26:44 -0400 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: References: Message-ID: <1460863604.2395852.580973377.0ED81943@webmail.messagingengine.com> On Sat, Apr 16, 2016, at 22:50, Chris Angelico wrote: > def __getattr__(self, name): > if '__getattr__' in self.__dict__: > return self.__dict__['__getattr__'](name) > raise AttributeError > > The biggest downside I'm seeing is that module attributes double as > global names, which might mean this would get checked for every global > name that ends up being resolved from the builtins (which is going to > be a LOT). But I'm not sure if that's even true. It is not. (Also, incidentally, defining a global called __class__ does not set the module's class.) I don't think this would be enough alone to let you use property decorators on a module - you'd have to explicitly define a __getattr__ (and __setattr__). And of course make sure that the names you're using as properties don't exist as real members of the module, since you're using __getattr__ instead of __getattribute__. From njs at pobox.com Sat Apr 16 23:40:14 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 16 Apr 2016 20:40:14 -0700 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: References: Message-ID: On Sat, Apr 16, 2016 at 7:50 PM, Chris Angelico wrote: > Every now and then there's been talk of making it easier to subclass > modules, and the most common use case that I can remember hearing > about is descriptor protocol requiring code on the type. (For > instance, you can't change a module-level constant into a property > without moving away from the default ModuleType.) > > How bad would it be for the default ModuleType to have a __getattr__ > function which defers to the instance? Something like this: > > def __getattr__(self, name): > if '__getattr__' in self.__dict__: > return self.__dict__['__getattr__'](name) > raise AttributeError > > The biggest downside I'm seeing is that module attributes double as > global names, which might mean this would get checked for every global > name that ends up being resolved from the builtins (which is going to > be a LOT). But I'm not sure if that's even true. It's not true :-). Code executing inside the module has 'globals() is mod.__dict__', so lookups go directly to mod.__dict__ and skip mod.__getattr__. However, starting in 3.5 cpython allows __class__ assignment on modules, so you can implement custom __getattr__ on a module with: class ModuleWithMyGetattr(types.ModuleType): def __getattr__(self, name): # .. whatever you want ... sys.modules[__name__].__class__ = ModuleWithMyGetattr The advantage of doing it this way is that you can also implement other things like __dir__ (so tab completion on your new attributes will work). This package backports the functionality to earlier versions of CPython: https://pypi.python.org/pypi/metamodule https://github.com/njsmith/metamodule/ Basically just replace the explicit __class__ assignment with import metamodule metamodule.install(__name__, ModuleWithMyGetattr) (See the links above for more details and a worked example.) -n -- Nathaniel J. Smith -- https://vorpus.org From rosuav at gmail.com Sat Apr 16 23:42:46 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 17 Apr 2016 13:42:46 +1000 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: <1460863604.2395852.580973377.0ED81943@webmail.messagingengine.com> References: <1460863604.2395852.580973377.0ED81943@webmail.messagingengine.com> Message-ID: On Sun, Apr 17, 2016 at 1:26 PM, Random832 wrote: > On Sat, Apr 16, 2016, at 22:50, Chris Angelico wrote: >> def __getattr__(self, name): >> if '__getattr__' in self.__dict__: >> return self.__dict__['__getattr__'](name) >> raise AttributeError >> >> The biggest downside I'm seeing is that module attributes double as >> global names, which might mean this would get checked for every global >> name that ends up being resolved from the builtins (which is going to >> be a LOT). But I'm not sure if that's even true. > > It is not. (Also, incidentally, defining a global called __class__ does > not set the module's class.) > > I don't think this would be enough alone to let you use property > decorators on a module - you'd have to explicitly define a __getattr__ > (and __setattr__). And of course make sure that the names you're using > as properties don't exist as real members of the module, since you're > using __getattr__ instead of __getattribute__. Right, it wouldn't automatically allow the use of properties *as such*, but you would be able to achieve most of the same goal. # Version 1 BITS_PER_BYTE = 8 BITS_PER_WORD = 32 # Version 2 - doesn't work BITS_PER_BYTE = 8 @property def BITS_PER_WORD(): return 32 or 64 # Version 3 - could work BITS_PER_BYTE = 8 def __getattr__(name): if name == 'BITS_PER_WORD': return 32 or 64 raise AttributeError It's not as clean as actually supporting @property, but it could be done without the "bootstrap problem" of trying to have a module contain the class that it's to be an instance of. All you have to do is define __getattr__ as a regular top-level function, and it'll get called. You can then dispatch to property functions if you wish (eg """return globals()['_property_'+name]()"""), or just put all the code straight into __getattr__. ChrisA From rosuav at gmail.com Sat Apr 16 23:47:08 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 17 Apr 2016 13:47:08 +1000 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: References: Message-ID: On Sun, Apr 17, 2016 at 1:40 PM, Nathaniel Smith wrote: > However, starting in 3.5 cpython allows __class__ assignment on > modules, so you can implement custom __getattr__ on a module with: > > class ModuleWithMyGetattr(types.ModuleType): > def __getattr__(self, name): > # .. whatever you want ... > > sys.modules[__name__].__class__ = ModuleWithMyGetattr > > The advantage of doing it this way is that you can also implement > other things like __dir__ (so tab completion on your new attributes > will work). > Oooh! I did not know that. That's pretty much what I was thinking of - you can have the class in the module that it's affecting. Coolness! And inside that class, you can toss in @property and everything, so it's exactly as clean as it wants to be. I like! Thanks Nathaniel. ChrisA From random832 at fastmail.com Sun Apr 17 00:06:30 2016 From: random832 at fastmail.com (Random832) Date: Sun, 17 Apr 2016 00:06:30 -0400 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: References: <1460863604.2395852.580973377.0ED81943@webmail.messagingengine.com> Message-ID: <1460865990.2405856.580982697.11DCBC13@webmail.messagingengine.com> On Sat, Apr 16, 2016, at 23:42, Chris Angelico wrote: > It's not as clean as actually supporting @property, but it could be > done without the "bootstrap problem" of trying to have a module > contain the class that it's to be an instance of. All you have to do > is define __getattr__ as a regular top-level function, and it'll get > called. You can then dispatch to property functions if you wish (eg > """return globals()['_property_'+name]()"""), or just put all the code > straight into __getattr__. Or you could have ModuleType.__getattr(ibute?)__ search through some object other than the module itself for descriptors. from types import ModuleType, SimpleNamespace import sys # __magic__ could simply be a class, but just to prove it doesn't have to be: @property def foo(self): return eval(input("get foo?")) @foo.setter def foo(self, value): print("set foo=" + repr(value)) def __call__(self, v1, v2): print("modcall" + repr((v1, v2))) __magic__ = SimpleNamespace() for name in 'foo __call__'.split(): setattr(__magic__, name, globals()[name]) del globals()[name] class MagicModule(ModuleType): def __getattr__(self, name): descriptor = getattr(self.__magic__, name) return descriptor.__get__(self, None) def __setattr__(self, name, value): try: descriptor = getattr(self.__magic__, name) except AttributeError: return super().__setattr__(name, value) return descriptor.__set__(self, value) def __call__(self, *args, **kwargs): # Because why not? try: call = self.__magic__.__call__ except AttributeError: raise TypeError("Could not access " + self.__name__ + "__magic__.__call__ method.") return call(self, *args, **kwargs) sys.modules[__name__].__class__ = MagicModule From k7hoven at gmail.com Sun Apr 17 04:49:53 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 17 Apr 2016 11:49:53 +0300 Subject: [Python-ideas] __getattr__ bouncer for modules In-Reply-To: <1460865990.2405856.580982697.11DCBC13@webmail.messagingengine.com> References: <1460863604.2395852.580973377.0ED81943@webmail.messagingengine.com> <1460865990.2405856.580982697.11DCBC13@webmail.messagingengine.com> Message-ID: On Sun, Apr 17, 2016 at 7:06 AM, Random832 wrote: > On Sat, Apr 16, 2016, at 23:42, Chris Angelico wrote: >> It's not as clean as actually supporting @property, but it could be >> done without the "bootstrap problem" of trying to have a module >> contain the class that it's to be an instance of. All you have to do >> is define __getattr__ as a regular top-level function, and it'll get >> called. You can then dispatch to property functions if you wish (eg >> """return globals()['_property_'+name]()"""), or just put all the code >> straight into __getattr__. > > Or you could have ModuleType.__getattr(ibute?)__ search through some > object other than the module itself for descriptors. > > from types import ModuleType, SimpleNamespace > import sys > [...] Not for a module, but for a *package*, this was reasonably easy (well, at least after figuring out how ;-) to do in pre-3.5 Python: One would import a submodule in __init__.py and then, within that submodule, reconstruct the package using a subclass of ModuleType and replace the original package in sys.modules. -Koos From pavol.lisy at gmail.com Sun Apr 17 14:24:21 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sun, 17 Apr 2016 20:24:21 +0200 Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: <20160409154305.GU12526@ando.pearwood.info> References: <570679AF.7020502@stoneleaf.us> <57068A10.3020904@stoneleaf.us> <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <22279.18221.103226.654215@turnbull.sk.tsukuba.ac.jp> <20160409154305.GU12526@ando.pearwood.info> Message-ID: 2016-04-09 17:43 GMT+02:00, Steven D'Aprano : > On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote: [...] >> I just don't see why the current >> behaviors of &|^ are particularly useful, since you'll have to guard >> all bitwise expressions against non-bool truthies and falsies. > > flag ^ flag is useful since we don't have a boolean-xor operator and > bitwise-xor does the right thing for bools. And I suppose some people > might prefer & and | over boolean-and and boolean-or because they're > shorter and require less typing. I don't think that's a particularly > good reason for using them, and as you say, you do have to guard > against non-bools slipping, but Consenting Adults applies. They are also useful if you need to avoid short-circuit evaluation. From k7hoven at gmail.com Sun Apr 17 15:28:58 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 17 Apr 2016 22:28:58 +0300 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On Sat, Apr 2, 2016 at 11:22 AM, Random832 wrote: > On Sat, Apr 2, 2016, at 02:37, Koos Zevenhoven wrote: >> python -e "random.randint(0,10)" >> >> which would automatically import stdlib if their names appear in the >> expression. > > #!/usr/bin/env python3 > import sys > class magicdict(dict): > def __getitem__(self, x): > try: > return super().__getitem__(x) > except KeyError: > try: > mod = __import__(x) > self[x] = mod > return mod > except ImportError: > raise KeyError > > g = magicdict() > for arg in sys.argv[1:]: > try: > p, obj = True, eval(arg, g) > except SyntaxError: > p = False > exec(arg, g) > if p: > sys.displayhook(obj) > > > Handling modules inside packages is left as an exercise for the reader. Thanks :). There were some positive reactions to this in the discussions two weeks ago, so I decided to go on and implement this further. Then I kind of forgot about it, but now the other getattr thread reminded me of this, so I came back to it. The implementation now requires Python 3.5+, but I could also do it the same way as I did in my 'np' package [1], which in the newest version (released three weeks ago), uses different module-magic-method approaches for Python<3.5 and >= 3.5, so it even works on Python 2. So here's oneline.py: https://gist.github.com/k7hoven/21c5532ce19b306b08bb4e82cfe5a609 I suppose this could be on pypi, and one could do things like oneline.py "random.randint(0,10)" or python -m oneline "random.randint(0,10)" Any thoughts? -Koos [1] https://pypi.python.org/pypi/np From leewangzhong+python at gmail.com Sun Apr 17 21:04:11 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sun, 17 Apr 2016 21:04:11 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: Message-ID: Two different (probably radical) ideas, with the same justification: reduce often-useless output in the REPL, which can flood out terminal history and overwhelm the user. 1. Limit the output per entered command: If you type into the REPL (AKA interactive shell), list(range(n)) and you forgot that you set n to 10**10, the interpreter should not print more than a page of output. Instead, it will print a few lines ("... and approximately X more lines"), and tell you how to print more. (E.g. "Call '_more()' for more. Call '_full()' for full output.") Alternatively, have "less"-like behavior. 2. Only print a few parts of the stack trace. In particular, for a recursive or mutually recursive function, if the error was due to maximum recursion (is this reasonably distinguishable? the error is `RuntimeError('maximum recursion depth exceeded')`), try to print each function on the stack once each. Again, there should be a message telling you how to get the full stacktrace printed. EXACTLY how, preferably in a way that is easy to type, so that a typo won't cause the trace to be lost. It should not use `sys.something()`, because the user's first few encounters with this message will result in, "NameError: name 'sys' is not defined". A few possible rules to reduce stacktrace size: - Always show the last (top) frame(?). - Hide any other stdlib funcs directly below (above) it. - If a function appears more than once in a row, show it once, with the note, "(and X recursive calls)". - If functions otherwise appears more than once (usually by mutual recursion?), and there is a run of them, list them as, "(Mutual recursion: 'x' (5 times), 'y' (144 times), 'z' (13 times).)". These two behaviors and their corresponding functions could go into a special module which is (by default) loaded by the interactive shell. The behaviors can be turned off with some command-line verbosity flag, or tuned with command-line parameters (e.g. how many lines/pages to print). Justification: - Excessive output floods the terminal window. Some terminals have a limit on output history (Windows *defaults* to 300 lines), or don't let you scroll up at all (at least, my noob self couldn't figure it out when I did get flooded). - Students learning Python, and also everyone else using Python, don't usually need 99% of a 10000-line output or stack trace. - People who want the full output are probably advanced users with, like, high-limit or unlimited window size, and advanced users are more likely to look for a verbosity flag, or use a third-party REPL. Default should be newbie friendly, because advanced users can work around it. Thoughts? Even if the specific proposals are unworkable, is limiting REPL output (by default) a reasonable goal? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Sun Apr 17 21:42:57 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 18 Apr 2016 11:42:57 +1000 Subject: [Python-ideas] Have REPL print less by default References: Message-ID: <85r3e3sndq.fsf@benfinney.id.au> "Franklin? Lee" writes: > 1. Limit the output per entered command: If you type into the REPL (AKA > interactive shell), > > list(range(n)) > > and you forgot that you set n to 10**10, the interpreter should not > print more than a page of output. For that specific example, when I run it, the output is quite short: $ python3 >>> n = 10**10 >>> list(range(n)) Traceback (most recent call last): File "", line 1, in MemoryError I take the point though: some objects have a very long ?repr? output. > Instead, it will print a few lines ("... and approximately X more > lines"), and tell you how to print more. (E.g. "Call '_more()' for > more. Call '_full()' for full output.") Perhaps you want a different ?print the interactive-REPL-safe text representation of this object? function, and to have the interactive REPL use that new function to represent objects. -- \ ?I am too firm in my consciousness of the marvelous to be ever | `\ fascinated by the mere supernatural ?? ?Joseph Conrad, _The | _o__) Shadow-Line_ | Ben Finney From dan at tombstonezero.net Sun Apr 17 22:33:31 2016 From: dan at tombstonezero.net (Dan Sommers) Date: Mon, 18 Apr 2016 02:33:31 +0000 (UTC) Subject: [Python-ideas] Have REPL print less by default References: Message-ID: On Sun, 17 Apr 2016 21:04:11 -0400, Franklin? Lee wrote: > These two behaviors and their corresponding functions could go into a > special module which is (by default) loaded by the interactive shell. The > behaviors can be turned off with some command-line verbosity flag, or tuned > with command-line parameters (e.g. how many lines/pages to print). The existing pretty print module already does some of what you want. Instead of creating a new module, maybe extending that one would have greater benefits to more users. > - People who want the full output are probably advanced users with, like, > high-limit or unlimited window size, and advanced users are more likely to > look for a verbosity flag, or use a third-party REPL. Default should be > newbie friendly, because advanced users can work around it. Judging what others might find more friendly, or guessing what actions they are more likely to take, can be dangerous. > Thoughts? Even if the specific proposals are unworkable, is limiting > REPL output (by default) a reasonable goal? IMO, the default REPL should be as simple as possible, and respond as directly as possible to what I ask it to do. It should also be [highly] configurable for when you want more complexity, like summarizing stack traces and paginating output. From leewangzhong+python at gmail.com Sun Apr 17 23:03:21 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sun, 17 Apr 2016 23:03:21 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <85r3e3sndq.fsf@benfinney.id.au> References: <85r3e3sndq.fsf@benfinney.id.au> Message-ID: On Apr 17, 2016 9:43 PM, "Ben Finney" wrote: > Perhaps you want a different ?print the interactive-REPL-safe text > representation of this object? function, and to have the interactive > REPL use that new function to represent objects. You mean that the REPL should call `repr_limited(...)` instead of `repr(...)`, and not that class makers should implement `__repr_limited__`, right? I think one could make a `pprint` class for that. Thing is, I would also want the following to have limited output. >>> def dumb(n): ... for i in range(n): ... print(i) ... >>> dumb(10**10) That could print the first 50 or so lines, then the REPL buffers the rest and prints out the message about suppressed output. On the other hand, for a long-running computation with logging statements, I don't wanna start it, come back, and lose all that would have been printed, just because I forgot that the REPL does that now. Storing it all could be a use of memory that the computation might not be able to afford. Possible solution: 1. After output starts getting buffered, print out warnings every X seconds about Y lines being buffered. 2. Detect screen limit, and only store that many lines back. You don't lose anything that wasn't already going to be lost. (Optional, but possibly useful: Also warn about Z lines being lost due to screen limit.) It was going to be in the terminal's memory, anyway, so just store it (as efficiently as possible) in Python's memory. I think this only uses up to twice as much total system memory (actual screen mem + dump). Since you won't use the output as Python at least until the user comes back, it could be lazy about converting to `str`, and maybe even compress the buffer on the fly (if queue compression is cheaper than printing). It gets trickier when you want user input during the eval part of the REPL. 3. Allow the user to interrupt the eval to print the last page of output, or dump the entire current output, and then Python will continue the eval. (Include in the warning, "Press Ctrl+?? to show .") 4. If the eval is waiting for user input, show the last page. (I don't know how the user could ask for more pages without sending something that the eval is allowed to interrupt as input. I don't understand terminal keyboard control that much.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Sun Apr 17 23:24:06 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 18 Apr 2016 13:24:06 +1000 Subject: [Python-ideas] Have REPL print less by default References: <85r3e3sndq.fsf@benfinney.id.au> Message-ID: <85lh4bsip5.fsf@benfinney.id.au> "Franklin? Lee" writes: > You mean that the REPL should call `repr_limited(...)` instead of > `repr(...)`, and not that class makers should implement > `__repr_limited__`, right? The former, yes: changing the behaviour of the interactive session, without the need for the programmer to change their code. To be clear, I'm asking whether that would meet your requirements :-) > I think one could make a `pprint` class for that. Surem, that would be a good place for it. I think that's much more feasible than changing the behaviour of ?repr? for this purpose. > Thing is, I would also want the following to have limited output. > > >>> def dumb(n): > ... for i in range(n): > ... print(i) > ... > >>> dumb(10**10) Again, it's only the interactive session which has the behaviour you want changed. That code run from a non-interactive session simply won't output anything if there's no error, so I think ?repr? is not a problem by default. It's only because the interactive session has the *additional* behaviour of emitting the representation of the object resulting from evaluation, that it has the behaviour you describe. So this is increasing the likelihood that the best way to address your request is not ?change default ?repr? behaviour?, but ?change the representation function the interactive session uses for its special object-representation output?. -- \ ?The conflict between humanists and religionists has always | `\ been one between the torch of enlightenment and the chains of | _o__) enslavement.? ?Wole Soyinka, 2014-08-10 | Ben Finney From njs at pobox.com Sun Apr 17 23:57:07 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 18 Apr 2016 03:57:07 +0000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: Message-ID: On Mon, Apr 18, 2016 at 2:33 AM, Dan Sommers wrote: > On Sun, 17 Apr 2016 21:04:11 -0400, Franklin? Lee wrote: > >> These two behaviors and their corresponding functions could go into a >> special module which is (by default) loaded by the interactive shell. The >> behaviors can be turned off with some command-line verbosity flag, or tuned >> with command-line parameters (e.g. how many lines/pages to print). > > The existing pretty print module already does some of what you want. > Instead of creating a new module, maybe extending that one would have > greater benefits to more users. There's also IPython's repr system as prior art: http://ipython.readthedocs.org/en/stable/api/generated/IPython.lib.pretty.html#module-IPython.lib.pretty My impression (as I guess one of the few people who have tried to systematically add _repr_pretty_ callbacks to their libraries) is that it's only about 30% baked and has a number of flaws and limitations, but it's still a significant step beyond __repr__ or the stdlib pprint. (Some issues I've run into: there's something wonky in the line breaking that often produces misaligned lines; integration with __repr__ is awkward if you don't want to reimplement everything twice [1]; there's no built-in control for compressing output down to ellipses; handling the most common case of reprs that look like Foo(bar, baz, kwarg=1) is way more awkward than it needs to be [2]; and IIRC there were some awkward limitations in the formatting tools provided though I don't remember the details now. But despite all this it does make it easy to write composable reprs, so e.g. here's a syntax tree: In [1]: import patsy In [2]: patsy.parse_formula.parse_formula("y ~ 1 + (a * b)") Out[2]: ParseNode('~', Token('~', ~<- 1 + (a * b) (2-3)>), [ParseNode('PYTHON_EXPR', Token('PYTHON_EXPR', y<- ~ 1 + (a * b) (0-1)>, extra='y'), []), ParseNode('+', Token('+', +<- (a * b) (6-7)>), [ParseNode('ONE', Token('ONE', 1<- + (a * b) (4-5)>, extra='1'), []), ParseNode('*', Token('*', *<- b) (11-12)>), [ParseNode('PYTHON_EXPR', Token('PYTHON_EXPR', a<- * b) (9-10)>, extra='a'), []), ParseNode('PYTHON_EXPR', Token('PYTHON_EXPR', b<-) (13-14)>, extra='b'), [])])])]) -n [1] https://github.com/pydata/patsy/blob/23ab37519276188c4ce09db03a1c40c5b9938bfc/patsy/util.py#L356-L424 [2] https://github.com/pydata/patsy/blob/23ab37519276188c4ce09db03a1c40c5b9938bfc/patsy/util.py#L426-L452 -- Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Mon Apr 18 01:32:41 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Mon, 18 Apr 2016 01:32:41 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <85lh4bsip5.fsf@benfinney.id.au> References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> Message-ID: On Apr 17, 2016 11:24 PM, "Ben Finney" wrote: > > Surem, that would be a good place for it. I think that's much more > feasible than changing the behaviour of ?repr? for this purpose. Huh? I never meant for any change to happen to `repr`. The desired behavior is, "Set default: Each command entered into the REPL should have limited output (without loss of capability for a newbie)." The proposal is to have the REPL determine when it's going to output too much. My example, as originally written, `print`d the huge object. I realized that this was too specific, and I didn't *really* care about `print` as such, so I changed it to (what I thought was) a more general example. I should've just had multiple examples. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Apr 18 03:12:52 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Apr 2016 17:12:52 +1000 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On 18 April 2016 at 05:28, Koos Zevenhoven wrote: > So here's oneline.py: > > https://gist.github.com/k7hoven/21c5532ce19b306b08bb4e82cfe5a609 > > Neat, although you'll want to use importlib.import_module() rather than calling __import__ directly (the latter won't behave the way you want when importing submodules, as it returns the top level module for the import statement to bind in the current namespace, rather than the imported submodule) > I suppose this could be on pypi, and one could do things like > > oneline.py "random.randint(0,10)" > > or > > python -m oneline "random.randint(0,10)" > > Any thoughts? > There are certainly plenty of opportunities to make Python easier to invoke for one-off commands. Another interesting example is pyp: https://code.google.com/archive/p/pyp/wikis/pyp_manual.wiki A completely undocumented hack I put together while playing one day was a utility to do json -> json transformations via command line pipes: https://bitbucket.org/ncoghlan/misc/src/default/pycall The challenge with these kinds of things is getting them from "Hey, look at this cool thing you can do" to "This will materially improve your day-to-day programming experience". The former can still be fun to work on as a hobby, but it's the latter that people need to get over the initial adoption barrier. Cheers, Nick. > > -Koos > > [1] https://pypi.python.org/pypi/np > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Apr 18 03:25:24 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Apr 2016 17:25:24 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> Message-ID: On 18 April 2016 at 15:32, Franklin? Lee wrote: > On Apr 17, 2016 11:24 PM, "Ben Finney" wrote: > > > > Surem, that would be a good place for it. I think that's much more > > feasible than changing the behaviour of ?repr? for this purpose. > > Huh? I never meant for any change to happen to `repr`. The desired > behavior is, "Set default: Each command entered into the REPL should have > limited output (without loss of capability for a newbie)." The proposal is > to have the REPL determine when it's going to output too much. > A few tips for folks that want to play with this: - setting sys.displayhook controls how evaluated objects are displayed in the default REPL - setting sys.excepthook does the same for exceptions - the PYTHONSTARTUP env var runs the given file when the default REPL starts So folks are already free to change their REPL to work however they want it to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file Changing the *default* REPL behaviour is a very different question, and forks off in a couple of different directions: - improved defaults for teaching novices? Perhaps the default REPL isn't the best environment for that - easier debugging at the REPL? Perhaps pprint should gain an "install_displayhook()" option that overwrites sys.displayhook and optionally allows enabling of an output pager Since the more the REPL does, the more opportunities there are for it to break when debugging, having the output hooks be as simple as possible is quite desirable. However, making it easier to request something more sophisticated via the pprint module seems like a plausible approach to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon Apr 18 03:27:36 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 18 Apr 2016 09:27:36 +0200 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: Message-ID: <57148C68.5010707@egenix.com> On 18.04.2016 03:04, Franklin? Lee wrote: > Two different (probably radical) ideas, with the same justification: reduce > often-useless output in the REPL, which can flood out terminal history and > overwhelm the user. > > 1. Limit the output per entered command: If you type into the REPL (AKA > interactive shell), > > list(range(n)) > > and you forgot that you set n to 10**10, the interpreter should not print > more than a page of output. Instead, it will print a few lines ("... and > approximately X more lines"), and tell you how to print more. (E.g. "Call > '_more()' for more. Call '_full()' for full output.") > > Alternatively, have "less"-like behavior. You should be able to write your own sys.displayhook to accomplish this. > 2. Only print a few parts of the stack trace. In particular, for a > recursive or mutually recursive function, if the error was due to maximum > recursion (is this reasonably distinguishable? the error is > `RuntimeError('maximum recursion depth exceeded')`), try to print each > function on the stack once each. > > Again, there should be a message telling you how to get the full stacktrace > printed. EXACTLY how, preferably in a way that is easy to type, so that a > typo won't cause the trace to be lost. It should not use `sys.something()`, > because the user's first few encounters with this message will result in, > "NameError: name 'sys' is not defined". Same here with sys.excepthook. FWIW: I see your point in certain situations, but don't think the defaults should be such that you have to enable some variable to see everything. This would make debugging harder than necessary, since often enough (following Murphy's law) the most interesting information would be hidden in some ellipsis. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 18 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From leewangzhong+python at gmail.com Mon Apr 18 11:41:14 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Mon, 18 Apr 2016 11:41:14 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> Message-ID: On Apr 18, 2016 3:25 AM, "Nick Coghlan" wrote: > > On 18 April 2016 at 15:32, Franklin? Lee wrote: >> >> On Apr 17, 2016 11:24 PM, "Ben Finney" wrote: >> > >> > Surem, that would be a good place for it. I think that's much more >> > feasible than changing the behaviour of ?repr? for this purpose. >> >> Huh? I never meant for any change to happen to `repr`. The desired behavior is, "Set default: Each command entered into the REPL should have limited output (without loss of capability for a newbie)." The proposal is to have the REPL determine when it's going to output too much. > > A few tips for folks that want to play with this: > > - setting sys.displayhook controls how evaluated objects are displayed in the default REPL > - setting sys.excepthook does the same for exceptions > - the PYTHONSTARTUP env var runs the given file when the default REPL starts > > So folks are already free to change their REPL to work however they want it to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file I don't want arguments like, "This can already be done, for yourself, if you really need it." I use IPython's shell, with maximum output height, and it was years ago that I used a terminal which I couldn't scroll. I want arguments like, "This will break my workflow." > - improved defaults for teaching novices? Perhaps the default REPL isn't the best environment for that Why not? I imagine that self-taught novices will use the default REPL more than the advanced users. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Apr 18 11:51:46 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 18 Apr 2016 08:51:46 -0700 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> Message-ID: <57150292.9020409@stoneleaf.us> On 04/18/2016 08:41 AM, Franklin? Lee wrote: > On Apr 18, 2016 3:25 AM, "Nick Coghlan" wrote: >> A few tips for folks that want to play with this: >> >> - setting sys.displayhook controls how evaluated objects are >> displayed in the default REPL >> - setting sys.excepthook does the same for exceptions >> - the PYTHONSTARTUP env var runs the given file when the default REPL >> starts >> >> So folks are already free to change their REPL to work however they >> want it to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file > > I don't want arguments like, "This can already be done, for yourself, if > you really need it." I use IPython's shell, with maximum output height, > and it was years ago that I used a terminal which I couldn't scroll. > > I want arguments like, "This will break my workflow." How about: the default REPL is a basic tool, and the limitations of basic tools are what drive folks to seek out advanced tools. ? Or Marc-Andre's: > I [...] don't think the defaults should be such that you have to > enable some variable to see everything. This would make debugging > harder than necessary, since often enough (following Murphy's law) > the most interesting information would be hidden in some ellipsis. Or even Nick's (that you snipped): > Since the more the REPL does, the more opportunities there are for > it to break when debugging, having the output hooks be as simple as > possible is quite desirable. IMO those are all good reasons to leave the basic REPL alone. -- ~Ethan~ From k7hoven at gmail.com Mon Apr 18 12:03:46 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 18 Apr 2016 19:03:46 +0300 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On Mon, Apr 18, 2016 at 10:12 AM, Nick Coghlan wrote: > On 18 April 2016 at 05:28, Koos Zevenhoven wrote: >> >> So here's oneline.py: >> >> https://gist.github.com/k7hoven/21c5532ce19b306b08bb4e82cfe5a609 >> > > Neat, although you'll want to use importlib.import_module() rather than > calling __import__ directly (the latter won't behave the way you want when > importing submodules, as it returns the top level module for the import > statement to bind in the current namespace, rather than the imported > submodule) > Thanks :). That is in fact why I had worked around it by grabbing sys.modules[name] instead. Good to know import_module() already does the right thing. I now changed the code to use import_module, assuming that is the preferred way today. However, to prevent infinite recursion when importing submodules, I now do a setattr(parentmodule, submodulename, None) before the import (and delattr if the import fails). >> >> I suppose this could be on pypi, and one could do things like >> >> oneline.py "random.randint(0,10)" >> >> or >> >> python -m oneline "random.randint(0,10)" >> >> Any thoughts? > > > There are certainly plenty of opportunities to make Python easier to invoke > for one-off commands. Another interesting example is pyp: > https://code.google.com/archive/p/pyp/wikis/pyp_manual.wiki This is nice, although solves a different problem. > A completely undocumented hack I put together while playing one day was a > utility to do json -> json transformations via command line pipes: > https://bitbucket.org/ncoghlan/misc/src/default/pycall So it looks like it would work like this: cat input.json | pycall "my.transformation.function" > output.json Also a different problem, but cool. > The challenge with these kinds of things is getting them from "Hey, look at > this cool thing you can do" to "This will materially improve your day-to-day > programming experience". The former can still be fun to work on as a hobby, > but it's the latter that people need to get over the initial adoption > barrier. I think the users of oneline.py could be people that now write lots of bash scripts and work on the command line. So whenever someone asks a question somewhere about how to do X on the linux command line, we might have the answer: """ Q: On the linux commandline, how do I get only the filename from a full path that is in $FILEPATH A: Python has this. You can use the tools in os.path: Filename: $ oneline.py "os.path.basename('$FILEPATH')" Path to directory: $ oneline.py "os.path.dirname('$FILEPATH')" """ This might be more appealing than python -c. The whole point is to make Python's power available and visible for a larger audience. -Koos From wes.turner at gmail.com Mon Apr 18 13:30:01 2016 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 18 Apr 2016 12:30:01 -0500 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On Apr 18, 2016 11:04 AM, "Koos Zevenhoven" wrote: > > On Mon, Apr 18, 2016 at 10:12 AM, Nick Coghlan wrote: > > On 18 April 2016 at 05:28, Koos Zevenhoven wrote: > >> > >> So here's oneline.py: > >> > >> https://gist.github.com/k7hoven/21c5532ce19b306b08bb4e82cfe5a609 > >> > > > > Neat, although you'll want to use importlib.import_module() rather than > > calling __import__ directly (the latter won't behave the way you want when > > importing submodules, as it returns the top level module for the import > > statement to bind in the current namespace, rather than the imported > > submodule) > > > > Thanks :). That is in fact why I had worked around it by grabbing > sys.modules[name] instead. Good to know import_module() already does > the right thing. I now changed the code to use import_module, assuming > that is the preferred way today. However, to prevent infinite > recursion when importing submodules, I now do a setattr(parentmodule, > submodulename, None) before the import (and delattr if the import > fails). > > >> > >> I suppose this could be on pypi, and one could do things like > >> > >> oneline.py "random.randint(0,10)" > >> > >> or > >> > >> python -m oneline "random.randint(0,10)" > >> > >> Any thoughts? > > > > > > There are certainly plenty of opportunities to make Python easier to invoke > > for one-off commands. Another interesting example is pyp: > > https://code.google.com/archive/p/pyp/wikis/pyp_manual.wiki > > This is nice, although solves a different problem. > > > A completely undocumented hack I put together while playing one day was a > > utility to do json -> json transformations via command line pipes: > > https://bitbucket.org/ncoghlan/misc/src/default/pycall > > So it looks like it would work like this: > > cat input.json | pycall "my.transformation.function" > output.json > > Also a different problem, but cool. > > > The challenge with these kinds of things is getting them from "Hey, look at > > this cool thing you can do" to "This will materially improve your day-to-day > > programming experience". The former can still be fun to work on as a hobby, > > but it's the latter that people need to get over the initial adoption > > barrier. > > I think the users of oneline.py could be people that now write lots of > bash scripts and work on the command line. So whenever someone asks a > question somewhere about how to do X on the linux command line, we > might have the answer: """ > > Q: On the linux commandline, how do I get only the filename from a > full path that is in $FILEPATH > > A: Python has this. You can use the tools in os.path: > > Filename: > $ oneline.py "os.path.basename('$FILEPATH')" > > Path to directory: > $ oneline.py "os.path.dirname('$FILEPATH')" > """ FILEPATH='for'"example');"'subprocess.call("cat /etc/passwd", shell=True)' > > This might be more appealing than python -c. The whole point is to > make Python's power available and visible for a larger audience. > > -Koos > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Apr 18 13:36:41 2016 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 18 Apr 2016 12:36:41 -0500 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On Apr 18, 2016 12:30 PM, "Wes Turner" wrote: > > > > I think the users of oneline.py could be people that now write lots of > > bash scripts and work on the command line. So whenever someone asks a > > question somewhere about how to do X on the linux command line, we > > might have the answer: """ > > > > Q: On the linux commandline, how do I get only the filename from a > > full path that is in $FILEPATH > > > > A: Python has this. You can use the tools in os.path: > > > > Filename: > > $ oneline.py "os.path.basename('$FILEPATH')" > > > > Path to directory: > > $ oneline.py "os.path.dirname('$FILEPATH')" > > """ > > FILEPATH='for'"example');"'subprocess.call("cat /etc/passwd", shell=True)' sys.argv[1] (IFS=' ') stdin (~IFS=$'\n') ... * https://github.com/westurner/dotfiles/blob/develop/scripts/el * https://github.com/westurner/pyline/blob/master/pyline/pyline.py (considering adding an argument (in addition to the existing -m) for importlib.import_module)) > > > > > This might be more appealing than python -c. The whole point is to > > make Python's power available and visible for a larger audience. > > > > -Koos > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Mon Apr 18 14:39:32 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 19 Apr 2016 04:39:32 +1000 Subject: [Python-ideas] Have REPL print less by default References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> Message-ID: <85d1pmsqvv.fsf@benfinney.id.au> "Franklin? Lee" writes: > On Apr 18, 2016 3:25 AM, "Nick Coghlan" wrote: > > So folks are already free to change their REPL to work however they > > want it to: set sys.displayhook and sys.excepthook from a > > PYTHONSTARTUP file > > I don't want arguments like, "This can already be done, for yourself, > if you really need it." That's your prerogative. This forum is for discussing whether ideas are right for changing Python, though. Arguments such as ?This can already be done, for yourself, if you need it? are salient and sufficient, and not to be dismissed. > I want arguments like, "This will break my workflow." You're free to solicit those arguments. You'll need to ackonwledge, though, that they are in addition to the quite sufficient argument of ?already possible for people to get this in Python as is, if they want it?. -- \ ?I think Western civilization is more enlightened precisely | `\ because we have learned how to ignore our religious leaders.? | _o__) ?Bill Maher, 2003 | Ben Finney From wes.turner at gmail.com Mon Apr 18 15:20:47 2016 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 18 Apr 2016 14:20:47 -0500 Subject: [Python-ideas] Add __main__ for uuid, random and urandom In-Reply-To: References: <56FEB2B9.4020405@gmail.com> <1459585335.1521491.566423210.2C35B7FD@webmail.messagingengine.com> Message-ID: On Apr 18, 2016 12:36 PM, "Wes Turner" wrote: > > > On Apr 18, 2016 12:30 PM, "Wes Turner" wrote: > > > > > > > I think the users of oneline.py could be people that now write lots of > > > bash scripts and work on the command line. So whenever someone asks a > > > question somewhere about how to do X on the linux command line, we > > > might have the answer: """ > > > > > > Q: On the linux commandline, how do I get only the filename from a > > > full path that is in $FILEPATH > > > > > > A: Python has this. You can use the tools in os.path: > > > > > > Filename: > > > $ oneline.py "os.path.basename('$FILEPATH')" > > > > > > Path to directory: > > > $ oneline.py "os.path.dirname('$FILEPATH')" > > > """ > > > > FILEPATH='for'"example');"'subprocess.call("cat /etc/passwd", shell=True)' > > sys.argv[1] (IFS=' ') > stdin (~IFS=$'\n') > > ... > > * https://github.com/westurner/dotfiles/blob/develop/scripts/el > > * https://github.com/westurner/pyline/blob/master/pyline/pyline.py (considering adding an argument (in addition to the existing -m) for importlib.import_module)) another thing worth mentioning is that `ls` prints '?' for certain characters in filenames (e.g. newlines $'\n') so, | pipes with ls and xargs are bad/wrong/unsafe: e.g. $ touch 'file'$'\n''name' $ ls 'file'* | xargs stat #ERR $ find . -maxdepth 1 -name 'file*' | xargs stat #ERRless unsafe (?): >> [x for x in os.listdir('.') if x.startswith('file')] # ['file\nname'] $ find . -maxdepth 1 -name 'file*' -print0 | xargs -0 stat ... * "CWE-93: Improper Neutralization of CRLF Sequences ('CRLF Injection')" https://cwe.mitre.org/data/definitions/93.html * CWE-78: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') https://cwe.mitre.org/data/definitions/78.html > > > > > > > > This might be more appealing than python -c. The whole point is to > > > make Python's power available and visible for a larger audience. > > > > > > -Koos > > > _______________________________________________ > > > Python-ideas mailing list > > > Python-ideas at python.org > > > https://mail.python.org/mailman/listinfo/python-ideas > > > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joejev at gmail.com Mon Apr 18 15:33:22 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Mon, 18 Apr 2016 15:33:22 -0400 Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8 Message-ID: I saw that there was recently a change to pep 8 to suggest adding a line break before a binary operator. Pep 7 suggests the opposite: > When you break a long expression at a binary operator, the operator goes at the end of the previous line, e.g.: > if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 && > type->tp_dictoffset == b_size && > (size_t)t_size == b_size + sizeof(PyObject *)) > return 0; /* "Forgive" adding a __dict__ only */ I imagine that some of the reasons for making the change in pep 8 for readability reasons will also translate to C; maybe pep 7 should also be updated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Apr 18 16:27:05 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 06:27:05 +1000 Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8 In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 5:33 AM, Joseph Jevnik wrote: > I saw that there was recently a change to pep 8 to suggest adding a line > break before a binary operator. Pep 7 suggests the opposite: > >> When you break a long expression at a binary operator, the operator goes >> at the end of the previous line, e.g.: > >> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 && >> type->tp_dictoffset == b_size && >> (size_t)t_size == b_size + sizeof(PyObject *)) >> return 0; /* "Forgive" adding a __dict__ only */ > > I imagine that some of the reasons for making the change in pep 8 for > readability reasons will also > translate to C; maybe pep 7 should also be updated. I would agree with this. Passing it directly to python-dev as that's where the key decision makers are. ChrisA From guido at python.org Mon Apr 18 18:52:38 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Apr 2016 15:52:38 -0700 Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8 In-Reply-To: References: Message-ID: [ideas to bcc] I'm not as excited about this as I am about the PEP 8 change. PEP 8 affects most Python programmers. But PEP 7 is really just for CPython and its extensions, and I don't think it has found anything like as widespread a following as PEP 8. I worry that if we change this in PEP 7 we'll just see either massing inconsistent code or endless diffs that do nothing but change the formatting (and occasionally introduce a bug). And I don't think it would do as much good -- reading and understanding C code is primarily a matter of knowing the language, and the audience is much more heavily skewed towards experts. IOW, -1. On Mon, Apr 18, 2016 at 1:27 PM, Chris Angelico wrote: > On Tue, Apr 19, 2016 at 5:33 AM, Joseph Jevnik wrote: > > I saw that there was recently a change to pep 8 to suggest adding a line > > break before a binary operator. Pep 7 suggests the opposite: > > > >> When you break a long expression at a binary operator, the operator goes > >> at the end of the previous line, e.g.: > > > >> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 && > >> type->tp_dictoffset == b_size && > >> (size_t)t_size == b_size + sizeof(PyObject *)) > >> return 0; /* "Forgive" adding a __dict__ only */ > > > > I imagine that some of the reasons for making the change in pep 8 for > > readability reasons will also > > translate to C; maybe pep 7 should also be updated. > > I would agree with this. Passing it directly to python-dev as that's > where the key decision makers are. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Mon Apr 18 20:40:41 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 03:40:41 +0300 Subject: [Python-ideas] Type hinting for path-related functions Message-ID: I actually proposed this already in one of the pathlib threads on python-dev, but I decided to repost here, because this is easily seen as a separate issue. I'll start with some introduction, then moving on to the actual type hinting part. In our seemingly never-ending discussions about pathlib support in the stdlib in various threads, first here on python-ideas, then even more extensively on python-dev, have perhaps almost converged. The required changes involve a protocol method, probably named __fspath__, which any path-like type could implement to return a more, let's say, "classical" path object such as a str. However, the protocol is polymorphic and may also return bytes, which has a lot do do with the fact that the stdlib itself is polymophic and currently accepts str as well as bytes paths almost everywhere, including the newly-introduced os.scandir + DirEntry combination. The upcoming improvements will further allow passing pathlib path objects as well as DirEntry objects to any stdlib function that take paths. It came up, for instance here [1], that the function associated with the protocol, potentially named os.fspath, will end up needing type hints. This function takes pathlike objects and turns them into str or bytes. There are various different scenarios [2] that can be considered for code dealing with paths, but let's consider the case of os.path.* and other traditional python path-related functions. Some examples: os.path.join Currently, it takes str or bytes paths and returns a joined path of the same type (mixing different types raises an exception). In the future, it will also accept pathlib objects (underlying type always str) and DirEntry (underlying type str or bytes) or third-party path objects (underlying type str or bytes). The function will then return a pathname of the underlying type. os.path.dirname Currently, it takes a str or bytes and returns the dirname of the same type. In the future, it will also accept Path and DirEntry and return the underlying type. Let's consider the type hint of os.path.dirname at present and in the future: Currently, one could write def dirname(p: Union[str, bytes]) -> Union[str, bytes]: ... While this is valid, it could be more precise: pathstring = typing.TypeVar('pathstring', str, bytes) def dirname(p: pathstring) -> pathstring: ... This now contains the information that the return type is the same as the argument type. The name 'pathstring' may be considered slightly misleading because "byte strings" are not actually strings in Python 3, but at least it does not advertise the use of bytes as paths, which is very rarely desirable. But what about the future. There are two kinds of rich path objects, those with an underlying type of str and those with an underlying type of bytes. These should implement the __fspath__() protocol and return their underlying type. However, we do care about what (underlying) type is provided by the protocol, so we might want to introduce something like typing.FSPath[underlying_type]: FSPath[str] # str-based pathlike, including str FSPath[bytes] # bytes-based pathlike, including bytes And now, using the above defined TypeVar pathstring, the future version of dirname would be type annotated as follows: def dirname(p: FSPath[pathstring]) -> pathstring: ... It's getting late. I hope this made sense :). -Koos [1] https://mail.python.org/pipermail/python-dev/2016-April/144246.html [2] https://mail.python.org/pipermail/python-dev/2016-April/144239.html From guido at python.org Mon Apr 18 21:27:00 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Apr 2016 18:27:00 -0700 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: Your pathstring seems to be the same as the predefined (in typing.py, and PEP 484) AnyStr. You are indeed making sense, except that for various reasons the stdlib is not likely to adopt in-line signature annotations yet -- not even for new code. However once there's agreement on os.fspath() it can be added to the stubs in github.com/python/typeshed. Is there going to be a PEP for os.fspath()? (I muted most of the discussions so I'm not sure where it stands.) On Mon, Apr 18, 2016 at 5:40 PM, Koos Zevenhoven wrote: > I actually proposed this already in one of the pathlib threads on > python-dev, but I decided to repost here, because this is easily seen > as a separate issue. I'll start with some introduction, then moving on > to the actual type hinting part. > > In our seemingly never-ending discussions about pathlib support in the > stdlib in various threads, first here on python-ideas, then even more > extensively on python-dev, have perhaps almost converged. The required > changes involve a protocol method, probably named __fspath__, which > any path-like type could implement to return a more, let's say, > "classical" path object such as a str. However, the protocol is > polymorphic and may also return bytes, which has a lot do do with the > fact that the stdlib itself is polymophic and currently accepts str as > well as bytes paths almost everywhere, including the newly-introduced > os.scandir + DirEntry combination. The upcoming improvements will > further allow passing pathlib path objects as well as DirEntry objects > to any stdlib function that take paths. > > It came up, for instance here [1], that the function associated with > the protocol, potentially named os.fspath, will end up needing type > hints. This function takes pathlike objects and turns them into str or > bytes. There are various different scenarios [2] that can be > considered for code dealing with paths, but let's consider the case of > os.path.* and other traditional python path-related functions. > > Some examples: > > os.path.join > > Currently, it takes str or bytes paths and returns a joined path of > the same type (mixing different types raises an exception). > > In the future, it will also accept pathlib objects (underlying type > always str) and DirEntry (underlying type str or bytes) or third-party > path objects (underlying type str or bytes). The function will then > return a pathname of the underlying type. > > os.path.dirname > > Currently, it takes a str or bytes and returns the dirname of the same > type. > In the future, it will also accept Path and DirEntry and return the > underlying type. > > Let's consider the type hint of os.path.dirname at present and in the > future: > > Currently, one could write > > def dirname(p: Union[str, bytes]) -> Union[str, bytes]: > ... > > While this is valid, it could be more precise: > > pathstring = typing.TypeVar('pathstring', str, bytes) > > def dirname(p: pathstring) -> pathstring: > ... > > This now contains the information that the return type is the same as > the argument type. The name 'pathstring' may be considered slightly > misleading because "byte strings" are not actually strings in Python > 3, but at least it does not advertise the use of bytes as paths, which > is very rarely desirable. > > But what about the future. There are two kinds of rich path objects, > those with an underlying type of str and those with an underlying type > of bytes. These should implement the __fspath__() protocol and return > their underlying type. However, we do care about what (underlying) > type is provided by the protocol, so we might want to introduce > something like typing.FSPath[underlying_type]: > > FSPath[str] # str-based pathlike, including str > FSPath[bytes] # bytes-based pathlike, including bytes > > And now, using the above defined TypeVar pathstring, the future > version of dirname would be type annotated as follows: > > def dirname(p: FSPath[pathstring]) -> pathstring: > ... > > It's getting late. I hope this made sense :). > > -Koos > > [1] https://mail.python.org/pipermail/python-dev/2016-April/144246.html > [2] https://mail.python.org/pipermail/python-dev/2016-April/144239.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Apr 18 22:06:40 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 18 Apr 2016 19:06:40 -0700 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: <571592B0.4090403@stoneleaf.us> On 04/18/2016 06:27 PM, Guido van Rossum wrote: > Is there going to be a PEP for os.fspath()? (I muted most of the > discussions so I'm not sure where it stands.) We're nearing the end of the discussions. Brett Cannon and Chris Angelico will draw up an amendment to the pathlib PEP. -- ~Ethan~ From leewangzhong+python at gmail.com Tue Apr 19 00:52:39 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 19 Apr 2016 00:52:39 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <85d1pmsqvv.fsf@benfinney.id.au> References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Mon, Apr 18, 2016 at 11:51 AM, Ethan Furman wrote: > On 04/18/2016 08:41 AM, Franklin? Lee wrote: >> >> I don't want arguments like, "This can already be done, for yourself, if >> you really need it." I use IPython's shell, with maximum output height, >> and it was years ago that I used a terminal which I couldn't scroll. >> >> I want arguments like, "This will break my workflow." > > > How about: the default REPL is a basic tool, and the limitations of basic > tools are what drive folks to seek out advanced tools. ? The REPL is a basic tool for basic users, which is why it should "Do the right thing" for people who wouldn't know better. I'm asking whether this is the "right thing" for those basic users: Advanced users are the ones who can use more than basic info. > Or Marc-Andre's: >> I [...] don't think the defaults should be such that you have to >> enable some variable to see everything. This would make debugging >> harder than necessary, since often enough (following Murphy's law) >> the most interesting information would be hidden in some ellipsis. I think it's a high priority not to lose information that was hidden. In my proposal, I said that there should be a way to access it, right after you cause it, which is very different from needing to enable a variable. If not for (memory) performance anxiety, I would've proposed that the info would be stored indefinitely, or that you would always be able to access the last three (or something) outputs/stacktraces, because I was concerned about typos causing exceptions that would then push the interesting stacktrace out. (Hey, that's another possible proposal: maybe syntax/lookup errors on the "R" part of "REPL" shouldn't count toward `sys.last_traceback`. Has anyone here ever needed to debug or reprint syntax errors coming from compiling the command they just entered?) > Or even Nick's (that you snipped): >> >> Since the more the REPL does, the more opportunities there are for > >> it to break when debugging, having the output hooks be as simple as >> possible is quite desirable. I read this as, "It'll be more complicated, so it MIGHT be buggy." This seems solvable via design, and doesn't speak to whether it's good, in principle, to limit output per input. If you still think it's a good reason, then I probably didn't understand it correctly. On Mon, Apr 18, 2016 at 2:39 PM, Ben Finney wrote: > "Franklin? Lee" > writes: > >> On Apr 18, 2016 3:25 AM, "Nick Coghlan" wrote: >> > So folks are already free to change their REPL to work however they >> > want it to: set sys.displayhook and sys.excepthook from a >> > PYTHONSTARTUP file >> >> I don't want arguments like, "This can already be done, for yourself, >> if you really need it." > > That's your prerogative. > > This forum is for discussing whether ideas are right for changing > Python, though. Arguments such as ?This can already be done, for > yourself, if you need it? are salient and sufficient, and not to be > dismissed. > >> I want arguments like, "This will break my workflow." > > You're free to solicit those arguments. You'll need to ackonwledge, > though, that they are in addition to the quite sufficient argument of > ?already possible for people to get this in Python as is, if they want > it?. When they are presented as a solution to "my" problem, they are, to be short, irrelevant. They try to address the speaker's need for the feature on their own machine, when I am asking for opinions on both the usefulness and harmfulness of such a feature, and the principle behind the feature. They are distracting: people are talking more about whether I can do it "myself" than how it's bad to have this, or how it wouldn't help who I want to help. Even if they weren't trying to solve "my" problem, I had non-advanced users in mind, and these solutions tend to be about what advanced users can do. I felt the need to point out what I'm looking for in the arguments, because people are telling me how *I* can have this feature by, for example, writing a display hook, or plugging into pprint. From mike at selik.org Tue Apr 19 01:21:46 2016 From: mike at selik.org (Michael Selik) Date: Tue, 19 Apr 2016 05:21:46 +0000 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: References: <570D6B50.9070005@btinternet.com> <570E15C9.7050703@btinternet.com> Message-ID: On Wed, Apr 13, 2016 at 12:45 PM Guido van Rossum wrote: > On Wed, Apr 13, 2016 at 2:47 AM, Rob Cliffe > wrote: > > Isn't there an inconsistency that random.sample caters to a set by > > converting it to a tuple, but random.choice doesn't? > > Perhaps because the use cases are different? Over the years I've > learned that inconsistencies aren't always signs of sloppy thinking -- > they may actually point to deep issues that aren't apparently on the > surface. > > I imagine the typical use case for sample() to be something that > samples the population once and then does something to the sample; the > next time sample() is called the population is probably different > (e.g. the next lottery has a different set of players). > > But I imagine a fairly common use case for choice() to be choosing > from the same population over and over, and that's exactly the case > where the copying implementation you're proposing would be a small > disaster. > Repeated random sampling from the same population is a common use case ("bootstrapping"). Perhaps the oversight was *allowing* sets as input into random.sample. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Apr 19 04:09:25 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 09:09:25 +0100 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 19 April 2016 at 05:52, Franklin? Lee wrote: >> How about: the default REPL is a basic tool, and the limitations of basic >> tools are what drive folks to seek out advanced tools. ? > > The REPL is a basic tool for basic users, which is why it should "Do > the right thing" for people who wouldn't know better. I'm asking > whether this is the "right thing" for those basic users: Advanced > users are the ones who can use more than basic info. > Basic users should probably be using a tool like IDLE, which has a bit more support for beginners than the raw REPL. I view the REPL as more important for intermediate or advanced users who want to quickly test out an idea (at least, that's *my* usage of the REPL). As I disagree with your statement "the REPL is a basic tool for basic users" I don't find your conclusions compelling, sorry. Paul From k7hoven at gmail.com Tue Apr 19 05:35:18 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 12:35:18 +0300 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Tue, Apr 19, 2016 at 11:09 AM, Paul Moore wrote: > On 19 April 2016 at 05:52, Franklin? Lee wrote: >>> How about: the default REPL is a basic tool, and the limitations of basic >>> tools are what drive folks to seek out advanced tools. ? >> >> The REPL is a basic tool for basic users, which is why it should "Do >> the right thing" for people who wouldn't know better. I'm asking >> whether this is the "right thing" for those basic users: Advanced >> users are the ones who can use more than basic info. >> > > Basic users should probably be using a tool like IDLE, which has a bit > more support for beginners than the raw REPL. I view the REPL as more > important for intermediate or advanced users who want to quickly test > out an idea (at least, that's *my* usage of the REPL). > I was once a basic user, but I still have no idea what "IDLE" is. Does it come with python? I have tried $ idle $ python -m idle $ python -m IDLE $ python --idle To be honest, I do remember seeing a shortcut to IDLE in one of my Windows python installations, and I've seen it come up in discussions. However, it does not seem to me that IDLE is something that beginners would know to turn to. I do use IPython. IPython is nice---too bad it starts up slowly and is not recommended by default. -- Koos From leewangzhong+python at gmail.com Tue Apr 19 05:36:14 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 19 Apr 2016 05:36:14 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Apr 19, 2016 4:09 AM, "Paul Moore" wrote: > > On 19 April 2016 at 05:52, Franklin? Lee wrote: > >> How about: the default REPL is a basic tool, and the limitations of basic > >> tools are what drive folks to seek out advanced tools. ? > > > > The REPL is a basic tool for basic users, which is why it should "Do > > the right thing" for people who wouldn't know better. I'm asking > > whether this is the "right thing" for those basic users: Advanced > > users are the ones who can use more than basic info. > > > > Basic users should probably be using a tool like IDLE, which has a bit > more support for beginners than the raw REPL. You say "should"? Do you mean that it is likely, or do you mean that it is what would happen in an ideal world? My college had CS students SSH into the department's Linux server to compile and run their code, and many teachers don't believe that students should start with fancy IDE featues like, er, syntax highlighting. You (and most regulars on this list) can adjust your shell to the way you like it, or use a more sophisticated shell, like IPython or bpython. On the other hand, changing shells and adding display hooks to site.py is not an option for those who don't know it's an option. > I view the REPL as more > important for intermediate or advanced users who want to quickly test > out an idea (at least, that's *my* usage of the REPL). But that doesn't answer my question: would the proposed change hurt your workflow? -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Apr 19 05:48:54 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 10:48:54 +0100 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 19 April 2016 at 10:36, Franklin? Lee wrote: > On Apr 19, 2016 4:09 AM, "Paul Moore" wrote: >> >> On 19 April 2016 at 05:52, Franklin? Lee >> wrote: >> >> How about: the default REPL is a basic tool, and the limitations of >> >> basic >> >> tools are what drive folks to seek out advanced tools. ? >> > >> > The REPL is a basic tool for basic users, which is why it should "Do >> > the right thing" for people who wouldn't know better. I'm asking >> > whether this is the "right thing" for those basic users: Advanced >> > users are the ones who can use more than basic info. >> > >> >> Basic users should probably be using a tool like IDLE, which has a bit >> more support for beginners than the raw REPL. > > You say "should"? Do you mean that it is likely, or do you mean that it is > what would happen in an ideal world? My college had CS students SSH into the > department's Linux server to compile and run their code, and many teachers > don't believe that students should start with fancy IDE featues like, er, > syntax highlighting. I mean that that is what I hear people on lists like this saying as "a reasonable beginner environment". It means that the people I've introduced to Python (on Windows) have tended to end up using IDLE (either from the start menu or via "edit with IDLE" from the right click menu on a script). My experience (in business, environments) is that people expect an IDE when introduced to a new programming language, and IDLE, like it or not, is what is available out of the box with Python. > You (and most regulars on this list) can adjust your shell to the way you > like it, or use a more sophisticated shell, like IPython or bpython. On the > other hand, changing shells and adding display hooks to site.py is not an > option for those who don't know it's an option. As a "scripting expert" and consultant, I typically get asked to knock up scripts on a variety of environments. I do not normally have what I'd describe as "my shell", just a series of basic "out of the box" prompts people expect me to work at. So no, the luxury of configuring the default experience is *not* something I typically have. >> I view the REPL as more >> important for intermediate or advanced users who want to quickly test >> out an idea (at least, that's *my* usage of the REPL). > > But that doesn't answer my question: would the proposed change hurt your > workflow? Yes. If I get a stack trace, I want it all. And if I print something out, I want to see it by default. The REPL for me is an investigative environment for seeing exactly what's going on. (Also, having the REPL behave differently depending on what version of Python I have would be a problem - backward compatibility applies here as much as anywhere else). Paul From contrebasse at gmail.com Tue Apr 19 05:49:44 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 09:49:44 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples Message-ID: Hi, list ! namedtuples are really great. I would like to use them even more, for example for functions that return multiple arguments. The problem is that namedtuples have to be "declared" beforehand, so it would be quite tedious to declare a nameedtuple by function, that's why I very rarely do it. Another hting I don't like about namedtuples is the duplication of the name. TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`, where `Point` is repeated two times. I'll go one step further and say that the name is useless most of the time, so let's just get rid of it. Proposal ======== So I thought about a new (ok, maybe it has been proposed before but I couldn't find it) syntax for anonymous namedtuples (I put the prints as comments, otherwise gmane is complainig about top-posting): my_point = (x=12, y=16) # (x=12, y=16) my_point[0] # 12 my_point.y # 16 type(my_point) # It's just a tuple, but with names. Parenthesis would be mandatory because `my_point = x = 12, y = 16` wouldn't work. Single elements anonymouns namedtuples would require a trailing comma, similarely to tuples. I'd be happy to make a factory function for my personal use, but the order of kwargs is not respected. To have an elegant syntax, it has to be a construct of the language. The created objects would all be of the same class (with a better name, of course). As it adds no keyword to the language, it would not break compatibility. The created objects could support some namedtuples methods: _asdict, _replace and _fields. The two other methods _make and _source would not apply. Performance-wise, I guess that they would be slower than tuples and namedtuples, But to me the additionnal usability trumps the performance hit. If you have to care about performance, you can still use a namedtuple ot bare tuples. Even further... =============== This initial ideas is useful in itself, but it could be extended even further. I've thought about a possible evolution. The idea is that not all values has to be named: similarely to how args and kwargs work for functions, there could be non-named values, with the same limitations as the arguments (unnamed first, named afterwards). All values could be retrived by indexing, and named values could also be retrieved by their attribute name. This way they could be used in __getitem__ as proposed in PEP 472[0], so that __getitem__ supports additionnal keywords arguments. It would correspond to Strategy "named tuple", with some of the cons removed: "The namedtuple fields, and thus the type, will have to change according to the passed arguments. This can be a performance bottleneck, and makes it impossible to guarantee that two subsequent index accesses get the same Index class;" That would still be true for the performance bottleneck, but since the class would always be the same the second problem disappears. To minimize the performace hit, a standard tuple would be passed to __getitiem__ if there is no keyword argument, and an anonymous namedtuple would be passed if there is a keyword. "the _n "magic" fields are a bit unusual, but ipython already uses them for result history." Those wouldn't be needed if both named and unnamed values are allowed. "Differently from a function, the two notations gridValues[x=3, y=5, z=8] and gridValues[3,5,8] would not gracefully match if the order is modified at call time (e.g. we ask for gridValues[y=5, z=8, x=3]) . In a function, we can pre-define argument names so that keyword arguments are properly matched. Not so in __getitem__ , leaving the task for interpreting and matching to __getitem__ itself." Indexing already doesn't behave like a function call. Keeping the argument order (like this proposal would imply) is a special case of not keeping it, while the reverse is not true, so the more general solution would be implemented. Finally, I admit that I have no idea of how to implement this, I love python but never looked at it's internals. Maybe my proposal is too naive, I really hope not. Thanks for your attention, Joseph [0] https://www.python.org/dev/peps/pep-0472/ From rosuav at gmail.com Tue Apr 19 05:56:22 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 19:56:22 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Tue, Apr 19, 2016 at 7:35 PM, Koos Zevenhoven wrote: > I was once a basic user, but I still have no idea what "IDLE" is. Does > it come with python? > > I have tried > > $ idle > $ python -m idle > $ python -m IDLE > $ python --idle The first one will often work, but it depends on exactly how your Python installation has been set up. (Also, not all Linux distros come with Idle by default; you may have to install a python-idle package.) The most reliable way to invoke Idle is: $ python3 -m idlelib.idle (or python2 if you want 2.7). It's more verbose than would be ideal, and maybe it'd help to add one of your other attempts as an alias, but normally Idle will be made available in your GUI, which is where a lot of people will look for it. ChrisA From contrebasse at gmail.com Tue Apr 19 06:01:49 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 10:01:49 +0000 (UTC) Subject: [Python-ideas] Have REPL print less by default References: Message-ID: Franklin? Lee writes: > 2. Only print a few parts of the stack trace. In particular, for a recursive or mutually recursive function, if the error was due to maximum recursion (is this reasonably distinguishable? the error is `RuntimeError('maximum recursion depth exceeded')`), try to print each function on the stack once each. > - If a function appears more than once in a row, show it once, with the note, "(and X recursive calls)". > - If functions otherwise appears more than once (usually by mutual recursion?), and there is a run of them, list them as, "(Mutual recursion: 'x' (5 times), 'y' (144 times), 'z' (13 times).)". I don't know about the other ideas, but simplifying the output of recursion errors would be very useful. 990 repeats of the exact same lines add no value at all, the user being an expert or a beginner. Something like this is very clear in my mind: Traceback (most recent call last): File "", line 1, in File "/home/user/script.py", line 10, in f f() File "/home/user/script.py", line 10, in f f() ... File "/home/user/script.py", line 10, in f (994 repeats) f() ... File "/home/user/script.py", line 10, in f f() File "/home/user/script.py", line 10, in f f() RecursionError: maximum recursion depth exceeded From p.f.moore at gmail.com Tue Apr 19 06:03:35 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 11:03:35 +0100 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 19 April 2016 at 10:35, Koos Zevenhoven wrote: > I was once a basic user, but I still have no idea what "IDLE" is. Does > it come with python? > > I have tried > > $ idle > $ python -m idle > $ python -m IDLE > $ python --idle > > To be honest, I do remember seeing a shortcut to IDLE in one of my > Windows python installations, and I've seen it come up in discussions. > However, it does not seem to me that IDLE is something that beginners > would know to turn to. I do use IPython. IPython is nice---too bad it > starts up slowly and is not recommended by default. On Windows, it's set up as a shortcut in the start menu alongside Python, with a tooltip "Launches IDLE, the interactive environment for Python 3.5". And Python scripts have an "Edit with IDLE" option on the right click menu. So it's where most Windows users would naturally look when searching for "the Python GUI". Discoverability of IDLE on other platforms isn't something I know much about, sorry. (I should also point out that I'm a relentless command line user on Windows, so my comments above about Windows are based on my experience watching what my "normal Windows user" colleagues do, not on my personal habits...) Paul From rosuav at gmail.com Tue Apr 19 06:06:17 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 20:06:17 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde wrote: > I'd be happy to make a factory function for my personal use, but the order > of kwargs is not respected. To have an elegant syntax, it has to be a > construct of the language. There have been proposals to have kwargs retain some order (either by having it actually be an OrderedDict, or by changing the native dict type to retain order under fairly restricted circumstances). With that, you could craft the factory function easily. It may be worth dusting off one of those proposals and seeing if it can move forward. That said, though: I don't often feel the yearning for quick namedtuples. A function that today returns a namedtuple of 'x' and 'y' can't in the future grow a 'z' without breaking any callers that unpack "x, y = func()", so extensible return objects have to forego unpacking (eg using a dict or SimpleNamespace). So you still have a sharp division between "tuple" and "thing with arbitrary names", and a namedtuple has to be soundly in the first camp. There *is* a use-case for this, but it's fairly narrow - you have to have enough names that you don't want to just unpack them everywhere, yet still have an order, AND the set of names has to be constant. I suspect the correct solution here is to make it easier for people to write their own namedtuple factories, rather than granting them syntactic support. ChrisA From leewangzhong+python at gmail.com Tue Apr 19 06:10:29 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 19 Apr 2016 06:10:29 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Apr 19, 2016 5:48 AM, "Paul Moore" wrote: > > You say "should"? Do you mean that it is likely, or do you mean that it is > > what would happen in an ideal world? My college had CS students SSH into the > > department's Linux server to compile and run their code, and many teachers > > don't believe that students should start with fancy IDE featues like, er, > > syntax highlighting. > > I mean that that is what I hear people on lists like this saying as "a > reasonable beginner environment". It means that the people I've > introduced to Python (on Windows) have tended to end up using IDLE > (either from the start menu or via "edit with IDLE" from the right > click menu on a script). But are they perhaps tending to use IDLE because you were the one introducing them to it? Whether it is a recommended environment by the consensus of mailing lists is hardly indicative that IDLE is what people will start on. > My experience (in business, environments) is that people expect an IDE > when introduced to a new programming language, and IDLE, like it or > not, is what is available out of the box with Python. They expect it => they know what an IDE is. Programmers are already half-intermediate, especially programmers who use IDEs. > > You (and most regulars on this list) can adjust your shell to the way you > > like it, or use a more sophisticated shell, like IPython or bpython. On the > > other hand, changing shells and adding display hooks to site.py is not an > > option for those who don't know it's an option. > > As a "scripting expert" and consultant, I typically get asked to knock > up scripts on a variety of environments. I do not normally have what > I'd describe as "my shell", just a series of basic "out of the box" > prompts people expect me to work at. So no, the luxury of configuring > the default experience is *not* something I typically have. I suggested a command-line switch. > >> I view the REPL as more > >> important for intermediate or advanced users who want to quickly test > >> out an idea (at least, that's *my* usage of the REPL). > > > > But that doesn't answer my question: would the proposed change hurt your > > workflow? > > Yes. If I get a stack trace, I want it all. And if I print something > out, I want to see it by default. The REPL for me is an investigative > environment for seeing exactly what's going on. And why is it an insufficient option for you to type, for example, "_exfull()" to get the full trace? > (Also, having the REPL behave differently depending on what version of > Python I have would be a problem - backward compatibility applies here > as much as anywhere else). This is a concern, but not one that is enough, by itself, to justify the status quo. I disagree that it applies *anywhere near as much* here, since I don't see how it could break existing code. It will add a little adjustment and a little more work sometimes, but it doesn't require, say, refactoring a module, or every site page you ever downloaded. -------------- next part -------------- An HTML attachment was scrubbed... URL: From __peter__ at web.de Tue Apr 19 06:13:28 2016 From: __peter__ at web.de (Peter Otten) Date: Tue, 19 Apr 2016 12:13:28 +0200 Subject: [Python-ideas] Anonymous namedtuples References: Message-ID: Joseph Martinot-Lagarde wrote: > Proposal > ======== > > So I thought about a new (ok, maybe it has been proposed before but I > couldn't find it) syntax for anonymous namedtuples (I put the prints as > comments, otherwise gmane is complainig about top-posting): > > my_point = (x=12, y=16) > # (x=12, y=16) > my_point[0] > # 12 > my_point.y > # 16 > type(my_point) > # > > It's just a tuple, but with names. Parenthesis would be mandatory because > `my_point = x = 12, y = 16` wouldn't work. Single elements anonymouns > namedtuples would require a trailing comma, similarely to tuples. That would be a great addition! The parens could be optional for return values return x=1, y=2 From p.f.moore at gmail.com Tue Apr 19 06:25:07 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 11:25:07 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On 19 April 2016 at 11:06, Chris Angelico wrote: > That said, though: I don't often feel the yearning for quick > namedtuples. A function that today returns a namedtuple of 'x' and 'y' > can't in the future grow a 'z' without breaking any callers that > unpack "x, y = func()", so extensible return objects have to forego > unpacking (eg using a dict or SimpleNamespace). Possibly the docs for namedtuple should refer the user to SimpleNamespace as an alternative if "being a tuple subclass" isn't an important requirement. Or possibly SimpleNamespace should be in the collections module (alongside namedtuple) rather than in the types module? The issue may simply be that SimpleNamespace isn't as discoverable as it should be? Paul From rosuav at gmail.com Tue Apr 19 06:37:08 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 20:37:08 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Tue, Apr 19, 2016 at 8:10 PM, Franklin? Lee wrote: > This is a concern, but not one that is enough, by itself, to justify the > status quo. I disagree that it applies *anywhere near as much* here, since I > don't see how it could break existing code. It will add a little adjustment > and a little more work sometimes, but it doesn't require, say, refactoring a > module, or every site page you ever downloaded. > > You have things slightly backwards. The status quo doesn't have to justify itself; a change does. Particularly when there's ANY backward incompatibility being brought in, the change has to be of material benefit, or it isn't going to happen. Take, for example, this bug report: http://bugs.python.org/issue18018 Notice that: * The error is clearly *incorrect*. The interpreter should not be raising SystemError in valid situations. * The old behaviour (raising ValueError) is also incorrect, but Brett offered it as a serious possibility. * Only the latest CPython was changed (the 'default' branch). It wasn't correspondingly fixed in 3.4 or 3.5. * Nobody ever even *considered* changing 2.7's behaviour. Backward compatibility CAN be broken, but it's a big deal. Anything that can be done simply by changing a few local options is preferable to a core language change. Remember, anyone can redistribute Python with a modified site.py or other init script, or create a shortcut icon or shell script that invokes "python3 -i local_customizations.py" rather than simply running "python3" - and that script can do whatever you want it to, including changing sys.displayhook and sys.excepthook. ChrisA From k7hoven at gmail.com Tue Apr 19 06:41:45 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 13:41:45 +0300 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum wrote: > Your pathstring seems to be the same as the predefined (in typing.py, and > PEP 484) AnyStr. Oh, there too! :) I thought I will need a TypeVar, so I turned to help(typing.TypeVar) to look up how to do that, and there it was, right in front of me, just with a different name 'A': A = TypeVar('A', str, bytes) Anyway, it might make sense to consider defining 'pathstring' (or 'PathStr' for consistency?), even if it would be the same as AnyStr. Then, hypothetically, if at any point in the far future, bytes paths would be deprecated, it could be considered to make PathStr just str. After all, we don't want just Any String, we want something that represents a path (in a documentation sense). > You are indeed making sense, except that for various reasons the stdlib is > not likely to adopt in-line signature annotations yet -- not even for new > code. > > However once there's agreement on os.fspath() it can be added to the stubs > in github.com/python/typeshed. > I see, and I did have that impression already about the stdlib and type hints, probably based on some of your writings. My intention was to write these in the stub format, but apparently I need to look up the stub syntax once more. > Is there going to be a PEP for os.fspath()? (I muted most of the discussions > so I'm not sure where it stands.) It has not seemed like a good idea to discuss this (too?), but now that you ask, I have been wondering how optimal it is to add this to the pathlib PEP. While the changes do affect pathlib (even the code of the module itself), this will affect ntpath, posixpath, os.scandir, os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is not true), shutil.[stuff], (io.)open, and potentially all kinds of random places in the stdlib, such as fileinput, filecmp, zipfile, tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob, and fnmatch, to name a few :). And now, if the FSPath[underlying_type] I just proposed ends up being added to typing (by whatever name), this will even affect typing.py. -Koos From rosuav at gmail.com Tue Apr 19 06:46:02 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 20:46:02 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 8:25 PM, Paul Moore wrote: > On 19 April 2016 at 11:06, Chris Angelico wrote: >> That said, though: I don't often feel the yearning for quick >> namedtuples. A function that today returns a namedtuple of 'x' and 'y' >> can't in the future grow a 'z' without breaking any callers that >> unpack "x, y = func()", so extensible return objects have to forego >> unpacking (eg using a dict or SimpleNamespace). > > Possibly the docs for namedtuple should refer the user to > SimpleNamespace as an alternative if "being a tuple subclass" isn't an > important requirement. Or possibly SimpleNamespace should be in the > collections module (alongside namedtuple) rather than in the types > module? > > The issue may simply be that SimpleNamespace isn't as discoverable as > it should be? That's entirely possible. For the situations where order is insignificant and unpacking is unnecessary, a SimpleNamespace would be perfect, if people knew about it. Probably not worth moving the actual class, but a quick little docs link might help. ChrisA From leewangzhong+python at gmail.com Tue Apr 19 07:10:45 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 19 Apr 2016 07:10:45 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Apr 19, 2016 6:37 AM, "Chris Angelico" wrote: > > On Tue, Apr 19, 2016 at 8:10 PM, Franklin? Lee > wrote: > > This is a concern, but not one that is enough, by itself, to justify the > > status quo. I disagree that it applies *anywhere near as much* here, since I > > don't see how it could break existing code. It will add a little adjustment > > and a little more work sometimes, but it doesn't require, say, refactoring a > > module, or every site page you ever downloaded. > > > > > > You have things slightly backwards. The status quo doesn't have to > justify itself; a change does. Particularly when there's ANY backward > incompatibility being brought in, the change has to be of material > benefit, or it isn't going to happen. No, I meant exactly what I said. He made it as an argument for the status quo, and I said that, alone, it is a weak argument. I don't mean anything beyond that. I posted this to find out potential issues and to see whether less output (by default) is a decent goal to have. I am pushing for better counterarguments. I am not suggesting that change should be the default position. The justification is already on the table: it will help people who are still developing their skills and knowledge from getting too much non-information. I expected arguments like, "When I was a newbie using the REPL, this would not have helped me." Hardly anyone argued against it, but just because it isn't discussed doesn't mean it isn't there. > Take, for example, this bug > report: > > http://bugs.python.org/issue18018 > > Notice that: > > * The error is clearly *incorrect*. The interpreter should not be > raising SystemError in valid situations. > * The old behaviour (raising ValueError) is also incorrect, but Brett > offered it as a serious possibility. > * Only the latest CPython was changed (the 'default' branch). It > wasn't correspondingly fixed in 3.4 or 3.5. > * Nobody ever even *considered* changing 2.7's behaviour. > Backward compatibility CAN be broken, but it's a big deal. Again, this is very different from the usual change. The incompatibility that would be introduced is purely in the mind of the programmer: -> "Oh, I am not getting the full output. But wait, there is a message here." <- "Wow, that is a lot of output. I suppose I will have to deal with it, since it is already here." The change in the bug report affects existing code (like code that tries to catch SystemError) and googlability (existing resources). This change only affects existing understanding, which is exactly what you are able to change when you are sitting at a REPL. That is a huge difference with respect to compatibility. > Anything > that can be done simply by changing a few local options is preferable > to a core language change. Remember, anyone can redistribute Python > with a modified site.py or other init script, or create a shortcut > icon or shell script that invokes "python3 -i local_customizations.py" > rather than simply running "python3" - and that script can do whatever > you want it to, including changing sys.displayhook and sys.excepthook. Again, these are things that the ones who are supposed to benefit from the proposal are least likely to have available to them. Adding more distributions to the ecosystem just makes it that much worse for them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From contrebasse at gmail.com Tue Apr 19 07:31:51 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 11:31:51 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: Message-ID: Chris Angelico writes: > > On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde > wrote: > > I'd be happy to make a factory function for my personal use, but the order > > of kwargs is not respected. To have an elegant syntax, it has to be a > > construct of the language. > > There have been proposals to have kwargs retain some order (either by > having it actually be an OrderedDict, or by changing the native dict > type to retain order under fairly restricted circumstances). With > that, you could craft the factory function easily. I saw some proposals but forgot about them, thanks for the reminder. I'd prefer my proposal for two reasons: - the syntax is nicer that a factory function - it wouldn't be possible to add keyword arguments to __getitem__ without breaking compatibility Those issues are relatively minor, i'd be happy with a factory function > That said, though: I don't often feel the yearning for quick > namedtuples. A function that today returns a namedtuple of 'x' and 'y' > can't in the future grow a 'z' without breaking any callers that > unpack "x, y = func()", so extensible return objects have to forego > unpacking (eg using a dict or SimpleNamespace). My intention wasn't to allow extensible return objects, just easier access to the return values. > There *is* a use-case > for this, but it's fairly narrow - you have to have enough names that > you don't want to just unpack them everywhere, yet still have an > order, AND the set of names has to be constant. It's true that python already has some commodities in place. When I first heard about namedtuples I was pretty excited, but when I tried in practice they are not that easy to use, with lots of duplications (the name, and possibly the field names) and a global declaration. For all my use cases (not only function returns) I would completely replace namedtuple by anonymouns namedtuple. From mistersheik at gmail.com Tue Apr 19 07:35:40 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Tue, 19 Apr 2016 04:35:40 -0700 (PDT) Subject: [Python-ideas] Changing the meaning of bool.__invert__ In-Reply-To: References: <57069645.8040509@stoneleaf.us> <20160408014030.GL12526@ando.pearwood.info> <1460129233.1465771.572986385.6254AC76@webmail.messagingengine.com> <20160409152514.GT12526@ando.pearwood.info> Message-ID: I'm +1 on this change because is makes sense as a user. Note how numpy deals with invert and unsigned integers: In [2]: a = np.uint8(10) In [3]: ~a Out[3]: 245 The result of invert staying within the same type makes sense to me. (Also, as an idealist, I believe that decoupling int and bool might one day many many years from now bring about the ideal of bool not subclassing int.) Best, Neil On Saturday, April 9, 2016 at 12:25:57 PM UTC-4, Guido van Rossum wrote: > > Let me pronounce something here. This change is not worth the amount > of effort and pain a deprecation would cause everyone. Either we > change this quietly in 3.6 (adding it to What's New etc. of course) or > we don't do it at all. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Apr 19 07:49:19 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 19 Apr 2016 21:49:19 +1000 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 9:31 PM, Joseph Martinot-Lagarde wrote: > Chris Angelico writes: > >> >> On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde >> wrote: >> > I'd be happy to make a factory function for my personal use, but the order >> > of kwargs is not respected. To have an elegant syntax, it has to be a >> > construct of the language. >> >> There have been proposals to have kwargs retain some order (either by >> having it actually be an OrderedDict, or by changing the native dict >> type to retain order under fairly restricted circumstances). With >> that, you could craft the factory function easily. > > I saw some proposals but forgot about them, thanks for the reminder. > > I'd prefer my proposal for two reasons: > - the syntax is nicer that a factory function > - it wouldn't be possible to add keyword arguments to __getitem__ without > breaking compatibility > > Those issues are relatively minor, i'd be happy with a factory function Agreed, the syntax is a lot nicer - for this specific situation. Don't forget, though, that every piece of new syntax carries with it the not insignificant cost of complexity; and some things are better represented by function calls than syntax (cf 'print'). With "from types import SimpleNamespace as NS" at the top of your code, you can use the keyword-arguments form of the constructor to be almost the same as your proposal, without any new syntax. When a subsequent maintainer looks at your code, s/he can quickly look up at the imports to see what's going on, instead of having to learn another piece of syntax. > It's true that python already has some commodities in place. When I first > heard about namedtuples I was pretty excited, but when I tried in practice > they are not that easy to use, with lots of duplications (the name, and > possibly the field names) and a global declaration. > > For all my use cases (not only function returns) I would completely replace > namedtuple by anonymouns namedtuple. I have to agree. Fortunately it isn't hard to create a namedtuple factory: def NS(fields): return namedtuple('anonymous', fields.split()) or possibly: def NS(fields, *values): return namedtuple('anonymous', fields.split())(*values) That cuts down the duplication some, but it's far from perfect. ChrisA From steve at pearwood.info Tue Apr 19 08:06:28 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Apr 2016 22:06:28 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: <20160419120627.GR1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 11:25:07AM +0100, Paul Moore wrote: > Possibly the docs for namedtuple should refer the user to > SimpleNamespace as an alternative if "being a tuple subclass" isn't an > important requirement. Or possibly SimpleNamespace should be in the > collections module (alongside namedtuple) rather than in the types > module? +1 I'm repeatedly surprised that SimpleNamespace isn't in collections. > The issue may simply be that SimpleNamespace isn't as discoverable as > it should be? This. -- Steve From p.f.moore at gmail.com Tue Apr 19 08:26:31 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 13:26:31 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On 19 April 2016 at 11:46, Chris Angelico wrote: >> The issue may simply be that SimpleNamespace isn't as discoverable as >> it should be? > > That's entirely possible. For the situations where order is > insignificant and unpacking is unnecessary, a SimpleNamespace would be > perfect, if people knew about it. Probably not worth moving the actual > class, but a quick little docs link might help. http://bugs.python.org/issue26805 From contrebasse at gmail.com Tue Apr 19 08:30:10 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 12:30:10 +0000 (UTC) Subject: [Python-ideas] Fwd: Anonymous namedtuples References: Message-ID: Chris Angelico writes: > > On Tue, Apr 19, 2016 at 9:31 PM, Joseph Martinot-Lagarde > wrote: > > Chris Angelico ...> writes: > > > >> > >> On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde > >> ...> wrote: > >> > I'd be happy to make a factory function for my personal use, but the order > >> > of kwargs is not respected. To have an elegant syntax, it has to be a > >> > construct of the language. > >> > >> There have been proposals to have kwargs retain some order (either by > >> having it actually be an OrderedDict, or by changing the native dict > >> type to retain order under fairly restricted circumstances). With > >> that, you could craft the factory function easily. > > > > I saw some proposals but forgot about them, thanks for the reminder. > > > > I'd prefer my proposal for two reasons: > > - the syntax is nicer that a factory function > > - it wouldn't be possible to add keyword arguments to __getitem__ without > > breaking compatibility > > > > Those issues are relatively minor, i'd be happy with a factory function > > Agreed, the syntax is a lot nicer - for this specific situation. Don't > forget, though, that every piece of new syntax carries with it the not > insignificant cost of complexity; and some things are better > represented by function calls than syntax (cf 'print'). With "from > types import SimpleNamespace as NS" at the top of your code, you can > use the keyword-arguments form of the constructor to be almost the > same as your proposal, without any new syntax. When a subsequent > maintainer looks at your code, s/he can quickly look up at the imports > to see what's going on, instead of having to learn another piece of > syntax. I agree on the principle, but to me this syntax looks less complex to me compared to the actual namedtuple where you have to use a factory to create a class to be able to use a namedtuple. And if you use a namedtuple as a return value, the definition of the namedtuple class will typically be relatively far away from the actual use of it. SimpleNamespace is nice but would break compatibility for return values. > I have to agree. Fortunately it isn't hard to create a namedtuple factory: As you said before, it would be if the order of the keyword arguments was conserved. > def NS(fields, *values): > return namedtuple('anonymous', fields.split())(*values) > > That cuts down the duplication some, but it's far from perfect. It cuts down the duplication at the cost of readability. There is also the Matlab way which looks a bit better: from collections import namedtuple def NS(*args): fields = args[::2] values = args[1::2] return namedtuple('anonymous', fields)(*values) print(NS("x", 12, "y", 15)) But readability counts, and it's very easy to introduce bugs as adding or removing arguments can impact the whole chain. Initially coming from Matlab, the greatest interest I found in python was named arguments for functions! :) From contrebasse at gmail.com Tue Apr 19 08:34:37 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 12:34:37 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: <20160419120627.GR1819@ando.pearwood.info> Message-ID: > I'm repeatedly surprised that SimpleNamespace isn't in collections. +1 ! But I still think that my proposal has some value outside of SimpleNamespace. :) From p.f.moore at gmail.com Tue Apr 19 08:47:09 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 13:47:09 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <20160419120627.GR1819@ando.pearwood.info> Message-ID: On 19 April 2016 at 13:34, Joseph Martinot-Lagarde wrote: > But I still think that my proposal has some value outside of SimpleNamespace. :) The biggest issue is likely to be that the added value doesn't justify the cost of new syntax, and a language change. On the other hand, functions preserving the order of keyword arguments (which would allow you to write a cleaner factory function) is probably a change with wider value, as well as being less intrusive (it's a semantic change, but there's no change in syntax) and so may be more likely to get through. A lot of the difficulty with assessing proposals like this is balancing the costs and benefits, particularly as the costs typically affect far more people, most of whom don't participate in these lists, so we have to take a cautious ("assume the worst") view. Personally, I've not used namedtuple enough to feel that it warrants dedicated syntax. OTOH, I have wished for something like SimpleNamespace lots of times, so making that more discoverable would have been a huge win for me! Paul From contrebasse at gmail.com Tue Apr 19 08:49:22 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 12:49:22 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: Message-ID: > This way they could be used in __getitem__ as proposed in PEP 472[0] One of the link in this PEP is https://mail.python.org/pipermail/python-ideas/2013-June/021257.html where the writer raises some concerns about named tuples: - They pose problems with pickle because namedtuples needs subclasses to work. In my proposal it shouldn't be a problem anymore since there would be one class. - namedtuple() is a factory function requiring a 2-step initialization, a name, and a subclass. Again it disappears in my proposal. I'm not sure about the database problem since I don't use them. From steve at pearwood.info Tue Apr 19 08:59:58 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Apr 2016 22:59:58 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: <20160419125958.GS1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 09:49:44AM +0000, Joseph Martinot-Lagarde wrote: > Hi, list ! > > namedtuples are really great. I would like to use them even more, for > example for functions that return multiple arguments. The problem is that > namedtuples have to be "declared" beforehand, so it would be quite tedious > to declare a nameedtuple by function, that's why I very rarely do it. You aren't forced to declare the type ahead of time, you can just use it as part of an expression: py> from collections import namedtuple py> my_tuple = namedtuple("_", "x y z")(1, 2, 3) py> print(my_tuple) _(x=1, y=2, z=3) However, there is a gotcha if you do this: each time you create an anonymous namedtuple, you create a new, distinct, class: py> a = namedtuple("_", "x y")(1, 2) py> b = namedtuple("_", "x y")(1, 2) py> type(a) == type(b) False which could be both surprising and expensive. One possible solution: keep your own cache of classes: def nt(name, fields, _cache={}): cls = _cache.get((name, fields), None) if cls is None: cls = _cache[(name, fields)] = namedtuple(name, fields) return cls > Another hting I don't like about namedtuples is the duplication of the name. > TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`, > where `Point` is repeated two times. I'll go one step further and say that > the name is useless most of the time, so let's just get rid of it. The name is not useless. It is very useful for string representations, debugging, introspection, and generally having a clue what the object represents. How else do you know what kind of object you are dealing with? I agree that it is a little sad that we have to repeat the name twice, but strictly speaking, you don't even need to do that. For example: class Point(namedtuple("AbstractPoint", "x y z")): def method(self): ... In my opinion, avoiding having to repeat the name twice is a "nice to have", not a "must have". > Proposal > ======== > > So I thought about a new (ok, maybe it has been proposed before but I > couldn't find it) syntax for anonymous namedtuples (I put the prints as > comments, otherwise gmane is complainig about top-posting): > > my_point = (x=12, y=16) > # (x=12, y=16) > my_point[0] > # 12 How do you know that the 0th item is field "x"? Keyword arguments are not ordered. Even if you could somehow determine the order that they are given, you can't use that information since we should expect that: assert (x=12, y=16) == (y=16, x=12) will pass. (Why? Because they're keyword arguments, and the order of keyword arguments shouldn't matter.) So we have a problem that the indexing order (the same order is used for iteration and tuple unpacking) is not specified anywhere. This is a *major* problem for an ordered type like tuple. namedtuple avoids this problem by requiring the user to specify the field name order before creating an instance. SimpleNamespace avoids this problem by not being ordered or iterable. I suppose we could put the fields in sorted order. But that's going to make life difficult for uses where we would like some other order, e.g. to match common conventions. Consider a 3D point in spherical coordinates: pt = (r=3, theta=0.5, phi=0.25) In sorted order, pt == (3, 0.25, 0.5) which goes against the standard mathematical definition. Namedtuples pre-define what fields are allowed, what they are called, and what order they appear in. Anonymous namedtuples as you describe them don't do any of these things. Consider that since they're tuples, we should be able to provide the items as positional arguments to the construct, just like regular namedtuples: pt = (x=12, y=16) type(pt)(1, 2) This works fine with namedtuple, but how will it work with your proposal? And what happens if we do this? type(pt)(1, 2, 3) And of course, a naive implementation would suffer from the same issue as mentioned above, where every instance is a singleton of a distinct class. Python isn't a language where we care too much about memory use, but surely we don't want to be quite this profligate: py> a = namedtuple("Point", "x y z")(1, 2, 3) py> sys.getsizeof(a) # reasonably small 36 py> sys.getsizeof(type(a)) # not so small 420 Obviously we don't want every instance to belong to a distinct class, so we need some sort of cache. SimpleNamespace solves this problem by making all instances belong to the same class. That's another difference between namedtuple and what you seem to want. -- Steve From ncoghlan at gmail.com Tue Apr 19 09:12:16 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Apr 2016 23:12:16 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 19 April 2016 at 14:52, Franklin? Lee wrote: > On Mon, Apr 18, 2016 at 11:51 AM, Ethan Furman wrote: > > Or even Nick's (that you snipped): > >> > >> Since the more the REPL does, the more opportunities there are for > > > >> it to break when debugging, having the output hooks be as simple as > >> possible is quite desirable. > > I read this as, "It'll be more complicated, so it MIGHT be buggy." > This seems solvable via design, and doesn't speak to whether it's > good, in principle, to limit output per input. If you still think it's > a good reason, then I probably didn't understand it correctly. > No, that's not what it means. It relates to one of the ways experienced developers are able to debug code: by running it in their heads, and comparing the result their brain computed with what actually happened. When there's a discrepancy, either their expectation is wrong or the code is wrong, and they need to investigate further to figure out which it is. When folks say "Python fits my brain" this is often what they're talking about. Implicit side effects from hidden code break that mental equivalence - it's why effective uses of metaclasses, monkeypatching, and other techniques for deliberately introducing changed implicit behaviour often also involve introducing some kind of more local signal to help convey what is going on (such as a naming convention, or ensuring the altered behaviour is used consistently across the entire project). The default REPL behaviour is appropriate for this "somewhat experienced Pythonista tinkering with code to see how it behaves" use case - keeping the results very close to what they would be if you typed the same line of code into a text file and ran it that way. It's not necessarily the best way to *learn* those equivalences, but that's also not what it's designed for. IPython's REPL is tailored for a different audience - their primary audience is research scientists, and they want to be able to better eyeball calculation results, rather than lower level Python instance representations. As a result, it's much cleverer than the default REPL, but it's also aiming to tap into people's intuitions about the shape of their data and the expected outcomes of the operations they're performing on it, rather than their ability to mentally run Python code specifically. A REPL designed specifically for folks learning Python, like the one in the Mu editor, or the direction IDLE seems to be going, would likely be better off choosing different default settings for sys displayhook and sys.excepthook, but those changes would be best selected based on direct observations of classrooms and workshops, and noting where folks get confused or intimidated by the default settings. For environments other than IDLE, they can also be iterated on at a much higher rate than we make CPython releases. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Apr 19 09:24:34 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Apr 2016 23:24:34 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <20160419120627.GR1819@ando.pearwood.info> Message-ID: <20160419132434.GT1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 01:47:09PM +0100, Paul Moore wrote: > On 19 April 2016 at 13:34, Joseph Martinot-Lagarde > wrote: > > But I still think that my proposal has some value outside of SimpleNamespace. :) > > The biggest issue is likely to be that the added value doesn't justify > the cost of new syntax, and a language change. I think there are more problems with the proposal than just "not useful enough". See my previous post. > On the other hand, > functions preserving the order of keyword arguments (which would allow > you to write a cleaner factory function) is probably a change with > wider value, as well as being less intrusive (it's a semantic change, > but there's no change in syntax) and so may be more likely to get > through. It's a pretty big semantic change though. As I point out in my previous post, at the moment we can be sure that changing the order of keyword arguments does not change anything: spam(a=1, b=2) and spam(b=2, a=1) will always have the same semantics. Making keyword arguments ordered will break that assumption, and even if it doesn't break any existing code, it will make it harder to reason about future code. No more can you trust that the order of keyword arguments will have no effect on the result of calling a function. I don't see the current behaviour of keyword arguments as a limitation, I see it as a feature. I don't need to care about the order of keyword arguments, only their names. -- Steve From contrebasse at gmail.com Tue Apr 19 09:48:08 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 13:48:08 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: <20160419125958.GS1819@ando.pearwood.info> Message-ID: Steven D'Aprano writes: > py> from collections import namedtuple > py> my_tuple = namedtuple("_", "x y z")(1, 2, 3) > py> print(my_tuple) > _(x=1, y=2, z=3) This looks nice with a short number of short attribute names, but in a real case separating the values from their corresponding field hurts readability. Compare: my_tuple = namedtuple("_", "site, gisement, range_nominal, range_bista, is_monostatic")(12.45, 45.0, 3000.0, 2998.5357, True) my_tuple = (site=12.45, gisement=45.0, range_nominal=3000.0, range_bista=2998.5357, is_monostatic=True) As a side note, if the empty string would be allowed for the class name the printed value would look better. The fact that it's not possible right now looks liek an implementation "detail". > > Another hting I don't like about namedtuples is the duplication of the name. > > TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`, > > where `Point` is repeated two times. I'll go one step further and say that > > the name is useless most of the time, so let's just get rid of it. > > The name is not useless. It is very useful for string representations, > debugging, introspection, and generally having a clue what the object > represents. How else do you know what kind of object you are dealing > with? > > In my opinion, avoiding having to repeat the name twice is a "nice to > have", not a "must have". Of course he name is not completely useless, but you can often know what a namedtuple represents in your application or your script from the field names only (or the variable name). SimpleNamespace live very well without a name, for example. > > my_point = (x=12, y=16) > > # (x=12, y=16) > > my_point[0] > > # 12 > > How do you know that the 0th item is field "x"? > > Keyword arguments are not ordered. In my proposal these are not keywords arguments, they are a syntax construct to build a anonymous namedtuple. That's why I can't just create a factory function. The order is determiend by the construction, and can be found by introspection (simply printing the object will do). I suppose that as a developper it's a shift from how keywords behave now, but as a user a namedtuple is still a tuple, and it keeps the ordering. > Consider that since they're tuples, we should be able to provide the > items as positional arguments to the construct, just like regular > namedtuples: > > pt = (x=12, y=16) > type(pt)(1, 2) They inherits tuples, that doesn't mean that they behave in all cases like tuples. In that case I guess that this form would not be allowed. I don't know how bad this looks for a python dev though! > Obviously we don't want every instance to belong to a distinct class, so > we need some sort of cache. SimpleNamespace solves this problem by > making all instances belong to the same class. That's another difference > between namedtuple and what you seem to want. Yes, that's another difference, I don't want to replicate namedtuples exactly (otherwise I'd just use a namedtuple). I don't think that you need a cache, more like a frozendict per instance which binds the fields to the corresponding index. Maybe it's what you call cache ? >From your comments I get that a problem is that the only way to create an anonymous namedtuple would be via the syntax construct, it's not possible to use the class constructor directly. I would argue that it's not a problem becaus if you need to do that you can just use a standard namedtuple, that's exactly what it's here for. Anonymous namedtuple are for quick, easy, and readable use. Maybe that means that the added syntax complexity is not worth it. From mike at selik.org Tue Apr 19 10:03:35 2016 From: mike at selik.org (Michael Selik) Date: Tue, 19 Apr 2016 14:03:35 +0000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee wrote: > On Apr 19, 2016 4:09 AM, "Paul Moore" wrote: > > Basic users should probably be using a tool like IDLE, which has a bit > > more support for beginners than the raw REPL. > > My college had CS students SSH into the department's Linux server to > compile and run their code, and many teachers don't believe that students > should start with fancy IDE featues like, er, syntax highlighting. > That's probably because your professors thought you were more advanced than other new Pythonistas, because you were CS students. If I were in their shoes, I might chose a different approach depending on the level of the course. > But that doesn't answer my question: would the proposed change hurt your workflow? It might. Would it affect doctests? Would it make indirect infinite recursion more difficult to trace? Would it make me remember yet another command line option or REPL option to turn on complete reprs? Would it force me to explain yet another config setting to new programmers? I think a beginner understands when they've printed something too big. I see this happen frequently. They laugh, shake their heads, and retype whatever they need to. If they're using IDLE, they say, "OMG I crashed it!" then they close the window or restart IDLE. I'd say it's more a problem in IDLE than in the default REPL. -------------- next part -------------- An HTML attachment was scrubbed... URL: From contrebasse at gmail.com Tue Apr 19 10:09:45 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 14:09:45 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: Message-ID: > That's entirely possible. For the situations where order is > insignificant and unpacking is unnecessary, a SimpleNamespace would be > perfect, if people knew about it. Probably not worth moving the actual > class, but a quick little docs link might help. > How big a problem would it be to actually move the class to collections ? For example at the top of the documentation there is a table with all the available container classes, having SimpleNamespace would be far more discoverable than a paragraph in namedtuple. The class could still be importable from types.SimpleNamespace for backward compatibility. Sorry for my noob questions... From p.f.moore at gmail.com Tue Apr 19 10:26:08 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 15:26:08 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On 19 April 2016 at 15:09, Joseph Martinot-Lagarde wrote: >> That's entirely possible. For the situations where order is >> insignificant and unpacking is unnecessary, a SimpleNamespace would be >> perfect, if people knew about it. Probably not worth moving the actual >> class, but a quick little docs link might help. >> > How big a problem would it be to actually move the class to collections ? > For example at the top of the documentation there is a table with all the > available container classes, having SimpleNamespace would be far more > discoverable than a paragraph in namedtuple. Not incredibly hard, in theory. However... Move the definition: SimpleNamespace = type(sys.implementation) to collections (but note that this isn't *really* the definition, the actual definition is in C, in the sys module, and is unnamed - to that extent types, which is for exposing names for types that otherwise aren't given built in names, is actually the right place for this definition!) Update the documentation. Update the tests (test/test_types.py) - this probably involves *moving* those tests to test_collections.py. Add a backward compatibility name in types. Add some tests for that backward compatibility name. Work out what to do about pickle compatibility: >>> from pickle import dumps >>> from types import SimpleNamespace >>> dumps(SimpleNamespace(a=1)) b'\x80\x03ctypes\nSimpleNamespace\nq\x00)Rq\x01}q\x02X\x01\x00\x00\x00aq\x03K\x01sb.' Note that the type name (types.SimpleNamespace) is embedded in the pickle. How do we avoid breaking pickles? That's probably far from everything - this was 5 minutes' quick investigation, and I'm not very experienced at doing this type of refactoring. I'm pretty sure I've missed some things. > The class could still be importable from types.SimpleNamespace for backward > compatibility. Where it's defined matters in the pickle case - so it's not quite that simple. > Sorry for my noob questions... Not at all. It's precisely because it "seems simple" that the feedback people get seems negative at times - I hope the above gives you a better idea of what might be involved in such a seemingly simple change. So thanks for asking, and giving me the opportunity to clarify :-) In summary: It's not likely to be worth the effort, even though it looks simple at first glance. Paul From random832 at fastmail.com Tue Apr 19 10:26:41 2016 From: random832 at fastmail.com (Random832) Date: Tue, 19 Apr 2016 10:26:41 -0400 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> On Tue, Apr 19, 2016, at 06:25, Paul Moore wrote: > Possibly the docs for namedtuple should refer the user to > SimpleNamespace as an alternative if "being a tuple subclass" isn't an > important requirement. Just my two cents - for the use cases that I would find this useful for (I usually end up just using tuple), a hashable "FrozenSimpleNamespace" would be nice. Or even one that has immutable key fields that are used as the basis for the hash but that you can hang other values off of (SemiFrozen?) A way to simply initialize a SimpleNamespace-like class with a passed-in mapping, and have it inherit the equality/hash/etc from that (so FrozenSimpleNamespace could use FrozenDict, you could do one with OrderedDict, one with chainmap or with an easy-to-write custom dict that provides the "immutable key plus mutable other" behavior), might provide for all of these in a relatively easy way. Maybe have SimpleNamespace([dict], /, **kwargs) that just throws the dict argument (if present) into __dict__. If there were a FrozenOrderedDict you could even implement a very namedtuple-like class that way. From p.f.moore at gmail.com Tue Apr 19 10:39:48 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 15:39:48 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: On 19 April 2016 at 15:26, Random832 wrote: > On Tue, Apr 19, 2016, at 06:25, Paul Moore wrote: >> Possibly the docs for namedtuple should refer the user to >> SimpleNamespace as an alternative if "being a tuple subclass" isn't an >> important requirement. > > Just my two cents - for the use cases that I would find this useful for > (I usually end up just using tuple), a hashable "FrozenSimpleNamespace" > would be nice. Or even one that has immutable key fields that are used > as the basis for the hash but that you can hang other values off of > (SemiFrozen?) Having done some digging just now... SimpleNamespace is basically just exposing for end users, the type that is used to build the sys.implementation object. The definition is *literally* nothing more than SimpleNamespace = type(sys.implementation) Variations such as you suggest may indeed be useful, but the existing implementation was essentially a free benefit of "we use this one ourselves in the core". Going beyond that, any additional types probably need to justify themselves a bit better - and at that point the usual "publish on PyPI as a standalone module and see how useful they turn out to be" suggestion probably applies. Paul From contrebasse at gmail.com Tue Apr 19 10:51:35 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 14:51:35 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: > Going beyond that, any additional types > probably need to justify themselves a bit better - and at that point > the usual "publish on PyPI as a standalone module and see how useful > they turn out to be" suggestion probably applies. The core of my proposal depends on modifications in the langage itself, it needs either adding a new syntax, either keeping the order of keywords arguments. Right now I don't see how it could work as a standalone module, but I'll think about it. From guido at python.org Tue Apr 19 11:06:30 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Apr 2016 08:06:30 -0700 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 4:49 AM, Chris Angelico wrote: > Fortunately it isn't hard to create a namedtuple factory: > > def NS(fields): > return namedtuple('anonymous', fields.split()) > > or possibly: > > def NS(fields, *values): > return namedtuple('anonymous', fields.split())(*values) > > That cuts down the duplication some, but it's far from perfect. > Please don't recommend or spread this idiom! Every call to namedtuple() creates a new class, which is a very expensive operation. On my machine the simplest namedtuple call taks around 350 usec. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Apr 19 11:13:53 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Apr 2016 08:13:53 -0700 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: A general warning about this (x=12, y=16) idea: It's not so different from using dicts as cheap structs, constructing objects on the fly using either {'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is that if you use this idiom a lot, you're invariably going to run into cases where a field is missing, and you're spending a lot of time tracking down where the object was created incorrectly. Using a namedtuple (or any other construct that asks you to pre-declare the type used) the incorrect construction will cause a failure right at the point where it's being incorrectly constructed, making it much simpler to diagnose. In fully typed languages like Haskell or Java this wouldn't be a problem, but in languages like Python or Perl it is a real concern. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Tue Apr 19 11:15:01 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 18:15:01 +0300 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: On Tue, Apr 19, 2016 at 5:39 PM, Paul Moore wrote: > > Having done some digging just now... SimpleNamespace is basically just > exposing for end users, the type that is used to build the > sys.implementation object. The definition is *literally* nothing more > than > > SimpleNamespace = type(sys.implementation) > So, if this were added to 'collections' too, the documentation should be moved there, and the 'types' documentation could mention this somehow and recommend using collections. Heck, if it gets another alias, why not give it another name too: collections.ComplicatedNamespace ;-) My main objection to putting SimpleNamespace in collections is that we might want something even better in collections, perhaps ComplexNamespace? -Koos From p.f.moore at gmail.com Tue Apr 19 11:30:26 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 19 Apr 2016 16:30:26 +0100 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: On 19 April 2016 at 15:51, Joseph Martinot-Lagarde wrote: >> Going beyond that, any additional types >> probably need to justify themselves a bit better - and at that point >> the usual "publish on PyPI as a standalone module and see how useful >> they turn out to be" suggestion probably applies. > > The core of my proposal depends on modifications in the langage itself, it > needs either adding a new syntax, either keeping the order of keywords > arguments. Right now I don't see how it could work as a standalone module, > but I'll think about it. I was referring to randon832's suggested variants on SimpleNamespace. As you say, your proposal is basically a language change and you're completely right to assume that needs to be implemented in the core. Sorry if I wasn't clear enough. Paul From steve at pearwood.info Tue Apr 19 12:18:58 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Apr 2016 02:18:58 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: <20160419161855.GU1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote: > The default REPL behaviour is appropriate for this "somewhat experienced > Pythonista tinkering with code to see how it behaves" use case - keeping > the results very close to what they would be if you typed the same line of > code into a text file and ran it that way. It's not necessarily the best > way to *learn* those equivalences, but that's also not what it's designed > for. I mostly agree with what you say, but I would like to see one change to the default sys.excepthook: large numbers of *identical* traceback lines (as you often get with recursion errors) should be collapsed. For example: py> sys.setrecursionlimit(20) py> fact(30) # obvious recursive factorial Traceback (most recent call last): File "", line 1, in File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 3, in fact File "", line 2, in fact RuntimeError: maximum recursion depth exceeded in comparison Try as I might, I just don't see the value of manually counting all those 'File "", line 3, in fact' lines to find out where the recursive call failed :-) I think that it would be better if identical lines were collapsed, something like this: import sys import traceback from itertools import groupby TEMPLATE = " [...repeat previous line %d times...]\n" def collapse(seq, minimum, template=TEMPLATE): for key, group in groupby(seq): group = list(group) if len(group) < minimum: for item in group: yield item else: yield key yield template % (len(group)-1) def shortertb(*args): lines = traceback.format_exception(*args) sys.stderr.write(''.join(collapse(lines, 3))) sys.excepthook = shortertb which then gives tracebacks like this: py> sys.setrecursionlimit(200) py> a = fact(10000) Traceback (most recent call last): File "", line 1, in File "", line 3, in fact [...repeat previous line 197 times...] File "", line 2, in fact RuntimeError: maximum recursion depth exceeded in comparison -- Steve From guido at python.org Tue Apr 19 12:20:48 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Apr 2016 09:20:48 -0700 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 3:41 AM, Koos Zevenhoven wrote: > On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum > wrote: > > Your pathstring seems to be the same as the predefined (in typing.py, and > > PEP 484) AnyStr. > > Oh, there too! :) I thought I will need a TypeVar, so I turned to > help(typing.TypeVar) to look up how to do that, and there it was, > right in front of me, just with a different name 'A': > > A = TypeVar('A', str, bytes) > > Anyway, it might make sense to consider defining 'pathstring' (or > 'PathStr' for consistency?), even if it would be the same as AnyStr. > Then, hypothetically, if at any point in the far future, bytes paths > would be deprecated, it could be considered to make PathStr just str. > After all, we don't want just Any String, we want something that > represents a path (in a documentation sense). > Unfortunately, until we implement something like "NewType" ( https://github.com/python/typing/issues/189) the type checkers won't check whether you're actually using the right thing, so while the separate name would add a bit of documentation, I doubt that you'll ever be able to change the meaning of PathStr. Also, I don't expect a future where bytes paths don't make sense, unless Linux starts enforcing a normalized UTF-8 encoding in the kernel. > > You are indeed making sense, except that for various reasons the stdlib > is > > not likely to adopt in-line signature annotations yet -- not even for new > > code. > > > > However once there's agreement on os.fspath() it can be added to the > stubs > > in github.com/python/typeshed. > > > > I see, and I did have that impression already about the stdlib and > type hints, probably based on some of your writings. My intention was > to write these in the stub format, but apparently I need to look up > the stub syntax once more. > Once there's a PEP, updating the stubs will be routine. > > Is there going to be a PEP for os.fspath()? (I muted most of the > discussions > > so I'm not sure where it stands.) > > It has not seemed like a good idea to discuss this (too?), but now > that you ask, I have been wondering how optimal it is to add this to > the pathlib PEP. While the changes do affect pathlib (even the code of > the module itself), this will affect ntpath, posixpath, os.scandir, > os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is > not true), shutil.[stuff], (io.)open, and potentially all kinds of > random places in the stdlib, such as fileinput, filecmp, zipfile, > tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob, > and fnmatch, to name a few :). > > And now, if the FSPath[underlying_type] I just proposed ends up being > added to typing (by whatever name), this will even affect typing.py. > Personally I think it's better off as a separate PEP, unless it turns out that it can be compressed to just the addition of a few paragraphs to the original PEP 428. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Apr 19 12:44:41 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Apr 2016 02:44:41 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: <20160419120627.GR1819@ando.pearwood.info> References: <20160419120627.GR1819@ando.pearwood.info> Message-ID: <20160419164440.GW1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 10:06:28PM +1000, Steven D'Aprano wrote: > On Tue, Apr 19, 2016 at 11:25:07AM +0100, Paul Moore wrote: > > > Possibly the docs for namedtuple should refer the user to > > SimpleNamespace as an alternative if "being a tuple subclass" isn't an > > important requirement. Or possibly SimpleNamespace should be in the > > collections module (alongside namedtuple) rather than in the types > > module? > > +1 > > I'm repeatedly surprised that SimpleNamespace isn't in collections. I've thought about this some more, and I've decided that I don't think it should be in collections. It's not a collection. Strangely, "collection" is not in the glossary: https://docs.python.org/dev/glossary.html (neither is "container") and there doesn't appear to be an ABC for what makes a collection, but the docs seem to suggest that "collections" are another word for "containers": Quote: 8.3. collections ? Container datatypes Source code: Lib/collections/__init__.py This module implements specialized container datatypes providing alternatives to Python?s general purpose built-in containers, dict, list, set, and tuple. https://docs.python.org/dev/library/collections.html which mostly matches the old (and I mean really old, going back to Python 1.5 days) informal usage of "collection" as "a string, list, tuple or dict", that is, a sequence or mapping. (These days, I'd add set to the list as well.) SimpleNamespace is neither a sequence nor a mapping. We do have a "Container" ABC: https://docs.python.org/dev/library/collections.abc.html#collections.abc.Container and SimpleNamespace doesn't match that ABC either. So I think SimpleNamespace should stay where it is. I guess I'll just have to learn that it's a type, not a collection :-) -- Steve From jbvsmo at gmail.com Tue Apr 19 12:58:47 2016 From: jbvsmo at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Bernardo?=) Date: Tue, 19 Apr 2016 13:58:47 -0300 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 12:06 PM, Guido van Rossum wrote: > Every call to namedtuple() creates a new class, which is a very expensive > operation. On my machine the simplest namedtuple call taks around 350 usec. > Isn't that because the code for namedtuple is using exec instead a more pythonic approach of metaclassing to make it "faster" but taking longer to actually create the class? http://bugs.python.org/issue3974 Maybe instead of changing namedtuple (which seems to be a taboo), there could be a new anonymous type. Something built-in, with its own syntax -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Apr 19 13:35:45 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Apr 2016 10:35:45 -0700 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 9:58 AM, Jo?o Bernardo wrote: > > On Tue, Apr 19, 2016 at 12:06 PM, Guido van Rossum > wrote: > >> Every call to namedtuple() creates a new class, which is a very expensive >> operation. On my machine the simplest namedtuple call taks around 350 usec. >> > > > Isn't that because the code for namedtuple is using exec instead a more > pythonic approach of metaclassing to make it "faster" but taking longer to > actually create the class? > http://bugs.python.org/issue3974 > That's part of it, but no matter how you do it, it's still creating a new class object, you can't get around that. And class objects are just expensive. Any idiom that ends up creating a new class object for each instance that's created is a big anti-pattern. > Maybe instead of changing namedtuple (which seems to be a taboo), there > could be a new anonymous type. Something built-in, with its own syntax. > That's essentially what the (x=12, y=16) proposal is about, IIUC -- it would just be a single new type, so (x=12, y=16).__class__ would be the same class object as (a='', b=3.14). But I have serious reservations about that idiom too. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Tue Apr 19 13:36:47 2016 From: random832 at fastmail.com (Random832) Date: Tue, 19 Apr 2016 13:36:47 -0400 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: <1461087407.2619470.583514289.06CD4F00@webmail.messagingengine.com> On Tue, Apr 19, 2016, at 12:20, Guido van Rossum wrote: > Also, I don't expect a future where bytes paths don't make sense, unless > Linux starts enforcing a normalized UTF-8 encoding in the kernel. Well, OSX does that now, but that's a whole other topic. Whether it is useful to represent paths as the bytes type in Python code is orthogonal to whether you can have paths that aren't valid strings in an encoding, considering that surrogateescape lets you represent any sequence of bytes as a str. From k7hoven at gmail.com Tue Apr 19 13:40:06 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 20:40:06 +0300 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 7:20 PM, Guido van Rossum wrote: > On Tue, Apr 19, 2016 at 3:41 AM, Koos Zevenhoven wrote: >> >> On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum >> wrote: >> >> > Is there going to be a PEP for os.fspath()? (I muted most of the >> > discussions >> > so I'm not sure where it stands.) >> >> It has not seemed like a good idea to discuss this (too?), but now >> that you ask, I have been wondering how optimal it is to add this to >> the pathlib PEP. While the changes do affect pathlib (even the code of >> the module itself), this will affect ntpath, posixpath, os.scandir, >> os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is >> not true), shutil.[stuff], (io.)open, and potentially all kinds of >> random places in the stdlib, such as fileinput, filecmp, zipfile, >> tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob, >> and fnmatch, to name a few :). >> >> And now, if the FSPath[underlying_type] I just proposed ends up being >> added to typing (by whatever name), this will even affect typing.py. > > > Personally I think it's better off as a separate PEP, unless it turns out > that it can be compressed to just the addition of a few paragraphs to the > original PEP 428. > While I could imagine the discussions having been shorter, it does not seem like compressing everything into a few paragraphs is a good idea either. And there are things that have not really been discussed, such as the details of the 'typing' part and the list of affected modules, which I tried to sketch above. Anyway, after all this, I wouldn't mind to also work on the PEP if there will be separate one---if that makes any sense. -Koos From guido at python.org Tue Apr 19 13:51:13 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Apr 2016 10:51:13 -0700 Subject: [Python-ideas] Type hinting for path-related functions In-Reply-To: <1461087407.2619470.583514289.06CD4F00@webmail.messagingengine.com> References: <1461087407.2619470.583514289.06CD4F00@webmail.messagingengine.com> Message-ID: On Tue, Apr 19, 2016 at 10:36 AM, Random832 wrote: > On Tue, Apr 19, 2016, at 12:20, Guido van Rossum wrote: > > Also, I don't expect a future where bytes paths don't make sense, unless > > Linux starts enforcing a normalized UTF-8 encoding in the kernel. > > Well, OSX does that now, but that's a whole other topic. Whether it is > useful to represent paths as the bytes type in Python code is orthogonal > to whether you can have paths that aren't valid strings in an encoding, > considering that surrogateescape lets you represent any sequence of > bytes as a str. > I'm sorry, I just don't see support for bytes paths going away any time soon (regardless of the alternative you bring up), so I think it's a waste of breath to discuss it further. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From __peter__ at web.de Tue Apr 19 13:55:59 2016 From: __peter__ at web.de (Peter Otten) Date: Tue, 19 Apr 2016 19:55:59 +0200 Subject: [Python-ideas] Fwd: Anonymous namedtuples References: Message-ID: Guido van Rossum wrote: > A general warning about this (x=12, y=16) idea: It's not so different from > using dicts as cheap structs, constructing objects on the fly using either > {'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is > that if you use this idiom a lot, you're invariably going to run into > cases where a field is missing, and you're spending a lot of time tracking > down where the object was created incorrectly. Using a namedtuple (or any > other construct that asks you to pre-declare the type used) the incorrect > construction will cause a failure right at the point where it's being > incorrectly constructed, making it much simpler to diagnose. > > In fully typed languages like Haskell or Java this wouldn't be a problem, > but in languages like Python or Perl it is a real concern. If there were support from the compiler for a function def f(): return x=1, y=2 the namedtuple("_", "x y") class could be stored in f.__code__.co_consts just like 1 and 2. Adding information about the file and line number could probably be handled like it's done for code objects of classes and functions defined inside a function. While this approach still produces "late" failures debugging should at least be easier than for dict or SimpleNamespace instances. From k7hoven at gmail.com Tue Apr 19 16:06:31 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 19 Apr 2016 23:06:31 +0300 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <20160419161855.GU1819@ando.pearwood.info> References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On Tue, Apr 19, 2016 at 7:18 PM, Steven D'Aprano wrote: [...] > which then gives tracebacks like this: > > > py> sys.setrecursionlimit(200) > py> a = fact(10000) > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in fact > [...repeat previous line 197 times...] > File "", line 2, in fact > RuntimeError: maximum recursion depth exceeded in comparison > Something like this would also make sense (although not be quite as useful) when there is infinite recursion. But it should recognize a block of several lines repeating. -Koos From contrebasse at gmail.com Tue Apr 19 16:25:30 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 20:25:30 +0000 (UTC) Subject: [Python-ideas] Fwd: Anonymous namedtuples References: Message-ID: Guido van Rossum writes: > A general warning about this (x=12, y=16) idea: It's not so different from using dicts as cheap structs, constructing objects on the fly using either {'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is that if you use this idiom a lot, you're invariably going to run into cases where a field is missing, and you're spending a lot of time tracking down where the object was created incorrectly. Using a namedtuple (or any other construct that asks you to pre-declare the type used) the incorrect construction will cause a failure right at the point where it's being incorrectly constructed, making it much simpler to diagnose. I think that it depends a lot on what these are used for. What I primarily though about was to be able to easily return a namadtuple as a function output. In this case the namedtuple (anonymous or not) is created in a single place, so having to declare the namedtuple before using it has a limited interest, and has the drawbacks I presented before (code duplication, declaration far away from the use). What you're talking about is more like a container which would be used all around an application, where you need to ensure the corecntess of the struct everywhere. In this case a standard nametuple is a better fit. I feel like it's similar to the separation you talked about before, script vs application. My view is more script-like, yours is more application-like. Joseph From contrebasse at gmail.com Tue Apr 19 16:38:06 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 20:38:06 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: <20160419120627.GR1819@ando.pearwood.info> <20160419132434.GT1819@ando.pearwood.info> Message-ID: > It's a pretty big semantic change though. As I point out in my previous > post, at the moment we can be sure that changing the order of keyword > arguments does not change anything: spam(a=1, b=2) and spam(b=2, a=1) > will always have the same semantics. Making keyword arguments ordered > will break that assumption, and even if it doesn't break any existing > code, it will make it harder to reason about future code. No more can > you trust that the order of keyword arguments will have no effect on the > result of calling a function. > > I don't see the current behaviour of keyword arguments as a limitation, > I see it as a feature. I don't need to care about the order of keyword > arguments, only their names. > I think I finally understand your concern about keeping the order of keywords arguments. Most of the time it wouldn't matter but it can potentially lead to very surprising behavior. Maybe the fact that we're all consenting adults can help ? However in my proposal there is no function call, even if the syntax is similar. The order of the fields has the same significance as in a namedtuple, but it's another (easier) way to set them. It's also why I proposed an evolution to use it in __getitem__ also, which already doesn't behave like a standard function call either. Joseph From contrebasse at gmail.com Tue Apr 19 16:47:13 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 20:47:13 +0000 (UTC) Subject: [Python-ideas] Anonymous namedtuples References: <20160419120627.GR1819@ando.pearwood.info> <20160419164440.GW1819@ando.pearwood.info> Message-ID: > SimpleNamespace is neither a sequence nor a mapping. > I don't know the exact definition of a mapping, but to me SimpleNamespace is like a dict with restricted keys (valid attributes only) and a nicer syntax. `my_dict["key"]` is roughly equivalent to `my_namespace.key`. SimpleNamespace lacks lots of methods comapred to dict. Maybe an hypothetical ComplexNamespace would be in collections module ? Joseph From srkunze at mail.de Tue Apr 19 17:20:30 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 19 Apr 2016 23:20:30 +0200 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: References: Message-ID: <5716A11E.30704@mail.de> On 19.04.2016 19:35, Guido van Rossum wrote: > That's essentially what the (x=12, y=16) proposal is about, IIUC -- it > would just be a single new type, so (x=12, y=16).__class__ would be > the same class object as (a='', b=3.14). The proposal reminds me of JavaScript's "object". So, "(x=12, y=16).__class__ == type(object())"? > But I have serious reservations about that idiom too. Me, too. I don't fully support this kind of on-the-fly construction as it is the same for closures. But that might just be my personal feeling because as such they escape proper testing and type support by IDEs (which reminds me strongly about Web development with JavaScript). On the other side, it allows ultra-rapid prototyping from which one can strip down to "traditional development" step by step when needed (using classes, tests etc.). This said, what are the reasons for your reservations? Best, Sven From tjreedy at udel.edu Tue Apr 19 17:37:14 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Apr 2016 17:37:14 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 4/19/2016 5:35 AM, Koos Zevenhoven wrote: > I was once a basic user, but I still have no idea what "IDLE" is. Does > it come with python? Yes, unless explicitly omitted either in packaging or installation. (Some redistributors might put it with a separate tkinter/tix/idle/turtle/turtledemo package. The windows installer has a box that can be unchecked.) > I have tried > > $ idle This and idle3 works on some but not all systems. > $ python -m idle python -m idlelib (or idlelib.idle -- required in 2.x) > $ python -m IDLE > $ python --idle > To be honest, I do remember seeing a shortcut to IDLE in one of my > Windows python installations, Right. Once run, the IDLE icon can be pinned to the taskbar. > and I've seen it come up in discussions. > However, it does not seem to me that IDLE is something that beginners > would know to turn to. Yet many do, perhaps because instructors and books suggest it and tell how to start it up. -- Terry Jan Reedy From contrebasse at gmail.com Tue Apr 19 18:02:55 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Tue, 19 Apr 2016 22:02:55 +0000 (UTC) Subject: [Python-ideas] Fwd: Anonymous namedtuples References: <5716A11E.30704@mail.de> Message-ID: > On 19.04.2016 19:35, Guido van Rossum wrote: > > That's essentially what the (x=12, y=16) proposal is about, IIUC -- it > > would just be a single new type, so (x=12, y=16).__class__ would be > > the same class object as (a='', b=3.14). > > The proposal reminds me of JavaScript's "object". > > So, "(x=12, y=16).__class__ == type(object())"? No, it would have its own class but it would be the same for all of these object, independently of the fields. This is different from namedtuples who need the creation of a subclass for each set of fields. From tjreedy at udel.edu Tue Apr 19 18:02:54 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Apr 2016 18:02:54 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 4/19/2016 9:12 AM, Nick Coghlan wrote: > The default REPL behaviour is appropriate for this "somewhat experienced > Pythonista tinkering with code to see how it behaves" use case - keeping > the results very close to what they would be if you typed the same line > of code into a text file and ran it that way. It's not necessarily the > best way to *learn* those equivalences, but that's also not what it's > designed for. The default REPL behavior should not throw away output. In the Windows console, what is not displayed cannot be retrieved. Reversible output editing is possible and appropriate in a GUI that can keep output separate from what is displayed. > IPython's REPL is tailored for a different audience - their primary > audience is research scientists, and they want to be able to better > eyeball calculation results, rather than lower level Python instance > representations. As a result, it's much cleverer than the default REPL, > but it's also aiming to tap into people's intuitions about the shape of > their data and the expected outcomes of the operations they're > performing on it, rather than their ability to mentally run Python code > specifically. > > A REPL designed specifically for folks learning Python, like the one in > the Mu editor, or the direction IDLE seems to be going, would likely be > better off choosing different default settings for sys displayhook and > sys.excepthook, There is a external Squeezer extension to IDLE that more or less does what this thread proposes and is also reversible. Deleted blocks are replaced by something that can be clicked on to expand the text. There is a proposal to incorporate Squeezer into IDLE. I have not reviewed the proposal yet because it would not solve any of my problems. I am more interested in way to put long output from help() into a separate text window instead of the shell. > but those changes would be best selected based on direct > observations of classrooms and workshops, and noting where folks get > confused or intimidated by the default settings. For environments other > than IDLE, they can also be iterated on at a much higher rate than we > make CPython releases. -- Terry Jan Reedy From tjreedy at udel.edu Tue Apr 19 18:12:51 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Apr 2016 18:12:51 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On 4/19/2016 10:03 AM, Michael Selik wrote: > > > On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee > > > wrote: > > On Apr 19, 2016 4:09 AM, "Paul Moore" > > wrote: > > Basic users should probably be using a tool like IDLE, which has a bit > > more support for beginners than the raw REPL. > > My college had CS students SSH into the department's Linux server to > compile and run their code, and many teachers don't believe that > students should start with fancy IDE featues like, er, syntax > highlighting. > > That's probably because your professors thought you were more advanced > than other new Pythonistas, because you were CS students. If I were in > their shoes, I might chose a different approach depending on the level > of the course. > >> But that doesn't answer my question: would the proposed change hurt > your workflow? > > It might. Would it affect doctests? Would it make indirect infinite > recursion more difficult to trace? Would it make me remember yet another > command line option or REPL option to turn on complete reprs? Would it > force me to explain yet another config setting to new programmers? > > I think a beginner understands when they've printed something too big. I > see this happen frequently. They laugh, shake their heads, and retype > whatever they need to. > > If they're using IDLE, they say, "OMG I crashed it!" then they close the > window or restart IDLE. Recursion limit tracebacks with 2000 short lines (under 80 chars) are not a problem for tk's Text widget. Long lines, from printing something line 'a'*10000 or [1]*5000 make the widget sluggish. Too many or too long may require a restart. > I'd say it's more a problem in IDLE than in the default REPL. Its a problem with using an underlying text widget optimized for managing 'sensible' length '\n' delimited lines. -- Terry Jan Reedy From tjreedy at udel.edu Tue Apr 19 18:28:40 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Apr 2016 18:28:40 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <20160419161855.GU1819@ando.pearwood.info> References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On 4/19/2016 12:18 PM, Steven D'Aprano wrote: > I mostly agree with what you say, but I would like to see one change to > the default sys.excepthook: large numbers of *identical* traceback lines > (as you often get with recursion errors) should be collapsed. For > example: Tracebacks produce *pairs* of lines: the location and the line itself. Replacing pairs with a count of repetitions would not lose information, and would make the info more visible. I would require at least, say, 3 repetitions before collapsing. > py> sys.setrecursionlimit(20) > py> fact(30) # obvious recursive factorial > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in fact return fact(n-1) > File "", line 3, in fact return fact(n-1) > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 3, in fact > File "", line 2, in fact > RuntimeError: maximum recursion depth exceeded in comparison > > > Try as I might, I just don't see the value of manually counting all > those 'File "", line 3, in fact' lines to find out where the > recursive call failed :-) > > I think that it would be better if identical lines were collapsed, > something like this: > > > import sys > import traceback > from itertools import groupby > TEMPLATE = " [...repeat previous line %d times...]\n" > > def collapse(seq, minimum, template=TEMPLATE): > for key, group in groupby(seq): > group = list(group) > if len(group) < minimum: > for item in group: > yield item > else: > yield key > yield template % (len(group)-1) > > def shortertb(*args): > lines = traceback.format_exception(*args) > sys.stderr.write(''.join(collapse(lines, 3))) > > sys.excepthook = shortertb > > > > which then gives tracebacks like this: > > > py> sys.setrecursionlimit(200) > py> a = fact(10000) > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in fact return fact(n-1) > [...repeat previous line 197 times...] [repeat previous pair of lines 197 times] > File "", line 2, in fact > RuntimeError: maximum recursion depth exceeded in comparison -- Terry Jan Reedy From ericsnowcurrently at gmail.com Tue Apr 19 20:26:42 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 19 Apr 2016 18:26:42 -0600 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: On Tue, Apr 19, 2016 at 8:39 AM, Paul Moore wrote: > Having done some digging just now... SimpleNamespace is basically just > exposing for end users, the type that is used to build the > sys.implementation object. The definition is *literally* nothing more > than > > SimpleNamespace = type(sys.implementation) FYI, sys.implementation was added for PEP 421 (May-ish 2012). That PEP has a small section on the type I used. [1] At first it was unclear if I should even expose type(sys.implementation). [2] Nick suggested types.SimpleNamespace, so I went with that. [3] :) Note that the type is super simple/trivial, a basic subclass of object with a **kwargs __init__() to pre-populate and a nice repr. Both that first link [1] and the docs [4] spell it out. The point is that there isn't much to the type. It's a convenience, nothing else. I used to get roughly the same thing with "class SimpleNamespace: pass". :) FWIW, I proposed giving SimpleNamespace more exposure around the same time as PEP 421, and the reaction was lukewarm. [5] However, since then it's been used in several other places in the stdlib and I know of a few folks that have used it in their own projects. Regardless, ultimately the only reason I added it to the stdlib in the first place [6] was because it's the type of sys.implementation and it made more sense to expose it explicitly in the types module than implicitly via type(sys.implementation). -eric [1] https://www.python.org/dev/peps/pep-0421/#type-considerations [2] https://mail.python.org/pipermail/python-dev/2012-May/119771.html [3] https://mail.python.org/pipermail/python-dev/2012-May/119775.html [4] https://docs.python.org/3/library/types.html#types.SimpleNamespace [5] https://mail.python.org/pipermail/python-ideas/2012-May/015208.html [6] in contrast to argparse.Namespace and multiprocessing.managers.Namespace... From ethan at stoneleaf.us Tue Apr 19 20:53:51 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 19 Apr 2016 17:53:51 -0700 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <1461076001.2559112.583290505.7B6D8CA8@webmail.messagingengine.com> Message-ID: <5716D31F.9050404@stoneleaf.us> On 04/19/2016 08:15 AM, Koos Zevenhoven wrote: > My main objection to putting SimpleNamespace in collections is that we > might want something even better in collections, perhaps > ComplexNamespace? A ComplexNamespace is a class. :) -- ~Ethan~ From random832 at fastmail.com Tue Apr 19 22:04:40 2016 From: random832 at fastmail.com (Random832) Date: Tue, 19 Apr 2016 22:04:40 -0400 Subject: [Python-ideas] Fwd: Anonymous namedtuples In-Reply-To: <5716A11E.30704@mail.de> References: <5716A11E.30704@mail.de> Message-ID: <1461117880.2452825.583929169.02A7CF5D@webmail.messagingengine.com> On Tue, Apr 19, 2016, at 17:20, Sven R. Kunze wrote: > So, "(x=12, y=16).__class__ == type(object())"? Well, object does equality comparison by identity; you presumably would want this class to use its contents. From steve at pearwood.info Tue Apr 19 22:48:11 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Apr 2016 12:48:11 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <20160419120627.GR1819@ando.pearwood.info> <20160419164440.GW1819@ando.pearwood.info> Message-ID: <20160420024810.GX1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 08:47:13PM +0000, Joseph Martinot-Lagarde wrote: > > SimpleNamespace is neither a sequence nor a mapping. > > > I don't know the exact definition of a mapping, but to me SimpleNamespace is > like a dict with restricted keys (valid attributes only) and a nicer syntax. > `my_dict["key"]` is roughly equivalent to `my_namespace.key`. That is (I believe) how Javascript and PHP treat it, but they're not exactly the best languages to emulate, or at least not blindly. Attributes and keys represent different concepts. Attributes represent an integral part of the object: dog.tail me.head car.engine book.cover while keys represent arbitrary (or nearly so) data associated with some data collection: books['The Lord Of The Rings'] kings['Henry VIII'] prisoner[239410] colours['AliceBlue'] They don't just use different syntaxes, they have different purposes, and while it is tempting to (mis)use the shorter attribute syntax for key lookups: colours.AliceBlue it is risky to conflate the two. Suppose you have a book called "update" (presumably a book of experimental poetry by somebody who dislikes uppercase letters): books.update Either the key shadows the update method, or the method shadows the book, or you get an error. All of these scenarios are bad. I'd consider giving Python an alternate key-lookup syntax purely as syntactic sugar: colours~AliceBlue # sugar for colours['AliceBlue'] books~update # books['update'] before I would consider adding a standard library class that conflates attribute- and key-lookup. If people want to do this in their own code (and I do see the attraction, even if I think it is a bad idea), so be it, but the standard library shouldn't encourage it. -- Steve From rosuav at gmail.com Tue Apr 19 23:00:29 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Apr 2016 13:00:29 +1000 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: <20160420024810.GX1819@ando.pearwood.info> References: <20160419120627.GR1819@ando.pearwood.info> <20160419164440.GW1819@ando.pearwood.info> <20160420024810.GX1819@ando.pearwood.info> Message-ID: On Wed, Apr 20, 2016 at 12:48 PM, Steven D'Aprano wrote: > Attributes and keys represent different concepts. Attributes represent > an integral part of the object: > > dog.tail > me.head > car.engine > book.cover > > while keys represent arbitrary (or nearly so) data associated with some > data collection: > > books['The Lord Of The Rings'] > kings['Henry VIII'] > prisoner[239410] > colours['AliceBlue'] > > > They don't just use different syntaxes, they have different purposes, > and while it is tempting to (mis)use the shorter attribute syntax for > key lookups: > > colours.AliceBlue > > it is risky to conflate the two. Enumerations blur the line. You could define 'colours' thus: class colours(enum.IntEnum): AliceBlue = 0xf0f8ff Is "AliceBlue" a fundamental attribute of the "colours" object, or is it a piece of data associated with the collection of colours? Or, like Danny Kaye, perchance a brilliant combination of both? ChrisA From mertz at gnosis.cx Tue Apr 19 23:12:19 2016 From: mertz at gnosis.cx (David Mertz) Date: Tue, 19 Apr 2016 20:12:19 -0700 Subject: [Python-ideas] random.choice on non-sequence In-Reply-To: <20160414013037.GF1819@ando.pearwood.info> References: <570D6B50.9070005@btinternet.com> <570DA411.3020003@canterbury.ac.nz> <-176815014396412028@unknownmsgid> <570EBE79.6070704@canterbury.ac.nz> <20160414013037.GF1819@ando.pearwood.info> Message-ID: For any positive integer you select (including those with more digits than there are particles in the universe), ALMOST ALL integers are larger than your selection. I.e. the measure of those smaller remains zero. On Apr 13, 2016 6:31 PM, "Steven D'Aprano" wrote: > On Thu, Apr 14, 2016 at 09:47:37AM +1200, Greg Ewing wrote: > > Chris Angelico wrote: > > >On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy wrote: > > > > > >>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote: > > >> > > >>>BTW, isn't it impossible to randomly select from an infinite iterable > > >>>anyway? > > >> > > >>With equal probability, yes, impossible. > [...] > > > I think Terry meant that you can't pick just one item that's > > equally likely to be any of the infinitely many items returned > > by the iterator. > > Correct. That's equivalent to chosing a positive integer with uniform > probability distribution and no upper bound. > > > > You can prove that by considering that the probability of > > a given item being returned would have to be 1/infinity, > > which is zero -- so you can't return anything! > > That's not how probability works :-) > > Consider a dart which is thrown at a dartboard. The probability of it > landing on any specific point is zero, since the area of a single point > is zero. Nevertheless, the dart does hit somewhere! > > A formal and precise treatment would have to involve calculus and limits > as the probability approaches zero, rather than a flat out "the > probability is zero, therefore it's impossible". > > Slightly less formally, we can say (only horrifying mathematicians a > little bit) that the probability of any specific number is an > infinitesimal number. > > https://en.wikipedia.org/wiki/Infinitesimal > > > While it is *mathematically* meaningful to talk about selecting a random > positive integer uniformly, its hard to do much more than that. The mean > (average) is undefined[1]. A typical value chosen would have a vast > number of digits, far larger than anything that could be stored in > computer memory. Indeed Almost All[2] of the values we generate would be > so large that we have no notation for writing it down (and not enough > space in the universe to write it even if we did). So it is impossible > in practice to select a random integer with uniform distribution and no > upper bound. > > Non-uniform distributions, though, are easy :-) > > > > > > [1] While weird, this is not all that weird. For example, selecting > numbers from a Cauchy distribution also has an undefined mean. What this > means in practice is that the *sample mean* will not converge as you > take more and more samples: the more samples you take, the more wildly > the average will jump all over the place. > > https://en.wikipedia.org/wiki/Cauchy_distribution#Estimation_of_parameters > > [2] https://en.wikipedia.org/wiki/Almost_all > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan at tombstonezero.net Tue Apr 19 23:18:25 2016 From: dan at tombstonezero.net (Dan Sommers) Date: Wed, 20 Apr 2016 03:18:25 +0000 (UTC) Subject: [Python-ideas] Have REPL print less by default References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On Wed, 20 Apr 2016 02:18:58 +1000, Steven D'Aprano wrote: > On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote: >> The default REPL behaviour is appropriate for this "somewhat >> experienced Pythonista tinkering with code to see how it behaves" use >> case - keeping the results very close to what they would be if you >> typed the same line of code into a text file and ran it that >> way. It's not necessarily the best way to *learn* those equivalences, >> but that's also not what it's designed for. That's a pretty powerful argument: running something in the REPL should give the same results as running it from a command line. When things fail differently in different environments, the environments themselves become suspects. > I mostly agree with what you say, but I would like to see one change > to the default sys.excepthook: large numbers of *identical* traceback > lines (as you often get with recursion errors) should be > collapsed. For example: [...] > Try as I might, I just don't see the value of manually counting all > those 'File "", line 3, in fact' lines to find out where the > recursive call failed :-) Yes, I see the smiley, but I would add specifically that the number of calls in the stack trace from a recursive function is rarely the important part. When I write something recursive, and get *that* stack trace, I don't have to scroll anywhere to look at anything to know that I blew the termination condition(s). (I'm agreeing with you: there is no value of counting the number of calls in the stack trace.) If I suddenly got tiny stack traces, I'd spend *more* time realizing what went wrong, until I retrained myself. From tjreedy at udel.edu Tue Apr 19 23:33:38 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Apr 2016 23:33:38 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On 4/19/2016 11:18 PM, Dan Sommers wrote: > On Wed, 20 Apr 2016 02:18:58 +1000, Steven D'Aprano wrote: > >> On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote: > >>> The default REPL behaviour is appropriate for this "somewhat >>> experienced Pythonista tinkering with code to see how it behaves" use >>> case - keeping the results very close to what they would be if you >>> typed the same line of code into a text file and ran it that >>> way. It's not necessarily the best way to *learn* those equivalences, >>> but that's also not what it's designed for. > > That's a pretty powerful argument: running something in the REPL should > give the same results as running it from a command line. When things > fail differently in different environments, the environments themselves > become suspects. If RecursionError stack traces were condensed, I would have them continue to be the same (condensed) in either interactive or batch mode. >> I mostly agree with what you say, but I would like to see one change >> to the default sys.excepthook: large numbers of *identical* traceback >> lines (as you often get with recursion errors) should be >> collapsed. For example: > > [...] > >> Try as I might, I just don't see the value of manually counting all >> those 'File "", line 3, in fact' lines to find out where the >> recursive call failed :-) > > Yes, I see the smiley, but I would add specifically that the number of > calls in the stack trace from a recursive function is rarely the > important part. When I write something recursive, and get *that* stack > trace, I don't have to scroll anywhere to look at anything to know that > I blew the termination condition(s). > > (I'm agreeing with you: there is no value of counting the number of > calls in the stack trace.) > > If I suddenly got tiny stack traces, I'd spend *more* time realizing > what went wrong, until I retrained myself. -- Terry Jan Reedy From ethan at stoneleaf.us Wed Apr 20 00:07:38 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 19 Apr 2016 21:07:38 -0700 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: <20160419120627.GR1819@ando.pearwood.info> <20160419164440.GW1819@ando.pearwood.info> <20160420024810.GX1819@ando.pearwood.info> Message-ID: <5717008A.5080102@stoneleaf.us> On 04/19/2016 08:00 PM, Chris Angelico wrote: > On Wed, Apr 20, 2016 at 12:48 PM, Steven D'Aprano wrote: >> Attributes and keys represent different concepts. Attributes represent >> an integral part of the object: >> [...] >> it is risky to conflate the two. > > Enumerations blur the line. You could define 'colours' thus: > > class colours(enum.IntEnum): > AliceBlue = 0xf0f8ff However, enumerations have a very specific purpose -- they aren't a generic collection you can modify and update at will (at least, not without a *lot* of effort). -- ~Ethan~ From vgr255 at live.ca Wed Apr 20 00:07:11 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Wed, 20 Apr 2016 00:07:11 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) Message-ID: This idea was mentioned a couple of times in the previous thread, and it seems reasonable to me. Getting recursion errors when testing a function in the interactive prompt often screwed me over, as I have a limited scrollback of 1000 lines at best on Windows (I never checked exactly how high/low the limit was), and with Python's recursion limit of 1000, that's a whopping 1000 to 2000 lines in my console, effectively killing off anything useful I might have wanted to see. Such as, for example, the function definition that triggered the exception. Of course, the same is true for actual programs, where tracebacks drown off everything else useful. For all we know, a typo caused a NameError in one of the functions which somehow triggered it to call itself over again until Python decided it was enough, but you can't know that because your entire scrollback is full of File "", line 1, in func And, for programs that are user-facing, I take all tracebacks and redirect them to a file that the user can submit me so I can debug the issue, and having tens of hundreds of the same line hurts readability. The issue that the output from running a certain script in the REPL and as a script would differ has been raised, and I am of the opinion that in no way, shape or form should the two have different behaviours (the exception being sys.ps1, sys.ps2 and builtins._ in interactive mode, which are 100% fine). I'm suggesting a change to how Python handles specifically recursion limits, to cut after a sane number of times (someone suggested to shrink the output after 3 times), and simply stating how many more identical messages there are. I would also like to extend this to any arbitrary loop of callers (for example foo -> bar -> baz -> foo and so on would be counted as one "call" for the purposes of this proposal). Under this proposal, something similar to this would happen: Traceback (most recent call last): File "", line 1, in File "", line 1, in func File "", line 1, in func File "", line 1, in func [Previous 1 message(s) repeated 996 more times] RecursionError: maximum recursion depth exceeded With multiple chained calls (I don't know how hard it would be to implement this, probably not trivial): Traceback (most recent call last): File "", line 1, in File "", line 1, in foo File "", line 1, in bar File "", line 1, in baz File "", line 1, in foo File "", line 1, in bar File "", line 1, in baz File "", line 1, in foo File "", line 1, in bar File "", line 1, in baz [Previous 3 message(s) repeated 330 more times] RecursionError: maximum recursion depth exceeded We can probably afford to cut out immediately as we notice the recursion, but even just 3 times with multiple chained calls isn't going to be anywhere near as long as it currently is. Thoughts? Suggestions? Something I forgot? -Emanuel From ethan at stoneleaf.us Wed Apr 20 00:26:24 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 19 Apr 2016 21:26:24 -0700 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: <571704F0.4040701@stoneleaf.us> On 04/19/2016 09:07 PM, ?manuel Barry wrote: > Under this proposal, something similar to this would happen: > > Traceback (most recent call last): > File "", line 1, in > File "", line 1, in func > File "", line 1, in func > File "", line 1, in func > [Previous 1 message(s) repeated 996 more times] > RecursionError: maximum recursion depth exceeded > > With multiple chained calls (I don't know how hard it would be to implement > this, probably not trivial): > > Traceback (most recent call last): > File "", line 1, in > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > [Previous 3 message(s) repeated 330 more times] > RecursionError: maximum recursion depth exceeded > Thoughts? Suggestions? Something I forgot? I don't think you'll find anyone opposed -- someone just needs to do the work, and volunteer hours are scarce. :( -- ~Ethan~ From tjreedy at udel.edu Wed Apr 20 02:19:38 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 20 Apr 2016 02:19:38 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: On 4/20/2016 12:07 AM, ?manuel Barry wrote: > This idea was mentioned a couple of times in the previous thread, and it > seems reasonable to me. Getting recursion errors when testing a function in > the interactive prompt often screwed me over, as I have a limited scrollback > of 1000 lines at best on Windows (I never checked exactly how high/low the > limit was), and with Python's recursion limit of 1000, that's a whopping > 1000 to 2000 lines in my console, effectively killing off anything useful I > might have wanted to see. Such as, for example, the function definition that > triggered the exception. For those who don't know, the Windows console uses a circular buffer of lines. The default size was once 300, I believe -- too small to contain a run of the test suite. (I seems to be less on Win 10). The max with win 10 appears to be 999 (x 4? unclear). -- Terry Jan Reedy From jcgoble3 at gmail.com Wed Apr 20 02:41:02 2016 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Wed, 20 Apr 2016 02:41:02 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: On Wed, Apr 20, 2016 at 2:19 AM, Terry Reedy wrote: > On 4/20/2016 12:07 AM, ?manuel Barry wrote: >> >> This idea was mentioned a couple of times in the previous thread, and it >> seems reasonable to me. Getting recursion errors when testing a function >> in >> the interactive prompt often screwed me over, as I have a limited >> scrollback >> of 1000 lines at best on Windows (I never checked exactly how high/low the >> limit was), and with Python's recursion limit of 1000, that's a whopping >> 1000 to 2000 lines in my console, effectively killing off anything useful >> I >> might have wanted to see. Such as, for example, the function definition >> that >> triggered the exception. > > > For those who don't know, the Windows console uses a circular buffer of > lines. The default size was once 300, I believe -- too small to contain a > run of the test suite. (I seems to be less on Win 10). The max with win 10 > appears to be 999 (x 4? unclear). I'm guessing you pulled the 999x4 from the Options tab of the properties dialog, but that option is actually for command history (i.e. the ability to use the up arrow to recall previous commands). That has a limit of 999 commands in the buffer, with a limit of 999 (not 4, which is the default) total buffers in memory across all open CMD.EXE instances. (Not sure what happens when you open more instances than that limit is set to.) The setting for scrollback is found under the Layout tab, and is disguised as the Height option in the Screen Buffer Size section. That has a limit of 9,999 lines, and I do believe it defaulted to 300 (I changed that setting so long ago that I no longer remember for sure what the stock default was). From rob.cliffe at btinternet.com Wed Apr 20 05:41:46 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 20 Apr 2016 10:41:46 +0100 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: <57174EDA.2000502@btinternet.com> Perhaps to simplify the implementation, one could just look for repetitions/cycles that include the LAST trace line. Rob Cliffe On 20/04/2016 05:07, ?manuel Barry wrote: > This idea was mentioned a couple of times in the previous thread, and it > seems reasonable to me. Getting recursion errors when testing a function in > the interactive prompt often screwed me over, as I have a limited scrollback > of 1000 lines at best on Windows (I never checked exactly how high/low the > limit was), and with Python's recursion limit of 1000, that's a whopping > 1000 to 2000 lines in my console, effectively killing off anything useful I > might have wanted to see. Such as, for example, the function definition that > triggered the exception. > > Of course, the same is true for actual programs, where tracebacks drown off > everything else useful. For all we know, a typo caused a NameError in one of > the functions which somehow triggered it to call itself over again until > Python decided it was enough, but you can't know that because your entire > scrollback is full of > > File "", line 1, in func > > And, for programs that are user-facing, I take all tracebacks and redirect > them to a file that the user can submit me so I can debug the issue, and > having tens of hundreds of the same line hurts readability. > > The issue that the output from running a certain script in the REPL and as a > script would differ has been raised, and I am of the opinion that in no way, > shape or form should the two have different behaviours (the exception being > sys.ps1, sys.ps2 and builtins._ in interactive mode, which are 100% fine). > I'm suggesting a change to how Python handles specifically recursion limits, > to cut after a sane number of times (someone suggested to shrink the output > after 3 times), and simply stating how many more identical messages there > are. I would also like to extend this to any arbitrary loop of callers (for > example foo -> bar -> baz -> foo and so on would be counted as one "call" > for the purposes of this proposal). > > Under this proposal, something similar to this would happen: > > Traceback (most recent call last): > File "", line 1, in > File "", line 1, in func > File "", line 1, in func > File "", line 1, in func > [Previous 1 message(s) repeated 996 more times] > RecursionError: maximum recursion depth exceeded > > With multiple chained calls (I don't know how hard it would be to implement > this, probably not trivial): > > Traceback (most recent call last): > File "", line 1, in > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > File "", line 1, in foo > File "", line 1, in bar > File "", line 1, in baz > [Previous 3 message(s) repeated 330 more times] > RecursionError: maximum recursion depth exceeded > > We can probably afford to cut out immediately as we notice the recursion, > but even just 3 times with multiple chained calls isn't going to be anywhere > near as long as it currently is. > > Thoughts? Suggestions? Something I forgot? > > -Emanuel > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From k7hoven at gmail.com Wed Apr 20 06:39:21 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 20 Apr 2016 13:39:21 +0300 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy wrote: > On 4/19/2016 12:18 PM, Steven D'Aprano wrote: >> >> I mostly agree with what you say, but I would like to see one change to >> the default sys.excepthook: large numbers of *identical* traceback lines >> (as you often get with recursion errors) should be collapsed. For >> example: > > > Tracebacks produce *pairs* of lines: the location and the line itself. Not always, as Steven's example shows. For example: def fun(): fun() fun() Does that. But yes, as I wrote in my previous email, it should recognize a block of several lines repeating, too. -Koos From k7hoven at gmail.com Wed Apr 20 06:41:19 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 20 Apr 2016 13:41:19 +0300 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: <57174EDA.2000502@btinternet.com> References: <57174EDA.2000502@btinternet.com> Message-ID: On Wed, Apr 20, 2016 at 12:41 PM, Rob Cliffe wrote: > Perhaps to simplify the implementation, one could just look for > repetitions/cycles that include the LAST trace line. > Rob Cliffe > Maybe you mean the last line before the exception message? -Koos From rob.cliffe at btinternet.com Wed Apr 20 07:12:44 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 20 Apr 2016 12:12:44 +0100 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: <57174EDA.2000502@btinternet.com> Message-ID: <5717642C.8010600@btinternet.com> Yes of course. On 20/04/2016 11:41, Koos Zevenhoven wrote: > On Wed, Apr 20, 2016 at 12:41 PM, Rob Cliffe wrote: >> Perhaps to simplify the implementation, one could just look for >> repetitions/cycles that include the LAST trace line. >> Rob Cliffe >> > Maybe you mean the last line before the exception message? > > -Koos > From ericsnowcurrently at gmail.com Wed Apr 20 12:00:46 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 20 Apr 2016 10:00:46 -0600 Subject: [Python-ideas] Anonymous namedtuples In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 4:06 AM, Chris Angelico wrote: > There have been proposals to have kwargs retain some order (either by > having it actually be an OrderedDict, or by changing the native dict > type to retain order under fairly restricted circumstances). With > that, you could craft the factory function easily. > > It may be worth dusting off one of those proposals and seeing if it > can move forward. FYI, PEP 468: "Preserving the order of **kwargs in a function." [1] The PEP exists for exactly the sort of use case being discussed here. In my case it was an alternate enum proposal I was making back when enums were under discussion. Regardless, I'd still like to see the PEP land. It's just a matter of finding time. I'll see if I can get some consensus leading up to and at PyCon next month. The actual implementation for PEP 468 is almost trivial. Now that we have a C-implementation of OrderedDict, it's mostly just a matter of addressing performance concerns [2], particularly those expressed by Guido, or falling back to one of the alternatives discussed in the PEP. Note that a related concept is represented with my work to make the class definition namespace ordered by default. [3] It would allow class decorators to have access to the order in which the class's attributes were defined. -eric [1] https://www.python.org/dev/peps/pep-0468/ [2] https://www.python.org/dev/peps/pep-0468/#performance [3] http://bugs.python.org/issue24254 From tjreedy at udel.edu Wed Apr 20 13:46:21 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 20 Apr 2016 13:46:21 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On 4/20/2016 6:39 AM, Koos Zevenhoven wrote: > On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy wrote: >> On 4/19/2016 12:18 PM, Steven D'Aprano wrote: >>> >>> I mostly agree with what you say, but I would like to see one change to >>> the default sys.excepthook: large numbers of *identical* traceback lines >>> (as you often get with recursion errors) should be collapsed. For >>> example: >> >> >> Tracebacks produce *pairs* of lines: the location and the line itself. > > Not always, as Steven's example shows. For example: > > def fun(): > fun() > > fun() > > Does that. The above produces pairs of lines, so I do not understand your point. -- Terry Jan Reedy From k7hoven at gmail.com Wed Apr 20 13:57:41 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 20 Apr 2016 20:57:41 +0300 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy wrote: > On 4/20/2016 6:39 AM, Koos Zevenhoven wrote: >> >> On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy wrote: >>> >>> On 4/19/2016 12:18 PM, Steven D'Aprano wrote: >>>> >>>> >>>> I mostly agree with what you say, but I would like to see one change to >>>> the default sys.excepthook: large numbers of *identical* traceback lines >>>> (as you often get with recursion errors) should be collapsed. For >>>> example: >>> >>> >>> >>> Tracebacks produce *pairs* of lines: the location and the line itself. >> >> >> Not always, as Steven's example shows. For example: >> >> def fun(): >> fun() >> >> fun() >> >> Does that. > > > The above produces pairs of lines, so I do not understand your point. > Strange. for me it produces a repeating single line. Tried both on 2.7.6. and 3.5.1. Probably not worth discussing, though. -Koos From random832 at fastmail.com Wed Apr 20 14:15:39 2016 From: random832 at fastmail.com (Random832) Date: Wed, 20 Apr 2016 14:15:39 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: <1461176139.2978840.584702161.1EBCB3FD@webmail.messagingengine.com> On Wed, Apr 20, 2016, at 13:57, Koos Zevenhoven wrote: > On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy wrote: > > On 4/20/2016 6:39 AM, Koos Zevenhoven wrote: > >> def fun(): > >> fun() > >> > >> fun() > >> > >> Does that. > > > > > > The above produces pairs of lines, so I do not understand your point. > > > > Strange. for me it produces a repeating single line. Tried both on > 2.7.6. and 3.5.1. Probably not worth discussing, though. It produces single lines in the interactive interpreter, pairs in a file. Any real implementation should do a comparison at the traceback data level, though, rather than the string. From leewangzhong+python at gmail.com Wed Apr 20 14:25:57 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 20 Apr 2016 14:25:57 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: On Apr 19, 2016 9:12 AM, "Nick Coghlan" wrote: > > Implicit side effects from hidden code break that mental equivalence - it's why effective uses of metaclasses, monkeypatching, and other techniques for deliberately introducing changed implicit behaviour often also involve introducing some kind of more local signal to help convey what is going on (such as a naming convention, or ensuring the altered behaviour is used consistently across the entire project). But a local signal _is_ part of my original proposal: """Instead, it will print a few lines ("... and approximately X more lines"), and tell you how to print more. (E.g. "Call '_more()' for more. Call '_full()' for full output.")""" """Again, there should be a message telling you how to get the full stacktrace printed. EXACTLY how, preferably in a way that is easy to type, so that a typo won't cause the trace to be lost. It should not use `sys.something()`, because the user's first few encounters with this message will result in, "NameError: name 'sys' is not defined".""" This also satisfies Terry Reedy's "reversibility" condition. On Apr 19, 2016 10:03 AM, "Michael Selik" wrote: > > On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee < leewangzhong+python at gmail.com> wrote: >> >> On Apr 19, 2016 4:09 AM, "Paul Moore" wrote: >> > Basic users should probably be using a tool like IDLE, which has a bit >> > more support for beginners than the raw REPL. >> >> My college had CS students SSH into the department's Linux server to compile and run their code, and many teachers don't believe that students should start with fancy IDE featues like, er, syntax highlighting. > > That's probably because your professors thought you were more advanced than other new Pythonistas, because you were CS students. If I were in their shoes, I might chose a different approach depending on the level of the course. I meant to have two separate clauses: - In my college, there was a server for compiling and running code. It allowed people to do so without installing a compiler (e.g. on a general-purpose school computer). It also gave a standard compiler to test against. I did not learn Python in school. - Many teachers, all over the world, do not believe in IDE features for beginners. (Therefore, there will be many *students* which don't learn about IDLE.) You can want them to stop thinking that way, but the students will still be out there. > > But that doesn't answer my question: would the proposed change hurt your workflow? > > It might. Would it affect doctests? Would it make indirect infinite recursion more difficult to trace? Would it make me remember yet another command line option or REPL option to turn on complete reprs? Would it force me to explain yet another config setting to new programmers? Let's be clear about what I am proposing. I prioritize not losing info and making it obvious how to get at that info. My suggestion was to show less output *and* have a message at the end saying how to get the output. It obviously should not affect doctests, since the output there is not for humans. An implementation that affects doctests should be considered buggy. Perhaps have an option to allow it, but affecting doctests by default would be an actual backward incompatibility, since it changes how existing code *runs* (i.e. not just output, but different logical paths during runtime). My idea for tracebacks of mutually recursive calls: See which functions are on the stack more than once, and if they keep appearing, pack them up. The algorithm should not pack up calls which happen only once within an apparent cycle, and it should be clear which order the calls come in (outside of folded mutually-recursive calls, of course). Example: File "", line 1, in f File "", line 1, in g [Mutually recursive calls hidden: f (300), g (360)] File "", line 1, in h File "", line 1, in f File "", line 1, in g [Mutual-recursive calls hidden: f (103), g (200)] RuntimeError: maximum recursion depth exceeded [963 calls hidden. Call _full_ex() to print full trace.] Or maybe the second f and g are also folded into the second hidden bit. And maybe it checks line numbers when deciding whether to print explicitly (but not when folding). Would that output make indirect infinite recursion more difficult for you to debug? > I think a beginner understands when they've printed something too big. I see this happen frequently. They laugh, shake their heads, and retype whatever they need to. I am not proposing this as a better error message, but because a flooded terminal is loss of information. It is also good to have significant info close together, to reduce effort (and thus mental cache) between processing of related ideas. Have you never needed to see the output of previous lines? Re-entering the earlier line might not work if the state of the program has changed. This happens at the beginner level. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Wed Apr 20 15:01:48 2016 From: random832 at fastmail.com (Random832) Date: Wed, 20 Apr 2016 15:01:48 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> Message-ID: <1461178908.2990267.584734985.0A30EA93@webmail.messagingengine.com> On Wed, Apr 20, 2016, at 14:25, Franklin? Lee wrote: > Example: > File "", line 1, in f > File "", line 1, in g > [Mutually recursive calls hidden: f (300), g (360)] > File "", line 1, in h > File "", line 1, in f > File "", line 1, in g > [Mutual-recursive calls hidden: f (103), g (200)] > RuntimeError: maximum recursion depth exceeded > [963 calls hidden. Call _full_ex() to print full trace.] > > Or maybe the second f and g are also folded into the second hidden > bit. And maybe it checks line numbers when deciding whether to print > explicitly (but not when folding). > > Would that output make indirect infinite recursion more difficult for > you to debug? You know what would make complicated infinite recursion easier to debug? The arguments. Is there a reliable way to determine what in f_locals correspond to arguments? My toy example below only works for named positional arguments. def magic(frame): code = frame.f_code fname = code.co_name argcount = code.co_argcount args = code.co_varnames[:argcount] values = tuple(frame.f_locals[a] for a in args) result = '%s%r' % (fname, values) if len(result) > 64: return fname + '(...)' return result Maybe even tokenize the source line the error occurred on and print any other locals whose name matches any token on the line. I guess I'll leave it to the bikeshed design committee and say: WIBNI tracebacks printed relevant frame variables FSDO relevant? From tjreedy at udel.edu Wed Apr 20 15:24:42 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 20 Apr 2016 15:24:42 -0400 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <1461176139.2978840.584702161.1EBCB3FD@webmail.messagingengine.com> References: <85r3e3sndq.fsf@benfinney.id.au> <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> <1461176139.2978840.584702161.1EBCB3FD@webmail.messagingengine.com> Message-ID: On 4/20/2016 2:15 PM, Random832 wrote: > On Wed, Apr 20, 2016, at 13:57, Koos Zevenhoven wrote: >> On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy wrote: >>> On 4/20/2016 6:39 AM, Koos Zevenhoven wrote: >>>> def fun(): >>>> fun() >>>> >>>> fun() >>>> >>>> Does that. >>> >>> >>> The above produces pairs of lines, so I do not understand your point. >>> >> >> Strange. for me it produces a repeating single line. Tried both on >> 2.7.6. and 3.5.1. Probably not worth discussing, though. > > It produces single lines in the interactive interpreter, pairs in a > file. Aha. In IDLE's Shell, which generally simulates the interactive interpreter quite well, pairs are printed because interactive user input is exec'ed in a process running Python in normal batch mode. I expect any GUI shell to act similarly. I consider interactive interpreter omission of the line at fault to be a design buglet. For a single line of input, it is not much of a problem to look back up. But if one pastes a 20-line statement, finding line 13, for instance, is not so quick. And on Windows, a 1000 line traceback may erase the input so there is nothing to look at. -- Terry Jan Reedy From steve at pearwood.info Wed Apr 20 21:03:07 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 11:03:07 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <1461178908.2990267.584734985.0A30EA93@webmail.messagingengine.com> References: <85d1pmsqvv.fsf@benfinney.id.au> <1461178908.2990267.584734985.0A30EA93@webmail.messagingengine.com> Message-ID: <20160421010307.GA1819@ando.pearwood.info> On Wed, Apr 20, 2016 at 03:01:48PM -0400, Random832 wrote: > You know what would make complicated infinite recursion easier to debug? > The arguments. Check out the cgitb module, which installs an except hook which (among many other things) prints the arguments to the functions. Run this sample code: import cgitb cgitb.enable(format='text') def spam(arg): if arg == 0: raise ValueError('+++ out of cheese error, redo from start +++') return spam(arg - 1) def eggs(x): return spam(x) + 1 def cheese(a, b, c): return a or b or eggs(c) cheese(0, None, 1) and you will see output something like the following. (For brevity I have compressed some of the output.) ValueError Python 3.3.0rc3: /usr/local/bin/python3.3 Thu Apr 21 10:58:22 2016 A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred. /home/steve/python/ in () /home/steve/python/ in cheese(a=0, b=None, c=2) /home/steve/python/ in eggs(x=2) /home/steve/python/ in spam(arg=2) /home/steve/python/ in spam(arg=1) /home/steve/python/ in spam(arg=0) ValueError: +++ out of cheese error, redo from start +++ __cause__ = None __class__ = [...] __traceback__ = args = ('+++ out of cheese error, redo from start +++',) with_traceback = The above is a description of an error in a Python program. Here is the original traceback: Traceback (most recent call last): File "", line 1, in File "", line 2, in cheese File "", line 2, in eggs File "", line 4, in spam File "", line 4, in spam File "", line 3, in spam ValueError: +++ out of cheese error, redo from start +++ -- Steve From steve at pearwood.info Wed Apr 20 21:58:52 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 11:58:52 +1000 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: References: <85lh4bsip5.fsf@benfinney.id.au> <85d1pmsqvv.fsf@benfinney.id.au> <20160419161855.GU1819@ando.pearwood.info> Message-ID: <20160421015850.GB1819@ando.pearwood.info> On Tue, Apr 19, 2016 at 06:28:40PM -0400, Terry Reedy wrote: > On 4/19/2016 12:18 PM, Steven D'Aprano wrote: > >I mostly agree with what you say, but I would like to see one change to > >the default sys.excepthook: large numbers of *identical* traceback lines > >(as you often get with recursion errors) should be collapsed. For > >example: > > Tracebacks produce *pairs* of lines: the location and the line itself. Only if the source code is available. The source isn't available for functions defined in the REPL, for C code, or for Python functions read from a .pyc file where the .py file is not available. The code I gave before works with pairs of location + source, without any change. If I move the definition of fact into a file, so that the source lines are included, we get: py> sys.setrecursionlimit(10) py> fact(30) Traceback (most recent call last): File "", line 1, in File "/home/steve/python/fact.py", line 3, in fact return n*fact(n-1) [...repeat previous line 7 times...] File "/home/steve/python/fact.py", line 2, in fact if n < 1: return 1 RuntimeError: maximum recursion depth exceeded in comparison I'd want to adjust the wording. Perhaps "previous entry" rather than previous line? > Replacing pairs with a count of repetitions would not lose information, > and would make the info more visible. I would require at least, say, 3 > repetitions before collapsing. That's what my example does: it only collapses the line/(pair of lines) if there are at least three identical repetitions. -- Steve From steve at pearwood.info Wed Apr 20 22:46:20 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 12:46:20 +1000 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: <20160421024620.GD1819@ando.pearwood.info> On Wed, Apr 20, 2016 at 12:07:11AM -0400, ?manuel Barry wrote: > This idea was mentioned a couple of times in the previous thread, and it > seems reasonable to me. Getting recursion errors when testing a function in > the interactive prompt often screwed me over, as I have a limited scrollback > of 1000 lines at best on Windows (I never checked exactly how high/low the > limit was), and with Python's recursion limit of 1000, that's a whopping > 1000 to 2000 lines in my console, effectively killing off anything useful I > might have wanted to see. Such as, for example, the function definition that > triggered the exception. Even if you have an effectively unlimited scrollback, getting thousands of identical lines is a PITA and rather distracting and annoying. > I'm suggesting a change to how Python handles specifically recursion limits, We should not limit this to specifically *recursion* limits, despite the error message printed out, it is not just recursive calls that count towards the recursion limit. It is any function call. In practice, though, it is difficult to run into this limit with regular non-recursive calls, but not impossible. > to cut after a sane number of times (someone suggested to shrink the output > after 3 times), and simply stating how many more identical messages there > are. That would be me :-) > I would also like to extend this to any arbitrary loop of callers (for > example foo -> bar -> baz -> foo and so on would be counted as one "call" > for the purposes of this proposal). Seems reasonable. > We can probably afford to cut out immediately as we notice the recursion, No you can't. A bunch of recursive calls may be followed by non- recursive calls, or a different set of recursive calls. It might not even be a RuntimeError at the end. For example, put these four functions in a file, "fact.py": def fact(n): if n < 1: return 1 return n*fact(n-1) def rec(n): if n < 20: return spam(n) return n + rec(n-1) def spam(n): return eggs(2*n) def eggs(arg): return fact(arg//2) Now I run this: import fact sys.setrecursionlimit(35) fact.rec(30) and I get this shorter traceback: Traceback (most recent call last): File "", line 1, in File "/home/steve/python/fact.py", line 9, in rec return n + rec(n-1) [...repeat previous line 10 times...] File "/home/steve/python/fact.py", line 8, in rec return spam(n) File "/home/steve/python/fact.py", line 12, in spam return eggs(2*n) File "/home/steve/python/fact.py", line 15, in eggs return fact(arg//2) File "/home/steve/python/fact.py", line 3, in fact return n*fact(n-1) [...repeat previous line 18 times...] File "/home/steve/python/fact.py", line 2, in fact if n < 1: return 1 RuntimeError: maximum recursion depth exceeded in comparison The point being, you cannot just stop processing traces when you find one repeated. -- Steve From mistersheik at gmail.com Wed Apr 20 22:51:37 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 20 Apr 2016 19:51:37 -0700 (PDT) Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. Message-ID: Sometimes users inherit from builtin types only to find that their overridden methods are not called. Instead of this being a trap for unsuspecting users, I suggest overriding the __new__ method of these types so that it will raise with an informative exception explaining that, e.g., instead of inheriting from dict, you should inherit from UserDict. I suggest this modification to any Python implementation that has special versions of classes that cannot easily be extended, such as CPython. If another Python implementation allows dict (e.g.) to be extended easily, then it doesn't have to raise. Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Apr 20 23:02:29 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 20 Apr 2016 20:02:29 -0700 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: <571842C5.9060700@stoneleaf.us> On 04/20/2016 07:51 PM, Neil Girdhar wrote: > Sometimes users inherit from builtin types only to find that their > overridden methods are not called. Instead of this being a trap for > unsuspecting users, I suggest overriding the __new__ method of these > types so that it will raise with an informative exception explaining > that, e.g., instead of inheriting from dict, you should inherit from > UserDict. How, exactly, does that help? And even if it does help (which I doubt), you're willing to break currently working code that successfully subclasses dicts, strings, tuples, etc., that then pass through dicts, strings, tuples, etc., to build their new class? No thank you. -- ~Ethan~ From mistersheik at gmail.com Wed Apr 20 22:55:39 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 20 Apr 2016 19:55:39 -0700 (PDT) Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: Specifically, I'm suggesting something like class dict: def __new__(cls): if cls is not dict: raise RuntimeError("Cannot inherit from dict; inherit from UserDict instead") super().__new__(cls) On Wednesday, April 20, 2016 at 10:51:37 PM UTC-4, Neil Girdhar wrote: > > Sometimes users inherit from builtin types only to find that their > overridden methods are not called. Instead of this being a trap for > unsuspecting users, I suggest overriding the __new__ method of these types > so that it will raise with an informative exception explaining that, e.g., > instead of inheriting from dict, you should inherit from UserDict. > > I suggest this modification to any Python implementation that has special > versions of classes that cannot easily be extended, such as CPython. If > another Python implementation allows dict (e.g.) to be extended easily, > then it doesn't have to raise. > > Best, > > Neil > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Apr 20 23:15:43 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 13:15:43 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: <20160421031543.GE1819@ando.pearwood.info> On Wed, Apr 20, 2016 at 07:51:37PM -0700, Neil Girdhar wrote: > Sometimes users inherit from builtin types only to find that their > overridden methods are not called. Instead of this being a trap for > unsuspecting users, I suggest overriding the __new__ method of these types > so that it will raise with an informative exception explaining that, e.g., > instead of inheriting from dict, you should inherit from UserDict. -1 What about those who don't want or need to inherit from UserDict, and are perfectly happy with inheriting from dict? Why should we break their working code for the sake of people whose code already isn't working? Not everyone who inherits from dict tries to override __getitem__ and __setitem__ and are then surprised that other methods don't call their overridden methods. We shouldn't punish them. -- Steve From mistersheik at gmail.com Thu Apr 21 00:33:33 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 21 Apr 2016 04:33:33 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160421031543.GE1819@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: I think inheriting directly from dict is simply bad code because CPython doesn't promise that any of your overridden methods will be called. The fact that it silently doesn't call them is an inscrutable trap. And really, it's not much of a "punishment" to simply change your base class name. On Wed, Apr 20, 2016 at 11:16 PM Steven D'Aprano wrote: > On Wed, Apr 20, 2016 at 07:51:37PM -0700, Neil Girdhar wrote: > > Sometimes users inherit from builtin types only to find that their > > overridden methods are not called. Instead of this being a trap for > > unsuspecting users, I suggest overriding the __new__ method of these > types > > so that it will raise with an informative exception explaining that, > e.g., > > instead of inheriting from dict, you should inherit from UserDict. > > -1 > > What about those who don't want or need to inherit from UserDict, and > are perfectly happy with inheriting from dict? Why should we break their > working code for the sake of people whose code already isn't working? > > Not everyone who inherits from dict tries to override __getitem__ and > __setitem__ and are then surprised that other methods don't call their > overridden methods. We shouldn't punish them. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/7nFQURLhlQY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Thu Apr 21 00:55:11 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Thu, 21 Apr 2016 00:55:11 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: <20160421024620.GD1819@ando.pearwood.info> References: <20160421024620.GD1819@ando.pearwood.info> Message-ID: > From Steven D'Aprano > Sent: Wednesday, April 20, 2016 10:46 PM > > We should not limit this to specifically *recursion* limits, despite the > error message printed out, it is not just recursive calls that count > towards the recursion limit. It is any function call. > > In practice, though, it is difficult to run into this limit with regular > non-recursive calls, but not impossible. > See my second comment > > We can probably afford to cut out immediately as we notice the recursion, > > No you can't. A bunch of recursive calls may be followed by non- > recursive calls, or a different set of recursive calls. It might not > even be a RuntimeError at the end. > See my first comment *wink wink* My proposal is so that we strip out repeated lines to avoid unnecessary noise, *not* to strip out everything thereafter! I agree that my wording was misleading, but I do want to keep all relevant information - only to strip out consecutive and identical lines. If there's a 20-call recursion in the middle of the stack for some odd reason (call itself with n+1 until n == 20 and then do something else?), some of it would be stripped out but the rest would remain. The core of my proposal is to enhance ability to read and debug large tracebacks, which most of the times will be because of recursive calls. We're shrinking tracebacks for recursive calls, and if that happens to shrink other large tracebacks as well, that's a useful side effect and should be considered, but that's not the core of my idea. If a one-size-fits-all solution works here, I'll go for that and avoid dumping yet another bucket of special cases (which aren't special enough to break the rules) into the code. There are already enough. > The point being, you cannot just stop processing traces when you find > one repeated. Yep, don't want that at all indeed. Thanks for your input Steven :) -Emanuel From vgr255 at live.ca Thu Apr 21 01:08:35 2016 From: vgr255 at live.ca (=?UTF-8?Q?=C3=89manuel_Barry?=) Date: Thu, 21 Apr 2016 01:08:35 -0400 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: Except from the fact that a lot of code requiring actual dicts (or str, list?) will break because they are no longer receiving a dict instance (or an instance of (one of) its subclasses), but a collections.UserDict instance which is also entirely unrelated to dict except that they have methods with similar names. As I understand it, you prefer to put the burden of supporting collections.UserDict not on the people subclassing dict, but on the library developers expecting actual dicts. I?m sure there are a couple of built-in functions (written in C) accepting dicts that would not work if they didn?t get dicts. And getting built-in functions to accept instances of pure Python code is nontrivial and tricky at best, and requires a lot more work than it would make sense to in this case. I?m -1 on the idea. -Emanuel From: Neil Girdhar Sent: Thursday, April 21, 2016 12:34 AM I think inheriting directly from dict is simply bad code because CPython doesn't promise that any of your overridden methods will be called. The fact that it silently doesn't call them is an inscrutable trap. And really, it's not much of a "punishment" to simply change your base class name. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 21 01:14:29 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Apr 2016 15:14:29 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 2:33 PM, Neil Girdhar wrote: > I think inheriting directly from dict is simply bad code because CPython > doesn't promise that any of your overridden methods will be called. The > fact that it silently doesn't call them is an inscrutable trap. And really, > it's not much of a "punishment" to simply change your base class name. There are way too many cases that work just fine, though. >>> class AutoCreateDict(dict): ... def __missing__(self, key): ... return "" % key ... >>> acd = AutoCreateDict() >>> acd["asdf"] "" Why should I inherit from UserDict instead? I have to import that from somewhere (is it in types? collections? though presumably your error message would tell me that), and then I have to contend with the fact that my class is no longer a dictionary. >>> class AutoCreateUserDict(collections.UserDict): ... def __missing__(self, key): ... return "" % key ... >>> acud = AutoCreateUserDict() >>> acud["qwer"] "" >>> json.dumps(acd) '{}' >>> json.dumps(acud) Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.6/json/__init__.py", line 230, in dumps return _default_encoder.encode(obj) File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/usr/local/lib/python3.6/json/encoder.py", line 180, in default o.__class__.__name__) TypeError: Object of type 'AutoCreateUserDict' is not JSON serializable If you *do* push forward with this proposal, incidentally, I would recommend not doing it in __new__, but changing it so the class is no longer subclassable, as per bool: >>> class X(int): pass ... >>> class X(bool): pass ... Traceback (most recent call last): File "", line 1, in TypeError: type 'bool' is not an acceptable base type But I am firmly -1 on disallowing dict subclasses just because they aren't guaranteed to call all of your overridden methods. ChrisA From rosuav at gmail.com Thu Apr 21 01:26:52 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Apr 2016 15:26:52 +1000 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: <20160421024620.GD1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 2:55 PM, ?manuel Barry wrote: > The core of my proposal is to enhance ability to read and debug large > tracebacks, which most of the times will be because of recursive calls. > We're shrinking tracebacks for recursive calls, and if that happens to > shrink other large tracebacks as well, that's a useful side effect and > should be considered, but that's not the core of my idea. No problem with that. In my experience, most RecursionErrors come from *accidental* recursion, which is straight-forwardly infinite and usually involves a single function. Consider this idiom: >>> class id(int): ... _orig_id = id ... def __new__(cls, obj): ... return super().__new__(cls, cls._orig_id(obj)) ... def __repr__(self): ... return hex(self) ... >>> obj = object() >>> obj >>> id(obj) 0x7ffb12290110 If I muck something up and accidentally call id() inside the definition of __new__ (instead of cls._orig_id), it'll end up infinitely recursing. A traceback shortener that recognizes only the very simplest forms of repetition would work fine for this case. It doesn't need a huge amount of intelligence - the traceback would have a bunch of exactly identical lines, and it _would_ end with one of the identical ones. ChrisA From mistersheik at gmail.com Thu Apr 21 01:33:24 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 21 Apr 2016 05:33:24 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 1:15 AM Chris Angelico wrote: > On Thu, Apr 21, 2016 at 2:33 PM, Neil Girdhar > wrote: > > I think inheriting directly from dict is simply bad code because CPython > > doesn't promise that any of your overridden methods will be called. The > > fact that it silently doesn't call them is an inscrutable trap. And > really, > > it's not much of a "punishment" to simply change your base class name. > > There are way too many cases that work just fine, though. > > >>> class AutoCreateDict(dict): > ... def __missing__(self, key): > ... return "" % key > ... > >>> acd = AutoCreateDict() > >>> acd["asdf"] > "" > > Why should I inherit from UserDict instead? I have to import that from > somewhere (is it in types? collections? though presumably your error > message would tell me that), and then I have to contend with the fact > that my class is no longer a dictionary. > Of course it's a dictionary. It's an abc.Mapping, which is all a user of your class should care about. After all, it "quacks like a duck", which is all that matters. > > >>> class AutoCreateUserDict(collections.UserDict): > ... def __missing__(self, key): > ... return "" % key > ... > >>> acud = AutoCreateUserDict() > >>> acud["qwer"] > "" > >>> json.dumps(acd) > '{}' > >>> json.dumps(acud) > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python3.6/json/__init__.py", line 230, in dumps > return _default_encoder.encode(obj) > File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encode > chunks = self.iterencode(o, _one_shot=True) > File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencode > return _iterencode(o, 0) > File "/usr/local/lib/python3.6/json/encoder.py", line 180, in default > o.__class__.__name__) > TypeError: Object of type 'AutoCreateUserDict' is not JSON serializable > > That's a fair point, but it seems like a bug in JSON. They should have checked if it's an abc.Mapping imho. > If you *do* push forward with this proposal, incidentally, I would > recommend not doing it in __new__, but changing it so the class is no > longer subclassable, as per bool: > > >>> class X(int): pass > ... > >>> class X(bool): pass > ... > Traceback (most recent call last): > File "", line 1, in > TypeError: type 'bool' is not an acceptable base type > Good point. > > But I am firmly -1 on disallowing dict subclasses just because they > aren't guaranteed to call all of your overridden methods. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/7nFQURLhlQY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 21 01:47:53 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Apr 2016 15:47:53 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 3:33 PM, Neil Girdhar wrote: >> Why should I inherit from UserDict instead? I have to import that from >> somewhere (is it in types? collections? though presumably your error >> message would tell me that), and then I have to contend with the fact >> that my class is no longer a dictionary. > > > Of course it's a dictionary. It's an abc.Mapping, which is all a user of > your class should care about. After all, it "quacks like a duck", which is > all that matters. Your first sentence is in conflict with your other statements. My type is simply *not* a dictionary. You can start raising bug reports all over the place saying "JSON should look for abc.Mapping rather than dict", but I'm not even sure that it should. ChrisA From mistersheik at gmail.com Thu Apr 21 01:57:41 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 21 Apr 2016 05:57:41 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: Lol, well, I think it should. Anyway, the cons are that yes some people's bad code won't work. The pros are that unsuspecting users won't fall into the trap associated with inheriting from dict, list or set. Of course, people on python-ideas know the pitfalls of inheriting from builtin types. You're not users that benefit from this change. It's people who don't know the problems and have no way of finding out until they spend a day debugging why overriding some method on a derived class of a derived class that ultimately inherits from dict doesn't work. Experts always want to make their own lives better. That's the problem with a language made by experts. New users fall into traps that you will never fall into (again). On Thu, Apr 21, 2016 at 1:48 AM Chris Angelico wrote: > On Thu, Apr 21, 2016 at 3:33 PM, Neil Girdhar > wrote: > >> Why should I inherit from UserDict instead? I have to import that from > >> somewhere (is it in types? collections? though presumably your error > >> message would tell me that), and then I have to contend with the fact > >> that my class is no longer a dictionary. > > > > > > Of course it's a dictionary. It's an abc.Mapping, which is all a user of > > your class should care about. After all, it "quacks like a duck", which > is > > all that matters. > > Your first sentence is in conflict with your other statements. My type > is simply *not* a dictionary. You can start raising bug reports all > over the place saying "JSON should look for abc.Mapping rather than > dict", but I'm not even sure that it should. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/7nFQURLhlQY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Apr 21 02:36:54 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Apr 2016 16:36:54 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: On 21 April 2016 at 12:51, Neil Girdhar wrote: > Sometimes users inherit from builtin types only to find that their > overridden methods are not called. Instead of this being a trap for > unsuspecting users, I suggest overriding the __new__ method of these types > so that it will raise with an informative exception explaining that, e.g., > instead of inheriting from dict, you should inherit from UserDict. > > I suggest this modification to any Python implementation that has special > versions of classes that cannot easily be extended, such as CPython. If > another Python implementation allows dict (e.g.) to be extended easily, > then it doesn't have to raise. > Builtins can be extended, you just have to override all the methods where you want to change the return type: >>> from collections import defaultdict, Counter, OrderedDict >>> issubclass(defaultdict, dict) True >>> issubclass(Counter, dict) True >>> issubclass(OrderedDict, dict) True This isn't hard as such, it's just tedious, so it's often simpler to use the more subclass friendly variants that dynamically look up the type to return and hence let you get away with overriding a smaller subset of the methods. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Thu Apr 21 02:38:34 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Wed, 20 Apr 2016 23:38:34 -0700 Subject: [Python-ideas] Why was Path.path added? Message-ID: Well, it's already there and it's probably too late to remove it, but introducing the idiom path_str = getattr(arg, "path", arg) was not necessary: you could also write path_str = str(Path(arg)) which works just as well. My 2c., Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Thu Apr 21 03:20:40 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 21 Apr 2016 07:20:40 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: Good point, I withdraw my suggestion. I think it's unfortunate for Python to have traps like this, but I don't see a nice way to protect users while still letting people do whatever they want. On Thu, Apr 21, 2016 at 3:03 AM Neil Girdhar wrote: > I guess that's fair. > > On Thu, Apr 21, 2016 at 2:36 AM Nick Coghlan wrote: > >> On 21 April 2016 at 12:51, Neil Girdhar wrote: >> >>> Sometimes users inherit from builtin types only to find that their >>> overridden methods are not called. Instead of this being a trap for >>> unsuspecting users, I suggest overriding the __new__ method of these types >>> so that it will raise with an informative exception explaining that, e.g., >>> instead of inheriting from dict, you should inherit from UserDict. >>> >>> I suggest this modification to any Python implementation that has >>> special versions of classes that cannot easily be extended, such as >>> CPython. If another Python implementation allows dict (e.g.) to be >>> extended easily, then it doesn't have to raise. >>> >> >> Builtins can be extended, you just have to override all the methods where >> you want to change the return type: >> >> >>> from collections import defaultdict, Counter, OrderedDict >> >>> issubclass(defaultdict, dict) >> True >> >>> issubclass(Counter, dict) >> True >> >>> issubclass(OrderedDict, dict) >> True >> >> This isn't hard as such, it's just tedious, so it's often simpler to use >> the more subclass friendly variants that dynamically look up the type to >> return and hence let you get away with overriding a smaller subset of the >> methods. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Thu Apr 21 03:27:35 2016 From: mike at selik.org (Michael Selik) Date: Thu, 21 Apr 2016 07:27:35 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 1:58 AM Neil Girdhar wrote: > The pros are that unsuspecting users won't fall into the trap associated > with inheriting from dict, list or set. ... It's people who don't know the > problems and have no way of finding out until they spend a day debugging > why overriding some method on a derived class of a derived class that > ultimately inherits from dict doesn't work. > I feel your pain. However, what you see as a failure or defect of the builtins is actually a feature! :-) The downstream programmer should not be expected to know the implementation details of a dict. You don't want to read the source, you just want to use the public interface. You should be able to subclass and override methods as you like without worrying about hidden internal relationships. You should be able to override ``__getitem__`` without accidentally affecting things like ``values``. I guess it's a case of "a little knowledge is dangerous". Someone who knows nothing about dict implementation would not expect to see a change in method A because of an override of method B. The expert is keeping the novice safe. It's the journeyman who suffers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Thu Apr 21 03:30:27 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 21 Apr 2016 07:30:27 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: Wow, I disagree with this. The fact that you can override getitem and it doesn't fix initialization in the constructor to match is definitely confusing. The fact that __add__ called on a dict subclass doesn't return a class of the same type is confusing. This is why every Python book advises people not to inherit from builtins. Let's not get carried away. This is an optimization for CPython ? not a feature. On Thu, Apr 21, 2016 at 3:27 AM Michael Selik wrote: > On Thu, Apr 21, 2016 at 1:58 AM Neil Girdhar > wrote: > >> The pros are that unsuspecting users won't fall into the trap associated >> with inheriting from dict, list or set. ... It's people who don't know the >> problems and have no way of finding out until they spend a day debugging >> why overriding some method on a derived class of a derived class that >> ultimately inherits from dict doesn't work. >> > > I feel your pain. However, what you see as a failure or defect of the > builtins is actually a feature! :-) > > The downstream programmer should not be expected to know the > implementation details of a dict. You don't want to read the source, you > just want to use the public interface. You should be able to subclass and > override methods as you like without worrying about hidden internal > relationships. You should be able to override ``__getitem__`` without > accidentally affecting things like ``values``. > > I guess it's a case of "a little knowledge is dangerous". Someone who > knows nothing about dict implementation would not expect to see a change in > method A because of an override of method B. The expert is keeping the > novice safe. It's the journeyman who suffers. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Apr 21 04:07:40 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Apr 2016 09:07:40 +0100 Subject: [Python-ideas] Why was Path.path added? In-Reply-To: References: Message-ID: On 21 April 2016 at 07:38, Antony Lee wrote: > Well, it's already there and it's probably too late to remove it, but pathlib is provisional, so it *can* be removed (and indeed probably will be - see the endless threads here and on python-dev) > introducing the idiom > > path_str = getattr(arg, "path", arg) > > > was not necessary: you could also write > > path_str = str(Path(arg)) > > > which works just as well. The latter requires you to import pathlib, which is not Python 2.7-compatible. A minor point, but one reason why the path attribute offers simpler code. Also, str(Path(arg)) applies an unnecessary call to Path() if arg is already a path. (And str(arg) is not OK, as it will also accept things like ints). If you want the full details, please read the various threads that have been going on here for about a month. It isn't worth reiterating the debate again, I'm afraid - most people are fairly burned out on pathlib discussions by now. Paul From oscar.j.benjamin at gmail.com Thu Apr 21 04:44:53 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 21 Apr 2016 09:44:53 +0100 Subject: [Python-ideas] Have REPL print less by default In-Reply-To: <20160421010307.GA1819@ando.pearwood.info> References: <85d1pmsqvv.fsf@benfinney.id.au> <1461178908.2990267.584734985.0A30EA93@webmail.messagingengine.com> <20160421010307.GA1819@ando.pearwood.info> Message-ID: On 21 April 2016 at 02:03, Steven D'Aprano wrote: > On Wed, Apr 20, 2016 at 03:01:48PM -0400, Random832 wrote: > >> You know what would make complicated infinite recursion easier to debug? >> The arguments. > > Check out the cgitb module, which installs an except hook which (among > many other things) prints the arguments to the functions. Run this > sample code: > > import cgitb > cgitb.enable(format='text') Oh that's nice. It would be good if you could use it with -m like pdb $ python -m cgitb myscript.py rather than needing to insert it in the code. -- Oscar From k7hoven at gmail.com Thu Apr 21 04:49:22 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 21 Apr 2016 11:49:22 +0300 Subject: [Python-ideas] Why was Path.path added? In-Reply-To: References: Message-ID: On Thu, Apr 21, 2016 at 11:07 AM, Paul Moore wrote: > On 21 April 2016 at 07:38, Antony Lee wrote: >> Well, it's already there and it's probably too late to remove it, but > > pathlib is provisional, so it *can* be removed (and indeed probably > will be - see the endless threads here and on python-dev) > >> introducing the idiom >> >> path_str = getattr(arg, "path", arg) >> >> >> was not necessary: you could also write >> >> path_str = str(Path(arg)) >> >> >> which works just as well. > > The latter requires you to import pathlib, which is not Python > 2.7-compatible. A minor point, but one reason why the path attribute > offers simpler code. Also, str(Path(arg)) applies an unnecessary call > to Path() if arg is already a path. (And str(arg) is not OK, as it > will also accept things like ints). > > If you want the full details, please read the various threads that > have been going on here for about a month. It isn't worth reiterating > the debate again, I'm afraid - most people are fairly burned out on > pathlib discussions by now. The need for .path has been brought up before, although not because .path was unnecessary, but because we wanted something better. And this better thing currently goes by the name fspath, so instead of str(Path(arg)) you should be calling os.fspath(arg), which even works for third-party path libraries that implement the protocol. Regarding this, however, we are seizing the discussions until we have a PEP. A problem with .path, however, is that there should be only one way to do it, and if I were to decide, I would probably remove it in pathlib.PurePath before it's too late. -Koos From steve at pearwood.info Thu Apr 21 07:17:09 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 21:17:09 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> Message-ID: <20160421111709.GF1819@ando.pearwood.info> On Thu, Apr 21, 2016 at 04:33:33AM +0000, Neil Girdhar wrote: > I think inheriting directly from dict is simply bad code because CPython > doesn't promise that any of your overridden methods will be called. Python makes the same promise that it makes for *all* classes: if you override a method, and call that method on an instance of a subclass, the subclass' overridden implementation will be called. Dicts are no different from any other class, and have been since version 2.2 when types and classes where unified and inheriting from built-ins was first allowed. In fact, dicts were *explicitly* listed by Guido as one of the built-ins which can be subclassed: "Let's start with the juiciest bit: you can subtype built-in types like dictionaries and lists." https://www.python.org/download/releases/2.2.3/descrintro/#subclassing Your proposal would break at least two standard types, both of which subclass dict: py> from collections import defaultdict, Counter py> defaultdict.__mro__ (, , ) py> Counter.__mro__ (, , ) and it runs counter to a documented feature of dicts, that they can be subclassed and given a __missing__ method: "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() and key is not present, the d[key] operation calls that method ..." https://docs.python.org/2/library/stdtypes.html#mapping-types-dict > The fact that it silently doesn't call them is an inscrutable trap. That is not a fact. What you should say is: "The fact that dicts don't obey the undocumented assumptions I made about the implementation of methods is a trap." and I will agree: correct, but its a trap for hubris and foolishness, and the answer to that is, don't make unjustified assumptions about how dicts are implemented. And I can say that because I made exactly that wrong assumption too. Nowhere does the documentation say that dict.update calls __setitem__. Nowhere does it say that dict.clear calls __delitem__. Nowhere does it say that dict.values calls __getitem__. And yet I assumed that they did all that. When somebody makes the unjustified assumption that they do, then they will discover that calling the subclass' clear method fails to call the overridden __delitem__. It also fails to call the overridden __str__. The only difference is that nobody assumes that clear() calls __str__, but many people foolishly assume that it calls the __delitem__ method. Both assumptions have *exactly* the same justification: none at all. I made a bunch of stupid, foolish assumptions, based on absolutely nothing more than the idea that it stands to reason that dict must be implemented in this way, and got bitten. And I deserved it. I was wrong, and anyone making the same assumption is wrong. Well, I say I was "bitten", but that over-dramatises the situation. What actually happened was that I wrote a subclass without doing any tests, then tested it, and discovered that it didn't work how I expected. I spent a few minutes playing with the class, added a few print statements, discovered that my assumptions were wrong, and then overrode the classes I actually wanted to override. Twenty minutes of googling and reading the docs convinced me that, no, Python doesn't promise that dict.clear calls __delitem__. It would have been five minutes except I was especially stubbon and pig-headed that day and didn't want to admit that I was in the wrong. But I was. Getting bitten by this was, in fact, a valuable lesson. I learned what I should have already known, what I had *intellectually* known but had ignored because "it stands to reason". Namely, if an implementation isn't documented, you cannot assume that it works in a particular way. My sympathy level is zero, and my support for this proposal is negative. -- Steve From steve at pearwood.info Thu Apr 21 07:30:37 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Apr 2016 21:30:37 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: Message-ID: <20160421113037.GG1819@ando.pearwood.info> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote: > Builtins can be extended, you just have to override all the methods where > you want to change the return type: I wonder whether we should have a class decorator which automatically adds the appropriate methods? It would need some sort of introspection to look at the superclass and adds overridden methods? E.g. if the superclass is float, it would add a bunch of methods like: def __add__(self, other): x = super().__add__(other) if x is NotImplemented: return x return type(self)(x) but only if they aren't already overridden. Then you could do this: @decorator # I have no idea what to call it. class MyInt(int): pass and now MyInt() + MyInt() will return a MyInt, rather than a regular int. Getting the list of dunder methods is easy, but telling whether or not they should return an instance of the subclass may not be. Thoughts? -- Steve From mike at selik.org Thu Apr 21 07:44:00 2016 From: mike at selik.org (Michael Selik) Date: Thu, 21 Apr 2016 11:44:00 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160421113037.GG1819@ando.pearwood.info> References: <20160421113037.GG1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano wrote: > On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote: > > > Builtins can be extended, you just have to override all the methods where > > you want to change the return type: > > I wonder whether we should have a class decorator which automatically > adds the appropriate methods? > If I'm not mistaken, some of the dunders need to be overridden as part of the class definition and can't be added dynamically without some exec a la namedtuple. If so, are you still interested in creating that decorator? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 21 08:42:00 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Apr 2016 22:42:00 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421113037.GG1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 9:44 PM, Michael Selik wrote: > On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano wrote: >> >> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote: >> >> > Builtins can be extended, you just have to override all the methods >> > where >> > you want to change the return type: >> >> I wonder whether we should have a class decorator which automatically >> adds the appropriate methods? > > > If I'm not mistaken, some of the dunders need to be overridden as part of > the class definition and can't be added dynamically without some exec a la > namedtuple. If so, are you still interested in creating that decorator? Which ones? __new__ is already covered, and AFAIK all operator dunders can be injected just fine. def operators_return_subclass(cls): if len(cls.__bases__) != 1: raise ValueError("Need to have exactly one base class") base, = cls.__bases__ def dunder(name): def flip_to_subclass(self, other): x = getattr(super(cls, self), name)(other) if type(x) is base: return cls(x) return x setattr(cls, name, flip_to_subclass) for meth in "add", "sub", "mul", "div": dunder("__"+meth+"__") return cls @operators_return_subclass class hexint(int): def __repr__(self): return hex(self) I'm not sure what set of dunders should be included though. I've gone extra strict in this version; for instance, hexint/2 --> float, not hexint. You could go either way. ChrisA From mike at selik.org Thu Apr 21 08:50:36 2016 From: mike at selik.org (Michael Selik) Date: Thu, 21 Apr 2016 12:50:36 +0000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421113037.GG1819@ando.pearwood.info> Message-ID: On Thu, Apr 21, 2016 at 8:42 AM Chris Angelico wrote: > On Thu, Apr 21, 2016 at 9:44 PM, Michael Selik wrote: > > On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano > wrote: > >> > >> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote: > >> > >> > Builtins can be extended, you just have to override all the methods > >> > where > >> > you want to change the return type: > >> > >> I wonder whether we should have a class decorator which automatically > >> adds the appropriate methods? > > > > > > If I'm not mistaken, some of the dunders need to be overridden as part of > > the class definition and can't be added dynamically without some exec a > la > > namedtuple. If so, are you still interested in creating that decorator? > > Which ones? __new__ is already covered, and AFAIK all operator dunders > can be injected just fine. > I have a vague memory of trouble with __repr__ when I once tried to make a sort of mutable namedtuple, but now I can't reproduce the issue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Thu Apr 21 09:43:17 2016 From: random832 at fastmail.com (Random832) Date: Thu, 21 Apr 2016 09:43:17 -0400 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160421111709.GF1819@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> Message-ID: <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote: > and it runs counter to a documented feature of dicts, that they can be > subclassed and given a __missing__ method: > > "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() > and key is not present, the d[key] operation calls that method ..." So what method should be overridden to make a dict subclass useful as a class or object dictionary (i.e. for attribute lookup to work with names that have not been stored with dict.__setitem__)? Overriding __getitem__ or __missing__ doesn't work. My only consolation is that defaultdict doesn't work either. I can't even figure out how to get the real class dict, as I would need if I were overriding __getattribute__ explicitly in the metaclass (which also doesn't work) - cls.__dict__ returns a mappingproxy. Alternatively, where, other than object and class dicts, are you actually required to have a subclass of dict rather than a UserDict or other duck-typed mapping? Incidentally, why is __missing__ documented under defaultdict as "in addition to the standard dict operations"? From storchaka at gmail.com Thu Apr 21 11:52:56 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 21 Apr 2016 18:52:56 +0300 Subject: [Python-ideas] Bytecode for calling function with keyword arguments Message-ID: Currently the call of function with keyword arguments (say `f(a, b, x=c, y=d)`) is compiled to following bytecode: 0 LOAD_NAME 0 (f) 3 LOAD_NAME 1 (a) 6 LOAD_NAME 2 (b) 9 LOAD_CONST 0 ('x') 12 LOAD_NAME 3 (c) 15 LOAD_CONST 1 ('y') 18 LOAD_NAME 4 (d) 21 CALL_FUNCTION 514 (2 positional, 2 keyword pair) For every positional argument its value is pushed on the stack, and for every keyword argument its name and its value are pushed on the stack. But keyword arguments are always constant strings! We can reorder the stack, and push keyword argument values and names separately. And we can save all keyword argument names for this call in a constant tuple and load it by one bytecode command. 0 LOAD_NAME 0 (f) 3 LOAD_NAME 1 (a) 6 LOAD_NAME 2 (b) 9 LOAD_NAME 3 (c) 12 LOAD_NAME 4 (d) 15 LOAD_CONST 0 (('x', 'y')) 18 CALL_FUNCTION2 2 Benefits: 1. We save one command for every keyword parameter after the first. This saves bytecode size and execution time. 2. Since the number of keyword arguments is obtained from tuple's size, new CALL_FUNCTION opcode needs only the number of positional arguments. It's argument is simpler and needs less bits (important for wordcode). Disadvantages: 1. Increases the number of constants. From rosuav at gmail.com Thu Apr 21 12:00:35 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Apr 2016 02:00:35 +1000 Subject: [Python-ideas] Bytecode for calling function with keyword arguments In-Reply-To: References: Message-ID: On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka wrote: > 2. Since the number of keyword arguments is obtained from tuple's size, new > CALL_FUNCTION opcode needs only the number of positional arguments. It's > argument is simpler and needs less bits (important for wordcode). What about calls that don't include any keyword arguments? Currently, no pushing is done. Will you have two opcodes, one if there are kwargs and one if there are not? Also: How does this interact with **dict calls? ChrisA From storchaka at gmail.com Thu Apr 21 13:28:39 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 21 Apr 2016 20:28:39 +0300 Subject: [Python-ideas] Bytecode for calling function with keyword arguments In-Reply-To: References: Message-ID: On 21.04.16 19:00, Chris Angelico wrote: > On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka wrote: >> 2. Since the number of keyword arguments is obtained from tuple's size, new >> CALL_FUNCTION opcode needs only the number of positional arguments. It's >> argument is simpler and needs less bits (important for wordcode). > > What about calls that don't include any keyword arguments? Currently, > no pushing is done. Will you have two opcodes, one if there are kwargs > and one if there are not? Yes, we need either two opcodes, or a falg in the argument. I prefer the first approach to keep the argument short and simple. > Also: How does this interact with **dict calls? For now we have separate opcode for calls with **kwargs. We will have corresponding opcode with new way to provide keyword arguments. Instead of 4 current opcodes we will need at most 8 new opcodes. May be less, because calls with *args or **kwargs is less performance critical, we can pack arguments in a tuple and a dict by separate opcodes and call a function just with a tuple and a dict. From contrebasse at gmail.com Thu Apr 21 15:35:58 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 21 Apr 2016 19:35:58 +0000 (UTC) Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) References: <20160421024620.GD1819@ando.pearwood.info> Message-ID: > No problem with that. In my experience, most RecursionErrors come from > *accidental* recursion, which is straight-forwardly infinite and > usually involves a single function. It can come very easily with __getattr__, if __getattr__ itself uses an undefined attribute. This recursion is usually not what the user wanted! :) From ethan at stoneleaf.us Thu Apr 21 16:15:57 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 21 Apr 2016 13:15:57 -0700 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: <20160421024620.GD1819@ando.pearwood.info> Message-ID: <571934FD.4040306@stoneleaf.us> On 04/21/2016 12:35 PM, Joseph Martinot-Lagarde wrote: >> No problem with that. In my experience, most RecursionErrors come from >> *accidental* recursion, which is straight-forwardly infinite and >> usually involves a single function. > > It can come very easily with __getattr__, if __getattr__ itself uses an > undefined attribute. This recursion is usually not what the user wanted! :) True, but I'm pretty sure that falls in to the *accidental recursion* category. ;) -- ~Ethan~ From chris.barker at noaa.gov Thu Apr 21 18:17:32 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Apr 2016 15:17:32 -0700 Subject: [Python-ideas] Why was Path.path added? In-Reply-To: References: Message-ID: On Thu, Apr 21, 2016 at 1:49 AM, Koos Zevenhoven wrote: > Regarding > this, however, we are seizing the discussions until we have a PEP. > I think you meant "ceasing" the discussion -- but case that was intentional, I like it! it really did need to be seized! :-) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 21 18:23:01 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Apr 2016 08:23:01 +1000 Subject: [Python-ideas] Why was Path.path added? In-Reply-To: References: Message-ID: On Fri, Apr 22, 2016 at 8:17 AM, Chris Barker wrote: > On Thu, Apr 21, 2016 at 1:49 AM, Koos Zevenhoven wrote: >> >> Regarding >> this, however, we are seizing the discussions until we have a PEP. > > > I think you meant "ceasing" the discussion -- but case that was intentional, > I like it! it really did need to be seized! > > :-) Carpe Discussiem? ChrisA about to be slapped with a fish if he makes a 'carp' joke From tjreedy at udel.edu Thu Apr 21 19:13:04 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Apr 2016 19:13:04 -0400 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160421113037.GG1819@ando.pearwood.info> References: <20160421113037.GG1819@ando.pearwood.info> Message-ID: On 4/21/2016 7:30 AM, Steven D'Aprano wrote: > I wonder whether we should have a class decorator which automatically > adds the appropriate methods? > > It would need some sort of introspection to look at the superclass and > adds overridden methods? E.g. if the superclass is float, it would add a > bunch of methods like: > > def __add__(self, other): > x = super().__add__(other) > if x is NotImplemented: > return x > return type(self)(x) > > but only if they aren't already overridden. Then you could do this: > > @decorator # I have no idea what to call it. > class MyInt(int): > pass > > and now MyInt() + MyInt() will return a MyInt, rather than a regular > int. > > Getting the list of dunder methods is easy, but telling whether or not > they should return an instance of the subclass may not be. > > Thoughts? Interesting idea. I would start with a TDDed fix_int_subclass, then a TDDed fix_list_subclass, and only then think about whether there is enough common code to refactor. -- Terry Jan Reedy From contrebasse at gmail.com Thu Apr 21 19:40:22 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 21 Apr 2016 23:40:22 +0000 (UTC) Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) References: <20160421024620.GD1819@ando.pearwood.info> <571934FD.4040306@stoneleaf.us> Message-ID: > True, but I'm pretty sure that falls in to the *accidental recursion* > category. ;) Of course, I just wanted to present a less complicated example (and more common) than the id example from Chris Angelico. Anyway shortening the traceback is useful in all cases, the recursion being intentionnal or not. From steve at pearwood.info Thu Apr 21 22:00:17 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Apr 2016 12:00:17 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> Message-ID: <20160422020017.GJ1819@ando.pearwood.info> On Thu, Apr 21, 2016 at 09:43:17AM -0400, Random832 wrote: > On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote: > > and it runs counter to a documented feature of dicts, that they can be > > subclassed and given a __missing__ method: > > > > "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() > > and key is not present, the d[key] operation calls that method ..." > > So what method should be overridden to make a dict subclass useful as a > class or object dictionary (i.e. for attribute lookup to work with names > that have not been stored with dict.__setitem__)? I don't understand your question. Or rather, if I have understood it, the question has a trivial answer: you don't have to override anything. class MyDict(dict): pass d = MyDict() d.attr = 1 print(d.attr) will do what you appear to be asking. If that's not what you actually mean, then you need to explain more carefully. I will also point out that your question is based on a point of confusion. You ask: "So what method should be overridden..." but the answer to this in general must be "What makes you think a single method is sufficient?" This doesn't just apply to dicts, it applies in general to any class. "What method do I override to make a subclass of int implement arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?" > Overriding __getitem__ > or __missing__ doesn't work. My only consolation is that defaultdict > doesn't work either. > > I can't even figure out how to get the real class dict, as I would need > if I were overriding __getattribute__ explicitly in the metaclass (which > also doesn't work) - cls.__dict__ returns a mappingproxy. This doesn't make anything any clearer for me. > Alternatively, where, other than object and class dicts, are you > actually required to have a subclass of dict rather than a UserDict or > other duck-typed mapping? Anywhere you have to operate with code that does "if isinstance(x, dict)" checks. > Incidentally, why is __missing__ documented under defaultdict as "in > addition to the standard dict operations"? Because defaultdict provides __missing__ in addition to the standard dict operations. Is there a problem with the current docs? -- Steve From rosuav at gmail.com Thu Apr 21 22:13:00 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Apr 2016 12:13:00 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160422020017.GJ1819@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> Message-ID: On Fri, Apr 22, 2016 at 12:00 PM, Steven D'Aprano wrote: > On Thu, Apr 21, 2016 at 09:43:17AM -0400, Random832 wrote: >> On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote: >> > and it runs counter to a documented feature of dicts, that they can be >> > subclassed and given a __missing__ method: >> > >> > "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() >> > and key is not present, the d[key] operation calls that method ..." >> >> So what method should be overridden to make a dict subclass useful as a >> class or object dictionary (i.e. for attribute lookup to work with names >> that have not been stored with dict.__setitem__)? > > I don't understand your question. Or rather, if I have understood it, > the question has a trivial answer: you don't have to override anything. > > class MyDict(dict): > pass > > d = MyDict() > d.attr = 1 > print(d.attr) > > > will do what you appear to be asking. If that's not what you actually > mean, then you need to explain more carefully. "As a class or object dictionary". Consider: >>> class X: pass ... >>> X().__dict__ {} >>> type(_) Now, what can I replace that with? A regular dict works fine: >>> x = X() >>> x.__dict__ = {"asdf": 1} >>> x.asdf 1 But defining __missing__ doesn't create attributes automatically: >>> class AutoCreateDict(dict): ... def __missing__(self, key): ... return "<%r>" % key ... >>> x.__dict__ = AutoCreateDict() >>> x.qwer Traceback (most recent call last): File "", line 1, in AttributeError: 'X' object has no attribute 'qwer' And neither does overriding __getitem__: >>> class OtherAutoCreateDict(dict): ... def __getitem__(self, key): ... try: return super().__getitem__(key) ... except KeyError: return "<%r>" % key ... >>> x.__dict__ = OtherAutoCreateDict() >>> x.zxcv Traceback (most recent call last): File "", line 1, in AttributeError: 'X' object has no attribute 'zxcv' The question is a fair one. What can you do to make an object's dictionary provide attributes that weren't put there with __setitem__? ChrisA From random832 at fastmail.com Thu Apr 21 22:35:01 2016 From: random832 at fastmail.com (Random832) Date: Thu, 21 Apr 2016 22:35:01 -0400 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160422020017.GJ1819@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> Message-ID: <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> On Thu, Apr 21, 2016, at 22:00, Steven D'Aprano wrote: > I don't understand your question. Or rather, if I have understood it, > the question has a trivial answer: you don't have to override anything. I thought it went without saying that I want to be able to get items other than by having them already stored in the dict under the exact key. Like if I wanted to make a case-insensitive dict, or one where numbers are equivalent to strings, or have it materialize a default value for missing keys. > "So what method should be overridden..." > > but the answer to this in general must be "What makes you think a single > method is sufficient?" Well, I was assuming it doesn't call *more than one* method for the specific task of looking up an attribute by name, even if there are other methods for other tasks that I would have to override to get the whole package of what behavior I want. As near as I can tell, it doesn't call *any method at all*, but instead reaches inside PyDictObject's internal structure. > This doesn't just apply to dicts, it applies in > general to any class. > > "What method do I override to make a subclass of int implement > arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?" Right, but I'm asking which method *one particular* expression calls. This is more like "Which method on an object implements the + operator", only it's "Which method on an object's type's namespace dict implements the ability to look up attributes on that object?" In code terms: class mydict(dict): def ????????(self, name): if name == 'foo': return 'bar' mytype = type('C',(),mydict()) => desired result: mytype().foo == mytype.foo == 1 > > I can't even figure out how to get the real class dict, as I would need > > if I were overriding __getattribute__ explicitly in the metaclass (which > > also doesn't work) - cls.__dict__ returns a mappingproxy. > > This doesn't make anything any clearer for me. Given a type, how do I get a reference to the dict instance that was passed to the type's constructor? Or, in code terms: >>> mydict = {} >>> mytype = type('C',(),mydict) >>> mytype.__dict__ is mydict False >>> type(mytype.__dict__) def f(x): ???????? => desired result: f(mytype) is mydict > > Alternatively, where, other than object and class dicts, are you > > actually required to have a subclass of dict rather than a UserDict or > > other duck-typed mapping? > > Anywhere you have to operate with code that does "if isinstance(x, > dict)" checks. Fix the other code, because it's wrong. If you can't, monkey-patch its builtins. > > Incidentally, why is __missing__ documented under defaultdict as "in > > addition to the standard dict operations"? > > Because defaultdict provides __missing__ in addition to the standard > dict operations. Is there a problem with the current docs? When I read it earlier, the wording in defaultdict's documentation seemed to suggest that what it provides is the ability to define a __missing__ method and have it be called - and that, itself, *is* a "standard dict operation" - rather than an implementation of the method. It looks like I misinterpreted it though. From vgr255 at live.ca Thu Apr 21 22:38:08 2016 From: vgr255 at live.ca (=?iso-8859-1?Q?=C9manuel_Barry?=) Date: Thu, 21 Apr 2016 22:38:08 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: <20160421024620.GD1819@ando.pearwood.info> <571934FD.4040306@stoneleaf.us> Message-ID: I implemented a small patch for this, over at http://bugs.python.org/issue26823 Comments and feedback are very much welcome :) -Emanuel From steve at pearwood.info Thu Apr 21 23:49:57 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Apr 2016 13:49:57 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> Message-ID: <20160422034955.GK1819@ando.pearwood.info> On Thu, Apr 21, 2016 at 10:35:01PM -0400, Random832 wrote: > On Thu, Apr 21, 2016, at 22:00, Steven D'Aprano wrote: > > I don't understand your question. Or rather, if I have understood it, > > the question has a trivial answer: you don't have to override anything. > > I thought it went without saying Obviously not. > that I want to be able to get items > other than by having them already stored in the dict under the exact > key. "Items"? Like items in a dict? Then you already have at least two ways: you override __getitem__, or add a __missing__ method to the dict. What makes you think these techniques don't work? > Like if I wanted to make a case-insensitive dict, or one where > numbers are equivalent to strings, or have it materialize a default > value for missing keys. > > > "So what method should be overridden..." > > > > but the answer to this in general must be "What makes you think a single > > method is sufficient?" > > Well, I was assuming it doesn't call *more than one* method for the > specific task of looking up an attribute by name, Where does attribute lookup come into this? > even if there are > other methods for other tasks that I would have to override to get the > whole package of what behavior I want. As near as I can tell, it doesn't > call *any method at all*, but instead reaches inside PyDictObject's > internal structure. Are you still talking about *attribute lookup* or are you back to *key lookup*? Honestly Random, you have to be clear as to which you want, because they are different thing. You can't just jump backwards and forwards between talking about dicts and attributes and expect people to understand what you mean. Classes and instances may not have a __dict__ at all, if they define __slots__ instead, or if they are builtins: py> (1).__dict__ Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute '__dict__' > > This doesn't just apply to dicts, it applies in > > general to any class. > > > > "What method do I override to make a subclass of int implement > > arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?" > > Right, but I'm asking which method *one particular* expression calls. > This is more like "Which method on an object implements the + operator", An excellent example, because the answer is *two* methods, __add__ and __radd__. You've made my point for me again. > only it's "Which method on an object's type's namespace dict implements > the ability to look up attributes on that object?" What makes you think that such a method exists? Have you read the rest of this thread? If not, I suggest you do, because the thread is all about making unjustified assumptions about how objects are implemented. You are doing it again. Where is it documented that there is a method on the object __dict__ (if such a __dict__ even exists!) that does this? So you are asking the wrong question. Don't ask about an implementation detail that you *assume* exists. You should ask, not "which method", but "how do I customise attribute lookup?" And you know the standard answer to that: override __getattribute__ or __getattr__. Do they not solve your problem? Another smart question might be, what are the constraints and rules for customizing behaviour by setting the object __dict__ to something other than a regular builtin dict? I don't know the answer to that. But don't make assumptions about the implementation, and having made those assumptions, assume that everyone else shares them and that they "go without saying". *Especially not* in a thread that talks about how silly it is to make those assumptions. > Given a type, how do I get a reference to the dict instance that was > passed to the type's constructor? I believe that the answer to that is, you can't. > > > Alternatively, where, other than object and class dicts, are you > > > actually required to have a subclass of dict rather than a UserDict or > > > other duck-typed mapping? > > > > Anywhere you have to operate with code that does "if isinstance(x, > > dict)" checks. > > Fix the other code, because it's wrong. If you can't, monkey-patch its > builtins. You cannot assume that it is "wrong" just because it is inconvenient for you. Do you seriously think that monkey-patching the built-ins in production code is a good idea? -- Steve From random832 at fastmail.com Fri Apr 22 00:45:34 2016 From: random832 at fastmail.com (Random832) Date: Fri, 22 Apr 2016 00:45:34 -0400 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160422034955.GK1819@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> Message-ID: <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote: > "Items"? Like items in a dict? Then you already have at least two ways: > you override __getitem__, or add a __missing__ method to the dict. > > What makes you think these techniques don't work? The fact that I tried them and they didn't work. > Where does attribute lookup come into this? Because looking up an attribute implies getting an item from the object or class's __dict__ with a string key of the name of the attribute (see below for the documented basis for this assumption). How are you not following this? I don't believe you're not messing with me. > Classes and instances may not have a __dict__ at all, if they define > __slots__ instead, or if they are builtins: If a class defines __slots__, the *instances* don't have a __dict__, but the *class* sure as hell still does. Even, as it turns out, if the metaclass had __slots__. (Not sure what's up with that, actually). > What makes you think that such a method exists? > > Have you read the rest of this thread? If not, I suggest you do, because > the thread is all about making unjustified assumptions about how objects > are implemented. You are doing it again. Where is it documented that > there is a method on the object __dict__ (if such a __dict__ even > exists!) that does this? My main concern here is the class dict. So, let's see... ### Class attribute references are translated to lookups in this dictionary, e.g., C.x is translated to C.__dict__["x"] Now, that is technically true (C.__dict__ is, we've established, not the actual dict, but a "mappingproxy" object), but this behavior itself contradicts the documentation: ### [type] With three arguments, [...] and the dict dictionary is the namespace containing definitions for class body and becomes the __dict__ attribute. Except, it *doesn't* become the __dict__ attribute - its contents are *copied* into the __dict__ object, which is a new "mappingproxy" whose contents will not reflect further updates to the dict that was passed in. And regarding the object __dict__, when such a __dict__ *does* exist (since, unlike class dicts, you actually can set object dicts to be arbitrary dict subclasses) ### The default behavior for attribute access is to get, set, or delete the attribute from an object?s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses. > > Given a type, how do I get a reference to the dict instance that was > > passed to the type's constructor? > > I believe that the answer to that is, you can't. I hadn't yet realized, when I asked this question, that the class __dict__ "mappingproxy" is a new object (that isn't even a dict! "A class has a namespace implemented by a dictionary object" was also wrong) and doesn't retain a reference to the passed-in dict nor reflect changes to it. From ben+python at benfinney.id.au Fri Apr 22 01:17:34 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 22 Apr 2016 15:17:34 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> Message-ID: <854maunrwx.fsf@benfinney.id.au> Random832 writes: > On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote: > > Where does attribute lookup come into this? > > Because looking up an attribute implies getting an item from the > object or class's __dict__ with a string key of the name of the > attribute (see below for the documented basis for this assumption). No, that's not what it implies. The ?__dict__? of an object is an implementation detail, and is not necessarily used for attribute lookup. As the documentation says: A class has a namespace implemented by a dictionary object. Class attribute references are translated to lookups in this dictionary, e.g., C.x is translated to C.__dict__["x"] (although there are a number of hooks which allow for other means of locating attributes). So no, attribute lookup does not imply getting an item from any particular dictionary. Nothing in the documentation implies that ? if you find an exception, please file a bug report for that part of the documentation. > How are you not following this? I don't believe you're not messing > with me. Steven follows quite well; he is trying to get you to explain your meaning so we can find where the mismatch is. > > > Given a type, how do I get a reference to the dict instance that > > > was passed to the type's constructor? > > > > I believe that the answer to that is, you can't. > > I hadn't yet realized, when I asked this question, that the class > __dict__ "mappingproxy" is a new object (that isn't even a dict! "A > class has a namespace implemented by a dictionary object" was also > wrong) and doesn't retain a reference to the passed-in dict nor > reflect changes to it. That's right. As documented, attribute lookup does not imply any dictionary lookup for the attribute. Attribute lookup by name is one thing. Dictionary lookup by key is quite a different thing. Please let's keep the two distinct in any discussions. -- \ ?Ubi dubium, ibi libertas.? (?Where there is doubt, there is | `\ freedom.?) | _o__) | Ben Finney From greg.ewing at canterbury.ac.nz Fri Apr 22 01:39:16 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Apr 2016 17:39:16 +1200 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> Message-ID: <5719B904.3060603@canterbury.ac.nz> Random832 wrote: > If a class defines __slots__, the *instances* don't have a __dict__, but > the *class* sure as hell still does. Even, as it turns out, if the > metaclass had __slots__. (Not sure what's up with that, actually). It's because metaclasses are all based on type, whose instances have a dict. An object will always have a dict if any of its base classes cause it to have a dict. -- Greg From rosuav at gmail.com Fri Apr 22 02:06:39 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Apr 2016 16:06:39 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <854maunrwx.fsf@benfinney.id.au> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <854maunrwx.fsf@benfinney.id.au> Message-ID: On Fri, Apr 22, 2016 at 3:17 PM, Ben Finney wrote: > Random832 writes: > >> On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote: >> > Where does attribute lookup come into this? >> >> Because looking up an attribute implies getting an item from the >> object or class's __dict__ with a string key of the name of the >> attribute (see below for the documented basis for this assumption). > > No, that's not what it implies. The ?__dict__? of an object is an > implementation detail, and is not necessarily used for attribute lookup. > > As the documentation says: > > A class has a namespace implemented by a dictionary object. Class > attribute references are translated to lookups in this dictionary, > e.g., C.x is translated to C.__dict__["x"] (although there are a > number of hooks which allow for other means of locating attributes). > > Okay. That definitely implies that the class itself has a real dict, though. And it should be possible, using a metaclass, to manipulate that, right? >>> class AutoCreateDict(dict): ... def __missing__(self, key): ... return "<%r>" % key ... def __repr__(self): ... return "AutoCreateDict" + super().__repr__() ... >>> class DemoMeta(type): ... def __new__(*a): ... print("new:", a) ... return type.__new__(*a) ... @classmethod ... def __prepare__(*a): ... print("prepare:", a) ... return AutoCreateDict() ... >>> class Demo(metaclass=DemoMeta): pass ... prepare: (, 'Demo', ()) new: (, 'Demo', (), AutoCreateDict{'__qualname__': 'Demo', '__module__': "<'__name__'>"}) >>> Demo.asdf Traceback (most recent call last): File "", line 1, in AttributeError: type object 'Demo' has no attribute 'asdf' >>> Demo().asdf Traceback (most recent call last): File "", line 1, in AttributeError: 'Demo' object has no attribute 'asdf' >>> Demo.__dict__ mappingproxy({'__dict__': , '__doc__': None, '__weakref__': , '__module__': "<'__name__'>"}) >>> Demo().__dict__ {} Maybe I'm just misunderstanding how metaclasses should be written, but this seems like it ought to work. And it's not making any use of the AutoCreateDict - despite __prepare__ and __new__ clearly being called, the latter with an AutoCreateDict instance. But by the time it gets actually attached to the dictionary, we have a mappingproxy that ignores __missing__. The docs say "are translated to", implying that this will actually be equivalent. Using a non-dict-subclass results in prompt rejection in the type() constructor: >>> class DictLike: ... def __getitem__(self, key): ... print("DictLike get", key) ... return "<%r>" % key ... def __setitem__(self, key, value): ... print("DictLike set", key, value) ... >>> class Demo(metaclass=DemoMeta): pass ... prepare: (, 'Demo', ()) DictLike get __name__ DictLike set __module__ <'__name__'> DictLike set __qualname__ Demo new: (, 'Demo', (), <__main__.DictLike object at 0x7f2b41b96ac8>) Traceback (most recent call last): File "", line 1, in File "", line 4, in __new__ TypeError: type() argument 3 must be dict, not DictLike Is there any way to make use of this documented transformation? ChrisA From niki.spahiev at gmail.com Fri Apr 22 02:49:25 2016 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 22 Apr 2016 09:49:25 +0300 Subject: [Python-ideas] Bytecode for calling function with keyword arguments In-Reply-To: References: Message-ID: On 21.04.2016 20:28, Serhiy Storchaka wrote: > On 21.04.16 19:00, Chris Angelico wrote: >> On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka >> wrote: >>> 2. Since the number of keyword arguments is obtained from tuple's >>> size, new >>> CALL_FUNCTION opcode needs only the number of positional arguments. It's >>> argument is simpler and needs less bits (important for wordcode). >> >> What about calls that don't include any keyword arguments? Currently, >> no pushing is done. Will you have two opcodes, one if there are kwargs >> and one if there are not? > > Yes, we need either two opcodes, or a falg in the argument. I prefer the > first approach to keep the argument short and simple. > >> Also: How does this interact with **dict calls? > > For now we have separate opcode for calls with **kwargs. We will have > corresponding opcode with new way to provide keyword arguments. > > Instead of 4 current opcodes we will need at most 8 new opcodes. May be > less, because calls with *args or **kwargs is less performance critical, > we can pack arguments in a tuple and a dict by separate opcodes and call > a function just with a tuple and a dict. We can limit second argument (number of kw args) to be 0 or 1 only. With 1 meaning single tuple with the names. Niki From steve at pearwood.info Fri Apr 22 07:03:35 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Apr 2016 21:03:35 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <854maunrwx.fsf@benfinney.id.au> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <854maunrwx.fsf@benfinney.id.au> Message-ID: <20160422110334.GA13497@ando.pearwood.info> On Fri, Apr 22, 2016 at 03:17:34PM +1000, Ben Finney wrote: > Random832 writes: > > > On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote: > > > Where does attribute lookup come into this? > > > > Because looking up an attribute implies getting an item from the > > object or class's __dict__ with a string key of the name of the > > attribute (see below for the documented basis for this assumption). > > No, that's not what it implies. The ?__dict__? of an object is an > implementation detail, and is not necessarily used for attribute lookup. I think it's a bit more than an implementation detail. For example, I don't think it would be legitimate for (let's say) FooPython to decide to use instance.__tree__ (a binary tree) for instance attribute look-ups. But I think that the details are far more complex than "attribute lookups are done by key access to a dict __dict__" (for example, if attribute lookups are done by inspecting obj.__dict__, how do you look up obj.__dict__?). And I don't think that the docs suggest that they are replacable by dict subclasses. > > How are you not following this? I don't believe you're not messing > > with me. > > Steven follows quite well; he is trying to get you to explain your > meaning so we can find where the mismatch is. Not quite: my first email was genuinely confused. When I said I didn't understand what Random was trying to say, I meant it. By my second email, or at least the end of it, I could guess what he was trying to do, which (I think) is to replace __dict__ with a subclass instance in order to customise attribute lookup. -- Steve From steve at pearwood.info Fri Apr 22 07:16:52 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Apr 2016 21:16:52 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> Message-ID: <20160422111652.GB13497@ando.pearwood.info> On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote: [...] > My main concern here is the class dict. So, let's see... > > ### Class attribute references are translated to lookups in this > dictionary, e.g., C.x is translated to C.__dict__["x"] > > Now, that is technically true (C.__dict__ is, we've established, not the > actual dict, but a "mappingproxy" object), but this behavior itself > contradicts the documentation: > > ### [type] With three arguments, [...] and the dict dictionary is the > namespace containing definitions for class body and becomes the __dict__ > attribute. > > Except, it *doesn't* become the __dict__ attribute - its contents are > *copied* into the __dict__ object, which is a new "mappingproxy" whose > contents will not reflect further updates to the dict that was passed > in. That is a very good point. I think that's a documentation bug. > And regarding the object __dict__, when such a __dict__ *does* exist > (since, unlike class dicts, you actually can set object dicts to be > arbitrary dict subclasses) True, but the documentation doesn't say that attribute lookup goes through the *full* dict key lookup, including support of __missing__ and __getitem__. I'll grant you that neither does the documentation say that it doesn't, so I'd call this a documentation bug. -- Steve From ethan at stoneleaf.us Fri Apr 22 07:51:47 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 22 Apr 2016 04:51:47 -0700 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <854maunrwx.fsf@benfinney.id.au> Message-ID: <571A1053.3010301@stoneleaf.us> On 04/21/2016 11:06 PM, Chris Angelico wrote: > On Fri, Apr 22, 2016 at 3:17 PM, Ben Finney wrote: >> Random832 writes: >>> Because looking up an attribute implies getting an item from the >>> object or class's __dict__ with a string key of the name of the >>> attribute (see below for the documented basis for this assumption). >> >> No, that's not what it implies. The ?__dict__? of an object is an >> implementation detail, and is not necessarily used for attribute lookup. >> >> As the documentation says: >> >> A class has a namespace implemented by a dictionary object. Class >> attribute references are translated to lookups in this dictionary, >> e.g., C.x is translated to C.__dict__["x"] (although there are a >> number of hooks which allow for other means of locating attributes). >> >> > > Okay. That definitely implies that the class itself has a real dict, > though. A class does have a real dict() -- not a subclass, not a look-a-like, but a real, honest-to-goodness, freshly minted {}. > And it should be possible, using a metaclass, to manipulate > that, right? Only for the purposes of class creation. One of the final steps of creating a class (after __prepare__ and __new__ have been called) is to copy everything from whatever dict (sub)class you used into a brand-new dict. > Maybe I'm just misunderstanding how metaclasses should be written, but > this seems like it ought to work. You have the metaclass portion right. > And it's not making any use of the > AutoCreateDict - despite __prepare__ and __new__ clearly being called, > the latter with an AutoCreateDict instance. Well, it would if you tried creating any attributes without values during class creation. > But by the time it gets > actually attached to the dictionary, we have a mappingproxy that > ignores __missing__. No, it's a dict -- we just don't get to directly fiddle with it (hence the return of a mappingproxy). > The docs say "are translated to", implying that > this will actually be equivalent. > Is there any way to make use of this documented transformation? There are already many hooks in place to supplement or replace attribute lookup -- use them instead. :) -- ~Ethan~ From ethan at stoneleaf.us Fri Apr 22 07:58:53 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 22 Apr 2016 04:58:53 -0700 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160422111652.GB13497@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <20160422111652.GB13497@ando.pearwood.info> Message-ID: <571A11FD.2060107@stoneleaf.us> On 04/22/2016 04:16 AM, Steven D'Aprano wrote: > On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote: >> ### Class attribute references are translated to lookups in this >> dictionary, e.g., C.x is translated to C.__dict__["x"] >> >> Now, that is technically true (C.__dict__ is, we've established, not the >> actual dict, but a "mappingproxy" object), but this behavior itself >> contradicts the documentation: No, it's a real dict() -- we just don't get direct access to it anymore. >> ### [type] With three arguments, [...] and the dict dictionary is the >> namespace containing definitions for class body and becomes the __dict__ >> attribute. >> >> Except, it *doesn't* become the __dict__ attribute - its contents are >> *copied* into the __dict__ object, which is a new "mappingproxy" whose >> contents will not reflect further updates to the dict that was passed >> in. See above about the 'mappingproxy'. As for updates, IIRC setattr() is the way to make changes these days. > That is a very good point. I think that's a documentation bug. Patches welcome. ;) -- ~Ethan~ From ncoghlan at gmail.com Fri Apr 22 09:24:10 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Apr 2016 23:24:10 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <854maunrwx.fsf@benfinney.id.au> Message-ID: On 22 April 2016 at 16:06, Chris Angelico wrote: > Maybe I'm just misunderstanding how metaclasses should be written, but > this seems like it ought to work. And it's not making any use of the > AutoCreateDict - despite __prepare__ and __new__ clearly being called, > the latter with an AutoCreateDict instance. But by the time it gets > actually attached to the dictionary, we have a mappingproxy that > ignores __missing__. The docs say "are translated to", implying that > this will actually be equivalent. > The mapping used during class body execution can be customised via __prepare__, but the resulting contents of that mapping are still copied to a regular dictionary when constructing the class object itself. We deliberately don't provide a mechanism to customise the runtime dictionary used by object instances, regardless of whether they're normal instances or type definitions. In combination with the __dict__ descriptor only exposing a mapping proxy, this ensures that all Python level modifications to the contents go through the descriptor machinery - you can't get your hands on a mutable pointer to the post-creation namespace. It looks like there *is* a missing detail in the data model docs in relation to this, though: https://docs.python.org/3/reference/datamodel.html#creating-the-class-object should state explicitly that the namespace contents are copied to a plain dict (which is then never exposed directly to Python code), but it doesn't. Cheers, Nick. P.S. I actually played around with an experimental interpreter build that dropped the copy-to-a-new-namespace step back when I was working on https://www.python.org/dev/peps/pep-0422/#new-ways-of-using-classes. It's astonishingly broken in the number of ways it offers to corrupt the interpreter state (since you can entirely bypass the descriptor machinery, which the rest of the interpreter expects to be impossible if you're not messing about with ctypes or C extensions), but kinda fun in the quirky action at a distance it makes possible :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Apr 22 09:34:01 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Apr 2016 23:34:01 +1000 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <854maunrwx.fsf@benfinney.id.au> Message-ID: On Fri, Apr 22, 2016 at 11:24 PM, Nick Coghlan wrote: > > It looks like there *is* a missing detail in the data model docs in relation > to this, though: > https://docs.python.org/3/reference/datamodel.html#creating-the-class-object > should state explicitly that the namespace contents are copied to a plain > dict (which is then never exposed directly to Python code), but it doesn't. Okay, that explains my confusion, at least :) I can't think of any situations where I actually *want* a dict subclass (__getattr[ibute]__ or descriptors can do everything I can come up with), but it'd be good to document that that's not possible. ChrisA From ericsnowcurrently at gmail.com Fri Apr 22 15:19:59 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Apr 2016 13:19:59 -0600 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <20160422111652.GB13497@ando.pearwood.info> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> <20160422111652.GB13497@ando.pearwood.info> Message-ID: On Fri, Apr 22, 2016 at 5:16 AM, Steven D'Aprano wrote: > On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote: >> ### [type] With three arguments, [...] and the dict dictionary is the >> namespace containing definitions for class body and becomes the __dict__ >> attribute. >> >> Except, it *doesn't* become the __dict__ attribute - its contents are >> *copied* into the __dict__ object, which is a new "mappingproxy" whose >> contents will not reflect further updates to the dict that was passed >> in. > > That is a very good point. I think that's a documentation bug. mappingproxy (a.k.a. types.MappingProxyType) is exactly what it says: a proxy for a mapping. Basically, it is a wrapper around a collections.abc.Mapping. The namespace (a dict or dict sub-class, e.g. OrderedDict) passed to type() is copied into a new dict and the new type's __dict__ is set to a mappingproxy that wraps that copy. So if there is a documentation bug then it is the ambiguity of the word "becomes". Perhaps it would be more correct as "is copied into". > >> And regarding the object __dict__, when such a __dict__ *does* exist >> (since, unlike class dicts, you actually can set object dicts to be >> arbitrary dict subclasses) > > True, but the documentation doesn't say that attribute lookup goes > through the *full* dict key lookup, including support of __missing__ and > __getitem__. I'll grant you that neither does the documentation say that > it doesn't, so I'd call this a documentation bug. What would you say is the specific documentation bug? That the default attribute lookup (object.__getattribute()) does not use the obj.__dict__ attribute but rather the dict it points to (if it knows about it)? Or just that anything set to __dict__ is not guaranteed to be honored by the default __getattribute__()? -eric From ericsnowcurrently at gmail.com Fri Apr 22 16:12:57 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Apr 2016 14:12:57 -0600 Subject: [Python-ideas] Override dict.__new__ to raise if cls is not dict; do the same for str, list, etc. In-Reply-To: <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> References: <20160421031543.GE1819@ando.pearwood.info> <20160421111709.GF1819@ando.pearwood.info> <1461246197.4186699.585516009.6DFB227B@webmail.messagingengine.com> <20160422020017.GJ1819@ando.pearwood.info> <1461292501.2136146.586174281.5F74BB8C@webmail.messagingengine.com> <20160422034955.GK1819@ando.pearwood.info> <1461300334.2972294.586239185.29AEDC83@webmail.messagingengine.com> Message-ID: On Thu, Apr 21, 2016 at 10:45 PM, Random832 wrote: > On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote: >> Where does attribute lookup come into this? > > Because looking up an attribute implies getting an item from the object > or class's __dict__ with a string key of the name of the attribute (see > below for the documented basis for this assumption) Attribute access and item access communicate different things about the namespaces on which they operate. The fact that objects use mappings under the hood for the default attribute access behavior is an implementation detail. Python does not document any mechanism to hook arbitrary mappings into the default attribute access behavior (i.e. object.__getattribute__()). >> What makes you think that such a method exists? >> >> Have you read the rest of this thread? If not, I suggest you do, because >> the thread is all about making unjustified assumptions about how objects >> are implemented. You are doing it again. Where is it documented that >> there is a method on the object __dict__ (if such a __dict__ even >> exists!) that does this? > > My main concern here is the class dict. So, let's see... > > ### Class attribute references are translated to lookups in this > dictionary, e.g., C.x is translated to C.__dict__["x"] > Not exactly. You still have to factor in descriptors (including slots). [1] In the absence of those then you are correct that the current implementation of type.__getattribute__() (not the same as object.__getattribute__(), BTW) is a lookup on the type's __dict__. However, that is done directly on tp_dict and not on .__dict__, if you want to talk about implementation details. Of course, the point is moot, as you've pointed out, since a type's __dict__ is both unsettable and a read-only view. > And regarding the object __dict__, when such a __dict__ *does* exist > (since, unlike class dicts, you actually can set object dicts to be > arbitrary dict subclasses) > > ### The default behavior for attribute access is to get, set, or delete > the attribute from an object?s dictionary. For instance, a.x has a > lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], > and continuing through the base classes of type(a) excluding > metaclasses. Again, you also have to factor in descriptors. [2] Regardless of that, it's important to realize that object.__getattribute__() doesn't use any object's __dict__ attribute for any of the lookups you've identified. This is because you can't really look up the object's __dict__ attribute *while* doing attribute lookup. It's the same way that the following doesn't work: class Spam: def __getattribute__(self, name): if self.__class__ == Spam: # infinite recursion!!! ... return object.__getattribute__(self, name) Instead you have to do this: class Spam: def __getattribute__(self, name): if object.__getattribute__(self, "__class__") == Spam: ... return object.__getattribute__(self, name) Hence, object.__getattribute__() only does lookup on the dict identified through the type's tp_dictoffset field. However, as you've noted, objects of custom classes have a settable __dict__ attribute. This is because by default the mapping is tied to the object at the tp_dictoffset of the object's type. [3] Notably, the mapping must be of type dict or of a dict subclass. What this implies to me is that someone went to the trouble of allowing folks to use some other dict (or OrderedDict, etc.) than the one you get by default with a new object. However, either they felt that using a non-dict mapping type was asking for too much trouble, there were performance concerns, or they did not want to go to the effort to fix all the places that expect __dict__ to be an actual dict. It's probably all three. Keep in mind that even with a dict subclass the implementation of object.__getattribute__() can't sensibly use normal lookup on when it does the lookup on the underlying namespace dict. Instead it uses PyDict_GetItem(), which is basically equivalent to calling dict.__getitem__(ns, attr). Hence that underlying mapping must be a dict or dict subclass, and any overridden __getitem__() method is ignored. -eric [1] https://hg.python.org/cpython/file/default/Objects/typeobject.c#l2924 [2] https://hg.python.org/cpython/file/default/Objects/object.c#l1028 [3] https://hg.python.org/cpython/file/default/Objects/object.c#l1195 From vgr255 at live.ca Fri Apr 22 22:00:12 2016 From: vgr255 at live.ca (=?UTF-8?Q?=C3=89manuel_Barry?=) Date: Fri, 22 Apr 2016 22:00:12 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: I have been thinking, and especially after Nick?s comments, that it might be better to keep it as simple as possible to reduce the risk of errors happening while we?re printing the tracebacks. Recursive functions (that go deep enough to trigger an exception) are usually within one function, in the REPL, or by using __getattr__ / __getattribute__. Nice to have? Sure. Necessary? Don?t think as much. I?m also a very solid -1 on the idea of prompting the user to do something like full_ex() to get the full stack trace. My rationale for such a change is that we need to be extremely precise in our message, to leave absolutely no room for a different interpretation than what we mean. Your message, for instance, is ambiguous. The fact it says that calls are hidden would let me think I just lost information about the stack trace (though that?s but a wording issue). As a user, just seeing "Call _full_ex() to print full trace." would be an immediate red flag: Crap, I just lost precious debugging information to save on lines! Might be just me, but that?s my opinion anyway :) -Emanuel Example: File "", line 1, in f File "", line 1, in g [Mutually recursive calls hidden: f (300), g (360)] File "", line 1, in h File "", line 1, in f File "", line 1, in g [Mutual-recursive calls hidden: f (103), g (200)] RuntimeError: maximum recursion depth exceeded [963 calls hidden. Call _full_ex() to print full trace.] This rule is easily modified to, "have been seen at least three times before." For functions that recurse at multiple lines, it can print out one message per line, but maybe it should count them all together in the "hidden" summary. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Apr 22 23:50:10 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Apr 2016 13:50:10 +1000 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: <20160423035009.GC13497@ando.pearwood.info> On Fri, Apr 22, 2016 at 10:00:12PM -0400, ?manuel Barry wrote: > I?m also a very solid -1 on the idea of prompting the user to do > something like full_ex() to get the full stack trace. I'm not sure who suggested that idea. I certainly didn't. > My rationale for > such a change is that we need to be extremely precise in our message, > to leave absolutely no room for a different interpretation than what > we mean. Your message, for instance, is ambiguous. I'm not sure who you are talking to here, or whose suggestion the following faked stack trace is. [...] > Example: > File "", line 1, in f > File "", line 1, in g > [Mutually recursive calls hidden: f (300), g (360)] For the record, detecting subsequences of mutally recurive calls is not a trivial task. (It's not the same as cycle detection.) And I don't understand how you can get 300 calls to f and 360 calls to g in a scenario where f calls g and g calls f. Surely there would be equal number of calls to each? > File "", line 1, in h > File "", line 1, in f > File "", line 1, in g > [Mutual-recursive calls hidden: f (103), g (200)] > RuntimeError: maximum recursion depth exceeded > [963 calls hidden. Call _full_ex() to print full trace.] > > This rule is easily modified to, "have been seen at least three times > before." For functions that recurse at multiple lines, it can print > out one message per line, but maybe it should count them all together > in the "hidden" summary. I think the above over-complicates the situation. I want to deal with the low-hanging fruit that gives the biggest benefit for the least work (and least complexity!), not minimise every conceivable bit of redundancy in a stack trace. So, for the record, let me explicitly state my proposal: Stack traces should collapse blocks of repeated identical, contiguous lines (or pairs of lines, where the source code is available for display). For brevity, whenever I refer to a "line" in the stack trace, I mean either a single line of the form: File "", line 1, in f when source code is not available, or a pair of lines of the form: File "path/to/file.py", line 1, in f line of source code when it is available. Whenever a single, continguous block of lines from the traceback consists of three or more identical lines (i.e. lines which compare equal using string equality), they should be collapsed down to: a single instance of that line followed by a message reporting the number of repetitions. For example: File "", line 1, in f return f(arg) File "", line 1, in f return f(arg) File "", line 1, in f return f(arg) File "", line 1, in f return f(arg) File "", line 1, in f return f(arg) File "", line 1, in f return f(arg) would be collapsed to something like this: File "", line 1, in f return f(arg) [...previous call was repeated 5 more times...] (In practice, you are more like to see "repeated 1000 more times" than just five.) This would not be collapsed, as the line numbers are not the same: File "", line 1, in f File "", line 2, in f File "", line 3, in f File "", line 4, in f File "", line 5, in f File "", line 6, in f nor this: File "", line 1, in f File "", line 1, in g File "", line 1, in f File "", line 1, in g File "", line 1, in f File "", line 1, in g My proposal does not include any provision for collapsing chains of mutual recursion. If somebody else wants to champion that as a separate proposal, please do, but I won't be making that proposal. This will shrink the ugly and harmful huge stacktraces that we get from many accidental recursion errors, without hiding potentially useful information. -- Steve From tjreedy at udel.edu Fri Apr 22 23:15:15 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Apr 2016 23:15:15 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: On 4/22/2016 10:00 PM, ?manuel Barry wrote: > I have been thinking, and especially after Nick?s comments, that it > might be better to keep it as simple as possible to reduce the risk of > errors happening while we?re printing the tracebacks. Recursive > functions (that go deep enough to trigger an exception) are usually > within one function, in the REPL, or by using __getattr__ / > __getattribute__. Nice to have? Sure. Necessary? Don?t think as much. I agree (I think) that the first issue and patch should be to detect and collapse n identical lines or pairs of lines in a row, where n > 3, with a message "The above pair of lines [or line] is repeated n-1 more times." This will be an optional but nice new feature giving 80% benefit for 20% work. There seems to be a consensus for this much. Lets do it. > I?m also a very solid -1 on the idea of prompting the user to do > something like full_ex() to get the full stack trace. My rationale for > such a change is that we need to be extremely precise in our message, to > leave absolutely no room for a different interpretation than what we > mean. Your message, for instance, is ambiguous. The fact it says that > calls are hidden would let me think I just lost information about the > stack trace (though that?s but a wording issue). As a user, just seeing > "Call _full_ex() to print full trace." would be an immediate red flag: > Crap, I just lost precious debugging information to save on lines! > > Might be just me, but that?s my opinion anyway :) I agree that anything beyond the first step is more dubious. I think extensions should be another proposal and possible issue. Or left to GUI wrappers of an underlying Python. > Example: > File "", line 1, in f > File "", line 1, in g > [Mutually recursive calls hidden: f (300), g (360)] > File "", line 1, in h > File "", line 1, in f > File "", line 1, in g > [Mutual-recursive calls hidden: f (103), g (200)] > RuntimeError: maximum recursion depth exceeded > [963 calls hidden. Call _full_ex() to print full trace.] > > This rule is easily modified to, "have been seen at least three times > before." For functions that recurse at multiple lines, it can print out > one message per line, but maybe it should count them all together in the > "hidden" summary. -- Terry Jan Reedy From leewangzhong+python at gmail.com Sat Apr 23 02:14:10 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sat, 23 Apr 2016 02:14:10 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: (Sorry, my previous message was sent from the wrong address, so only ?manuel got it. I then replied to his reply, which had me reuse the wrong address. Sorry, ?manuel, for this repeat.) On Fri, Apr 22, 2016 at 10:00 PM, ?manuel Barry wrote: > I have been thinking, and especially after Nick?s comments, that it might be > better to keep it as simple as possible to reduce the risk of errors > happening while we?re printing the tracebacks. Recursive functions (that go > deep enough to trigger an exception) are usually within one function, in the > REPL, or by using __getattr__ / __getattribute__. Nice to have? Sure. > Necessary? Don?t think as much. Simple as in, unlikely to be affected by whatever caused the exception in the first place? Here's pseudocode for my suggestion. (I assume appropriate definitions of `traceback`, `print_folded`, and `print_the_thing`. I assume `func_name` is a qualified name (e.g. `('f', '')`).) seen = collections.Counter() # Counts number of times each (func,line_no) has been seen block = None # A block of potentially-hidden functions. prev_line_no = None # In case len(block) == 1. hidecount = 0 # Total number of hidden lines. for func_name, line_no in traceback: times = seen[func_name, line_no] += 1 if times >= 3: if block is None: block = collections.Counter() block[func_name] += 1 prev_line_no = line_no else: # This `if` can be a function which returns a hidecount, # so we don't repeat ourselves at the end of the loop. if block is not None: if len(block) == 1: # don't need to hide print_the_thing(next(block. keys()), prev_line_no) else: print_folded(block) hidecount += len(block) block = None print_the_thing(func_name, line_no) if block is not None: if len(block) == 1: print_the_thing(block[0]) else: print_folded(block) hidecount += len(block) > I?m also a very solid -1 on the idea of prompting the user to do something > like full_ex() to get the full stack trace. My rationale for such a change > is that we need to be extremely precise in our message, to leave absolutely > no room for a different interpretation than what we mean. Your message, for > instance, is ambiguous. The fact it says that calls are hidden would let me > think I just lost information about the stack trace (though that?s but a > wording issue). As a user, just seeing "Call _full_ex() to print full > trace." would be an immediate red flag: Crap, I just lost precious debugging > information to save on lines! My hiding is more complex (can't reproduce original output exactly), so it would be important to have an obvious way to get the old behavior. Someone else can propose the wording, if the hiding strategy itself seems useful. On Fri, Apr 22, 2016 at 11:50 PM, Steven D'Aprano wrote: > > For the record, detecting subsequences of mutally recurive calls is not > a trivial task. (It's not the same as cycle detection.) And I don't > understand how you can get 300 calls to f and 360 calls to g in a > scenario where f calls g and g calls f. Surely there would be equal > number of calls to each? It depends on how you define the problem. I look for, "contiguous block of output, each line of which has been seen before." (You can set up a trivial example where g calls itself 60 times if the parameter is mod 300. Other examples might involve randomness, or perhaps it's just a really deep recursion rather than a cycle. I'm just giving an example of how you don't need to detect repeated blocks.) From steve at pearwood.info Sat Apr 23 06:13:36 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Apr 2016 20:13:36 +1000 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: References: Message-ID: <20160423101336.GE13497@ando.pearwood.info> On Sat, Apr 23, 2016 at 02:14:10AM -0400, Franklin? Lee wrote: > Here's pseudocode for my suggestion. > (I assume appropriate definitions of `traceback`, `print_folded`, and > `print_the_thing`. I assume `func_name` is a qualified name (e.g. > `('f', '')`).) > > seen = collections.Counter() # Counts number of times each > (func,line_no) has been seen > block = None # A block of potentially-hidden functions. > prev_line_no = None # In case len(block) == 1. > hidecount = 0 # Total number of hidden lines. > for func_name, line_no in traceback: > times = seen[func_name, line_no] += 1 > if times >= 3: > if block is None: > block = collections.Counter() > block[func_name] += 1 > prev_line_no = line_no > else: > # This `if` can be a function which returns a hidecount, > # so we don't repeat ourselves at the end of the loop. > if block is not None: > if len(block) == 1: # don't need to hide > print_the_thing(next(block. > keys()), prev_line_no) > else: > print_folded(block) > hidecount += len(block) > block = None > print_the_thing(func_name, line_no) > > if block is not None: > if len(block) == 1: > print_the_thing(block[0]) > else: > print_folded(block) > hidecount += len(block) Just in case anyone missed it, here's my actual, working, code, which I now have in my PYTHONSTARTUP file. import sys import traceback from itertools import groupby TEMPLATE = " [...previous call is repeated %d times...]\n" def collapse(seq): for key, group in groupby(seq): group = list(group) if len(group) < 3: for item in group: yield item else: yield key yield TEMPLATE % (len(group)-1) def shortertb(*args): lines = traceback.format_exception(*args) sys.stderr.write(''.join(collapse(lines))) sys.excepthook = shortertb And here is an actual working example of it in action: py> import fact # Uses the obvious recursive algorithm. py> fact.fact(50000) Traceback (most recent call last): File "", line 1, in File "/home/steve/python/fact.py", line 3, in fact return n*fact(n-1) [...previous call is repeated 997 times...] File "/home/steve/python/fact.py", line 2, in fact if n < 1: return 1 RuntimeError: maximum recursion depth exceeded in comparison I'm not very interested in a complex, untested, incomplete, non-working chunk of pseudo-code when I have something which actually works in less than twenty lines. Especially since your version hides useful traceback information and requires the user to call a separate function to display the unmangled (and likely huge) traceback to find out what they're missing. In my version, they're not missing anything: it's a simple run-length encoding of the tracebacks, and nothing is hidden. The output is just compressed. So I'm afraid that, even if you manage to get your pseudo-code working and debugged, I'm going to vote a strong -1 on your proposal. Even if it works, I don't want lines to be hidden just because they've been seen before in some unrelated part of the traceback. > My hiding is more complex (can't reproduce original output exactly), > so it would be important to have an obvious way to get the old > behavior. Someone else can propose the wording, if the hiding strategy > itself seems useful. To me, it seems harmful, not useful. -- Steve From storchaka at gmail.com Sat Apr 23 07:14:30 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 23 Apr 2016 14:14:30 +0300 Subject: [Python-ideas] Bytecode for calling function with keyword arguments In-Reply-To: References: Message-ID: On 21.04.16 20:28, Serhiy Storchaka wrote: > On 21.04.16 19:00, Chris Angelico wrote: >> Also: How does this interact with **dict calls? > > For now we have separate opcode for calls with **kwargs. We will have > corresponding opcode with new way to provide keyword arguments. > > Instead of 4 current opcodes we will need at most 8 new opcodes. May be > less, because calls with *args or **kwargs is less performance critical, > we can pack arguments in a tuple and a dict by separate opcodes and call > a function just with a tuple and a dict. Actually I think we need only 3 opcodes: one general and two specialized for common cases. 1. Call with fixed number of positional arguments. This covers about 90% of calls. 2. Call with fixed number of positional and keyword arguments. This covers about 10% of calls. 3. Call with variable number of positional and/or keyword arguments packed in a tuple and a dict. Only 0.5% of calls need this. Since *args and **kwargs can be now used multiple times in the call, there are opcodes for packing arguments: BUILD_LIST_UNPACK and BUILD_MAP_UNPACK_WITH_CALL. From leewangzhong+python at gmail.com Sat Apr 23 20:44:27 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sat, 23 Apr 2016 20:44:27 -0400 Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL print less by default) In-Reply-To: <20160423101336.GE13497@ando.pearwood.info> References: <20160423101336.GE13497@ando.pearwood.info> Message-ID: On Sat, Apr 23, 2016 at 6:13 AM, Steven D'Aprano wrote: > Just in case anyone missed it, here's my actual, working, code, > I'm not very interested in a complex, untested, incomplete, non-working > chunk of pseudo-code when I have something which actually works in less > than twenty lines. I'm not conjoined to the idea. I'm just proposing it as an alternative. I provided the pseudocode to express an idea, so that the idea could be discussed. I've already identified one glaring bug, but the quality of pseudocode itself is not really important: if there are actual implementation obstacles, they can be discussed _if_ the idea is considered useful. Remember that we're all here for the good of Python. > So I'm afraid that, even if you manage to get your pseudo-code working > and debugged, I'm going to vote a strong -1 on your proposal. Even if it > works, I don't want lines to be hidden just because they've been seen > before in some unrelated part of the traceback. In a single callstack, there ARE no unrelated parts of the traceback. They've been seen before because they are part of a cycle: they started a call chain that eventually leads back to them. > Especially since your version hides useful traceback > information and requires the user to call a separate function to display > the unmangled (and likely huge) traceback to find out what they're > missing. In fact, it would probably only hide useful information if there is a non-trivial graph cycle. In that case, the RLE will still give the unmangled (and likely huge) traceback. From guettliml at thomas-guettler.de Mon Apr 25 03:47:25 2016 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Mon, 25 Apr 2016 09:47:25 +0200 Subject: [Python-ideas] unittest: assertEqual(a, b, msg): Show diff AND msg Message-ID: <571DCB8D.2040508@thomas-guettler.de> Up to now assertEqual(a, b, msg) outputs only the msg, not the diff. I know that setting longMessage to True shows the diff and the msg[1] I think the sane default is to show the diff and the message for assertEqual(). What do you think? Regards, Thomas G?ttler [1] longMessage: https://docs.python.org/3/library/unittest.html#unittest.TestCase.longMessage -- Thomas Guettler http://www.thomas-guettler.de/ From ram at rachum.com Mon Apr 25 04:38:17 2016 From: ram at rachum.com (Ram Rachum) Date: Mon, 25 Apr 2016 11:38:17 +0300 Subject: [Python-ideas] Allow with (x as y, z as w): Message-ID: Hi, I have a problem: I've got code that has a `with` statement with multiple long names, so I have to break the line awkwardly using the \ character: with some_darn_object.my_long_named_method_on_it(argument='yep') as foo, \ a_different_darn_object.and_yet_another_method() as bar: I tried wrapping the context managers in parentheses, but it looks like it's not legal syntax: File "", line 1 with (some_darn_object.my_long_named_method_on_it(argument='yep') as foo, ^ SyntaxError: invalid syntax Another demonstration from the REPL: Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> with (x as y, z as w): pass File "", line 1 with (x as y, z as w): pass ^ SyntaxError: invalid syntax I wish this syntax to be legal. Thanks, Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Mon Apr 25 07:36:37 2016 From: cory at lukasa.co.uk (Cory Benfield) Date: Mon, 25 Apr 2016 12:36:37 +0100 Subject: [Python-ideas] Allow with (x as y, z as w): In-Reply-To: References: Message-ID: <5DF22261-3EC4-4296-A12A-30E53DA258A2@lukasa.co.uk> > On 25 Apr 2016, at 09:38, Ram Rachum wrote: > > Hi, > > I have a problem: I've got code that has a `with` statement with multiple long names, so I have to break the line awkwardly using the \ character: > Ram, This request has come up a couple of times before. See the following prior discussions: - https://mail.python.org/pipermail/python-ideas/2010-September/008021.html - https://mail.python.org/pipermail/python-dev/2014-August/135741.html The upshot is: the restriction on context managers with parentheses like this is relatively likely to stick around. There have been discussions about the relative ambiguity of the syntax you?re using here compared with the syntax of tuples, and some suggestions that it?ll make life particularly tricky for the parser. I don?t have anything to add to those discussions, just wanted to suggest you may want to familiarise yourself with the previous times we had this conversation before continuing it. Cory -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From greg at krypto.org Mon Apr 25 13:35:10 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 25 Apr 2016 17:35:10 +0000 Subject: [Python-ideas] unittest: assertEqual(a, b, msg): Show diff AND msg In-Reply-To: <571DCB8D.2040508@thomas-guettler.de> References: <571DCB8D.2040508@thomas-guettler.de> Message-ID: On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler < guettliml at thomas-guettler.de> wrote: > Up to now assertEqual(a, b, msg) outputs only the msg, not the diff. > > I know that setting longMessage to True shows the diff and the msg[1] > > I think the sane default is to show the diff and the message for > assertEqual(). > > What do you think? > longMessage already defaults to True in Python 3. https://hg.python.org/cpython/file/default/Lib/unittest/case.py#l371 Changing the default in a future Python 2.7.xx release is unlikely as that kind of change can catch people by surprise and cause problems in the middle of a stable release. Setting it to true manually in all your 2.x code? recommended! -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Tue Apr 26 04:06:54 2016 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Tue, 26 Apr 2016 10:06:54 +0200 Subject: [Python-ideas] unittest: assertEqual(a, b, msg): Show diff AND msg In-Reply-To: References: <571DCB8D.2040508@thomas-guettler.de> Message-ID: <571F219E.7090905@thomas-guettler.de> Am 25.04.2016 um 19:35 schrieb Gregory P. Smith: > > On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler > wrote: > > Up to now assertEqual(a, b, msg) outputs only the msg, not the diff. > > I know that setting longMessage to True shows the diff and the msg[1] > > I think the sane default is to show the diff and the message for assertEqual(). > > What do you think? > > > longMessage already defaults to True in Python 3. > https://hg.python.org/cpython/file/default/Lib/unittest/case.py#l371 > > Changing the default in a future Python 2.7.xx release is unlikely as that kind of change can catch people by surprise > and cause problems in the middle of a stable release. Thank you Gregory! I was blind. First I read the Python2 docs, then I read the first line of the docs of Python3: If set to True then any explicit failure ..... https://docs.python.org/3/library/unittest.html#unittest.TestCase.longMessage Yes, Python3 has the better default. Maybe the docs should get updated. I guess the above sentence was copied from the old docs where you had to set True. "If set to True then ..." is correct if you have a "math brain". But it is confusing for new comers. -- Thomas Guettler http://www.thomas-guettler.de/ From barry at python.org Tue Apr 26 11:22:16 2016 From: barry at python.org (Barry Warsaw) Date: Tue, 26 Apr 2016 11:22:16 -0400 Subject: [Python-ideas] Allow with (x as y, z as w): References: <5DF22261-3EC4-4296-A12A-30E53DA258A2@lukasa.co.uk> Message-ID: <20160426112216.44944ec5@anarchist.wooz.org> On Apr 25, 2016, at 12:36 PM, Cory Benfield wrote: >This request has come up a couple of times before. ... >The upshot is: the restriction on context managers with parentheses like this >is relatively likely to stick around. I highly recommend looking at contextlib.ExitStack. Once I started using this idiom, I found I wanted the requested with-feature much less frequently. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From guido at python.org Tue Apr 26 11:32:22 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Apr 2016 08:32:22 -0700 Subject: [Python-ideas] Allow with (x as y, z as w): In-Reply-To: <20160426112216.44944ec5@anarchist.wooz.org> References: <5DF22261-3EC4-4296-A12A-30E53DA258A2@lukasa.co.uk> <20160426112216.44944ec5@anarchist.wooz.org> Message-ID: Also, I think some people are too fundamentalist about rejecting all uses of \ to break long lines. When combined with proper indentation of the continuation line, if there is no alternative, I think it looks better than artifically introduced parentheses. At least if there's only one or two continuation lines. E.g. ``` very_long_variable_name = \ very_long_function(very_long_argument_list) ``` looks better to me than ``` very_long_variable_name = ( very_long_function(very_long_argument_list)) ``` I'm just saying, there's a reason PEP 8's motto is "A Foolish Consistency is the Hobgoblin of Little Minds". On Tue, Apr 26, 2016 at 8:22 AM, Barry Warsaw wrote: > On Apr 25, 2016, at 12:36 PM, Cory Benfield wrote: > > >This request has come up a couple of times before. > ... > >The upshot is: the restriction on context managers with parentheses like > this > >is relatively likely to stick around. > > I highly recommend looking at contextlib.ExitStack. Once I started using > this idiom, I found I wanted the requested with-feature much less > frequently. > > Cheers, > -Barry > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Wed Apr 27 03:49:06 2016 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Wed, 27 Apr 2016 09:49:06 +0200 Subject: [Python-ideas] unittest: assertEqual(a, b, msg): Show diff AND msg In-Reply-To: <571F219E.7090905@thomas-guettler.de> References: <571DCB8D.2040508@thomas-guettler.de> <571F219E.7090905@thomas-guettler.de> Message-ID: <57206EF2.2050905@thomas-guettler.de> I opened a docs issue for "longMessage" http://bugs.python.org/issue26869 Am 26.04.2016 um 10:06 schrieb Thomas G?ttler: > > > Am 25.04.2016 um 19:35 schrieb Gregory P. Smith: >> >> On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler > >> wrote: >> >> Up to now assertEqual(a, b, msg) outputs only the msg, not the diff. >> >> I know that setting longMessage to True shows the diff and the msg[1] >> >> I think the sane default is to show the diff and the message for assertEqual(). >> >> What do you think? >> >> >> longMessage already defaults to True in Python 3. >> https://hg.python.org/cpython/file/default/Lib/unittest/case.py#l371 >> >> Changing the default in a future Python 2.7.xx release is unlikely as that kind of change can catch people by surprise >> and cause problems in the middle of a stable release. > > Thank you Gregory! I was blind. > > First I read the Python2 docs, then I read the first line of the docs of Python3: > > > If set to True then any explicit failure ..... > > https://docs.python.org/3/library/unittest.html#unittest.TestCase.longMessage > > Yes, Python3 has the better default. Maybe the docs should get updated. > I guess the above sentence was copied from the old docs where you had to set True. > > "If set to True then ..." is correct if you have a "math brain". But it is confusing > for new comers. > > > > -- Thomas Guettler http://www.thomas-guettler.de/