From robert.hoelzl at posteo.de Mon May 1 03:19:01 2017 From: robert.hoelzl at posteo.de (robert.hoelzl at posteo.de) Date: Mon, 1 May 2017 09:19:01 +0200 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() Message-ID: <3wGbPx0mQJz105n@submission01.posteo.de> The bytes.hex() function is the inverse function of Bytes.fromhex(). But fromhex can process spaces (which is much more readable), while hex() provides no way to include spaces. My proposal would be to add an optional delimiter, that allows to specify a string that will be inserted between the digit pairs of a byte: def hex(self, delimiter=??): ? This would allow to write: assert b?abc?.hex(? ?) == ?61 62 63? Gesendet von Mail f?r Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 1 03:28:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 May 2017 17:28:31 +1000 Subject: [Python-ideas] Proposed PEP 484 addition: describe a way of annotating decorated declarations In-Reply-To: References: Message-ID: On 1 May 2017 at 03:07, Guido van Rossum wrote: > There's a PR to the peps proposal here: > https://github.com/python/peps/pull/242 > > The full text of the current proposal is below. The motivation for this is > that for complex decorators, even if the type checker can figure out what's > going on (by taking the signature of the decorator into account), it's > sometimes helpful to the human reader of the code to be reminded of the type > after applying the decorators (or a stack thereof). Much discussion can be > found in the PR. Note that we ended up having `Callable` in the type because > there's no rule that says a decorator returns a function type (e.g. > `property` doesn't). So a rigorous typechecker that understood the full decorator stack would be able to check whether or not the argument to `decorated_type` was correct, while all typecheckers (and human readers) would be able to just believe the argument rather than having to run through all the decorator transformations? Make sense to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From noxdafox at gmail.com Mon May 1 07:13:00 2017 From: noxdafox at gmail.com (NoxDaFox) Date: Mon, 1 May 2017 14:13:00 +0300 Subject: [Python-ideas] Decorators for running a function in a Process or Thread Message-ID: The decorators would abstract the logic of spawning a process or thread and maintaining its lifecycle. I think it could be a good fit for the `concurrent.futures` module. Decorated functions would return a `Future` object and run the logic in a separate thread or process. @concurrent.futures.thread def function(arg, kwarg=0): return arg + kwarg future = function(1, kwarg=2) future.result() The Process decorator in particular would support use cases such as running unstable code within an application taking advantage of process separation benefits. I often found myself relying on external APIs which would either crash or run indefinitely affecting my application stability and performance. @concurrent.futures.process(timeout=60): def unstable_function(arg, kwarg=0): # hang or segfault here return arg + kwarg future = unstable_function(1, kwarg=2) try: future.result() except TimeoutError as error: print("Function took more than %d seconds" % error.args[1]) except ProcessExpired as error: print("Process exit code %d" % error.exitcode) In case of timeout, the process would be terminated reclaiming back its resources. Few years ago I wrote a library for this purpose which turned out pretty handy in reducing the boilerplate code when dealing with threads and processes. It could be a starting point for discussing about the implementation. https://pypi.python.org/pypi/Pebble -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon May 1 08:02:02 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 1 May 2017 13:02:02 +0100 Subject: [Python-ideas] Decorators for running a function in a Process or Thread In-Reply-To: References: Message-ID: On 1 May 2017 at 12:13, NoxDaFox wrote: > > I think it could be a good fit for the `concurrent.futures` module. > Decorated functions would return a `Future` object and run the logic in a > separate thread or process. > > > @concurrent.futures.thread > def function(arg, kwarg=0): > return arg + kwarg > > future = function(1, kwarg=2) > future.result() What's the benefit over just running the function in a thread (or process) pool, using Executor.submit()? Paul From apalala at gmail.com Mon May 1 09:23:58 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 1 May 2017 09:23:58 -0400 Subject: [Python-ideas] Augmented assignment syntax for objects. In-Reply-To: <20170427232853.GC22525@ando.pearwood.info> References: <701f3390-5f12-bf34-0874-8451d7f81add@mgmiller.net> <20170427232853.GC22525@ando.pearwood.info> Message-ID: On Thu, Apr 27, 2017 at 7:28 PM, Steven D'Aprano wrote: > > In my experience, what Python is lacking is a way to declare attributes > > outside of the constructor. Take a look at how it's done in C#, Swisft, > or > > Go. > > Since you apparently already know how they do it, how about telling us > (1) how they do it, and (2) *why* they do it? > They make attribure declarations at the class declaration scope be instance attributes. Python makes that kind of declaration class attributes (statics in some other languages). This is the spec for C#: https://goo.gl/FeBTuy The reason *why* they do it that way is because declaring instance fields/variables is much more frequent than declaring class ones. > > Object attributes outside of the constructor would solve things more > > relevant than the vertical space used when assigning constructor > parameters > > to attributes. > > Solve which things? > Instance attributes may be defined with or without default values without having to pass them as arguments to or mention them in a constructor. > > For example, multiple inheritance is well designed in > > Python, except that it often requires constructors with no parameters, > > which leads to objects with no default attributes. > > Can you elaborate? A class hierarchy in which there is multiple inheritance requires constructors with no arguments. This is typical: https://goo.gl/l54tx7 I don't know which would be the best syntax, but it would be convenient to be able to declare something like: class A: var a = 'a' And have "a" be an instance attribute. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 1 09:38:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 May 2017 23:38:20 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <3wGbPx0mQJz105n@submission01.posteo.de> References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: On 1 May 2017 at 17:19, wrote: > The bytes.hex() function is the inverse function of Bytes.fromhex(). > > But fromhex can process spaces (which is much more readable), while hex() > provides no way to include spaces. > > My proposal would be to add an optional delimiter, that allows to specify a > string that will be inserted between the digit pairs of a byte: > > def hex(self, delimiter=??): ? We're definitely open to offering better formatting options for bytes.hex(). My proposal in https://bugs.python.org/issue22385 was to define a new formatting mini-language (akin to the way strftime works, but with a much simpler formatting mini-language): http://bugs.python.org/issue22385#msg292663 However, a much simpler alternative would be to just support two keyword arguments to hex(): "delimiter" (as you suggest) and "chunk_size" (defaulting to 1, so you get per-byte chunking by default) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon May 1 09:45:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 May 2017 23:45:26 +1000 Subject: [Python-ideas] Decorators for running a function in a Process or Thread In-Reply-To: References: Message-ID: On 1 May 2017 at 22:02, Paul Moore wrote: > On 1 May 2017 at 12:13, NoxDaFox wrote: >> >> I think it could be a good fit for the `concurrent.futures` module. >> Decorated functions would return a `Future` object and run the logic in a >> separate thread or process. >> >> >> @concurrent.futures.thread >> def function(arg, kwarg=0): >> return arg + kwarg >> >> future = function(1, kwarg=2) >> future.result() > > What's the benefit over just running the function in a thread (or > process) pool, using Executor.submit()? It allows function designers to deliberately increase the friction of calling the function "normally". Consider an async library, for example - in such cases, it's useful to be able ensure that a blocking function never runs in the *current* thread, and instead always runs in a different one. One of the problems for the proposal is that we don't have the notion of a "default executor", the way we do with the default event loop in asyncio, so functions decorated with these would need to accept an additional parameter specifying the executor to use. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From apalala at gmail.com Mon May 1 10:04:25 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 1 May 2017 10:04:25 -0400 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: > just support two > keyword arguments to hex(): "delimiter" (as you suggest) and > "chunk_size" (defaulting to 1, so you get per-byte chunking by > default) > I'd expect "chunk_size" to mean the number of hex digits (not bytes) per chunk. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 1 10:44:30 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 1 May 2017 07:44:30 -0700 Subject: [Python-ideas] Proposed PEP 484 addition: describe a way of annotating decorated declarations In-Reply-To: References: Message-ID: On Mon, May 1, 2017 at 12:28 AM, Nick Coghlan wrote: > On 1 May 2017 at 03:07, Guido van Rossum wrote: > > There's a PR to the peps proposal here: > > https://github.com/python/peps/pull/242 > > > > The full text of the current proposal is below. The motivation for this > is > > that for complex decorators, even if the type checker can figure out > what's > > going on (by taking the signature of the decorator into account), it's > > sometimes helpful to the human reader of the code to be reminded of the > type > > after applying the decorators (or a stack thereof). Much discussion can > be > > found in the PR. Note that we ended up having `Callable` in the type > because > > there's no rule that says a decorator returns a function type (e.g. > > `property` doesn't). > > So a rigorous typechecker that understood the full decorator stack > would be able to check whether or not the argument to `decorated_type` > was correct, while all typecheckers (and human readers) would be able > to just believe the argument rather than having to run through all the > decorator transformations? > Yes. In fact the intention is that the checker should check the declared type with the inferred and complain if they don't fit. In some cases the inferred type would have `Any` where the declared type would have a specific type and then the declared type would "win" (for uses of the decorated function) -- this is an example of where "erosion" in type inference can be counteracted by explicit declarations. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon May 1 13:34:41 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 01 May 2017 10:34:41 -0700 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: <590771B1.4080707@stoneleaf.us> On 05/01/2017 07:04 AM, Juancarlo A?ez wrote: > On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: > >> just support two >> keyword arguments to hex(): "delimiter" (as you suggest) and >> "chunk_size" (defaulting to 1, so you get per-byte chunking by >> default) > > I'd expect "chunk_size" to mean the number of hex digits (not bytes) per chunk. I was also surprised by that. Also, should Python be used on a machine with, say, 24-bit words then a chunk size of three makes more sense that one of 1.5. ;) -- ~Ethan~ From abrault at mapgears.com Mon May 1 13:41:27 2017 From: abrault at mapgears.com (Alexandre Brault) Date: Mon, 1 May 2017 13:41:27 -0400 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <590771B1.4080707@stoneleaf.us> References: <3wGbPx0mQJz105n@submission01.posteo.de> <590771B1.4080707@stoneleaf.us> Message-ID: On 2017-05-01 01:34 PM, Ethan Furman wrote: > On 05/01/2017 07:04 AM, Juancarlo A?ez wrote: >> On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: >> >>> just support two >>> keyword arguments to hex(): "delimiter" (as you suggest) and >>> "chunk_size" (defaulting to 1, so you get per-byte chunking by >>> default) >> >> I'd expect "chunk_size" to mean the number of hex digits (not bytes) >> per chunk. > > I was also surprised by that. Also, should Python be used on a > machine with, say, 24-bit words then a chunk size of three makes more > sense that one of 1.5. ;) > > -- > ~Ethan~ A hex digit is 4 bits long. To separate into words, the 24-bit word Python would use 3 (counting in bytes as initially proposed), or 6 (counting in hex digits). Neither option would result in a 1.5 chunk_size for 24-bit chunks. Counting chunk_size either in nibbles or bytes seem equally intuitive to me (as long as it's documented). From abrault at mapgears.com Mon May 1 13:49:12 2017 From: abrault at mapgears.com (Alexandre Brault) Date: Mon, 1 May 2017 13:49:12 -0400 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <590771B1.4080707@stoneleaf.us> Message-ID: <39c03a04-7ed8-f8a6-8d38-cb632831887c@mapgears.com> On 2017-05-01 01:41 PM, Alexandre Brault wrote: > On 2017-05-01 01:34 PM, Ethan Furman wrote: >> On 05/01/2017 07:04 AM, Juancarlo A?ez wrote: >>> On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: >>> >>>> just support two >>>> keyword arguments to hex(): "delimiter" (as you suggest) and >>>> "chunk_size" (defaulting to 1, so you get per-byte chunking by >>>> default) >>> I'd expect "chunk_size" to mean the number of hex digits (not bytes) >>> per chunk. >> I was also surprised by that. Also, should Python be used on a >> machine with, say, 24-bit words then a chunk size of three makes more >> sense that one of 1.5. ;) >> >> -- >> ~Ethan~ > A hex digit is 4 bits long. To separate into words, the 24-bit word > Python would use 3 (counting in bytes as initially proposed), or 6 > (counting in hex digits). Neither option would result in a 1.5 > chunk_size for 24-bit chunks. > > Counting chunk_size either in nibbles or bytes seem equally intuitive to > me (as long as it's documented). And I only just realised your main concern was about the 12-bit byte of that 24-bit word architecture. Carry on From malaclypse2 at gmail.com Mon May 1 14:50:48 2017 From: malaclypse2 at gmail.com (Jerry Hill) Date: Mon, 1 May 2017 14:50:48 -0400 Subject: [Python-ideas] Augmented assignment syntax for objects. In-Reply-To: <20170427232133.GB22525@ando.pearwood.info> References: <20170427232133.GB22525@ando.pearwood.info> Message-ID: On Thu, Apr 27, 2017 at 7:21 PM, Steven D'Aprano wrote: > On Wed, Apr 26, 2017 at 03:54:22PM -0400, Jerry Hill wrote: >> On Tue, Apr 25, 2017 at 8:05 PM, Ryan Gonzalez wrote: >> > def ___init__(self, self.attr): >> >> I'm not a python developer, I'm just a developer that uses python. >> That said, I really like this form. It eliminates most of the >> redundancy, while still being explicit. It's true that you have to >> repeat the 'self' name, but it feels like the right balance in my >> mind. It seems like anyone who's familiar with python function >> definitions would immediately grasp what's going on. > > I don't like it, not even a bit. It's not general, "self" is effectively > a special case. Consider: I wasn't thinking that 'self' should be a special case at all. > What happens if you use this syntax in a top-level function rather > than a method? (Or a static method?) > > def function(x, y, x.attr): > ... > > (And don't forget that behind the scenes, methods *are* functions.) What > would this syntax even mean? It would mean the same thing as "def function(x, y, z):", which is: take the first three positional arguments, and assign them to the local names listed in the function definition. The same as def function(foo, bar, baz): x = foo y = bar x.attr = baz > Or if you use some other name other than the first parameter? > > def method(self, spam, foo.eggs): > ... What's the problem? How is this any different from def method(self, bar, baz): spam = bar foo.eggs = baz I mean, it does imply that 'foo' is some globally writable bit of data, being modified via side effect of calling a method, which is certainly a code smell. Still, we're all consenting adults here, and it doesn't seem any worse than the already-legal variant I mention above. > Conceptually, it is mixing up two distinct tasks: > > - declaring the parameter list of the function/method; > > - running part of the body of the method. Is assigning your parameters to their local names really part of 'running the body of the method'? I mean, I almost never write my functions like this: def fun(*args): foo = args[0] bar = args[1] baz = args[2] Do you? If not, aren't you using the function definition to assign parameters to local names? Also, if you do write functions that way, aren't you *also* losing the semantic hints about what parameters are acceptable and what they are used for? > The parameter list of the method is the *interface* to the method: it > tells you the public names and default values (and possibly types, if > you use type annotations) of the method parameters. But this syntax > overloads the parameter list to show part of the *implementation* of the > method (namely, that some parameters are assigned directly to attributes > of self). Every time the implementation changes, the parameter list will > change too. I guess I don't agree that the function definition only tells you the 'public names' of the parameters being passed. Instead, I think the function definition tells you the names those parameters are going to go by inside the function. That is, their local names. And if you're taking the parameters and assigning them to attributes of self anyway, then you have the exact same issue with your parameter list changing every time the implementation changes anyway, don't you? > it doesn't look so good in larger, more realistic cases, especially with > other syntax. Here's a parameter list taken from some real code of mine, > with the "self." syntax added: I won't argue with that, I suppose. I'm not convinced it's worse than the status quo, but yes, if you pack all the possible options into your parameters, you can end up with a messy function signature. I guess my python experience is atypical if most of your methods take 6+ parameters, including some that are keyword-only, most of which have default values, and are annotated with type hints. > class BinnedData(object): > def __init__(self, self.number:int, > self.start:float=None, > self.end:float=None, > self.width:float=None, > *, > self.policy=None, > self.mark=False > ): > > The repetition of "self" is not only tedious and verbose, it adds > noise to the parameter list and pushes the reader's attention away from > the important part (the name of the parameter, e.g. "width") and to > something irrelevant to the interface ("self"). > > And I am not looking forward to having to explain to beginners to Python > why this doesn't work: > > data = BinnedData(self.number = 8, self.start = 0, self.end = 20) For what it's worth, if that construct wouldn't work, then I'm convinced that the idea is a bad one. :) I'm not sure I see why that would have to be forbidden, though. -- Jerry From tjreedy at udel.edu Mon May 1 17:08:07 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 1 May 2017 17:08:07 -0400 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <590771B1.4080707@stoneleaf.us> Message-ID: On 5/1/2017 1:41 PM, Alexandre Brault wrote: > On 2017-05-01 01:34 PM, Ethan Furman wrote: >> On 05/01/2017 07:04 AM, Juancarlo A?ez wrote: >>> On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: >>> >>>> just support two >>>> keyword arguments to hex(): "delimiter" (as you suggest) and >>>> "chunk_size" (defaulting to 1, so you get per-byte chunking by >>>> default) >>> >>> I'd expect "chunk_size" to mean the number of hex digits (not bytes) >>> per chunk. >> >> I was also surprised by that. Also, should Python be used on a >> machine with, say, 24-bit words then a chunk size of three makes more >> sense that one of 1.5. ;) >> >> -- >> ~Ethan~ > A hex digit is 4 bits long. To separate into words, the 24-bit word > Python would use 3 (counting in bytes as initially proposed), or 6 > (counting in hex digits). Neither option would result in a 1.5 > chunk_size for 24-bit chunks. > > Counting chunk_size either in nibbles or bytes seem equally intuitive to > me (as long as it's documented). Call the paramater 'octets' and it should be clear that it means 8 bit chunks. Do any machine now use anything else? -- Terry Jan Reedy From ncoghlan at gmail.com Tue May 2 00:52:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 May 2017 14:52:40 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <590771B1.4080707@stoneleaf.us> References: <3wGbPx0mQJz105n@submission01.posteo.de> <590771B1.4080707@stoneleaf.us> Message-ID: On 2 May 2017 at 03:34, Ethan Furman wrote: > On 05/01/2017 07:04 AM, Juancarlo A?ez wrote: >> >> On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: >> >>> just support two >>> keyword arguments to hex(): "delimiter" (as you suggest) and >>> "chunk_size" (defaulting to 1, so you get per-byte chunking by >>> default) >> >> >> I'd expect "chunk_size" to mean the number of hex digits (not bytes) per >> chunk. > > I was also surprised by that. Also, should Python be used on a machine > with, say, 24-bit words then a chunk size of three makes more sense that one > of 1.5. ;) I came up with a possible alternative scheme on the issue tracker: def hex(self, *, group_digits=None, delimiter=" "): """B.hex() -> string of hex digits B.hex(group_digits=N) -> hex digits in groups separated by *delimeter* Create a string of hexadecimal numbers from a bytes object:: >>> b'\xb9\x01\xef'.hex() 'b901ef' >>> b'\xb9\x01\xef'.hex(group_digits=2) 'b9 01 ef' """ Advantages of this approach: - grouping by digits generalises more obviously to other bases (e.g. if similar arguments were ever added to the hex/oct/bin builtins) - by using "group_digits=None" to indicate "no grouping", the default delimiter can be a space rather than the empty string Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brenbarn at brenbarn.net Mon May 1 15:14:16 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Mon, 01 May 2017 12:14:16 -0700 Subject: [Python-ideas] Augmented assignment syntax for objects. In-Reply-To: References: <20170427232133.GB22525@ando.pearwood.info> Message-ID: <59078908.7010907@brenbarn.net> On 2017-05-01 11:50, Jerry Hill wrote: >> >What happens if you use this syntax in a top-level function rather >> >than a method? (Or a static method?) >> > >> >def function(x, y, x.attr): >> > ... >> > >> >(And don't forget that behind the scenes, methods*are* functions.) What >> >would this syntax even mean? > It would mean the same thing as "def function(x, y, z):", which is: > take the first three positional arguments, and assign them to the > local names listed in the function definition. The same as > > def function(foo, bar, baz): > x = foo > y = bar > x.attr = baz > > >> >Or if you use some other name other than the first parameter? >> > >> >def method(self, spam, foo.eggs): >> > ... > What's the problem? How is this any different from > def method(self, bar, baz): > spam = bar > foo.eggs = baz > > I mean, it does imply that 'foo' is some globally writable bit of > data, being modified via side effect of calling a method, which is > certainly a code smell. Still, we're all consenting adults here, and > it doesn't seem any worse than the already-legal variant I mention > above. I don't think the existing cases are really parallel. In the example with x.attr, you're introducing a dependency among different arguments. That isn't currently possible. You can't define a function like this: def function(x, y, x): So there's never any question of which order function arguments are bound in. In your system, what would happen if I do this: def function(x.attr, x) Would this, when called, assign an attribute on whatever is referred to by a global variable x, and then, after that, create a local variable x (necessarily shadowing the global whose attribute I just set)? You can argue that assigning in left-to-right order is the natural choice, and I'd agree, but that doesn't change the fact that introducing potential order dependency is new behavior. You also say that the existing behavior is "assign to local names", but that's just the thing. "x.attr = blah" is not an assignment to a local name, because "x.attr" is not a name. It's an attribute assignment, because "x.attr" is an attribute reference. Those are very different things. (The latter, for instance, can be hooked with __setattr__, but assignment to local names is not hookable.) Right now you can only use function arguments to assign to local names. But if you could start putting other things as function arguments, you could use them to assign to things that are not local names. That is a major change. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From steve at pearwood.info Tue May 2 07:31:48 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 2 May 2017 21:31:48 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: <20170502113148.GO22525@ando.pearwood.info> On Mon, May 01, 2017 at 11:38:20PM +1000, Nick Coghlan wrote: > We're definitely open to offering better formatting options for bytes.hex(). > > My proposal in https://bugs.python.org/issue22385 was to define a new > formatting mini-language (akin to the way strftime works, but with a > much simpler formatting mini-language): > http://bugs.python.org/issue22385#msg292663 > > However, a much simpler alternative would be to just support two > keyword arguments to hex(): "delimiter" (as you suggest) and > "chunk_size" (defaulting to 1, so you get per-byte chunking by > default) I disagree with this approach. There's nothing special about bytes.hex() here, perhaps we want to format the output of hex() or bin() or oct(), or for that matter "%x" and any of the other string templates? In fact, this is a string operation that could apply to any character string, including decimal digits. Rather than duplicate the API and logic everywhere, I suggest we add a new string method. My suggestion is str.chunk(size, delimiter=' ') and str.rchunk() with the same arguments: "1234ABCDEF".chunk(4) => returns "1234 ABCD EF" rchunk will be useful for money or other situations where we group from the right rather than from the left: "$" + str(10**6).rchunk(3, ',') => returns "$1,000,000" And if we want to add bells and whistles, we could accept a tuple for the size argument: # Format mobile phone number in the Australian style "04123456".rchunk((4, 3)) => returns "0412 345 678" # Format an integer in the Indian style str(123456789).rchunk((3, 2), ",") => returns "12,34,56,789" In the OP's use-case: bytes("abcde", "ascii").hex().chunk(2) => returns '61 62 63 64 65' bytes("abcde", "ascii").hex().chunk(4) => returns '6162 6364 65' I don't see any advantage to adding this to bytes.hex(), hex(), oct(), bin(), and I really don't think it is helpful to be grouping the characters by the number of bits. Its a string formatting operation, not a bit operation. -- Steve From jsbueno at python.org.br Tue May 2 08:02:11 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 2 May 2017 09:02:11 -0300 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: On 1 May 2017 at 11:04, Juancarlo A?ez wrote: > > On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: >> >> just support two >> keyword arguments to hex(): "delimiter" (as you suggest) and >> "chunk_size" (defaulting to 1, so you get per-byte chunking by >> default) > > > I'd expect "chunk_size" to mean the number of hex digits (not bytes) per > chunk. So do I. Moreover, if "1" is for two digits, there is no way to specify single digits - for little use we can perceive for that. Maybe it does not need to be named "chunk_size" - "digits_per_block" is too big, but is precise. Also, whatever we think is good for "hex" could also be done to "bin" . > > Cheers, > > > -- > Juancarlo A?ez > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From carl.input at gmail.com Tue May 2 08:06:09 2017 From: carl.input at gmail.com (Carl Smith) Date: Tue, 02 May 2017 12:06:09 +0000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: Couldn't it just be named `str.delimit`? I totally agree with Steve for what it's worth. Thanks for everything guys. Best, On Tue, 2 May 2017 13:02 Joao S. O. Bueno, wrote: > On 1 May 2017 at 11:04, Juancarlo A?ez wrote: > > > > On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan wrote: > >> > >> just support two > >> keyword arguments to hex(): "delimiter" (as you suggest) and > >> "chunk_size" (defaulting to 1, so you get per-byte chunking by > >> default) > > > > > > I'd expect "chunk_size" to mean the number of hex digits (not bytes) per > > chunk. > So do I. Moreover, if "1" is for two digits, there is no way to > specify single digits - for little use we can perceive for that. > > Maybe it does not need to be named "chunk_size" - "digits_per_block" > is too big, but is precise. > > Also, whatever we think is good for "hex" could also be done to "bin" . > > > > > Cheers, > > > > > > -- > > Juancarlo A?ez > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.hoelzl at posteo.de Tue May 2 08:10:13 2017 From: robert.hoelzl at posteo.de (robert.hoelzl at posteo.de) Date: Tue, 2 May 2017 14:10:13 +0200 Subject: [Python-ideas] Add a .chunks() method to sequences Message-ID: <3wHKqT4SwQz108v@submission01.posteo.de> Steven D?Aprano was giving me an idea (in the bytes.hex delimiter discussion): I had very often the use case that I want to split sequences into subsequences of same size. How about adding a chunks() and rchunks() function to sequences: [1,2,3,4,5,6,7].chunks(3) => [[1,2,3], [4,5,6], [7]] "1234?.chunks(2) => [?12?, ?34?] (this could then be used to emulate stevens proposal: ? ?.join(?1234567?.chunks(2)) => ?12 34 56 7?) robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.input at gmail.com Tue May 2 08:12:51 2017 From: carl.input at gmail.com (Carl Smith) Date: Tue, 02 May 2017 12:12:51 +0000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: Sorry. I meant to be terse, but wasn't clear enough. I meant the method name. If it takes a `delimiter` karg, it would be consistent to call the operation `delimit`. On Tue, 2 May 2017 13:06 Carl Smith, wrote: > Couldn't it just be named `str.delimit`? I totally agree with Steve for > what it's worth. Thanks for everything guys. Best, > > On Tue, 2 May 2017 13:02 Joao S. O. Bueno, wrote: > >> On 1 May 2017 at 11:04, Juancarlo A?ez wrote: >> > >> > On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan >> wrote: >> >> >> >> just support two >> >> keyword arguments to hex(): "delimiter" (as you suggest) and >> >> "chunk_size" (defaulting to 1, so you get per-byte chunking by >> >> default) >> > >> > >> > I'd expect "chunk_size" to mean the number of hex digits (not bytes) >> per >> > chunk. >> So do I. Moreover, if "1" is for two digits, there is no way to >> specify single digits - for little use we can perceive for that. >> >> Maybe it does not need to be named "chunk_size" - "digits_per_block" >> is too big, but is precise. >> >> Also, whatever we think is good for "hex" could also be done to "bin" . >> >> > >> > Cheers, >> > >> > >> > -- >> > Juancarlo A?ez >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.input at gmail.com Tue May 2 08:18:01 2017 From: carl.input at gmail.com (Carl Smith) Date: Tue, 02 May 2017 12:18:01 +0000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: On the block size arg, couldn't it just be named `index`? On Tue, 2 May 2017 13:12 Carl Smith, wrote: > Sorry. I meant to be terse, but wasn't clear enough. I meant the method > name. If it takes a `delimiter` karg, it would be consistent to call the > operation `delimit`. > > On Tue, 2 May 2017 13:06 Carl Smith, wrote: > >> Couldn't it just be named `str.delimit`? I totally agree with Steve for >> what it's worth. Thanks for everything guys. Best, >> >> On Tue, 2 May 2017 13:02 Joao S. O. Bueno, wrote: >> >>> On 1 May 2017 at 11:04, Juancarlo A?ez wrote: >>> > >>> > On Mon, May 1, 2017 at 9:38 AM, Nick Coghlan >>> wrote: >>> >> >>> >> just support two >>> >> keyword arguments to hex(): "delimiter" (as you suggest) and >>> >> "chunk_size" (defaulting to 1, so you get per-byte chunking by >>> >> default) >>> > >>> > >>> > I'd expect "chunk_size" to mean the number of hex digits (not bytes) >>> per >>> > chunk. >>> So do I. Moreover, if "1" is for two digits, there is no way to >>> specify single digits - for little use we can perceive for that. >>> >>> Maybe it does not need to be named "chunk_size" - "digits_per_block" >>> is too big, but is precise. >>> >>> Also, whatever we think is good for "hex" could also be done to "bin" . >>> >>> > >>> > Cheers, >>> > >>> > >>> > -- >>> > Juancarlo A?ez >>> > >>> > _______________________________________________ >>> > Python-ideas mailing list >>> > Python-ideas at python.org >>> > https://mail.python.org/mailman/listinfo/python-ideas >>> > Code of Conduct: http://python.org/psf/codeofconduct/ >>> > >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From geoffspear at gmail.com Tue May 2 08:33:17 2017 From: geoffspear at gmail.com (Geoffrey Spear) Date: Tue, 02 May 2017 12:33:17 +0000 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: <3wHKqT4SwQz108v@submission01.posteo.de> References: <3wHKqT4SwQz108v@submission01.posteo.de> Message-ID: On Tue, May 2, 2017 at 8:10 AM wrote: > Steven D?Aprano was giving me an idea (in the bytes.hex delimiter > discussion): > > > > I had very often the use case that I want to split sequences into > subsequences of same size. > > How about adding a chunks() and rchunks() function to sequences: > > > > [1,2,3,4,5,6,7].chunks(3) => [[1,2,3], [4,5,6], [7]] > > "1234?.chunks(2) => [?12?, ?34?] > > > > (this could then be used to emulate stevens proposal: ? > ?.join(?1234567?.chunks(2)) => ?12 34 56 7?) > > > Changing the definition of the Sequence ABC to avoid needing to use a 2-line function from the itertools recipes seems like a pretty drastic change. I don't think there's even a compelling argument for adding grouper() to itertools, let along to every single sequence. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 2 09:45:35 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 May 2017 23:45:35 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170502113148.GO22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: On 2 May 2017 at 21:31, Steven D'Aprano wrote: > On Mon, May 01, 2017 at 11:38:20PM +1000, Nick Coghlan wrote: >> However, a much simpler alternative would be to just support two >> keyword arguments to hex(): "delimiter" (as you suggest) and >> "chunk_size" (defaulting to 1, so you get per-byte chunking by >> default) > > I disagree with this approach. There's nothing special about bytes.hex() > here, perhaps we want to format the output of hex() or bin() or oct(), > or for that matter "%x" and any of the other string templates? > > In fact, this is a string operation that could apply to any character > string, including decimal digits. > > Rather than duplicate the API and logic everywhere, I suggest we add a > new string method. My suggestion is str.chunk(size, delimiter=' ') and > str.rchunk() with the same arguments: > > "1234ABCDEF".chunk(4) > => returns "1234 ABCD EF" > > rchunk will be useful for money or other situations where we group from > the right rather than from the left: > > "$" + str(10**6).rchunk(3, ',') > => returns "$1,000,000" Nice. That proposal also addresses one of the problems I raised in the issue tracker, which is that the decimal equivalent to hex/oct/bin is just str, so anything based on keyword arguments to the display functions is hard to apply to ordinary decimal numbers. Attempting to align the terminology with existing string methods and other stdlib APIs: 1. the programming FAQ uses "chunks" as the accumulation variable prior to calling str.join(): https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together 2. the most analogous itertools recipe is the "grouper" recipe, which describes it purpose as "Collect data into fixed-length chunks or blocks" 3. there's a top level "chunk" module for working with audio file formats (today-I-learned...) 4. multiprocessing uses "chunksize" to manage the dispatching of work to worker processes 5. various networking, IO and serialisation libraries use "chunk" to describe data blocks for incremental reads and writes I think a couple of key problems are illustrated by that survey: 1. we don't have any current APIs or documentation that use "chunk" in combination with any kind of delimiter 2. we don't have any current APIs or documentation that use "chunk" as a verb - they all use it as a noun So if we went with this approach, then Carl Smith's suggestion of "str.delimit()" likely makes sense. However, the other question worth asking is whether we might want a "string slice splitting" operation rather than a string delimiting option: once you have the slices, then combining them again with str.join is straightforward, but extracting the slices in the first place is currently a little fiddly (especially for the reversed case): def splitslices(self, size): return [self[start:start+size] for start in range(0, len(self), size)] def rsplitslices(self, size): blocks = [self[start:start+size] for start in range(-2*size, -len(self), -size)] blocks.append(self[-size:]) return blocks Given those methods, the split-and-rejoin use case that started the thread would look like: " ".join("1234ABCDEF".splitslices(4)) => "1234 ABCD EF" "$" + ",".join(str(10**6).rsplitslices(3)) => "$1,000,000" Which is the same pattern that can be used to change a delimiter with str.split() and str.splitlines(). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue May 2 10:09:21 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 3 May 2017 00:09:21 +1000 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: <3wHKqT4SwQz108v@submission01.posteo.de> References: <3wHKqT4SwQz108v@submission01.posteo.de> Message-ID: On 2 May 2017 at 22:10, wrote: > Steven D?Aprano was giving me an idea (in the bytes.hex delimiter > discussion): > > I had very often the use case that I want to split sequences into > subsequences of same size. > > How about adding a chunks() and rchunks() function to sequences: > > [1,2,3,4,5,6,7].chunks(3) => [[1,2,3], [4,5,6], [7]] > > "1234?.chunks(2) => [?12?, ?34?] > > (this could then be used to emulate stevens proposal: ? > ?.join(?1234567?.chunks(2)) => ?12 34 56 7?) While there may be a case for a "splitslices()" method on strings for text formatting purposes, that case is weaker for generic sequences - we don't offer general purpose equivalents to split() or partition() either. That said, the possibility of a sequence or container focused counterpart to "itertools" has come up before, and it's conceivable such an algorithm might find a home there. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Tue May 2 12:28:12 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 3 May 2017 02:28:12 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: <20170502162809.GP22525@ando.pearwood.info> On Tue, May 02, 2017 at 11:45:35PM +1000, Nick Coghlan wrote: > Attempting to align the terminology with existing string methods and > other stdlib APIs: [...] > 1. we don't have any current APIs or documentation that use "chunk" in > combination with any kind of delimiter > 2. we don't have any current APIs or documentation that use "chunk" as > a verb - they all use it as a noun English has a long and glorious tradition of verbing nouns, and nouning verbs. Group can mean the action of putting things into a group, join likewise refers to both the action of attaching two things and the seam or joint where they have been joined. Likewise for chunking: https://duckduckgo.com/html/?q=chunking "Chunk" has used as a verb since at least 1890 (albeit with a different meaning). None of my dictionaries give a date for the use of chunking to mean dividing something up into chunks, so that could be quite recent, but it's well-established in education (chunking as a technique for doing long division), psychology, linguistics and more. I remember using "chunking" as a verb to describe Hyperscript's text handling back in the mid 1980s, e.g. "word 2 of line 6 of text". The nltk library handles chunk as both a noun and verb in a similar sense: http://www.nltk.org/howto/chunk.html > So if we went with this approach, then Carl Smith's suggestion of > "str.delimit()" likely makes sense. The problem with "delimit" is that in many contexts it refers to marking both the start and end boundaries, e.g. people often refer to string delimiters '...' and list delimiters [...]. That doesn't apply here, where we're adding separators between chunks/groups. The term delimiter can be used in various ways, and some of them do not match the behaviour we want here: http://stackoverflow.com/questions/9118769/when-to-use-the-terms-delimiter-terminator-and-separator In this case, we are not adding delimiters, we're adding separators. We're chunking (or grouping) characters by counting them, then separating the groups. The test here is what happens if the string is shorter than the group size? "xyz".chunk(5, '*') If we're delimiting the boundaries of the group, then I expect that we should get "*xyz*", but if we're separating groups, I expect that we should get "xyz" unchanged. > However, the other question worth asking is whether we might want a > "string slice splitting" operation rather than a string delimiting > option: once you have the slices, then combining them again with > str.join is straightforward, but extracting the slices in the first > place is currently a little fiddly (especially for the reversed case): Let me think about that :-) -- Steve From mertz at gnosis.cx Tue May 2 13:48:08 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 2 May 2017 10:48:08 -0700 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170502113148.GO22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: On Tue, May 2, 2017 at 4:31 AM, Steven D'Aprano wrote: > Rather than duplicate the API and logic everywhere, I suggest we add a > new string method. My suggestion is str.chunk(size, delimiter=' ') and > str.rchunk() with the same arguments: > > "1234ABCDEF".chunk(4) > => returns "1234 ABCD EF" > > rchunk will be useful for money or other situations where we group from > the right rather than from the left: > > "$" + str(10**6).rchunk(3, ',') > => returns "$1,000,000" > > # Format mobile phone number in the Australian style > "04123456".rchunk((4, 3)) > => returns "0412 345 678" > > # Format an integer in the Indian style > str(123456789).rchunk((3, 2), ",") > => returns "12,34,56,789" > I like this general idea very much. Dealing with lakh and crore is a very nice feature (and one that the `.format()` mini-language sadly fails to handle; it assumes numeric delimiters can only be commas, and only ever three positions). But I'm not sure the semantics you propose is flexible enough. I take it that the tuple means (, ) from your examples. But I don't think that suffices for every common format. It would be fine to get a USA phone number like: str(4135559414).rchunk((4,3),'-') # -> 413-555-9414 But for example, looking somewhat at random at an international call ( https://en.wikipedia.org/wiki/Telephone_numbers_in_Belgium) *Dialing from New York to Brussel**011-32-2-555-12-12* - Omitting the leading "0". Maybe your API is for any length tuple, with the final element repeated. So I guess maybe this example could be: "0113225551212".rchunk((2,2,3,1,2,3),'-') I don't care about this method being called .chunk() vs. .delimit() vs. something else. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.input at gmail.com Tue May 2 14:46:57 2017 From: carl.input at gmail.com (Carl Smith) Date: Tue, 2 May 2017 19:46:57 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: The main reason for naming it `delimit` was to be consistent with the karg `delimiter`, so `str.delimit(index, delimiter)`. You could call it `chop` I guess, but I'm just bikeshedding, so will leave it while you guys figure out the important stuff. -- Carl Smith carl.input at gmail.com On 2 May 2017 at 18:48, David Mertz wrote: > On Tue, May 2, 2017 at 4:31 AM, Steven D'Aprano > wrote: > >> Rather than duplicate the API and logic everywhere, I suggest we add a >> new string method. My suggestion is str.chunk(size, delimiter=' ') and >> str.rchunk() with the same arguments: >> >> "1234ABCDEF".chunk(4) >> => returns "1234 ABCD EF" >> >> rchunk will be useful for money or other situations where we group from >> the right rather than from the left: >> >> "$" + str(10**6).rchunk(3, ',') >> => returns "$1,000,000" >> >> # Format mobile phone number in the Australian style >> "04123456".rchunk((4, 3)) >> => returns "0412 345 678" >> >> # Format an integer in the Indian style >> str(123456789).rchunk((3, 2), ",") >> => returns "12,34,56,789" >> > > I like this general idea very much. Dealing with lakh and crore is a very > nice feature (and one that the `.format()` mini-language sadly fails to > handle; it assumes numeric delimiters can only be commas, and only ever > three positions). > > But I'm not sure the semantics you propose is flexible enough. I take it > that the tuple means (, ) from your > examples. But I don't think that suffices for every common format. It > would be fine to get a USA phone number like: > > str(4135559414 <(413)%20555-9414>).rchunk((4,3),'-') # -> > 413-555-9414 <(413)%20555-9414> > > But for example, looking somewhat at random at an international call ( > https://en.wikipedia.org/wiki/Telephone_numbers_in_Belgium) > > *Dialing from New York to Brussel**011-32-2-555-12-12 <+32%202%20555%2012%2012>* - Omitting the leading "0". > > Maybe your API is for any length tuple, with the final element repeated. > So I guess maybe this example could be: > > "0113225551212 <+32%202%20555%2012%2012>".rchunk((2,2,3,1,2,3),'-') > > I don't care about this method being called .chunk() vs. .delimit() vs. > something else. > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram at rachum.com Tue May 2 15:07:37 2017 From: ram at rachum.com (Ram Rachum) Date: Tue, 2 May 2017 22:07:37 +0300 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size Message-ID: Hi, I have a suggestion: Add a function shutil.get_dir_size that gets the size of a directory, including all the items inside it recursively. I currently need this functionality and it looks like I'll have to write my own function for it. Cheers, Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue May 2 18:10:29 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 03 May 2017 10:10:29 +1200 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> Message-ID: <590903D5.2030800@canterbury.ac.nz> For a name, I think "group" would be better than "chunk". We talk about grouping the digits of a number, not chunking them. -- Greg From python at lucidity.plus.com Tue May 2 18:39:48 2017 From: python at lucidity.plus.com (Erik) Date: Tue, 2 May 2017 23:39:48 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170502113148.GO22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: On 02/05/17 12:31, Steven D'Aprano wrote: > I disagree with this approach. There's nothing special about bytes.hex() > here, perhaps we want to format the output of hex() or bin() or oct(), > or for that matter "%x" and any of the other string templates? > > In fact, this is a string operation that could apply to any character > string, including decimal digits. > > Rather than duplicate the API and logic everywhere, I suggest we add a > new string method. My suggestion is str.chunk(size, delimiter=' ') and > str.rchunk() with the same arguments: > > "1234ABCDEF".chunk(4) > => returns "1234 ABCD EF" FWIW, I implemented a version of something similar as a fixed-length "chunk" method in itertoolsmodule.c (it was similar to izip_longest - it had a "fill" keyword to pad the final chunk). It was ~100 LOC including the structure definitions. The chunk method was an iterator (so it returned a sequence of "chunks" as defined by the API). Then I read that "itertools" should consist of primitives only and that we should defer to "moreitertools" for anything that is of a higher level (which this is - it can be done in terms of itertools functions). So I didn't propose it, although the processing of my WAV files (in which the sample data are groups of bytes - frames - of a fixed length) was significantly faster with it :( I also looked at implementing itertools.chunk as a function that would make use of a "__chunk__" method on the source object if it existed (which allowed a class to support an even more efficient version of chunking - things like range() etc). > I don't see any advantage to adding this to bytes.hex(), hex(), oct(), > bin(), and I really don't think it is helpful to be grouping the > characters by the number of bits. Its a string formatting operation, not > a bit operation. Why do you want to limit it to strings? Isn't something like this potentially useful for all sequences (where the result is a tuple of objects that are the same as the source sequence - be that strings or lists or lazy ranges or whatever?). Why aren't the chunks returned via an iterator? E. From python at lucidity.plus.com Tue May 2 18:55:48 2017 From: python at lucidity.plus.com (Erik) Date: Tue, 2 May 2017 23:55:48 +0100 Subject: [Python-ideas] Augmented assignment syntax for objects. In-Reply-To: References: <20170425025305.GC20708@ando.pearwood.info> <8ecf5b9c-291d-5068-77eb-b50bdd822bd6@mgmiller.net> <1f9146c9-2a8c-040e-22a7-f7f7b7df2489@lucidity.plus.com> Message-ID: <82b220c7-1f9f-d66f-7c4f-ec451e15fb68@lucidity.plus.com> On 26/04/17 21:50, Chris Angelico wrote: > On Thu, Apr 27, 2017 at 6:24 AM, Erik wrote: >> The background is that what I find myself doing a lot of for private >> projects is importing data from databases into a structured collection of >> objects and then grouping and analyzing the data in different ways before >> graphing the results. >> >> So yes, I tend to have classes that accept their entire object state as >> parameters to the __init__ method (from the database values) and then any >> other methods in the class are generally to do with the subsequent analysis >> (including dunder methods for iteration, rendering and comparison etc). > > You may want to try designing your objects as namedtuples. That gives > you a lot of what you're looking for. I did look at this. It looked promising. What I found was that I spent a lot of time working out how to subclass namedtuples properly (I do need to do that to add the extra logic - and sometimes some state - for my analysis) and once I got that working, I was left with a whole different set of boilerplate and special cases and therefore another set of things to remember if I return to this code at some point. So I've reverted to regular classes and multiple assignments in __init__. E. From steve at pearwood.info Tue May 2 20:01:17 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 3 May 2017 10:01:17 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: <20170503000117.GR22525@ando.pearwood.info> On Tue, May 02, 2017 at 10:48:08AM -0700, David Mertz wrote: > Maybe your API is for any length tuple, with the final element repeated. > So I guess maybe this example could be: > > "0113225551212".rchunk((2,2,3,1,2,3),'-') That's what I meant. -- Steve From cs at zip.com.au Tue May 2 18:57:33 2017 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 3 May 2017 08:57:33 +1000 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size In-Reply-To: References: Message-ID: <20170502225733.GA5850@cskk.homeip.net> On 02May2017 22:07, Ram Rachum wrote: >I have a suggestion: Add a function shutil.get_dir_size that gets the size >of a directory, including all the items inside it recursively. I currently >need this functionality and it looks like I'll have to write my own >function for it. Feels like a rather niche function. Had you considered just calling "du" via subprocess and reading the number it returns? Cheers, Cameron Simpson From steve at pearwood.info Tue May 2 20:43:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 3 May 2017 10:43:33 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> Message-ID: <20170503004330.GS22525@ando.pearwood.info> On Tue, May 02, 2017 at 11:39:48PM +0100, Erik wrote: > On 02/05/17 12:31, Steven D'Aprano wrote: > >Rather than duplicate the API and logic everywhere, I suggest we add a > >new string method. My suggestion is str.chunk(size, delimiter=' ') and > >str.rchunk() with the same arguments: For the record, I now think the second argument should be called "sep", for separator, and I'm okay with Greg's suggestion we call the method "group". > >"1234ABCDEF".chunk(4) > >=> returns "1234 ABCD EF" [...] > Why do you want to limit it to strings? I'm not stopping anyone from proposing a generalisation of this that works with other sequence types. As somebody did :-) I've also been thinking about generalisations such as grouping lines into paragraphs, words into lines, etc. In text processing, chunking can refer to more than just characters. But here we have a specific, concrete use-case that involves strings. Anything else is YAGNI until a need is demonstrated :-) > Isn't something like this > potentially useful for all sequences (where the result is a tuple of > objects that are the same as the source sequence - be that strings or > lists or lazy ranges or whatever?). Why aren't the chunks returned via > an iterator? String methods should return strings. That's not to argue against a generic iterator solution, but the barrier to use of an iterator solution is higher than just calling a method. You have to learn about importing, you need to know there is an itertools module (or a third party module to install first!), you have to know how to convert the iterator back to a string... -- Steve From apalala at gmail.com Tue May 2 21:07:41 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 2 May 2017 21:07:41 -0400 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170503004330.GS22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: On Tue, May 2, 2017 at 8:43 PM, Steven D'Aprano wrote: > String methods should return strings. > >>> "A-B-C".split("-") ['A', 'B', 'C'] If chunk() worked for all iterables: >>> " ".join("1234ABCDEF".chunk(4)) "1234 ABCD EF" Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at lucidity.plus.com Tue May 2 21:48:03 2017 From: python at lucidity.plus.com (Erik) Date: Wed, 3 May 2017 02:48:03 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170503004330.GS22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: On 03/05/17 01:43, Steven D'Aprano wrote: > On Tue, May 02, 2017 at 11:39:48PM +0100, Erik wrote: >> On 02/05/17 12:31, Steven D'Aprano wrote: > >>> Rather than duplicate the API and logic everywhere, I suggest we add a >>> new string method. My suggestion is str.chunk(size, delimiter=' ') and >>> str.rchunk() with the same arguments: > > For the record, I now think the second argument should be called "sep", > for separator, and I'm okay with Greg's suggestion we call the method > "group". > > >>> "1234ABCDEF".chunk(4) >>> => returns "1234 ABCD EF" > [...] > >> Why do you want to limit it to strings? > > I'm not stopping anyone from proposing a generalisation of this that > works with other sequence types. As somebody did :-) Who? I didn't spot that in the thread - please give a reference. Thanks. Anyway, I know you can't stop anyone from *proposing* something like this, but as soon as they do you may decide to quote the recipe from "https://docs.python.org/3/library/functions.html#zip" and try to block their proposition. There are already threads on fora that do that. That was my sticking point at the time when I implemented a general solution. Why bother to propose something that (although it made my code significantly faster) had already been blocked as being something that should be a python-level operation and not something to be included in a built-in? > String methods should return strings. In that case, we need to fix this ASAP ;) : >>> 'foobarbaz'.split('o') ['f', '', 'barbaz'] Where the result is reasonably a sequence, a method should return a sequence (but I would agree that it should generally be a sequence of objects of the source type - which I think is what I effectively said: "Isn't something like this potentially useful for all sequences (where the result is a [sequence] of objects that are the same [type] as the source sequence)" > That's not to argue against a generic iterator solution, but the barrier > to use of an iterator solution is higher than just calling a method. Knowing which sequence classes have a "chunk" method and which don't is a higher barrier than knowing that all sequences can be "chunked" by a single imported function. E. From storchaka at gmail.com Wed May 3 01:43:48 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 3 May 2017 08:43:48 +0300 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size In-Reply-To: References: Message-ID: On 02.05.17 22:07, Ram Rachum wrote: > I have a suggestion: Add a function shutil.get_dir_size that gets the > size of a directory, including all the items inside it recursively. I > currently need this functionality and it looks like I'll have to write > my own function for it. The comprehensive implementation should take into account hard links, mount points, variable block sizes, sparse files, transparent files and blocks compression, file tails packing, blocks deduplication, additional file streams, file versioning, and many many other FS specific features. If you implement a module providing this feature, publish it on PyPI and prove its usefulness for common Python user, it may be considered for inclusion in the Python standard library. From greg.ewing at canterbury.ac.nz Wed May 3 02:08:35 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 03 May 2017 18:08:35 +1200 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170503004330.GS22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: <590973E3.1020708@canterbury.ac.nz> Steven D'Aprano wrote: > I've also been thinking about generalisations such as grouping lines > into paragraphs, words into lines, etc. You're probably going to want considerably more complicated algorithms for that kind of thing, though. Let's keep it simple. -- Greg From ram at rachum.com Wed May 3 02:13:31 2017 From: ram at rachum.com (Ram Rachum) Date: Wed, 3 May 2017 09:13:31 +0300 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size In-Reply-To: <20170502225733.GA5850@cskk.homeip.net> References: <20170502225733.GA5850@cskk.homeip.net> Message-ID: Calling `du` is possible but I prefer to avoid these kinds of solutions. (OS-specific, parsing text output from a third-party program.) On Wed, May 3, 2017 at 1:57 AM, Cameron Simpson wrote: > On 02May2017 22:07, Ram Rachum wrote: > >> I have a suggestion: Add a function shutil.get_dir_size that gets the size >> of a directory, including all the items inside it recursively. I currently >> need this functionality and it looks like I'll have to write my own >> function for it. >> > > Feels like a rather niche function. Had you considered just calling "du" > via subprocess and reading the number it returns? > > Cheers, > Cameron Simpson > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed May 3 03:57:55 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 3 May 2017 08:57:55 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: On 3 May 2017 at 02:48, Erik wrote: > Anyway, I know you can't stop anyone from *proposing* something like this, > but as soon as they do you may decide to quote the recipe from > "https://docs.python.org/3/library/functions.html#zip" and try to block > their proposition. There are already threads on fora that do that. > > That was my sticking point at the time when I implemented a general > solution. Why bother to propose something that (although it made my code > significantly faster) had already been blocked as being something that > should be a python-level operation and not something to be included in a > built-in? It sounds like you have a reasonable response to the suggestion of using zip - that you have a use case where performance matters, and your proposed solution is of value in that case. Whether it's a *sufficient* response remains to be seen, but unless you present the argument we won't know. IMO, the idea behind itertools being building blocks is not to deter proposals for new tools, but to make sure that people focus on providing important low-level tools, and not on high level operations that can just as easily be written using those tools - essentially the guideline "not every 3-line function needs to be a builtin". So it's to make people think, not to block innovation. Hope this clarifies, Paul From p.f.moore at gmail.com Wed May 3 03:59:36 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 3 May 2017 08:59:36 +0100 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size In-Reply-To: References: Message-ID: On 3 May 2017 at 06:43, Serhiy Storchaka wrote: > On 02.05.17 22:07, Ram Rachum wrote: >> >> I have a suggestion: Add a function shutil.get_dir_size that gets the >> size of a directory, including all the items inside it recursively. I >> currently need this functionality and it looks like I'll have to write >> my own function for it. > > > The comprehensive implementation should take into account hard links, mount > points, variable block sizes, sparse files, transparent files and blocks > compression, file tails packing, blocks deduplication, additional file > streams, file versioning, and many many other FS specific features. If you > implement a module providing this feature, publish it on PyPI and prove its > usefulness for common Python user, it may be considered for inclusion in the > Python standard library. +1 I would be interested in a pure-Python version of "du", but would expect to see it as a 3rd party module in the first instance (at least until all of the bugs have been thrashed out) rather than being immediately proposed for the stdlib. Paul From cs at zip.com.au Wed May 3 03:01:12 2017 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 3 May 2017 17:01:12 +1000 Subject: [Python-ideas] Suggestion: Add shutil.get_dir_size In-Reply-To: References: Message-ID: <20170503070112.GA18066@cskk.homeip.net> On 03May2017 09:13, Ram Rachum wrote: >Calling `du` is possible but I prefer to avoid these kinds of solutions. >(OS-specific, parsing text output from a third-party program.) Your choice, and fair enough. However I'd point out that "du" is available on all UNIX systems (includes MacOS) and UNIXlike systems (Linux in its many forms), and has the advantage that it embodies all the platform specific knowledge you need as alluded to in another reply; hard links, block counts things-not-plain-files and so forth. Of course, I speak here with some hypocracy because I'm the kind of person who'd write his own for things like this sometimes also. Starting with Python's os.walk function would get you off the ground. If you end up with something clean and usable by others, publish it to PyPI. Cheers, Cameron Simpson From steve at pearwood.info Wed May 3 05:20:48 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 3 May 2017 19:20:48 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: <20170503092046.GT22525@ando.pearwood.info> On Tue, May 02, 2017 at 09:07:41PM -0400, Juancarlo A?ez wrote: > On Tue, May 2, 2017 at 8:43 PM, Steven D'Aprano wrote: > > > String methods should return strings. > > > > >>> "A-B-C".split("-") > ['A', 'B', 'C'] Yes, thank you. And don't forget: py> 'abcd'.index('c') 2 But in context, I was responding to the question of why this proposed chunk()/group() method returns a string rather than an iterator. I worded my answer badly, but the intention was clear, at least in my own head *wink* Given that we were discussing a method that both groups the characters of a string and inserts the separators, it makes sense to return a string, like other string methods: 'foo'.upper() returns 'FOO', not iter(['F', 'O', 'O']) 'cheese and green eggs'.replace('green', 'red') returns a string, not iter(['cheese and ', 'red', ' eggs']) 'xyz'.zfill(5) returns '00xyz' not iter(['00', 'xyz']) etc, and likewise: 'abcdef'.chunk(2, sep='-') should return 'ab-cd-ef' rather than iter(['ab', '-', 'cd', '-', 'ef']) If we're talking about a different API, one where only the grouping is done and inserting separators is left for join(), then my answer will be different. In that case, then it is a matter of taste whether to return a list (like split()) or an iterator. I lean slightly towards returning a list, but I can see arguments for and against both. -- Steve From steve at pearwood.info Wed May 3 05:28:00 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 3 May 2017 19:28:00 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: <20170503092759.GU22525@ando.pearwood.info> On Wed, May 03, 2017 at 02:48:03AM +0100, Erik wrote: > On 03/05/17 01:43, Steven D'Aprano wrote: > >I'm not stopping anyone from proposing a generalisation of this that > >works with other sequence types. As somebody did :-) > > Who? I didn't spot that in the thread - please give a reference. Thanks. https://mail.python.org/pipermail/python-ideas/2017-May/045568.html [...] > Knowing which sequence classes have a "chunk" method and which don't is > a higher barrier than knowing that all sequences can be "chunked" by a > single imported function. At the moment, we're only talking about strings. That's the only actual use-case been presented so far. Everything else is at best Nice To Have, if not YAGNI. Let's not kill this idea by over-generalising it. We can always extend the idea in the future once it is proven. Or for those who really want a general purpose group-any-iterable function, it can start life as a third party module, and we can discuss adding it to the language when it is mature and the kinks are ironed out. -- Steve From ncoghlan at gmail.com Wed May 3 09:46:54 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 3 May 2017 23:46:54 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <590903D5.2030800@canterbury.ac.nz> References: <3wGbPx0mQJz105n@submission01.posteo.de> <590903D5.2030800@canterbury.ac.nz> Message-ID: On 3 May 2017 at 08:10, Greg Ewing wrote: > For a name, I think "group" would be better than "chunk". > We talk about grouping the digits of a number, not chunking > them. As soon as I added an intermediate variable to my example, I came to the same conclusion: >>> digit_groups = b'\xb9\x01\xef'.hex().splitgroups(2) >>> ' '.join(digit_groups) 'b9 01 ef' (from http://bugs.python.org/issue22385#msg292900) And for David's telephone number examples: >>> digit_groups = str(4135559414).rsplitgroups(4,3) >>> '-'.join(digit_groups) '413-555-9414' >>> digit_groups = "0113225551212".rsplitgroups(2,2,3,1,2,3) >>> '-'.join(digit_groups) '011-32-2-555-12-12' Another example would be generating numeric literals with underscores: >>> digit_groups = str(int(1e6).rsplitgroups(3) >>> '_'.join(digit_groups) '1_000_000' While a generalised reversed version wouldn't be possible, a corresponding "itertools.itergroups" function could be used to produce groups of defined lengths as islice iterators, similar to the way itertools.groupby works (i.e. producing subiterators of variable length rather than a fixed length tuple the way the grouper() recipe in the docs does). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From python at lucidity.plus.com Wed May 3 19:13:25 2017 From: python at lucidity.plus.com (Erik) Date: Thu, 4 May 2017 00:13:25 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> Message-ID: <81ef8f6b-65e9-5d88-ff17-d03681110393@lucidity.plus.com> Hi Paul, On 03/05/17 08:57, Paul Moore wrote: > On 3 May 2017 at 02:48, Erik wrote: >> Anyway, I know you can't stop anyone from *proposing* something like this, >> but as soon as they do you may decide to quote the recipe from >> "https://docs.python.org/3/library/functions.html#zip" and try to block >> their proposition. There are already threads on fora that do that. >> >> That was my sticking point at the time when I implemented a general >> solution. Why bother to propose something that (although it made my code >> significantly faster) had already been blocked as being something that >> should be a python-level operation and not something to be included in a >> built-in? > > It sounds like you have a reasonable response to the suggestion of > using zip- that you have a use case where performance matters, and > your proposed solution is of value in that case. I don't think so, though. I had a use-case where splitting an iterable into a sequence of same-sized chunks efficiently improved the performance of my code significantly (processing a LOT of 24-bit, multi-channel - 16 to 32 - PCM streams from a WAV file). Having thought "I need to split this stream by a fixed number of bytes" and then found more_itertools.chunked() (and the zip_longest(*([iter(foo)] * num)) trick) it turned out they were not quick enough so I implemented itertools.chunked() in C. That worked well for me, so when I was done I did a search in case it was worth proposing as an enhancement to feed it back to the community. Then I came across things such as the following: http://bugs.python.org/issue6021 I am specifically referring to the "It has been rejected before" comment, also mentioned here: https://mail.python.org/pipermail/python-dev/2012-July/120885.html See this entire thread, too: https://mail.python.org/pipermail/python-ideas/2012-July/015671.html This is the reason why I really just didn't care enough to go through the process of proposing it in the end (even though the more_itertools.chunked function was one of the first 3 implemented in V1.0 and seems to _still_ be cropping up all the time in different guises - so is perhaps more fundamental than people recognise). The strong implication of the discussions linked to above is that if it had been mentioned before it would be immediately rejected, and that was supported by several members of the community in good standing. So I didn't propose it. I have no idea now what I spent my saved hours doing, but I imagine that it was fun > Whether it's a > *sufficient* response remains to be seen, but unless you present the > argument we won't know. Summary: I didn't present the argument because I'm not a masochist Regards, E. From steve at pearwood.info Wed May 3 20:24:50 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 4 May 2017 10:24:50 +1000 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <81ef8f6b-65e9-5d88-ff17-d03681110393@lucidity.plus.com> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> <81ef8f6b-65e9-5d88-ff17-d03681110393@lucidity.plus.com> Message-ID: <20170504002450.GV22525@ando.pearwood.info> On Thu, May 04, 2017 at 12:13:25AM +0100, Erik wrote: > I had a use-case where splitting an iterable into a sequence of > same-sized chunks efficiently improved the performance of my code [...] > So I didn't propose it. I have no idea now what I spent my saved hours > doing, but I imagine that it was fun > Summary: I didn't present the argument because I'm not a masochist I'm not sure what the point of that anecdote was, unless it was "I wrote some useful code, and you missed out". Your comments come across as a passive-aggressive chastisment of the core devs and the Python-Ideas community for being too quick to reject useful code: we missed out on something good, because you don't have the time or energy to deal with our negativity and knee-jerk rejection of everything good. That's the way your series of posts come across to me. Not every piece of useful code has to go into the std lib, and even if it should, it doesn't necessarily have to go into it from day 1. If you wanted to give back to the community, there are a number of options apart from "std lib or nothing": - you could have offered it to the moreitertools project; - you could have published it on PyPy; - you could have proposed it on Python-Ideas with an explicit statement that you didn't have the time or energy to get into a debate about including the function, "here's my implementation and an appropriate licence for you to use it: use it yourself, or if somebody else wants to champion putting it into the std lib, go right ahead, but I won't"; and possibly more. I'm not suggesting that you have any obligation to do any of these things, but you don't *have* to get into a long-winded, energy-sapping debate over inclusion unless you *really* care about having it added. If you care so little that you can't be bothered even to propose it, why do you care if it is rejected? -- Steve From python at lucidity.plus.com Wed May 3 21:32:24 2017 From: python at lucidity.plus.com (Erik) Date: Thu, 4 May 2017 02:32:24 +0100 Subject: [Python-ideas] Add an option for delimiters in bytes.hex() In-Reply-To: <20170504002450.GV22525@ando.pearwood.info> References: <3wGbPx0mQJz105n@submission01.posteo.de> <20170502113148.GO22525@ando.pearwood.info> <20170503004330.GS22525@ando.pearwood.info> <81ef8f6b-65e9-5d88-ff17-d03681110393@lucidity.plus.com> <20170504002450.GV22525@ando.pearwood.info> Message-ID: On 04/05/17 01:24, Steven D'Aprano wrote: > On Thu, May 04, 2017 at 12:13:25AM +0100, Erik wrote: >> I had a use-case where splitting an iterable into a sequence of >> same-sized chunks efficiently improved the performance of my code > [...] >> So I didn't propose it. I have no idea now what I spent my saved hours >> doing, but I imagine that it was fun > >> Summary: I didn't present the argument because I'm not a masochist > > I'm not sure what the point of that anecdote was, unless it was "I wrote > some useful code, and you missed out". Then you have misunderstood me. Paul suggested that my use-case (chunking could be faster) was perhaps enough to propose that my patch may be considered. I responded with historical/empirical evidence that perhaps that would actually not be the case. I was responding, honestly, to the questions raised by Paul's email. > Your comments come across as a passive-aggressive chastisment of the > core devs and the Python-Ideas community for being too quick to reject > useful code: we missed out on something good, because you don't have the > time or energy to deal with our negativity and knee-jerk rejection of > everything good. That's the way your series of posts come across to me. I apologise if my words or my turn of phrase do not appeal to you. I am trying to be constructive with everything I post. If you choose to interpret my messages in a different way then I'm not sure what I can do about that. Back to the important stuff though: > - you could have offered it to the moreitertools project; A more efficient version of moreitertools.chunked() is what we're talking about. > - you could have published it on PyPy; Does PyPy support C extension modules? If so, that's a possibility. > - you could have proposed it on Python-Ideas with an explicit statement I may well do that - my current patch (because of when I did it) is against a Py2 codebase, but I could port it to Py3. I still have a nagging doubt that I'd be wasting my time though ;) > If > you care so little that you can't be bothered even to propose it, why do > you care if it is rejected? You are mistaking not caring enough about the functionality with not caring enough to enter into an argument about including that functionality ... I didn't propose it at the time because of the reasons I mentioned. But when I saw something being discussed yet again that I had a general solution for already written I thought I mention it in case it was useful. As I said, I'm _trying_ to be constructive. E. From njs at pobox.com Thu May 4 04:59:05 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 4 May 2017 01:59:05 -0700 Subject: [Python-ideas] Storing a reference to the function object (if any) inside frame objects Message-ID: Hi all, Currently, given a frame object (e.g. from sys._getframe or inspect.getouterframes), there's no way to get back to the function object that created it. This creates an obstacle for various sorts of introspection. In particular, in the unusual but real situation where you need to "mark" a function in a way that can be detected via stack introspection, then the natural way to do that ? attaching an attribute to the function object ? doesn't work. Instead, you have to mangle the frame's locals. Example: In pytest, if you want to mark a function as uninteresting and not worth showing in tracebacks (a common situation for assertion helpers), then you can't do this with a decorator like @pytest.tracebackhide; instead, you have to set a __magic__ local variable: https://docs.pytest.org/en/latest/example/simple.html#writing-well-integrated-assertion-helpers Example: In trio, I want to mark certain Python functions as "protected" from receiving KeyboardInterrupt exceptions ? kind of like how CPython doesn't allow KeyboardInterrupt in internal functions [1]. If it were possible for my signal handler to get at function objects when it walks the stack, then the necessary decorator would be something like: def keyboard_interrupt_protected(fn): fn._trio_keyboard_interrupt_protected = True return fn Instead right now it's this mess: https://github.com/python-trio/trio/blob/64119b12309ffeaf3a35622ef08d3b03e438006e/trio/_core/_ki.py#L108-L150 And worse, the current mess has a race condition, because a KeyboardInterrupt could arrive while we're in the middle of executing the code to enable the KeyboardInterrupt protection. OTOH if this information was carried on the function object, the protection would be be automatically set up when the frame was created. Another benefit is that it would make it easy to include __qualname__'s in exception tracebacks, which is often nice. For example, here's a current traceback: File "/home/njs/trio/trio/_sync.py", line 374, in acquire_nowait return self._lock.acquire_nowait() File "/home/njs/trio/trio/_sync.py", line 277, in acquire_nowait raise WouldBlock And here's what it could look like if the traceback machinery could access frame.f_func.__qualname__: File "/home/njs/trio/trio/_sync.py", line 374, in Condition.acquire_nowait return self._lock.acquire_nowait() File "/home/njs/trio/trio/_sync.py", line 277, in Lock.acquire_nowait raise WouldBlock Here's some semi-terrifying code attempting to work around this issue: https://stackoverflow.com/questions/14817788/python-traceback-with-module-names To be fair, this last issue could also be solved by adding a f_qualname field to frame objects, like generators and coroutines already have. But another way to think about it is that if we had the whole function object then generators and coroutines wouldn't have to track the qualname separately anymore :-). Q: But what about frames that don't have any associated function object, like those created by exec()? A: Then frame.f_func would be NULL/None, no biggie. Q: Won't this increase memory usage? A: In principle this could increase memory usage by keeping function objects alive longer than otherwise, but I think the effect should be pretty minimal. It doesn't introduce any reference loops, and the vast majority of function objects outlive their frames in any case. Specifically I think the only (?) case where it would make a difference is if you create a new function and then call it immediately without storing it in a temporary, like: (lambda: 1)(). So basically: it's a bit obscure, but I think would be a nice little addition and I don't see any major downsides. Anything I'm missing? -n [1] There's a lot more detail on this here: https://vorpus.org/blog/control-c-handling-in-python-and-trio/ but it's probably unnecessary for the current discussion. -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Thu May 4 11:23:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 5 May 2017 01:23:19 +1000 Subject: [Python-ideas] Storing a reference to the function object (if any) inside frame objects In-Reply-To: References: Message-ID: On 4 May 2017 at 18:59, Nathaniel Smith wrote: > Hi all, > > Currently, given a frame object (e.g. from sys._getframe or > inspect.getouterframes), there's no way to get back to the function > object that created it. This creates an obstacle for various sorts of > introspection. In particular, in the unusual but real situation where > you need to "mark" a function in a way that can be detected via stack > introspection, then the natural way to do that ? attaching an > attribute to the function object ? doesn't work. Instead, you have to > mangle the frame's locals. Eric Snow put together a mostly working patch for this quite some time ago: http://bugs.python.org/issue12857 It needs some naming adjustments to better account for generators and coroutines, and rebasing against a recent version of the tree, but I don't see any major barriers to getting the change made other than someone actually working through all the nitty gritty details of finalising the patch. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Thu May 4 12:50:36 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 4 May 2017 18:50:36 +0200 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: <3wHKqT4SwQz108v@submission01.posteo.de> References: <3wHKqT4SwQz108v@submission01.posteo.de> Message-ID: > How about adding a chunks() and rchunks() function to sequences: > > [1,2,3,4,5,6,7].chunks(3) => [[1,2,3], [4,5,6], [7]] I prefer str.join() approach: write a single chunks() function which takes a sequence, instead of modifying all sequence types around the world ;-) It's less natural to write chunks(seq, n), but it's much simpler to implement and will work on all Python versions, enjoy! Victor From greg.ewing at canterbury.ac.nz Thu May 4 18:20:42 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 05 May 2017 10:20:42 +1200 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: References: <3wHKqT4SwQz108v@submission01.posteo.de> Message-ID: <590BA93A.1030709@canterbury.ac.nz> Victor Stinner wrote: > I prefer str.join() approach: write a single chunks() function which > takes a sequence, instead of modifying all sequence types around the > world ;-) Even if a general sequence-chunking function is thought useful, it might be worth providing a special-purpose one as a string method in the interests of efficiency. Splitting a string into a sequence of characters, messing with it and then joining it back into a string is a pretty expensive way to do things. While most uses would probably be for short strings, I can think of uses cases involving large ones. For example, to format a hex dump into lines with 8 bytes per line and spaces between the lines: data.group(2, ' ').group(24, '\n') And even for short strings, processing lots of them in a loop could get expensive with the general-purpose approach. -- Greg From victor.stinner at gmail.com Thu May 4 18:50:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 5 May 2017 00:50:42 +0200 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: <590BA93A.1030709@canterbury.ac.nz> References: <3wHKqT4SwQz108v@submission01.posteo.de> <590BA93A.1030709@canterbury.ac.nz> Message-ID: 2017-05-05 0:20 GMT+02:00 Greg Ewing : > While most uses would probably be for short strings, I can > think of uses cases involving large ones. For example, to > format a hex dump into lines with 8 bytes per line and spaces > between the lines: For such specialized use case, write a C extension. Victor From ncoghlan at gmail.com Fri May 5 03:29:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 5 May 2017 17:29:55 +1000 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: <590BA93A.1030709@canterbury.ac.nz> References: <3wHKqT4SwQz108v@submission01.posteo.de> <590BA93A.1030709@canterbury.ac.nz> Message-ID: On 5 May 2017 at 08:20, Greg Ewing wrote: > Victor Stinner wrote: >> >> I prefer str.join() approach: write a single chunks() function which >> takes a sequence, instead of modifying all sequence types around the >> world ;-) > > > Even if a general sequence-chunking function is thought useful, > it might be worth providing a special-purpose one as a string > method in the interests of efficiency. Splitting a string into > a sequence of characters, messing with it and then joining it > back into a string is a pretty expensive way to do things. > > While most uses would probably be for short strings, I can > think of uses cases involving large ones. For example, to > format a hex dump into lines with 8 bytes per line and spaces > between the lines: > > data.group(2, ' ').group(24, '\n') > > And even for short strings, processing lots of them in a loop > could get expensive with the general-purpose approach. I don't think performance is a good argument for combining the split/join operations, but I do think it's a decent argument for offering native support for decomposition of regularly structured data without the conceptual complexity of going through memoryview and reshaping the data that way. That is, approaching the problem of displaying regular data from a "formatting text" point of view would look like: BYTES_PER_LINE = 8 DIGITS_PER_BYTE = 2 hex_lines = data.hex().splitgroups(BYTES_PER_LINE * DIGITS_PER_BYTE) for line in hex_lines: print(' '.join(line.splitgroups(DIGITS_PER_BYTE)) This has the benefit of working with the hex digits directly, so it doesn't specifically require access to the original data. By being a string method, it can handle all the complexities of str's variable width internal storage, while still taking advantage of direct access to those data structures. The corresponding "data view" mindset would be: NUM_LINES, remainder = divmod(len(data), BYTES_PER_LINE) if remainder: ... # Pad the data with zeros to give it a regular shape view = memoryview(data).cast('c', (NUM_LINES, BYTES_PER_LINE)) for row in view.tolist(): print(' '.join(entry.hex() for entry in row)) This approaches the task at hand as an array-rendering issue rather than as a string or bytes formatting problem, but that's actually a pretty big mental leap to make if you're thinking of your input as a stream of text to be formatted rather than as an array of data to be displayed. And then given the proposed str.splitgroups() on the one hand, and the existing memoryview.cast() on the other, offering itertools.itergroups() as a corresponding building block specifically for working with streams of regular data would make sense to me - that's a standard approach in time-division multiplexing protocols, and it also shows up in areas like digital audio processing as well (where you're often doing things like shuffling incoming data chunks into FFT buffers) Cheers, Nick. P.S. As evidence for "this is a problem for memoryview" being a tricky leap to make, note that it's the first time it has come up in this thread as a possibility, even though it already works in at least 3.5+: >>> data = b'abcdefghijklmnop' >>> data.hex() '6162636465666768696a6b6c6d6e6f70' >>> view = memoryview(data).cast('b', (4, 4)) >>> for row in view.tolist(): ... print(' '.join(entry.hex() for entry in row)) ... 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From george at fischhof.hu Fri May 5 03:58:15 2017 From: george at fischhof.hu (George Fischhof) Date: Fri, 5 May 2017 09:58:15 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() Message-ID: Hi Folks, I have a task to synchronize folders but some files should be remained untouched. I think this is a very common task. I found that shutil.copytree() has ignore_patterns() but rmtree() has not. So here comes my idea: add ignore_patterns() to rmtree() it is a good feature and makes the functions symmetric. BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri May 5 05:52:57 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 5 May 2017 12:52:57 +0300 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: References: Message-ID: On 05.05.17 10:58, George Fischhof wrote: > I have a task to synchronize folders but some files should be remained > untouched. > I think this is a very common task. > > I found that shutil.copytree() has ignore_patterns() but rmtree() has not. > > So here comes my idea: add ignore_patterns() to rmtree() it is a good > feature and makes the functions symmetric. You can't remove a tree when remain its branches. Thus rmtree() with ignore_patterns will always fail. For your particular case I suggest to use os.walk(). You can implement arbitrary application specific conditions. From python at lucidity.plus.com Fri May 5 06:29:47 2017 From: python at lucidity.plus.com (Erik) Date: Fri, 5 May 2017 11:29:47 +0100 Subject: [Python-ideas] Add a .chunks() method to sequences In-Reply-To: References: <3wHKqT4SwQz108v@submission01.posteo.de> <590BA93A.1030709@canterbury.ac.nz> Message-ID: <02001430-f554-61a5-2b5b-fa222b65bfcf@lucidity.plus.com> Hi Nick, On 05/05/17 08:29, Nick Coghlan wrote: > And then given the proposed str.splitgroups() on the one hand, and the > existing memoryview.cast() on the other, offering > itertools.itergroups() as a corresponding building block specifically > for working with streams of regular data would make sense to me - > that's a standard approach in time-division multiplexing protocols, > and it also shows up in areas like digital audio processing as well > (where you're often doing things like shuffling incoming data chunks > into FFT buffers) It looks to me like your "itertools.itergroups()" is similar to more_itertools.chunked() - with at least one obvious change, see below(*). If anyone wants to persue this (or any itertools) enhancement, then please be aware of the following thread (and in particular the message being linked to - and the bug and discussion that it is replying to): https://mail.python.org/pipermail/python-dev/2012-July/120885.html I have been told off for bringing this up already, but I do it again in direct response to your suggestion because it seems there is a bar to getting something included in itertools and something like "chunked()" has already failed to make it. The thing to do is probably to talk directly to Raymond to see if there's an acceptable solution first before too much work is put into something that may be rejected as being too high level. It may be that a C version of "more_itertools" for things which people would find a speedup useful might be a solution (where the more_itertools package defers to those built-ins if they exist on the version of Python its executing on, otherwise uses its existing implementation as a fallback). I am not suggesting implementing the _whole_ of more_itertools in C - it's quite large now. (*) I had implemented itertools.chunked in C before (also for audio processing, as it happens) and one thing that I didn't like is the way strings get unpacked: >>> tuple(more_itertools.chunked("foo bar baz", 2)) (['f', 'o'], ['o', ' '], ['b', 'a'], ['r', ' '], ['b', 'a'], ['z']) If the chunked/itergroups method checked for the presence of a __chunks__ or similar dunder method in the source sequence which returns an iterator, then the string class could efficiently yield substrings rather than individual characters which then had to be wrapped in a list or tuple (which I think is what you wanted itergroups() to do): >>> tuple(itertools.chunked("foo bar baz", 2)) ('fo', 'o ', 'ba', 'r ', 'ba', 'z') Similarly, for objects which _represent_ a lot of data but do not actually hold those data literally (for example, range objects or even memoryviews), the returned chunks can also be representations of the data (subranges or subviews) and not the actual rendered data. For example, the existing: >>> range(10) range(0, 10) >>> tuple(more_itertools.chunked(range(10), 3)) ([0, 1, 2], [3, 4, 5], [6, 7, 8], [9]) becomes: >>> tuple(more_itertools.chunked(range(10), 3)) (range(0, 3), range(3, 6), range(6, 9), range(9, 10)) Obviously, with those short strings and ranges one could argue that there's no point, but the principle of doing it this way scales better than the version that collects all of the data in lists - for things like chunks of some sort of "view" object, you would still only have the actual data stored once in the original object. I suppose that one thing to consider is what happens when an iterator is passed to the chunked() function. An iterator could have a __chunks__ method which returned chunks of the source sequence from the existing point in the iteration, however the difference between such an iterator and one that _doesn't_ have a __chunks__ method is that in the second case the iterator would be consumed by the fall-back code which just does what more_itertools.chunked() does now, but in the first it would not. Perhaps there is a precedent for that particular edge case with iterators in a different context. Hope that helps, E. From george at fischhof.hu Fri May 5 06:46:54 2017 From: george at fischhof.hu (George Fischhof) Date: Fri, 5 May 2017 12:46:54 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: References: Message-ID: 2017-05-05 11:52 GMT+02:00 Serhiy Storchaka : > On 05.05.17 10:58, George Fischhof wrote: > >> I have a task to synchronize folders but some files should be remained >> untouched. >> I think this is a very common task. >> >> I found that shutil.copytree() has ignore_patterns() but rmtree() has not. >> >> So here comes my idea: add ignore_patterns() to rmtree() it is a good >> feature and makes the functions symmetric. >> > > You can't remove a tree when remain its branches. Thus rmtree() with > ignore_patterns will always fail. > > For your particular case I suggest to use os.walk(). You can implement > arbitrary application specific conditions. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > If the remaining equals to ignore_pattern plus the container folders, then it is succesful ;-) In my opinion -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri May 5 07:02:16 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 5 May 2017 13:02:16 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: References: Message-ID: <20170505110216.GA24871@phdru.name> Hi! On Fri, May 05, 2017 at 09:58:15AM +0200, George Fischhof wrote: > Hi Folks, > > I have a task to synchronize folders but some files should be remained > untouched. Synchronize folders using rmtree()? I don't get it. > I think this is a very common task. I think it is not that common. > I found that shutil.copytree() has ignore_patterns() but rmtree() has not. > > So here comes my idea: add ignore_patterns() to rmtree() it is a good rmtree() is like ``rm -r``, not like ``find . -name *.pyc -delete``. > feature and makes the functions symmetric. Why impose artificial symmetry? > BR, > George Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From george at fischhof.hu Fri May 5 09:55:37 2017 From: george at fischhof.hu (George Fischhof) Date: Fri, 5 May 2017 15:55:37 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: <20170505110216.GA24871@phdru.name> References: <20170505110216.GA24871@phdru.name> Message-ID: 2017-05-05 13:02 GMT+02:00 Oleg Broytman : > Hi! > > On Fri, May 05, 2017 at 09:58:15AM +0200, George Fischhof < > george at fischhof.hu> wrote: > > Hi Folks, > > > > I have a task to synchronize folders but some files should be remained > > untouched. > > Synchronize folders using rmtree()? I don't get it. > > > I think this is a very common task. > > I think it is not that common. > > > I found that shutil.copytree() has ignore_patterns() but rmtree() has > not. > > > > So here comes my idea: add ignore_patterns() to rmtree() it is a good > > rmtree() is like ``rm -r``, not like ``find . -name *.pyc -delete``. > > > feature and makes the functions symmetric. > > Why impose artificial symmetry? > > > BR, > > George > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Actually it would be good if copytree() would be able to overwrite files and directories. George -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri May 5 10:14:17 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 5 May 2017 16:14:17 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: References: <20170505110216.GA24871@phdru.name> Message-ID: <20170505141417.GA5489@phdru.name> On Fri, May 05, 2017 at 03:55:37PM +0200, George Fischhof wrote: > Actually it would be good if copytree() would be able to overwrite files > and directories. Seems you want rsync, no? > George Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From george at fischhof.hu Fri May 5 10:50:07 2017 From: george at fischhof.hu (George Fischhof) Date: Fri, 5 May 2017 16:50:07 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: <20170505141417.GA5489@phdru.name> References: <20170505110216.GA24871@phdru.name> <20170505141417.GA5489@phdru.name> Message-ID: yes, something like that ... ;-) but I use windows, and I want the feature in Python, with a simple and elegant way (1-2 commands) 2017-05-05 16:14 GMT+02:00 Oleg Broytman : > On Fri, May 05, 2017 at 03:55:37PM +0200, George Fischhof < > george at fischhof.hu> wrote: > > Actually it would be good if copytree() would be able to overwrite files > > and directories. > > Seems you want rsync, no? > > > George > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri May 5 13:01:48 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 5 May 2017 19:01:48 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: References: <20170505110216.GA24871@phdru.name> <20170505141417.GA5489@phdru.name> Message-ID: <20170505170148.GA18474@phdru.name> On Fri, May 05, 2017 at 04:50:07PM +0200, George Fischhof wrote: > yes, something like that ... ;-) but I use windows, and I want the feature > in Python, with a simple and elegant way (1-2 commands) > > 2017-05-05 16:14 GMT+02:00 Oleg Broytman : > > > On Fri, May 05, 2017 at 03:55:37PM +0200, George Fischhof < > > george at fischhof.hu> wrote: > > > Actually it would be good if copytree() would be able to overwrite files > > > and directories. > > > > Seems you want rsync, no? I can understand the need but I don't think such a library/script should be in stdlib. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From george at fischhof.hu Wed May 10 03:01:29 2017 From: george at fischhof.hu (George Fischhof) Date: Wed, 10 May 2017 09:01:29 +0200 Subject: [Python-ideas] Add shutil.ignore_patterns() to shutil.rmtree() In-Reply-To: <20170505170148.GA18474@phdru.name> References: <20170505110216.GA24871@phdru.name> <20170505141417.GA5489@phdru.name> <20170505170148.GA18474@phdru.name> Message-ID: 2017. m?j. 5. du. 7:02 ezt ?rta ("Oleg Broytman" ): On Fri, May 05, 2017 at 04:50:07PM +0200, George Fischhof < george at fischhof.hu> wrote: > yes, something like that ... ;-) but I use windows, and I want the feature > in Python, with a simple and elegant way (1-2 commands) > > 2017-05-05 16:14 GMT+02:00 Oleg Broytman : > > > On Fri, May 05, 2017 at 03:55:37PM +0200, George Fischhof < > > george at fischhof.hu> wrote: > > > Actually it would be good if copytree() would be able to overwrite files > > > and directories. > > > > Seems you want rsync, no? I can understand the need but I don't think such a library/script should be in stdlib. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ But if shutil.copytree would be able to overwrite, it would be good, and simetimes enough. And I think it is not too difficult to implement ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From terji78 at gmail.com Wed May 10 11:40:01 2017 From: terji78 at gmail.com (terji78 at gmail.com) Date: Wed, 10 May 2017 08:40:01 -0700 (PDT) Subject: [Python-ideas] add a LogReader to the logging module Message-ID: <5617eba2-fc35-41e4-8444-903039721fb6@googlegroups.com> I get a message back that I'm not subscribed to the mailing list, but see my message in google groups. My sincerest apologies in advance, if this appears several times for you. Anyways: The logging module has an easy-to-setup logger: import logging logger = logging.getLogger(__name__) and you just log away. Easy. However, it's quite a bit more difficult to set up log readers, requiring IMO an unreasonably number of lines of code: import logging logger = logging.getLogger('mypackage') logger.setLevel(logging.DEBUG) handler = logging.StreamHandler() formatter = logging.Formatter('%( levelname)s:%(name)s:%(message)s') handler.setFormatter(formatter) logger.addHandler(handler) I propose adding a function that sets up a log reader with sensible defaults, but allowing customization. So, I propose a ``logging.getLogReader`` in the logging module to mirror ``logging.getLogger``. So, to use this, in your main.py, you'd typically just do: import logging log_reader = logging.getLogReader('mypackage.models') and you'd get all log output from mypackage.models with sensible defaults set. Much easier. You could also set up it up in more detail, e.g.: log_reader = logging.getLogReader('mypackage.models', level='debug', format='%(levelname)s | %(filename)s| line %(lineno)s | %(message)s' ) For a specific proposal, see: https://gist.github.com/topper-123/85e27ffe261850eed150eac53d61b82d Because it's just a logger, log_reader can be further customized as necessary. In summary, I think that today it's unneccesarily complex to set up a log reader and the proposed function serves a general enough need, that it - or something similar - should be in the logging module. Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Wed May 10 11:52:08 2017 From: phd at phdru.name (Oleg Broytman) Date: Wed, 10 May 2017 17:52:08 +0200 Subject: [Python-ideas] add a LogReader to the logging module In-Reply-To: <5617eba2-fc35-41e4-8444-903039721fb6@googlegroups.com> References: <5617eba2-fc35-41e4-8444-903039721fb6@googlegroups.com> Message-ID: <20170510155208.GA4838@phdru.name> Hi! Isn't it just basicConfig? import logging logging.basicConfig( filename='test.log', format='[%(asctime)s] %(name)s %(levelname)s: %(message)s', level=logging.DEBUG, ) log = logging.getLogger("TEST") On Wed, May 10, 2017 at 08:40:01AM -0700, terji78 at gmail.com wrote: > I get a message back that I'm not subscribed to the mailing list, but see > my message in google groups. My sincerest apologies in advance, if this > appears several times for you. Anyways: > > The logging module has an easy-to-setup logger: > > import logging > logger = logging.getLogger(__name__) > > and you just log away. Easy. > > However, it's quite a bit more difficult to set up log readers, requiring > IMO an unreasonably number of lines of code: > > import logging > > logger = logging.getLogger('mypackage') > logger.setLevel(logging.DEBUG) > handler = logging.StreamHandler() > formatter = logging.Formatter('%( > levelname)s:%(name)s:%(message)s') > handler.setFormatter(formatter) > logger.addHandler(handler) > > I propose adding a function that sets up a log reader with sensible > defaults, but allowing customization. So, I propose a > ``logging.getLogReader`` in the logging module to mirror > ``logging.getLogger``. So, to use this, in your main.py, you'd typically > just do: > > import logging > > log_reader = logging.getLogReader('mypackage.models') > > > and you'd get all log output from mypackage.models with sensible defaults > set. Much easier. > > You could also set up it up in more detail, e.g.: > > log_reader = logging.getLogReader('mypackage.models', > level='debug', > format='%(levelname)s | %(filename)s| > line %(lineno)s | %(message)s' > ) > > For a specific proposal, see: > > https://gist.github.com/topper-123/85e27ffe261850eed150eac53d61b82d > > Because it's just a logger, log_reader can be further customized as > necessary. > > In summary, I think that today it's unneccesarily complex to set up a log > reader and the proposed function serves a general enough need, that it - or > something similar - should be in the logging module. Thoughts? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From terji78 at gmail.com Wed May 10 12:55:21 2017 From: terji78 at gmail.com (terji78 at gmail.com) Date: Wed, 10 May 2017 09:55:21 -0700 (PDT) Subject: [Python-ideas] add a LogReader to the logging module In-Reply-To: <20170510155208.GA4838@phdru.name> References: <5617eba2-fc35-41e4-8444-903039721fb6@googlegroups.com> <20170510155208.GA4838@phdru.name> Message-ID: <2260d4c1-bee8-4e70-95b5-2c414bd27cdf@googlegroups.com> Hi! Isn't it just basicConfig? > > import logging > logging.basicConfig( > filename='test.log', > format='[%(asctime)s] %(name)s %(levelname)s: %(message)s', > level=logging.DEBUG, > ) > log = logging.getLogger("TEST") > BasicConfig does not allow different settings for different log readers, as I can see. E.g. import logging logging.basicConfig( level=logging.DEBUG, ) log1 = logging.getLogger("TEST1") logging.basicConfig( level=logging.INFO, ) log2 = logging.getLogger("TEST2") Here log2 will write out logs at the same level as log1, so basicConfic is IMO too basic for a lot of use cases and so you'll have to go through the other more verbose setup. So, this getLogReader function would be more general than basicConfig, but also much more useful. (But please educate me, if I've misunderstood basicConfig) -------------- next part -------------- An HTML attachment was scrubbed... URL: From agustin.herranz at gmail.com Thu May 11 13:44:55 2017 From: agustin.herranz at gmail.com (=?UTF-8?Q?Agust=c3=adn_Herranz_Cecilia?=) Date: Thu, 11 May 2017 19:44:55 +0200 Subject: [Python-ideas] add a LogReader to the logging module In-Reply-To: <2260d4c1-bee8-4e70-95b5-2c414bd27cdf@googlegroups.com> References: <5617eba2-fc35-41e4-8444-903039721fb6@googlegroups.com> <20170510155208.GA4838@phdru.name> <2260d4c1-bee8-4e70-95b5-2c414bd27cdf@googlegroups.com> Message-ID: > > BasicConfig does not allow different settings for different log > readers, as I can see. E.g. > > import logging > logging.basicConfig( > level=logging.DEBUG, > ) > log1 = logging.getLogger("TEST1") > > logging.basicConfig( > level=logging.INFO, > ) > log2 = logging.getLogger("TEST2") > > Here log2 will write out logs at the same level as log1, so > basicConfic is IMO too basic for a lot of use cases and so you'll have > to go through the other more verbose setup. > > So, this getLogReader function would be more general than basicConfig, > but also much more useful. (But please educate me, if I've > misunderstood basicConfig) The basicConfig sets the root (or "") logger, any messages from other loggers will end in the root logger if the propagation setting is enabled. There is the misconception that with the logging module you could have completely different loggers. In reality there is only 'one' logger, but with the hierarchy of names and some logic you could use as if there is more than one logger. Also in your propose the name of getLoggerReader is misleading as this implies that there is something to read, there is not, a log event is something to process, write if you wish (this is the purpose of handlers). Regards! From simonramstedt at gmail.com Sun May 14 00:07:44 2017 From: simonramstedt at gmail.com (Simon Ramstedt) Date: Sun, 14 May 2017 04:07:44 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) Message-ID: Hi, do you have an opinion on the following? Wouldn't it be nice to define classes via a simple constructor function (as below) instead of a conventional class definition? *conventional*: class MyClass(ParentClass): def __init__(x): self._x = x def my_method(y): z = self._x + y return z *proposed*: def MyClass(x): self = ParentClass() def my_method(y): z = x + y return z self.my_method = my_method # that's cumbersome (see comments below) return self Here are the pros and cons I could come up with for the proposed method: (+) Simpler and more explicit. (+) No need to create attributes (like `self._x`) just to pass something from `__init__` to another method. (+) Default arguments / annotations for methods could be different for each class instance. Adaptive defaults wouldn't have to simulated with a None. (+) Class/instance level imports would work. (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the `class`-based objects take only 0.4 ?s. For method execution however the closure takes only 0.15 ?s while the proper method takes 0.22 ?s (script ). (-/+) Checking types: In the proposed example above the returned object wouldn't know that it has been created by `MyClass`. There are a couple of solutions to that, though. The easiest to implement would be to change the first line to `self = subclass(ParentClass())` where the subclass function looks at the next item in the call stack (i.e. `MyClass`) and makes it the type of the object. Another solution would be to have a special rule for functions with capital first letter returning a single object to append itself to the list of types of the returned object. Alternatively there could be a special keyword e.g. `classdef` that would be used instead of `def` if we wouldn't want to rely on the name. (-) The current syntax for adding a function to an object is cumbersome. That's what is preventing me from actually using the proposed pattern. But is this really the only reason for not using it? And if so, wouldn't that be a good argument for enabling something like below? *attribute function definitions*: def MyClass(x): self = ParentClass() def self.my_method(y): z = x + y return z return self or alternatively *multiline lambdas*: def MyClass(x): self = ParentClass() self.my_method = (y): z = x + y return z return self Cheers, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Sun May 14 00:53:51 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Sat, 13 May 2017 21:53:51 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: <5917E2DF.6020000@brenbarn.net> On 2017-05-13 21:07, Simon Ramstedt wrote: > Hi, do you have an opinion on the following? My general opinion is that imitating JavaScript is almost always a bad idea. :-) > Wouldn't it be nice to define classes via a simple constructor function > (as below) instead of a conventional class definition? > > *conventional*: > | > classMyClass(ParentClass): > def__init__(x): > self._x=x > defmy_method(y): > z=self._x+y > returnz > | > > > *proposed*: > > | > defMyClass(x): > self=ParentClass() > defmy_method(y): > z=x+y > returnz > self.my_method=my_method # that's cumbersome (see comments below) > returnself > | > > > Here are the pros and cons I could come up with for the proposed method: > > (+) Simpler and more explicit. I don't really see how that's simpler or more explicit. In one respect it's clearly less explicit, as the "self" is implicit. > (+) No need to create attributes (like `self._x`) just to pass something > from `__init__` to another method. Attributes aren't just for passing things to other methods. They're for storing state. In your proposed system, how would an object mutate one of its own attributes? It looks like "x" here is just stored in a function closure, which wouldn't allow easy mutation. Also, how would another object access the attribute from outside (as we currently do with self.x)? You can say we'd only use this new attribute-free approach when we want to pass a constructor argument that's used but never mutated or accessed from outside, but that severely restricts the potential use cases, and all it saves you is typing "self". Relatedly, how is ParentClass itself defined? I don't see how you could bootstrap this without having a real class at the bottom of it somehow (as your test script in fact does). > (+) Default arguments / annotations for methods could be different for > each class instance. Adaptive defaults wouldn't have to simulated with a > None. That seems as likely to be a negative as a positive. Having different instances with different default values could be confusing. This would even allow different instances to define totally different methods (with if-logic inside the function constructor), which would definitely be confusing. > (+) Class/instance level imports would work. How often is that really needed? > (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the > `class`-based objects take only 0.4 ?s. For method execution however the > closure takes only 0.15 ?s while the proper method takes 0.22 ?s (script > ). I don't think you can really evaluate the performance impact of this alternative just based on a trivial example like that. > (-/+) Checking types: In the proposed example above the returned object > wouldn't know that it has been created by `MyClass`. There are a couple > of solutions to that, though. The easiest to implement would be to > change the first line to `self = subclass(ParentClass())` where the > subclass function looks at the next item in the call stack (i.e. > `MyClass`) and makes it the type of the object. Another solution would > be to have a special rule for functions with capital first letter > returning a single object to append itself to the list of types of the > returned object. Alternatively there could be a special keyword e.g. > `classdef` that would be used instead of `def` if we wouldn't want to > rely on the name. Those special rules sound very hackish to me. > (-) The current syntax for adding a function to an object is > cumbersome. That's what is preventing me from actually using the > proposed pattern. But is this really the only reason for not using it? > And if so, wouldn't that be a good argument for enabling something like > below? > * > * > *attribute function definitions*: > | > defMyClass(x): > self=ParentClass() > defself.my_method(y): > z=x+y > returnz > returnself > | > > > or alternatively*multiline lambdas*: > > | > defMyClass(x): > self=ParentClass() > self.my_method=(y): > z=x+y > returnz > returnself > | To be honest, from all your examples, I don't really see what the point is. It's a different syntax for doing some of the same things the existing class syntax does, while providing no apparent way to do some important things (like mutating attributes). I think Python's existing class syntax is simple, clear, and quite nice overall, and creating class instances by calling a function instead of a class doesn't add anything. In fact, even JavaScript has recently added classes to allow programmers to move away from the old approach that you describe here. Also, as I alluded to above, JavaScript is so terrible in so many ways that the mere idea of imitating it inherently makes me skeptical; there's almost nothing about JavaScript's design that couldn't be done better, and most of what it does are things that Python already does better and has done better for years. In short, I don't see any advantages at all to doing classes this way, and there are some non-negligible disadvantages. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From mafagafogigante at gmail.com Sun May 14 02:01:25 2017 From: mafagafogigante at gmail.com (Bernardo Sulzbach) Date: Sun, 14 May 2017 03:01:25 -0300 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <5917E2DF.6020000@brenbarn.net> References: <5917E2DF.6020000@brenbarn.net> Message-ID: <6be9695f-8e49-aafd-7dcb-8216e6843a73@gmail.com> On 05/14/2017 01:53 AM, Brendan Barnwell wrote: > On 2017-05-13 21:07, Simon Ramstedt wrote: >> >> Here are the pros and cons I could come up with for the proposed method: >> >> (+) Simpler and more explicit. > > I don't really see how that's simpler or more explicit. In one > respect it's clearly less explicit, as the "self" is implicit. > I cannot imagine that the average Python programmer would consider this to be simpler or more explicit than the current way of creating objects. Some purists may argue in your favor, saying that by not having the _x attribute your code is more robust as you cannot accidentally change _x later and end up modifying how a method works. I don't think this is worse. However, I do think it is unnecessary. From steve at pearwood.info Sun May 14 03:04:45 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 May 2017 17:04:45 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: <20170514070445.GF24625@ando.pearwood.info> On Sun, May 14, 2017 at 04:07:44AM +0000, Simon Ramstedt wrote: > Hi, do you have an opinion on the following? Hi, and welcome, and of course we have an opinion! This is Python-Ideas, we're very opinionated :-) > Wouldn't it be nice to define classes via a simple constructor function (as > below) instead of a conventional class definition? No. > *conventional*: > > class MyClass(ParentClass): > def __init__(x): > self._x = x > def my_method(y): > z = self._x + y > return z Looks good to me. It is nicely explicit that you're creating a class, the superclass or superclasses are easy to see, and the attributes are explicit. > *proposed*: > > def MyClass(x): That is the exact same syntax for defining a function called "MyClass", that takes one argument, x. How is Python (and the reader!) supposed to tell which calls to def return a class and which return a function? > self = ParentClass() What if you have multiple parent classes? Why is self an instance of the parent class, instead of MyClass? > def my_method(y): > z = x + y > return z The local variable x is not defined. Wait, is that supposed to come from the closure def MyClass(x)? What if your class has twenty methods, each of which takes different arguments? Do you have to write: def MyClass(x, # used in my_method y, # used in another_method z, # used in third_method, a, b, c, # used in fourth_method ... # blah blah blah ): How does this generalise to non-toy classes, classes with more than one method? > self.my_method = my_method # that's cumbersome (see comments below) > return self > > > Here are the pros and cons I could come up with for the proposed method: > > (+) Simpler and more explicit. I think you mean "More complicated and less explicit". Is this supposed to be some sort of prototype-based OOP instead of class-based OOP? I'd be interested in investigating prototype-based objects, but I don't think this is the way to do it. -- Steve From arj.python at gmail.com Sun May 14 03:12:21 2017 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Sun, 14 May 2017 11:12:21 +0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <20170514070445.GF24625@ando.pearwood.info> References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Whatever you all propose, coming from a java and c++ background, OOP in python is quite cumbersome. if you tell that i am not a python guy, then consider that current oop style does not reflect python's style of ease and simplicity is __init__ really a good syntax choice? Abdur-Rahmaan Janhangeer, Mauritius https://abdurrahmaanjanhangeer.wordpress.com On 14 May 2017 11:05, "Steven D'Aprano" wrote: > On Sun, May 14, 2017 at 04:07:44AM +0000, Simon Ramstedt wrote: > > Hi, do you have an opinion on the following? > > Hi, and welcome, and of course we have an opinion! This is Python-Ideas, > we're very opinionated :-) > > > Wouldn't it be nice to define classes via a simple constructor function > (as > > below) instead of a conventional class definition? > > No. > > > *conventional*: > > > > class MyClass(ParentClass): > > def __init__(x): > > self._x = x > > def my_method(y): > > z = self._x + y > > return z > > Looks good to me. It is nicely explicit that you're creating a class, > the superclass or superclasses are easy to see, and the attributes are > explicit. > > > > *proposed*: > > > > def MyClass(x): > > That is the exact same syntax for defining a function called "MyClass", > that takes one argument, x. How is Python (and the reader!) supposed to > tell which calls to def return a class and which return a function? > > > > self = ParentClass() > > What if you have multiple parent classes? > > Why is self an instance of the parent class, instead of MyClass? > > > > def my_method(y): > > z = x + y > > return z > > The local variable x is not defined. Wait, is that supposed to come from > the closure def MyClass(x)? > > What if your class has twenty methods, each of which takes different > arguments? Do you have to write: > > def MyClass(x, # used in my_method > y, # used in another_method > z, # used in third_method, > a, b, c, # used in fourth_method > ... # blah blah blah > ): > > How does this generalise to non-toy classes, classes with more than one > method? > > > > self.my_method = my_method # that's cumbersome (see comments > below) > > return self > > > > > > Here are the pros and cons I could come up with for the proposed method: > > > > (+) Simpler and more explicit. > > I think you mean "More complicated and less explicit". > > Is this supposed to be some sort of prototype-based OOP instead of > class-based OOP? I'd be interested in investigating prototype-based > objects, but I don't think this is the way to do it. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun May 14 03:34:36 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 14 May 2017 17:34:36 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <5917E2DF.6020000@brenbarn.net> References: <5917E2DF.6020000@brenbarn.net> Message-ID: On Sun, May 14, 2017 at 2:53 PM, Brendan Barnwell wrote: > Attributes aren't just for passing things to other methods. They're > for storing state. In your proposed system, how would an object mutate one > of its own attributes? It looks like "x" here is just stored in a function > closure, which wouldn't allow easy mutation. Also, how would another object > access the attribute from outside (as we currently do with self.x)? You can > say we'd only use this new attribute-free approach when we want to pass a > constructor argument that's used but never mutated or accessed from outside, > but that severely restricts the potential use cases, and all it saves you is > typing "self". My expectation is that you'd be using "nonlocal x" to do that. Closures can be used to emulate classes (and vice versa). However, in Python, the syntax for reaching outside a method to access the closure is significantly clunkier than the equivalent in C-derived languages: // JavaScript function outer() { var x = 0; function inner() { x += 2; } } # Python def outer(): x = 0 def inner(): nonlocal x x += 2 Is it better to say "nonlocal" on everything you use than to say "self.x" on each use? I say no, because it makes copying and pasting code dangerous - reading attributes will work, but mutating them requires the nonlocal tag. With "self.x", it's the same on both, and it's the same in all methods. ChrisA From simonramstedt at gmail.com Sun May 14 03:35:38 2017 From: simonramstedt at gmail.com (Simon Ramstedt) Date: Sun, 14 May 2017 07:35:38 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <5917E2DF.6020000@brenbarn.net> References: <5917E2DF.6020000@brenbarn.net> Message-ID: Hi, thanks a lot for your feedback! On Sun, May 14, 2017, 00:54 Brendan Barnwell wrote: > On 2017-05-13 21:07, Simon Ramstedt wrote: > > Hi, do you have an opinion on the following? > > My general opinion is that imitating JavaScript is almost always a > bad > idea. :-) > > > Wouldn't it be nice to define classes via a simple constructor function > > (as below) instead of a conventional class definition? > > > > *conventional*: > > | > > classMyClass(ParentClass): > > def__init__(x): > > self._x=x > > defmy_method(y): > > z=self._x+y > > returnz > > | > > > > > > *proposed*: > > > > | > > defMyClass(x): > > self=ParentClass() > > defmy_method(y): > > z=x+y > > returnz > > self.my_method=my_method # that's cumbersome (see comments below) > > returnself > > | > > > > > > Here are the pros and cons I could come up with for the proposed method: > > > > (+) Simpler and more explicit. > > I don't really see how that's simpler or more explicit. In one > respect > it's clearly less explicit, as the "self" is implicit. > > > (+) No need to create attributes (like `self._x`) just to pass something > > from `__init__` to another method. > > Attributes aren't just for passing things to other methods. > They're > for storing state. In your proposed system, how would an object mutate > one of its own attributes? It looks like "x" here is just stored in a > function closure, which wouldn't allow easy mutation. Also, how would > another object access the attribute from outside (as we currently do > with self.x)? You can say we'd only use this new attribute-free > approach when we want to pass a constructor argument that's used but > never mutated or accessed from outside, but that severely restricts the > potential use cases, and all it saves you is typing "self". > Attributes could be added to self just as in conventional classes if they are needed. > Relatedly, how is ParentClass itself defined? I don't see how you > could bootstrap this without having a real class at the bottom of it > somehow (as your test script in fact does). > You could bootstrap with an object base class/constructor just as normal classes inherit from object. Also the normal class system should remain in any case in order not to break every python library. > > > (+) Default arguments / annotations for methods could be different for > > each class instance. Adaptive defaults wouldn't have to simulated with a > > None. > > That seems as likely to be a negative as a positive. Having > different > instances with different default values could be confusing. This would > even allow different instances to define totally different methods (with > if-logic inside the function constructor), which would definitely be > confusing. > Different default values for different instances are a corner case but they are already happening by setting default to None. Defining different methods for different instances wouldn't be good but that is also possible with conventional classes (by adding functions to self in __init__). > (+) Class/instance level imports would work. > > How often is that really needed? > True, usually it doesn't matter. But when using big packages like tensorflow that take several seconds to load it can be annoying. Its always loaded when importing any library that uses it internally, because of module level imports that should be class/instance level. Even if we just wanted to do --help on the command line and needed that library before argparse for some reason. > (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the > > `class`-based objects take only 0.4 ?s. For method execution however the > > closure takes only 0.15 ?s while the proper method takes 0.22 ?s (script > > ). > > I don't think you can really evaluate the performance impact of > this > alternative just based on a trivial example like that. > > Agree, I don't know really how well this would perform. > > (-/+) Checking types: In the proposed example above the returned object > > wouldn't know that it has been created by `MyClass`. There are a couple > > of solutions to that, though. The easiest to implement would be to > > change the first line to `self = subclass(ParentClass())` where the > > subclass function looks at the next item in the call stack (i.e. > > `MyClass`) and makes it the type of the object. Another solution would > > be to have a special rule for functions with capital first letter > > returning a single object to append itself to the list of types of the > > returned object. Alternatively there could be a special keyword e.g. > > `classdef` that would be used instead of `def` if we wouldn't want to > > rely on the name. > > Those special rules sound very hackish to me. > > (-) The current syntax for adding a function to an object is > > cumbersome. That's what is preventing me from actually using the > > proposed pattern. But is this really the only reason for not using it? > > And if so, wouldn't that be a good argument for enabling something like > > below? > > * > > * > > *attribute function definitions*: > > | > > defMyClass(x): > > self=ParentClass() > > defself.my_method(y): > > z=x+y > > returnz > > returnself > > | > > > > > > or alternatively*multiline lambdas*: > > > > | > > defMyClass(x): > > self=ParentClass() > > self.my_method=(y): > > z=x+y > > returnz > > returnself > > | > > To be honest, from all your examples, I don't really see what the > point > is. It's a different syntax for doing some of the same things the > existing class syntax does, while providing no apparent way to do some > important things (like mutating attributes). I think Python's existing > class syntax is simple, clear, and quite nice overall, and creating > class instances by calling a function instead of a class doesn't add > anything. In fact, even JavaScript has recently added classes to allow > programmers to move away from the old approach that you describe here. > Also, as I alluded to above, JavaScript is so terrible in so many ways > that the mere idea of imitating it inherently makes me skeptical; > there's almost nothing about JavaScript's design that couldn't be done > better, and most of what it does are things that Python already does > better and has done better for years. In short, I don't see any > advantages at all to doing classes this way, and there are some > non-negligible disadvantages. > Interesting, didn't know that about Javascript. I also don't like Javascript's prototypes very much but thought adding "JavaScript-like" to the title might help explain what I meant. Leaving the possible replacement for classes aside, do you have an opinion specifically about the following? def obj.my_function(a, b): ... as syntactic sugar for def my_function(a, b): ... obj.my_function = my_function In my experience this pattern comes actually up quite a bit. E.g. when working with these "symbolic" machine learning frameworks like theano or tensorflow. Apart from that it mixins very easy. What do you think are the odds of something like this actually making it into the Python and if greater than 0 in which timeframe? > -- > Brendan Barnwell > "Do not follow where the path may lead. Go, instead, where there is no > path, and leave a trail." > --author unknown > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ Thanks, Simon > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Sun May 14 03:43:08 2017 From: bzvi7919 at gmail.com (Bar Harel) Date: Sun, 14 May 2017 07:43:08 +0000 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: Bump On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: > +1 to this as well, I think this would be really useful in the stdlib. > > On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel wrote: > >> Any updates with a singledispatch for methods? >> >> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >> >>> At last! Haven't used single dispatch exactly because of that. Thank you >>> savior! >>> +1 >>> >>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell >>> wrote: >>> >>>> Hi All, >>>> >>>> We have a modified version of singledispatch at work which works for >>>> methods as well as functions. We have open-sourced it as methoddispatch >>>> (pypi: https://pypi.python.org/pypi/methoddispatch). >>>> >>>> IMHO I thought it would make a nice addition to python stdlib. >>>> >>>> What does everyone else think? >>>> >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Sun May 14 03:45:20 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sun, 14 May 2017 09:45:20 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: FWIW, Javascript itself is moving away from this syntax in favour of a more Python-like syntax based on the 'class' keyword. This was introduced in EcmaScript 2015. Stephan Op 14 mei 2017 09:35 schreef "Simon Ramstedt" : > Hi, thanks a lot for your feedback! > > On Sun, May 14, 2017, 00:54 Brendan Barnwell > wrote: > >> On 2017-05-13 21:07, Simon Ramstedt wrote: >> > Hi, do you have an opinion on the following? >> >> My general opinion is that imitating JavaScript is almost always >> a bad >> idea. :-) >> >> > Wouldn't it be nice to define classes via a simple constructor function >> > (as below) instead of a conventional class definition? >> > >> > *conventional*: >> > | >> > classMyClass(ParentClass): >> > def__init__(x): >> > self._x=x >> > defmy_method(y): >> > z=self._x+y >> > returnz >> > | >> > >> > >> > *proposed*: >> > >> > | >> > defMyClass(x): >> > self=ParentClass() >> > defmy_method(y): >> > z=x+y >> > returnz >> > self.my_method=my_method # that's cumbersome (see comments below) >> > returnself >> > | >> > >> > >> > Here are the pros and cons I could come up with for the proposed method: >> > >> > (+) Simpler and more explicit. >> >> I don't really see how that's simpler or more explicit. In one >> respect >> it's clearly less explicit, as the "self" is implicit. >> >> > (+) No need to create attributes (like `self._x`) just to pass something >> > from `__init__` to another method. >> >> Attributes aren't just for passing things to other methods. >> They're >> for storing state. In your proposed system, how would an object mutate >> one of its own attributes? It looks like "x" here is just stored in a >> function closure, which wouldn't allow easy mutation. Also, how would >> another object access the attribute from outside (as we currently do >> with self.x)? You can say we'd only use this new attribute-free >> approach when we want to pass a constructor argument that's used but >> never mutated or accessed from outside, but that severely restricts the >> potential use cases, and all it saves you is typing "self". >> > > Attributes could be added to self just as in conventional classes if they > are needed. > > >> Relatedly, how is ParentClass itself defined? I don't see how you >> could bootstrap this without having a real class at the bottom of it >> somehow (as your test script in fact does). >> > > You could bootstrap with an object base class/constructor just as normal > classes inherit from object. Also the normal class system should remain in > any case in order not to break every python library. > >> >> > (+) Default arguments / annotations for methods could be different for >> > each class instance. Adaptive defaults wouldn't have to simulated with a >> > None. >> >> That seems as likely to be a negative as a positive. Having >> different >> instances with different default values could be confusing. This would >> even allow different instances to define totally different methods (with >> if-logic inside the function constructor), which would definitely be >> confusing. >> > > Different default values for different instances are a corner case but > they are already happening by setting default to None. Defining different > methods for different instances wouldn't be good but that is also possible > with conventional classes (by adding functions to self in __init__). > > > (+) Class/instance level imports would work. >> >> How often is that really needed? >> > > True, usually it doesn't matter. But when using big packages like > tensorflow that take several seconds to load it can be annoying. Its always > loaded when importing any library that uses it internally, because of > module level imports that should be class/instance level. Even if we just > wanted to do --help on the command line and needed that library before > argparse for some reason. > > > (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the >> > `class`-based objects take only 0.4 ?s. For method execution however the >> > closure takes only 0.15 ?s while the proper method takes 0.22 ?s (script >> > ). >> >> I don't think you can really evaluate the performance impact of >> this >> alternative just based on a trivial example like that. >> >> Agree, I don't know really how well this would perform. >> > > > (-/+) Checking types: In the proposed example above the returned object >> > wouldn't know that it has been created by `MyClass`. There are a couple >> > of solutions to that, though. The easiest to implement would be to >> > change the first line to `self = subclass(ParentClass())` where the >> > subclass function looks at the next item in the call stack (i.e. >> > `MyClass`) and makes it the type of the object. Another solution would >> > be to have a special rule for functions with capital first letter >> > returning a single object to append itself to the list of types of the >> > returned object. Alternatively there could be a special keyword e.g. >> > `classdef` that would be used instead of `def` if we wouldn't want to >> > rely on the name. >> >> Those special rules sound very hackish to me. > > >> > (-) The current syntax for adding a function to an object is >> > cumbersome. That's what is preventing me from actually using the >> > proposed pattern. But is this really the only reason for not using it? >> > And if so, wouldn't that be a good argument for enabling something like >> > below? >> > * >> > * >> > *attribute function definitions*: >> > | >> > defMyClass(x): >> > self=ParentClass() >> > defself.my_method(y): >> > z=x+y >> > returnz >> > returnself >> > | >> > >> > >> > or alternatively*multiline lambdas*: >> > >> > | >> > defMyClass(x): >> > self=ParentClass() >> > self.my_method=(y): >> > z=x+y >> > returnz >> > returnself >> > | >> >> To be honest, from all your examples, I don't really see what the >> point >> is. It's a different syntax for doing some of the same things the >> existing class syntax does, while providing no apparent way to do some >> important things (like mutating attributes). I think Python's existing >> class syntax is simple, clear, and quite nice overall, and creating >> class instances by calling a function instead of a class doesn't add >> anything. In fact, even JavaScript has recently added classes to allow >> programmers to move away from the old approach that you describe here. >> Also, as I alluded to above, JavaScript is so terrible in so many ways >> that the mere idea of imitating it inherently makes me skeptical; >> there's almost nothing about JavaScript's design that couldn't be done >> better, and most of what it does are things that Python already does >> better and has done better for years. In short, I don't see any >> advantages at all to doing classes this way, and there are some >> non-negligible disadvantages. >> > > Interesting, didn't know that about Javascript. I also don't like > Javascript's prototypes very much but thought adding "JavaScript-like" to > the title might help explain what I meant. > > Leaving the possible replacement for classes aside, do you have an opinion > specifically about the following? > > def obj.my_function(a, b): > ... > > as syntactic sugar for > > def my_function(a, b): > ... > > obj.my_function = my_function > > In my experience this pattern comes actually up quite a bit. E.g. when > working with these "symbolic" machine learning frameworks like theano or > tensorflow. Apart from that it mixins very easy. > > What do you think are the odds of something like this actually making it > into the Python and if greater than 0 in which timeframe? > > > >> -- >> Brendan Barnwell >> "Do not follow where the path may lead. Go, instead, where there is no >> path, and leave a trail." >> --author unknown >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > Thanks, > > Simon > >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine.rozo at gmail.com Sun May 14 03:57:25 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Sun, 14 May 2017 09:57:25 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: How do you call methods from superclass, like super in classic style ? 2017-05-14 9:45 GMT+02:00 Stephan Houben : > FWIW, Javascript itself is moving away from this syntax in favour of a > more Python-like syntax based on the 'class' keyword. This was introduced > in EcmaScript 2015. > > Stephan > > Op 14 mei 2017 09:35 schreef "Simon Ramstedt" : > > Hi, thanks a lot for your feedback! >> >> On Sun, May 14, 2017, 00:54 Brendan Barnwell >> wrote: >> >>> On 2017-05-13 21:07, Simon Ramstedt wrote: >>> > Hi, do you have an opinion on the following? >>> >>> My general opinion is that imitating JavaScript is almost always >>> a bad >>> idea. :-) >>> >>> > Wouldn't it be nice to define classes via a simple constructor function >>> > (as below) instead of a conventional class definition? >>> > >>> > *conventional*: >>> > | >>> > classMyClass(ParentClass): >>> > def__init__(x): >>> > self._x=x >>> > defmy_method(y): >>> > z=self._x+y >>> > returnz >>> > | >>> > >>> > >>> > *proposed*: >>> > >>> > | >>> > defMyClass(x): >>> > self=ParentClass() >>> > defmy_method(y): >>> > z=x+y >>> > returnz >>> > self.my_method=my_method # that's cumbersome (see comments below) >>> > returnself >>> > | >>> > >>> > >>> > Here are the pros and cons I could come up with for the proposed >>> method: >>> > >>> > (+) Simpler and more explicit. >>> >>> I don't really see how that's simpler or more explicit. In one >>> respect >>> it's clearly less explicit, as the "self" is implicit. >>> >>> > (+) No need to create attributes (like `self._x`) just to pass >>> something >>> > from `__init__` to another method. >>> >>> Attributes aren't just for passing things to other methods. >>> They're >>> for storing state. In your proposed system, how would an object mutate >>> one of its own attributes? It looks like "x" here is just stored in a >>> function closure, which wouldn't allow easy mutation. Also, how would >>> another object access the attribute from outside (as we currently do >>> with self.x)? You can say we'd only use this new attribute-free >>> approach when we want to pass a constructor argument that's used but >>> never mutated or accessed from outside, but that severely restricts the >>> potential use cases, and all it saves you is typing "self". >>> >> >> Attributes could be added to self just as in conventional classes if they >> are needed. >> >> >>> Relatedly, how is ParentClass itself defined? I don't see how >>> you >>> could bootstrap this without having a real class at the bottom of it >>> somehow (as your test script in fact does). >>> >> >> You could bootstrap with an object base class/constructor just as normal >> classes inherit from object. Also the normal class system should remain in >> any case in order not to break every python library. >> >>> >>> > (+) Default arguments / annotations for methods could be different for >>> > each class instance. Adaptive defaults wouldn't have to simulated with >>> a >>> > None. >>> >>> That seems as likely to be a negative as a positive. Having >>> different >>> instances with different default values could be confusing. This would >>> even allow different instances to define totally different methods (with >>> if-logic inside the function constructor), which would definitely be >>> confusing. >>> >> >> Different default values for different instances are a corner case but >> they are already happening by setting default to None. Defining >> different methods for different instances wouldn't be good but that is also >> possible with conventional classes (by adding functions to self in >> __init__). >> >> > (+) Class/instance level imports would work. >>> >>> How often is that really needed? >>> >> >> True, usually it doesn't matter. But when using big packages like >> tensorflow that take several seconds to load it can be annoying. Its always >> loaded when importing any library that uses it internally, because of >> module level imports that should be class/instance level. Even if we just >> wanted to do --help on the command line and needed that library before >> argparse for some reason. >> >> > (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the >>> > `class`-based objects take only 0.4 ?s. For method execution however >>> the >>> > closure takes only 0.15 ?s while the proper method takes 0.22 ?s >>> (script >>> > ). >>> >>> I don't think you can really evaluate the performance impact of >>> this >>> alternative just based on a trivial example like that. >>> >>> Agree, I don't know really how well this would perform. >>> >> >> > (-/+) Checking types: In the proposed example above the returned object >>> > wouldn't know that it has been created by `MyClass`. There are a couple >>> > of solutions to that, though. The easiest to implement would be to >>> > change the first line to `self = subclass(ParentClass())` where the >>> > subclass function looks at the next item in the call stack (i.e. >>> > `MyClass`) and makes it the type of the object. Another solution would >>> > be to have a special rule for functions with capital first letter >>> > returning a single object to append itself to the list of types of the >>> > returned object. Alternatively there could be a special keyword e.g. >>> > `classdef` that would be used instead of `def` if we wouldn't want to >>> > rely on the name. >>> >>> Those special rules sound very hackish to me. >> >> >>> > (-) The current syntax for adding a function to an object is >>> > cumbersome. That's what is preventing me from actually using the >>> > proposed pattern. But is this really the only reason for not using it? >>> > And if so, wouldn't that be a good argument for enabling something like >>> > below? >>> > * >>> > * >>> > *attribute function definitions*: >>> > | >>> > defMyClass(x): >>> > self=ParentClass() >>> > defself.my_method(y): >>> > z=x+y >>> > returnz >>> > returnself >>> > | >>> > >>> > >>> > or alternatively*multiline lambdas*: >>> > >>> > | >>> > defMyClass(x): >>> > self=ParentClass() >>> > self.my_method=(y): >>> > z=x+y >>> > returnz >>> > returnself >>> > | >>> >>> To be honest, from all your examples, I don't really see what >>> the point >>> is. It's a different syntax for doing some of the same things the >>> existing class syntax does, while providing no apparent way to do some >>> important things (like mutating attributes). I think Python's existing >>> class syntax is simple, clear, and quite nice overall, and creating >>> class instances by calling a function instead of a class doesn't add >>> anything. In fact, even JavaScript has recently added classes to allow >>> programmers to move away from the old approach that you describe here. >>> Also, as I alluded to above, JavaScript is so terrible in so many ways >>> that the mere idea of imitating it inherently makes me skeptical; >>> there's almost nothing about JavaScript's design that couldn't be done >>> better, and most of what it does are things that Python already does >>> better and has done better for years. In short, I don't see any >>> advantages at all to doing classes this way, and there are some >>> non-negligible disadvantages. >>> >> >> Interesting, didn't know that about Javascript. I also don't like >> Javascript's prototypes very much but thought adding "JavaScript-like" to >> the title might help explain what I meant. >> >> Leaving the possible replacement for classes aside, do you have an >> opinion specifically about the following? >> >> def obj.my_function(a, b): >> ... >> >> as syntactic sugar for >> >> def my_function(a, b): >> ... >> >> obj.my_function = my_function >> >> In my experience this pattern comes actually up quite a bit. E.g. when >> working with these "symbolic" machine learning frameworks like theano or >> tensorflow. Apart from that it mixins very easy. >> >> What do you think are the odds of something like this actually making it >> into the Python and if greater than 0 in which timeframe? >> >> >> >>> -- >>> Brendan Barnwell >>> "Do not follow where the path may lead. Go, instead, where there is no >>> path, and leave a trail." >>> --author unknown >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> Thanks, >> >> Simon >> >>> >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From simonramstedt at gmail.com Sun May 14 03:59:30 2017 From: simonramstedt at gmail.com (Simon Ramstedt) Date: Sun, 14 May 2017 00:59:30 -0700 (PDT) Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <20170514070445.GF24625@ando.pearwood.info> References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: <5cc63b8a-d7ba-4c0b-b8ca-22b98530d3e0@googlegroups.com> On Sunday, May 14, 2017 at 3:05:46 AM UTC-4, Steven D'Aprano wrote: > > On Sun, May 14, 2017 at 04:07:44AM +0000, Simon Ramstedt wrote: > > Hi, do you have an opinion on the following? > > Hi, and welcome, and of course we have an opinion! This is Python-Ideas, > we're very opinionated :-) > > Thanks! > > Wouldn't it be nice to define classes via a simple constructor function > (as > > below) instead of a conventional class definition? > > No. > > > *conventional*: > > > > class MyClass(ParentClass): > > def __init__(x): > > self._x = x > > def my_method(y): > > z = self._x + y > > return z > > Looks good to me. It is nicely explicit that you're creating a class, > the superclass or superclasses are easy to see, and the attributes are > explicit. > > > > *proposed*: > > > > def MyClass(x): > > That is the exact same syntax for defining a function called "MyClass", > that takes one argument, x. How is Python (and the reader!) supposed to > tell which calls to def return a class and which return a function? > > > > self = ParentClass() > > What if you have multiple parent classes? > > Right, the parent class would have to specifically written to allow that e.g. via: def ParentClass(obj=None): self = obj or Object() ... > Why is self an instance of the parent class, instead of MyClass? > That's what I've tried to cover under "(+/-) Checking types: ..." > > def my_method(y): > > z = x + y > > return z > > The local variable x is not defined. Wait, is that supposed to come from > the closure def MyClass(x)? > > What if your class has twenty methods, each of which takes different > arguments? Do you have to write: > > def MyClass(x, # used in my_method > y, # used in another_method > z, # used in third_method, > a, b, c, # used in fourth_method > ... # blah blah blah > ): > > How does this generalise to non-toy classes, classes with more than one > method? > > def MyClass would basically as a replacement for __init__. Using __init__ instead for your example would also not be perfect: def __init__(self, x, y, z, a, b, c): self._x = x self._y = y ... The point is though, that you could still do exactly the same if you wanted to: def MyClass(x, y, z, a, b, c): self = SuperClass() self.x = x self._y = y ... def my_method(): self.x += 5 return self.x + self._y ... > > self.my_method = my_method # that's cumbersome (see comments > below) > > return self > > > > > > Here are the pros and cons I could come up with for the proposed method: > > > > (+) Simpler and more explicit. > > I think you mean "More complicated and less explicit". > > Is this supposed to be some sort of prototype-based OOP instead of > class-based OOP? I'd be interested in investigating prototype-based > objects, but I don't think this is the way to do it. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 14 07:14:36 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 May 2017 21:14:36 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: <20170514111436.GG24625@ando.pearwood.info> Some further thoughts... On Sun, May 14, 2017 at 04:07:44AM +0000, Simon Ramstedt wrote: > *proposed*: > > def MyClass(x): > self = ParentClass() > def my_method(y): > z = x + y > return z > self.my_method = my_method # that's cumbersome (see comments below) > return self I think I misunderstood this earlier. "x" in this case is intended as a parameter to the __init__ method, not as a parameter to the "my_method" method. So my earlier objection doesn't apply: this isn't about the arguments to my_method, it is about the parameters to __init__. To my mind, this is very confusing. The function is called MyClass, leading me to believe that it returns the class object, but it doesn't, it returns a newly instantiated instance. But... given obj = MyClass(1), say, we get: assert type(obj) is ParentClass So MyClass doesn't actually exist anywhere. Worse: def Spam(): self = Parent() # ... def Eggs(): self = Parent() # ... a = Spam() b = Eggs() assert type(a) is type(b) As you point out later in your post: > (-/+) Checking types: In the proposed example above the returned object > wouldn't know that it has been created by `MyClass`. In fact, there is no MyClass class anywhere. That's very strange and confusing. > There are a couple of > solutions to that, though. The easiest to implement would be to change the > first line to `self = subclass(ParentClass())` where the subclass function > looks at the next item in the call stack (i.e. `MyClass`) and makes it the > type of the object. You say this is the easiest to implement. Do you have an implementation? Does it work for CPython, Jython, IronPython, PyPy, Stackless, Nuitka, and other Python implementations? What of Python implementations that don't support intraspection of the call stack? (Perhaps we should just make that a required language feature. But that's a separate argument.) > Another solution would be to have a special rule for > functions with capital first letter returning a single object to append > itself to the list of types of the returned object. Alternatively there > could be a special keyword e.g. `classdef` that would be used instead of > `def` if we wouldn't want to rely on the name. There needs to be some compiler magic happening here. Whether it is in the def keyword or in a new built-in "subclass()" or "classdef", any way you do it it needs to be built into the interpreter. That means its backwards incompatible, and cannot be backported to older versions or alternative implementations. That's not necessarily a fatal objection, but it does mean that the feature needs to add substantial value to the language to make up for the cost of having two ways to do the same thing. > Here are the pros and cons I could come up with for the proposed method: > > (+) Simpler and more explicit. To elaborate on my earlier answer, I really don't think it is simpler. There's a lot of extra compiler magic going on to make it work, and we're lacking a reference to the actual class object itself. You can't say: MyClass.attribute = 999 # add a class attribute because MyClass isn't actually the class, it's a special constructor of the class. Not the __init__ method, or the __new__ method, but a factory function that somehow, behind the scenes, magically creates the class and returns a new instance of it. So to add to the class, you have to write: instance = MyClass() # hope this has no side-effects type(instance).attribute = 999 del instance A couple of other consequences that you might have missed: (1) No docstring: there's no obvious place to declare the class docstring. (2) Every instance gets a new, unique, my_method() object added to it. Every time you call the MyClass constructor/factory function, it creates a brand new my_method object, and attaches it to the instance, self. That's potentially wasteful of memory, but it will work. Well, not quite. As written, it is quite different from the way methods are normally handled. In this new suggested syntax, my_method is actually a function object on the instance, with no self parameter. Since the descriptor protocol is bypassed for objects on self, that doesn't matter, it will work. But now we have two ways of writing methods: - if you write methods using the class keyword, they will be stored in the class __dict__ and they MUST have a `self` parameter; - if you write methods using the constructor function, they will be stored in the instance __dict__, they will be function objects not method objects, they MUST NOT have a `self` parameter, and you cannot use classmethod or staticmethod. I can see this leading to severe confusion when people mistakenly add a self parameter or leave it out. To avoid this, we need yet more magic, or work: instead of writing self.my_method = my_method you have to write something like: type(self).my_method = my_method except that's not right either! That will mean each time you call the constructor function, you replace the my_method for EVERY instance with a new closure. This is not simple at all. [...] > (+) Class/instance level imports would work. Why do you think that import doesn't work at the class or instance? > (-/+) Speed: The `def`-based objects take 0.6 ?s to create while the > `class`-based objects take only 0.4 ?s. For method execution however the > closure takes only 0.15 ?s while the proper method takes 0.22 ?s (script > ). Premature optimization. > (-) The current syntax for adding a function to an object is cumbersome. No more cumbersome than any other value: obj.attribue = value works for any value, including functions, provided that it already exists (or can be created using a single expression). If not, then the syntax is no worse for functions than any other multi-statement object: # prepare the value a = [] while condition(): a.append(something()) # attach it to the object obj.attribute = a > That's what is preventing me from actually using the proposed pattern. But > is this really the only reason for not using it? And if so, wouldn't that > be a good argument for enabling something like below? [...] > def self.my_method(y): That's been proposed at least once before, and probably more than that. I think the most recent time was earlier this year? I don't think I'm against that specific language feature. But nor is it an obviously compelling feature. > or alternatively *multiline lambdas*: Gosh, what a good idea! I wonder why nobody else has suggested a multi-statement lambda! /sarcasm Multi-statement lambdas have been debated since, oh, Python 1.4 if not earlier. If they were easy to add, we would have had them by now. At the point that your simple proposal to add Javascript-style constructors requires multiple other changes to the language to be practical: - either multi-statement lambda, - or dotted function targets `def self.name`; - a new built-in function subclass() with call stack introspection powers, - or a new keyword classdef, - or significant magic in the def keyword; this is not a simple proposal. And the benefits are marginal, at best: as far as I can see, the best we can hope for is a tiny performance increase by using closures as methods, and some interesting semantic differences: - classmethod and staticmethod won't work (this is bad); - per-instance methods: methods are in the instance __dict__, not the class __dict__ (this might be good, or bad); - bypass the descriptor protocol; - no self (that's a bad thing); - mutator methods need to use the nonlocal keyword (bad). -- Steve From steve at pearwood.info Sun May 14 07:18:31 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 May 2017 21:18:31 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: <20170514111831.GH24625@ando.pearwood.info> On Sun, May 14, 2017 at 11:12:21AM +0400, Abdur-Rahmaan Janhangeer wrote: > Whatever you all propose, > > coming from a java and c++ background, OOP in python is quite cumbersome. In what way it is cumbersome? > if you tell that i am not a python guy, then consider that current oop > style does not reflect python's style of ease and simplicity That is one opinion. > is __init__ really a good syntax choice? I don't understand the question. Are you complaining about the name "__init__"? Do you think it would be easier to write if it was spelled "init" or "new" or "Constructor"? -- Steve From antoine.rozo at gmail.com Sun May 14 07:33:32 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Sun, 14 May 2017 13:33:32 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <20170514111831.GH24625@ando.pearwood.info> References: <20170514070445.GF24625@ando.pearwood.info> <20170514111831.GH24625@ando.pearwood.info> Message-ID: Also, how do you handle special methods for operators, such as __add__? 2017-05-14 13:18 GMT+02:00 Steven D'Aprano : > On Sun, May 14, 2017 at 11:12:21AM +0400, Abdur-Rahmaan Janhangeer wrote: > > Whatever you all propose, > > > > coming from a java and c++ background, OOP in python is quite cumbersome. > > In what way it is cumbersome? > > > if you tell that i am not a python guy, then consider that current oop > > style does not reflect python's style of ease and simplicity > > That is one opinion. > > > > is __init__ really a good syntax choice? > > I don't understand the question. Are you complaining about the name > "__init__"? Do you think it would be easier to write if it was spelled > "init" or "new" or "Constructor"? > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 14 07:39:21 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 May 2017 21:39:21 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: <20170514113920.GI24625@ando.pearwood.info> On Sun, May 14, 2017 at 07:35:38AM +0000, Simon Ramstedt wrote: > Leaving the possible replacement for classes aside, do you have an opinion > specifically about the following? > > def obj.my_function(a, b): > ... > > as syntactic sugar for > > def my_function(a, b): > ... > > obj.my_function = my_function Personally, I don't object to it, I can see the idea has some merit. See the most recent discussion here: https://mail.python.org/pipermail/python-ideas/2017-February/044551.html > In my experience this pattern comes actually up quite a bit. E.g. when > working with these "symbolic" machine learning frameworks like theano or > tensorflow. Apart from that it mixins very easy. > > What do you think are the odds of something like this actually making it > into the Python and if greater than 0 in which timeframe? Somebody would need to propose some compelling use-cases for where this is clearly better than the status quo. Guido would have to not object. Somebody would have to volunteer to do the work, and it would have to not cause an unacceptible performance hit. (I doubt that it would, since it's just a small amount of new syntactic sugar.) If you can show actual real-life code that would be improved by this new feature, that would increase the probability. If you can prove that it would be a benefit to (let's say) the Theano community, that would increase the probability. If you volunteered to do the work, rather than expect somebody else to do it, that would *significantly* increase the probability of it actually happening, provided the suggestion was actually accepted. If you read the previous discussion, I think the conclusion was that there is nothing that def Some_Object.name(x): ... can do that cannot be emulated by a decorator: @inject(Some_Object) def name(x): ... *except* that the decorator solution would leave the function "name" polluting the current namespace, and the new syntax could avoid that. And even that is easy to work-around: del name So I think the probability is low, but not zero. It would help if you could prove a significant real-world use-case for injecting functions into existing objects. -- Steve From steve at pearwood.info Sun May 14 07:48:52 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 May 2017 21:48:52 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> <20170514111831.GH24625@ando.pearwood.info> Message-ID: <20170514114851.GJ24625@ando.pearwood.info> On Sun, May 14, 2017 at 01:33:32PM +0200, Antoine Rozo wrote: > Also, how do you handle special methods for operators, such as __add__? Oh, that's a good point! I forgot about that. For implementation-dependent reasons, you couldn't use this proposed new syntax for dunder methods: def MyClass(): self = subclass(Parent) def my_method(arg): ... self.my_method = my_method def __str__(): ... self.__str__ = __str__ return self obj = MyClass() obj.my_method(123) # okay obj.__str__() # works, but bad style str(obj) # doesn't work in CPython Because of the implementation, str(obj) would NOT call __str__ in CPython, although I think it would in IronPython. I'm not sure about PyPy or Jython. (CPython "new style classes" only call __dunder__ methods when they are defined on the class, or a superclass, not when they are in the instance __dict__.) -- Steve From rosuav at gmail.com Sun May 14 08:45:24 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 14 May 2017 22:45:24 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <20170514111436.GG24625@ando.pearwood.info> References: <20170514111436.GG24625@ando.pearwood.info> Message-ID: On Sun, May 14, 2017 at 9:14 PM, Steven D'Aprano wrote: >> There are a couple of >> solutions to that, though. The easiest to implement would be to change the >> first line to `self = subclass(ParentClass())` where the subclass function >> looks at the next item in the call stack (i.e. `MyClass`) and makes it the >> type of the object. > > You say this is the easiest to implement. Do you have an implementation? > Does it work for CPython, Jython, IronPython, PyPy, Stackless, Nuitka, > and other Python implementations? What of Python implementations that > don't support intraspection of the call stack? > > (Perhaps we should just make that a required language feature. But > that's a separate argument.) > Here's a much easier way to achieve that: @Class def MyClass(...): ... As well as declaring that this is, in fact, a class (more reliable than depending on a naming convention), this decorator could take the return value from the original function and then tag it as a subclass of that object, with the subclass's name being lifted from the function's name. Something like: def Class(func): @wraps(func) def wrapper(*a, **kw): ret = func(*a, **kw) if ret is None: return ret ret.__class__ = func.__name__ # pretend this works return ret return wrapper This would chain, because "self = ParentClass()" will be going through this decorator too. But honestly, I think I'd rather do something like this: def Class(func): class Thing: @wraps(func) def __new__(*a, **kw): func() Thing.__name__ = func.__name__ return Thing In other words, the transformation is to an actual class. The function form becomes syntactic sugar for a particular form of class definition. You could even do subclassing through the decorator: def Class(*parents): def deco(func): class Thing(*parents): # etc AFAIK all this is fully legal in current versions of Python 3. If you want to mess around with classes and functions, decorators can do all that and more, and IMO are the best way to prototype the syntax. No changes needed to grammar or anything. It'll still be a class under the hood, so isinstance checks will work, for instance (sorry, couldn't resist). And it can all be done through a single entry point that you can easily publish in a module on PyPI. ChrisA From guido at python.org Sun May 14 12:25:12 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 14 May 2017 09:25:12 -0700 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: How exactly do you think the process of adopting something into the stdlib works? Just posting "bump" messages to the mailing list doesn't really help, it just sounds rude.If you need help understanding how to add/improve a stdlib module, please ask a specific about that topic. On Sun, May 14, 2017 at 12:43 AM, Bar Harel wrote: > Bump > > On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: > >> +1 to this as well, I think this would be really useful in the stdlib. >> >> On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel wrote: >> >>> Any updates with a singledispatch for methods? >>> >>> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >>> >>>> At last! Haven't used single dispatch exactly because of that. Thank >>>> you savior! >>>> +1 >>>> >>>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell >>>> wrote: >>>> >>>>> Hi All, >>>>> >>>>> We have a modified version of singledispatch at work which works for >>>>> methods as well as functions. We have open-sourced it as methoddispatch >>>>> (pypi: https://pypi.python.org/pypi/methoddispatch). >>>>> >>>>> IMHO I thought it would make a nice addition to python stdlib. >>>>> >>>>> What does everyone else think? >>>>> >>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 14 12:25:42 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 14 May 2017 09:25:42 -0700 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: PS: I didn't see a message from Lisa on the mailing list -- maybe she replied to you only? On Sun, May 14, 2017 at 9:25 AM, Guido van Rossum wrote: > How exactly do you think the process of adopting something into the stdlib > works? Just posting "bump" messages to the mailing list doesn't really > help, it just sounds rude.If you need help understanding how to add/improve > a stdlib module, please ask a specific about that topic. > > On Sun, May 14, 2017 at 12:43 AM, Bar Harel wrote: > >> Bump >> >> On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: >> >>> +1 to this as well, I think this would be really useful in the stdlib. >>> >>> On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel wrote: >>> >>>> Any updates with a singledispatch for methods? >>>> >>>> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >>>> >>>>> At last! Haven't used single dispatch exactly because of that. Thank >>>>> you savior! >>>>> +1 >>>>> >>>>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell < >>>>> tim.mitchell at leapfrog3d.com> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> We have a modified version of singledispatch at work which works for >>>>>> methods as well as functions. We have open-sourced it as methoddispatch >>>>>> (pypi: https://pypi.python.org/pypi/methoddispatch). >>>>>> >>>>>> IMHO I thought it would make a nice addition to python stdlib. >>>>>> >>>>>> What does everyone else think? >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Python-ideas mailing list >>>>>> Python-ideas at python.org >>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 14 12:37:10 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 14 May 2017 09:37:10 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: On Sun, May 14, 2017 at 12:35 AM, Simon Ramstedt wrote: > What do you think are the odds of something like this actually making it > into the Python and if greater than 0 in which timeframe? > If you're asking for language or stdlib support or an official endorsement, the odds are exactly zero. Of course if this pattern is useful for you in the context of something you are developing you are free to use it, presuming it doesn't require language or stdlib changes. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Sun May 14 12:37:32 2017 From: bzvi7919 at gmail.com (Bar Harel) Date: Sun, 14 May 2017 16:37:32 +0000 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: I guess so. Sorry for that. To be honest I'm not entirely sure of the entire procedure and if small things need a PEP or not. I actually received the tip to bump from core-mentorship, so now I'm rather confused. Anyway, shall I add it to the bug tracker as an enhancement? On Sun, May 14, 2017, 7:26 PM Guido van Rossum wrote: > PS: I didn't see a message from Lisa on the mailing list -- maybe she > replied to you only? > > On Sun, May 14, 2017 at 9:25 AM, Guido van Rossum > wrote: > >> How exactly do you think the process of adopting something into the >> stdlib works? Just posting "bump" messages to the mailing list doesn't >> really help, it just sounds rude.If you need help understanding how to >> add/improve a stdlib module, please ask a specific about that topic. >> >> On Sun, May 14, 2017 at 12:43 AM, Bar Harel wrote: >> >>> Bump >>> >>> On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: >>> >>>> +1 to this as well, I think this would be really useful in the stdlib. >>>> >>>> On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel wrote: >>>> >>>>> Any updates with a singledispatch for methods? >>>>> >>>>> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >>>>> >>>>>> At last! Haven't used single dispatch exactly because of that. Thank >>>>>> you savior! >>>>>> +1 >>>>>> >>>>>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell < >>>>>> tim.mitchell at leapfrog3d.com> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> We have a modified version of singledispatch at work which works for >>>>>>> methods as well as functions. We have open-sourced it as methoddispatch >>>>>>> (pypi: https://pypi.python.org/pypi/methoddispatch). >>>>>>> >>>>>>> IMHO I thought it would make a nice addition to python stdlib. >>>>>>> >>>>>>> What does everyone else think? >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Python-ideas mailing list >>>>>>> Python-ideas at python.org >>>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>> >>>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 14 12:49:31 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 14 May 2017 09:49:31 -0700 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: Maybe ask core membership if they meant you to literally just post just the word "bump" to the list (my guess is not). Also the last time I see that you received any advice was a long time ago and regarding to a different issue. For this idea there's no issue and no patch (and core devs aren't required to read python-ideas). Please understand that in this community you are expected to do some work yourself too -- we're not being paid to implement features proposed (or fix bugs reported) by users, we mostly implement/fix things we care about personally, and some of us sometimes volunteer to mentor users who show an interest in learning. IMO posting "bump" several times does not exhibit such interest. On Sun, May 14, 2017 at 9:37 AM, Bar Harel wrote: > I guess so. > > Sorry for that. > To be honest I'm not entirely sure of the entire procedure and if small > things need a PEP or not. I actually received the tip to bump from > core-mentorship, so now I'm rather confused. > > Anyway, shall I add it to the bug tracker as an enhancement? > > On Sun, May 14, 2017, 7:26 PM Guido van Rossum wrote: > >> PS: I didn't see a message from Lisa on the mailing list -- maybe she >> replied to you only? >> >> On Sun, May 14, 2017 at 9:25 AM, Guido van Rossum >> wrote: >> >>> How exactly do you think the process of adopting something into the >>> stdlib works? Just posting "bump" messages to the mailing list doesn't >>> really help, it just sounds rude.If you need help understanding how to >>> add/improve a stdlib module, please ask a specific about that topic. >>> >>> On Sun, May 14, 2017 at 12:43 AM, Bar Harel wrote: >>> >>>> Bump >>>> >>>> On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: >>>> >>>>> +1 to this as well, I think this would be really useful in the stdlib. >>>>> >>>>> On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel wrote: >>>>> >>>>>> Any updates with a singledispatch for methods? >>>>>> >>>>>> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >>>>>> >>>>>>> At last! Haven't used single dispatch exactly because of that. Thank >>>>>>> you savior! >>>>>>> +1 >>>>>>> >>>>>>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell < >>>>>>> tim.mitchell at leapfrog3d.com> wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> We have a modified version of singledispatch at work which works >>>>>>>> for methods as well as functions. We have open-sourced it as >>>>>>>> methoddispatch (pypi: https://pypi.python.org/pypi/methoddispatch). >>>>>>>> >>>>>>>> IMHO I thought it would make a nice addition to python stdlib. >>>>>>>> >>>>>>>> What does everyone else think? >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Python-ideas mailing list >>>>>>>> Python-ideas at python.org >>>>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Python-ideas mailing list >>>>>> Python-ideas at python.org >>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Sun May 14 13:10:53 2017 From: bzvi7919 at gmail.com (Bar Harel) Date: Sun, 14 May 2017 17:10:53 +0000 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: As I said, sorry for that. It's just that I'm not entirely sure there's anything to implement here. The implementation already exists. If it doesn't suffice I will help as much as I can to make sure it works :-) On Sun, May 14, 2017, 7:49 PM Guido van Rossum wrote: > Maybe ask core membership if they meant you to literally just post just > the word "bump" to the list (my guess is not). Also the last time I see > that you received any advice was a long time ago and regarding to a > different issue. For this idea there's no issue and no patch (and core devs > aren't required to read python-ideas). > > Please understand that in this community you are expected to do some work > yourself too -- we're not being paid to implement features proposed (or fix > bugs reported) by users, we mostly implement/fix things we care about > personally, and some of us sometimes volunteer to mentor users who show an > interest in learning. IMO posting "bump" several times does not exhibit > such interest. > > On Sun, May 14, 2017 at 9:37 AM, Bar Harel wrote: > >> I guess so. >> >> Sorry for that. >> To be honest I'm not entirely sure of the entire procedure and if small >> things need a PEP or not. I actually received the tip to bump from >> core-mentorship, so now I'm rather confused. >> >> Anyway, shall I add it to the bug tracker as an enhancement? >> >> On Sun, May 14, 2017, 7:26 PM Guido van Rossum wrote: >> >>> PS: I didn't see a message from Lisa on the mailing list -- maybe she >>> replied to you only? >>> >>> On Sun, May 14, 2017 at 9:25 AM, Guido van Rossum >>> wrote: >>> >>>> How exactly do you think the process of adopting something into the >>>> stdlib works? Just posting "bump" messages to the mailing list doesn't >>>> really help, it just sounds rude.If you need help understanding how to >>>> add/improve a stdlib module, please ask a specific about that topic. >>>> >>>> On Sun, May 14, 2017 at 12:43 AM, Bar Harel wrote: >>>> >>>>> Bump >>>>> >>>>> On Wed, Jan 4, 2017, 8:01 PM Lisa Roach wrote: >>>>> >>>>>> +1 to this as well, I think this would be really useful in the stdlib. >>>>>> >>>>>> On Mon, Dec 26, 2016 at 5:40 AM, Bar Harel >>>>>> wrote: >>>>>> >>>>>>> Any updates with a singledispatch for methods? >>>>>>> >>>>>>> On Tue, Sep 20, 2016, 5:49 PM Bar Harel wrote: >>>>>>> >>>>>>>> At last! Haven't used single dispatch exactly because of that. >>>>>>>> Thank you savior! >>>>>>>> +1 >>>>>>>> >>>>>>>> On Tue, Sep 20, 2016, 6:03 AM Tim Mitchell < >>>>>>>> tim.mitchell at leapfrog3d.com> wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> We have a modified version of singledispatch at work which works >>>>>>>>> for methods as well as functions. We have open-sourced it as >>>>>>>>> methoddispatch (pypi: https://pypi.python.org/pypi/methoddispatch >>>>>>>>> ). >>>>>>>>> >>>>>>>>> IMHO I thought it would make a nice addition to python stdlib. >>>>>>>>> >>>>>>>>> What does everyone else think? >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Python-ideas mailing list >>>>>>>>> Python-ideas at python.org >>>>>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Python-ideas mailing list >>>>>>> Python-ideas at python.org >>>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 14 13:18:39 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 15 May 2017 03:18:39 +1000 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: <20170514171839.GK24625@ando.pearwood.info> On Sun, May 14, 2017 at 04:37:32PM +0000, Bar Harel wrote: > I guess so. > > Sorry for that. > To be honest I'm not entirely sure of the entire procedure and if small > things need a PEP or not. I actually received the tip to bump from > core-mentorship, so now I'm rather confused. If you are referring to the Core-mentorship at python.org mailing list, I don't recall seeing anyone tell you to send a "Bump" message. Perhaps I missed it? In any case, it's not so much the bump as the brusque, excessively terse manner in which you put it. A single word "bump" to get people's attention comes across as rather rude, unless you know them well. For example, at my work, we often use "ping" to get someone else's attention or remind them to answer a question. But we would never do so to a customer or supplier, or people we didn't know well. Like a mailing list full of strangers from all over the world :-) As far as the proposal here, singledispatch for methods, I think we have to go back to September last year for the original announcement/query from Tim Mitchell: https://mail.python.org/pipermail/python-ideas/2016-September/042466.html Unfortunately that seemed to be an extremely busy month of wild ideas on the mailing list, and long arguments that went around and around in circles. I guess that many people must have felt burned out by the volume of messages, or simply missed Tim Mitchell's post. I remember seeing it, and simply not having the time or energy to have an opinion. Quoting Tim: We have a modified version of singledispatch at work which works for methods as well as functions. We have open-sourced it as methoddispatch (pypi: https://pypi.python.org/pypi/methoddispatch). IMHO I thought it would make a nice addition to python stdlib. What does everyone else think? I don't have any objection to being able to use single dispatch on methods. To be honest, I assumed that singledispatch already did work on methods! -- Steve From steve at pearwood.info Sun May 14 13:28:14 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 15 May 2017 03:28:14 +1000 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: References: Message-ID: <20170514172814.GL24625@ando.pearwood.info> On Sun, May 14, 2017 at 05:10:53PM +0000, Bar Harel wrote: > As I said, sorry for that. > > It's just that I'm not entirely sure there's anything to implement here. > The implementation already exists. If it doesn't suffice I will help as > much as I can to make sure it works :-) I think you've succeeded in bringing the issue to people's attention :-) If you care about this enough to do the work (I don't, I expect this will be my last post on the topic), then I suggest you should: - contact Tim Mitchell and see if his offer of contributing the code still stands; - if so, and there are no conclusive objections on this list, then raise an issue on the bug tracker; - if not, then someone will have to fork Tim's code (assuming the licence allows it) or reimplement it without violating the licence; - somebody will have to make a Push Request on GitHub; that might be you, or it might be Tim; - Tim will need to sign a contributor agreement, since it's his code being used; - See the DevGuide for more details. (I don't remember the URL: unless someone else volunteers it, you can google for it.) And I think I've just hit the limit of how much I care about this issue. It would be nice to have, but I don't care enough to push it forward. Good luck. -- Steve From guido at python.org Sun May 14 13:40:07 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 14 May 2017 10:40:07 -0700 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: <20170514172814.GL24625@ando.pearwood.info> References: <20170514172814.GL24625@ando.pearwood.info> Message-ID: Thanks Steven. I think you've just concisely summarized the info in this section of the devguide: https://docs.python.org/devguide/stdlibchanges.html On Sun, May 14, 2017 at 10:28 AM, Steven D'Aprano wrote: > On Sun, May 14, 2017 at 05:10:53PM +0000, Bar Harel wrote: > > As I said, sorry for that. > > > > It's just that I'm not entirely sure there's anything to implement here. > > The implementation already exists. If it doesn't suffice I will help as > > much as I can to make sure it works :-) > > I think you've succeeded in bringing the issue to people's attention :-) > > If you care about this enough to do the work (I don't, I expect this > will be my last post on the topic), then I suggest you should: > > - contact Tim Mitchell and see if his offer of contributing the code > still stands; > > - if so, and there are no conclusive objections on this list, then raise > an issue on the bug tracker; > > - if not, then someone will have to fork Tim's code (assuming the > licence allows it) or reimplement it without violating the licence; > > - somebody will have to make a Push Request on GitHub; that might be > you, or it might be Tim; > > - Tim will need to sign a contributor agreement, since it's his code > being used; > > - See the DevGuide for more details. (I don't remember the URL: unless > someone else volunteers it, you can google for it.) > > And I think I've just hit the limit of how much I care about this issue. > It would be nice to have, but I don't care enough to push it forward. > > Good luck. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenbarn at brenbarn.net Sun May 14 13:48:01 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Sun, 14 May 2017 10:48:01 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: <59189851.5060806@brenbarn.net> On 2017-05-14 00:34, Chris Angelico wrote: > On Sun, May 14, 2017 at 2:53 PM, Brendan Barnwell wrote: >> Attributes aren't just for passing things to other methods. They're >> for storing state. In your proposed system, how would an object mutate one >> of its own attributes? It looks like "x" here is just stored in a function >> closure, which wouldn't allow easy mutation. Also, how would another object >> access the attribute from outside (as we currently do with self.x)? You can >> say we'd only use this new attribute-free approach when we want to pass a >> constructor argument that's used but never mutated or accessed from outside, >> but that severely restricts the potential use cases, and all it saves you is >> typing "self". > > My expectation is that you'd be using "nonlocal x" to do that. That would allow mutation from within methods, but (as far as I can tell) not access (or mutation) from outside the class. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From apalala at gmail.com Sun May 14 13:57:47 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 14 May 2017 13:57:47 -0400 Subject: [Python-ideas] singledispatch for instance methods In-Reply-To: <20170514172814.GL24625@ando.pearwood.info> References: <20170514172814.GL24625@ando.pearwood.info> Message-ID: On Sun, May 14, 2017 at 1:28 PM, Steven D?Aprano wrote: - contact Tim Mitchell and see if his offer of contributing the code > still stands; > FWIW, this is a Python implementation of a single-dispatch decorator for methods that I wrote from looking at the stdlib, and that I have used successfully in some projects: from functools import singledispatchfrom functools import update_wrapper def singledispatch_method(method): dispatcher = singledispatch(method) def wrapper(*args, **kw): return dispatcher.dispatch(args[1].__class__)(*args, **kw) wrapper.register = dispatcher.register update_wrapper(wrapper, method) return wrapper ? -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From hugo.fisher at gmail.com Mon May 15 05:38:29 2017 From: hugo.fisher at gmail.com (Hugh Fisher) Date: Mon, 15 May 2017 19:38:29 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? Message-ID: I wrote this little Python program using CPython 3.5.2. It's ... interesting ... that we apparently don't need comments or pass statements any more. Anyone else think it might be worth tightening up the grammar definition and parser a bit? def empty(): """Don't do anything""" def helloWorld(): """Docstring""" x = 0 if x > 0: """Pass""" else: x += 1 print(x) """Comment that is a string or vice versa""" x = 2 print(x) if x == 2: x += 1 ;"Add 1 to x" print(x) if x == 3: 42 print("Answered everything") if __name__ == "__main__": helloWorld() print(empty()) -- cheers, Hugh Fisher From rosuav at gmail.com Mon May 15 06:13:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 15 May 2017 20:13:48 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: Message-ID: On Mon, May 15, 2017 at 7:38 PM, Hugh Fisher wrote: > I wrote this little Python program using CPython 3.5.2. It's ... > interesting ... that we apparently don't need comments or pass > statements any more. Anyone else think it might be worth tightening up > the grammar definition and parser a bit? > Nope. For starters, you shouldn't be using "pass" statements OR dummy strings to fill in an if statement's body; you can instead simply write: if x <= 0: x += 1 Or worst case: if not (x > 0): x += 1 For the rest, all you've shown is that trivial expressions consisting only of string literals will be ignored in certain contexts. The trouble is that string literals don't really mean comments, and won't be ignored by most humans; plus, there are contexts where they are not ignored. Here, rewrite this without comments: wrong_answer_messages = [ "Wrong.", "Totally wrong, you moron.", "Bob, you idiot, that answer is not right. Cordially, Ted.", # Maize "That's as useful as a screen door on a battleship.", # BTTF # etc ] String literals won't work here, and even if they did, they would be _extremely_ confusing. Comments are semantically distinct. The 'pass' statement has a very specific meaning and only a few use-cases. It could often be omitted in favour of something else, but there's not a lot of value in doing so. Comments have very significant value and should definitely be kept. ChrisA From ncoghlan at gmail.com Mon May 15 07:05:29 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 15 May 2017 21:05:29 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On 14 May 2017 at 17:12, Abdur-Rahmaan Janhangeer wrote: > Whatever you all propose, > > coming from a java and c++ background, OOP in python is quite cumbersome. > > if you tell that i am not a python guy, then consider that current oop style > does not reflect python's style of ease and simplicity > > is __init__ really a good syntax choice? That's a different question, and one with a well-structured third party solution: https://attrs.readthedocs.io/en/stable/ See https://mail.python.org/pipermail/python-ideas/2017-April/045514.html for some ideas on how something like attrs might be adapted to provide better standard library tooling for more concise and readable class definitions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Mon May 15 08:29:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 15 May 2017 22:29:33 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: Message-ID: <20170515122933.GM24625@ando.pearwood.info> On Mon, May 15, 2017 at 07:38:29PM +1000, Hugh Fisher wrote: > I wrote this little Python program using CPython 3.5.2. It's ... > interesting ... that we apparently don't need comments or pass > statements any more. I'm not sure what you mean by "any more". The code you give works, unchanged, all the way back to Python 2.0 when augmented assignment was added. If you replace the x += 1 with x = x + 1 it works all the way back to Python 1.5 and probably even older. Python has (more or less) always supported arbitrary expressions as statements, so this is not new. This is a feature, not a bug: supporting expressions as statements is necessary for expressions like: alist.sort() and other expressions with side-effects. Unfortunately, that means that pointless expressions like: 42 that have no purpose are also legal. In recent versions, the compiler has a peephole optimizer that removes at least some constant expressions: # Python 3.5 py> block = """x = 1 ... 'some string' ... 100 ... y = 2 ... """ py> code = compile(block, '', 'exec') py> from dis import dis py> dis(code) 1 0 LOAD_CONST 0 (1) 3 STORE_NAME 0 (x) 4 6 LOAD_CONST 1 (2) 9 STORE_NAME 1 (y) 12 LOAD_CONST 2 (None) 15 RETURN_VALUE There's also a (weak) convention that bare string literals are intended as pseudo-constants. That's especially handy with triple-quoted strings, since they can comment-out multiple lines. > Anyone else think it might be worth tightening up > the grammar definition and parser a bit? Not me. In fact, I'd go further than just saying "I don't think it is worthwhile". I'll say that treating bare strings as pseudo-comments is a positive feature worth keeping. Tightening up the grammar to prohibit that is a bad thing. There's an argument to be made that bare expressions like: 100 are pointless, but it isn't a strong argument. In practice, it isn't really a common source of errors, and as far as efficiency goes, the peephole optimizer solves that. And its easy to get the rules wrong. For instance, at first I thought that a bare name lookup like: x could be safely optimized away, or prohibited, but it can't. It is true that a successful name lookup will do nothing, but not all lookups are successful: try: next except NameError: # Python version is too old def next(iterator): return iterator.next() If we prohibit bare name lookups, that will break a lot of working code. I suppose it is possible that a *sufficiently intelligent* compiler could recognise bare expressions that have no side-effects, and prohibit them, and that this might prevent some rare, occasional errors: x #= 1 # oops I meant += but honestly, I don't see that this is a good use of developer's time. It adds complexity to the language, risks false positives, and in my opinion is the sort of thing that is better flagged by a linter, not prohibited by the interpreter. -- Steve From steve at pearwood.info Mon May 15 09:00:15 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 15 May 2017 23:00:15 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: Message-ID: <20170515130014.GN24625@ando.pearwood.info> On Mon, May 15, 2017 at 08:13:48PM +1000, Chris Angelico wrote: > On Mon, May 15, 2017 at 7:38 PM, Hugh Fisher wrote: > > I wrote this little Python program using CPython 3.5.2. It's ... > > interesting ... that we apparently don't need comments or pass > > statements any more. Anyone else think it might be worth tightening up > > the grammar definition and parser a bit? > > > > Nope. I agree with that. But not necessarily the following: > For starters, you shouldn't be using "pass" statements OR dummy > strings to fill in an if statement's body; you can instead simply > write: > > if x <= 0: > x += 1 That is often the case, but there are times where a condition is clearer with a pass statement followed by an else than by reversing the sense of the test. Or the pass might just be a place-holder: TDD often means that there's code where only one branch of an if works (and it's not necessarily the if branch). if condition: pass # will be fixed in the next iteration of TDD else: code There's also cases where if x > y: pass else: code is *not necessarily* the same as if not (x > y): code (x > y) is not always not(x <= y). E.g. sets, and even floats. > For the rest, all you've shown is that trivial expressions consisting > only of string literals will be ignored in certain contexts. > The trouble is that string literals don't really mean comments, and won't > be ignored by most humans; Bare string literals do sometimes mean comments, and I should hope they aren't ignored by the reader! E.g. bare strings at the start of a module, class or function are docstrings, and even in the middle of the module or function, they are allowed. Guido has spoken! (Unless he's changed his mind since then :-) https://twitter.com/gvanrossum/status/112670605505077248 > plus, there are contexts where they are not ignored. Oh, and here I was thinking strings were ignored everywhere! print("hello world") # does nothing *wink* But seriously, of course *expression statements* which are string literals are not syntactically comments, but they can be, and are, treated as if they were. Just use a bit of common sense. Here, rewrite this without comments: > > wrong_answer_messages = [ > "Wrong.", > "Totally wrong, you moron.", > "Bob, you idiot, that answer is not right. Cordially, Ted.", # Maize > "That's as useful as a screen door on a battleship.", # BTTF > # etc > ] > > String literals won't work here, and even if they did, they would be > _extremely_ confusing. That's because the statement is an assignment statement, not an expression statement: https://docs.python.org/3/reference/simple_stmts.html#grammar-token-expression_stmt > Comments are semantically distinct. > > The 'pass' statement has a very specific meaning and only a few > use-cases. It could often be omitted in favour of something else, but > there's not a lot of value in doing so. Comments have very significant > value and should definitely be kept. Oh, I see where you are coming from! You have interpreted Hugh as suggesting that we remove pass and # comments from the language! I interpreted him as suggesting the opposite: that we tighten up the grammar to prohibit bare expressions, in order to prevent them from being used instead of pass or # comments. -- Steve From rosuav at gmail.com Mon May 15 09:17:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 15 May 2017 23:17:48 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: <20170515130014.GN24625@ando.pearwood.info> References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: On Mon, May 15, 2017 at 11:00 PM, Steven D'Aprano wrote: > On Mon, May 15, 2017 at 08:13:48PM +1000, Chris Angelico wrote: >> On Mon, May 15, 2017 at 7:38 PM, Hugh Fisher wrote: >> > I wrote this little Python program using CPython 3.5.2. It's ... >> > interesting ... that we apparently don't need comments or pass >> > statements any more. Anyone else think it might be worth tightening up >> > the grammar definition and parser a bit? >> > >> >> Nope. > > There's also cases where > > if x > y: > pass > else: > code > > is *not necessarily* the same as > > if not (x > y): > code > > (x > y) is not always not(x <= y). E.g. sets, and even floats. Uhm.... not sure what you're getting at here. I'm fully aware that: if x > y: pass else: code is not the same as: if x <= y: code but I don't know of any way that it could be different from: if not (x > y): code because that's going to evaluate (x > y) exactly the same way the original would, and then perform a boolean negation on it, which is exactly the same as the if/else will do. Or have I missed something here? >> The 'pass' statement has a very specific meaning and only a few >> use-cases. It could often be omitted in favour of something else, but >> there's not a lot of value in doing so. Comments have very significant >> value and should definitely be kept. > > Oh, I see where you are coming from! You have interpreted Hugh as > suggesting that we remove pass and # comments from the language! I > interpreted him as suggesting the opposite: that we tighten up the > grammar to prohibit bare expressions, in order to prevent them from > being used instead of pass or # comments. Yes, that was what I was interpreting his statements as. I now know better, so you can ignore a lot of my comments, which were about that :) So. Taking this the other way, that Hugh intended to make dumb code illegal: I think it's unnecessary, because linters and optimizers are better for detecting dead code; it's not something that often crops up as a bug anywhere. ChrisA From zauddelig at gmail.com Mon May 15 09:18:03 2017 From: zauddelig at gmail.com (Fabrizio Messina) Date: Mon, 15 May 2017 06:18:03 -0700 (PDT) Subject: [Python-ideas] NamedTuple Interface Message-ID: <9faf0702-c60b-49b6-a71f-35c7c2ef56d9@googlegroups.com> I think it would be nice to have a generic NamedTuple interface in python: from typing import NamedTupleType def test( arguments: NamedTupleType ) -> NamedTupleType: return SomeType(**NamedTupleType._asdict) The rationale is that named tuple exposes a common API, and it would be nice to have it readily available. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon May 15 10:49:58 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 15 May 2017 17:49:58 +0300 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: <20170515130014.GN24625@ando.pearwood.info> References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: On 15.05.17 16:00, Steven D'Aprano wrote: > There's also cases where > > if x > y: > pass > else: > code > > is *not necessarily* the same as > > if not (x > y): > code This is not true. if not cond: stmt1 else: stmt2 always is equivalent to if cond: stmt2 else: stmt1 From guido at python.org Mon May 15 11:29:21 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 15 May 2017 08:29:21 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: This should be worked into a PEP, instead of living on as a bunch of python-ideas posts and blogs. I find the attrs documentation (and Glyph's blog post about it) almost unreadable because of the exalted language -- half the doc seems to be *selling* the library more than *explaining* it. If this style were to become common I would find it a disturbing trend. But having something alongside NamedTuple that helps you declare classes with mutable attributes using the new PEP 526 syntax (and maybe a few variants) would definitely be useful. Will someone please write a PEP? Very few of the specifics of attrs need be retained (its punny naming choices are too much for the stdlib). --Guido On Mon, May 15, 2017 at 4:05 AM, Nick Coghlan wrote: > On 14 May 2017 at 17:12, Abdur-Rahmaan Janhangeer > wrote: > > Whatever you all propose, > > > > coming from a java and c++ background, OOP in python is quite cumbersome. > > > > if you tell that i am not a python guy, then consider that current oop > style > > does not reflect python's style of ease and simplicity > > > > is __init__ really a good syntax choice? > > That's a different question, and one with a well-structured third > party solution: https://attrs.readthedocs.io/en/stable/ > > See https://mail.python.org/pipermail/python-ideas/2017-April/045514.html > for some ideas on how something like attrs might be adapted to provide > better standard library tooling for more concise and readable class > definitions. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Mon May 15 11:46:46 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 15 May 2017 10:46:46 -0500 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: I guess maybe if you overload the operators to return broken objects, maybe then they would be different? -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On May 15, 2017 9:50 AM, "Serhiy Storchaka" wrote: > On 15.05.17 16:00, Steven D'Aprano wrote: > >> There's also cases where >> >> if x > y: >> pass >> else: >> code >> >> is *not necessarily* the same as >> >> if not (x > y): >> code >> > > This is not true. > > if not cond: > stmt1 > else: > stmt2 > > always is equivalent to > > if cond: > stmt2 > else: > stmt1 > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Mon May 15 12:03:13 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Mon, 15 May 2017 16:03:13 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Mon, May 15, 2017 at 6:30 PM Guido van Rossum wrote: > This should be worked into a PEP, instead of living on as a bunch of python-ideas posts and blogs. ... > Will someone please write a PEP? If by "this" you mean adding to stdlib something like @record class Point: x: int y: int or something along the lines of my "modifiers" proposal ( https://mail.python.org/pipermail//python-ideas/2016-September/042360.html), then I think I would like to help writing such a PEP. But I thought these proposals were rejected. Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Mon May 15 12:21:50 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Mon, 15 May 2017 18:21:50 +0200 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: Something broken like this? import inspect def cond(): if 'not cond' in inspect.stack()[1].code_context[0]: return False return True if cond(): print('yes') else: print('no') if not cond(): print('no') else: print('yes') On 5/15/17, Ryan Gonzalez wrote: > I guess maybe if you overload the operators to return broken objects, maybe > then they would be different? > > -- > Ryan (????) > Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else > http://refi64.com > > On May 15, 2017 9:50 AM, "Serhiy Storchaka" wrote: > >> On 15.05.17 16:00, Steven D'Aprano wrote: >> >>> There's also cases where >>> >>> if x > y: >>> pass >>> else: >>> code >>> >>> is *not necessarily* the same as >>> >>> if not (x > y): >>> code >>> >> >> This is not true. >> >> if not cond: >> stmt1 >> else: >> stmt2 >> >> always is equivalent to >> >> if cond: >> stmt2 >> else: >> stmt1 >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > From guido at python.org Mon May 15 12:22:17 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 15 May 2017 09:22:17 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: I expect that we will need someone with a really good sensibility for Pythonic language/API design to lead the PEP writing. On Mon, May 15, 2017 at 9:03 AM, ????? wrote: > On Mon, May 15, 2017 at 6:30 PM Guido van Rossum wrote: > > This should be worked into a PEP, instead of living on as a bunch of > python-ideas posts and blogs. > ... > > Will someone please write a PEP? > > If by "this" you mean adding to stdlib something like > > @record > class Point: > x: int > y: int > > or something along the lines of my "modifiers" proposal ( > https://mail.python.org/pipermail//python-ideas/2016-September/042360.html), > then I think I would like to help writing such a PEP. But I thought these > proposals were rejected. > > Elazar > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon May 15 12:50:04 2017 From: brett at python.org (Brett Cannon) Date: Mon, 15 May 2017 16:50:04 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Mon, 15 May 2017 at 08:30 Guido van Rossum wrote: > This should be worked into a PEP, instead of living on as a bunch of > python-ideas posts and blogs. > > I find the attrs documentation (and Glyph's blog post about it) almost > unreadable because of the exalted language -- half the doc seems to be > *selling* the library more than *explaining* it. If this style were to > become common I would find it a disturbing trend. > > But having something alongside NamedTuple that helps you declare classes > with mutable attributes using the new PEP 526 syntax (and maybe a few > variants) would definitely be useful. Will someone please write a PEP? Very > few of the specifics of attrs need be retained (its punny naming choices > are too much for the stdlib). > In case someone decides to take this on, I wrote a blog post back in March that shows how to use __init_subclass__() to do a rough approximation of what Guido is suggesting: https://snarky.ca/customizing-class-creation-in-python/ . Based on my thinking on the topic while writing my blog post, the tricky bit is going to be deciding how to handle default values (i.e. if you set a default value like `attr: int = 42` on the class definition then you have `cls.attr` exist which might not be what you want if you would rather have the default value explicitly set on every instance but not fall through to the class (e.g. `del ins.attr; ins.attr` raises an AttributeError instead of falling through to `cls.attr`). You could remove the default from the class in your __init_subclass__(), but then you have to decide if that's too unexpected/magical for someone looking at the code. And I too would be interested in seeing something like this, if for any other reason than to help people not to misuse NamedTuple for quick-and-dirty data objects in new APIs (NamedTuple is meant to help move old-style tuple-based APIs to a class-based one). -Brett > > --Guido > > On Mon, May 15, 2017 at 4:05 AM, Nick Coghlan wrote: > >> On 14 May 2017 at 17:12, Abdur-Rahmaan Janhangeer >> wrote: >> > Whatever you all propose, >> > >> > coming from a java and c++ background, OOP in python is quite >> cumbersome. >> > >> > if you tell that i am not a python guy, then consider that current oop >> style >> > does not reflect python's style of ease and simplicity >> > >> > is __init__ really a good syntax choice? >> >> That's a different question, and one with a well-structured third >> party solution: https://attrs.readthedocs.io/en/stable/ >> >> See https://mail.python.org/pipermail/python-ideas/2017-April/045514.html >> for some ideas on how something like attrs might be adapted to provide >> better standard library tooling for more concise and readable class >> definitions. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 15 13:31:46 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 15 May 2017 10:31:46 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Mon, May 15, 2017 at 9:50 AM, Brett Cannon wrote: > > On Mon, 15 May 2017 at 08:30 Guido van Rossum wrote: > >> This should be worked into a PEP, instead of living on as a bunch of >> python-ideas posts and blogs. >> >> I find the attrs documentation (and Glyph's blog post about it) almost >> unreadable because of the exalted language -- half the doc seems to be >> *selling* the library more than *explaining* it. If this style were to >> become common I would find it a disturbing trend. >> >> But having something alongside NamedTuple that helps you declare classes >> with mutable attributes using the new PEP 526 syntax (and maybe a few >> variants) would definitely be useful. Will someone please write a PEP? Very >> few of the specifics of attrs need be retained (its punny naming choices >> are too much for the stdlib). >> > > In case someone decides to take this on, I wrote a blog post back in March > that shows how to use __init_subclass__() to do a rough approximation of > what Guido is suggesting: https://snarky.ca/customizing-class-creation-in- > python/ . > > Based on my thinking on the topic while writing my blog post, the tricky > bit is going to be deciding how to handle default values (i.e. if you set a > default value like `attr: int = 42` on the class definition then you have > `cls.attr` exist which might not be what you want if you would rather have > the default value explicitly set on every instance but not fall through to > the class (e.g. `del ins.attr; ins.attr` raises an AttributeError instead > of falling through to `cls.attr`). You could remove the default from the > class in your __init_subclass__(), but then you have to decide if that's > too unexpected/magical for someone looking at the code. > I would personally prefer the initializer to stay in the class in cases like this. If the initializer needs to be a default instance of a mutable class (e.g. an empty list or dict) there could be a special marker to indicate that, e.g. attacks: List[int] = MAKE_NEW # Creates a new [] for each instance while if the default needs to be something more custom it could be a similar marker with a callable argument, e.g. fleet: Dict[str, str] = MAKE_NEW(lambda: {'flagship': 'Enterprise'}) I would prefer not to have cleverness like initialization with a callable automatically does something different. > And I too would be interested in seeing something like this, if for any > other reason than to help people not to misuse NamedTuple for > quick-and-dirty data objects in new APIs (NamedTuple is meant to help move > old-style tuple-based APIs to a class-based one). > Not sure I agree that is its only purpose. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon May 15 13:51:27 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 15 May 2017 20:51:27 +0300 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: On 15.05.17 18:46, Ryan Gonzalez wrote: > I guess maybe if you overload the operators to return broken objects, > maybe then they would be different? No. The compiler generates an equivalent bytecode for both cases. From brett at python.org Mon May 15 14:18:10 2017 From: brett at python.org (Brett Cannon) Date: Mon, 15 May 2017 18:18:10 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Mon, 15 May 2017 at 10:32 Guido van Rossum wrote: > On Mon, May 15, 2017 at 9:50 AM, Brett Cannon wrote: > >> >> On Mon, 15 May 2017 at 08:30 Guido van Rossum wrote: >> >>> This should be worked into a PEP, instead of living on as a bunch of >>> python-ideas posts and blogs. >>> >>> I find the attrs documentation (and Glyph's blog post about it) almost >>> unreadable because of the exalted language -- half the doc seems to be >>> *selling* the library more than *explaining* it. If this style were to >>> become common I would find it a disturbing trend. >>> >>> But having something alongside NamedTuple that helps you declare classes >>> with mutable attributes using the new PEP 526 syntax (and maybe a few >>> variants) would definitely be useful. Will someone please write a PEP? Very >>> few of the specifics of attrs need be retained (its punny naming choices >>> are too much for the stdlib). >>> >> >> In case someone decides to take this on, I wrote a blog post back in >> March that shows how to use __init_subclass__() to do a rough approximation >> of what Guido is suggesting: >> https://snarky.ca/customizing-class-creation-in-python/ . >> >> Based on my thinking on the topic while writing my blog post, the tricky >> bit is going to be deciding how to handle default values (i.e. if you set a >> default value like `attr: int = 42` on the class definition then you have >> `cls.attr` exist which might not be what you want if you would rather have >> the default value explicitly set on every instance but not fall through to >> the class (e.g. `del ins.attr; ins.attr` raises an AttributeError instead >> of falling through to `cls.attr`). You could remove the default from the >> class in your __init_subclass__(), but then you have to decide if that's >> too unexpected/magical for someone looking at the code. >> > > I would personally prefer the initializer to stay in the class in cases > like this. If the initializer needs to be a default instance of a mutable > class (e.g. an empty list or dict) there could be a special marker to > indicate that, e.g. > > attacks: List[int] = MAKE_NEW # Creates a new [] for each instance > > while if the default needs to be something more custom it could be a > similar marker with a callable argument, e.g. > > fleet: Dict[str, str] = MAKE_NEW(lambda: {'flagship': 'Enterprise'}) > > I would prefer not to have cleverness like initialization with a callable > automatically does something different. > So if I'm understanding your idea correctly: class Foo(DataClass): attr: int = 42 would leave Foo.attr alone, but: class Foo(DataClass): attr: int = MAKE_NEW(42) would be the way to flag that `Foo.attr` shouldn't exist (I'm assuming both options would flag that there should be an `attr` argument to __init__())? > > >> And I too would be interested in seeing something like this, if for any >> other reason than to help people not to misuse NamedTuple for >> quick-and-dirty data objects in new APIs (NamedTuple is meant to help move >> old-style tuple-based APIs to a class-based one). >> > > Not sure I agree that is its only purpose. > My typical thinking on this is I don't want the tuple API that comes with NamedTuple for new APIs, and so that's when I reach for types.SimpleNamespace and have a function that controls the constructor so I can provide a concrete initializer API (e.g. `def foo(a, b): return types.SimpleNamespace(a=a, b=b)`). -Brett > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue May 16 10:52:22 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 16 May 2017 16:52:22 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On 15 May 2017 at 18:22, Guido van Rossum wrote: > I expect that we will need someone with a really good sensibility for > Pythonic language/API design to lead the PEP writing. > I probably don't have good sensibility for Pythonic API design yet (and I am more focused on PEP 544) so I cannot lead this, but I would like to actively participate in writing. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue May 16 10:53:50 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 16 May 2017 07:53:50 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: I could also try this myself in my spare time at PyCon (surprisingly, I have some!). It sounds kind of interesting. However I've never used the 'attrs' package... On Tue, May 16, 2017 at 7:52 AM, Ivan Levkivskyi wrote: > On 15 May 2017 at 18:22, Guido van Rossum wrote: > >> I expect that we will need someone with a really good sensibility for >> Pythonic language/API design to lead the PEP writing. >> > > I probably don't have good sensibility for Pythonic API design yet (and I > am more focused on PEP 544) so I cannot lead this, > but I would like to actively participate in writing. > > -- > Ivan > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue May 16 10:57:12 2017 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 16 May 2017 10:57:12 -0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: <461FC0B6-0124-4085-BB80-2268BA582E91@trueblade.com> I wouldn't mind discussing it at PyCon. I'm just starting a project to switch to attrs, and I'm reasonably familiar with it. -- Eric. > On May 16, 2017, at 10:53 AM, Guido van Rossum wrote: > > I could also try this myself in my spare time at PyCon (surprisingly, I have some!). It sounds kind of interesting. However I've never used the 'attrs' package... > >> On Tue, May 16, 2017 at 7:52 AM, Ivan Levkivskyi wrote: >>> On 15 May 2017 at 18:22, Guido van Rossum wrote: >>> I expect that we will need someone with a really good sensibility for Pythonic language/API design to lead the PEP writing. >> >> I probably don't have good sensibility for Pythonic API design yet (and I am more focused on PEP 544) so I cannot lead this, >> but I would like to actively participate in writing. >> >> -- >> Ivan >> >> > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue May 16 10:59:27 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 17 May 2017 00:59:27 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum wrote: > I could also try this myself in my spare time at PyCon (surprisingly, I have > some!). It sounds kind of interesting. However I've never used the 'attrs' > package... Me neither, so I'm not really an ideal person to head this up. Is there anyone who (a) knows what is and isn't Pythonic, (b) has used 'attrs', and (c) has spare time? It's not an easy trifecta but we can hope! ChrisA From gvanrossum at gmail.com Tue May 16 11:07:50 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 16 May 2017 08:07:50 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Maybe Lukasz is interested? On May 16, 2017 8:00 AM, "Chris Angelico" wrote: > On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum > wrote: > > I could also try this myself in my spare time at PyCon (surprisingly, I > have > > some!). It sounds kind of interesting. However I've never used the > 'attrs' > > package... > > Me neither, so I'm not really an ideal person to head this up. Is > there anyone who (a) knows what is and isn't Pythonic, (b) has used > 'attrs', and (c) has spare time? It's not an easy trifecta but we can > hope! > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tinchester at gmail.com Tue May 16 12:58:03 2017 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Tue, 16 May 2017 18:58:03 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 126, Issue 35 In-Reply-To: References: Message-ID: I have it on good authority both Hynek (the author of attrs) and Glyph will be attending PyCon. I think it'd be a shame if they weren't involved with this effort somehow. > Message: 2 > Date: Tue, 16 May 2017 07:53:50 -0700 > From: Guido van Rossum > To: Ivan Levkivskyi > Cc: ????? , "python-ideas at python.org" > > Subject: Re: [Python-ideas] JavaScript-Style Object Creation in Python > (using a constructor function instead of a class to create objects) > Message-ID: > kmcD7hSKA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > I could also try this myself in my spare time at PyCon (surprisingly, I > have some!). It sounds kind of interesting. However I've never used the > 'attrs' package... > > On Tue, May 16, 2017 at 7:52 AM, Ivan Levkivskyi > wrote: > > > On 15 May 2017 at 18:22, Guido van Rossum wrote: > > > >> I expect that we will need someone with a really good sensibility for > >> Pythonic language/API design to lead the PEP writing. > >> > > > > I probably don't have good sensibility for Pythonic API design yet (and I > > am more focused on PEP 544) so I cannot lead this, > > but I would like to actively participate in writing. > > > > -- > > Ivan > > > > > > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue May 16 14:10:40 2017 From: brett at python.org (Brett Cannon) Date: Tue, 16 May 2017 18:10:40 +0000 Subject: [Python-ideas] Python-ideas Digest, Vol 126, Issue 35 In-Reply-To: References: Message-ID: I'm sure at least Hynek will be told about this so he can provide input based on his experience (but I also know my rough approximation of this idea somewhat horrified him so I don't know he supportive he will be either :) . On Tue, 16 May 2017 at 09:58 Tin Tvrtkovi? wrote: > I have it on good authority both Hynek (the author of attrs) and Glyph > will be attending PyCon. I think it'd be a shame if they weren't involved > with this effort somehow. > > >> Message: 2 >> Date: Tue, 16 May 2017 07:53:50 -0700 >> From: Guido van Rossum >> To: Ivan Levkivskyi >> Cc: ????? , "python-ideas at python.org" >> >> Subject: Re: [Python-ideas] JavaScript-Style Object Creation in Python >> (using a constructor function instead of a class to create >> objects) >> Message-ID: >> < >> CAP7+vJJynk+mQTJfVxaVxXO5QNbE2sU0MrXhJZwePkmcD7hSKA at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> I could also try this myself in my spare time at PyCon (surprisingly, I >> have some!). It sounds kind of interesting. However I've never used the >> 'attrs' package... >> >> On Tue, May 16, 2017 at 7:52 AM, Ivan Levkivskyi >> wrote: >> >> > On 15 May 2017 at 18:22, Guido van Rossum wrote: >> > >> >> I expect that we will need someone with a really good sensibility for >> >> Pythonic language/API design to lead the PEP writing. >> >> >> > >> > I probably don't have good sensibility for Pythonic API design yet (and >> I >> > am more focused on PEP 544) so I cannot lead this, >> > but I would like to actively participate in writing. >> > >> > -- >> > Ivan >> > >> > >> > >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue May 16 14:08:47 2017 From: brett at python.org (Brett Cannon) Date: Tue, 16 May 2017 18:08:47 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Maybe we can bring this up as a lightning talk at the language summit to see who in the room has the appropriate background knowledge? And obviously someone can talk to Hynek to see if he wants to provide input based on community feedback for attrs and lessons learned. On Tue, 16 May 2017 at 08:11 Guido van Rossum wrote: > Maybe Lukasz is interested? > > On May 16, 2017 8:00 AM, "Chris Angelico" wrote: > >> On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum >> wrote: >> > I could also try this myself in my spare time at PyCon (surprisingly, I >> have >> > some!). It sounds kind of interesting. However I've never used the >> 'attrs' >> > package... >> >> Me neither, so I'm not really an ideal person to head this up. Is >> there anyone who (a) knows what is and isn't Pythonic, (b) has used >> 'attrs', and (c) has spare time? It's not an easy trifecta but we can >> hope! >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Tue May 16 15:18:21 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Tue, 16 May 2017 21:18:21 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Hi all, Thanks to this thread I learned about the "attrs" library. I am a heavy namedtuple (ab)user but I think I will be using attrs going forward. If something like attrs would made it in the standard library it would be awesome. Thanks, Stephan 2017-05-16 20:08 GMT+02:00 Brett Cannon : > Maybe we can bring this up as a lightning talk at the language summit to see > who in the room has the appropriate background knowledge? And obviously > someone can talk to Hynek to see if he wants to provide input based on > community feedback for attrs and lessons learned. > > On Tue, 16 May 2017 at 08:11 Guido van Rossum wrote: >> >> Maybe Lukasz is interested? >> >> On May 16, 2017 8:00 AM, "Chris Angelico" wrote: >>> >>> On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum >>> wrote: >>> > I could also try this myself in my spare time at PyCon (surprisingly, I >>> > have >>> > some!). It sounds kind of interesting. However I've never used the >>> > 'attrs' >>> > package... >>> >>> Me neither, so I'm not really an ideal person to head this up. Is >>> there anyone who (a) knows what is and isn't Pythonic, (b) has used >>> 'attrs', and (c) has spare time? It's not an easy trifecta but we can >>> hope! >>> >>> ChrisA >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From gvanrossum at gmail.com Tue May 16 17:04:36 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 16 May 2017 14:04:36 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Stephen, What features of attrs specifically solve your use cases? --Guido On Tue, May 16, 2017 at 12:18 PM, Stephan Houben wrote: > Hi all, > > Thanks to this thread I learned about the "attrs" library. I am a > heavy namedtuple (ab)user but I think > I will be using attrs going forward. > > If something like attrs would made it in the standard library it would > be awesome. > > Thanks, > > Stephan > > 2017-05-16 20:08 GMT+02:00 Brett Cannon : > > Maybe we can bring this up as a lightning talk at the language summit to > see > > who in the room has the appropriate background knowledge? And obviously > > someone can talk to Hynek to see if he wants to provide input based on > > community feedback for attrs and lessons learned. > > > > On Tue, 16 May 2017 at 08:11 Guido van Rossum > wrote: > >> > >> Maybe Lukasz is interested? > >> > >> On May 16, 2017 8:00 AM, "Chris Angelico" wrote: > >>> > >>> On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum > >>> wrote: > >>> > I could also try this myself in my spare time at PyCon > (surprisingly, I > >>> > have > >>> > some!). It sounds kind of interesting. However I've never used the > >>> > 'attrs' > >>> > package... > >>> > >>> Me neither, so I'm not really an ideal person to head this up. Is > >>> there anyone who (a) knows what is and isn't Pythonic, (b) has used > >>> 'attrs', and (c) has spare time? It's not an easy trifecta but we can > >>> hope! > >>> > >>> ChrisA > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas at python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue May 16 19:23:38 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 17 May 2017 09:23:38 +1000 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? Message-ID: <20170516232338.GO24625@ando.pearwood.info> Since PEP 526 is already provisionally accepted, it may be too late to bring this up, but I have a question and suggestion about the name ClassVar. I've read the PEP but didn't see an answer or rejection to this. https://www.python.org/dev/peps/pep-0526/ Why choose ClassVar over ClassAttr when the usual terminology used in the Python community is class and instance *attributes* rather than "variables"? I understand that, in a sense, attributes are variables (unless they're constants *wink*) but the term "class variable" sounds very Java-esque rather than Pythonic. And it is an uncomfortable fit with a language like Python where classes are first class values like ints, strings, floats etc: - we talk about a string variable meaning a variable holding a string; - a float variable is a variable holding a float; - a list variable is a variable holding a list; - so a class variable ought to be a variable holding a class. I get the intention: we have local, global, instance and class variables. But I feel that grouping instance/class with local/global is too abstract and "computer sciencey": in practice, instance/class vars are used in ways which are different enough from global/local vars that they deserve a different name: attributes, members or properties are common choices. (Python of course uses attributes, and properties for a particular kind of computed attribute.) This introduces split terminology: we now talk about annotating class attributes with ClassVar. Since there's no GlobalVar, NonLocalVar or LocalVar, there doesn't seem to be any good reason to stick with the FooVar naming system. Can we change the annotation to ClassAttr instead? -- Steve From guido at python.org Tue May 16 22:10:03 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 16 May 2017 19:10:03 -0700 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: <20170516232338.GO24625@ando.pearwood.info> References: <20170516232338.GO24625@ando.pearwood.info> Message-ID: It's "class variable" because we (at least I) also routinely use "instance variable". On Tue, May 16, 2017 at 4:23 PM, Steven D'Aprano wrote: > Since PEP 526 is already provisionally accepted, it may be too late to > bring this up, but I have a question and suggestion about the name > ClassVar. I've read the PEP but didn't see an answer or rejection to > this. > > https://www.python.org/dev/peps/pep-0526/ > > Why choose ClassVar over ClassAttr when the usual terminology used in > the Python community is class and instance *attributes* rather than > "variables"? > > I understand that, in a sense, attributes are variables (unless they're > constants *wink*) but the term "class variable" sounds very Java-esque > rather than Pythonic. And it is an uncomfortable fit with a language > like Python where classes are first class values like ints, strings, > floats etc: > > - we talk about a string variable meaning a variable holding a string; > - a float variable is a variable holding a float; > - a list variable is a variable holding a list; > - so a class variable ought to be a variable holding a class. > > I get the intention: we have local, global, instance and class > variables. But I feel that grouping instance/class with local/global is > too abstract and "computer sciencey": in practice, instance/class vars > are used in ways which are different enough from global/local vars that > they deserve a different name: attributes, members or properties are > common choices. > > (Python of course uses attributes, and properties for a particular kind > of computed attribute.) > > This introduces split terminology: we now talk about annotating class > attributes with ClassVar. Since there's no GlobalVar, NonLocalVar or > LocalVar, there doesn't seem to be any good reason to stick with the > FooVar naming system. > > Can we change the annotation to ClassAttr instead? > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Tue May 16 22:54:27 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 16 May 2017 22:54:27 -0400 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: References: <20170516232338.GO24625@ando.pearwood.info> Message-ID: On Tue, May 16, 2017 at 10:10 PM, Guido van Rossum wrote: > It's "class variable" because we (at least I) also routinely use "instance > variable". It is `getattr()`, `setattr()`, and a very long etc. in Python. I agree with the OP that a sudden talk about "vars" is confusing, more so when Python doesn't have "vars", but "names" (etc.). Cheers! -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Tue May 16 23:14:14 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 16 May 2017 23:14:14 -0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Tue, May 16, 2017 at 5:04 PM, Guido van Rossum wrote: What features of attrs specifically solve your use cases? > (not Stephen) I hadn?t thought about this use case: In [1]: class C(): ...: x = 1 ...: ...: def __init__(self, x=None): ...: if x is not None: ...: self.x = x ...: ...: def __str__(self): ...: return 'C(%s)' % self.x ...: In [2]: c1 = C() ...: c2 = C(2) ...: In [3]: print(c1, c2) C(1) C(2) And I might use it here on. What I like about attrs is: - The class level declaration of instance attributes - That the reasonable *init*, *repr*, and *eq* are generated I don?t like the excessive wordiness in attrs, and I don?t need ?the kitchen sink? be available to have instance attributes declared at the class level. A solution based on the typing module would be much better. Basically, Python is lacking a way to declare instance fields with default values away of the initializer. Several of the mainstream OO languages (Java, Swift, Go) provide for that. I haven?t thought much about this, except about if there?s indeed a need (and there is), but I can?t know if the solution if through decorators, or inheritance. ? -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed May 17 01:11:31 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 16 May 2017 22:11:31 -0700 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: References: <20170516232338.GO24625@ando.pearwood.info> Message-ID: There was another reason too. Many things are "class attributes" e.g. methods, descriptors. But only specific things are class *variables*. Trust me, we debated this when the PEP was drafted. ClassVar is better than ClassAttr. On Tue, May 16, 2017 at 7:54 PM, Juancarlo A?ez wrote: > > On Tue, May 16, 2017 at 10:10 PM, Guido van Rossum > wrote: > >> It's "class variable" because we (at least I) also routinely use >> "instance variable". > > > It is `getattr()`, `setattr()`, and a very long etc. in Python. > > I agree with the OP that a sudden talk about "vars" is confusing, more so > when Python doesn't have "vars", but "names" (etc.). > > Cheers! > > > -- > Juancarlo *A?ez* > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Wed May 17 01:22:00 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 16 May 2017 22:22:00 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On Tue, May 16, 2017 at 8:14 PM, Juancarlo A?ez wrote: > What I like about attrs is: > > - The class level declaration of instance attributes > - That the reasonable *init*, *repr*, and *eq* are generated > > OK, the former should be doable using PEP 526 (the type is stored in __annotations__ and the default in the class dict given to the metaclass), while the latter should be doable using a standard metaclass -- assuming we can agree on what the "reasonable" __init__, __repr__ and __eq__ should do. > > > I don?t like the excessive wordiness in attrs, > Really? @attr.s is wordy? :-) I think it's deadly cute. (The only library I've ever seen that did something worse was "monocle" which used @_o.) > and I don?t need ?the kitchen sink? be available to have instance > attributes declared at the class level. A solution based on the typing > module would be much better. > That's what I am hoping, yes. > Basically, Python is lacking a way to declare instance fields with default > values away of the initializer. Several of the mainstream OO languages > (Java, Swift, Go) provide for that. > Hm, there are some issues here of course -- while it's simple to set the default to e.g. 0, (1, 2, 3) or '', it's not so easy to set a default to [] or {'foo': 'bar'} unless you just state "do whatever copy.copy() does". > I haven?t thought much about this, except about if there?s indeed a need > (and there is), but I can?t know if the solution if through decorators, or > inheritance. > I suppose we could do it using either a class decorator or a metaclass -- we'll have to compare the pros and cons and specific use cases to choose. (Example: https://github.com/python/typing/issues/427.) -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed May 17 02:49:46 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 08:49:46 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: Hi Guido, As mentioned I am a heavy user of namedtuple, I use it everywhere where constructor arguments are equal to instance variables. Which is quite often, at least in my programming style (possibly under the influence of ML's "datatype" and Scala's "case class"-es.) Compared to namedtuple, I see that attr solves a number of issues which sometimes prevented me from using namedtuple: 1. Allow hash and equality to be based on object identity, rather than structural identity, this is very important if one wants to store un-hashable objects in the instance. (In my case: mostly dict's and numpy arrays). 2. Not subclassed from tuple. I have been bitten by this subclassing when trying to set up singledispatch on sequences and also on my classes. 3. Easily allow to specify default values. With namedtuple this requires overriding __new__. 4. Easily allow to specify a conversion function. For example I have some code like below: note that I can store a numpy array while keeping hashability and I can make it convert to a numpy array in the constructor. @attr.s(cmp=False, hash=False) class SvgTransform(SvgPicture): child = attr.ib() matrix = attr.ib(convert=numpy.asarray) These are the main advantages I have encountered so far. Stephan 2017-05-16 23:04 GMT+02:00 Guido van Rossum : > Stephen, > > What features of attrs specifically solve your use cases? > > --Guido > > On Tue, May 16, 2017 at 12:18 PM, Stephan Houben > wrote: >> >> Hi all, >> >> Thanks to this thread I learned about the "attrs" library. I am a >> heavy namedtuple (ab)user but I think >> I will be using attrs going forward. >> >> If something like attrs would made it in the standard library it would >> be awesome. >> >> Thanks, >> >> Stephan >> >> 2017-05-16 20:08 GMT+02:00 Brett Cannon : >> > Maybe we can bring this up as a lightning talk at the language summit to >> > see >> > who in the room has the appropriate background knowledge? And obviously >> > someone can talk to Hynek to see if he wants to provide input based on >> > community feedback for attrs and lessons learned. >> > >> > On Tue, 16 May 2017 at 08:11 Guido van Rossum >> > wrote: >> >> >> >> Maybe Lukasz is interested? >> >> >> >> On May 16, 2017 8:00 AM, "Chris Angelico" wrote: >> >>> >> >>> On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum >> >>> wrote: >> >>> > I could also try this myself in my spare time at PyCon >> >>> > (surprisingly, I >> >>> > have >> >>> > some!). It sounds kind of interesting. However I've never used the >> >>> > 'attrs' >> >>> > package... >> >>> >> >>> Me neither, so I'm not really an ideal person to head this up. Is >> >>> there anyone who (a) knows what is and isn't Pythonic, (b) has used >> >>> 'attrs', and (c) has spare time? It's not an easy trifecta but we can >> >>> hope! >> >>> >> >>> ChrisA >> >>> _______________________________________________ >> >>> Python-ideas mailing list >> >>> Python-ideas at python.org >> >>> https://mail.python.org/mailman/listinfo/python-ideas >> >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > > > -- > --Guido van Rossum (python.org/~guido) From desmoulinmichel at gmail.com Wed May 17 04:15:17 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 17 May 2017 10:15:17 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Le 17/05/2017 ? 07:22, Guido van Rossum a ?crit : > On Tue, May 16, 2017 at 8:14 PM, Juancarlo A?ez > wrote: > > What I like about attrs is: > > * The class level declaration of instance attributes > * That the reasonable *init*, *repr*, and *eq* are generated > > OK, the former should be doable using PEP 526 (the type is stored in > __annotations__ and the default in the class dict given to the > metaclass), while the latter should be doable using a standard metaclass > -- assuming we can agree on what the "reasonable" __init__, __repr__ and > __eq__ should do. > > > I don?t like the excessive wordiness in attrs, > > Really? @attr.s is wordy? :-) I think it's deadly cute. (The only > library I've ever seen that did something worse was "monocle" which used > @_o.) >>> import attr >>> @attr.s ... class Point: ... x = attr.ib(default=42) ... y = attr.ib(default=attr.Factory(list)) Is pretty wordy compared to something like another language would do such as: class Point: int x = 42 list y = [] Now I get that: - Python already has a similar syntax creating class attributes. - You can't put mutable objects here. - attr does more since it generates dunder methods. But having an import, a decorator and verbose calls to attr.ib does not feel like idiomatic Python at all. Python is an elegant and expressive language. This is none of the above. Also Python is beginner friendly. Now OPP is already hard to teach to my students, but if I have to add this to the mix, I will just have to tell them to copy / paste it blindly for a long time before I can get to the point they can understand what it does. We should be able to find a middle ground. First, if we have something LIKE attr, should it need an import? Basic data structures may not require an import to work. async/await are way better than import @asyncio.coroutine. Secondly, while I really, really, really want this feature, I think we should not rush it. Some ideas have to be explored. E.G: Adding keywords for it ? I know adding a keyword is the worst thing one can suggest on this list ever. But it has to be mentioned because most other languages do it this way. class Point: instancevar x = 42 instancevar y = lazy [] # we add a debate about this a few months ago Adding a special syntax for it ? ruby has something similar. class Point: @x = 42 @@y = list Upgrading the class constructor? It does not address the mutablility issue though. class Point(x=42, y=[]) Mixing concepts? class Point(metaclass=autoclass(x=42, y=lazy [])): pass @args(x=42) @factory_args(y=list) @autodunder() class Point: pass @autoclass( x=42 y=autoclass.lazy(list) ) class Point: pass Just adding attrs, which is a workaround to a missing feature in Python, as a boiler plate and calling it a day seems unwise. Don't get me wrong, I like attrs, I like asyncio and I like the whole battery included concept. But we lived without it until now, so let's not go too fast on this. From sf at fermigier.com Wed May 17 04:32:33 2017 From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=) Date: Wed, 17 May 2017 10:32:33 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: (Not Stephen either). I've been using attrs for some time now, but only superficially. I'm not sure yet if I want to make it mandatory for my team or not. My biggest issues so far are: - how to truly leverage it with a declarative ORM? (SQLAlchemy in my case, the workaround being to only use a subset of attrs' functionalities, at the expense of additional complexity). - how to make it interop with type annotations? (Which is also an issue for SQLAlchemy, AFAIK, at this point). I won't be in Pycon (but I will be at PyParis next month, obviously, since I'm organising it ;). Hynek will be there, from what I see so obviously if a PEP or some fresh ideas emerge from the discussions there, I'll be more than happy. Have a nice Pycon. S. On Tue, May 16, 2017 at 11:04 PM, Guido van Rossum wrote: > Stephen, > > What features of attrs specifically solve your use cases? > > --Guido > > On Tue, May 16, 2017 at 12:18 PM, Stephan Houben > wrote: > >> Hi all, >> >> Thanks to this thread I learned about the "attrs" library. I am a >> heavy namedtuple (ab)user but I think >> I will be using attrs going forward. >> >> If something like attrs would made it in the standard library it would >> be awesome. >> >> Thanks, >> >> Stephan >> >> 2017-05-16 20:08 GMT+02:00 Brett Cannon : >> > Maybe we can bring this up as a lightning talk at the language summit >> to see >> > who in the room has the appropriate background knowledge? And obviously >> > someone can talk to Hynek to see if he wants to provide input based on >> > community feedback for attrs and lessons learned. >> > >> > On Tue, 16 May 2017 at 08:11 Guido van Rossum >> wrote: >> >> >> >> Maybe Lukasz is interested? >> >> >> >> On May 16, 2017 8:00 AM, "Chris Angelico" wrote: >> >>> >> >>> On Wed, May 17, 2017 at 12:53 AM, Guido van Rossum >> >>> wrote: >> >>> > I could also try this myself in my spare time at PyCon >> (surprisingly, I >> >>> > have >> >>> > some!). It sounds kind of interesting. However I've never used the >> >>> > 'attrs' >> >>> > package... >> >>> >> >>> Me neither, so I'm not really an ideal person to head this up. Is >> >>> there anyone who (a) knows what is and isn't Pythonic, (b) has used >> >>> 'attrs', and (c) has spare time? It's not an easy trifecta but we can >> >>> hope! >> >>> >> >>> ChrisA >> >>> _______________________________________________ >> >>> Python-ideas mailing list >> >>> Python-ideas at python.org >> >>> https://mail.python.org/mailman/listinfo/python-ideas >> >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group / Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyData Paris - http://pydata.fr/ --- ?You never change things by ?ghting the existing reality. To change something, build a new model that makes the existing model obsolete.? ? R. Buckminster Fuller -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 17 05:03:30 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 17 May 2017 19:03:30 +1000 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: References: <20170516232338.GO24625@ando.pearwood.info> Message-ID: <20170517090330.GP24625@ando.pearwood.info> On Tue, May 16, 2017 at 10:11:31PM -0700, Guido van Rossum wrote: > There was another reason too. Many things are "class attributes" e.g. > methods, descriptors. But only specific things are class *variables*. Ah, that's a good reason. I can live with that. Thanks for the explanation. -- Steve From eric at trueblade.com Wed May 17 05:11:11 2017 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 17 May 2017 05:11:11 -0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: <11c849a1-b1eb-77d0-750a-0a7b8cbbc4ec@trueblade.com> On 5/16/17 5:04 PM, Guido van Rossum wrote: > Stephen, > > What features of attrs specifically solve your use cases? Also not Stephan! As others have said, it's the "tupleness" of namedtuple that has bitten me. Also, option mutability is key for my use cases. One use case that attrs satisfies (as does namedtuple) that I'd like to make sure we allow for in any solution is dynamically creating classes where you don't know the field names until runtime. I run in to this when reading anything with columnar metadata, like databases or CSV files. The attr.s "these" parameter solves this for me: fields = ['a', 'b', 'c'] @attr.s(these={f:attr.ib() for f in fields}) class Foo: pass f = Foo(1, 2, 3) print(f) gives: Foo(a=1, b=2, c=3) Mutable default values is a foot-gun. attrs solves it in a similar way to my "namedlist" on PyPI. I'm moving to abandon namedlist and replace it with attrs: the namedlist API is horrible, but a logical (to me!) extension from namedtuple. I mainly wrote it as an exercise in dynamically creating classes using the ast module (as opposed to namedtuple where we build a string definition of the class and exec that). Eric. From stephanh42 at gmail.com Wed May 17 07:08:32 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 13:08:32 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: Hi Michel, > Now OPP is already hard to teach to my students, but if I have to add > this to the mix, I will just have to tell them to copy / paste it > blindly for a long time before I can get to the point they can > understand what it does. About the teachability, some remarks: 1. You can still just skip it. I presume you don't teach all the advanced metaclass stuff right away either, even though it is part of core Python. 2. Is it really that complicated? attr.s is just a normal Python function, which adds some members to a class. You don't even have to use the decorator @ syntax, that is just a convenience. To me this seems easier to teach than yet another dedicated syntax. Stephan 2017-05-17 10:15 GMT+02:00 Michel Desmoulin : > > > Le 17/05/2017 ? 07:22, Guido van Rossum a ?crit : >> On Tue, May 16, 2017 at 8:14 PM, Juancarlo A?ez > > wrote: >> >> What I like about attrs is: >> >> * The class level declaration of instance attributes >> * That the reasonable *init*, *repr*, and *eq* are generated >> >> OK, the former should be doable using PEP 526 (the type is stored in >> __annotations__ and the default in the class dict given to the >> metaclass), while the latter should be doable using a standard metaclass >> -- assuming we can agree on what the "reasonable" __init__, __repr__ and >> __eq__ should do. >> >> >> I don?t like the excessive wordiness in attrs, >> >> Really? @attr.s is wordy? :-) I think it's deadly cute. (The only >> library I've ever seen that did something worse was "monocle" which used >> @_o.) > >>>> import attr >>>> @attr.s > ... class Point: > ... x = attr.ib(default=42) > ... y = attr.ib(default=attr.Factory(list)) > > > Is pretty wordy compared to something like another language would do > such as: > > class Point: > int x = 42 > list y = [] > > Now I get that: > > - Python already has a similar syntax creating class attributes. > - You can't put mutable objects here. > - attr does more since it generates dunder methods. > > But having an import, a decorator and verbose calls to attr.ib does not > feel like idiomatic Python at all. Python is an elegant and expressive > language. This is none of the above. Also Python is beginner friendly. > Now OPP is already hard to teach to my students, but if I have to add > this to the mix, I will just have to tell them to copy / paste it > blindly for a long time before I can get to the point they can > understand what it does. > > We should be able to find a middle ground. > > First, if we have something LIKE attr, should it need an import? Basic > data structures may not require an import to work. async/await are way > better than import @asyncio.coroutine. > > Secondly, while I really, really, really want this feature, I think we > should not rush it. > > Some ideas have to be explored. > > E.G: > > Adding keywords for it ? I know adding a keyword is the worst thing one > can suggest on this list ever. But it has to be mentioned because most > other languages do it this way. > > class Point: > instancevar x = 42 > instancevar y = lazy [] # we add a debate about this a few months ago > > > Adding a special syntax for it ? ruby has something similar. > > class Point: > @x = 42 > @@y = list > > Upgrading the class constructor? It does not address the mutablility > issue though. > > class Point(x=42, y=[]) > > Mixing concepts? > > > class Point(metaclass=autoclass(x=42, y=lazy [])): > pass > > @args(x=42) > @factory_args(y=list) > @autodunder() > class Point: > pass > > @autoclass( > x=42 > y=autoclass.lazy(list) > ) > class Point: > pass > > > Just adding attrs, which is a workaround to a missing feature in Python, > as a boiler plate and calling it a day seems unwise. > > Don't get me wrong, I like attrs, I like asyncio and I like the whole > battery included concept. But we lived without it until now, so let's > not go too fast on this. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Wed May 17 07:14:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 May 2017 21:14:58 +1000 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: <20170517090330.GP24625@ando.pearwood.info> References: <20170516232338.GO24625@ando.pearwood.info> <20170517090330.GP24625@ando.pearwood.info> Message-ID: On 17 May 2017 at 19:03, Steven D'Aprano wrote: > On Tue, May 16, 2017 at 10:11:31PM -0700, Guido van Rossum wrote: >> There was another reason too. Many things are "class attributes" e.g. >> methods, descriptors. But only specific things are class *variables*. > > Ah, that's a good reason. I can live with that. In the specific context of type hints, there is also https://docs.python.org/3/library/typing.html#typing.TypeVar, which is used to indicate that particular variable name refers to a type variable for generics, rather than a normal runtime value. So a "type variable" and a "type attribute" are also very different things (assuming that the latter is being used as an equivalent to "class attribute"). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From desmoulinmichel at gmail.com Wed May 17 07:37:22 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Wed, 17 May 2017 13:37:22 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: Le 17/05/2017 ? 13:08, Stephan Houben a ?crit : > Hi Michel, > >> Now OPP is already hard to teach to my students, but if I have to add >> this to the mix, I will just have to tell them to copy / paste it >> blindly for a long time before I can get to the point they can >> understand what it does. > > About the teachability, some remarks: > 1. You can still just skip it. I presume you don't teach all the > advanced metaclass > stuff right away either, even though it is part of core Python. > Comparing setting instance attributes to metaclasses is pushing it don't you think ? The thing is, teaching OOP is challenging enough. You have __init__ and all it's quicks, and the @decorators for class methods, and self, and cls, and inheritances. And that's just the basics, not talking about multiple inheritance, properties, other dunder methods or compostion. Having a cleaner, faster solution to declare a class would be awesome, both for dev and for teaching. That's why we all love attrs. But we are talking here about a nice-to-have feature. Python works perfectly fine without it. But since we are at it, let's make something great. > 2. Is it really that complicated? attr.s is just a normal Python > function, which adds some members to a class. > You don't even have to use the decorator @ syntax, that is just a convenience. > To me this seems easier to teach than yet another dedicated syntax. My guess is that, without the decorator, the attributes don't do anything. They are just declarative hints for the decorator to do the magic. But even then, it's already an additional burden to have to explain the difference between this magic and the regular class attribute. So if I have to do it, I wish I could do it without having to explain also 2 namespaced function calls, one taking itself the return value of a call as a parameter, which BTW is a factory taking a constructor as a callback. Do not over estimate the skills of newcommers in programming. They have a lot of things to learn. From ncoghlan at gmail.com Wed May 17 07:43:01 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 May 2017 21:43:01 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <20170514070445.GF24625@ando.pearwood.info> Message-ID: On 17 May 2017 at 13:14, Juancarlo A?ez wrote: > On Tue, May 16, 2017 at 5:04 PM, Guido van Rossum > wrote: > What I like about attrs is: > > * The class level declaration of instance attributes > * That the reasonable init, repr, and eq are generated These are also the two main benefits for my own use cases, with easy conversion to JSON compatible dicts being third. I'm less bothered by the wordiness, hence my suggestion for borrowing the attrs API design and doing this as a standard library module along the lines of: from autoclass import data_record, field @data_record class Point3D: x: int = field(default=0) y: int = field(default=0) z: int = field(default=0) While that's wordier than dedicated syntax in the simple cases, it also means that - if we want to define additional templates in the future, it just means adding a new decorator to the autoclass module - things like ORMs and other class based schema DSLs are a natural extension of this "runtime class template" model, - class level settings (e.g. declaring post-init immutability) are just keyword arguments to a function call - field level settings (e.g. omitting the field from the generated repr) are just keyword arguments to a function call Those last two points echo the primary reason that print was converted from a statement to a builtin in Python 3 That said, even with this model, the base case of "fields with an immutable or shared default" could potentially be simplified to: from autoclass import data_record @data_record class Point3D: x: int = 0 y: int = 0 z: int = 0 However, the potentially surprising behaviour there is that to implement it, the decorator not only has to special case the output of "field()" calls, but also has to special case any object that implements the descriptor protocol to avoid getting confused by normal method and property definitions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From gmarcel.plch at gmail.com Wed May 17 08:38:54 2017 From: gmarcel.plch at gmail.com (gmarcel.plch at gmail.com) Date: Wed, 17 May 2017 14:38:54 +0200 Subject: [Python-ideas] Running C extension modules using -m switch Message-ID: <1495024734.5444.2.camel@localhost> Greetings, I'm a student that has been working lately on feature of the runpy module that I have been quite interested in: execution of extension modules using the -m switch. Currently this requires access to the module's code, so it only works for modules written in Python. I have a proof-of-concept implementation that adds a new ExtensionFileLoader method called "exec_as_main". The runpy module then checks if the loader has this method, and if so, calls it instead of getting the the code and running that. This new method calls into the _imp module, which executes the module as a script. I can see two ways of doing this. Both expect that the module uses PEP 489 multi-phase initialization. The first way is having a new PyModuleDef_Slot called Py_mod_main, which names a function to execute when run as main. The second way is running a module's Py_mod_exec inside the __main__ module's namespace, as it's done for normal modules. The module would then do a `if __name__ == "__main__"` check. This is possible for modules that don't define Py_mod_create: they expect a default module object to be created for them, so we can pass the __main__ module to their Py_mod_exec function. This way would mean that, for example, modules written in Cython would behave like their Python counterparts. Another possibility would be to use both, allowing both easy Cython- style modules and a dedicated slot for modules that need custom Py_mod_create. My proof of concept uses another combination: it requires Py_mod_main and runs it in the __main__ namespace. But that can change based on discussion here. Link to the implementation: https://github.com/Traceur759/cpython/tree/ main_c_modules Diff from master: https://github.com/python/cpython/compare/master...Tr aceur759:main_c_modules You can quickly test it with: $ ./python -m _testmultiphase This is an extension module named __main__ From bborcic at gmail.com Wed May 17 09:08:32 2017 From: bborcic at gmail.com (Boris Borcic) Date: Wed, 17 May 2017 15:08:32 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: Chris Angelico wrote: > # Python > def outer(): > x = 0 > def inner(): > nonlocal x > x += 2 > > Is it better to say "nonlocal" on everything you use than to say > "self.x" on each use? I've not used Python closures since nonlocal came about, but AFAIK you only need to use nonlocal if there's an assignment to the variable in the scope. Maybe an augmented assignment creates the same constraint but logically it shouldn't. From stephanh42 at gmail.com Wed May 17 09:20:16 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 15:20:16 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: Hi Michel, > Comparing setting instance attributes to metaclasses is pushing it don't > you think ? My only point is that there are aspects of the core Python language which are probably not covered by your course (and I chose metaclasses as a topic least likely to be covered in a beginner's course). So this would justify skipping an 'attr' functionality if it would be deemed too complex for beginners. > My guess is that, without the decorator, the attributes don't do > anything. They are just declarative hints for the decorator to do the magic. I mean you don't need to use the @ syntax. You can just as well do: class MyClass: foo = attr.ib() MyClass = attr.s(MyClass) thereby stressing that `attr.s' is just a plain Python function, and there is no magic involved. > Do not over estimate the skills of newcomers in programming. They have > a lot of things to learn. I realize that. My point is only about the *relative* ease of learning of the current attr.s decorator-based approach vs. dedicated syntax. Perhaps both are too difficult for a beginner's course, but I would contend that the decorator approach is (relatively) simpler (perhaps in the same way that special relativity is simple compared to general relativity ;-) ). My argument for that is that the attr.s approach does not introduce any new mechanism, but is just an (advanced) application of basic mechanism such as function calls and attribute access. Hope this clarifies my position. Stephan 2017-05-17 13:37 GMT+02:00 Michel Desmoulin : > > > Le 17/05/2017 ? 13:08, Stephan Houben a ?crit : >> Hi Michel, >> >>> Now OPP is already hard to teach to my students, but if I have to add >>> this to the mix, I will just have to tell them to copy / paste it >>> blindly for a long time before I can get to the point they can >>> understand what it does. >> >> About the teachability, some remarks: >> 1. You can still just skip it. I presume you don't teach all the >> advanced metaclass >> stuff right away either, even though it is part of core Python. >> > > Comparing setting instance attributes to metaclasses is pushing it don't > you think ? > > The thing is, teaching OOP is challenging enough. You have __init__ and > all it's quicks, and the @decorators for class methods, and self, and > cls, and inheritances. And that's just the basics, not talking about > multiple inheritance, properties, other dunder methods or compostion. > > Having a cleaner, faster solution to declare a class would be awesome, > both for dev and for teaching. That's why we all love attrs. > > But we are talking here about a nice-to-have feature. Python works > perfectly fine without it. But since we are at it, let's make something > great. > > >> 2. Is it really that complicated? attr.s is just a normal Python >> function, which adds some members to a class. >> You don't even have to use the decorator @ syntax, that is just a convenience. >> To me this seems easier to teach than yet another dedicated syntax. > > My guess is that, without the decorator, the attributes don't do > anything. They are just declarative hints for the decorator to do the magic. > > But even then, it's already an additional burden to have to explain the > difference between this magic and the regular class attribute. So if I > have to do it, I wish I could do it without having to explain also 2 > namespaced function calls, one taking itself the return value of a call > as a parameter, which BTW is a factory taking a constructor as a callback. > > Do not over estimate the skills of newcommers in programming. They have > a lot of things to learn. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Wed May 17 09:20:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 17 May 2017 23:20:56 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <5917E2DF.6020000@brenbarn.net> Message-ID: On Wed, May 17, 2017 at 11:08 PM, Boris Borcic wrote: > Chris Angelico wrote: >> >> # Python >> def outer(): >> x = 0 >> def inner(): >> nonlocal x >> x += 2 >> >> Is it better to say "nonlocal" on everything you use than to say >> "self.x" on each use? > > > I've not used Python closures since nonlocal came about, but AFAIK you only > need to use nonlocal if there's an assignment to the variable in the scope. > Maybe an augmented assignment creates the same constraint but logically it > shouldn't. Augmented assignment is still assignment. The rules for the nonlocal keyword are the same as for the global keyword, so you can tinker with that if you want a comparison. Technically my statement wasn't quite true ("on everything" ignores the fact that *reading* a closed-over variable doesn't require any declaration), but omitting the nonlocal declaration would leave you open to incredibly sneaky bugs where moving code from one method to another changes its semantics. So you'd probably end up wanting some sort of class-level declaration that says "please make these nonlocal in all enclosed scopes" (which you could do with macropy, I'm sure), to avoid the risk of missing one. And at that point, you're really asking for a C++ style of thing where your variables get declared, and someone's going to ask you why you're writing this in Python :) ChrisA From eric at trueblade.com Wed May 17 10:16:04 2017 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 17 May 2017 07:16:04 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <11c849a1-b1eb-77d0-750a-0a7b8cbbc4ec@trueblade.com> References: <11c849a1-b1eb-77d0-750a-0a7b8cbbc4ec@trueblade.com> Message-ID: <29575794-aba7-06a5-b35c-2493a7d4d008@trueblade.com> On 5/17/17 2:11 AM, Eric V. Smith wrote: > One use case that attrs satisfies (as does namedtuple) that I'd like to > make sure we allow for in any solution is dynamically creating classes > where you don't know the field names until runtime. I run in to this > when reading anything with columnar metadata, like databases or CSV > files. The attr.s "these" parameter solves this for me: > > fields = ['a', 'b', 'c'] > > @attr.s(these={f:attr.ib() for f in fields}) > class Foo: > pass I should also have mentioned attr.make_class. From toddrjen at gmail.com Wed May 17 10:28:52 2017 From: toddrjen at gmail.com (Todd) Date: Wed, 17 May 2017 10:28:52 -0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: On May 17, 2017 04:16, "Michel Desmoulin" wrote: Le 17/05/2017 ? 07:22, Guido van Rossum a ?crit : > On Tue, May 16, 2017 at 8:14 PM, Juancarlo A?ez > wrote: > > What I like about attrs is: > > * The class level declaration of instance attributes > * That the reasonable *init*, *repr*, and *eq* are generated > > OK, the former should be doable using PEP 526 (the type is stored in > __annotations__ and the default in the class dict given to the > metaclass), while the latter should be doable using a standard metaclass > -- assuming we can agree on what the "reasonable" __init__, __repr__ and > __eq__ should do. > > > I don?t like the excessive wordiness in attrs, > > Really? @attr.s is wordy? :-) I think it's deadly cute. (The only > library I've ever seen that did something worse was "monocle" which used > @_o.) >>> import attr >>> @attr.s ... class Point: ... x = attr.ib(default=42) ... y = attr.ib(default=attr.Factory(list)) Is pretty wordy compared to something like another language would do such as: class Point: int x = 42 list y = [] Now I get that: - Python already has a similar syntax creating class attributes. - You can't put mutable objects here. - attr does more since it generates dunder methods. But having an import, a decorator and verbose calls to attr.ib does not feel like idiomatic Python at all. Python is an elegant and expressive language. This is none of the above. Also Python is beginner friendly. Now OPP is already hard to teach to my students, but if I have to add this to the mix, I will just have to tell them to copy / paste it blindly for a long time before I can get to the point they can understand what it does. We should be able to find a middle ground. First, if we have something LIKE attr, should it need an import? Basic data structures may not require an import to work. async/await are way better than import @asyncio.coroutine. Secondly, while I really, really, really want this feature, I think we should not rush it. Some ideas have to be explored. E.G: Adding keywords for it ? I know adding a keyword is the worst thing one can suggest on this list ever. But it has to be mentioned because most other languages do it this way. class Point: instancevar x = 42 instancevar y = lazy [] # we add a debate about this a few months ago Adding a special syntax for it ? ruby has something similar. class Point: @x = 42 @@y = list Upgrading the class constructor? It does not address the mutablility issue though. class Point(x=42, y=[]) Mixing concepts? class Point(metaclass=autoclass(x=42, y=lazy [])): pass @args(x=42) @factory_args(y=list) @autodunder() class Point: pass @autoclass( x=42 y=autoclass.lazy(list) ) class Point: pass Just adding attrs, which is a workaround to a missing feature in Python, as a boiler plate and calling it a day seems unwise. Don't get me wrong, I like attrs, I like asyncio and I like the whole battery included concept. But we lived without it until now, so let's not go too fast on this. What about an attribute-level decorator-like syntax, like: class Point: @instattr x = 42 @instattr @lazy y = [] Or: class Point: @instattr: x = 42 @instattr: @lazy: y = [] This would have the benefit of a keyword-like syntax without actually needing a new keyword. The question is how such a system would work in a general manner. Of course these aren't decorators, something new would be needed. I see two main approaches. 1. These functions are called both at class creation and initialization time with arguments that let it decide what to do. So perhaps it is given whatever is on the right side of the "=", the class, and the class instance (which is None at class creation time). Perhaps it is given the variable name as a string, or perhaps Python magically adds the name to the namespace whenever the function returns. Alternatively, this could be a class and there are two dunder methods that are called at class creation time and class initialization time. 2. These are called at attribute access time. Python sets up the attribute as a special property, and whenever the attribute is accessed the function is given arguments that allow it to infer what is going on. It then controls if and how the variable is accessed. Alternatively, this could be a class and there are dunder methods called at various times. The second approach would be more flexible, while the first approach would require less work for developers for basic tasks. The two approaches are not mutually-exclusive, either, especially if a dunder method-based approach is used. I know that strictly speaking the property-based approach could be implemented in the function-based approach, but that would be a lot more work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed May 17 12:14:05 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 17 May 2017 12:14:05 -0400 Subject: [Python-ideas] fnmatch.filter_false Message-ID: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> Fnmath.filter works great. To remind people what it does, it takes an iterable of strings and a pattern and returns a list of the strings that match the pattern. And that is wonderful However, I often need to filter *out* the items that match the pattern (to ignore them). In every project that I need this I end up copying the function out of the fnmatch library and adding 'not' to the test clause. It would be wonderful if there was a filter_false version in the standard library. Or in inversion Boolean option. Or something, to stop from having to copy code every time I need to ignore files. From srkunze at mail.de Wed May 17 12:11:00 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 17 May 2017 18:11:00 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> On 17.05.2017 13:37, Michel Desmoulin wrote: > Having a cleaner, faster solution to declare a class would be awesome, > both for dev and for teaching. That's why we all love attrs. > > But we are talking here about a nice-to-have feature. Python works > perfectly fine without it. But since we are at it, let's make something > great. > Same for me. IMHO the biggest benefit using attr is an (almost?) feature-complete and bug-free set of pre-defined __dunder__ methods as described in [1]. Defining state-variables (aka instance variables accessible via 'self.') wouldn't be enough for me to make it a valuable feature. So, one could imagine a __dunder__-method generator of some sort. But even given that (and I am only speaking for my team), I haven't even seen a use-case for namedtuples in a year. Every time we considered it, people said: "please make it its own class for documentary purposes; this thing will tend to grow faster than we can imagine". >> 2. Is it really that complicated? attr.s is just a normal Python >> function, which adds some members to a class. >> You don't even have to use the decorator @ syntax, that is just a convenience. >> To me this seems easier to teach than yet another dedicated syntax. > My guess is that, without the decorator, the attributes don't do > anything. They are just declarative hints for the decorator to do the magic. > > But even then, it's already an additional burden to have to explain the > difference between this magic and the regular class attribute. It might also have something to do with this. IMO this feature should integrate naturally in a way that nobody notice. Sven [1] https://attrs.readthedocs.io/en/stable/why.html#hand-written-classes From levkivskyi at gmail.com Wed May 17 12:32:59 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 17 May 2017 18:32:59 +0200 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: <20170517090330.GP24625@ando.pearwood.info> References: <20170516232338.GO24625@ando.pearwood.info> <20170517090330.GP24625@ando.pearwood.info> Message-ID: On 17 May 2017 at 11:03, Steven D'Aprano wrote: > On Tue, May 16, 2017 at 10:11:31PM -0700, Guido van Rossum wrote: > > There was another reason too. Many things are "class attributes" e.g. > > methods, descriptors. But only specific things are class *variables*. > > Ah, that's a good reason. I can live with that. > > Thanks for the explanation. > > This was discussed during development of PEP 526: https://github.com/python/typing/issues/258#issuecomment-242263868 Maybe we should add a corresponding subsection to "Rejected ideas"? -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Wed May 17 12:38:18 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 18:38:18 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: Hi Sven, > But even given that (and I am only speaking for my team), I haven't even > seen a use-case for namedtuples in a year. Every time we considered it, > people said: "please make it its own class for documentary purposes; this > thing will tend to grow faster than we can imagine". Using namedtuple doesn't stop the class from being its "own class". Typical use case: class Foo(namedtuple("Foo", "bar "baz"), FooBase): "Foo is a very important class and you should totally use it.""" def grand_total(self): return self.bar + self.baz Stephan 2017-05-17 18:11 GMT+02:00 Sven R. Kunze : > On 17.05.2017 13:37, Michel Desmoulin wrote: >> >> Having a cleaner, faster solution to declare a class would be awesome, >> both for dev and for teaching. That's why we all love attrs. >> >> But we are talking here about a nice-to-have feature. Python works >> perfectly fine without it. But since we are at it, let's make something >> great. >> > > Same for me. IMHO the biggest benefit using attr is an (almost?) > feature-complete and bug-free set of pre-defined __dunder__ methods as > described in [1]. > > Defining state-variables (aka instance variables accessible via 'self.') > wouldn't be enough for me to make it a valuable feature. So, one could > imagine a __dunder__-method generator of some sort. > > > But even given that (and I am only speaking for my team), I haven't even > seen a use-case for namedtuples in a year. Every time we considered it, > people said: "please make it its own class for documentary purposes; this > thing will tend to grow faster than we can imagine". > >>> 2. Is it really that complicated? attr.s is just a normal Python >>> function, which adds some members to a class. >>> You don't even have to use the decorator @ syntax, that is just a >>> convenience. >>> To me this seems easier to teach than yet another dedicated syntax. >> >> My guess is that, without the decorator, the attributes don't do >> anything. They are just declarative hints for the decorator to do the >> magic. >> >> But even then, it's already an additional burden to have to explain the >> difference between this magic and the regular class attribute. > > > It might also have something to do with this. IMO this feature should > integrate naturally in a way that nobody notice. > > > > Sven > > [1] https://attrs.readthedocs.io/en/stable/why.html#hand-written-classes > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From phd at phdru.name Wed May 17 12:43:31 2017 From: phd at phdru.name (Oleg Broytman) Date: Wed, 17 May 2017 18:43:31 +0200 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> Message-ID: <20170517164331.GA22240@phdru.name> On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters wrote: > Fnmath.filter works great. To remind people what it does, it takes an > iterable of strings and a pattern and returns a list of the strings that > match the pattern. And that is wonderful > > However, I often need to filter *out* the items that match the pattern (to > ignore them). In every project that I need this I end up copying the > function out of the fnmatch library and adding 'not' to the test clause. It > would be wonderful if there was a filter_false version in the standard > library. Or in inversion Boolean option. Or something, to stop from having > to copy code every time I need to ignore files. Why not create a package and publish at PyPI? Then all you need is pip install fnmatch_filter_false in your virtual env. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From levkivskyi at gmail.com Wed May 17 12:48:25 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 17 May 2017 18:48:25 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: On 17 May 2017 at 18:38, Stephan Houben wrote: > Hi Sven, > > > But even given that (and I am only speaking for my team), I haven't even > > seen a use-case for namedtuples in a year. Every time we considered it, > > people said: "please make it its own class for documentary purposes; this > > thing will tend to grow faster than we can imagine". > > Using namedtuple doesn't stop the class from being its "own class". > Typical use case: > > class Foo(namedtuple("Foo", "bar "baz"), FooBase): > "Foo is a very important class and you should totally use it.""" > > def grand_total(self): > return self.bar + self.baz > And the right (modern) way to do this is from typing import NamedTuple class Foo(NamedTuple): """Foo is a very important class and you should totally use it. """ bar: int baz: int = 0 def grand_total(self): return self.bar + self.baz typing.NamedTuple supports docstrings, user-defined methods, and default values. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed May 17 13:19:28 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Wed, 17 May 2017 13:19:28 -0400 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <20170517164331.GA22240@phdru.name> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> Message-ID: <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Oleg Broytman > Sent: Wednesday, May 17, 2017 12:44 PM > To: python-ideas at python.org > Subject: Re: [Python-ideas] fnmatch.filter_false > > On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters list at sdamon.com> wrote: > > Fnmath.filter works great. To remind people what it does, it takes an > > iterable of strings and a pattern and returns a list of the strings that > > match the pattern. And that is wonderful > > > > However, I often need to filter *out* the items that match the pattern (to > > ignore them). In every project that I need this I end up copying the > > function out of the fnmatch library and adding 'not' to the test clause. It > > would be wonderful if there was a filter_false version in the standard > > library. Or in inversion Boolean option. Or something, to stop from having > > to copy code every time I need to ignore files. > > Why not create a package and publish at PyPI? Then all you need is > pip install fnmatch_filter_false > in your virtual env. That is my normal thought on something like this, but in the case of adding a Boolean argument to fnmatch.filter, it (might be) as simple as a 3 line diff that does not break the API, and as far as I can tell, does not have performance implications. Copying a module out of the standard library that is identical except a 3 line diff that does not break compatibility with the standard library... that just wreaks of something that should be in the standard library to begin with. In the case of adding a separate function to fnmatch, it's still not that big of a diff, and wouldn't have much duplicated code, at least in the way I would implement it - it would essentially do the previous Boolean option method, and wrap that. A filter_false function, now that I think about it, is less ideal than just adding a keyword only Boolean option to fnmatch.filter. > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ethan at stoneleaf.us Wed May 17 13:30:10 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 17 May 2017 10:30:10 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> Message-ID: <591C88A2.8070400@stoneleaf.us> On 05/17/2017 06:20 AM, Stephan Houben wrote: > class MyClass: > foo = attr.ib() > > MyClass = attr.s(MyClass) Given that one of Python's great strengths is its readability, I would not use the attr library in teaching because it is not. Having a dot in the middle of words is confusing, especially when you don't already have a basis for which abbreviations are common. Is it attr.ib or att.rib or at.trib? -- ~Ethan~ From apalala at gmail.com Wed May 17 13:40:22 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Wed, 17 May 2017 13:40:22 -0400 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: On Wed, May 17, 2017 at 12:48 PM, Ivan Levkivskyi wrote: > class Foo(NamedTuple): > """Foo is a very important class and > you should totally use it. > """ > bar: int > baz: int = 0 > > def grand_total(self): > return self.bar + self.baz > Really?! I didn't know that idiom existed. It is enough for many use cases, and I was just about to require typing and pathlib on my 2.7-compatible projects. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Wed May 17 13:43:26 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 17 May 2017 19:43:26 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: On 17 May 2017 at 19:40, Juancarlo A?ez wrote: > > On Wed, May 17, 2017 at 12:48 PM, Ivan Levkivskyi > wrote: > >> class Foo(NamedTuple): >> """Foo is a very important class and >> you should totally use it. >> """ >> bar: int >> baz: int = 0 >> >> def grand_total(self): >> return self.bar + self.baz >> > > Really?! > > I didn't know that idiom existed. > > It is enough for many use cases, and I was just about to require typing > and pathlib on my 2.7-compatible projects. > > Unfortunately, this works _only_ in Python 3.6+. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed May 17 13:39:36 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 17 May 2017 19:39:36 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <591C88A2.8070400@stoneleaf.us> References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <591C88A2.8070400@stoneleaf.us> Message-ID: On 17.05.2017 19:30, Ethan Furman wrote: > Given that one of Python's great strengths is its readability, I would > not use the attr library in teaching because it is not. Having a dot > in the middle of words is confusing, especially when you don't already > have a basis for which abbreviations are common. Is it attr.ib or > att.rib or at.trib? It took me 5 days to see "foo = attrib()" in "foo = attr.ib()".... What the hell means "ib"? ... Sven From srkunze at mail.de Wed May 17 13:52:02 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 17 May 2017 19:52:02 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: Hi Stephan, hi Ivan, On 17.05.2017 18:48, Ivan Levkivskyi wrote: > from typing import NamedTuple > > class Foo(NamedTuple): > """Foo is a very important class and > you should totally use it. > """ > bar: int > baz: int = 0 > > def grand_total(self): > return self.bar + self.baz > > typing.NamedTuple supports docstrings, user-defined methods, and > default values. I hope the second ': int' can be omitted because 0 already is an int. This makes me wonder three things: 1) Michel, can newcomers differentiate between when to use ' : ' and when to use ' = ' and a combination thereof? 2) There must be a lot of cornercases where people rewrite Foo to be a normal class in the end, right? 3) If one doesn't need tuple-__dunder__ methods, a "normal" class would even need 1 line less. (+ Stephan's second point) So, this still leaves those missing __dunder__ magic methods doing the right thing at the right time. Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Wed May 17 13:55:08 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Wed, 17 May 2017 13:55:08 -0400 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> Message-ID: <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> Top posting, apologies. I'm sure there is a better way to do it, and there is a performance hit, but its negligible. This is also a three line delta of the function. from fnmatch import _compile_pattern, filter as old_filter import os import os.path import posixpath data = os.listdir() def filter(names, pat, *, invert=False): """Return the subset of the list NAMES that match PAT.""" result = [] pat = os.path.normcase(pat) match = _compile_pattern(pat) if os.path is posixpath: # normcase on posix is NOP. Optimize it away from the loop. for name in names: if bool(match(name)) == (not invert): result.append(name) else: for name in names: if bool(match(os.path.normcase(name))) == (not invert): result.append(name) return result if __name__ == '__main__': import timeit print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import filter, data" )) print(timeit.timeit( "filter(data, '__*')", setup="from __main__ import old_filter as filter, data" )) The first test (modified code) timed at 22.492161903402575, where the second test (unmodified) timed at 19.555531892032324 > -----Original Message----- > From: tritium-list at sdamon.com [mailto:tritium-list at sdamon.com] > Sent: Wednesday, May 17, 2017 1:19 PM > To: python-ideas at python.org > Subject: RE: [Python-ideas] fnmatch.filter_false > > > -----Original Message----- > > From: Python-ideas [mailto:python-ideas-bounces+tritium- > > list=sdamon.com at python.org] On Behalf Of Oleg Broytman > > Sent: Wednesday, May 17, 2017 12:44 PM > > To: python-ideas at python.org > > Subject: Re: [Python-ideas] fnmatch.filter_false > > > > On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters > list at sdamon.com> wrote: > > > Fnmath.filter works great. To remind people what it does, it takes an > > > iterable of strings and a pattern and returns a list of the strings that > > > match the pattern. And that is wonderful > > > > > > However, I often need to filter *out* the items that match the pattern > (to > > > ignore them). In every project that I need this I end up copying the > > > function out of the fnmatch library and adding 'not' to the test clause. > It > > > would be wonderful if there was a filter_false version in the standard > > > library. Or in inversion Boolean option. Or something, to stop from > having > > > to copy code every time I need to ignore files. > > > > Why not create a package and publish at PyPI? Then all you need is > > pip install fnmatch_filter_false > > in your virtual env. > > That is my normal thought on something like this, but in the case of adding > a Boolean argument to fnmatch.filter, it (might be) as simple as a 3 line > diff that does not break the API, and as far as I can tell, does not have > performance implications. Copying a module out of the standard library that > is identical except a 3 line diff that does not break compatibility with the > standard library... that just wreaks of something that should be in the > standard library to begin with. > > In the case of adding a separate function to fnmatch, it's still not that > big of a diff, and wouldn't have much duplicated code, at least in the way I > would implement it - it would essentially do the previous Boolean option > method, and wrap that. A filter_false function, now that I think about it, > is less ideal than just adding a keyword only Boolean option to > fnmatch.filter. > > > Oleg. > > -- > > Oleg Broytman http://phdru.name/ phd at phdru.name > > Programmers don't die, they just GOSUB without RETURN. > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ From stephanh42 at gmail.com Wed May 17 13:55:34 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 19:55:34 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <591C88A2.8070400@stoneleaf.us> Message-ID: If this is the *only* objection to attrs let me quote some documentation: """ If playful naming turns you off, attrs comes with serious business aliases: >>> from attr import attrs, attrib >>> @attrs ... class SeriousCoordinates(object): ... x = attrib() ... y = attrib() """ So attrs and attrib can be used as alternatives for attr.s and attr.ib . Personally, I like the playful names. Stephan 2017-05-17 19:39 GMT+02:00 Sven R. Kunze : > On 17.05.2017 19:30, Ethan Furman wrote: >> >> Given that one of Python's great strengths is its readability, I would not >> use the attr library in teaching because it is not. Having a dot in the >> middle of words is confusing, especially when you don't already have a basis >> for which abbreviations are common. Is it attr.ib or att.rib or at.trib? > > > It took me 5 days to see "foo = attrib()" in "foo = attr.ib()".... What the > hell means "ib"? ... > > Sven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From phd at phdru.name Wed May 17 13:59:28 2017 From: phd at phdru.name (Oleg Broytman) Date: Wed, 17 May 2017 19:59:28 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <591C88A2.8070400@stoneleaf.us> Message-ID: <20170517175928.GA31319@phdru.name> On Wed, May 17, 2017 at 07:39:36PM +0200, "Sven R. Kunze" wrote: > It took me 5 days to see "foo = attrib()" in "foo = attr.ib()".... What the > hell means "ib"? ... Guido has named it "deadly cute". (-: > Sven Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From srkunze at mail.de Wed May 17 14:00:46 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 17 May 2017 20:00:46 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <591C88A2.8070400@stoneleaf.us> Message-ID: <5d9e36db-7c23-f786-c732-b971ada9d186@mail.de> On 17.05.2017 19:55, Stephan Houben wrote: > So attrs and attrib can be used as alternatives for attr.s and attr.ib . > Personally, I like the playful names. Ah, now I understand their documentation. :D I read this passage and thought: "where is the difference. Maybe, they meant omitting x=, y= in the constructor?" "There should be one-- and preferably only one --obvious way to do it." <<< That's why I didn't even thought there's an alternative. I feel that's a bit not-Python. ;) Sven From stephanh42 at gmail.com Wed May 17 14:08:24 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Wed, 17 May 2017 20:08:24 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: Hi Sven, "I hope the second ': int' can be omitted because 0 already is an int." 0 is also an Any, an object, a SupportAbs, and a Union[int, str]. And infinitely more, of course. A typechecker needs to be explicitly told which was intended. Stephan Op 17 mei 2017 19:52 schreef "Sven R. Kunze" : Hi Stephan, hi Ivan, On 17.05.2017 18:48, Ivan Levkivskyi wrote: from typing import NamedTuple class Foo(NamedTuple): """Foo is a very important class and you should totally use it. """ bar: int baz: int = 0 def grand_total(self): return self.bar + self.baz typing.NamedTuple supports docstrings, user-defined methods, and default values. I hope the second ': int' can be omitted because 0 already is an int. This makes me wonder three things: 1) Michel, can newcomers differentiate between when to use ' : ' and when to use ' = ' and a combination thereof? 2) There must be a lot of cornercases where people rewrite Foo to be a normal class in the end, right? 3) If one doesn't need tuple-__dunder__ methods, a "normal" class would even need 1 line less. (+ Stephan's second point) So, this still leaves those missing __dunder__ magic methods doing the right thing at the right time. Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed May 17 14:09:33 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 17 May 2017 20:09:33 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: Hi Stephan, On 17.05.2017 08:49, Stephan Houben wrote: > 2. Not subclassed from tuple. I have been bitten by this subclassing > when trying to set up > singledispatch on sequences and also on my classes. Would it make sense to have a 'simpleobject'? Which basically implements a NamedTuple constructor but nothing more? class Foo(simpleobject): attribute1: User attribute2: Blog attribute3: list And if you need more __dunder__ magic, have it provided by some mixins? class Foo(dictlike, tuplelike, simpleobject): attribute1: User attribute2: Blog attribute3: list def __my_dunder__(self): ... I don't know exactly if some of those dictlike, tuplelike mixins are already available in the stdlib under a different name, but at least to me this looks like plug'n'play __dunder__ magic. Sven From elazarg at gmail.com Wed May 17 16:53:49 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Wed, 17 May 2017 20:53:49 +0000 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> Message-ID: There shouldn't be any difference at all. Checking the for invert can be outside of the loop, which will make the loop itself exactly as it is now. Just like what's been done with normcase. On Wed, May 17, 2017 at 8:55 PM wrote: > Top posting, apologies. > > I'm sure there is a better way to do it, and there is a performance hit, > but > its negligible. This is also a three line delta of the function. > > from fnmatch import _compile_pattern, filter as old_filter > import os > import os.path > import posixpath > > > data = os.listdir() > > def filter(names, pat, *, invert=False): > """Return the subset of the list NAMES that match PAT.""" > result = [] > pat = os.path.normcase(pat) > match = _compile_pattern(pat) > if os.path is posixpath: > # normcase on posix is NOP. Optimize it away from the loop. > for name in names: > if bool(match(name)) == (not invert): > result.append(name) > else: > for name in names: > if bool(match(os.path.normcase(name))) == (not invert): > result.append(name) > return result > > if __name__ == '__main__': > import timeit > print(timeit.timeit( > "filter(data, '__*')", > setup="from __main__ import filter, data" > )) > print(timeit.timeit( > "filter(data, '__*')", > setup="from __main__ import old_filter as filter, data" > )) > > The first test (modified code) timed at 22.492161903402575, where the > second > test (unmodified) timed at 19.555531892032324 > > > > > -----Original Message----- > > From: tritium-list at sdamon.com [mailto:tritium-list at sdamon.com] > > Sent: Wednesday, May 17, 2017 1:19 PM > > To: python-ideas at python.org > > Subject: RE: [Python-ideas] fnmatch.filter_false > > > > > -----Original Message----- > > > From: Python-ideas [mailto:python-ideas-bounces+tritium- > > > list=sdamon.com at python.org] On Behalf Of Oleg Broytman > > > Sent: Wednesday, May 17, 2017 12:44 PM > > > To: python-ideas at python.org > > > Subject: Re: [Python-ideas] fnmatch.filter_false > > > > > > On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters > > list at sdamon.com> wrote: > > > > Fnmath.filter works great. To remind people what it does, it takes > an > > > > iterable of strings and a pattern and returns a list of the strings > that > > > > match the pattern. And that is wonderful > > > > > > > > However, I often need to filter *out* the items that match the > pattern > > (to > > > > ignore them). In every project that I need this I end up copying the > > > > function out of the fnmatch library and adding 'not' to the test > clause. > > It > > > > would be wonderful if there was a filter_false version in the > standard > > > > library. Or in inversion Boolean option. Or something, to stop from > > having > > > > to copy code every time I need to ignore files. > > > > > > Why not create a package and publish at PyPI? Then all you need is > > > pip install fnmatch_filter_false > > > in your virtual env. > > > > That is my normal thought on something like this, but in the case of > adding > > a Boolean argument to fnmatch.filter, it (might be) as simple as a 3 line > > diff that does not break the API, and as far as I can tell, does not have > > performance implications. Copying a module out of the standard library > that > > is identical except a 3 line diff that does not break compatibility with > the > > standard library... that just wreaks of something that should be in the > > standard library to begin with. > > > > In the case of adding a separate function to fnmatch, it's still not that > > big of a diff, and wouldn't have much duplicated code, at least in the > way > I > > would implement it - it would essentially do the previous Boolean option > > method, and wrap that. A filter_false function, now that I think about > it, > > is less ideal than just adding a keyword only Boolean option to > > fnmatch.filter. > > > > > Oleg. > > > -- > > > Oleg Broytman http://phdru.name/ > phd at phdru.name > > > Programmers don't die, they just GOSUB without RETURN. > > > _______________________________________________ > > > Python-ideas mailing list > > > Python-ideas at python.org > > > https://mail.python.org/mailman/listinfo/python-ideas > > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 17 17:06:51 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 17 May 2017 14:06:51 -0700 Subject: [Python-ideas] [semi-OT] NamedTuple from aenum library [was: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)] In-Reply-To: References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> Message-ID: <591CBB6B.7070905@stoneleaf.us> On 05/17/2017 10:43 AM, Ivan Levkivskyi wrote: > On 17 May 2017 at 19:40, Juancarlo A?ez wrote: >> On Wed, May 17, 2017 at 12:48 PM, Ivan Levkivskyi wrote: >>> class Foo(NamedTuple): >>> """Foo is a very important class and >>> you should totally use it. >>> """ >>> bar: int >>> baz: int = 0 >>> >>> def grand_total(self): >>> return self.bar + self.baz >> >> Really?! >> >> I didn't know that idiom existed. >> >> It is enough for many use cases, and I was just about to require >> typing and pathlib on my 2.7-compatible projects. > > Unfortunately, this works _only_ in Python 3.6+. You might want to check out the NamedTuple class from my aenum [1] library -- it is metaclass based (no execing), supports defaults, doc-strings, and other fun and possibly useful things. Here's the NamedTuple section from the docs: > Creating NamedTuples > -------------------- > > Simple > ^^^^^^ > > The most common way to create a new NamedTuple will be via the functional API:: > > >>> from aenum import NamedTuple > >>> Book = NamedTuple('Book', 'title author genre', module=__name__) > > This creates a ``NamedTuple`` called ``Book`` that will always contain three > items, each of which is also addressable as ``title``, ``author``, or ``genre``. > > ``Book`` instances can be created using positional or keyword argements or a > mixture of the two:: > > >>> b1 = Book('Lord of the Rings', 'J.R.R. Tolkien', 'fantasy') > >>> b2 = Book(title='Jhereg', author='Steven Brust', genre='fantasy') > >>> b3 = Book('Empire', 'Orson Scott Card', genre='scifi') > > If too few or too many arguments are used a ``TypeError`` will be raised:: > > >>> b4 = Book('Hidden Empire') > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): author, genre > >>> b5 = Book(genre='business') > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): title, author > > As a ``class`` the above ``Book`` ``NamedTuple`` would look like:: > > >>> class Book(NamedTuple): > ... title = 0 > ... author = 1 > ... genre = 2 > ... > > For compatibility with the stdlib ``namedtuple``, NamedTuple also has the > ``_asdict``, ``_make``, and ``_replace`` methods, and the ``_fields`` > attribute, which all function similarly:: > > >>> class Point(NamedTuple): > ... x = 0, 'horizontal coordinate', 1 > ... y = 1, 'vertical coordinate', -1 > ... > >>> class Color(NamedTuple): > ... r = 0, 'red component', 11 > ... g = 1, 'green component', 29 > ... b = 2, 'blue component', 37 > ... > >>> Pixel = NamedTuple('Pixel', Point+Color, module=__name__) > >>> pixel = Pixel(99, -101, 255, 128, 0) > > >>> pixel._asdict() > OrderedDict([('x', 99), ('y', -101), ('r', 255), ('g', 128), ('b', 0)]) > > >>> Point._make((4, 5)) > Point(x=4, y=5) > > >>> purple = Color(127, 0, 127) > >>> mid_gray = purple._replace(g=127) > >>> mid_gray > Color(r=127, g=127, b=127) > > >>> pixel._fields > ['x', 'y', 'r', 'g', 'b'] > > >>> Pixel._fields > ['x', 'y', 'r', 'g', 'b'] > > > Advanced > ^^^^^^^^ > > The simple method of creating ``NamedTuples`` requires always specifying all > possible arguments when creating instances; failure to do so will raise > exceptions:: > > >>> class Point(NamedTuple): > ... x = 0 > ... y = 1 > ... > >>> Point() > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): x, y > >>> Point(1) > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): y > >>> Point(y=2) > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): x > > However, it is possible to specify both docstrings and default values when > creating a ``NamedTuple`` using the class method:: > > >>> class Point(NamedTuple): > ... x = 0, 'horizontal coordinate', 0 > ... y = 1, 'vertical coordinate', 0 > ... > >>> Point() > Point(x=0, y=0) > >>> Point(1) > Point(x=1, y=0) > >>> Point(y=2) > Point(x=0, y=2) > > It is also possible to create ``NamedTuples`` that only have named attributes > for certain fields; any fields without names can still be accessed by index:: > > >>> class Person(NamedTuple): > ... fullname = 2 > ... phone = 5 > ... > >>> p = Person('Ethan', 'Furman', 'Ethan Furman', > ... 'ethan at stoneleaf dot us', > ... 'ethan.furman', '999.555.1212') > >>> p > Person('Ethan', 'Furman', 'Ethan Furman', 'ethan at stoneleaf dot us', > 'ethan.furman', '999.555.1212') > >>> p.fullname > 'Ethan Furman' > >>> p.phone > '999.555.1212' > >>> p[0] > 'Ethan' > > In the above example the last named field was also the last field possible; in > those cases where you don't need to have the last possible field named, you can > provide a ``_size_`` of ``TupleSize.minimum`` to declare that more fields are > okay:: > > >>> from aenum import TupleSize > >>> class Person(NamedTuple): > ... _size_ = TupleSize.minimum > ... first = 0 > ... last = 1 > ... > > or, optionally if using Python 3:: > > >>> class Person(NamedTuple, size=TupleSize.minimum): # doctest: +SKIP > ... first = 0 > ... last = 1 > > and in use:: > > >>> Person('Ethan', 'Furman') > Person(first='Ethan', last='Furman') > > >>> Person('Ethan', 'Furman', 'ethan.furman') > Person('Ethan', 'Furman', 'ethan.furman') > > >>> Person('Ethan', 'Furman', 'ethan.furman', 'yay Python!') > Person('Ethan', 'Furman', 'ethan.furman', 'yay Python!') > > >>> Person('Ethan') > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): last > > Also, for those cases where even named fields may not be present, you can > specify ``TupleSize.variable``:: > > >>> class Person(NamedTuple): > ... _size_ = TupleSize.variable > ... first = 0 > ... last = 1 > ... > > >>> Person('Ethan') > Person('Ethan') > > >>> Person(last='Furman') > Traceback (most recent call last): > ... > TypeError: values not provided for field(s): first > > Creating new ``NamedTuples`` from existing ``NamedTuples`` is simple:: > > >>> Point = NamedTuple('Point', 'x y') > >>> Color = NamedTuple('Color', 'r g b') > >>> Pixel = NamedTuple('Pixel', Point+Color, module=__name__) > >>> Pixel > > > The existing fields in the bases classes are renumbered to fit the new class, > but keep their doc strings and default values. If you use standard > subclassing:: > > >>> Point = NamedTuple('Point', 'x y') > >>> class Pixel(Point): > ... r = 2, 'red component', 11 > ... g = 3, 'green component', 29 > ... b = 4, 'blue component', 37 > ... > >>> Pixel.__fields__ > ['x', 'y', 'r', 'g', 'b'] > > You must manage the numbering yourself. -- ~Ethan~ From levkivskyi at gmail.com Wed May 17 17:29:25 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 17 May 2017 23:29:25 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: On 17 May 2017 at 20:09, Sven R. Kunze wrote: > Hi Stephan, > > On 17.05.2017 08:49, Stephan Houben wrote: > >> 2. Not subclassed from tuple. I have been bitten by this subclassing >> when trying to set up >> singledispatch on sequences and also on my classes. >> > > Would it make sense to have a 'simpleobject'? Which basically implements a > NamedTuple constructor but nothing more? > > class Foo(simpleobject): > attribute1: User > attribute2: Blog > attribute3: list > > > And if you need more __dunder__ magic, have it provided by some mixins? > > > class Foo(dictlike, tuplelike, simpleobject): > attribute1: User > attribute2: Blog > attribute3: list > > def __my_dunder__(self): > ... > As I understand this is more or less what is proposed, the idea is to write it into a PEP and consider API/corner cases/implementation/etc. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 17 18:19:32 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 17 May 2017 15:19:32 -0700 Subject: [Python-ideas] [semi-OT] NamedTuple from aenum library [was: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)] In-Reply-To: <591CBB6B.7070905@stoneleaf.us> References: <153e0087-2b75-a118-1bb3-1a81e72e0acb@gmail.com> <0946a458-225f-5d5f-bcad-8feda11be194@mail.de> <591CBB6B.7070905@stoneleaf.us> Message-ID: <591CCC74.2030804@stoneleaf.us> On 05/17/2017 02:06 PM, Ethan Furman wrote: > You might want to check out the NamedTuple class from my aenum [1] > library [1] https://pypi.python.org/pypi/aenum -- ~Ethan~ From steve at pearwood.info Wed May 17 21:00:40 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 May 2017 11:00:40 +1000 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> Message-ID: <20170518010040.GS24625@ando.pearwood.info> On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters wrote: > Fnmath.filter works great. To remind people what it does, it takes an > iterable of strings and a pattern and returns a list of the strings that > match the pattern. And that is wonderful > > However, I often need to filter *out* the items that match the pattern (to > ignore them). In every project that I need this I end up copying the > function out of the fnmatch library and adding 'not' to the test clause. At the cost of a slight inefficiency, you could use the pure Python equivalent given in the docs: https://docs.python.org/3/library/fnmatch.html#fnmatch.filter fnmatch.filter(names, pattern) Return the subset of the list of names that match pattern. It is the same as [n for n in names if fnmatch(n, pattern)], but implemented more efficiently. So your filter_false is: [n for n in names if not fnmatch(n, pattern)] which avoids the need for the copy-and-paste anti-pattern. Otherwise, I would support: - filter_false - or a glob symbol to reverse the sense of the test, e.g. ~ or ! as the first character; but I dislike functions that take boolean arguments to change their behaviour. It's not a 100% hard and fast rule, but in general I prefer to avoid functions that take a constant bool argument: # rather than this: function(important_args, True) function(important_args, False) # use this: function(important_args) function_false(important_args) (although there may be exceptions). -- Steve From tritium-list at sdamon.com Thu May 18 00:07:05 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Thu, 18 May 2017 00:07:05 -0400 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <20170518010040.GS24625@ando.pearwood.info> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170518010040.GS24625@ando.pearwood.info> Message-ID: <062b01d2cf8c$3682d280$a3887780$@hotmail.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Steven D'Aprano > Sent: Wednesday, May 17, 2017 9:01 PM > To: python-ideas at python.org > Subject: Re: [Python-ideas] fnmatch.filter_false > > On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters wrote: > > Fnmath.filter works great. To remind people what it does, it takes an > > iterable of strings and a pattern and returns a list of the strings that > > match the pattern. And that is wonderful > > > > However, I often need to filter *out* the items that match the pattern (to > > ignore them). In every project that I need this I end up copying the > > function out of the fnmatch library and adding 'not' to the test clause. > > At the cost of a slight inefficiency, you could use the pure Python > equivalent given in the docs: > > https://docs.python.org/3/library/fnmatch.html#fnmatch.filter > > > fnmatch.filter(names, pattern) > > Return the subset of the list of names that match pattern. It is > the same as [n for n in names if fnmatch(n, pattern)], but implemented > more efficiently. > > So your filter_false is: > > [n for n in names if not fnmatch(n, pattern)] > > which avoids the need for the copy-and-paste anti-pattern. I ran a test on the same dataset using listcomps. The modified version of filter is still 22., the unmodified version is still 19.. However the listcomp method is 41.. That performance is important, at least to me. Before you ask, yes, I have profiled my code, and filtering is a bottleneck. > > Otherwise, I would support: > > - filter_false > > - or a glob symbol to reverse the sense of the test, e.g. ~ or ! > as the first character; > > > but I dislike functions that take boolean arguments to change their > behaviour. It's not a 100% hard and fast rule, but in general I prefer > to avoid functions that take a constant bool argument: > > # rather than this: > function(important_args, True) > function(important_args, False) > > # use this: > function(important_args) > function_false(important_args) > > (although there may be exceptions). > That is a fair enough argument. In moderate defense of the code I posted earlier, I did make the inversion variable keyword only. > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Thu May 18 09:00:42 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 May 2017 23:00:42 +1000 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <062b01d2cf8c$3682d280$a3887780$@hotmail.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170518010040.GS24625@ando.pearwood.info> <062b01d2cf8c$3682d280$a3887780$@hotmail.com> Message-ID: <20170518130041.GU24625@ando.pearwood.info> On Thu, May 18, 2017 at 12:07:05AM -0400, tritium-list at sdamon.com wrote: > > At the cost of a slight inefficiency, you could use the pure Python > > equivalent given in the docs: > > > > https://docs.python.org/3/library/fnmatch.html#fnmatch.filter > > > > > > fnmatch.filter(names, pattern) > > > > Return the subset of the list of names that match pattern. It is > > the same as [n for n in names if fnmatch(n, pattern)], but implemented > > more efficiently. > > > > So your filter_false is: > > > > [n for n in names if not fnmatch(n, pattern)] > > > > which avoids the need for the copy-and-paste anti-pattern. > > I ran a test on the same dataset using listcomps. The modified version of > filter is still 22., the unmodified version is still 19. noise>. However the listcomp method is 41.. That performance > is important, at least to me. 41 what? Nanoseconds? Hours? For how many files? *wink* In any case, are you running on Linux or Unix? If so, try replacing the fnmatch with fnmatchcase, since that avoids calling os.path.normcase (a no-op on Linux) twice for each file. If you want to reduce the number of function calls even more, try this untested code: # using an implementation detail is a bit naughty match = fnmatch._compile_pattern(pattern).match results = [n for n in names if match(n) is None] -- Steve From steve at pearwood.info Thu May 18 09:03:35 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 May 2017 23:03:35 +1000 Subject: [Python-ideas] Tighten up the formal grammar and parsing a bit? In-Reply-To: References: <20170515130014.GN24625@ando.pearwood.info> Message-ID: <20170518130334.GV24625@ando.pearwood.info> On Mon, May 15, 2017 at 11:17:48PM +1000, Chris Angelico wrote: > On Mon, May 15, 2017 at 11:00 PM, Steven D'Aprano wrote: > > There's also cases where > > > > if x > y: > > pass > > else: > > code > > > > is *not necessarily* the same as > > > > if not (x > y): > > code > > > > (x > y) is not always not(x <= y). E.g. sets, and even floats. > > Uhm.... not sure what you're getting at here. Neither am I, now :-) I'm not quite sure what I was thinking, but it made sense at the time. Sorry for the noise. -- Steve From brett at python.org Thu May 18 11:25:28 2017 From: brett at python.org (Brett Cannon) Date: Thu, 18 May 2017 15:25:28 +0000 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: References: <20170516232338.GO24625@ando.pearwood.info> <20170517090330.GP24625@ando.pearwood.info> Message-ID: On Wed, May 17, 2017, 09:33 Ivan Levkivskyi, wrote: > On 17 May 2017 at 11:03, Steven D'Aprano wrote: > >> On Tue, May 16, 2017 at 10:11:31PM -0700, Guido van Rossum wrote: >> > There was another reason too. Many things are "class attributes" e.g. >> > methods, descriptors. But only specific things are class *variables*. >> >> Ah, that's a good reason. I can live with that. >> >> Thanks for the explanation. >> >> > This was discussed during development of PEP 526: > https://github.com/python/typing/issues/258#issuecomment-242263868 > Maybe we should add a corresponding subsection to "Rejected ideas"? > If it isn't too much trouble to write them I say it's a reasonable idea. -brett > -- > Ivan > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu May 18 17:14:45 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 18 May 2017 23:14:45 +0200 Subject: [Python-ideas] PEP 526: why ClassVar instead of ClassAttr? In-Reply-To: References: <20170516232338.GO24625@ando.pearwood.info> <20170517090330.GP24625@ando.pearwood.info> Message-ID: On 18 May 2017 at 17:25, Brett Cannon wrote: > On Wed, May 17, 2017, 09:33 Ivan Levkivskyi, wrote: > >> On 17 May 2017 at 11:03, Steven D'Aprano wrote: >> >>> On Tue, May 16, 2017 at 10:11:31PM -0700, Guido van Rossum wrote: >>> > There was another reason too. Many things are "class attributes" e.g. >>> > methods, descriptors. But only specific things are class *variables*. >>> >>> Ah, that's a good reason. I can live with that. >>> >>> Thanks for the explanation. >>> >>> >> This was discussed during development of PEP 526: >> https://github.com/python/typing/issues/258#issuecomment-242263868 >> Maybe we should add a corresponding subsection to "Rejected ideas"? >> > > If it isn't too much trouble to write them I say it's a reasonable idea. > Here is a small proposed PR https://github.com/python/peps/pull/261 -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Thu May 18 17:26:30 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 18 May 2017 23:26:30 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> On 17.05.2017 23:29, Ivan Levkivskyi wrote: > On 17 May 2017 at 20:09, Sven R. Kunze > wrote: > > class Foo(dictlike, tuplelike, simpleobject): > attribute1: User > attribute2: Blog > attribute3: list > > def __my_dunder__(self): > ... > > > As I understand this is more or less what is proposed, Are you sure? Could you point me to the relevant messages where mixins are mentioned as a key part of the proposal? All I could find are message using the @decorator syntaxes. We've been working with mixins successfully for years now and I can tell you that it's "just" a clever way of refactoring existing code in order to make it more accessible to other modules *based on use-cases*. So, the best person to tell what pieces to factor out would be Stephan using his 4-point list. And of course other people using NamedTuple but frequently refactoring to "attr" or own class because NamedTuple just is too much (defines too much implicitly). Another benefit is that NamedTuple itself would become a mere set of base class and mixins. > the idea is to write it into a PEP and consider API/corner > cases/implementation/etc. Who's writing it? Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Thu May 18 18:34:32 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Thu, 18 May 2017 18:34:32 -0400 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <20170518130041.GU24625@ando.pearwood.info> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170518010040.GS24625@ando.pearwood.info> <062b01d2cf8c$3682d280$a3887780$@hotmail.com> <20170518130041.GU24625@ando.pearwood.info> Message-ID: <06b401d2d026$ec478350$c4d689f0$@hotmail.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Steven D'Aprano > Sent: Thursday, May 18, 2017 9:01 AM > To: python-ideas at python.org > Subject: Re: [Python-ideas] fnmatch.filter_false > > On Thu, May 18, 2017 at 12:07:05AM -0400, tritium-list at sdamon.com wrote: > > > > At the cost of a slight inefficiency, you could use the pure Python > > > equivalent given in the docs: > > > > > > https://docs.python.org/3/library/fnmatch.html#fnmatch.filter > > > > > > > > > fnmatch.filter(names, pattern) > > > > > > Return the subset of the list of names that match pattern. It is > > > the same as [n for n in names if fnmatch(n, pattern)], but implemented > > > more efficiently. > > > > > > So your filter_false is: > > > > > > [n for n in names if not fnmatch(n, pattern)] > > > > > > which avoids the need for the copy-and-paste anti-pattern. > > > > I ran a test on the same dataset using listcomps. The modified version of > > filter is still 22., the unmodified version is still 19. > noise>. However the listcomp method is 41.. That > performance > > is important, at least to me. > > 41 what? Nanoseconds? Hours? For how many files? *wink* 41 "integer units of whatever timeit reports with". I would have to look, but iirc it's seconds for 10000 runs. I am not on unix. > In any case, are you running on Linux or Unix? If so, try replacing the > fnmatch with fnmatchcase, since that avoids calling os.path.normcase (a > no-op on Linux) twice for each file. > > If you want to reduce the number of function calls even more, try this > untested code: > > # using an implementation detail is a bit naughty > match = fnmatch._compile_pattern(pattern).match > results = [n for n in names if match(n) is None] > > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From eric at trueblade.com Thu May 18 21:37:41 2017 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 18 May 2017 18:37:41 -0700 Subject: [Python-ideas] Data Classes (was: Re: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)) In-Reply-To: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> Message-ID: <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> On 5/18/17 2:26 PM, Sven R. Kunze wrote: > On 17.05.2017 23:29, Ivan Levkivskyi wrote: >> the idea is to write it into a PEP and consider API/corner >> cases/implementation/etc. > > Who's writing it? Guido, Hynek, and I met today. I'm writing up our notes, and hopefully that will eventually become a PEP. I'm going to propose calling this feature "Data Classes" as a placeholder until we come up with something better. Once I have something readable, I'll open it up for discussion. Eric. From eric at trueblade.com Fri May 19 00:31:59 2017 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 18 May 2017 21:31:59 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> Message-ID: <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> On 5/18/17 2:26 PM, Sven R. Kunze wrote: > On 17.05.2017 23:29, Ivan Levkivskyi wrote: >> On 17 May 2017 at 20:09, Sven R. Kunze > > wrote: >> >> class Foo(dictlike, tuplelike, simpleobject): >> attribute1: User >> attribute2: Blog >> attribute3: list >> >> def __my_dunder__(self): >> ... >> >> >> As I understand this is more or less what is proposed, > > Are you sure? Could you point me to the relevant messages where mixins > are mentioned as a key part of the proposal? All I could find are > message using the @decorator syntaxes. > > We've been working with mixins successfully for years now and I can tell > you that it's "just" a clever way of refactoring existing code in order > to make it more accessible to other modules *based on use-cases*. > > > So, the best person to tell what pieces to factor out would be Stephan > using his 4-point list. > And of course other people using NamedTuple but frequently refactoring > to "attr" or own class because NamedTuple just is too much (defines too > much implicitly). Could you point me to this 4-point list of Stephan's? I couldn't find anything in the archive that you might be referring to. Eric. From niki.spahiev at gmail.com Fri May 19 03:10:00 2017 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 19 May 2017 10:10:00 +0300 Subject: [Python-ideas] Data Classes In-Reply-To: <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> Message-ID: On 19.05.2017 04:37, Eric V. Smith wrote: > On 5/18/17 2:26 PM, Sven R. Kunze wrote: >> On 17.05.2017 23:29, Ivan Levkivskyi wrote: >>> the idea is to write it into a PEP and consider API/corner >>> cases/implementation/etc. >> >> Who's writing it? > > Guido, Hynek, and I met today. I'm writing up our notes, and hopefully > that will eventually become a PEP. I'm going to propose calling this > feature "Data Classes" as a placeholder until we come up with something > better. FWIIW: Kotlin has similar feature - data class. data class ABCthing(val a: Double, val b: Double, val c: Double) Niki From eric at trueblade.com Fri May 19 08:42:12 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 19 May 2017 05:42:12 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> Message-ID: <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> > Could you point me to this 4-point list of Stephan's? I couldn't find > anything in the archive that you might be referring to. Never mind, I found them here: https://mail.python.org/pipermail/python-ideas/2017-May/045679.html Eric. From wolfgang.maier at biologie.uni-freiburg.de Fri May 19 10:03:16 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 19 May 2017 16:03:16 +0200 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> Message-ID: On 05/17/2017 07:55 PM, tritium-list at sdamon.com wrote: > Top posting, apologies. > > I'm sure there is a better way to do it, and there is a performance hit, but > its negligible. This is also a three line delta of the function. > > from fnmatch import _compile_pattern, filter as old_filter > import os > import os.path > import posixpath > > > data = os.listdir() > > def filter(names, pat, *, invert=False): > """Return the subset of the list NAMES that match PAT.""" > result = [] > pat = os.path.normcase(pat) > match = _compile_pattern(pat) > if os.path is posixpath: > # normcase on posix is NOP. Optimize it away from the loop. > for name in names: > if bool(match(name)) == (not invert): > result.append(name) > else: > for name in names: > if bool(match(os.path.normcase(name))) == (not invert): > result.append(name) > return result > > if __name__ == '__main__': > import timeit > print(timeit.timeit( > "filter(data, '__*')", > setup="from __main__ import filter, data" > )) > print(timeit.timeit( > "filter(data, '__*')", > setup="from __main__ import old_filter as filter, data" > )) > > The first test (modified code) timed at 22.492161903402575, where the second > test (unmodified) timed at 19.555531892032324 > If you don't care about slow-downs in this range, you could use this pattern: excluded = set(filter(data, '__*')) result = [item for item in data if item not in excluded] It seems to take just as much longer although the slow-down is not constant but depends on the size of the set you need to generate. Wolfgang From srkunze at mail.de Fri May 19 10:46:50 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 19 May 2017 16:46:50 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: <4b730e9f-94ee-b93d-cdef-7bc6c5637e39@mail.de> Exactly this. On 19.05.2017 14:42, Eric V. Smith wrote: >> Could you point me to this 4-point list of Stephan's? I couldn't find >> anything in the archive that you might be referring to. > > Never mind, I found them here: > https://mail.python.org/pipermail/python-ideas/2017-May/045679.html > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From tritium-list at sdamon.com Fri May 19 14:01:27 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Fri, 19 May 2017 14:01:27 -0400 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> Message-ID: <075201d2d0c9$f0135b50$d03a11f0$@hotmail.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Wolfgang Maier > Sent: Friday, May 19, 2017 10:03 AM > To: python-ideas at python.org > Subject: Re: [Python-ideas] fnmatch.filter_false > > On 05/17/2017 07:55 PM, > tritium-list at sdamon.com wrote: > > Top posting, apologies. > > > > I'm sure there is a better way to do it, and there is a performance hit, but > > its negligible. This is also a three line delta of the function. > > > > from fnmatch import _compile_pattern, filter as old_filter > > import os > > import os.path > > import posixpath > > > > > > data = os.listdir() > > > > def filter(names, pat, *, invert=False): > > """Return the subset of the list NAMES that match PAT.""" > > result = [] > > pat = os.path.normcase(pat) > > match = _compile_pattern(pat) > > if os.path is posixpath: > > # normcase on posix is NOP. Optimize it away from the loop. > > for name in names: > > if bool(match(name)) == (not invert): > > result.append(name) > > else: > > for name in names: > > if bool(match(os.path.normcase(name))) == (not invert): > > result.append(name) > > return result > > > > if __name__ == '__main__': > > import timeit > > print(timeit.timeit( > > "filter(data, '__*')", > > setup="from __main__ import filter, data" > > )) > > print(timeit.timeit( > > "filter(data, '__*')", > > setup="from __main__ import old_filter as filter, data" > > )) > > > > The first test (modified code) timed at 22.492161903402575, where the > second > > test (unmodified) timed at 19.555531892032324 > > > > If you don't care about slow-downs in this range, you could use this > pattern: > > excluded = set(filter(data, '__*')) > result = [item for item in data if item not in excluded] > > It seems to take just as much longer although the slow-down is not > constant but depends on the size of the set you need to generate. > > Wolfgang > If I didn't care about performance, I wouldn't be using filter - the only reason to use filter over a list comprehension is performance. The standard library has a performant inclusion filter, but does not have a performant exclusion filter. From guido at python.org Fri May 19 14:24:53 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 19 May 2017 11:24:53 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: For people who don't want to click on links: 1. Allow hash and equality to be based on object identity, rather than structural identity, this is very important if one wants to store un-hashable objects in the instance. (In my case: mostly dict's and numpy arrays). 2. Not subclassed from tuple. I have been bitten by this subclassing when trying to set up singledispatch on sequences and also on my classes. 3. Easily allow to specify default values. With namedtuple this requires overriding __new__. 4. Easily allow to specify a conversion function. For example I have some code like below: note that I can store a numpy array while keeping hashability and I can make it convert to a numpy array in the constructor. @attr.s(cmp=False, hash=False) class SvgTransform(SvgPicture): child = attr.ib() matrix = attr.ib(convert=numpy.asarray) I have one question about (4) -- how and when is the conversion function used, and what is its signature? On Fri, May 19, 2017 at 5:42 AM, Eric V. Smith wrote: > Could you point me to this 4-point list of Stephan's? I couldn't find >> anything in the archive that you might be referring to. >> > > Never mind, I found them here: > https://mail.python.org/pipermail/python-ideas/2017-May/045679.html > > Eric. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Fri May 19 14:32:44 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 19 May 2017 20:32:44 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: Let me quote the attrs docs: "" convert (callable) ? callable() that is called by attrs-generated __init__ methods to convert attribute?s value to the desired format. It is given the passed-in value, and the returned value will be used as the new value of the attribute. The value is converted before being passed to the validator, if any. """ So the signature is essentially: self.myattrib = callable (myattrib) Stephan Op 19 mei 2017 20:25 schreef "Guido van Rossum" : > For people who don't want to click on links: > > 1. Allow hash and equality to be based on object identity, rather than > structural identity, > this is very important if one wants to store un-hashable objects in > the instance. > (In my case: mostly dict's and numpy arrays). > > 2. Not subclassed from tuple. I have been bitten by this subclassing > when trying to set up > singledispatch on sequences and also on my classes. > > 3. Easily allow to specify default values. With namedtuple this > requires overriding __new__. > > 4. Easily allow to specify a conversion function. For example I have > some code like below: > note that I can store a numpy array while keeping hashability and > I can make it convert > to a numpy array in the constructor. > > @attr.s(cmp=False, hash=False) > class SvgTransform(SvgPicture): > child = attr.ib() > matrix = attr.ib(convert=numpy.asarray) > > > I have one question about (4) -- how and when is the conversion function used, and what is its signature? > > > On Fri, May 19, 2017 at 5:42 AM, Eric V. Smith wrote: > >> Could you point me to this 4-point list of Stephan's? I couldn't find >>> anything in the archive that you might be referring to. >>> >> >> Never mind, I found them here: >> https://mail.python.org/pipermail/python-ideas/2017-May/045679.html >> >> Eric. >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 19 14:36:08 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 19 May 2017 11:36:08 -0700 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: So it is only called by __init__ and not by __setattr__? On Fri, May 19, 2017 at 11:32 AM, Stephan Houben wrote: > Let me quote the attrs docs: > > "" > convert (callable) ? callable() that is called by attrs-generated __init__ > methods to convert attribute?s value to the desired format. It is given the > passed-in value, and the returned value will be used as the new value of > the attribute. The value is converted before being passed to the validator, > if any. > """ > > So the signature is essentially: > > self.myattrib = callable (myattrib) > > Stephan > > Op 19 mei 2017 20:25 schreef "Guido van Rossum" : > > For people who don't want to click on links: >> >> 1. Allow hash and equality to be based on object identity, rather than >> structural identity, >> this is very important if one wants to store un-hashable objects in >> the instance. >> (In my case: mostly dict's and numpy arrays). >> >> 2. Not subclassed from tuple. I have been bitten by this subclassing >> when trying to set up >> singledispatch on sequences and also on my classes. >> >> 3. Easily allow to specify default values. With namedtuple this >> requires overriding __new__. >> >> 4. Easily allow to specify a conversion function. For example I have >> some code like below: >> note that I can store a numpy array while keeping hashability and >> I can make it convert >> to a numpy array in the constructor. >> >> @attr.s(cmp=False, hash=False) >> class SvgTransform(SvgPicture): >> child = attr.ib() >> matrix = attr.ib(convert=numpy.asarray) >> >> >> I have one question about (4) -- how and when is the conversion function used, and what is its signature? >> >> >> On Fri, May 19, 2017 at 5:42 AM, Eric V. Smith >> wrote: >> >>> Could you point me to this 4-point list of Stephan's? I couldn't find >>>> anything in the archive that you might be referring to. >>>> >>> >>> Never mind, I found them here: >>> https://mail.python.org/pipermail/python-ideas/2017-May/045679.html >>> >>> Eric. >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Fri May 19 14:49:08 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 19 May 2017 20:49:08 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: Hi Guido, Yes indeed, *only* invoked by __init__ . See my test below. ===== import attr @attr.s class Foo: x = attr.ib(convert=str) foo = Foo(42) print(repr(foo.x)) # prints '42' foo.x = 42 print(repr(foo.x)) # prints 42 ====== Not sure if this is a good design but it matches the docs. Stephan Op 19 mei 2017 20:36 schreef "Guido van Rossum" : So it is only called by __init__ and not by __setattr__? On Fri, May 19, 2017 at 11:32 AM, Stephan Houben wrote: > Let me quote the attrs docs: > > "" > convert (callable) ? callable() that is called by attrs-generated __init__ > methods to convert attribute?s value to the desired format. It is given the > passed-in value, and the returned value will be used as the new value of > the attribute. The value is converted before being passed to the validator, > if any. > """ > > So the signature is essentially: > > self.myattrib = callable (myattrib) > > Stephan > > Op 19 mei 2017 20:25 schreef "Guido van Rossum" : > > For people who don't want to click on links: >> >> 1. Allow hash and equality to be based on object identity, rather than >> structural identity, >> this is very important if one wants to store un-hashable objects in >> the instance. >> (In my case: mostly dict's and numpy arrays). >> >> 2. Not subclassed from tuple. I have been bitten by this subclassing >> when trying to set up >> singledispatch on sequences and also on my classes. >> >> 3. Easily allow to specify default values. With namedtuple this >> requires overriding __new__. >> >> 4. Easily allow to specify a conversion function. For example I have >> some code like below: >> note that I can store a numpy array while keeping hashability and >> I can make it convert >> to a numpy array in the constructor. >> >> @attr.s(cmp=False, hash=False) >> class SvgTransform(SvgPicture): >> child = attr.ib() >> matrix = attr.ib(convert=numpy.asarray) >> >> >> I have one question about (4) -- how and when is the conversion function used, and what is its signature? >> >> >> On Fri, May 19, 2017 at 5:42 AM, Eric V. Smith >> wrote: >> >>> Could you point me to this 4-point list of Stephan's? I couldn't find >>>> anything in the archive that you might be referring to. >>>> >>> >>> Never mind, I found them here: >>> https://mail.python.org/pipermail/python-ideas/2017-May/045679.html >>> >>> Eric. >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tinchester at gmail.com Fri May 19 16:35:25 2017 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Fri, 19 May 2017 20:35:25 +0000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: Message-ID: Hello, I'm an attrs contributor so maybe I can clear up any questions. Convert callables are only called in __init__, not in the setters. We've had this requested a number of times and we will almost certainly support it in the future, probably on an opt-in basis. The reason we don't currently support it is mostly technical. We try really hard for our features to not add significant overhead to the generated classes, and doing this with minimal overhead for slot classes basically requires C/Cython or attribute *access* becomes significantly slower. But this is implementation stuff and not pertinent here. Date: Fri, 19 May 2017 20:49:08 +0200 > From: Stephan Houben > To: guido at python.org > Cc: "Eric V. Smith" , Python-Ideas > > Subject: Re: [Python-ideas] JavaScript-Style Object Creation in Python > (using a constructor function instead of a class to create objects) > Message-ID: > pxDNw at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi Guido, > > Yes indeed, *only* invoked by __init__ . > > See my test below. > ===== > import attr > > @attr.s > class Foo: > x = attr.ib(convert=str) > > foo = Foo(42) > print(repr(foo.x)) > # prints '42' > foo.x = 42 > print(repr(foo.x)) > # prints 42 > ====== > > Not sure if this is a good design but it matches the docs. > > Stephan > > Op 19 mei 2017 20:36 schreef "Guido van Rossum" : > > So it is only called by __init__ and not by __setattr__? > > On Fri, May 19, 2017 at 11:32 AM, Stephan Houben > wrote: > > > Let me quote the attrs docs: > > > > "" > > convert (callable) ? callable() that is called by attrs-generated > __init__ > > methods to convert attribute?s value to the desired format. It is given > the > > passed-in value, and the returned value will be used as the new value of > > the attribute. The value is converted before being passed to the > validator, > > if any. > > """ > > > > So the signature is essentially: > > > > self.myattrib = callable (myattrib) > > > > Stephan > > > > Op 19 mei 2017 20:25 schreef "Guido van Rossum" : > > > > For people who don't want to click on links: > >> > >> 1. Allow hash and equality to be based on object identity, rather than > >> structural identity, > >> this is very important if one wants to store un-hashable objects in > >> the instance. > >> (In my case: mostly dict's and numpy arrays). > >> > >> 2. Not subclassed from tuple. I have been bitten by this subclassing > >> when trying to set up > >> singledispatch on sequences and also on my classes. > >> > >> 3. Easily allow to specify default values. With namedtuple this > >> requires overriding __new__. > >> > >> 4. Easily allow to specify a conversion function. For example I have > >> some code like below: > >> note that I can store a numpy array while keeping hashability and > >> I can make it convert > >> to a numpy array in the constructor. > >> > >> @attr.s(cmp=False, hash=False) > >> class SvgTransform(SvgPicture): > >> child = attr.ib() > >> matrix = attr.ib(convert=numpy.asarray) > >> > >> > >> I have one question about (4) -- how and when is the conversion > function used, and what is its signature? > >> > >> > >> On Fri, May 19, 2017 at 5:42 AM, Eric V. Smith > >> wrote: > >> > >>> Could you point me to this 4-point list of Stephan's? I couldn't find > >>>> anything in the archive that you might be referring to. > >>>> > >>> > >>> Never mind, I found them here: > >>> https://mail.python.org/pipermail/python-ideas/2017-May/045679.html > >>> > >>> Eric. > >>> > >>> > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas at python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> Code of Conduct: http://python.org/psf/codeofconduct/ > >>> > >> > >> > >> > >> -- > >> --Guido van Rossum (python.org/~guido) > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Fri May 19 17:33:50 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 19 May 2017 23:33:50 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> Message-ID: <2de8d676-015d-9e0d-0808-39cb473e33d3@mail.de> Just one additional word about mixins: we've been inspired by Django's extensive usage of mixins for their class-based views, forms and model classes. For one, we defined more mixins for those classes (like UrlMixin for view classes), and for another we developed our own mixin infrastructure for domain-specific classes. In the end, a lot of our top-level classes (the ones actually used as API etc.) look like plug'n'play + configuration: class MyImportantThing(Mixin1, Mixin2, Base): url = 'blaa/asdf' next_foo = 234 special_model = User For example in this case, "url" might be the config parameter for "Mixin1", "next_foo" for "Mixin2" and "special_model" is required/recognized by "Base". As you can imagine, testing is also easier. Regards, Sven On 19.05.2017 06:31, Eric V. Smith wrote: > On 5/18/17 2:26 PM, Sven R. Kunze wrote: >> On 17.05.2017 23:29, Ivan Levkivskyi wrote: >>> On 17 May 2017 at 20:09, Sven R. Kunze >> > wrote: >>> >>> class Foo(dictlike, tuplelike, simpleobject): >>> attribute1: User >>> attribute2: Blog >>> attribute3: list >>> >>> def __my_dunder__(self): >>> ... >>> >>> >>> As I understand this is more or less what is proposed, >> >> Are you sure? Could you point me to the relevant messages where mixins >> are mentioned as a key part of the proposal? All I could find are >> message using the @decorator syntaxes. >> >> We've been working with mixins successfully for years now and I can tell >> you that it's "just" a clever way of refactoring existing code in order >> to make it more accessible to other modules *based on use-cases*. >> >> >> So, the best person to tell what pieces to factor out would be Stephan >> using his 4-point list. >> And of course other people using NamedTuple but frequently refactoring >> to "attr" or own class because NamedTuple just is too much (defines too >> much implicitly). > > Could you point me to this 4-point list of Stephan's? I couldn't find > anything in the archive that you might be referring to. > > Eric. > From steve at pearwood.info Fri May 19 20:19:23 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 20 May 2017 10:19:23 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> Message-ID: <20170520001921.GW24625@ando.pearwood.info> On Fri, May 19, 2017 at 11:24:53AM -0700, Guido van Rossum wrote: > 4. Easily allow to specify a conversion function. For example I have > some code like below: > note that I can store a numpy array while keeping hashability and > I can make it convert > to a numpy array in the constructor. > > @attr.s(cmp=False, hash=False) > class SvgTransform(SvgPicture): > child = attr.ib() > matrix = attr.ib(convert=numpy.asarray) I find that completely enigmatic, there's far too much implicit behaviour going on behind the scenes. I couldn't even begin to guess what SvgTransform as a class does, or what SvgTransform.child and SvgTransform.matrix are. I suppose that's okay for experts to whom the attrs module is second nature, but I think this approach is far too "magical" for my tastes. Instead of trying to cover every possible use-case from a single decorator with a multitude of keyword arguments, I think covering the simple cases is enough. Explicitly overriding methods is not a bad thing! It is much more comprehensible to see an explicit class with methods than a decorator with multiple keyword arguments and callbacks. I like the namedtuple approach: I think it hits the sweet spot between "having to do everything by hand" and "everything is magical". -- Steve From wolfgang.maier at biologie.uni-freiburg.de Sat May 20 07:05:18 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Sat, 20 May 2017 13:05:18 +0200 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <075201d2d0c9$f0135b50$d03a11f0$@hotmail.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170517164331.GA22240@phdru.name> <05c301d2cf31$bdda8e90$398fabb0$@hotmail.com> <05f001d2cf36$b93de2b0$2bb9a810$@hotmail.com> <075201d2d0c9$f0135b50$d03a11f0$@hotmail.com> Message-ID: <42152095-16f4-afb2-e5ca-d8b5677fe402@biologie.uni-freiburg.de> On 19.05.2017 20:01, tritium-list at sdamon.com wrote: > > >> -----Original Message----- >> From: Python-ideas [mailto:python-ideas-bounces+tritium- >> list=sdamon.com at python.org] On Behalf Of Wolfgang Maier >> Sent: Friday, May 19, 2017 10:03 AM >> To: python-ideas at python.org >> Subject: Re: [Python-ideas] fnmatch.filter_false >> >> On 05/17/2017 07:55 PM, >> tritium-list at sdamon.com wrote: >>> Top posting, apologies. >>> >>> I'm sure there is a better way to do it, and there is a performance hit, > but >>> its negligible. This is also a three line delta of the function. >>> >>> from fnmatch import _compile_pattern, filter as old_filter >>> import os >>> import os.path >>> import posixpath >>> >>> >>> data = os.listdir() >>> >>> def filter(names, pat, *, invert=False): >>> """Return the subset of the list NAMES that match PAT.""" >>> result = [] >>> pat = os.path.normcase(pat) >>> match = _compile_pattern(pat) >>> if os.path is posixpath: >>> # normcase on posix is NOP. Optimize it away from the loop. >>> for name in names: >>> if bool(match(name)) == (not invert): >>> result.append(name) >>> else: >>> for name in names: >>> if bool(match(os.path.normcase(name))) == (not invert): >>> result.append(name) >>> return result >>> >>> if __name__ == '__main__': >>> import timeit >>> print(timeit.timeit( >>> "filter(data, '__*')", >>> setup="from __main__ import filter, data" >>> )) >>> print(timeit.timeit( >>> "filter(data, '__*')", >>> setup="from __main__ import old_filter as filter, data" >>> )) >>> >>> The first test (modified code) timed at 22.492161903402575, where the >> second >>> test (unmodified) timed at 19.555531892032324 >>> >> >> If you don't care about slow-downs in this range, you could use this >> pattern: >> >> excluded = set(filter(data, '__*')) >> result = [item for item in data if item not in excluded] >> >> It seems to take just as much longer although the slow-down is not >> constant but depends on the size of the set you need to generate. >> >> Wolfgang >> > > > If I didn't care about performance, I wouldn't be using filter - the only > reason to use filter over a list comprehension is performance. The standard > library has a performant inclusion filter, but does not have a performant > exclusion filter. > I'm sorry, but then your statement above doesn't make any sense to me: "I'm sure there is a better way to do it, and there is a performance hit, but its negligible." I'm proposing an alternative to you which times in very similarly to your own suggestion without copying or modifying stdlib code. That said I still like your idea of adding the exclude functionality to fnmatch. I just thought you may be interested in a solution that works right now. From steve at pearwood.info Sat May 20 09:28:57 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 20 May 2017 23:28:57 +1000 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> Message-ID: <20170520132857.GY24625@ando.pearwood.info> On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters wrote: > Fnmath.filter works great. To remind people what it does, it takes an > iterable of strings and a pattern and returns a list of the strings that > match the pattern. And that is wonderful > > However, I often need to filter *out* the items that match the pattern (to > ignore them). In every project that I need this I end up copying the > function out of the fnmatch library and adding 'not' to the test clause. It > would be wonderful if there was a filter_false version in the standard > library. Or in inversion Boolean option. Or something, to stop from having > to copy code every time I need to ignore files. Since I haven't seen any substantial objections to this idea, I've created an issue on the bug tracker, including a patch. http://bugs.python.org/issue30413 Unfortunately I have no CPU cycles available to learn the new Github way of doing things right now, somebody else will need to shepherd this through the rest of the process (making a PR, doing a review, rejecting or approving it, etc.) -- Steve From ncoghlan at gmail.com Sat May 20 12:06:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 21 May 2017 02:06:18 +1000 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: <20170520001921.GW24625@ando.pearwood.info> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> <20170520001921.GW24625@ando.pearwood.info> Message-ID: On 20 May 2017 at 10:19, Steven D'Aprano wrote: > On Fri, May 19, 2017 at 11:24:53AM -0700, Guido van Rossum wrote: > >> 4. Easily allow to specify a conversion function. For example I have >> some code like below: >> note that I can store a numpy array while keeping hashability and >> I can make it convert >> to a numpy array in the constructor. >> >> @attr.s(cmp=False, hash=False) >> class SvgTransform(SvgPicture): >> child = attr.ib() >> matrix = attr.ib(convert=numpy.asarray) > > > I find that completely enigmatic, there's far too much implicit > behaviour going on behind the scenes. I couldn't even begin to guess > what SvgTransform as a class does, or what SvgTransform.child and > SvgTransform.matrix are. > > I suppose that's okay for experts to whom the attrs module is second > nature, but I think this approach is far too "magical" for my tastes. Some of the key problems I personally see are that attrs reuses a general noun (attributes) rather than using other words that are more evocative of the "data record" use case, and many of the parameter names are about "How attrs work" and "How Python magic methods work" rather than "Behaviours I would like this class to have". That's fine for someone that's already comfortable writing those behaviours by hand and just wants to automate the boilerplate away (which is exactly the problem that attrs was written to solve), but it's significantly more problematic once we assume people will be using a feature like this *before* learning how to write out all the corresponding boilerplate themselves (which is the key additional complication that a language level version of this will have to account for). However, consider instead the following API sketch: from autoclass import data_record, data_field @data_record(orderable=False, hashable=False) class SvgTransform(SvgPicture): child = data_field() matrix = data_field(setter=numpy.asarray) Here, the core concepts to be learned would be: - the "autoclass" module lets you ask the interpreter to automatically fill in class details - SvgTransform is a data record that cannot be hashed, and cannot be ordered - it is a Python class inheriting from SvgPicture - it has two defined fields, child & matrix - we know "child" is an ordinary read/write instance attribute - we know "matrix" is a property, using numpy.asarray as its setter In this particular API sketch, data_record is just a class decorator factory, and data_field is a declarative helper type for use with that factory, so if you wanted to factor out particular combinations, you'd just write ordinary helper functions. > Instead of trying to cover every possible use-case from a single > decorator with a multitude of keyword arguments, I think covering the > simple cases is enough. Explicitly overriding methods is not a bad > thing! It is much more comprehensible to see an explicit class with > methods than a decorator with multiple keyword arguments and callbacks. This isn't the case for folks that have to actually *read* dunder methods to find out what a class does, thought. Reading an imperatively defined class only works that way once you're able to mentally pattern match "Oh, that's a conventional __init__, that's a conventional __repr__, that's a conventional __hash__, that's a conventional __eq__, that's a conventional __lt__ implementation, etc, etc". Right now, telling Python "I want to do the same stock-standard things that everyone always does" means writing a lot of repetitive logic (or, more likely, copying the logic from an existing class that you or someone else wrote, and editing it to fit). The idea behind offering some form of declarative class definitions is to build out a vocabulary of conventional class behaviours, and make that vocabulary executable such that folks can use it to write applications even if they haven't learned how it works under the hood yet. As with descriptors before it, that vocabulary may also take advantage of the fact that Python offers first class functions to allow callbacks and transformation functions to be injected at various steps in the process *without* requiring you to also spell out all the other steps in the process that you don't want to alter. > I like the namedtuple approach: I think it hits the sweet spot between > "having to do everything by hand" and "everything is magical". It's certainly a lot better than nothing at all, but it brings a lot of baggage with it due to the fact that it *is* a tuple. Declarative class definitions aim to offer the convenience of namedtuple definitions, without the complications that arise from the "it's a tuple with some additional metadata and behaviours" aspects. Database object-relational-mapping layers like those in SQL Alchemy and Django would be the most famous precursors for this, but there are also things like Django Form definitions, and APIs like JSL (which uses Python classes to declaratively define JSON Schema documents). For folks already familiar with ORMs, declarative classes are just a matter of making in memory data structures as easy to work with as database backed ones. For folks that *aren't* familiar with ORMs yet, then declarative classes provide a potentially smoother learning curve, since the "declarative class" aspects can be better separated from the "object-relational mapping" aspects. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From pavol.lisy at gmail.com Sun May 21 02:04:04 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sun, 21 May 2017 08:04:04 +0200 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <20170520132857.GY24625@ando.pearwood.info> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170520132857.GY24625@ando.pearwood.info> Message-ID: On 5/20/17, Steven D'Aprano wrote: > On Wed, May 17, 2017 at 12:14:05PM -0400, Alex Walters wrote: >> Fnmath.filter works great. To remind people what it does, it takes an >> iterable of strings and a pattern and returns a list of the strings that >> match the pattern. And that is wonderful >> >> However, I often need to filter *out* the items that match the pattern >> (to >> ignore them). In every project that I need this I end up copying the >> function out of the fnmatch library and adding 'not' to the test clause. >> It >> would be wonderful if there was a filter_false version in the standard >> library. Or in inversion Boolean option. Or something, to stop from >> having >> to copy code every time I need to ignore files. > > Since I haven't seen any substantial objections to this idea, I've > created an issue on the bug tracker, including a patch. > > http://bugs.python.org/issue30413 > > Unfortunately I have no CPU cycles available to learn the new Github way > of doing things right now, somebody else will need to shepherd this > through the rest of the process (making a PR, doing a review, rejecting > or approving it, etc.) > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > If fnmatch.filter was written to solve performance, isn't calling _filtered step back? PL. From steve at pearwood.info Sun May 21 02:37:28 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 21 May 2017 16:37:28 +1000 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170520132857.GY24625@ando.pearwood.info> Message-ID: <20170521063728.GZ24625@ando.pearwood.info> On Sun, May 21, 2017 at 08:04:04AM +0200, Pavol Lisy wrote: > If fnmatch.filter was written to solve performance, isn't calling > _filtered step back? It adds overhead of *one* function call regardless of how many thousands or millions of names you are filtering. And the benefit is that filter() and filterfalse() can share the same implementation, without duplicating code. If you have a tiny list, then it might be a tiny bit slower, but we shouldn't care too much about the performance for tiny lists. We should care about performance for large lists, but for large N the overhead of one extra function call is insignificant. At least... I hope so. If somebody demonstrates an actual and significant performance hit, then we should rethink the implementation. But even a small performance hit is acceptible, if it comes with a significant benefit (no need to duplicate code) and it is small enough. On my computer, using Python 3.5, I get little significant difference: # original [steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" -s "L = list('abcdefghi')" "filter(L, 'b')" 100000 loops, best of 3: 11.7 usec per loop # patched 100000 loops, best of 3: 12.8 usec per loop Who cares about 1 millionth of a second? :-) For a larger list, the difference vanishes: # original [steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" -s "L = list('abcdefghi')*10000" "filter(L, 'b')" 10 loops, best of 3: 80.2 msec per loop # patched 10 loops, best of 3: 79.1 msec per loop (Yes, the patched version was faster, on my computer.) These are quick and dirty benchmarks, using an older version of Python, but I expect the same will apply in 3.7 if anyone bothers to check :-) -- Steve From pavol.lisy at gmail.com Sun May 21 03:45:31 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sun, 21 May 2017 09:45:31 +0200 Subject: [Python-ideas] fnmatch.filter_false In-Reply-To: <20170521063728.GZ24625@ando.pearwood.info> References: <04fc01d2cf28$9c068480$d4138d80$@sdamon.com> <20170520132857.GY24625@ando.pearwood.info> <20170521063728.GZ24625@ando.pearwood.info> Message-ID: On 5/21/17, Steven D'Aprano wrote: > On Sun, May 21, 2017 at 08:04:04AM +0200, Pavol Lisy wrote: > >> If fnmatch.filter was written to solve performance, isn't calling >> _filtered step back? > > It adds overhead of *one* function call regardless of how many > thousands or millions of names you are filtering. And the benefit is > that filter() and filterfalse() can share the same implementation, > without duplicating code. > > If you have a tiny list, then it might be a tiny bit slower, but we > shouldn't care too much about the performance for tiny lists. We > should care about performance for large lists, but for large N the > overhead of one extra function call is insignificant. > > At least... I hope so. If somebody demonstrates an actual and > significant performance hit, then we should rethink the implementation. > > But even a small performance hit is acceptible, if it comes with a > significant benefit (no need to duplicate code) and it is small enough. > > On my computer, using Python 3.5, I get little significant difference: > > # original > [steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" > -s "L = list('abcdefghi')" "filter(L, 'b')" > 100000 loops, best of 3: 11.7 usec per loop > > > # patched > 100000 loops, best of 3: 12.8 usec per loop > > Who cares about 1 millionth of a second? :-) I was reading 9-10% slowdown. :) It could be really good improvement if you can run 110m hurdles in 11.7s instead of 12.8s (nice coincidence ;) that according to wiki - 12.80s is current world record: https://en.wikipedia.org/wiki/Men%27s_110_metres_hurdles_world_record_progression ) But I really agree with this: > But even a small performance hit is acceptible, if it comes with a > significant benefit (no need to duplicate code) and it is small enough. although I was rather thinking about something like macro... PL. From paul_laos at outlook.com Sun May 21 10:43:12 2017 From: paul_laos at outlook.com (Paul Laos) Date: Sun, 21 May 2017 14:43:12 +0000 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: Message-ID: It would be nice to have the opposite of the pop() method: a push() method. While insert() and append() already exist, neither is the opposite of pop(). pop() has a default index parameter -1, but neither insert() nor append() has a default index parameter. push(obj) would be equivalent to insert(index = -1, object), having -1 as the default index parameter. In fact, push() could replace both append() and insert() by unifying them. By default push(obj) would insert an object to the end of the list, while push(obj, index) would insert the object at index, just like pop() removes (and returns) the object at the end of the list, and pop(index) removes (and returns) the object at the index. I found little discussion on this, just this SO thread http://stackoverflow.com/questions/1566266/why-is-pythons-append-not-push which lead to a discussion from 20 years ago (1997): https://groups.google.com/forum/#!topic/comp.lang.python/SKJq3S2ZYmg Some key arguments from the thread: >it would be an >easy and obvious improvement to make popend an explicit builtin, and >that this would make Python even more attractive to newcomers who want >to use what they already know and be immediately productive. - Terry Reedy >append() is a special case of insert(). The inverse of insert() is >the del function. The specific inverse of append() is del list[-1]. - Michael W. Ryan >but I'm not a big fan of multiple names for the same operation -- >sooner or later you're going to read code that uses the other one, so >you need to learn both, which is more cognitive load. - Guido van Rossum So while it has been discussed before, it's worth bringing up again, since this was before the release of Python 2.0. Pros: - Would simplify the language by having a symmetric relation to pop(). - Would make it easy to use lists as stacks. - Saves at least two characters - If append()/insert() are being removed and replaced, the complexity of lists is slightly reduced. Cons: - Would blur the line between lists and stacks. - The order of the parameters in push(obj, index = -1) would be the opposite of the parameters in insert(index, obj), because defaulted parameters come last. - If append()/insert() are being removed and replaced, backwards compatability breaks. - If append()/insert() are kept instead of being replaced, the complexity of lists is slightly increased. While it isn't a necessity, I believe the benefit of push() method outweighs its cons. ~Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sun May 21 22:27:34 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 21 May 2017 22:27:34 -0400 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: Message-ID: On Sun, May 21, 2017 at 10:43 AM, Paul Laos wrote: > By default push(obj) would insert an object to the end of the list, while > push(obj, index) would insert the object at index, just like pop() removes > (and returns) the object at the end of the list, and pop(index) removes > (and returns) the object at the index. The name asymmetry between .pop() and .append() has always bothered me. Cheers! -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Sun May 21 22:50:56 2017 From: toddrjen at gmail.com (Todd) Date: Sun, 21 May 2017 22:50:56 -0400 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: Message-ID: On Sun, May 21, 2017 at 10:43 AM, Paul Laos wrote: > It would be nice to have the opposite of the pop() method: a push() > method. While insert() and append() already exist, neither is the opposite > of pop(). pop() has a default index parameter -1, but neither insert() nor > append() has a default index parameter. push(obj) would be equivalent to > insert(index = -1, object), having -1 as the default index parameter. In > fact, push() could replace both append() and insert() by unifying them. > > > By default push(obj) would insert an object to the end of the list, while > push(obj, index) would insert the object at index, just like pop() removes > (and returns) the object at the end of the list, and pop(index) removes > (and returns) the object at the index. > > > I found little discussion on this, just this SO thread > http://stackoverflow.com/questions/1566266/why-is-pythons-append-not-push which > lead to a discussion from 20 years ago (1997): > > https://groups.google.com/forum/#!topic/comp.lang.python/SKJq3S2ZYmg > > > > Some key arguments from the thread: > > > >it would be an > >easy and obvious improvement to make popend an explicit builtin, and > >that this would make Python even more attractive to newcomers who want > >to use what they already know and be immediately productive. > - Terry Reedy > > > >append() is a special case of insert(). The inverse of insert() is > >the del function. The specific inverse of append() is del list[-1]. > - Michael W. Ryan > > >but I'm not a big fan of multiple names for the same operation -- > >sooner or later you're going to read code that uses the other one, so > >you need to learn both, which is more cognitive load. > - Guido van Rossum > > > So while it has been discussed before, it's worth bringing up again, since > this was before the release of Python 2.0. > > Pros: > > - Would simplify the language by having a symmetric relation to pop(). > > - Would make it easy to use lists as stacks. > > - Saves at least two characters > > - If append()/insert() are being removed and replaced, the complexity of > lists is slightly reduced. > > > Cons: > > - Would blur the line between lists and stacks. > > - The order of the parameters in push(obj, index = -1) would be the > opposite of the parameters in insert(index, obj), because defaulted > parameters come last. > > - If append()/insert() are being removed and replaced, backwards > compatability breaks. > > - If append()/insert() are kept instead of being replaced, the complexity > of lists is slightly increased. > > > While it isn't a necessity, I believe the benefit of push() method > outweighs its cons. > There is absolutely zero chance that append and insert are going away. That would break everything. So with your proposal, we would end up with two methods that do exactly the same thing, differing only in the defaults and the order of arguments. That definitely does not make the language simpler. The big question you need to ask when proposing a new feature for python is "what is the use-case for this?" What does this do that existing features can't do, or that existing features are significantly worse at? Since this literally duplicates the functionality of existing features, it doesn't have such a use-case If we were changing things at all, I think the first place to start would be to either set a default argument for "insert" or add an optional argument for "append", but in both cases that would again be duplicating functionality so I don't see the point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 22 01:47:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 22 May 2017 15:47:42 +1000 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: Message-ID: On 22 May 2017 at 00:43, Paul Laos wrote: > So while it has been discussed before, it's worth bringing up again, since > this was before the release of Python 2.0. It was also before the addition of collections.deque. which uses appendleft() and extendleft(), rather than pushleft(). > Pros: > > - Would simplify the language by having a symmetric relation to pop(). > - Would make it easy to use lists as stacks. I think the key argument here would be that *for folks that already know the push/pop terminology for stack data structures*, "push" is a potentially more intuitive term than the insert/append/extend trio. However, for most people, "append an item to a list", "insert an item into a list" and "extend a list" are common English phrases, while "push an item onto a list" would get you funny looks, and even "push an item onto a stack" would be unusual in a spoken conversation (the non-jargon phrase in that case is "add an item to the stack", but "add" would be ambiguous between the append() and extend() meanings) As a result, the specific-to-computer-science jargon loses out. The situation for `pop()` is different, as `remove()` is already taken for "remove an item from the list by value", so a different word is needed for "remove an item from the list by index". > - If append()/insert() are being removed and replaced, the complexity of > lists is slightly reduced. There's zero chance of the existing APIs going away - they're not broken, and they match common English phrasing. The fact they don't match common computer science jargon isn't ideal, but it's relevatively straightforward to define a stack data structure if someone really wants to do so. > Cons: > > - Would blur the line between lists and stacks. > > - The order of the parameters in push(obj, index = -1) would be the opposite > of the parameters in insert(index, obj), because defaulted parameters come > last. While I don't think it makes sense to add the method in the first place, if we did, it either wouldn't accept an index parameter, or else the index parameter would be a keyword-only argument. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From python at lucidity.plus.com Mon May 22 04:52:09 2017 From: python at lucidity.plus.com (Erik) Date: Mon, 22 May 2017 09:52:09 +0100 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: Message-ID: <80b2aeeb-724b-2731-aca3-80dcbd5daad9@lucidity.plus.com> On 21/05/17 15:43, Paul Laos wrote: > push(obj) would be > equivalent to insert(index = -1, object), having -1 as the default index > parameter. In fact, push() could replace both append() and insert() by > unifying them. I don't think list.insert() with an index of -1 does what you think it does: $ python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> l = [0, 1, 2] >>> l [0, 1, 2] >>> l.insert(-1, 99) >>> l [0, 1, 99, 2] >>> Because the indices can be thought of as referencing the spaces _between_ the objects, having a push() in which -1 is referencing a different 'space' than a -1 given to insert() or a slice operation refers to would, I suspect, be a source of confusion (and off-by-one bugs). E. From paul_laos at outlook.com Mon May 22 04:55:02 2017 From: paul_laos at outlook.com (Paul Laos) Date: Mon, 22 May 2017 08:55:02 +0000 Subject: [Python-ideas] Suggestion: push() method for lists In-Reply-To: References: , Message-ID: >I think the first place to start would be to either set a default argument for "insert" >or add an optional argument for "append", but in both cases that would again be >duplicating functionality so I don't see the point. - Todd That would be fine too; the main idea is just having a method that's symmetrically related to pop(). append(2, 2) would seem awkward, but it could work with insert(2). >However, for most people, "append an item to a list", "insert an item >into a list" and "extend a list" are common English phrases, while >"push an item onto a list" would get you funny looks, -Nick Coghlan This is an interesting argument, because English isn't my first language, so it's not something I would think of. Having a programming language obey the laws of spoken English seems uncalled for, but it's still a good point. The name could be something more natural, of course. (Even so, is "pop() an item from a list" any better?) >There is absolutely zero chance that append and insert are going away. That would break everything. - Todd >There's zero chance of the existing APIs going away - Nick Coghlan It was hypothetical, but you're probably right. >I don't think it makes sense to add the method - Nick Coghlan It makes sense to have inverse methods. pop() pops off the last item and pop(i) pops off the ith item. There are no methods such that push(obj) pushes the item onto the end and push(obj, index) pushes the item onto the ith position. That would be append(obj) and insert(obj, index), but there's no symmetric relation here. This is harder to learn, so it would be better to have a unified method for the inverse. Todd's suggestion of letting the index parameter for insert() be optional would accomplish the same result, without having to add an additional method. ~Paul ________________________________ From: Nick Coghlan Sent: Monday, May 22, 2017 7:47:42 AM To: Paul Laos Cc: python-ideas at python.org Subject: Re: [Python-ideas] Suggestion: push() method for lists On 22 May 2017 at 00:43, Paul Laos wrote: > So while it has been discussed before, it's worth bringing up again, since > this was before the release of Python 2.0. It was also before the addition of collections.deque. which uses appendleft() and extendleft(), rather than pushleft(). > Pros: > > - Would simplify the language by having a symmetric relation to pop(). > - Would make it easy to use lists as stacks. I think the key argument here would be that *for folks that already know the push/pop terminology for stack data structures*, "push" is a potentially more intuitive term than the insert/append/extend trio. However, for most people, "append an item to a list", "insert an item into a list" and "extend a list" are common English phrases, while "push an item onto a list" would get you funny looks, and even "push an item onto a stack" would be unusual in a spoken conversation (the non-jargon phrase in that case is "add an item to the stack", but "add" would be ambiguous between the append() and extend() meanings) As a result, the specific-to-computer-science jargon loses out. The situation for `pop()` is different, as `remove()` is already taken for "remove an item from the list by value", so a different word is needed for "remove an item from the list by index". > - If append()/insert() are being removed and replaced, the complexity of > lists is slightly reduced. There's zero chance of the existing APIs going away - they're not broken, and they match common English phrasing. The fact they don't match common computer science jargon isn't ideal, but it's relevatively straightforward to define a stack data structure if someone really wants to do so. > Cons: > > - Would blur the line between lists and stacks. > > - The order of the parameters in push(obj, index = -1) would be the opposite > of the parameters in insert(index, obj), because defaulted parameters come > last. While I don't think it makes sense to add the method in the first place, if we did, it either wouldn't accept an index parameter, or else the index parameter would be a keyword-only argument. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.maier at biologie.uni-freiburg.de Tue May 23 06:12:11 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 23 May 2017 12:12:11 +0200 Subject: [Python-ideas] tweaking the file system path protocol Message-ID: What do you think of this idea for a slight modification to os.fspath: the current version checks whether its arg is an instance of str, bytes or any subclass and, if so, returns the arg unchanged. In all other cases it tries to call the type's __fspath__ method to see if it can get str, bytes, or a subclass thereof this way. My proposal is to change this to: 1) check whether the type of the argument is str or bytes *exactly*; if so, return the argument unchanged 2) check wether __fspath__ can be called on the type and returns an instance of str, bytes, or any subclass (just like in the current version) 3) check whether the type is a subclass of str or bytes and, if so, return it unchanged This would have the following implications: a) it would speed up the very common case when the arg is either a str or a bytes instance exactly b) user-defined classes that inherit from str or bytes could control their path representation just like any other class c) subclasses of str/bytes that don't define __fspath__ would still work like they do now, but their processing would be slower d) subclasses of str/bytes that accidentally define a __fspath__ method would change their behavior I think cases c) and d) could be sufficiently rare that the pros outweigh the cons? Here's how the proposal could be implemented in the pure Python version (os._fspath): def _fspath(path): path_type = type(path) if path_type is str or path_type is bytes: return path # Work from the object's type to match method resolution of other magic # methods. try: path_repr = path_type.__fspath__(path) except AttributeError: if hasattr(path_type, '__fspath__'): raise elif issubclass(path_type, (str, bytes)): return path else: raise TypeError("expected str, bytes or os.PathLike object, " "not " + path_type.__name__) if isinstance(path_repr, (str, bytes)): return path_repr else: raise TypeError("expected {}.__fspath__() to return str or bytes, " "not {}".format(path_type.__name__, type(path_repr).__name__)) From steve at pearwood.info Tue May 23 06:49:59 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 23 May 2017 20:49:59 +1000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: <20170523104958.GB24625@ando.pearwood.info> On Tue, May 23, 2017 at 12:12:11PM +0200, Wolfgang Maier wrote: > Here's how the proposal could be implemented in the pure Python version > (os._fspath): > > def _fspath(path): > path_type = type(path) > if path_type is str or path_type is bytes: > return path How about simplifying the implementation of fspath by giving str and bytes a __fspath__ method that returns str(self) or bytes(self)? class str: def __fspath__(self): return str(self) # Must be str, not type(self). (1) We can avoid most of the expensive type checks. (2) Subclasses of str and bytes don't have to do anything to get a useful default behaviour. def fspath(path): try: dunder = type(path).__fspath__ except AttributeError: raise TypeError(...) from None else: if dunder is not None: result = dunder(path) if type(result) in (str, byte): return result raise TypeError('expected a str or bytes, got ...') The reason for the not None check is to allow subclasses to explicitly deny that they can be used for paths by setting __fspath__ to None in the subclass. -- Steve From srkunze at mail.de Tue May 23 12:08:38 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 23 May 2017 18:08:38 +0200 Subject: [Python-ideas] JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <89cf9642-266e-9979-bd58-1334252844e2@trueblade.com> <12b35e8c-28e6-b59e-8b4e-277c9ebe79cd@trueblade.com> <20170520001921.GW24625@ando.pearwood.info> Message-ID: <92234bdc-86c3-aba2-4b87-5b7472f6ad89@mail.de> On 20.05.2017 18:06, Nick Coghlan wrote: > That's fine for someone that's already comfortable writing those > behaviours by hand and just wants to automate the boilerplate away > (which is exactly the problem that attrs was written to solve), but > it's significantly more problematic once we assume people will be > using a feature like this *before* learning how to write out all the > corresponding boilerplate themselves (which is the key additional > complication that a language level version of this will have to > account for). That's a good point. At least, Python does a very good job at reducing boilerplate in many cases in the first case. One usually starts small instead of big. > [API sketch] > > In this particular API sketch, data_record is just a class decorator > factory, and data_field is a declarative helper type for use with that > factory, so if you wanted to factor out particular combinations, you'd > just write ordinary helper functions. I might weigh in that decorating classes seems to be a bit awkward. Especially because I know that there are many people and frameworks out there telling you to decorate your classes with their decorators. If you don't pay enough attention, you end up having something like this: @nice_decoration(para1, para2) @decorate_somehow @important_decoration(para3) @please_use_me() class MyClass(Mixin2, Mixin1, BaseClass): para4 = 'foo' para5 = 'bar' para6 = 'boo' para7 = 'baz' So, there's a region with decorations (decorators + params) OUTSIDE the class and there's a region with declarations (mixins + params) INSIDE the class; BOTH doing some sort of configuration of the class. I honestly cannot tell which API style would be better but it seems we (at least internally) decided for the latter version: "If it belongs to the class, it should be inside the class." Maybe, Python standard declarative class construction could be an exception because it's the default. But I am not sure. Technically, I think, both approaches can achieve the same result. > Database object-relational-mapping layers like those in SQL Alchemy > and Django would be the most famous precursors for this, but there are > also things like Django Form definitions, and APIs like JSL (which > uses Python classes to declaratively define JSON Schema documents). > > For folks already familiar with ORMs, declarative classes are just a > matter of making in memory data structures as easy to work with as > database backed ones. For folks that *aren't* familiar with ORMs yet, > then declarative classes provide a potentially smoother learning > curve, since the "declarative class" aspects can be better separated > from the "object-relational mapping" aspects. Well said. Regards, Sven From k7hoven at gmail.com Tue May 23 12:17:49 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 23 May 2017 19:17:49 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier wrote: > What do you think of this idea for a slight modification to os.fspath: > the current version checks whether its arg is an instance of str, bytes or > any subclass and, if so, returns the arg unchanged. In all other cases it > tries to call the type's __fspath__ method to see if it can get str, bytes, > or a subclass thereof this way. > > My proposal is to change this to: > 1) check whether the type of the argument is str or bytes *exactly*; if so, > return the argument unchanged > 2) check wether __fspath__ can be called on the type and returns an instance > of str, bytes, or any subclass (just like in the current version) > 3) check whether the type is a subclass of str or bytes and, if so, return > it unchanged The reason why this was not done was that a str or bytes subclass that implements __fspath__(self) would work in both pre-3.6 and 3.6+ but behave differently. This would be also be incompatible with existing code using str(path) for compatibility with the stdlib (the old way, which people still use for pre-3.6 compatibility even in new code). > This would have the following implications: > a) it would speed up the very common case when the arg is either a str or a > bytes instance exactly To get the same performance benefit for str and bytes, but without changing functionality, there could first be the exact type check and then the isinstance check. This would add some performance penalty for PathLike objects. Removing the isinstance part of the __fspath__() return value, which I find less useful, would compensate for that. (3) would not be necessary in this version. Are you asking for other reasons, or because you actually have a use case where this matters? If this performance really matters somewhere, the version I describe above could be considered. It would have 100% backwards compatibility, or a little less (99% ?) if the isinstance check of the __fspath__() return value is removed for performance compensation. > b) user-defined classes that inherit from str or bytes could control their > path representation just like any other class Again, this would cause differences in behavior between different Python versions, and based on whether str(path) is used or not. ?Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From k7hoven at gmail.com Tue May 23 12:28:44 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 23 May 2017 19:28:44 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <20170523104958.GB24625@ando.pearwood.info> References: <20170523104958.GB24625@ando.pearwood.info> Message-ID: On Tue, May 23, 2017 at 1:49 PM, Steven D'Aprano wrote: > > How about simplifying the implementation of fspath by giving str and > bytes a __fspath__ method that returns str(self) or bytes(self)? > The compatiblity issue I mention in the other email I just sent as a response to the OP will appear if a subclass returns something other than str(self) or bytes(self). ?Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From wolfgang.maier at biologie.uni-freiburg.de Tue May 23 12:53:44 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 23 May 2017 18:53:44 +0200 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <280641a5-e19e-1561-d9a0-57ea8e1d03e9@biologie.uni-freiburg.de> References: <280641a5-e19e-1561-d9a0-57ea8e1d03e9@biologie.uni-freiburg.de> Message-ID: <11ab5f1b-98e0-0798-fc1c-9f41b9b6ac53@biologie.uni-freiburg.de> On 05/23/2017 06:41 PM, Wolfgang Maier wrote: > On 05/23/2017 06:17 PM, Koos Zevenhoven wrote: >> On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier >> wrote: >>> What do you think of this idea for a slight modification to os.fspath: >>> the current version checks whether its arg is an instance of str, >>> bytes or >>> any subclass and, if so, returns the arg unchanged. In all other >>> cases it >>> tries to call the type's __fspath__ method to see if it can get str, >>> bytes, >>> or a subclass thereof this way. >>> >>> My proposal is to change this to: >>> 1) check whether the type of the argument is str or bytes *exactly*; >>> if so, >>> return the argument unchanged >>> 2) check wether __fspath__ can be called on the type and returns an >>> instance >>> of str, bytes, or any subclass (just like in the current version) >>> 3) check whether the type is a subclass of str or bytes and, if so, >>> return >>> it unchanged >> > > Hi Koos and thanks for your detailed response, > >> The reason why this was not done was that a str or bytes subclass that >> implements __fspath__(self) would work in both pre-3.6 and 3.6+ but >> behave differently. This would be also be incompatible with existing >> code using str(path) for compatibility with the stdlib (the old way, >> which people still use for pre-3.6 compatibility even in new code). >> > > I'm not sure that sounds very convincing because that exact problem > exists, was discussed and accepted in your PEP 519 for all other > classes. I do not really see why subclasses of str and bytes should > require special backwards compatibility here. Is there a reason why you > are thinking they should be treated specially? > Ah, sorry, I misunderstood what you were trying to say, but now I'm getting it! subclasses of str and bytes were of course usable as path arguments before simply because they were subclasses of them. Now they would be picked up based on their __fspath__ method, but old versions of Python executing code using them would still use them directly. Have to think about this one a bit, but thanks for pointing it out. >>> This would have the following implications: >>> a) it would speed up the very common case when the arg is either a >>> str or a >>> bytes instance exactly >> >> To get the same performance benefit for str and bytes, but without >> changing functionality, there could first be the exact type check and >> then the isinstance check. This would add some performance penalty for >> PathLike objects. Removing the isinstance part of the __fspath__() >> return value, which I find less useful, would compensate for that. (3) >> would not be necessary in this version. >> > > Right, that was one thing I forgot to mention in my list. My proposal > would also speed up processing of pathlike objects because it moves the > __fspath__ call up in front of the isinstance check. Your alternative > would speed up only str and bytes, but would slow down Path-like classes. > In addition, I'm not sure that removing the isinstance check on the > return value of __fspath__() is a good idea because that would mean > giving up the guarantee that os.fspath returns an instance of str or > bytes and would effectively force library code to do the isinstance > check anyway even if the function may have performed it already, which > would worsen performance further. > >> Are you asking for other reasons, or because you actually have a use >> case where this matters? If this performance really matters somewhere, >> the version I describe above could be considered. It would have 100% >> backwards compatibility, or a little less (99% ?) if the isinstance >> check of the __fspath__() return value is removed for performance >> compensation. >> > > That use case question is somewhat difficult to answer. I had this idea > when working on two bug tracker issues (one concerning fnmatch and a > follow-up one on os.path.normcase, which is called by fnmatch.filter > and, in turn, calls os.fspath. fnmatchfilter is a case where performance > matters and the decision when and where to call the rather expensive > os.path.normcase->os.fspath there is not entirely straightforward. So, > yes, I was basically looking at this because of a potential use case, > but I say potential because I'm far from sure that any speed gain in > os.fspath will be big enough to be useful for fnmatch.filter in the end. > > >>> b) user-defined classes that inherit from str or bytes could control >>> their >>> path representation just like any other class >> >> Again, this would cause differences in behavior between different >> Python versions, and based on whether str(path) is used or not. >> >> ?Koos >> >> From wolfgang.maier at biologie.uni-freiburg.de Tue May 23 12:41:30 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 23 May 2017 18:41:30 +0200 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: <280641a5-e19e-1561-d9a0-57ea8e1d03e9@biologie.uni-freiburg.de> On 05/23/2017 06:17 PM, Koos Zevenhoven wrote: > On Tue, May 23, 2017 at 1:12 PM, Wolfgang Maier > wrote: >> What do you think of this idea for a slight modification to os.fspath: >> the current version checks whether its arg is an instance of str, bytes or >> any subclass and, if so, returns the arg unchanged. In all other cases it >> tries to call the type's __fspath__ method to see if it can get str, bytes, >> or a subclass thereof this way. >> >> My proposal is to change this to: >> 1) check whether the type of the argument is str or bytes *exactly*; if so, >> return the argument unchanged >> 2) check wether __fspath__ can be called on the type and returns an instance >> of str, bytes, or any subclass (just like in the current version) >> 3) check whether the type is a subclass of str or bytes and, if so, return >> it unchanged > Hi Koos and thanks for your detailed response, > The reason why this was not done was that a str or bytes subclass that > implements __fspath__(self) would work in both pre-3.6 and 3.6+ but > behave differently. This would be also be incompatible with existing > code using str(path) for compatibility with the stdlib (the old way, > which people still use for pre-3.6 compatibility even in new code). > I'm not sure that sounds very convincing because that exact problem exists, was discussed and accepted in your PEP 519 for all other classes. I do not really see why subclasses of str and bytes should require special backwards compatibility here. Is there a reason why you are thinking they should be treated specially? >> This would have the following implications: >> a) it would speed up the very common case when the arg is either a str or a >> bytes instance exactly > > To get the same performance benefit for str and bytes, but without > changing functionality, there could first be the exact type check and > then the isinstance check. This would add some performance penalty for > PathLike objects. Removing the isinstance part of the __fspath__() > return value, which I find less useful, would compensate for that. (3) > would not be necessary in this version. > Right, that was one thing I forgot to mention in my list. My proposal would also speed up processing of pathlike objects because it moves the __fspath__ call up in front of the isinstance check. Your alternative would speed up only str and bytes, but would slow down Path-like classes. In addition, I'm not sure that removing the isinstance check on the return value of __fspath__() is a good idea because that would mean giving up the guarantee that os.fspath returns an instance of str or bytes and would effectively force library code to do the isinstance check anyway even if the function may have performed it already, which would worsen performance further. > Are you asking for other reasons, or because you actually have a use > case where this matters? If this performance really matters somewhere, > the version I describe above could be considered. It would have 100% > backwards compatibility, or a little less (99% ?) if the isinstance > check of the __fspath__() return value is removed for performance > compensation. > That use case question is somewhat difficult to answer. I had this idea when working on two bug tracker issues (one concerning fnmatch and a follow-up one on os.path.normcase, which is called by fnmatch.filter and, in turn, calls os.fspath. fnmatchfilter is a case where performance matters and the decision when and where to call the rather expensive os.path.normcase->os.fspath there is not entirely straightforward. So, yes, I was basically looking at this because of a potential use case, but I say potential because I'm far from sure that any speed gain in os.fspath will be big enough to be useful for fnmatch.filter in the end. >> b) user-defined classes that inherit from str or bytes could control their >> path representation just like any other class > > Again, this would cause differences in behavior between different > Python versions, and based on whether str(path) is used or not. > > ?Koos > > From brett at python.org Tue May 23 13:04:29 2017 From: brett at python.org (Brett Cannon) Date: Tue, 23 May 2017 17:04:29 +0000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: On Tue, 23 May 2017 at 03:13 Wolfgang Maier < wolfgang.maier at biologie.uni-freiburg.de> wrote: > What do you think of this idea for a slight modification to os.fspath: > the current version checks whether its arg is an instance of str, bytes > or any subclass and, if so, returns the arg unchanged. In all other > cases it tries to call the type's __fspath__ method to see if it can get > str, bytes, or a subclass thereof this way. > > My proposal is to change this to: > 1) check whether the type of the argument is str or bytes *exactly*; if > so, return the argument unchanged > 2) check wether __fspath__ can be called on the type and returns an > instance of str, bytes, or any subclass (just like in the current version) > 3) check whether the type is a subclass of str or bytes and, if so, > return it unchanged > > This would have the following implications: > a) it would speed up the very common case when the arg is either a str > or a bytes instance exactly > b) user-defined classes that inherit from str or bytes could control > their path representation just like any other class > c) subclasses of str/bytes that don't define __fspath__ would still work > like they do now, but their processing would be slower > d) subclasses of str/bytes that accidentally define a __fspath__ method > would change their behavior > > I think cases c) and d) could be sufficiently rare that the pros > outweigh the cons? > What exactly is the performance issue you are having that is leading to this proposal? I ask because b) and d) change semantics and so it's not a small thing to make this change at this point since Python 3.6 has been released. So unless there's a major performance impact I'm reluctant to want to change it at this point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue May 23 13:49:30 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 23 May 2017 10:49:30 -0700 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: I see no future for this proposal. Sorry Wolfgang! For future reference, the proposal was especially weak because it gave no concrete examples of code that was inconvenienced in any way by the current behavior. (And the performance hack of checking for exact str/bytes can be made without changing the semantics.) On Tue, May 23, 2017 at 10:04 AM, Brett Cannon wrote: > > > On Tue, 23 May 2017 at 03:13 Wolfgang Maier freiburg.de> wrote: > >> What do you think of this idea for a slight modification to os.fspath: >> the current version checks whether its arg is an instance of str, bytes >> or any subclass and, if so, returns the arg unchanged. In all other >> cases it tries to call the type's __fspath__ method to see if it can get >> str, bytes, or a subclass thereof this way. >> >> My proposal is to change this to: >> 1) check whether the type of the argument is str or bytes *exactly*; if >> so, return the argument unchanged >> 2) check wether __fspath__ can be called on the type and returns an >> instance of str, bytes, or any subclass (just like in the current version) >> 3) check whether the type is a subclass of str or bytes and, if so, >> return it unchanged >> >> This would have the following implications: >> a) it would speed up the very common case when the arg is either a str >> or a bytes instance exactly >> b) user-defined classes that inherit from str or bytes could control >> their path representation just like any other class >> c) subclasses of str/bytes that don't define __fspath__ would still work >> like they do now, but their processing would be slower >> d) subclasses of str/bytes that accidentally define a __fspath__ method >> would change their behavior >> >> I think cases c) and d) could be sufficiently rare that the pros >> outweigh the cons? >> > > What exactly is the performance issue you are having that is leading to > this proposal? I ask because b) and d) change semantics and so it's not a > small thing to make this change at this point since Python 3.6 has been > released. So unless there's a major performance impact I'm reluctant to > want to change it at this point. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Tue May 23 14:22:08 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 23 May 2017 21:22:08 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <11ab5f1b-98e0-0798-fc1c-9f41b9b6ac53@biologie.uni-freiburg.de> References: <280641a5-e19e-1561-d9a0-57ea8e1d03e9@biologie.uni-freiburg.de> <11ab5f1b-98e0-0798-fc1c-9f41b9b6ac53@biologie.uni-freiburg.de> Message-ID: On Tue, May 23, 2017 at 7:53 PM, Wolfgang Maier wrote: > > Ah, sorry, I misunderstood what you were trying to say, but now I'm getting > it! subclasses of str and bytes were of course usable as path arguments > before simply because they were subclasses of them. Now they would be picked > up based on their __fspath__ method, but old versions of Python executing > code using them would still use them directly. Have to think about this one > a bit, but thanks for pointing it out. > Yes, this is exactly what I meant. I noticed I had left out some of the details of the reasoning, sorry. I tried to fix that in my response to Steven. ? Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From storchaka at gmail.com Tue May 23 17:18:16 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 24 May 2017 00:18:16 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: 23.05.17 20:04, Brett Cannon ????: > On Tue, 23 May 2017 at 03:13 Wolfgang Maier > > wrote: > My proposal is to change this to: > 1) check whether the type of the argument is str or bytes *exactly*; if > so, return the argument unchanged > 2) check wether __fspath__ can be called on the type and returns an > instance of str, bytes, or any subclass (just like in the current > version) > 3) check whether the type is a subclass of str or bytes and, if so, > return it unchanged > > What exactly is the performance issue you are having that is leading to > this proposal? It seems to me that the purpose of this proposition is not performance, but the possibility to use __fspath__ in str or bytes subclasses. Currently defining __fspath__ in str or bytes subclasses doesn't have any effect. I don't know a reasonable use case for this feature. The __fspath__ method of str or bytes subclasses returning something not equivalent to self looks confusing to me. From k7hoven at gmail.com Tue May 23 17:31:06 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 24 May 2017 00:31:06 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: On Wed, May 24, 2017 at 12:18 AM, Serhiy Storchaka wrote: > It seems to me that the purpose of this proposition is not performance, but > the possibility to use __fspath__ in str or bytes subclasses. Currently > defining __fspath__ in str or bytes subclasses doesn't have any effect. > > I don't know a reasonable use case for this feature. The __fspath__ method > of str or bytes subclasses returning something not equivalent to self looks > confusing to me. Yes, that would be another reason. Only when Python drops support for strings as paths, can people start writing such subclasses. I'm sure many would now say dropping str/bytes path support won't even happen in Python 4. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From tritium-list at sdamon.com Tue May 23 18:33:13 2017 From: tritium-list at sdamon.com (tritium-list at sdamon.com) Date: Tue, 23 May 2017 18:33:13 -0400 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: <024801d2d414$917257b0$b4570710$@hotmail.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Koos Zevenhoven > Sent: Tuesday, May 23, 2017 5:31 PM > To: Serhiy Storchaka > Cc: python-ideas > Subject: Re: [Python-ideas] tweaking the file system path protocol > > On Wed, May 24, 2017 at 12:18 AM, Serhiy Storchaka > wrote: > > It seems to me that the purpose of this proposition is not performance, but > > the possibility to use __fspath__ in str or bytes subclasses. Currently > > defining __fspath__ in str or bytes subclasses doesn't have any effect. > > > > I don't know a reasonable use case for this feature. The __fspath__ > method > > of str or bytes subclasses returning something not equivalent to self looks > > confusing to me. > > Yes, that would be another reason. > > Only when Python drops support for strings as paths, can > people start writing such subclasses. I'm sure many > would now say dropping str/bytes path support won't even happen in > Python 4. > > -- Koos It is highly unlikely that python will ever drop str/bytes support for dealing with filesystem paths; case and point, they just ADDED bytes support back for windows filesystem paths. From steve at pearwood.info Tue May 23 20:41:26 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 24 May 2017 10:41:26 +1000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: Message-ID: <20170524004125.GD24625@ando.pearwood.info> On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote: > 23.05.17 20:04, Brett Cannon ????: > >What exactly is the performance issue you are having that is leading to > >this proposal? > > It seems to me that the purpose of this proposition is not performance, > but the possibility to use __fspath__ in str or bytes subclasses. > Currently defining __fspath__ in str or bytes subclasses doesn't have > any effect. That's how I interpreted the proposal, with any performance issue being secondary. (I don't expect that converting path-like objects to strings would be the bottleneck in any application doing actual disk IO.) > I don't know a reasonable use case for this feature. The __fspath__ > method of str or bytes subclasses returning something not equivalent to > self looks confusing to me. I can imagine at least two: - emulating something like DOS 8.3 versus long file names; - case normalisation but what would make this really useful is for debugging. For instance, I have used something like this to debug problems with int() being called wrongly: py> class MyInt(int): ... def __int__(self): ... print("__int__ called") ... return super().__int__() ... py> x = MyInt(23) py> int(x) __int__ called 23 It would be annoying and inconsistent if int(x) avoided calling __int__ on int subclasses. But that's exactly what happens with fspath and str. I see that as a bug, not a feature: I find it hard to believe that we would design an interface for string-like objects (paths) and then intentionally prohibit it from applying to strings. And if we did, surely its a misfeature. Why *shouldn't* subclasses of str get the same opportunity to customize the result of __fspath__ as they get to customize their __repr__ and __str__? py> class MyStr(str): ... def __repr__(self): ... return 'repr' ... def __str__(self): ... return 'str' ... py> s = MyStr('abcdef') py> repr(s) 'repr' py> str(s) 'str' I don't think that backwards compatibility is an issue here. Nobody will have had reason to write str subclasses with __fspath__ methods, so changing the behaviour to no longer ignore them shouldn't break any code. But of course, we should treat this as a new feature, and only change the behaviour in 3.7. -- Steve From wolfgang.maier at biologie.uni-freiburg.de Wed May 24 10:52:12 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Wed, 24 May 2017 16:52:12 +0200 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <20170524004125.GD24625@ando.pearwood.info> References: <20170524004125.GD24625@ando.pearwood.info> Message-ID: On 05/24/2017 02:41 AM, Steven D'Aprano wrote: > On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote: >> >> It seems to me that the purpose of this proposition is not performance, >> but the possibility to use __fspath__ in str or bytes subclasses. >> Currently defining __fspath__ in str or bytes subclasses doesn't have >> any effect. > > That's how I interpreted the proposal, with any performance issue being > secondary. (I don't expect that converting path-like objects to strings > would be the bottleneck in any application doing actual disk IO.) > > >> I don't know a reasonable use case for this feature. The __fspath__ >> method of str or bytes subclasses returning something not equivalent to >> self looks confusing to me. > > I can imagine at least two: > > - emulating something like DOS 8.3 versus long file names; > - case normalisation > > but what would make this really useful is for debugging. For instance, I > have used something like this to debug problems with int() being called > wrongly: > > py> class MyInt(int): > ... def __int__(self): > ... print("__int__ called") > ... return super().__int__() > ... > py> x = MyInt(23) > py> int(x) > __int__ called > 23 > > It would be annoying and inconsistent if int(x) avoided calling __int__ > on int subclasses. But that's exactly what happens with fspath and str. > I see that as a bug, not a feature: I find it hard to believe that we > would design an interface for string-like objects (paths) and then > intentionally prohibit it from applying to strings. > > And if we did, surely its a misfeature. Why *shouldn't* subclasses of > str get the same opportunity to customize the result of __fspath__ as > they get to customize their __repr__ and __str__? > > py> class MyStr(str): > ... def __repr__(self): > ... return 'repr' > ... def __str__(self): > ... return 'str' > ... > py> s = MyStr('abcdef') > py> repr(s) > 'repr' > py> str(s) > 'str' > This is almost exactly what I have been thinking (just that I couldn't have presented it so clearly)! Lets look at a potential usecase for this. Assume that in a package you want to handle several paths to different files and directories that are all located in a common package-specific parent directory. Then using the path protocol you could write this: class PackageBase (object): basepath = '/home/.package' class PackagePath (str, PackageBase): def __fspath__ (): return os.path.join(self.basepath, str(self)) config_file = PackagePath('.config') log_file = PackagePath('events.log') data_dir = PackagePath('data') with open(log_file) as log: log.write('package paths initialized.\n') Just that this wouldn't currently work because PackagePath inherits from str. Of course, there are other ways to achieve the above, but when you think about designing a Path-like object class str is just a pretty attractive base class to start from. Now lets look at compatibility of a class like PackagePath under this proposal: - if client code uses e.g. str(config_file) and proceeds to treat the resulting object as a path unexpected things will happen and, yes, that's bad. However, this is no different from any other Path-like object for which __str__ and __fspath__ don't define the same return value. - if client code uses the PEP-recommended backwards-compatible way of dealing with paths, path.__fspath__() if hasattr(path, "__fspath__") else path things will just work. Interstingly, this would *currently* produce an unexpected result namely that it would execute the__fspath__ method of the str-subclass - if client code uses instances of PackagePath as paths directly then in Python3.6 and below that would lead to unintended outcome, while in Python3.7 things would work. This is *really* bad. But what it means is that, under the proposal, using a str or bytes subclass with an __fspath__ method defined makes your code backwards-incompatible and the solution would be not to use such a class if you want to be backwards-compatible (and that should get documented somewhere). This restriction, of course, limits the usefulness of the proposal in the near future, but that disadvantage will vanish over time. In 5 years, not supporting Python3.6 anymore maybe won't be a big deal anymore (for comparison, Python3.2 was released 6 years ago and since last years pip is no longer supporting it). As Steven pointed out the proposal is *very* unlikely to break existing code. So to summarize, the proposal - avoids an up-front isinstance check in the protocol and thereby speeds up the processing of exact strings and bytes and of anything that follows the path protocol.* - slows down the processing of instances of regular str and bytes subclasses* - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" idiom consistent for subclasses of str and bytes that define __fspath__ - opens up the opportunity to write str/bytes subclasses that represent a path other than just their self in the future** Still sounds like a net win to me, but lets see what I forgot ... * yes, speed is typically not your primary concern when it comes to IO; what's often neglected though is that not all path operations have to trigger actual IO (things in os.path for example don't typically perform IO) ** somebody on the list (I guess it was Koos?) mentioned that such classes would only make sense if Python ever disallowed the use of str/bytes as paths, but I don't think that is a prerequisite here. Wolfgang From ericsnowcurrently at gmail.com Wed May 24 21:01:10 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 24 May 2017 18:01:10 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. Message-ID: Although I haven't been able to achieve the pace that I originally wanted, I have been able to work on my multi-core Python idea little-by-little. Most notably, some of the blockers have been resolved at the recent PyCon sprints and I'm ready to move onto the next step: exposing multiple interpreters via a stdlib module. Initially I just want to expose basic support via 3 successive changes. Below I've listed the corresponding (chained) PRs, along with what they add. Note that the 2 proposed modules take some cues from the threading module, but don't try to be any sort of replacement. Threading and subinterpreters are two different features that are used together rather than as alternatives to one another. At the very least I'd like to move forward with the _interpreters module sooner rather than later. Doing so will facilitate more extensive testing of subinterpreters, in preparation for further use of them in the multi-core Python project. We can iterate from there, but I'd at least like to get the basic functionality landed early. Any objections to (or feedback about) the low-level _interpreters module as described? Likewise for the high-level interpreters module? Discussion on any expanded functionality for the modules or on the broader topic of the multi-core project are both welcome, but please start other threads for those topics. -eric basic low-level API: https://github.com/python/cpython/pull/1748 _interpreters.create() -> id _interpreters.destroy(id) _interpreters.run_string(id, code) _interpreters.run_string_unrestricted(id, code, ns=None) -> ns extra low-level API: https://github.com/python/cpython/pull/1802 _interpreters.enumerate() -> [id, ...] _interpreters.get_current() -> id _interpreters.get_main() -> id _interpreters.is_running(id) -> bool basic high-level API: https://github.com/python/cpython/pull/1803 interpreters.enumerate() -> [Interpreter, ...] interpreters.get_current() -> Interpreter interpreters.get_main() -> Interpreter interpreters.create() -> Interpreter interpreters.Interpreter(id) interpreters.Interpreter.is_running() interpreters.Interpreter.destroy() interpreters.Interpreter.run(code) From njs at pobox.com Wed May 24 23:16:03 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 24 May 2017 20:16:03 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: CC'ing PyPy-dev... On Wed, May 24, 2017 at 6:01 PM, Eric Snow wrote: > Although I haven't been able to achieve the pace that I originally > wanted, I have been able to work on my multi-core Python idea > little-by-little. Most notably, some of the blockers have been > resolved at the recent PyCon sprints and I'm ready to move onto the > next step: exposing multiple interpreters via a stdlib module. > > Initially I just want to expose basic support via 3 successive > changes. Below I've listed the corresponding (chained) PRs, along > with what they add. Note that the 2 proposed modules take some cues > from the threading module, but don't try to be any sort of > replacement. Threading and subinterpreters are two different features > that are used together rather than as alternatives to one another. > > At the very least I'd like to move forward with the _interpreters > module sooner rather than later. Doing so will facilitate more > extensive testing of subinterpreters, in preparation for further use > of them in the multi-core Python project. We can iterate from there, > but I'd at least like to get the basic functionality landed early. > Any objections to (or feedback about) the low-level _interpreters > module as described? Likewise for the high-level interpreters module? > > Discussion on any expanded functionality for the modules or on the > broader topic of the multi-core project are both welcome, but please > start other threads for those topics. > > -eric > > > basic low-level API: https://github.com/python/cpython/pull/1748 > > _interpreters.create() -> id > _interpreters.destroy(id) > _interpreters.run_string(id, code) > _interpreters.run_string_unrestricted(id, code, ns=None) -> ns > > extra low-level API: https://github.com/python/cpython/pull/1802 > > _interpreters.enumerate() -> [id, ...] > _interpreters.get_current() -> id > _interpreters.get_main() -> id > _interpreters.is_running(id) -> bool > > basic high-level API: https://github.com/python/cpython/pull/1803 > > interpreters.enumerate() -> [Interpreter, ...] > interpreters.get_current() -> Interpreter > interpreters.get_main() -> Interpreter > interpreters.create() -> Interpreter > interpreters.Interpreter(id) > interpreters.Interpreter.is_running() > interpreters.Interpreter.destroy() > interpreters.Interpreter.run(code) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Wed May 24 23:30:42 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 24 May 2017 20:30:42 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: Hm... Curiously, I've heard a few people at PyCon mention they thought subinterpreters were broken and not useful (and they share the GIL anyways) and should be taken out. So we should at least have clarity on which direction we want to take... On Wed, May 24, 2017 at 6:01 PM, Eric Snow wrote: > Although I haven't been able to achieve the pace that I originally > wanted, I have been able to work on my multi-core Python idea > little-by-little. Most notably, some of the blockers have been > resolved at the recent PyCon sprints and I'm ready to move onto the > next step: exposing multiple interpreters via a stdlib module. > > Initially I just want to expose basic support via 3 successive > changes. Below I've listed the corresponding (chained) PRs, along > with what they add. Note that the 2 proposed modules take some cues > from the threading module, but don't try to be any sort of > replacement. Threading and subinterpreters are two different features > that are used together rather than as alternatives to one another. > > At the very least I'd like to move forward with the _interpreters > module sooner rather than later. Doing so will facilitate more > extensive testing of subinterpreters, in preparation for further use > of them in the multi-core Python project. We can iterate from there, > but I'd at least like to get the basic functionality landed early. > Any objections to (or feedback about) the low-level _interpreters > module as described? Likewise for the high-level interpreters module? > > Discussion on any expanded functionality for the modules or on the > broader topic of the multi-core project are both welcome, but please > start other threads for those topics. > > -eric > > > basic low-level API: https://github.com/python/cpython/pull/1748 > > _interpreters.create() -> id > _interpreters.destroy(id) > _interpreters.run_string(id, code) > _interpreters.run_string_unrestricted(id, code, ns=None) -> ns > > extra low-level API: https://github.com/python/cpython/pull/1802 > > _interpreters.enumerate() -> [id, ...] > _interpreters.get_current() -> id > _interpreters.get_main() -> id > _interpreters.is_running(id) -> bool > > basic high-level API: https://github.com/python/cpython/pull/1803 > > interpreters.enumerate() -> [Interpreter, ...] > interpreters.get_current() -> Interpreter > interpreters.get_main() -> Interpreter > interpreters.create() -> Interpreter > interpreters.Interpreter(id) > interpreters.Interpreter.is_running() > interpreters.Interpreter.destroy() > interpreters.Interpreter.run(code) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at louie.lu Wed May 24 23:57:09 2017 From: me at louie.lu (Louie Lu) Date: Thu, 25 May 2017 11:57:09 +0800 Subject: [Python-ideas] pdb / bdb adding watchpoint ability Message-ID: Hi all, Current pdb and bdb didn't have the ability to create a watchpoint for variable. Such in gdb, it have a series of watchpoint function*: * watch : break when expression value changed * rwatch : break when read value from expression * awatch: breka when read or write value to expression In Python, there have some problem in rwatch implement, that we can't directly know what variable is been read or not. This will need to used the dis module to get the bytecode and to know if the variable is been read (load) or not. The b.p.o issue is open at here: http://bugs.python.org/issue30429 And the current implement is at: https://github.com/python/cpython/pull/1756 How do you think about this ability to put in pdb/bdb? -Louie From ncoghlan at gmail.com Thu May 25 11:05:17 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 26 May 2017 01:05:17 +1000 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 25 May 2017 at 13:30, Guido van Rossum wrote: > Hm... Curiously, I've heard a few people at PyCon mention they thought > subinterpreters were broken and not useful (and they share the GIL anyways) > and should be taken out. Taking them out entirely would break mod_wsgi (and hence the use of Apache httpd as a Python application server), so I hope we don't consider going down that path :) As far as the GIL goes, Eric has a few ideas around potentially getting to a tiered locking approach, where the GIL becomes a Read/Write lock shared across the interpreters, and there are separate subinterpreter locks to guard actual code execution. That becomes a lot more feasible in a subinterpreter model, since the eval loop and various other structures are already separate - the tiered locking would mainly need to account for management of "object ownership" that prevented multiple interpreters from accessing the same object at the same time. However, I do think subinterpreters can be accurately characterised as fragile, especially in the presence of extension modules. I also think a large part of that fragility can be laid at the feet of them currently being incredibly difficult to test - while _testembed includes a rudimentary check [1] to make sure the subinterpreter machinery itself basically works, it doesn't do anything in the way of checking that the rest of the standard library actually does the right thing when run in a subinterpreter. So I'm +1 for the idea of exposing a low-level CPython-specific _interpreters API in order to start building out a proper test suite for the capability, and to let folks interested in working with them do so without having to write a custom embedding application ala mod_wsgi. However, I think it's still far too soon to be talking about defining a public supported API for them - while their use in mod_wsgi gives us assurance that they do mostly work in CPython, other implementations don't necessarily have anything comparable (even as a private implementation detail), and the kinds of software that folks run directly under mod_wsgi isn't necessarily reflective of the full extent of variation in the kinds of code that Python developers write in general. Cheers, Nick. [1] https://github.com/python/cpython/blob/master/Programs/_testembed.c#L41 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Thu May 25 13:03:44 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 25 May 2017 10:03:44 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Wed, May 24, 2017 at 8:30 PM, Guido van Rossum wrote: > Hm... Curiously, I've heard a few people at PyCon I'd love to get in touch with them and discuss the situation. I've spoken with Graham Dumpleton on several occasions about subinterpreters and what needs to be fixed. > mention they thought subinterpreters were broken There are a number of related long-standing bugs plus a few that I created in the last year or two. I'm motivated to get these resolved so that the multi-core Python project can take full advantage of subinterpreters without worry. As well, there are known limitations to using extension modules in subinterpreters. However, only extension modules that rely on process globals (rather than leveraging PEP 384, etc.) are affected, and we can control for that more carefully using the protocol introduced by PEP 489. There isn't anything broken about the concept or design of subinterpreters in CPython that I'm aware of. > and not useful (and they share the GIL anyways) I'll argue that their usefulness has been limited by lack of exposure in the stdlib. :) Furthermore, I'm finding them extremely useful as the vehicle for the multi-core Python project. > and should be taken out. So we should at least have clarity on which > direction we want to take... I'd definitely appreciate a firm commitment that they are not getting removed as I don't want to spend a ton of time on the project just to have the effort made irrelevant. :) Also, I'd be surprised if there were sufficient merit to removing support for subinterpreters, since there is very little machinery just for that feature. Instead, it leverages components of CPython that are there for other valid reasons. So I do not consider subinterpreters to currently add any significant burden to maintenance or development of the code base. Regardless, exposing the low-level _subinterpreters module should help us iron out bugs and API, as Nick pointed out. -eric From jim.baker at python.org Thu May 25 13:33:52 2017 From: jim.baker at python.org (Jim Baker) Date: Thu, 25 May 2017 11:33:52 -0600 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: Eric, Something like these subinterpreters in CPython are used from Jython's Java API. Like nearly all of Jython* this can be directly imported into using Python code, as seen in tests using this feature: https://github.com/ jythontools/jython/blob/master/Lib/test/test_pythoninterpreter_jy.py More on the API here: https://github.com/jythontools/jython/blob/ master/src/org/python/util/PythonInterpreter.java - note that is not even a core API for Jython, it just happens to be widely used, including by the launcher that wraps this API and calls itself the jython executable. So we can readily refactor if we have something better, because right now it is also problematic with respect to its lifecycle; what is the mapping to threads; and how it interacts with class loaders and other resources, especially during cleanup. It would be helpful to coordinate this subinterpreter work; or at least to cc jython-dev on such ideas your might have. Recently there have been some rumblings of consensus that it's about time for Jython to really start work on the 3.x implementation, targeting 3.6. But do be aware we are usually at most 2 to 5 developers, working in our spare time. So everything takes much longer than one would hope. I just hope we can finish 3.6 (or whatever) before Python 4.0 arrives :) *Excluding certain cases on core types where our bytecode rewriting makes it a true challenge! - Jim On Thu, May 25, 2017 at 11:03 AM, Eric Snow wrote: > On Wed, May 24, 2017 at 8:30 PM, Guido van Rossum > wrote: > > Hm... Curiously, I've heard a few people at PyCon > > I'd love to get in touch with them and discuss the situation. I've > spoken with Graham Dumpleton on several occasions about > subinterpreters and what needs to be fixed. > > > mention they thought subinterpreters were broken > > There are a number of related long-standing bugs plus a few that I > created in the last year or two. I'm motivated to get these resolved > so that the multi-core Python project can take full advantage of > subinterpreters without worry. > > As well, there are known limitations to using extension modules in > subinterpreters. However, only extension modules that rely on process > globals (rather than leveraging PEP 384, etc.) are affected, and we > can control for that more carefully using the protocol introduced by > PEP 489. > > There isn't anything broken about the concept or design of > subinterpreters in CPython that I'm aware of. > > > and not useful (and they share the GIL anyways) > > I'll argue that their usefulness has been limited by lack of exposure > in the stdlib. :) Furthermore, I'm finding them extremely useful as > the vehicle for the multi-core Python project. > > > and should be taken out. So we should at least have clarity on which > > direction we want to take... > > I'd definitely appreciate a firm commitment that they are not getting > removed as I don't want to spend a ton of time on the project just to > have the effort made irrelevant. :) Also, I'd be surprised if there > were sufficient merit to removing support for subinterpreters, since > there is very little machinery just for that feature. Instead, it > leverages components of CPython that are there for other valid > reasons. So I do not consider subinterpreters to currently add any > significant burden to maintenance or development of the code base. > Regardless, exposing the low-level _subinterpreters module should help > us iron out bugs and API, as Nick pointed out. > > -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu May 25 14:19:59 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 25 May 2017 11:19:59 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On May 24, 2017 20:31, "Guido van Rossum" wrote: Hm... Curiously, I've heard a few people at PyCon mention they thought subinterpreters were broken and not useful (and they share the GIL anyways) and should be taken out. So we should at least have clarity on which direction we want to take... My impression is that the code to support them inside CPython is fine, but they're broken and not very useful in the sense that lots of C extensions don't really support them, so in practice you can't reliably use them to run arbitrary code. Numpy for example definitely has lots of subinterpreter-related bugs, and when they get reported we close them as WONTFIX. Based on conversations at last year's pycon, my impression is that numpy probably *could* support subinterpreters (i.e. the required apis exist), but none of us really understand the details, it's the kind of problem that requires a careful whole-codebase audit, and a naive approach might make numpy's code slower and more complicated for everyone. (For example, there are lots of places where numpy keeps a little global cache that I guess should instead be per-subinterpreter caches, which would mean adding an extra lookup operation to fast paths.) Or maybe it'd be fine, but no one is motivated to figure it out, because the other side of the cost/benefit analysis is that almost nobody actually uses subinterpreters. I think the only two projects that do are mod_wsgi and jep [1]. So yeah, the status quo is broken. But there are two possible ways to fix it: IMHO either subinterpreters should be removed *or* they should have some compelling features added to make them actually worth the effort of fixing c extensions to support them. If Eric can pull off this multi-core idea then that would be pretty compelling :-). (And my impression is that the things that break under subinterpreters are essentially the same as would break under any GIL-removal plan.) The problem is that we don't actually know yet whether the multi-core idea will work, so it seems like a bad time to double down on committing to subinterpreter support and pressuring C extensions to keep up. Eric- do you have a plan written down somewhere? I'm wondering what ?the critical path from here to a multi-core proof of concept looks like. -n [1] https://github.com/mrj0/jep/wiki/How-Jep-Works -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu May 25 14:55:50 2017 From: brett at python.org (Brett Cannon) Date: Thu, 25 May 2017 18:55:50 +0000 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Thu, 25 May 2017 at 08:06 Nick Coghlan wrote: > On 25 May 2017 at 13:30, Guido van Rossum wrote: > > Hm... Curiously, I've heard a few people at PyCon mention they thought > > subinterpreters were broken and not useful (and they share the GIL > anyways) > > and should be taken out. > > Taking them out entirely would break mod_wsgi (and hence the use of > Apache httpd as a Python application server), so I hope we don't > consider going down that path :) > > As far as the GIL goes, Eric has a few ideas around potentially > getting to a tiered locking approach, where the GIL becomes a > Read/Write lock shared across the interpreters, and there are separate > subinterpreter locks to guard actual code execution. That becomes a > lot more feasible in a subinterpreter model, since the eval loop and > various other structures are already separate - the tiered locking > would mainly need to account for management of "object ownership" that > prevented multiple interpreters from accessing the same object at the > same time. > > However, I do think subinterpreters can be accurately characterised as > fragile, especially in the presence of extension modules. I also think > a large part of that fragility can be laid at the feet of them > currently being incredibly difficult to test - while _testembed > includes a rudimentary check [1] to make sure the subinterpreter > machinery itself basically works, it doesn't do anything in the way of > checking that the rest of the standard library actually does the right > thing when run in a subinterpreter. > > So I'm +1 for the idea of exposing a low-level CPython-specific > _interpreters API in order to start building out a proper test suite > for the capability, and to let folks interested in working with them > do so without having to write a custom embedding application ala > mod_wsgi. > > However, I think it's still far too soon to be talking about defining > a public supported API for them - while their use in mod_wsgi gives us > assurance that they do mostly work in CPython, other implementations > don't necessarily have anything comparable (even as a private > implementation detail), and the kinds of software that folks run > directly under mod_wsgi isn't necessarily reflective of the full > extent of variation in the kinds of code that Python developers write > in general. > I'm +1 on Nick's idea of the low-level, private API existing first to facilitate testing, but putting off any public API until we're sure we can make it function in a way we're happy with to more generally expose. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu May 25 15:01:21 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 25 May 2017 12:01:21 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Thu, May 25, 2017 at 11:19 AM, Nathaniel Smith wrote: > My impression is that the code to support them inside CPython is fine, but > they're broken and not very useful in the sense that lots of C extensions > don't really support them, so in practice you can't reliably use them to run > arbitrary code. Numpy for example definitely has lots of > subinterpreter-related bugs, and when they get reported we close them as > WONTFIX. > > Based on conversations at last year's pycon, my impression is that numpy > probably *could* support subinterpreters (i.e. the required apis exist), but > none of us really understand the details, it's the kind of problem that > requires a careful whole-codebase audit, and a naive approach might make > numpy's code slower and more complicated for everyone. (For example, there > are lots of places where numpy keeps a little global cache that I guess > should instead be per-subinterpreter caches, which would mean adding an > extra lookup operation to fast paths.) Thanks for pointing this out. You've clearly described probably the biggest challenge for folks that try to use subinterpreters. PEP 384 was meant to help with this, but seems to have fallen short. PEP 489 can help identify modules that profess subinterpreter support, as well as facilitating future extension module helpers to deal with global state. However, I agree that *right now* getting extension modules to reliably work with subinterpreters is not easy enough. Furthermore, that won't change unless there is sufficient benefit tied to subinterpreters, as you point out below. > > Or maybe it'd be fine, but no one is motivated to figure it out, because the > other side of the cost/benefit analysis is that almost nobody actually uses > subinterpreters. I think the only two projects that do are mod_wsgi and jep > [1]. > > So yeah, the status quo is broken. But there are two possible ways to fix > it: IMHO either subinterpreters should be removed *or* they should have some > compelling features added to make them actually worth the effort of fixing c > extensions to support them. If Eric can pull off this multi-core idea then > that would be pretty compelling :-). Agreed. :) > (And my impression is that the things > that break under subinterpreters are essentially the same as would break > under any GIL-removal plan.) More or less. There's a lot of process-global state in CPython that needs to get pulled into the interpreter state. So in that regard the effort and tooling will likely correspond fairly closely with what extension modules have to do. > > The problem is that we don't actually know yet whether the multi-core idea > will work, so it seems like a bad time to double down on committing to > subinterpreter support and pressuring C extensions to keep up. Eric- do you > have a plan written down somewhere? I'm wondering what the critical path > from here to a multi-core proof of concept looks like. Probably the best summary is here: http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html The caveat is that doing this myself is slow-going due to persistent lack of time. :/ So any timely solution would require effort from more people. I've had enough positive responses from folks at PyCon that I think enough people would pitch in to get it done in a timely manner. More significantly, I genuinely believe that isolated interpreters in the same process is a tool that many people will find extremely useful and will help the Python community. Consequently, exposing subinterpreters in the stdlib would result in a stronger incentive for folks to fix the known bugs and find a solution to the challenges for extension modules. -eric From ericsnowcurrently at gmail.com Thu May 25 15:03:36 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 25 May 2017 12:03:36 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Thu, May 25, 2017 at 11:55 AM, Brett Cannon wrote: > I'm +1 on Nick's idea of the low-level, private API existing first to > facilitate testing, but putting off any public API until we're sure we can > make it function in a way we're happy with to more generally expose. Same here. I hadn't expected the high-level API to be an immediate (or contingent) addition. My interest lies particularly with the low-level module. -eric From p.f.moore at gmail.com Thu May 25 16:21:46 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 25 May 2017 21:21:46 +0100 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 25 May 2017 at 20:01, Eric Snow wrote: > More significantly, I genuinely believe that isolated > interpreters in the same process is a tool that many people will find > extremely useful and will help the Python community. Consequently, > exposing subinterpreters in the stdlib would result in a stronger > incentive for folks to fix the known bugs and find a solution to the > challenges for extension modules. I'm definitely interested in subinterpreter support. I don't have a specific use case for it, but I see it as an enabling technology that could be used in creative ways (even given the current limitations involved in extension support). Perl has had subinterpreter support for many years - it's the implementation technique behind their fork primitive on Windows (on Unix, real fork is used) and allows many common patterns of use of fork to be ported to Windows. Python doesn't really have a need for this, as fork is not commonly used here (we use threads or multiprocessing where Perl would historically have used fork), but nevertheless it does provide prior art in this area. Paul From njs at pobox.com Fri May 26 03:04:41 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 26 May 2017 00:04:41 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Thu, May 25, 2017 at 12:01 PM, Eric Snow wrote: > More significantly, I genuinely believe that isolated > interpreters in the same process is a tool that many people will find > extremely useful and will help the Python community. Consequently, > exposing subinterpreters in the stdlib would result in a stronger > incentive for folks to fix the known bugs and find a solution to the > challenges for extension modules. I feel like the most effective incentive would be to demonstrate how useful they are first? If we do it in the other order, then there's a risk that cpython does provide an incentive, but it's of the form "this thing doesn't actually accomplish anything useful yet, but it got mentioned in whats-new-in-3.7 and now angry people are yelling at me in my bug tracker for not 'fixing' my package, so I have to do a bunch of pointless work to shut them up". This tends to leave bad feelings all around. I do get that this is a tricky chicken-and-egg situation: currently subinterpreters don't work very well, so no-one writes cool applications using them, so no-one bothers to make them work better. And I share the general intuition that this is a powerful tool that probably has some kind of useful applications. But I can't immediately name any such applications, which makes me nervous :-). The obvious application is your multi-core Python idea, and I think that would convince a lot of people; in general I'm enthusiastic about the idea of extending Python's semantics to enable better parallelism. But I just re-read your blog post and some of the linked thread, and it's not at all clear to me how you plan to solve the refcounting and garbage collection problems that will arise once you have objects that are shared between multiple subinterpreters and no GIL. Which makes it hard for me to make a case to the other numpy devs that it's worth spending energy on this now, to support a feature that might or might not happen in the future, especially if angry shouty people start joining the conversation. Does that make sense? I want the project to succeed, and if one of the conditions for that is getting buy-in from the community of C extension developers then it seems important to have a good plan for navigating the incentives tightrope. -n -- Nathaniel J. Smith -- https://vorpus.org From ronaldoussoren at mac.com Fri May 26 03:14:39 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 26 May 2017 09:14:39 +0200 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: <5DC713B2-BE8D-43B1-AC5B-FC3BF1000383@mac.com> > On 25 May 2017, at 19:03, Eric Snow wrote: > > On Wed, May 24, 2017 at 8:30 PM, Guido van Rossum wrote: >> Hm... Curiously, I've heard a few people at PyCon > > I'd love to get in touch with them and discuss the situation. I've > spoken with Graham Dumpleton on several occasions about > subinterpreters and what needs to be fixed. > >> mention they thought subinterpreters were broken > > There are a number of related long-standing bugs plus a few that I > created in the last year or two. I'm motivated to get these resolved > so that the multi-core Python project can take full advantage of > subinterpreters without worry. > > As well, there are known limitations to using extension modules in > subinterpreters. However, only extension modules that rely on process > globals (rather than leveraging PEP 384, etc.) are affected, and we > can control for that more carefully using the protocol introduced by > PEP 489. There also the PyGILState APIs (PEP 311), those assume there?s only one interpreter. Ronald From encukou at gmail.com Fri May 26 04:41:22 2017 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 26 May 2017 10:41:22 +0200 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 05/25/2017 09:01 PM, Eric Snow wrote: > On Thu, May 25, 2017 at 11:19 AM, Nathaniel Smith wrote: >> My impression is that the code to support them inside CPython is fine, but >> they're broken and not very useful in the sense that lots of C extensions >> don't really support them, so in practice you can't reliably use them to run >> arbitrary code. Numpy for example definitely has lots of >> subinterpreter-related bugs, and when they get reported we close them as >> WONTFIX. >> >> Based on conversations at last year's pycon, my impression is that numpy >> probably *could* support subinterpreters (i.e. the required apis exist), but >> none of us really understand the details, it's the kind of problem that >> requires a careful whole-codebase audit, and a naive approach might make >> numpy's code slower and more complicated for everyone. (For example, there >> are lots of places where numpy keeps a little global cache that I guess >> should instead be per-subinterpreter caches, which would mean adding an >> extra lookup operation to fast paths.) > > Thanks for pointing this out. You've clearly described probably the > biggest challenge for folks that try to use subinterpreters. PEP 384 > was meant to help with this, but seems to have fallen short. PEP 489 > can help identify modules that profess subinterpreter support, as well > as facilitating future extension module helpers to deal with global > state. However, I agree that *right now* getting extension modules to > reliably work with subinterpreters is not easy enough. Furthermore, > that won't change unless there is sufficient benefit tied to > subinterpreters, as you point out below. PEP 489 was a first step; the work is not finished. The next step is solving a major reason people are using global state in extension modules: per-module state isn't accessible from all the places it should be, namely in methods of classes. In other words, I don't think Python is ready for big projects like Numpy to start properly supporting subinterpreters. The work on fixing this has stalled, but it looks like I'll be getting back on track. Discussions about this are on the import-sig list, reach out there if you'd like to help. From k7hoven at gmail.com Fri May 26 07:15:44 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 26 May 2017 14:15:44 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <20170524004125.GD24625@ando.pearwood.info> References: <20170524004125.GD24625@ando.pearwood.info> Message-ID: On Wed, May 24, 2017 at 3:41 AM, Steven D'Aprano wrote: > On Wed, May 24, 2017 at 12:18:16AM +0300, Serhiy Storchaka wrote: >> I don't know a reasonable use case for this feature. The __fspath__ >> method of str or bytes subclasses returning something not equivalent to >> self looks confusing to me. > > I can imagine at least two: > > - emulating something like DOS 8.3 versus long file names; > - case normalisation > These are not reasonable use cases because they should not subclass str or bytes. That would be confusing. > but what would make this really useful is for debugging. For instance, I > have used something like this to debug problems with int() being called > wrongly: > > py> class MyInt(int): > ... def __int__(self): > ... print("__int__ called") > ... return super().__int__() > ... > py> x = MyInt(23) > py> int(x) > __int__ called > 23 > You can monkeypatch the stdlib: from os import fspath as real_fspath mystr = "23" def fspath(path): if path is mystr: print("fspath was called on mystr") return real_fspath(path) os.fspath = fspath try_something_with(mystr) Having __fspath__ on str and bytes by default would destroy the ability to distinguish between PathLike and non-PathLike, because all strings would appear to be PathLike. (Not to mention the important compatibility issues between different Python versions and different ways of dealing with pre-PEP519 path objects.) ?Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From stephanh42 at gmail.com Fri May 26 08:08:30 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 26 May 2017 14:08:30 +0200 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: Hi all, Personally I feel that the current subinterpreter support falls short in the sense that it still requires a single GIL across interpreters. If interpreters would have their own individual GIL, we could have true shared-nothing multi-threaded support similar to Javascript's "Web Workers". Here is a point-wise overview of what I am imagining. I realize the following is very ambitious, but I would like to bring it to your consideration. 1. Multiple interpreters can be instantiated, each of which is completely independent. To this end, all global interpreter state needs to go into an interpreter strucutre, including the GIL (which becomes per-interpreter) Interpreters share no state whatsoever. 2. PyObject's are tied to a particular interpreter and cannot be shared between interpreters. (This is because each interpreter now has its own GIL.) I imagine a special debug build would actually store the interpreter pointer in the PyObject and would assert everywhere that the PyObject is only manipulated by its owning interpreter. 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, need to get an additional explicit interpreter argument. I imagine that we would have a new prefix, say MPy_, because the existing APIs must be left for backward compatibility. 4. At most one interpreter can be designated the "main" interpreter. This is for backward compatibility of existing extension modules ONLY. All the existing Py_* APIs operate implicitly on this main interpreter. 5. Extension modules need to explicitly advertise multiple interpreter support. If they don't, they can only be imported in the main interpreter. However, in that case they can safely use the existing Py_ APIs. 6. Since PyObject's cannot be shared across interpreters, there needs to be an explicit function which takes a PyObject in interpreter A and constructs a similar object in interpreter B. Conceptually this would be equivalent to pickling in A and unpickling in B, but presumably more efficient. It would use the copyreg registry in a similar way to pickle. 7. Extension modules would also be able to register their function for copying custom types across interpreters . That would allow extension modules to provide custom types where the underlying C object is in fact not copied but shared between interpreters. I would imagine we would have a"shared memory" memoryview object and also Mutex and other locking constructs which would work across interpreters. 8. Finally, the main application: functionality similar to the current `multiprocessing' module, but with multiple interpreters on multiple threads in a single process. This would presumably be more efficient than `multiprocessing' and also allow extra functionality, since the underlying C objects can in fact be shared. (Imagine two interpreters operating in parallel on a single OpenCL context.) Stephan Op 26 mei 2017 10:41 a.m. schreef "Petr Viktorin" : > > On 05/25/2017 09:01 PM, Eric Snow wrote: >> >> On Thu, May 25, 2017 at 11:19 AM, Nathaniel Smith wrote: >>> >>> My impression is that the code to support them inside CPython is fine, but >>> they're broken and not very useful in the sense that lots of C extensions >>> don't really support them, so in practice you can't reliably use them to run >>> arbitrary code. Numpy for example definitely has lots of >>> subinterpreter-related bugs, and when they get reported we close them as >>> WONTFIX. >>> >>> Based on conversations at last year's pycon, my impression is that numpy >>> probably *could* support subinterpreters (i.e. the required apis exist), but >>> none of us really understand the details, it's the kind of problem that >>> requires a careful whole-codebase audit, and a naive approach might make >>> numpy's code slower and more complicated for everyone. (For example, there >>> are lots of places where numpy keeps a little global cache that I guess >>> should instead be per-subinterpreter caches, which would mean adding an >>> extra lookup operation to fast paths.) >> >> >> Thanks for pointing this out. You've clearly described probably the >> biggest challenge for folks that try to use subinterpreters. PEP 384 >> was meant to help with this, but seems to have fallen short. PEP 489 >> can help identify modules that profess subinterpreter support, as well >> as facilitating future extension module helpers to deal with global >> state. However, I agree that *right now* getting extension modules to >> reliably work with subinterpreters is not easy enough. Furthermore, >> that won't change unless there is sufficient benefit tied to >> subinterpreters, as you point out below. > > > PEP 489 was a first step; the work is not finished. The next step is solving a major reason people are using global state in extension modules: per-module state isn't accessible from all the places it should be, namely in methods of classes. In other words, I don't think Python is ready for big projects like Numpy to start properly supporting subinterpreters. > > The work on fixing this has stalled, but it looks like I'll be getting back on track. > Discussions about this are on the import-sig list, reach out there if you'd like to help. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From k7hoven at gmail.com Fri May 26 08:58:23 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 26 May 2017 15:58:23 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> Message-ID: On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier wrote: > On 05/24/2017 02:41 AM, Steven D'Aprano wrote: >> >> >> It would be annoying and inconsistent if int(x) avoided calling __int__ >> on int subclasses. But that's exactly what happens with fspath and str. >> I see that as a bug, not a feature: I find it hard to believe that we >> would design an interface for string-like objects (paths) and then >> intentionally prohibit it from applying to strings. >> >> And if we did, surely its a misfeature. Why *shouldn't* subclasses of >> str get the same opportunity to customize the result of __fspath__ as >> they get to customize their __repr__ and __str__? >> >> py> class MyStr(str): >> ... def __repr__(self): >> ... return 'repr' >> ... def __str__(self): >> ... return 'str' >> ... >> py> s = MyStr('abcdef') >> py> repr(s) >> 'repr' >> py> str(s) >> 'str' >> > > This is almost exactly what I have been thinking (just that I couldn't have > presented it so clearly)! Unfortunately, this thinking is also very shallow compared to what went into PEP519. > > Lets look at a potential usecase for this. Assume that in a package you want > to handle several paths to different files and directories that are all > located in a common package-specific parent directory. Then using the path > protocol you could write this: > > class PackageBase (object): > basepath = '/home/.package' > > class PackagePath (str, PackageBase): > def __fspath__ (): > return os.path.join(self.basepath, str(self)) > > config_file = PackagePath('.config') > log_file = PackagePath('events.log') > data_dir = PackagePath('data') > > with open(log_file) as log: > log.write('package paths initialized.\n') > This is exactly the kind of code that causes the problems. It will do the wrong thing when code like open(str(log_file), 'w') is used for compatiblity. > Just that this wouldn't currently work because PackagePath inherits from > str. Of course, there are other ways to achieve the above, but when you > think about designing a Path-like object class str is just a pretty > attractive base class to start from. Isn't it great that it doesn't work, so it's not attractive anymore? > Now lets look at compatibility of a class like PackagePath under this > proposal: > > - if client code uses e.g. str(config_file) and proceeds to treat the > resulting object as a path unexpected things will happen and, yes, that's > bad. However, this is no different from any other Path-like object for which > __str__ and __fspath__ don't define the same return value. > Yes, this is another way of shooting yourself in the foot. Luckily, this one is probably less attractive. > - if client code uses the PEP-recommended backwards-compatible way of > dealing with paths, > > path.__fspath__() if hasattr(path, "__fspath__") else path > > things will just work. Interstingly, this would *currently* produce an > unexpected result namely that it would execute the__fspath__ method of the > str-subclass > So people not testing for 3.6+ might think their code works while it doesn't. Luckily people not testing with 3.6+ are perhaps unlikely to try funny tricks with __fspath__. > - if client code uses instances of PackagePath as paths directly then in > Python3.6 and below that would lead to unintended outcome, while in > Python3.7 things would work. This is *really* bad. > > But what it means is that, under the proposal, using a str or bytes subclass > with an __fspath__ method defined makes your code backwards-incompatible and > the solution would be not to use such a class if you want to be > backwards-compatible (and that should get documented somewhere). This > restriction, of course, limits the usefulness of the proposal in the near > future, but that disadvantage will vanish over time. In 5 years, not > supporting Python3.6 anymore maybe won't be a big deal anymore (for > comparison, Python3.2 was released 6 years ago and since last years pip is > no longer supporting it). As Steven pointed out the proposal is *very* > unlikely to break existing code. > > So to summarize, the proposal > > - avoids an up-front isinstance check in the protocol and thereby speeds up > the processing of exact strings and bytes and of anything that follows the > path protocol.* Speedup for things with __fspath__ is the only virtue of this proposal, and it has not been shown that that speedup matters anywhere. > - slows down the processing of instances of regular str and bytes > subclasses* > > - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" > idiom consistent for subclasses of str and bytes that define __fspath__ > One can discuss whether this is the best idiom to use (I did not write it, so maybe someone else has comments). Anyway, some may want to use path.__fspath__() if hasattr(path, "__fspath__") else str(path) and some may want path if isinstance(path, (str, bytes)) else path.__fspath__() Or others may not be after oneliners like this and instead include the full implementation of fspath in their code?or even better, with some modifications. Really, the best thing to use in pre-3.6 might be more like: def fspath(path): if isinstance(path, (str, bytes)): return path if hasattr(path, '__fspath__'): return path.__fspath__() if type(path).__name__ == 'DirEntry': or isinstance(path, pathlib.PurePath): return str(path) raise TypeError("Argument cannot be interpreted as a file system path: " + repr(path)) Note that > - opens up the opportunity to write str/bytes subclasses that represent a > path other than just their self in the future** > > Still sounds like a net win to me, but lets see what I forgot ... > > * yes, speed is typically not your primary concern when it comes to IO; > what's often neglected though is that not all path operations have to > trigger actual IO (things in os.path for example don't typically perform IO) > > ** somebody on the list (I guess it was Koos?) mentioned that such classes > would only make sense if Python ever disallowed the use of str/bytes as > paths, but I don't think that is a prerequisite here. > Yes, I wrote that, and I stick with it: str and bytes subclasses that return something different from the str/bytes content should not be written. If Python ever disallows str/bytes as paths, such a thing becomes less harmful, and there is no need to have special treatment for str and bytes. Until then, I'm very happy with the decision to ignore __fspath__ on str and bytes. ?Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From ncoghlan at gmail.com Fri May 26 09:17:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 26 May 2017 23:17:42 +1000 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 26 May 2017 at 22:08, Stephan Houben wrote: > Hi all, > > Personally I feel that the current subinterpreter support falls short > in the sense that it still requires > a single GIL across interpreters. > > If interpreters would have their own individual GIL, > we could have true shared-nothing multi-threaded support similar to > Javascript's "Web Workers". > > Here is a point-wise overview of what I am imagining. > I realize the following is very ambitious, but I would like to bring > it to your consideration. > > 1. Multiple interpreters can be instantiated, each of which is > completely independent. > To this end, all global interpreter state needs to go into an > interpreter strucutre, including the GIL > (which becomes per-interpreter) > Interpreters share no state whatsoever. There'd still be true process global state (i.e. anything managed by the C runtime), so this would be a tiered setup with a read/write GIL and multiple SILs. For the time being though, a single GIL remains much easier to manage. > 2. PyObject's are tied to a particular interpreter and cannot be > shared between interpreters. > (This is because each interpreter now has its own GIL.) > I imagine a special debug build would actually store the > interpreter pointer in the PyObject and would assert everywhere > that the PyObject is only manipulated by its owning interpreter. Yes, something like Rust's ownership model is the gist of what we had in mind (i.e. allowing zero-copy transfer of ownership between subinterpreters, but only the owning interpreter is allowed to do anything else with the object). > 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, > need to get an additional explicit interpreter argument. > I imagine that we would have a new prefix, say MPy_, because the > existing APIs must be left for backward compatibility. This isn't necessary, as the active interpreter is already tracked as part of the thread local state (otherwise mod_wsgi et al wouldn't work at all). > 4. At most one interpreter can be designated the "main" interpreter. > This is for backward compatibility of existing extension modules ONLY. > All the existing Py_* APIs operate implicitly on this main interpreter. Yep, this is part of the concept. The PEP 432 draft has more details on that: https://www.python.org/dev/peps/pep-0432/#interpreter-initialization-phases > 5. Extension modules need to explicitly advertise multiple interpreter support. > If they don't, they can only be imported in the main interpreter. > However, in that case they can safely use the existing Py_ APIs. This is the direction we started moving the with multi-phase initialisation PEP for extension modules: https://www.python.org/dev/peps/pep-0489/ As Petr noted, the main missing piece there now is the fact that object methods (as opposed to module level functions) implemented in C currently don't have ready access to the module level state for the modules where they're defined. > 6. Since PyObject's cannot be shared across interpreters, there needs to be an > explicit function which takes a PyObject in interpreter A and constructs a > similar object in interpreter B. > > Conceptually this would be equivalent to pickling in A and > unpickling in B, but presumably more efficient. > It would use the copyreg registry in a similar way to pickle. This would be an ownership transfer rather than a copy (which carries the implication that all the subinterpreters would still need to share a common memory allocator) > 7. Extension modules would also be able to register their function > for copying custom types across interpreters . > That would allow extension modules to provide custom types where > the underlying C object is in fact not copied > but shared between interpreters. > I would imagine we would have a"shared memory" memoryview object > and also Mutex and other locking constructs which would work > across interpreters. We generally don't expect this to be needed given an ownership focused approach. Instead, the focus would be on enabling efficient channel based communication models that are cost-prohibitive when object serialisation is involved. > 8. Finally, the main application: functionality similar to the current > `multiprocessing' module, but with > multiple interpreters on multiple threads in a single process. > This would presumably be more efficient than `multiprocessing' and > also allow extra functionality, since the underlying C objects > can in fact be shared. > (Imagine two interpreters operating in parallel on a single OpenCL context.) We're not sure how feasible it will be to enable this in general, but even without it, zero-copy ownership transfers enable a *lot* of interest concurrency models that Python doesn't currently offer great primitives to support (they're mainly a matter of using threads in certain ways, which means they not only run afoul of the GIL, but you also don't get any assistance from the interpreter in strictly enforcing object ownership rules). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Fri May 26 09:20:39 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 26 May 2017 16:20:39 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> Message-ID: Accidentally sent the email before it was done. Additions / corrections below: On Fri, May 26, 2017 at 3:58 PM, Koos Zevenhoven wrote: > On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier >> >> - makes the "path.__fspath__() if hasattr(path, "__fspath__") else path" >> idiom consistent for subclasses of str and bytes that define __fspath__ >> > > One can discuss whether this is the best idiom to use (I did not write > it, so maybe someone else has comments). > > Anyway, some may want to use > > path.__fspath__() if hasattr(path, "__fspath__") else str(path) > > and some may want > > path if isinstance(path, (str, bytes)) else path.__fspath__() > > Or others may not be after oneliners like this and instead include the > full implementation of fspath in their code?or even better, with some > modifications. > > Really, the best thing to use in pre-3.6 might be more like: > > def fspath(path): > if isinstance(path, (str, bytes)): > return path > if hasattr(path, '__fspath__'): > return path.__fspath__() > if type(path).__name__ == 'DirEntry': > return path.path > if isinstance(path, pathlib.PurePath): > return str(path) > raise TypeError("Argument cannot be interpreted as a file system path: " + repr(path)) > In the above, I have to check type(path).__name__, because DirEntry was not exposed as os.DirEntry in 3.5 yet. For pre-3.4 Python and for older third-party libraries that do inherit from str/bytes, one could even use something like: def fspath(path): if isinstance(path, (str, bytes)): return path if hasattr(type(path), '__fspath__'): return type(path).__fspath__(path) if type(path).__name__ == 'DirEntry': return path.path if "Path" in type(path).__name__: # add whatever known names for path classes (what a hack!) return str(path) raise TypeError("Argument cannot be interpreted as a file system path: " + repr(path)) ?Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + From stephanh42 at gmail.com Fri May 26 09:49:26 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 26 May 2017 15:49:26 +0200 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: Hi Nick, As far as I understand, the (to me) essential difference between your approach and my proposal is that: Approach 1 (PEP-489): * Single (global) GIL. * PyObject's may be shared across interpreters (zero-copy transfer) Approach 2 (mine) * Per-interpreter GIL. * PyObject's must be copied across interpreters. To me, the per-interpreter GIL is the essential "target" I am aiming for, and I am willing to sacrifice the zero-copy for that. If the GIL is still shared then I don't see much advantage of this approach over just using the "threading" module with a single interpreter. (I realize it still gives you some isolation between interpreters. To me personally this is not very interesting, but this may be myopic.) > For the time being though, a single GIL remains > much easier to manage. "For the time being" suggests that you are intending approach 1 to be ultimately a stepping stone to something similar to approach 2? > Yes, something like Rust's ownership model is the gist of what we had > in mind (i.e. allowing zero-copy transfer of ownership between > subinterpreters, but only the owning interpreter is allowed to do > anything else with the object). This can be emulated in approach 2 by creating a wrapper C-level type which contains a PyObject and its corresponding interpreter. So that interpreter A can reference an object in interpreter B. >> 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, >> need to get an additional explicit interpreter argument. >> I imagine that we would have a new prefix, say MPy_, because the >> existing APIs must be left for backward compatibility. > > This isn't necessary, as the active interpreter is already tracked as > part of the thread local state (otherwise mod_wsgi et al wouldn't work > at all). I realize that it is possible to that it that way. However this has some disadvantages: * The interpreter becomes tied to a thread, or you need to have some way to switch interpeters on a thread. (Which makes your code look like OpenGL code;-) ) * Once you are going to write code which manipulates objects in multiple interpreters (e.g. my proposed copy function or the "foreign interpreter wrapper" I discussed above) making the interpreter explicit probably avoids headaches. * Explicit is better than implicit, as somebody once said. ;-) Stephan 2017-05-26 15:17 GMT+02:00 Nick Coghlan : > On 26 May 2017 at 22:08, Stephan Houben wrote: >> Hi all, >> >> Personally I feel that the current subinterpreter support falls short >> in the sense that it still requires >> a single GIL across interpreters. >> >> If interpreters would have their own individual GIL, >> we could have true shared-nothing multi-threaded support similar to >> Javascript's "Web Workers". >> >> Here is a point-wise overview of what I am imagining. >> I realize the following is very ambitious, but I would like to bring >> it to your consideration. >> >> 1. Multiple interpreters can be instantiated, each of which is >> completely independent. >> To this end, all global interpreter state needs to go into an >> interpreter strucutre, including the GIL >> (which becomes per-interpreter) >> Interpreters share no state whatsoever. > > There'd still be true process global state (i.e. anything managed by > the C runtime), so this would be a tiered setup with a read/write GIL > and multiple SILs. For the time being though, a single GIL remains > much easier to manage. > >> 2. PyObject's are tied to a particular interpreter and cannot be >> shared between interpreters. >> (This is because each interpreter now has its own GIL.) >> I imagine a special debug build would actually store the >> interpreter pointer in the PyObject and would assert everywhere >> that the PyObject is only manipulated by its owning interpreter. > > Yes, something like Rust's ownership model is the gist of what we had > in mind (i.e. allowing zero-copy transfer of ownership between > subinterpreters, but only the owning interpreter is allowed to do > anything else with the object). > >> 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, >> need to get an additional explicit interpreter argument. >> I imagine that we would have a new prefix, say MPy_, because the >> existing APIs must be left for backward compatibility. > > This isn't necessary, as the active interpreter is already tracked as > part of the thread local state (otherwise mod_wsgi et al wouldn't work > at all). > >> 4. At most one interpreter can be designated the "main" interpreter. >> This is for backward compatibility of existing extension modules ONLY. >> All the existing Py_* APIs operate implicitly on this main interpreter. > > Yep, this is part of the concept. The PEP 432 draft has more details > on that: https://www.python.org/dev/peps/pep-0432/#interpreter-initialization-phases > >> 5. Extension modules need to explicitly advertise multiple interpreter support. >> If they don't, they can only be imported in the main interpreter. >> However, in that case they can safely use the existing Py_ APIs. > > This is the direction we started moving the with multi-phase > initialisation PEP for extension modules: > https://www.python.org/dev/peps/pep-0489/ > > As Petr noted, the main missing piece there now is the fact that > object methods (as opposed to module level functions) implemented in C > currently don't have ready access to the module level state for the > modules where they're defined. > >> 6. Since PyObject's cannot be shared across interpreters, there needs to be an >> explicit function which takes a PyObject in interpreter A and constructs a >> similar object in interpreter B. >> >> Conceptually this would be equivalent to pickling in A and >> unpickling in B, but presumably more efficient. >> It would use the copyreg registry in a similar way to pickle. > > This would be an ownership transfer rather than a copy (which carries > the implication that all the subinterpreters would still need to share > a common memory allocator) > >> 7. Extension modules would also be able to register their function >> for copying custom types across interpreters . >> That would allow extension modules to provide custom types where >> the underlying C object is in fact not copied >> but shared between interpreters. >> I would imagine we would have a"shared memory" memoryview object >> and also Mutex and other locking constructs which would work >> across interpreters. > > We generally don't expect this to be needed given an ownership focused > approach. Instead, the focus would be on enabling efficient channel > based communication models that are cost-prohibitive when object > serialisation is involved. > >> 8. Finally, the main application: functionality similar to the current >> `multiprocessing' module, but with >> multiple interpreters on multiple threads in a single process. >> This would presumably be more efficient than `multiprocessing' and >> also allow extra functionality, since the underlying C objects >> can in fact be shared. >> (Imagine two interpreters operating in parallel on a single OpenCL context.) > > We're not sure how feasible it will be to enable this in general, but > even without it, zero-copy ownership transfers enable a *lot* of > interest concurrency models that Python doesn't currently offer great > primitives to support (they're mainly a matter of using threads in > certain ways, which means they not only run afoul of the GIL, but you > also don't get any assistance from the interpreter in strictly > enforcing object ownership rules). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 26 11:28:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 May 2017 01:28:20 +1000 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 26 May 2017 at 23:49, Stephan Houben wrote: > Hi Nick, > > As far as I understand, the (to me) essential difference between your > approach and my proposal is that: > > Approach 1 (PEP-489): > * Single (global) GIL. > * PyObject's may be shared across interpreters (zero-copy transfer) > > Approach 2 (mine) > * Per-interpreter GIL. > * PyObject's must be copied across interpreters. > > To me, the per-interpreter GIL is the essential "target" I am aiming for, > and I am willing to sacrifice the zero-copy for that. Err, no - I explicitly said that assuming the rest of idea works out well, we'd eventually like to move to a tiered model where the GIL becomes a read/write lock. Most code execution in subinterpreters would then only need a read lock on the GIL, and hence could happily execute code in parallel with other subinterpreters running on other cores. However, that aspect of the idea is currently just hypothetical handwaving that would need to deal with (and would be informed by) the current work happening with respect to the GILectomy, as it's not particularly interesting as far as concurrency modeling is concerned. By contrast, being able to reliably model Communicating Sequential Processes in Python without incurring any communications overhead though (ala goroutines)? Or doing the same with the Actor model (ala Erlang/BEAM processes)? Those are *very* interesting language design concepts, and something where offering a compelling alternative to the current practices of emulating them with threads or coroutines pretty much requires the property of zero-copy ownership transfer. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Fri May 26 13:30:00 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 26 May 2017 10:30:00 -0700 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On Fri, May 26, 2017 at 8:28 AM, Nick Coghlan wrote: > [...] assuming the rest of idea works out > well, we'd eventually like to move to a tiered model where the GIL > becomes a read/write lock. Most code execution in subinterpreters > would then only need a read lock on the GIL, and hence could happily > execute code in parallel with other subinterpreters running on other > cores. > Since the GIL protects refcounts and refcounts are probably the most frequently written item, I'm skeptical of this. > However, that aspect of the idea is currently just hypothetical > handwaving that would need to deal with (and would be informed by) the > current work happening with respect to the GILectomy, as it's not > particularly interesting as far as concurrency modeling is concerned. > > By contrast, being able to reliably model Communicating Sequential > Processes in Python without incurring any communications overhead > though (ala goroutines)? Or doing the same with the Actor model (ala > Erlang/BEAM processes)? > > Those are *very* interesting language design concepts, and something > where offering a compelling alternative to the current practices of > emulating them with threads or coroutines pretty much requires the > property of zero-copy ownership transfer. > But subinterpreters (which have independent sys.modules dicts) seem a poor match for that. It feels as if you're speculating about an entirely different language here, not named Python. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephanh42 at gmail.com Fri May 26 14:45:51 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Fri, 26 May 2017 20:45:51 +0200 Subject: [Python-ideas] a memory-efficient variant of itertools.tee Message-ID: Hi all, The itertools.tee function can hold on to objects "unnecessarily". In particular, if you do iter2 = itertools.tee(iter1, 2)[0] i.e. you "leak" one of the returned iterators, then all returned objects are not collected until also iter2 is collected. I propose a different implementation, namely the one in: https://github.com/stephanh42/streamtee streamtee.tee is a drop-in alternative for itertools.tee but as you can see from the test in the repo, it will not hold on to the generated objects as long. I propose this as an improved implementation of itertools.tee. Thanks, Stephan From ncoghlan at gmail.com Sat May 27 03:32:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 May 2017 17:32:26 +1000 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: On 27 May 2017 at 03:30, Guido van Rossum wrote: > On Fri, May 26, 2017 at 8:28 AM, Nick Coghlan wrote: >> >> [...] assuming the rest of idea works out >> well, we'd eventually like to move to a tiered model where the GIL >> becomes a read/write lock. Most code execution in subinterpreters >> would then only need a read lock on the GIL, and hence could happily >> execute code in parallel with other subinterpreters running on other >> cores. > > > Since the GIL protects refcounts and refcounts are probably the most > frequently written item, I'm skeptical of this. Likewise - hence my somewhat garbled attempt to explain that actually doing that would be contingent on the GILectomy folks figuring out some clever way to cope with the refcounts :) >> By contrast, being able to reliably model Communicating Sequential >> Processes in Python without incurring any communications overhead >> though (ala goroutines)? Or doing the same with the Actor model (ala >> Erlang/BEAM processes)? >> >> Those are *very* interesting language design concepts, and something >> where offering a compelling alternative to the current practices of >> emulating them with threads or coroutines pretty much requires the >> property of zero-copy ownership transfer. > > But subinterpreters (which have independent sys.modules dicts) seem a poor > match for that. It feels as if you're speculating about an entirely > different language here, not named Python. Ah, you're right - the types are all going to be separate as well, which means "cost of a deep copy" is the cheapest we're going to be able to get with this model. Anything better than that would require a more esoteric memory management architecture like the one in PyParallel. I guess I'll have to scale back my hopes on that front to be closer to what Stephan described - even a deep copy equivalent is often going to be cheaper than a full serialise/transmit/deserialise cycle or some other form of inter-process communication. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephanh42 at gmail.com Sat May 27 04:32:00 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Sat, 27 May 2017 10:32:00 +0200 Subject: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib. In-Reply-To: References: Message-ID: Hi Nick, > I guess I'll have to scale back my hopes on that front to be closer to > what Stephan described - even a deep copy equivalent is often going to > be cheaper than a full serialise/transmit/deserialise cycle or some > other form of inter-process communication. I would like to add that in many cases the underlying C objects *could* be shared. I identified some possible use cases of this. 1. numpy/scipy: share underlying memory of ndarray Effectively threads can then operate on the same array without GIL interference. 2. Sqlite in-memory database Multiple threads can operate on it in parallel. If you have an ORM it might feel very similar to just sharing Python objects across threads. 3. Tree of XML elements (like xml.etree) Assuming the tree data structure itself is in C, the tree could be shared across interpreters. This would be an example of a "deep" datastructure which can still be efficiently shared. So I feel this could still be very useful even if pure-Python objects need to be copied. Thanks, Stephan 2017-05-27 9:32 GMT+02:00 Nick Coghlan : > On 27 May 2017 at 03:30, Guido van Rossum wrote: >> On Fri, May 26, 2017 at 8:28 AM, Nick Coghlan wrote: >>> >>> [...] assuming the rest of idea works out >>> well, we'd eventually like to move to a tiered model where the GIL >>> becomes a read/write lock. Most code execution in subinterpreters >>> would then only need a read lock on the GIL, and hence could happily >>> execute code in parallel with other subinterpreters running on other >>> cores. >> >> >> Since the GIL protects refcounts and refcounts are probably the most >> frequently written item, I'm skeptical of this. > > Likewise - hence my somewhat garbled attempt to explain that actually > doing that would be contingent on the GILectomy folks figuring out > some clever way to cope with the refcounts :) > >>> By contrast, being able to reliably model Communicating Sequential >>> Processes in Python without incurring any communications overhead >>> though (ala goroutines)? Or doing the same with the Actor model (ala >>> Erlang/BEAM processes)? >>> >>> Those are *very* interesting language design concepts, and something >>> where offering a compelling alternative to the current practices of >>> emulating them with threads or coroutines pretty much requires the >>> property of zero-copy ownership transfer. >> >> But subinterpreters (which have independent sys.modules dicts) seem a poor >> match for that. It feels as if you're speculating about an entirely >> different language here, not named Python. > > Ah, you're right - the types are all going to be separate as well, > which means "cost of a deep copy" is the cheapest we're going to be > able to get with this model. Anything better than that would require a > more esoteric memory management architecture like the one in > PyParallel. > > I guess I'll have to scale back my hopes on that front to be closer to > what Stephan described - even a deep copy equivalent is often going to > be cheaper than a full serialise/transmit/deserialise cycle or some > other form of inter-process communication. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From leewangzhong+python at gmail.com Sat May 27 12:53:26 2017 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Sat, 27 May 2017 12:53:26 -0400 Subject: [Python-ideas] a memory-efficient variant of itertools.tee In-Reply-To: References: Message-ID: On Fri, May 26, 2017 at 2:45 PM, Stephan Houben wrote: > Hi all, > > The itertools.tee function can hold on to objects "unnecessarily". > In particular, if you do > > iter2 = itertools.tee(iter1, 2)[0] > > i.e. you "leak" one of the returned iterators, then all returned > objects are not collected until also iter2 is collected. > > I propose a different implementation, namely the one in: > > https://github.com/stephanh42/streamtee > > streamtee.tee is a drop-in alternative for itertools.tee but > as you can see from the test in the repo, it will not hold > on to the generated objects as long. > > I propose this as an improved implementation of itertools.tee. > > Thanks, > > Stephan For convenience, the implementation itself is here (33 lines including comments): https://github.com/stephanh42/streamtee/blob/master/streamtee.py I like this. It uses an infinite generator as a thunk. Though I think it belongs on bugs.python.org rather than -ideas, because the interface should be the same and the memory/time use is asymptotically the same. The current tee implementation in C (lines 378 to 852): https://github.com/python/cpython/blob/bf623ae8843dc30b28c574bec8d29fc14be59d86/Modules/itertoolsmodule.c#L378 What bothers me is, the current implementation looks very similar to what I imagine the C implementation of this looks like. Instead of thunks with a single element, it has thunks with up to LINKCELLS elements. The comment from the commit (https://github.com/python/cpython/commit/ad983e79d6f215235d205245c2599620e33cf719): > Formerly, underlying queue was implemented in terms of two lists. The > new queue is a series of singly-linked fixed length lists. > > The new implementation runs much faster, supports multi-way tees, and > allows tees of tees without additional memory costs. The delay in deletion, then, seems to be a feature for efficiency, and can be implemented with `#define LINKCELLS 1`. P.S.: I couldn't find a pure Python implementation of tee in the CPython repo. From steve at pearwood.info Sun May 28 01:18:54 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 28 May 2017 15:18:54 +1000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> Message-ID: <20170528051853.GB23443@ando.pearwood.info> On Fri, May 26, 2017 at 03:58:23PM +0300, Koos Zevenhoven wrote: > On Wed, May 24, 2017 at 5:52 PM, Wolfgang Maier > wrote: > > On 05/24/2017 02:41 AM, Steven D'Aprano wrote: [...] > > This is almost exactly what I have been thinking (just that I couldn't have > > presented it so clearly)! > > Unfortunately, this thinking is also very shallow compared to what > went into PEP519. That is a rather rude comment. How would you feel if Wolfgang or I said that the PEP's thinking was "very shallow"? (I see you are listed as co-author.) If you are going to criticise our reasoning, you better give reasons for why we are wrong, not just insult us: "...this thinking is very shallow..." "This is exactly the kind of code that causes the problems." "Isn't it great that it doesn't work, so it's not attractive anymore?" "Yes, this is another way of shooting yourself in the foot." Let me look at your objections: > str and bytes subclasses that > return something different from the str/bytes content should not be > written. That's your opinion, other people might disagree. In another post, you said it would be "confusing". I think this argument is FUD ("Fear, Uncertainty, Doubt"). We can already write confusing code in a million other ways, why is this one to be prohibited? I don't know of any other area of Python where a type isn't permitted to override its own dunders: strings have __str__ and __repr__ floats have __float__ ints have __int__ tuples can override __getitem__ to return whatever they like etc. This is legal: py> class ConfusingStr(str): ... def __getitem__(self, i): ... return 'x' ... py> s = ConfusingStr("Nobody expects the Spanish Inquisition!") py> s[5] 'x' People have had the ability to write "confusing" strings, floats and ints which could return something different from their own value. They either don't do it, or if they do, they have a good reason and it isn't so confusing. And if somebody does use it to write a confusing class? So what? "consenting adults" applies here. We aren't responsible for every abuse of the language that somebody might do. Why is __fspath__ so special that we need to protect users from doing something confusing? What *really is* confusing is to ignore __fspath__ methods in some objects but not other objects. If that decision was intentional, I don't think it was justified in the PEP. (At least, I didn't see it.) > > Lets look at a potential usecase for this. Assume that in a package you want > > to handle several paths to different files and directories that are all > > located in a common package-specific parent directory. Then using the path > > protocol you could write this: > > > > class PackageBase (object): > > basepath = '/home/.package' > > > > class PackagePath (str, PackageBase): > > def __fspath__ (): > > return os.path.join(self.basepath, str(self)) > > > > config_file = PackagePath('.config') > > log_file = PackagePath('events.log') > > data_dir = PackagePath('data') > > > > with open(log_file) as log: > > log.write('package paths initialized.\n') > > > > This is exactly the kind of code that causes the problems. It will do > the wrong thing when code like open(str(log_file), 'w') is used for > compatiblity. Then don't do that. Using open(str(log_file), 'w') is not the right way to emulate the Path protocol for backwards compatibility. The whole reason the Path protocol exists is because calling str(obj) is the wrong way to convert an unknown object to a file system path string. I think this argument about backwards compatibility is a storm in a tea cup. We can enumerate all the possibilities: 1. object that doesn't inherit from str/bytes: behaviour is unchanged; 2. object that does inherit from str/bytes, but doesn't override the __fspath__ method: behaviour is unchanged; 3. object that inherits from str/bytes, *and* overrides the __fspath__ method: behaviour is changed. Okay, the behaviour changes. I doubt that there will be many classes that subclass str and override __fspath__ now, because that would have been a waste of time up to now. So the main risk is: - classes created from Python 3.7 onwards; - which inherit from str/bytes; - and which override __fspath__; - and are back-ported to 3.6; - without taking into account that __fspath__ will be ignored in 3.6; - and the users don't read the docs to learn about the difference. The danger here is the possibility that the wrong pathname will be used, if str(obj) and fspath(obj) return a different string. Personally I think this is unlikely and not worth worrying about beyond a note in the documentation, but if people really feel this is a problem we could make this a __future__ import. But that just feels like overkill. -- Steve From ncoghlan at gmail.com Sun May 28 02:15:32 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 May 2017 16:15:32 +1000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <20170528051853.GB23443@ando.pearwood.info> References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> Message-ID: On 28 May 2017 at 15:18, Steven D'Aprano wrote: > On Fri, May 26, 2017 at 03:58:23PM +0300, Koos Zevenhoven wrote: > I think this argument about backwards compatibility is a storm in a tea > cup. We can enumerate all the possibilities: > > 1. object that doesn't inherit from str/bytes: behaviour is unchanged; > > 2. object that does inherit from str/bytes, but doesn't override > the __fspath__ method: behaviour is unchanged; > > 3. object that inherits from str/bytes, *and* overrides the __fspath__ > method: behaviour is changed. > > Okay, the behaviour changes. I doubt that there will be many > classes that subclass str and override __fspath__ now, because > that would have been a waste of time up to now. So the main risk is: > > - classes created from Python 3.7 onwards; > - which inherit from str/bytes; > - and which override __fspath__; > - and are back-ported to 3.6; > - without taking into account that __fspath__ will be ignored in 3.6; > - and the users don't read the docs to learn about the difference. > > The danger here is the possibility that the wrong pathname will be used, > if str(obj) and fspath(obj) return a different string. > > Personally I think this is unlikely and not worth worrying about beyond a note in > the documentation, but if people really feel this is a problem we could > make this a __future__ import. But that just feels like overkill. It wouldn't even need to be a __future__ import, as we have a runtime warning category specifically for this kind of change: https://docs.python.org/3/library/exceptions.html#FutureWarning So *if* a change like this was made, the appropriate transition plan would be: Python 3.7: at *class definition time*, we emit FutureWarning for subclasses of str and bytes that define __fspath__, saying that it is currently ignored for such subclasses, but will be called in Python 3.8+ Python 3.8: os.fspath() is changed as Wolgang proposes, such that explicit protocol support takes precedence over builtin inheritance However, if we *did* make such a change, it should also be made for operator.index as well, since that is similarly inconsistent with the way the int/float/etc constructor protocols work: >>> from operator import index >>> class MyInt(int): ... def __int__(self): ... return 5 ... def __index__(self): ... return 5 ... >>> int(MyInt(10)) 5 >>> index(MyInt(10)) 10 >>> class MyFloat(float): ... def __float__(self): ... return 5.0 ... >>> float(MyFloat(10)) 5.0 >>> class MyComplex(complex): ... def __complex__(self): ... return 5j ... >>> complex(MyComplex(10j)) 5j >>> class MyStr(str): ... def __str__(self): ... return "Hello" ... >>> str(MyStr("Not hello")) 'Hello' >>> class MyBytes(bytes): ... def __bytes__(self): ... return b"Hello" ... >>> bytes(MyBytes(b"Not hello")) b'Hello' Regards, Nick. P.S. I'll also echo Steven's observations that it is entirely inappropriate to describe the thinking of other posters to the list as being overly shallow. The entire reason we *have* python-ideas and the PEP process is because programming language design is a *hard problem*, especially for a language with as broad a set of use cases as Python. Rather than trying to somehow survey the entire world of Python developers, we instead provide them with an open forum where they can say "This surprises or otherwise causes problems for me" and describe their perspective. That's neither deep nor shallow thinking, it's just different people using the same language in different ways, and hence encountering different pain points. As far as the specific point at hand goes, I think contrasting the behaviour of PEP 357 (__index__) and PEP 519 (__fspath__) with the behaviour of the builtin constructor protocols suggest that this is better characterised as an oversight in the design of the more recent protocols, since neither PEP explicitly discusses the problem, both PEPs were specifically designed to permit the use of objects that *don't* inherit from the relevant builtin types (since subclasses already worked), and both PEPs handle the "subclass that also implements the corresponding protocol" scenario differently from the way the builtin constructor protocols handle it. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sun May 28 10:35:38 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 28 May 2017 17:35:38 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> Message-ID: On Sun, May 28, 2017 at 9:15 AM, Nick Coghlan wrote: > > However, if we *did* make such a change, it should also be made for > operator.index as well, since that is similarly inconsistent with the > way the int/float/etc constructor protocols work: > Part of this discussion seems to consider consistency as the only thing that matters, but consistency is only the surface here. I won't comment on the __index__ issue, and especially not call it a "misfeature", because I haven't thought about it deeply, and my comments on it would be very shallow. I might ask about it though, like the OP did. Don't get me wrong, I like consistency very much. But regarding the __fspath__ case, there are not that many people *writing* fspath-enabled classes. Instead, there are many many many more people *using* such classes (and dealing with their compatibility issues in different ways). For those people, the current behavior brings consistency---after all, it was of course designed by thinking about it from all angles and not just based on my or anyone else's own use cases only. -- Koos > >>> from operator import index > >>> class MyInt(int): > ... def __int__(self): > ... return 5 > ... def __index__(self): > ... return 5 > ... > >>> int(MyInt(10)) > 5 > >>> index(MyInt(10)) > 10 > >>> class MyFloat(float): > ... def __float__(self): > ... return 5.0 > ... > >>> float(MyFloat(10)) > 5.0 > >>> class MyComplex(complex): > ... def __complex__(self): > ... return 5j > ... > >>> complex(MyComplex(10j)) > 5j > >>> class MyStr(str): > ... def __str__(self): > ... return "Hello" > ... > >>> str(MyStr("Not hello")) > 'Hello' > >>> class MyBytes(bytes): > ... def __bytes__(self): > ... return b"Hello" > ... > >>> bytes(MyBytes(b"Not hello")) > b'Hello' > > Regards, > Nick. > -- + Koos Zevenhoven + http://twitter.com/k7hoven + From steve at pearwood.info Sun May 28 12:32:16 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 29 May 2017 02:32:16 +1000 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> Message-ID: <20170528163216.GF23443@ando.pearwood.info> On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote: > Don't get me wrong, I like consistency very much. But regarding the > __fspath__ case, there are not that many people *writing* > fspath-enabled classes. Instead, there are many many many more people > *using* such classes (and dealing with their compatibility issues in > different ways). What sort of compatibility issues are you referring to? os.fspath is new in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what compatibility issues you mean. > For those people, the current behavior brings consistency That's a very unintuitive statement. How is it consistent for fspath to call the __fspath__ dunder method for some objects but ignore it for others? > ---after all, it was of course designed by thinking about > it from all angles and not just based on my or anyone else's own use > cases only. Can explain the reasoning to us? I don't think it is explained in the PEP. -- Steve From wolfgang.maier at biologie.uni-freiburg.de Sun May 28 17:33:30 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Sun, 28 May 2017 23:33:30 +0200 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: <20170528163216.GF23443@ando.pearwood.info> References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> <20170528163216.GF23443@ando.pearwood.info> Message-ID: On 28.05.2017 18:32, Steven D'Aprano wrote: > On Sun, May 28, 2017 at 05:35:38PM +0300, Koos Zevenhoven wrote: > >> Don't get me wrong, I like consistency very much. But regarding the >> __fspath__ case, there are not that many people *writing* >> fspath-enabled classes. Instead, there are many many many more people >> *using* such classes (and dealing with their compatibility issues in >> different ways). > > What sort of compatibility issues are you referring to? os.fspath is new > in 3.6, and 3.7 isn't out yet, so I'm having trouble understanding what > compatibility issues you mean. > As far as I'm aware the only such issue people had was with building interfaces that could deal with regular strings and pathlib.Path (introduced in 3.4 if I remember correctly) instances alike. Because calling str on a pathlib.Path instance returns the path as a regular string it looked like it could become a (bad) habit to just always call str on any received object for "compatibility" with both types of path representations. The path protocol is a response to this that provides an explicit and safe alternative. > >> For those people, the current behavior brings consistency > > That's a very unintuitive statement. How is it consistent for fspath to > call the __fspath__ dunder method for some objects but ignore it for > others? > The path protocol brings a standard way of dealing with diverse path representations, but only if you use it. If people keep using str(path_object) as before, then they are doing things wrongly and are no better or safer off than they were before! The path protocol does *not* use __fspath__ as an indicator that an object's str-representation is intended to be used as a path. If you had wanted this, the PEP should have defined __fspath__ not as a method, but as a flag and have the protocol check that flag, then call __str__ if appropriate. With __fspath__ being a method that can return whatever its author sees fit, calling str to get a path from an arbitrary object is just as wrong as it always was - it will work for pathlib.Path objects and might or might not work for some other types. Importantly, this has nothing to do with this proposal, but is in the nature of the protocol as it is defined *now*. > >> ---after all, it was of course designed by thinking about >> it from all angles and not just based on my or anyone else's own use >> cases only. > > Can explain the reasoning to us? I don't think it is explained in the > PEP. > > From apalala at gmail.com Sun May 28 18:44:23 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 28 May 2017 18:44:23 -0400 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> <20170528163216.GF23443@ando.pearwood.info> Message-ID: On Sun, May 28, 2017 at 5:33 PM, Wolfgang Maier < wolfgang.maier at biologie.uni-freiburg.de> wrote: > With __fspath__ being a method that can return whatever its author sees > fit, calling str to get a path from an arbitrary object is just as wrong as > it always was - it will work for pathlib.Path objects and might or might > not work for some other types. Importantly, this has nothing to do with > this proposal, but is in the nature of the protocol as it is defined *now*. +1 -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon May 29 03:55:29 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 29 May 2017 10:55:29 +0300 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> <20170528163216.GF23443@ando.pearwood.info> Message-ID: 29.05.17 00:33, Wolfgang Maier ????: > The path protocol does > *not* use __fspath__ as an indicator that an object's str-representation > is intended to be used as a path. If you had wanted this, the PEP should > have defined __fspath__ not as a method, but as a flag and have the > protocol check that flag, then call __str__ if appropriate. __fspath__ is a method because there is a need to support bytes paths. __fspath__() can return a bytes object, str() can't. From wolfgang.maier at biologie.uni-freiburg.de Mon May 29 04:03:44 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Mon, 29 May 2017 10:03:44 +0200 Subject: [Python-ideas] tweaking the file system path protocol In-Reply-To: References: <20170524004125.GD24625@ando.pearwood.info> <20170528051853.GB23443@ando.pearwood.info> <20170528163216.GF23443@ando.pearwood.info> Message-ID: On 05/29/2017 09:55 AM, Serhiy Storchaka wrote: > 29.05.17 00:33, Wolfgang Maier ????: >> The path protocol does *not* use __fspath__ as an indicator that an >> object's str-representation is intended to be used as a path. If you >> had wanted this, the PEP should have defined __fspath__ not as a >> method, but as a flag and have the protocol check that flag, then call >> __str__ if appropriate. > > __fspath__ is a method because there is a need to support bytes paths. > __fspath__() can return a bytes object, str() can't. > That's certainly one reason, but again just shows that calling str(path_object) to get a path representation is wrong. From stephanh42 at gmail.com Mon May 29 14:31:13 2017 From: stephanh42 at gmail.com (Stephan Houben) Date: Mon, 29 May 2017 20:31:13 +0200 Subject: [Python-ideas] a memory-efficient variant of itertools.tee In-Reply-To: References: Message-ID: Hi Franklin, Thanks. I should have tested with a larger sequence. I suppose delaying deletion by a bounded amount of objects is fine. I was concerned that a potentially unbounded amount of objects was kept. The "reference implementation" in the docs suggested that and my initial testing seemed to confirm it. That's what you get from reading the docs instead of the code. There is even a justification in the code why it is 57 ;-) Stephan 2017-05-27 18:53 GMT+02:00 Franklin? Lee : > On Fri, May 26, 2017 at 2:45 PM, Stephan Houben wrote: >> Hi all, >> >> The itertools.tee function can hold on to objects "unnecessarily". >> In particular, if you do >> >> iter2 = itertools.tee(iter1, 2)[0] >> >> i.e. you "leak" one of the returned iterators, then all returned >> objects are not collected until also iter2 is collected. >> >> I propose a different implementation, namely the one in: >> >> https://github.com/stephanh42/streamtee >> >> streamtee.tee is a drop-in alternative for itertools.tee but >> as you can see from the test in the repo, it will not hold >> on to the generated objects as long. >> >> I propose this as an improved implementation of itertools.tee. >> >> Thanks, >> >> Stephan > > For convenience, the implementation itself is here (33 lines including > comments): > https://github.com/stephanh42/streamtee/blob/master/streamtee.py > > I like this. It uses an infinite generator as a thunk. Though I think > it belongs on bugs.python.org rather than -ideas, because the > interface should be the same and the memory/time use is asymptotically > the same. > > The current tee implementation in C (lines 378 to 852): > https://github.com/python/cpython/blob/bf623ae8843dc30b28c574bec8d29fc14be59d86/Modules/itertoolsmodule.c#L378 > > What bothers me is, the current implementation looks very similar to > what I imagine the C implementation of this looks like. Instead of > thunks with a single element, it has thunks with up to LINKCELLS > elements. The comment from the commit > (https://github.com/python/cpython/commit/ad983e79d6f215235d205245c2599620e33cf719): > >> Formerly, underlying queue was implemented in terms of two lists. The >> new queue is a series of singly-linked fixed length lists. >> >> The new implementation runs much faster, supports multi-way tees, and >> allows tees of tees without additional memory costs. > > The delay in deletion, then, seems to be a feature for efficiency, and > can be implemented with `#define LINKCELLS 1`. > > P.S.: I couldn't find a pure Python implementation of tee in the CPython repo. From mistersheik at gmail.com Mon May 29 14:59:16 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 29 May 2017 11:59:16 -0700 (PDT) Subject: [Python-ideas] dict(default=int) In-Reply-To: References: <7d2eb6ec-e662-7bbe-2b2d-3d3a31d816f2@brice.xyz> <70aca7b8-6d7b-a95c-5df0-24c438c1c0ad@trueblade.com> <20170309110826.66097a63@subdivisions.wooz.org> Message-ID: <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> A long time ago, I proposed that the dict variants (sorteddict, defaultdict, weakkeydict, etc.) be made more discoverable by having them specified as keyword arguments and I got the same feedback that the poster here is getting. Now, instead of moving these classes into dict, why not have a factory like dict.factory(values=None, *, ordered=True, sorted=False, has_default=False, weak_keys=False, weak_values=False, ?) If prefers the keyword-argument as options to the keyword-argument as initializer magic, they can set: dict = dict.factory Best, Neil On Thursday, March 9, 2017 at 5:58:43 PM UTC-5, Chris Barker wrote: > > > > >If we really want to make defaultdict feel more "builtin" (and I don't see >> >any reason to do so), I'd suggest adding a factory function: >> > >> >dict.defaultdict(int) >> > > >> Nice. >> > > I agree -- what about: > > dict.sorteddict() ?? > > make easy access to various built-in dict variations... > > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.... at noaa.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at getpattern.com Mon May 29 17:06:43 2017 From: matt at getpattern.com (Matt Gilson) Date: Mon, 29 May 2017 14:06:43 -0700 Subject: [Python-ideas] dict(default=int) In-Reply-To: <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> References: <7d2eb6ec-e662-7bbe-2b2d-3d3a31d816f2@brice.xyz> <70aca7b8-6d7b-a95c-5df0-24c438c1c0ad@trueblade.com> <20170309110826.66097a63@subdivisions.wooz.org> <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> Message-ID: On Mon, May 29, 2017 at 11:59 AM, Neil Girdhar wrote: > A long time ago, I proposed that the dict variants (sorteddict, > defaultdict, weakkeydict, etc.) be made more discoverable by having them > specified as keyword arguments and I got the same feedback that the poster > here is getting. Now, instead of moving these classes into dict, why not > have a factory like > > dict.factory(values=None, *, ordered=True, sorted=False, > has_default=False, weak_keys=False, weak_values=False, ?) > Hmm ... I don't think that I like this. For one, it greatly increases the amount of surface area that needs to be maintained in the standard library. As far as I know, we don't currently have a OrderedWeakKeyDictionary with defaultdict behavior. If this was to be added, then each of the combinations of input keyword arguments would need to be supported. Many of them probably don't have very compelling use-cases or they might not have semantics that would be easy to agree upon a "preferred behavior" (Assuming that a mythological SortedDict pops into existence in the standard lib, what happens if you call `dict.factory(sorted=True, ordered=True)`?). Of course, this sets a precedence that future dict subclasses need to be added to the `dict.factory` constructor as well which risks a very bloated signature (or, someone has to make the decision about which subclasses should be available and which should be left off ...). This last argument can repurposed against providing easy access to builtin dict subclasses via their own methods (`dict.defaultdict(...)`). I don't find very cumbersome to import the dict subclasses that I need from the locations where they live. I like that it forces me to be explicit and I think that it generally makes reading the code easier in the normal cases (`defaultdict(in)` vs. `dict.factory(default=int)`). Of course, if someone finds this idea particularly interesting, they could definitely explore the idea more by creating a package on pypi that attempts to provide this factory function. If it got usage, that might go a long way to convincing us nay-sayers that this idea has legs :-). > > If prefers the keyword-argument as options to the keyword-argument as > initializer magic, they can set: > > dict = dict.factory > > Best, > > Neil > > On Thursday, March 9, 2017 at 5:58:43 PM UTC-5, Chris Barker wrote: >> >> >> >> >If we really want to make defaultdict feel more "builtin" (and I don't >>> see >>> >any reason to do so), I'd suggest adding a factory function: >>> > >>> >dict.defaultdict(int) >>> >> >> >>> Nice. >>> >> >> I agree -- what about: >> >> dict.sorteddict() ?? >> >> make easy access to various built-in dict variations... >> >> -CHB >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.... at noaa.gov >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Matt Gilson | Pattern Software Engineer getpattern.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Mon May 29 17:27:45 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 29 May 2017 21:27:45 +0000 Subject: [Python-ideas] dict(default=int) In-Reply-To: References: <7d2eb6ec-e662-7bbe-2b2d-3d3a31d816f2@brice.xyz> <70aca7b8-6d7b-a95c-5df0-24c438c1c0ad@trueblade.com> <20170309110826.66097a63@subdivisions.wooz.org> <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> Message-ID: On Mon, May 29, 2017 at 5:06 PM Matt Gilson wrote: > On Mon, May 29, 2017 at 11:59 AM, Neil Girdhar > wrote: > >> A long time ago, I proposed that the dict variants (sorteddict, >> defaultdict, weakkeydict, etc.) be made more discoverable by having them >> specified as keyword arguments and I got the same feedback that the poster >> here is getting. Now, instead of moving these classes into dict, why not >> have a factory like >> >> dict.factory(values=None, *, ordered=True, sorted=False, >> has_default=False, weak_keys=False, weak_values=False, ?) >> > > Hmm ... I don't think that I like this. For one, it greatly increases the > amount of surface area that needs to be maintained in the standard > library. As far as I know, we don't currently have a > OrderedWeakKeyDictionary with defaultdict behavior. If this was to be > added, then each of the combinations of input keyword arguments would need > to be supported. Many of them probably don't have very compelling > use-cases or they might not have semantics that would be easy to agree upon > a "preferred behavior" (Assuming that a mythological SortedDict pops into > existence in the standard lib, what happens if you call > `dict.factory(sorted=True, ordered=True)`?). Of course, this sets a > precedence that future dict subclasses need to be added to the > `dict.factory` constructor as well which risks a very bloated signature > (or, someone has to make the decision about which subclasses should be > available and which should be left off ...). This last argument can > repurposed against providing easy access to builtin dict subclasses via > their own methods (`dict.defaultdict(...)`). > You can just raise NotImplementedError for the bad combinations and the reasonable combinations that aren't implemented yet. > > I don't find very cumbersome to import the dict subclasses that I need > from the locations where they live. I like that it forces me to be > explicit and I think that it generally makes reading the code easier in the > normal cases (`defaultdict(in)` vs. `dict.factory(default=int)`). Of > course, if someone finds this idea particularly interesting, they could > definitely explore the idea more by creating a package on pypi that > attempts to provide this factory function. If it got usage, that might go > a long way to convincing us nay-sayers that this idea has legs :-). > Fair enough. The reason I had proposed it was to make the various dict incantations more discoverable. Also, I was using a lot of weak-key dictionaries and it was odd to me that there was no ordered version of weak key dictionaries, etc. > > > > >> >> If prefers the keyword-argument as options to the keyword-argument as >> initializer magic, they can set: >> >> dict = dict.factory >> >> Best, >> >> Neil >> >> On Thursday, March 9, 2017 at 5:58:43 PM UTC-5, Chris Barker wrote: >>> >>> >>> >>> >If we really want to make defaultdict feel more "builtin" (and I don't >>>> see >>>> >any reason to do so), I'd suggest adding a factory function: >>>> > >>>> >dict.defaultdict(int) >>>> >>> >>> >>>> Nice. >>>> >>> >>> I agree -- what about: >>> >>> dict.sorteddict() ?? >>> >>> make easy access to various built-in dict variations... >>> >>> -CHB >>> >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> Chris.... at noaa.gov >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > > Matt Gilson | Pattern > > Software Engineer > getpattern.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mariocj89 at gmail.com Mon May 29 17:32:24 2017 From: mariocj89 at gmail.com (Mario Corchero) Date: Mon, 29 May 2017 22:32:24 +0100 Subject: [Python-ideas] SealedMock proposal for unittest.mock Message-ID: Hello Everyone! First time writing to python-ideas. *Overview* Add a new mock class within the mock module , SealedMock (or RestrictedMock) that allows to restrict in a dynamic and recursive way the addition of attributes to it. The new class just defines a special attribute "sealed" which once set to True the behaviour of automatically creating mocks is blocked, as well as for all its "submocks". See sealedmock . Don't focus on the implementation, it is ugly, it would be much simpler within *mock.py*. *Rationale* Inspired by GMock RestrictedMock, SealedMock aims to allow the developer to define a narrow interface to the mock that defines what the mocks allows to be called on. The feature of mocks returning mocks by default is extremely useful but not always desired. Quite often you rely on it only at the time you are writing the test but you want it to be disabled at the time the mock is passed into your code, that is what SealedMock aims to address. This solution also prevents user errors when mocking incorrect paths or having typos when calling attributes/methods of the mock. We have tried it internally in our company and it gives quite a nicer user experience for many use cases, specially for new users of mock as it helps out when you mock the wrong path. *Alternatives* - Using auto_spec/spec is a possible solution but removes flexibility and is rather painful to write for each of the mocks and submocks being used. - Leaving it outside of the mock.py as it is not interesting enough. I am fine with it :) just proposing it in case you think otherwise. - Make it part of the standard Mock base class. Works for me, but I'd concerned on how can we do it in a backward compatible way. (Say someone is mocking something that has a "sealed" attribute already). Let me know what you think, happy to open a enhancement in https://bugs.python.org/ and send a PR. Regards, Mario -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Mon May 29 18:43:39 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 29 May 2017 17:43:39 -0500 Subject: [Python-ideas] dict(default=int) In-Reply-To: <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> References: <7d2eb6ec-e662-7bbe-2b2d-3d3a31d816f2@brice.xyz> <70aca7b8-6d7b-a95c-5df0-24c438c1c0ad@trueblade.com> <20170309110826.66097a63@subdivisions.wooz.org> <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> Message-ID: Sometimes I feel that it would be neat of dict constructors (like proposed previously in the thread) could also be chained, e.g.: dict.ordered.default(int)(a=1, b=2) -- Ryan (????) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On May 29, 2017 2:06 PM, "Neil Girdhar" wrote: > A long time ago, I proposed that the dict variants (sorteddict, > defaultdict, weakkeydict, etc.) be made more discoverable by having them > specified as keyword arguments and I got the same feedback that the poster > here is getting. Now, instead of moving these classes into dict, why not > have a factory like > > dict.factory(values=None, *, ordered=True, sorted=False, > has_default=False, weak_keys=False, weak_values=False, ?) > > If prefers the keyword-argument as options to the keyword-argument as > initializer magic, they can set: > > dict = dict.factory > > Best, > > Neil > > On Thursday, March 9, 2017 at 5:58:43 PM UTC-5, Chris Barker wrote: >> >> >> >> >If we really want to make defaultdict feel more "builtin" (and I don't >>> see >>> >any reason to do so), I'd suggest adding a factory function: >>> > >>> >dict.defaultdict(int) >>> >> >> >>> Nice. >>> >> >> I agree -- what about: >> >> dict.sorteddict() ?? >> >> make easy access to various built-in dict variations... >> >> -CHB >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.... at noaa.gov >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon May 29 21:28:24 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 30 May 2017 11:28:24 +1000 Subject: [Python-ideas] dict(default=int) In-Reply-To: References: <7d2eb6ec-e662-7bbe-2b2d-3d3a31d816f2@brice.xyz> <70aca7b8-6d7b-a95c-5df0-24c438c1c0ad@trueblade.com> <20170309110826.66097a63@subdivisions.wooz.org> <06223f42-e2b4-4cf0-8cdf-29dc450cc5b1@googlegroups.com> Message-ID: On Tue, May 30, 2017 at 7:06 AM, Matt Gilson wrote: > On Mon, May 29, 2017 at 11:59 AM, Neil Girdhar > wrote: >> >> A long time ago, I proposed that the dict variants (sorteddict, >> defaultdict, weakkeydict, etc.) be made more discoverable by having them >> specified as keyword arguments and I got the same feedback that the poster >> here is getting. Now, instead of moving these classes into dict, why not >> have a factory like >> >> dict.factory(values=None, *, ordered=True, sorted=False, >> has_default=False, weak_keys=False, weak_values=False, ?) > > > Hmm ... I don't think that I like this. For one, it greatly increases the > amount of surface area that needs to be maintained in the standard library. > As far as I know, we don't currently have a OrderedWeakKeyDictionary with > defaultdict behavior. "defaultdict behavior" can be tacked onto anything: >>> class OrderedDefaultDict(collections.OrderedDict): ... def __missing__(self, key): ... self[key] = [] ... return self[key] The core functionality of defaultdict is part of dict (the fact that __missing__ gets called). The core functionality of weak references could easily be added to dict too, if something like this were wanted. So a lot of these would indeed be orthogonal. That said, though, I don't know of many situations in which you would need an OrderedWeakKeyDictionary, so if you have to write some custom code to make that happen, so be it. ChrisA