From ncoghlan at gmail.com Fri Jan 1 00:51:36 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 1 Jan 2016 15:51:36 +1000 Subject: [Python-ideas] Where to put non-collection ABCs (was: Deprecating the old-style sequence protocol) In-Reply-To: References: <5681785E.2070105@egenix.com> <568544BA.4040709@sdamon.com> Message-ID: On 1 January 2016 at 08:18, Michael Selik wrote: > On Thu, Dec 31, 2015 at 12:48 PM Brett Cannon wrote: >> >> On Thu, Dec 31, 2015, 07:08 Alexander Walters >> wrote: >>> >>> Would it be a good idea to mix 'concrete implementations of ABCs'* >>> directly in the abc module where the tooling to create ABCs live, or to >>> put it in a submodule? I feel it should be a submodule, but that isn't >>> based on vast experience. > > Locating collections ABCs in a submodule makes some sense, as there are 21 > of them and the collections module is important for beginners to learn > without getting distracted by ABCs. Contrast that with the direct inclusion > of ABCs in most other modules and it suggests the creation of a submodule > for collections may have been motivated for the same reason as this > discussion -- it didn't feel right to have certain ABCs directly in the > collections module. No need to speculate, the original motive for the move is documented in the tracker: http://bugs.python.org/issue11085 (which can be found by looking at the commit history for the collections module: https://hg.python.org/cpython/log/3.5/Lib/collections/abc.py ) The problem was with folks getting confused between the abstract types like Sequence and Mapping and the usable classes like deque, ChainMap, OrderedDict, defaultdict, etc, rather than there being a lot of non-collections related content in the file. At the time, Callable was the only non-container related ABC in collections.abc - most of the others now being considered for relocation (Generator, Coroutine, Awaitable, AsyncIterable, AsyncIterator) were added as part of the PEP 492 implementation in 3.5, and *that* was mostly driven by Iterable and Iterator already being there so it was "logical" to also add Generator, AsyncIterable and AsyncIterator, with Coroutine and Awaitable coming along for the ride. That does raise the question of whether or not it's worth continuing to publish the PEP 492 ABCs from collections.abc - Guido formally accepted PEP 492 with provisional status [1], so we have scope to do the following: - add abc.(Generator, Coroutine, Awaitable, AsyncIterable, AsyncIterator) in 3.5.2 (keeping the aliases in collections.abc) - drop the collections.abc aliases for the PEP 492 ABCs in 3.6 - add abc.(Callable, Iterable, Iterator) in 3.6 (keeping the aliases in collections.abc indefinitely for Python 2 compatibility) [1] https://mail.python.org/pipermail/python-dev/2015-May/139844.html > If the non-collection ABCs are being moved out of the collections module and > into the ``abc`` module, there's less reason to separate them into a > submodule. Beginners don't venture into the abc module expecting to > understand everything. It's natural to find a bunch of ABCs in a module > called ``abc``. And ABCs are included directly in many other modules instead > of being relegated to a less discoverable submodule like ``typing.abc``, > ``io.abc``, ``numbers.abc``, etc. as many of those are focused on ABCs in > the first place. Right, the reason importlib.abc and collections.abc make sense is that when you import "importlib", you're probably interested in dynamic imports, and when you import "collections", you're probably interested in using one of the concrete container classes. The ABCs are only relevant if you're wanting to do some kind of type checking or define your own classes implementing the ABCs, so it makes sense to separate them at the module level, not just in the documentation. Other modules defining ABCs either don't need separation, or get their separation in other ways: abc: no separation needed, you're necessarily already thinking about ABCs when importing this typing: no separation needed, the only ABC is the one for defining generic types email: no separation needed, the only ABC is the one for defining email policies io: to use the io stack, you just call open() or use some other file/stream opening API numbers: to use the numeric tower, you use a builtin type, fractions.Fraction, decimal.Decimal, or some other numeric type Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From surya.subbarao1 at gmail.com Sat Jan 2 13:14:55 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Sat, 2 Jan 2016 10:14:55 -0800 Subject: [Python-ideas] Bad programming style in decorators? Message-ID: In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory? I prefer defining decorators as classes with __call__ overridden. Is there a reason why decorators are taught with functions inside functions? -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Jan 2 13:56:19 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 2 Jan 2016 10:56:19 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: Message-ID: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> On Jan 2, 2016, at 10:14, u8y7541 The Awesome Person wrote: > > In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory? No. First, most decorators are only called once. For example: @lru_cache(maxsize=None) def fib(n) if n < 2: return n return fib(n-1) + fib(n-2) The lru_cache function gets called once, and creates and returns a decorator function. That decorator function is passed fib, and creates and returns a wrapper function. That wrapper function is what gets stored in the globals as fib. It may then get called a zillion times, but it doesn't create or call any new function. So, why should you care whether lru_cache is implemented with a function or a class? You're talking about a difference of a few dozen bytes, once in the entire lifetime of your program. Plus, where do you get the idea that a function object is "much larger"? Each new function that gets built used the same code object, globals dict, etc., so you're only paying for the cost of a function object header, plus a tuple of cell objects (pointers) for any state variables. Your alternative is to create a class instance header (maybe a little smaller than a function object header), and store all those state variables in a dict (33-50% bigger even with the new split-instance-dict optimizations). Anyway, I'm willing to bet that in this case, the function is ~256 bytes while the class is ~1024, so you're actually wasting rather than saving memory. But either way, it's far too little memory to care. > I prefer defining decorators as classes with __call__ overridden. Is there a reason why decorators are taught with functions inside functions? Because most decorators are more concise, more readable, and easier to understand that way. And that's far more important than a micro-optimization which may actually be a pessimization but which even more likely isn't going to matter at all. From ncoghlan at gmail.com Sat Jan 2 21:42:31 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Jan 2016 12:42:31 +1000 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On 3 January 2016 at 04:56, Andrew Barnert via Python-ideas wrote: > On Jan 2, 2016, at 10:14, u8y7541 The Awesome Person wrote: >> >> In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory? > > No. > > First, most decorators are only called once. For example: > > @lru_cache(maxsize=None) > def fib(n) > if n < 2: > return n > return fib(n-1) + fib(n-2) > > The lru_cache function gets called once, and creates and returns a decorator function. That decorator function is passed fib, and creates and returns a wrapper function. That wrapper function is what gets stored in the globals as fib. It may then get called a zillion times, but it doesn't create or call any new function. > > So, why should you care whether lru_cache is implemented with a function or a class? You're talking about a difference of a few dozen bytes, once in the entire lifetime of your program. We need to make a slight terminology clarification here, as the answer to Surya's question changes depend on whether we're talking about implementing wrapper functions inside decorators (like the "_lru_cache_wrapper" that lru_cache wraps around the passed in callable), or about implementing decorators inside decorator factories (like the transient "decorating_function" that lru_cache uses to apply the wrapper to the function being defined). Most of the time that distinction isn't important, so folks use the more informal approach of using "decorator" to refer to both decorators and decorator factories, but this is a situation where the difference matters. Every decorator factor does roughly the same thing: when called, it produces a new instance of a callable type which accepts a single function as its sole argument. From the perspective of the *user* of the decorator factory, it doesn't matter whether internally that's handled using a def statement, a lambda expression, functools.partial, instantiating a class that defines a custom __call__ method, or some other technique. It's also rare for decorator factories to be invoked in code that's a performance bottleneck, so it's generally more important to optimise for readability and maintainability when writing them than it is to optimise for speed. The wrapper functions themselves, though, exist in a one:one correspondence with the functions they're applied to - when you apply functools.lru_cache to a function, the transient decorator produced by the decorator factory only lasts as long as the execution of the function definition, but the wrapper function lasts for as long as the wrapped function does, and gets invoked every time that function is called (and if a function is performance critical enough for the results to be worth caching, then it's likely performance critical enough to be thinking about micro-optimisations). As such, from a micro-optimisation perspective, it's reasonable to want to know the answers to: * Which is faster, defining a new function object, or instantiating an existing class? * Which is faster, calling a function object that accepts a single parameter, or calling a class with a custom __call__ method? * Which uses more memory, defining a new function object, or instantiating an existing class? The answers to these questions can technically vary by implementation, but in practice, CPython's likely to be representative of their *relative* performance for any given implementation, so we can use it to check whether or not our intuitions about relative speed and memory consumption are correct. For the first question then, here are the numbers I get locally for CPython 3.4: $ python3 -m timeit "def f(): pass" 10000000 loops, best of 3: 0.0744 usec per loop $ python3 -m timeit -s "class C: pass" "c = C()" 10000000 loops, best of 3: 0.113 usec per loop The trick here is to realise that *at runtime*, a def statement is really just instantiating a new instance of types.FunctionType - most of the heavy lifting has already been done at compile time. The reason it manages to be faster than typical class instantiation is because we get to use customised bytecode operating on constant values rather than having to look the class up by name and making a standard function call: >>> dis.dis("def f(): pass") 1 0 LOAD_CONST 0 (", line 1>) 3 LOAD_CONST 1 ('f') 6 MAKE_FUNCTION 0 9 STORE_NAME 0 (f) 12 LOAD_CONST 2 (None) 15 RETURN_VALUE >>> dis.dis("c = C()") 1 0 LOAD_NAME 0 (C) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_NAME 1 (c) 9 LOAD_CONST 0 (None) 12 RETURN_VALUE For the second question: $ python3 -m timeit -s "def f(arg): pass" "f(None)" 10000000 loops, best of 3: 0.111 usec per loop [ncoghlan at thechalk ~]$ python3 -m timeit -s "class C:" -s " def __call__(self, arg): pass" -s "c = C()" "c(None)" 1000000 loops, best of 3: 0.232 usec per loop Again, we see that the native function outperforms the class with a custom __call__ method. There's no difference in the bytecode this time, but rather a difference in what happens inside the CALL_FUNCTION opcode: for the second case, we first have to retrieve the bound c.__call__() method, and then call *that* as c.__call__(None), which in turn internally calls C.__call__(c, None), while for the native function case we get to skip straight to running the called function. The speed difference can be significantly reduced (but not entirely eliminated), by caching the bound method during setup: $ python3 -m timeit -s "class C:" -s " def __call__(self, arg): pass" -s "c_call = C().__call__" "c_call(None)" 10000000 loops, best of 3: 0.115 usec per loop Finally, we get to the question of relative size: are function instances larger or smaller than your typical class instance? Again, we don't have to guess, we can use the interpreter to experiment and check our assumptions: >>> import sys >>> def f(): pass ... >>> sys.getsizeof(f) 136 >>> class C(): pass ... >>> sys.getsizeof(C()) 56 That's a potentially noticeable difference if we're applying the wrapper often enough - the native function is 80 bytes larger than an empty standard class instance. Looking at the available data attributes on f, we can see the likely causes of the difference: >>> set(dir(f)) - set(dir(C())) {'__code__', '__defaults__', '__name__', '__closure__', '__get__', '__kwdefaults__', '__qualname__', '__annotations__', '__globals__', '__call__'} There are 10 additional attributes there, although 2 of them (__get__ and __call__) relate to methods our native function has defined, but the empty class doesn't. The other 8 represent additional pieces of data stored (or potentially stored) per function, that we don't store for a typical class instance. However, we also need to account for the overhead of defining a new class object, and that's a non-trivial amount of memory when we're talking about a size difference of only 80 bytes per wrapped function: >>> sys.getsizeof(C) 976 That means if a wrapper function is only used a few times in any given run of the program, then native functions will be faster *and* use less memory (at least on CPython). If the wrapper is used more often than that, then native functions will still be the fastest option, but not the lowest memory option. Furthermore, if we decide to cache the bound __call__ method to reduce the speed impact of using a custom __call__ method, we give up most of the memory gains: >>> sys.getsizeof(C().__call__) 64 This all suggests that if your application is severely memory constrained (e.g. it's running on an embedded interpreter like MicroPython), then it *might* make sense to incur the extra complexity of using classes with a custom __call__ method to define wrapper functions, over just using a nested function. For more typical cases though, the difference is going to disappear into the noise, so you're likely to be better off defaulting to using nested function definitions, and only switching to the class based version in cases where it's more readable and maintainable (and in those cases considering whether or not it might make sense to return the bound __call__ method from the decorator, rather than the callable itself). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From surya.subbarao1 at gmail.com Sat Jan 2 22:00:38 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Sat, 2 Jan 2016 19:00:38 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: The wrapper functions themselves, though, exist in a one:one > correspondence with the functions they're applied to - when you apply > functools.lru_cache to a function, the transient decorator produced by > the decorator factory only lasts as long as the execution of the > function definition, but the wrapper function lasts for as long as the > wrapped function does, and gets invoked every time that function is > called (and if a function is performance critical enough for the > results to be worth caching, then it's likely performance critical > enough to be thinking about micro-optimisations). (Nick Coghlan) Yes, that is what I was thinking of. Just like Quake's fast inverse square root. Even though it is a micro-optimization, it greatly affects how fast the game runs. But, as I explained, the function will _not_ be redefined and trashed every > frame; it will be created one time. (Andrew Barnert) Hmm... Nick says different... This all suggests that if your application is severely memory > constrained (e.g. it's running on an embedded interpreter like > MicroPython), then it *might* make sense to incur the extra complexity > of using classes with a custom __call__ method to define wrapper > functions, over just using a nested function. (Nick Coghlan) Yes, I was thinking of that when I started this thread, but this thread is just from my speculation. -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 2 22:34:48 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 2 Jan 2016 20:34:48 -0700 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: If you want more discussion please discuss a specific example, showing the decorator code itself (not just the decorator call). The upshot is that creating a function object is pretty efficient and probably more efficient than instantiating a class -- if you don't believe that write a micro-benchmark. On Sat, Jan 2, 2016 at 8:00 PM, u8y7541 The Awesome Person < surya.subbarao1 at gmail.com> wrote: > > > The wrapper functions themselves, though, exist in a one:one >> correspondence with the functions they're applied to - when you apply >> functools.lru_cache to a function, the transient decorator produced by >> the decorator factory only lasts as long as the execution of the >> function definition, but the wrapper function lasts for as long as the >> wrapped function does, and gets invoked every time that function is >> called (and if a function is performance critical enough for the >> results to be worth caching, then it's likely performance critical >> enough to be thinking about micro-optimisations). (Nick Coghlan) > > > Yes, that is what I was thinking of. Just like Quake's fast inverse square > root. Even though it is a micro-optimization, it greatly affects how fast > the game runs. > > But, as I explained, the function will _not_ be redefined and trashed >> every frame; it will be created one time. (Andrew Barnert) > > > Hmm... Nick says different... > > This all suggests that if your application is severely memory >> constrained (e.g. it's running on an embedded interpreter like >> MicroPython), then it *might* make sense to incur the extra complexity >> of using classes with a custom __call__ method to define wrapper >> functions, over just using a nested function. (Nick Coghlan) > > > Yes, I was thinking of that when I started this thread, but this thread is > just from my speculation. > > > -- > -Surya Subbarao > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jan 2 22:39:01 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Jan 2016 13:39:01 +1000 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On 3 January 2016 at 13:00, u8y7541 The Awesome Person wrote: > >> The wrapper functions themselves, though, exist in a one:one >> correspondence with the functions they're applied to - when you apply >> functools.lru_cache to a function, the transient decorator produced by >> the decorator factory only lasts as long as the execution of the >> function definition, but the wrapper function lasts for as long as the >> wrapped function does, and gets invoked every time that function is >> called (and if a function is performance critical enough for the >> results to be worth caching, then it's likely performance critical >> enough to be thinking about micro-optimisations). (Nick Coghlan) > > Yes, that is what I was thinking of. Just like Quake's fast inverse square > root. Even though it is a micro-optimization, it greatly affects how fast > the game runs. For Python, much bigger performance pay-offs are available without changing the code by adopting tools like PyPy, Cython and Numba. Worrying about micro-optimisations like this usually only makes sense if a profiler has identified the code as a hotspot for your particular workload (and sometimes not even then). >> But, as I explained, the function will _not_ be redefined and trashed >> every frame; it will be created one time. (Andrew Barnert) > > Hmm... Nick says different... > >> This all suggests that if your application is severely memory >> constrained (e.g. it's running on an embedded interpreter like >> MicroPython), then it *might* make sense to incur the extra complexity >> of using classes with a custom __call__ method to define wrapper >> functions, over just using a nested function. (Nick Coghlan) The memory difference is only per function defined using the wrapper, not per call. The second speed difference I described (how long the CALL_FUNCTION opcode takes) is per call, and there native functions are the clear winner (followed by bound methods, and custom callable objects a relatively distant third). The other thing to keep in mind is that the examples I showed were focused specifically on measuring the differences in overhead, so the function bodies don't actually do anything, and the class instances didn't contain any state of their own. Adding even a single instance/closure variable is likely to swamp the differences in memory consumption between a native function and a class instance. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sat Jan 2 22:48:04 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 2 Jan 2016 20:48:04 -0700 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: Whoops, Nick already did the micro-benchmarks, and showed that creating a function object is faster than instantiating a class. He also measured the size, but I think he forgot that sys.getsizeof() doesn't report the size (recursively) of contained objects -- a class instance references a dict which is another 288 bytes (though if you care you can get rid of this by using __slots__). I expect that calling an instance using __call__ is also slower than calling a function (but you can do your own benchmarks :-). On Sat, Jan 2, 2016 at 8:39 PM, Nick Coghlan wrote: > On 3 January 2016 at 13:00, u8y7541 The Awesome Person > wrote: > > > >> The wrapper functions themselves, though, exist in a one:one > >> correspondence with the functions they're applied to - when you apply > >> functools.lru_cache to a function, the transient decorator produced by > >> the decorator factory only lasts as long as the execution of the > >> function definition, but the wrapper function lasts for as long as the > >> wrapped function does, and gets invoked every time that function is > >> called (and if a function is performance critical enough for the > >> results to be worth caching, then it's likely performance critical > >> enough to be thinking about micro-optimisations). (Nick Coghlan) > > > > Yes, that is what I was thinking of. Just like Quake's fast inverse > square > > root. Even though it is a micro-optimization, it greatly affects how fast > > the game runs. > > For Python, much bigger performance pay-offs are available without > changing the code by adopting tools like PyPy, Cython and Numba. > Worrying about micro-optimisations like this usually only makes sense > if a profiler has identified the code as a hotspot for your particular > workload (and sometimes not even then). > > >> But, as I explained, the function will _not_ be redefined and trashed > >> every frame; it will be created one time. (Andrew Barnert) > > > > Hmm... Nick says different... > > > >> This all suggests that if your application is severely memory > >> constrained (e.g. it's running on an embedded interpreter like > >> MicroPython), then it *might* make sense to incur the extra complexity > >> of using classes with a custom __call__ method to define wrapper > >> functions, over just using a nested function. (Nick Coghlan) > > The memory difference is only per function defined using the wrapper, > not per call. The second speed difference I described (how long the > CALL_FUNCTION opcode takes) is per call, and there native functions > are the clear winner (followed by bound methods, and custom callable > objects a relatively distant third). > > The other thing to keep in mind is that the examples I showed were > focused specifically on measuring the differences in overhead, so the > function bodies don't actually do anything, and the class instances > didn't contain any state of their own. Adding even a single > instance/closure variable is likely to swamp the differences in memory > consumption between a native function and a class instance. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Jan 2 22:50:34 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 2 Jan 2016 19:50:34 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On Jan 2, 2016, at 19:00, u8y7541 The Awesome Person wrote: > >> The wrapper functions themselves, though, exist in a one:one >> correspondence with the functions they're applied to - when you apply >> functools.lru_cache to a function, the transient decorator produced by >> the decorator factory only lasts as long as the execution of the >> function definition, but the wrapper function lasts for as long as the >> wrapped function does, and gets invoked every time that function is >> called (and if a function is performance critical enough for the >> results to be worth caching, then it's likely performance critical >> enough to be thinking about micro-optimisations). (Nick Coghlan) > > Yes, that is what I was thinking of. Just like Quake's fast inverse square root. Even though it is a micro-optimization, it greatly affects how fast the game runs. Of course micro-optimizations _can_ matter--when you're optimizing the work done in the inner loop of a program that's CPU-bound, even a few percent can make a difference. But that doesn't mean they _always_ matter. Saving 50ns in some code that runs thousands of times per frames makes a difference; saving 50ns in some code that happens once at startup does not. That's why we have profiling tools: so you can find the bit of your program where you're spending 99% of your time doing something a billion times, and optimize that part. And it also doesn't mean that everything that sounds like it should be lighter is worth doing. You have to actually test it and see. In the typical case where you're replacing one function object with one class object and one instance object, that's actually taking more space, not less. >> But, as I explained, the function will _not_ be redefined and trashed every frame; it will be created one time. (Andrew Barnert) > > Hmm... Nick says different... No, Nick doesn't say different. Read it again. The wrapper function lives as long as the wrapped function lives. It doesn't get created anew each time you call it. If you don't understand this, it may help to profile [fib(i) for i in range(10000)]. You'll see that the wrapper function gets called a ton of times, the wrapper function gets called 10000 times, and the factory function (which created wrapper functions) gets called 0 times. >> This all suggests that if your application is severely memory >> constrained (e.g. it's running on an embedded interpreter like >> MicroPython), then it *might* make sense to incur the extra complexity >> of using classes with a custom __call__ method to define wrapper >> functions, over just using a nested function. (Nick Coghlan) > > Yes, I was thinking of that when I started this thread, but this thread is just from my speculation. Nick is saying that there may be some cases where it might make sense to use a class. That doesn't at all support your idea that tutorials should teach using classes instead of functions. In general, using functions will be faster; in the most common case, using functions will use less memory; most importantly, in the vast majority of cases, it won't matter anyway. Maybe a MicroPython tutorial should have a section on how running on a machine with only 4KB changes a lot of the usual tradeoffs, using a decorator as an example. But a tutorial on decorators should show using a function, because it's the simplest, most readable way to do it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.subbarao1 at gmail.com Sat Jan 2 22:56:08 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Sat, 2 Jan 2016 19:56:08 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: > > If you don't understand this, it may help to profile [fib(i) for i in > range(10000)]. You'll see that the wrapper function gets called a ton of > times, the wrapper function gets called 10000 times, and the factory > function (which created wrapper functions) gets called 0 times. Ah, I see now. Thank you. On Sat, Jan 2, 2016 at 7:50 PM, Andrew Barnert wrote: > On Jan 2, 2016, at 19:00, u8y7541 The Awesome Person < > surya.subbarao1 at gmail.com> wrote: > > The wrapper functions themselves, though, exist in a one:one >> correspondence with the functions they're applied to - when you apply >> functools.lru_cache to a function, the transient decorator produced by >> the decorator factory only lasts as long as the execution of the >> function definition, but the wrapper function lasts for as long as the >> wrapped function does, and gets invoked every time that function is >> called (and if a function is performance critical enough for the >> results to be worth caching, then it's likely performance critical >> enough to be thinking about micro-optimisations). (Nick Coghlan) > > > Yes, that is what I was thinking of. Just like Quake's fast inverse square > root. Even though it is a micro-optimization, it greatly affects how fast > the game runs. > > > Of course micro-optimizations _can_ matter--when you're optimizing the > work done in the inner loop of a program that's CPU-bound, even a few > percent can make a difference. > > But that doesn't mean they _always_ matter. Saving 50ns in some code that > runs thousands of times per frames makes a difference; saving 50ns in some > code that happens once at startup does not. That's why we have profiling > tools: so you can find the bit of your program where you're spending 99% of > your time doing something a billion times, and optimize that part. > > And it also doesn't mean that everything that sounds like it should be > lighter is worth doing. You have to actually test it and see. In the > typical case where you're replacing one function object with one class > object and one instance object, that's actually taking more space, not less. > > But, as I explained, the function will _not_ be redefined and trashed >> every frame; it will be created one time. (Andrew Barnert) > > > Hmm... Nick says different... > > > No, Nick doesn't say different. Read it again. The wrapper function lives > as long as the wrapped function lives. It doesn't get created anew each > time you call it. > > If you don't understand this, it may help to profile [fib(i) for i in > range(10000)]. You'll see that the wrapper function gets called a ton of > times, the wrapper function gets called 10000 times, and the factory > function (which created wrapper functions) gets called 0 times. > > This all suggests that if your application is severely memory >> constrained (e.g. it's running on an embedded interpreter like >> MicroPython), then it *might* make sense to incur the extra complexity >> of using classes with a custom __call__ method to define wrapper >> functions, over just using a nested function. (Nick Coghlan) > > > Yes, I was thinking of that when I started this thread, but this thread is > just from my speculation. > > > Nick is saying that there may be some cases where it might make sense to > use a class. That doesn't at all support your idea that tutorials should > teach using classes instead of functions. In general, using functions will > be faster; in the most common case, using functions will use less memory; > most importantly, in the vast majority of cases, it won't matter anyway. > Maybe a MicroPython tutorial should have a section on how running on a > machine with only 4KB changes a lot of the usual tradeoffs, using a > decorator as an example. But a tutorial on decorators should show using a > function, because it's the simplest, most readable way to do it. > -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Sat Jan 2 23:01:06 2016 From: random832 at fastmail.com (Random832) Date: Sat, 02 Jan 2016 23:01:06 -0500 Subject: [Python-ideas] Bad programming style in decorators? References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: Nick Coghlan writes: > (and if a function is performance critical enough for the > results to be worth caching, then it's likely performance critical > enough to be thinking about micro-optimisations). Maybe. It could be that the "real" implementation is Very Expensive to invoke, and/or that the characteristics of how the function is called change the complexity class of an algorithm that calls it for a cached vs non-cached version. From ncoghlan at gmail.com Sun Jan 3 01:42:13 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Jan 2016 16:42:13 +1000 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On 3 January 2016 at 13:48, Guido van Rossum wrote: > Whoops, Nick already did the micro-benchmarks, and showed that creating a > function object is faster than instantiating a class. He also measured the > size, but I think he forgot that sys.getsizeof() doesn't report the size > (recursively) of contained objects -- a class instance references a dict > which is another 288 bytes (though if you care you can get rid of this by > using __slots__). You're right I forgot to account for that (54 bytes without __slots__ did seem surprisingly small!), but functions also always allocate f.__annotations__ at the moment. Always allocating f.__annotations__ actually puzzled me a bit - did we do that for a specific reason, or did we just not think of setting it to None when it's unused to save space the way we do for other function attributes? (__closure__, __defaults__, etc) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Sun Jan 3 06:55:36 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 3 Jan 2016 13:55:36 +0200 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On 03.01.16 04:42, Nick Coghlan wrote: > Finally, we get to the question of relative size: are function > instances larger or smaller than your typical class instance? Again, > we don't have to guess, we can use the interpreter to experiment and > check our assumptions: > > >>> import sys > >>> def f(): pass > ... > >>> sys.getsizeof(f) > 136 > >>> class C(): pass > ... > >>> sys.getsizeof(C()) > 56 sys.getsizeof() returns only the bare size of the object, not including the size of subobjects. To calculate total size you have to sum sizes of all subobjects recursively. [1] [1] http://bugs.python.org/file31822/gettotalsizeof.py From surya.subbarao1 at gmail.com Sun Jan 3 13:57:45 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Sun, 3 Jan 2016 10:57:45 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: Thanks for explaining the differences tho. I got confused between the decorator and the decorator factory, thinking the decorator had a function inside it. Sorry :) On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan wrote: > On 3 January 2016 at 13:48, Guido van Rossum wrote: > > Whoops, Nick already did the micro-benchmarks, and showed that creating a > > function object is faster than instantiating a class. He also measured > the > > size, but I think he forgot that sys.getsizeof() doesn't report the size > > (recursively) of contained objects -- a class instance references a dict > > which is another 288 bytes (though if you care you can get rid of this by > > using __slots__). > > You're right I forgot to account for that (54 bytes without __slots__ > did seem surprisingly small!), but functions also always allocate > f.__annotations__ at the moment. > > Always allocating f.__annotations__ actually puzzled me a bit - did we > do that for a specific reason, or did we just not think of setting it > to None when it's unused to save space the way we do for other > function attributes? (__closure__, __defaults__, etc) > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 4 15:31:48 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Jan 2016 12:31:48 -0800 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> Message-ID: [Adding python-ideas back -- I'm not sure why you dropped it but it looks like an oversight, not intentional] On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert wrote: > On Dec 27, 2015, at 09:04, Guido van Rossum wrote: > > > If we want some way to turn something that just defines __getitem__ and > __len__ into a proper sequence, it should just be made to inherit from > Sequence, which supplies the default __iter__ and __reversed__. > (Registration is *not* good enough here.) > > So, if I understand correctly, you're hoping that we can first make the > old-style sequence protocol unnecessary, except for backward compatibility, > and then maybe change the docs to only mention it for backward > compatibility, and only then deprecate it? > That sounds about right. > I think it's worth doing those first two steps, but not actually > deprecating it, at least while Python 2.7 is still around; otherwise, for > dual-version code, something like Steven D'Aprano's "Squares" type would > have to copy Indexable from the 3.x stdlib or get it from some third-party > module like six or backports.collections. > Yes, that's fine. Deprecation sometimes just has to take a really long time. > > If we really want a way to turn something that just supports __getitem__ > into an Iterable maybe we can provide an additional ABC for that purpose; > let's call it a HalfSequence until we've come up with a better name. (We > can't use Iterable for this because Iterable should not reference > __getitem__.) > > #25988 (using Nick's name Indexable, and the details from that post). > Oh, interesting. Though I have misgivings about that name. > > I also think it's fine to introduce Reversible as another ABC and > carefully fit it into the existing hierarchy. It should be a one-trick pony > and be another base class for Sequence; it should not have a default > implementation. (But this has been beaten to death in other threads -- it's > time to just file an issue with a patch.) > > #25987. Thanks! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.subbarao1 at gmail.com Mon Jan 4 19:04:06 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Mon, 4 Jan 2016 16:04:06 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: <1739359989.584131.1451933315417.JavaMail.yahoo@mail.yahoo.com> References: <1739359989.584131.1451933315417.JavaMail.yahoo@mail.yahoo.com> Message-ID: Yes, I knew this already. This is what I understood when I said I understood. I thought the *decorator* had a function inside it, not the *decorator factory*. So of course the decorator factory is only called twice. On Mon, Jan 4, 2016 at 10:48 AM, Andrew Barnert wrote: > (off-list, because I think this is no longer relevant to suggesting > changes to Python) > > On Sunday, January 3, 2016 10:58 AM, u8y7541 The Awesome Person < > surya.subbarao1 at gmail.com> wrote: > > > > >Thanks for explaining the differences tho. I got confused between the > decorator and the decorator factory, thinking the decorator had a function > inside it. Sorry :) > > I think you're _still_ confused. Not your fault, because this is confusing > stuff. The decorator actually does (usually) have a function definition in > it too. But the decorated function--the wrapper that it defines--doesn't. > And that's the thing that you usually call a zillion times, not the > decorator or the decorator factory. > > Let's make it concrete and as simple as possible, and then walk through > all the details: > > > def div(id): > def decorator(func): > > @wraps(func) > def wrapper(*args, **kw): > return "
{}
".format(id, func(*args, > **kw)) > return wrapper > return decorator > > @div('eggs') > def eggs(): > return 'eggs' > > @div('cheese') > def cheeses(): > return '
  • gouda
  • edam
' > > for _ in range(1000000): > print(eggs()) > print(cheeses()) > > When you're importing the module and hit that "@div('eggs')", that calls > the "div" factory. The only other time that happens is at the > "@div('cheese')". So, the factory does of course create a function, but the > factory only gets called twice in your entire program. (Also, the functions > it creates become garbage as soon as they're called, so by the time you get > to the "for" loop, they've both been deleted.) > > When you finish the "def eggs():" or "def cheeses():" statement, the > decorator function "decorator" returned by "div('eggs')" or "div('cheese')" > gets called. And that decorator also creates a function. But each one only > gets called once in your entire program, and there are only two of them, so > that's only two extra function definitions. (Obviously these two aren't > garbage--they're the functions you call inside the loop.) > > When you hit that "print(eggs())" line, you're calling the decorated > function "wrapper", returned by the decorator function "decorator", > returned by the decorator factory function "div". That function does not > have a function definition inside of it. So, calling it a million times > doesn't cost anything in function definitions. > > And of course "div" itself isn't garbage--you don't need it anymore, but > if you don't tell Python "del div", it'll stick around. So, at your peak, > in the middle of that "for" loop, you have 5 function definitions around > (div, decorated eggs, original eggs, decorated cheese, original cheese). > > If you refactor things differently, you could have 5 functions, 2 class > objects, 2 class instances (your intended class-style design); or 4 > functions, 1 class object, 2 class instances, 2 bound methods, (a simple > class-style decorator); 4 functions, 1 class object, 2 class instances, > (the smallest possible class-style decorator); 2 functions but with > duplicated code and string constants (by inlining div directly into each > function); or 3 functions (with eggs and cheese explicitly calling div--I > suspect this would be actually be smallest here); etc. The difference is > going to be a few hundred bytes one way or the other, and the smallest > possible design may have a severe cost in (time) performance or in > readability. But, as you suggested, and Nick confirmed, there are cases > where it matters. Which means it's worth knowing how to write all the > different possibilities, and evaluate them analytically, and test them. > -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 4 19:26:53 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Jan 2016 16:26:53 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan wrote: > On 3 January 2016 at 13:48, Guido van Rossum wrote: > > Whoops, Nick already did the micro-benchmarks, and showed that creating a > > function object is faster than instantiating a class. He also measured > the > > size, but I think he forgot that sys.getsizeof() doesn't report the size > > (recursively) of contained objects -- a class instance references a dict > > which is another 288 bytes (though if you care you can get rid of this by > > using __slots__). > > You're right I forgot to account for that (54 bytes without __slots__ > did seem surprisingly small!), but functions also always allocate > f.__annotations__ at the moment. > > Always allocating f.__annotations__ actually puzzled me a bit - did we > do that for a specific reason, or did we just not think of setting it > to None when it's unused to save space the way we do for other > function attributes? (__closure__, __defaults__, etc) > Where do you see that happening? The code in funcobject.c seems to indicate that it's created on demand. (And that's how I remember it always being.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 4 21:46:38 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 4 Jan 2016 18:46:38 -0800 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> Message-ID: On Jan 4, 2016, at 12:31, Guido van Rossum wrote: > >> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert wrote: >> On Dec 27, 2015, at 09:04, Guido van Rossum wrote: > >> > If we really want a way to turn something that just supports __getitem__ into an Iterable maybe we can provide an additional ABC for that purpose; let's call it a HalfSequence until we've come up with a better name. (We can't use Iterable for this because Iterable should not reference __getitem__.) >> >> #25988 (using Nick's name Indexable, and the details from that post). > > Oh, interesting. Though I have misgivings about that name. Now that you mention it, I can see the confusion. I interpreted Nick's "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as opposed to "subscriptable by arbitrary keys". But if I didn't already know what he intended, I suppose I could have instead guessed "usable as an index", which would be very misleading. There don't seem to be any existing terms for this that don't relate to "sequence", so maybe your HalfSequence (or Sequential or SequentiallySubscriptable or something even more horrible than that last one?) is the best option? Or, hopefully, someone _can_ come up with a better name. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Mon Jan 4 22:08:18 2016 From: vgr255 at live.ca (Emanuel Barry) Date: Mon, 4 Jan 2016 22:08:18 -0500 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: , <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com>, , , , , , Message-ID: Output from both 3.4.1 and 3.5.0: >>> def foo(): pass>>> foo.__annotations__{} Probably an oversight. I'm also not a C expert, but func_get_annotations (line 396 and onwards in funcobject.c) explicitely returns a new, empty dict if the function doesn't have any annotations (unlike all the other slots, like __defaults__ or __kwdefaults__, which merely return None if they're not present). From: guido at python.org Date: Mon, 4 Jan 2016 16:26:53 -0800 To: ncoghlan at gmail.com Subject: Re: [Python-ideas] Bad programming style in decorators? CC: python-ideas at python.org; surya.subbarao1 at gmail.com On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan wrote: On 3 January 2016 at 13:48, Guido van Rossum wrote: > Whoops, Nick already did the micro-benchmarks, and showed that creating a > function object is faster than instantiating a class. He also measured the > size, but I think he forgot that sys.getsizeof() doesn't report the size > (recursively) of contained objects -- a class instance references a dict > which is another 288 bytes (though if you care you can get rid of this by > using __slots__). You're right I forgot to account for that (54 bytes without __slots__ did seem surprisingly small!), but functions also always allocate f.__annotations__ at the moment. Always allocating f.__annotations__ actually puzzled me a bit - did we do that for a specific reason, or did we just not think of setting it to None when it's unused to save space the way we do for other function attributes? (__closure__, __defaults__, etc) Where do you see that happening? The code in funcobject.c seems to indicate that it's created on demand. (And that's how I remember it always being.) -- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 4 22:33:30 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 4 Jan 2016 19:33:30 -0800 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: <7BFDB656-6EDF-4862-9849-554E4885E711@yahoo.com> On Jan 4, 2016, at 19:08, Emanuel Barry wrote: > > Output from both 3.4.1 and 3.5.0: > > >>> def foo(): pass > >>> foo.__annotations__ > {} > > Probably an oversight. I'm also not a C expert, but func_get_annotations (line 396 and onwards in funcobject.c) explicitely returns a new, empty dict if the function doesn't have any annotations (unlike all the other slots, like __defaults__ or __kwdefaults__, which merely return None if they're not present). But that code implies if you just create a new function object and never check its __annotations__, it's not wasting any space for them. Otherwise, it wouldn't have to check for NULL there. We can't test this _directly_ from Python, but with a bit of ctypes hackery and funcobject.h, we can define a PyFunctionObject(Structure)... or, keeping things a bit more concise but a lot more hacky for the purposes of email: def func(): pass pf = cast(id(func), POINTER(c_voidp)) assert pf[2] == id(func.__code__) assert not pf[12] f.__annotations__ assert pf[12] So it is indeed NULL until you check it, and then it becomes something (an empty dict). -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jan 5 00:25:29 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Jan 2016 21:25:29 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: Following up on this, in theory the right way to walk a tree using pathlib already exists, it's the rglob() method. E.g. all paths under /foo/bar should be found as follows: for path in pathlib.Path('/foo/bar').rglob('**/*'): print(path) The PermissionError bug you found is already reported: http://bugs.python.org/issue24120 -- it even has a patch but it's stuck in review. Sadly there's another error: loops introduced by symlinks cause infinite recursion. I filed that here: http://bugs.python.org/issue26012. (The fix should be judicious use of is_symlink(), but the code is a little convoluted.) On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker wrote: > On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum > wrote: > >> The two-level iteration forced upon you by os.walk() is indeed often >> unnecessary -- but handling dirs and files separately usually makes sense, >> > > indeed, but not always, so a simple API that allows you to get a flat walk > would be nice.... > > Of course for that basic use case, you could just write your own wrapper >>> around os.walk: >>> >> > sure, but having to write "little" wrappers for common needs is > unfortunate... > > The problem isn't designing a nice walk API; it's integrating it with >>> pathlib.* >> >> > indeed -- I'd really like to see a *walk in pathlib itself. I've been > trying to use pathlib whenever I need, well, a path, but then I find I > almost immediately need to step out and use an os.path function, and have > to string-fy it anyway -- makes me wonder what the point is.. > > And honestly, if open, os.walk, etc. aren't going to work with Path >>> objects, >> >> > but they should -- of course they should..... > > Truly pushing for adoption of a new abstraction like this takes many years >> -- pathlib was new (and provisional) in 3.4 so it really hasn't been long >> enough to give up on it. The OP hasn't! >> > > it will take many years for sure -- but the standard library cold at least > adopt it as much as possible. > > Path.walk would be a nice start :-) > > My example: one of our sysadmins wanted a little script to go thorugh an > entire drive (Windows), and check if any paths were longer than 256 > characters (Windows, remember..) > > I came up with this: > > def get_all_paths(start_dir='/'): > for dirpath, dirnames, filenames in os.walk(start_dir): > for filename in filenames: > yield os.path.join(dirpath, filename) > > too_long = [] > for p in get_all_paths('/'): > print("checking:", p) > if len(p) > 255: > too_long.append(p) > print("Path too long!") > > way too wordy! > > I started with pathlib, but that just made it worse. > > now that I think about it, maybe I could have simpily used > pathlib.Path.rglob.... > > However, when I try that, I get a permission error: > > /Users/chris.barker/miniconda2/envs/py3/lib/python3.5/pathlib.py in > wrapped(pathobj, *args) > > 369 @functools.wraps(strfunc) > 370 def wrapped(pathobj, *args): > --> 371 return strfunc(str(pathobj), *args) > 372 return staticmethod(wrapped) > 373 > > PermissionError: [Errno 13] Permission denied: > '/Users/.chris.barker.xahome/caches/opendirectory' > > as the error comes insider the rglob() generator, I'm not sure how to tell > it to ignore and move on.... > > os.walk is somehow able to deal with this. > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.subbarao1 at gmail.com Tue Jan 5 00:32:49 2016 From: surya.subbarao1 at gmail.com (u8y7541 The Awesome Person) Date: Mon, 4 Jan 2016 21:32:49 -0800 Subject: [Python-ideas] How exactly does from ... import ... work? Message-ID: Suppose I have a file called randomFile.py which reads like this: class A: def __init__(self, foo): self.foo = foo self.bar = bar(foo) class B(A): pass class C(B): pass def bar(foo): return foo + 1 Suppose in another file in the same directory, I have another python program. from randomFile import C # some code When C has to be imported, B also has to be imported because it is the parent. Therefore, A also has to be imported. This also results in the function bar being imported. When from ... import ... is called, does Python follow all the references and import everything that is needed, or does it just import the whole namespace (making wildcard imports acceptable :O)? -- -Surya Subbarao -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jan 5 00:36:53 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 5 Jan 2016 16:36:53 +1100 Subject: [Python-ideas] How exactly does from ... import ... work? In-Reply-To: References: Message-ID: On Tue, Jan 5, 2016 at 4:32 PM, u8y7541 The Awesome Person wrote: > Suppose I have a file called randomFile.py which reads like this: > class A: > def __init__(self, foo): > self.foo = foo > self.bar = bar(foo) > class B(A): > pass > class C(B): > pass > def bar(foo): > return foo + 1 > > Suppose in another file in the same directory, I have another python > program. > > from randomFile import C > > # some code > > When C has to be imported, B also has to be imported because it is the > parent. Therefore, A also has to be imported. This also results in the > function bar being imported. When from ... import ... is called, does Python > follow all the references and import everything that is needed, or does it > just import the whole namespace (making wildcard imports acceptable :O)? > Not sure why this is on -ideas; explanations of how Python already works would more normally go on python-list. When you say "from X import Y", what Python does is, more-or-less: import Y X = Y.X del Y The entire file gets executed, and then one symbol from it gets imported into the current namespace. ChrisA From ncoghlan at gmail.com Tue Jan 5 00:50:11 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Jan 2016 15:50:11 +1000 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> Message-ID: On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas wrote: > On Jan 4, 2016, at 12:31, Guido van Rossum wrote: > > On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert wrote: >> >> On Dec 27, 2015, at 09:04, Guido van Rossum wrote: > >> >> > If we really want a way to turn something that just supports __getitem__ >> > into an Iterable maybe we can provide an additional ABC for that purpose; >> > let's call it a HalfSequence until we've come up with a better name. (We >> > can't use Iterable for this because Iterable should not reference >> > __getitem__.) >> >> #25988 (using Nick's name Indexable, and the details from that post). > > Oh, interesting. Though I have misgivings about that name. > > Now that you mention it, I can see the confusion. I interpreted Nick's > "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as > opposed to "subscriptable by arbitrary keys". But if I didn't already know > what he intended, I suppose I could have instead guessed "usable as an > index", which would be very misleading. > > There don't seem to be any existing terms for this that don't relate to > "sequence", so maybe your HalfSequence (or Sequential or > SequentiallySubscriptable or something even more horrible than that last > one?) is the best option? > > Or, hopefully, someone _can_ come up with a better name. :) I mainly suggested Indexable because it was the least-worst name I could think of, and I'd previously suggested Index as the name for "has an __index__ method" (in the context of typing, but it would also work in the context of collections.abc). The main alternative I've thought of is "IterableByIndex", which is both explicit and accurate, with the only strike against it being length. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Jan 5 00:55:00 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Jan 2016 15:55:00 +1000 Subject: [Python-ideas] Bad programming style in decorators? In-Reply-To: References: <8C9E6C07-9563-456E-88EB-D9C4A70F9235@yahoo.com> Message-ID: On 5 January 2016 at 10:26, Guido van Rossum wrote: > On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan wrote: >> Always allocating f.__annotations__ actually puzzled me a bit - did we >> do that for a specific reason, or did we just not think of setting it >> to None when it's unused to save space the way we do for other >> function attributes? (__closure__, __defaults__, etc) > > Where do you see that happening? The code in funcobject.c seems to indicate > that it's created on demand. (And that's how I remember it always being.) I didn't check the code, only the behaviour, so I missed that querying f.__annotations__ was implicitly creating the dictionary. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Tue Jan 5 01:04:26 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 04 Jan 2016 22:04:26 -0800 Subject: [Python-ideas] How exactly does from ... import ... work? In-Reply-To: References: Message-ID: <568B5CEA.7020503@stoneleaf.us> This list is for ideas about future Python (the language) enhancements. Please direct questions about how Python currently works to the tutor list: https://mail.python.org/mailman/listinfo/tutor/ -- ~Ethan~ From guido at python.org Tue Jan 5 01:49:18 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Jan 2016 22:49:18 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum wrote: > Following up on this, in theory the right way to walk a tree using pathlib > already exists, it's the rglob() method. E.g. all paths under /foo/bar > should be found as follows: > > for path in pathlib.Path('/foo/bar').rglob('**/*'): > Whoops, I just realized that I combined two ways of doing a recursive glob here. It should be either rglob('*') or plain glob('**/*'). What I wrote produces identical results, but at the cost of a lot of caching. :-) Note that the PEP doesn't mention rglob() -- why do we even have it? It seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep is '/'). No TOOWTDI here? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Tue Jan 5 09:13:24 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Tue, 05 Jan 2016 09:13:24 -0500 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> Message-ID: <568BCF84.3040604@sdamon.com> Devils Advocate: Please don't make me press shift more than twice in a base class name if you expect me to use it. It just makes annoying avoidable typos more common. 'Subscripted' sounds good to me, if that's worth anything. On 1/5/2016 00:50, Nick Coghlan wrote: > On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas > wrote: >> On Jan 4, 2016, at 12:31, Guido van Rossum wrote: >> >> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert wrote: >>> On Dec 27, 2015, at 09:04, Guido van Rossum wrote: >>>> If we really want a way to turn something that just supports __getitem__ >>>> into an Iterable maybe we can provide an additional ABC for that purpose; >>>> let's call it a HalfSequence until we've come up with a better name. (We >>>> can't use Iterable for this because Iterable should not reference >>>> __getitem__.) >>> #25988 (using Nick's name Indexable, and the details from that post). >> Oh, interesting. Though I have misgivings about that name. >> >> Now that you mention it, I can see the confusion. I interpreted Nick's >> "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as >> opposed to "subscriptable by arbitrary keys". But if I didn't already know >> what he intended, I suppose I could have instead guessed "usable as an >> index", which would be very misleading. >> >> There don't seem to be any existing terms for this that don't relate to >> "sequence", so maybe your HalfSequence (or Sequential or >> SequentiallySubscriptable or something even more horrible than that last >> one?) is the best option? >> >> Or, hopefully, someone _can_ come up with a better name. :) > I mainly suggested Indexable because it was the least-worst name I > could think of, and I'd previously suggested Index as the name for > "has an __index__ method" (in the context of typing, but it would also > work in the context of collections.abc). > > The main alternative I've thought of is "IterableByIndex", which is > both explicit and accurate, with the only strike against it being > length. > > Cheers, > Nick. > From chris.barker at noaa.gov Tue Jan 5 11:30:00 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 5 Jan 2016 08:30:00 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: <9171176375587048097@unknownmsgid> Thanks for following up. it's the rglob() method. E.g. all paths under /foo/bar should be found as follows: for path in pathlib.Path('/foo/bar').rglob('**/*'): print(path) The PermissionError bug you found is already reported: http://bugs.python.org/issue24120 -- it even has a patch but it's stuck in review. Thanks for pinging that -- I had somehow assumed that the PermissionError was intentional. Sadly there's another error: loops introduced by symlinks cause infinite recursion. I filed that here: http://bugs.python.org/issue26012. (The fix should be judicious use of is_symlink(), but the code is a little convoluted.) Thanks, -CHB On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker wrote: > On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum > wrote: > >> The two-level iteration forced upon you by os.walk() is indeed often >> unnecessary -- but handling dirs and files separately usually makes sense, >> > > indeed, but not always, so a simple API that allows you to get a flat walk > would be nice.... > > Of course for that basic use case, you could just write your own wrapper >>> around os.walk: >>> >> > sure, but having to write "little" wrappers for common needs is > unfortunate... > > The problem isn't designing a nice walk API; it's integrating it with >>> pathlib.* >> >> > indeed -- I'd really like to see a *walk in pathlib itself. I've been > trying to use pathlib whenever I need, well, a path, but then I find I > almost immediately need to step out and use an os.path function, and have > to string-fy it anyway -- makes me wonder what the point is.. > > And honestly, if open, os.walk, etc. aren't going to work with Path >>> objects, >> >> > but they should -- of course they should..... > > Truly pushing for adoption of a new abstraction like this takes many years >> -- pathlib was new (and provisional) in 3.4 so it really hasn't been long >> enough to give up on it. The OP hasn't! >> > > it will take many years for sure -- but the standard library cold at least > adopt it as much as possible. > > Path.walk would be a nice start :-) > > My example: one of our sysadmins wanted a little script to go thorugh an > entire drive (Windows), and check if any paths were longer than 256 > characters (Windows, remember..) > > I came up with this: > > def get_all_paths(start_dir='/'): > for dirpath, dirnames, filenames in os.walk(start_dir): > for filename in filenames: > yield os.path.join(dirpath, filename) > > too_long = [] > for p in get_all_paths('/'): > print("checking:", p) > if len(p) > 255: > too_long.append(p) > print("Path too long!") > > way too wordy! > > I started with pathlib, but that just made it worse. > > now that I think about it, maybe I could have simpily used > pathlib.Path.rglob.... > > However, when I try that, I get a permission error: > > /Users/chris.barker/miniconda2/envs/py3/lib/python3.5/pathlib.py in > wrapped(pathobj, *args) > > 369 @functools.wraps(strfunc) > 370 def wrapped(pathobj, *args): > --> 371 return strfunc(str(pathobj), *args) > 372 return staticmethod(wrapped) > 373 > > PermissionError: [Errno 13] Permission denied: > '/Users/.chris.barker.xahome/caches/opendirectory' > > as the error comes insider the rglob() generator, I'm not sure how to tell > it to ignore and move on.... > > os.walk is somehow able to deal with this. > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Jan 5 11:37:54 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 5 Jan 2016 08:37:54 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: <8348957817041496645@unknownmsgid> > Note that the PEP doesn't mention rglob() -- why do we even have it? It seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep is '/'). No TOOWTDI here? Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind of magic a newbie ( like me :-) ) would have research and understand. -CHB From guido at python.org Tue Jan 5 11:45:11 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Jan 2016 08:45:11 -0800 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: <568BCF84.3040604@sdamon.com> References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> <568BCF84.3040604@sdamon.com> Message-ID: Or maybe Indexable is fine after all, since the arguments to __getitem__ are supposed to be objects with an __index__ method (e.g. Integral, but not Real). BTW, Maybe Index needs to be added to numbers.py as an ABC? PEP 357, which introduced it, sounds like it pre-dates ABCs. On Tue, Jan 5, 2016 at 6:13 AM, Alexander Walters wrote: > Devils Advocate: Please don't make me press shift more than twice in a > base class name if you expect me to use it. It just makes annoying > avoidable typos more common. > > 'Subscripted' sounds good to me, if that's worth anything. > > > On 1/5/2016 00:50, Nick Coghlan wrote: > >> On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas >> wrote: >> >>> On Jan 4, 2016, at 12:31, Guido van Rossum wrote: >>> >>> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert >>> wrote: >>> >>>> On Dec 27, 2015, at 09:04, Guido van Rossum wrote: >>>> >>>>> If we really want a way to turn something that just supports >>>>> __getitem__ >>>>> into an Iterable maybe we can provide an additional ABC for that >>>>> purpose; >>>>> let's call it a HalfSequence until we've come up with a better name. >>>>> (We >>>>> can't use Iterable for this because Iterable should not reference >>>>> __getitem__.) >>>>> >>>> #25988 (using Nick's name Indexable, and the details from that post). >>>> >>> Oh, interesting. Though I have misgivings about that name. >>> >>> Now that you mention it, I can see the confusion. I interpreted Nick's >>> "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as >>> opposed to "subscriptable by arbitrary keys". But if I didn't already >>> know >>> what he intended, I suppose I could have instead guessed "usable as an >>> index", which would be very misleading. >>> >>> There don't seem to be any existing terms for this that don't relate to >>> "sequence", so maybe your HalfSequence (or Sequential or >>> SequentiallySubscriptable or something even more horrible than that last >>> one?) is the best option? >>> >>> Or, hopefully, someone _can_ come up with a better name. :) >>> >> I mainly suggested Indexable because it was the least-worst name I >> could think of, and I'd previously suggested Index as the name for >> "has an __index__ method" (in the context of typing, but it would also >> work in the context of collections.abc). >> >> The main alternative I've thought of is "IterableByIndex", which is >> both explicit and accurate, with the only strike against it being >> length. >> >> Cheers, >> Nick. >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Tue Jan 5 12:17:33 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Tue, 05 Jan 2016 12:17:33 -0500 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> <568BCF84.3040604@sdamon.com> Message-ID: <568BFAAD.5000804@sdamon.com> On 1/5/2016 11:45, Guido van Rossum wrote: > since the arguments to __getitem__ are supposed to be objects with an > __index__ method ...In the context of classic iteration only? From abarnert at yahoo.com Tue Jan 5 13:17:42 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 5 Jan 2016 10:17:42 -0800 Subject: [Python-ideas] Deprecating the old-style sequence protocol In-Reply-To: <568BFAAD.5000804@sdamon.com> References: <242429025.2987954.1451185648904.JavaMail.yahoo.ref@mail.yahoo.com> <242429025.2987954.1451185648904.JavaMail.yahoo@mail.yahoo.com> <567FD8D5.8090205@egenix.com> <568BCF84.3040604@sdamon.com> <568BFAAD.5000804@sdamon.com> Message-ID: <42B3F798-36AE-4608-9D74-BE6306F4A40B@yahoo.com> On Jan 5, 2016, at 09:17, Alexander Walters wrote: > >> On 1/5/2016 11:45, Guido van Rossum wrote: >> since the arguments to __getitem__ are supposed to be objects with an __index__ method > > ...In the context of classic iteration only? Basically, in the context of what makes a sequence different from a mapping. The idea here is to have a way to signal that a class follows the old-style sequence protocol, as opposed to being a mapping or some other use of __getitem__: you can access its elements by subscripting it with indexes from 0 up to the first one that raises IndexError (or up to __len__, if you provide it, but that isn't necessary). But this doesn't have to be airtight for type proofs or anything; if your class inherits from Indexable but then accepts integers too large to fit in an Index, that's fine. From srkunze at mail.de Tue Jan 5 13:41:11 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 5 Jan 2016 19:41:11 +0100 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: <568C0E47.30703@mail.de> Don't get me wrong but either glob('**/*') and rglob('*') sounds quite cryptic. Furthermore, globbing always sounds slow to me. Is it fast? And is there some way to leave out the '*' (three special characters for plain ol'everything)? And how can I walk directories only and files only? On 05.01.2016 07:49, Guido van Rossum wrote: > On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum > wrote: > > Following up on this, in theory the right way to walk a tree using > pathlib already exists, it's the rglob() method. E.g. all paths > under /foo/bar should be found as follows: > > for path in pathlib.Path('/foo/bar').rglob('**/*'): > > > Whoops, I just realized that I combined two ways of doing a recursive > glob here. It should be either rglob('*') or plain glob('**/*'). What > I wrote produces identical results, but at the cost of a lot of > caching. :-) Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jan 5 15:21:09 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Jan 2016 12:21:09 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: <8348957817041496645@unknownmsgid> References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> Message-ID: On Tue, Jan 5, 2016 at 8:37 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > Note that the PEP doesn't mention rglob() -- why do we even have it? It > seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep > is '/'). No TOOWTDI here? > > Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind > of magic a newbie ( like me :-) ) would have research and understand. > Sure. It's too late to remove it anyway. Is there anything actionable here besides fixing the PermissionError and the behavior under symlink loops? IMO if you want files only or directories only you can just add a filter using e.g. is_dir(): p = pathlib.Path.cwd() real_dirs = [p for p in p.rglob('*') if p.is_dir() and not p.is_symlink()] -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From moloney at ohsu.edu Tue Jan 5 15:27:24 2016 From: moloney at ohsu.edu (Brendan Moloney) Date: Tue, 5 Jan 2016 20:27:24 +0000 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid>, Message-ID: <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> The main issue is the lack of stat caching. That is why I wrote my own module around scandir which includes the DirEntry objects for each path so that the consumer can also do stuff with the cached stat info (like check if it is a file or directory). Often we won't need to call stat on the path at all, and if we do it will only be once. Brendan Moloney Research Associate Advanced Imaging Research Center Oregon Health Science University ________________________________ From: Python-ideas [python-ideas-bounces+moloney=ohsu.edu at python.org] on behalf of Guido van Rossum [guido at python.org] Sent: Tuesday, January 05, 2016 12:21 PM To: Chris Barker - NOAA Federal Cc: Python-Ideas Subject: Re: [Python-ideas] find-like functionality in pathlib On Tue, Jan 5, 2016 at 8:37 AM, Chris Barker - NOAA Federal > wrote: > Note that the PEP doesn't mention rglob() -- why do we even have it? It seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep is '/'). No TOOWTDI here? Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind of magic a newbie ( like me :-) ) would have research and understand. Sure. It's too late to remove it anyway. Is there anything actionable here besides fixing the PermissionError and the behavior under symlink loops? IMO if you want files only or directories only you can just add a filter using e.g. is_dir(): p = pathlib.Path.cwd() real_dirs = [p for p in p.rglob('*') if p.is_dir() and not p.is_symlink()] -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jan 5 16:04:21 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Jan 2016 13:04:21 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> Message-ID: On Tue, Jan 5, 2016 at 12:27 PM, Brendan Moloney wrote: > The main issue is the lack of stat caching. That is why I wrote my own > module around scandir which includes the DirEntry objects for each path so > that the consumer can also do stuff with the cached stat info (like check > if it is a file or directory). Often we won't need to call stat on the path > at all, and if we do it will only be once. > I wonder if stat() caching shouldn't be made an orthogonal optional feature of Path objects somehow; it keeps coming back as useful in various cases even though we don't want to enable it by default. One problem with stat() caching is that Path objects are considered immutable, and two Path objects referring to the same path are completely interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', cache_stat=True), the behavior of two instances of that object might be observably different (if they were instantiated at times when the contents of the filesystem was different). So maybe stat-caching Path instances should be considered unequal, or perhaps unhashable. Or perhaps they should only be considered equal if their stat() values are actually equal (i.e. if the file's stat() info didn't change). . So this is a thorny issue that requires some real thought before we commit to an API. We might also want to create Path instances directly from DirEntry objects. (Interesting, the DirEntry API seems to be a subset of the Path API, except for the .path attribute which is equivalent to the str() of a Path object.) Maybe some of this can be done first as a 3rd party module forked from the original 3rd party pathlib? https://bitbucket.org/pitrou/pathlib/ seems reasonably up to date. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Jan 5 18:49:21 2016 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Jan 2016 18:49:21 -0500 Subject: [Python-ideas] PEP 9 - plaintext PEP format - is officially deprecated Message-ID: <20160105184921.317ac5ec@limelight.wooz.org> I don't think this will be at all controversial. Brett suggested, and there was no disagreement from the PEP editors, that plain text PEPs be deprecated. reStructuredText is clearly a better format, and all recent PEP submissions have been in reST for a while now anyway. I am therefore withdrawing[*] PEP 9 and have made other appropriate changes to make it clear that only PEP 12 format is acceptable going forward. The PEP editors will not be converting the legacy PEPs to reST, nor will we currently be renaming the relevant PEP source files to end with ".rst" since there's too much tooling that would have to change to do so. However, if either task really interests you, please get in touch with the PEP editors. it-only-took-15-years-ly y'rs, -Barry (on behalf of the PEP editors) [*] Status: Withdrawn being about the only currently appropriate resolution status for process PEPs. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From greg.ewing at canterbury.ac.nz Tue Jan 5 17:59:49 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Jan 2016 11:59:49 +1300 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> Message-ID: <568C4AE5.7020407@canterbury.ac.nz> Guido van Rossum wrote: > I wonder if stat() caching shouldn't be made an orthogonal optional > feature of Path objects somehow; it keeps coming back as useful in > various cases even though we don't want to enable it by default. Maybe path.stat() could return a PathWithStat object that inherits from Path and can do everything that a Path can do, but also contains cached stat info and has a suitable set of attributes for accessing it. This would make it clear at what point in time the info is valid for, i.e. the moment you called stat(). It would also provide an obvious way to refresh the info: calling path_with_stat.stat() would give you a new PathWithStat containing updated info. Things like scandir could then return pre-populated PathWithStat objects. -- Greg From guido at python.org Tue Jan 5 19:02:41 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Jan 2016 16:02:41 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: <568C4AE5.7020407@canterbury.ac.nz> References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <568C4AE5.7020407@canterbury.ac.nz> Message-ID: On Tue, Jan 5, 2016 at 2:59 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> I wonder if stat() caching shouldn't be made an orthogonal optional >> feature of Path objects somehow; it keeps coming back as useful in various >> cases even though we don't want to enable it by default. >> > > Maybe path.stat() could return a PathWithStat object that > inherits from Path and can do everything that a Path can > do, but also contains cached stat info and has a suitable > set of attributes for accessing it. > Well, Path.stat() is already defined and returns the same type of object that os.stat() returns, and I don't think we should change that. We could add a new method that does this, but as long as it inherits from Path it wouldn't really address the issue with objects being == to each other but holding different stat info. > This would make it clear at what point in time the info > is valid for, i.e. the moment you called stat(). It would > also provide an obvious way to refresh the info: calling > path_with_stat.stat() would give you a new PathWithStat > containing updated info. > > Things like scandir could then return pre-populated > PathWithStat objects. I presume you are proposing a new Path.scandir() method -- the existing os.scandir() method already returns DirEntry objects which we really don't want to change at this point. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Wed Jan 6 11:11:13 2016 From: random832 at fastmail.com (Random832) Date: Wed, 06 Jan 2016 11:11:13 -0500 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> Message-ID: <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote: > One problem with stat() caching is that Path objects are considered > immutable, and two Path objects referring to the same path are completely > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', > cache_stat=True), the behavior of two instances of that object might be > observably different (if they were instantiated at times when the > contents of the filesystem was different). So maybe stat-caching Path instances > should be considered unequal, or perhaps unhashable. Or perhaps they > should only be considered equal if their stat() values are actually equal (i.e. > if the file's stat() info didn't change). What about a global cache? From guido at python.org Wed Jan 6 11:48:30 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Jan 2016 08:48:30 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> Message-ID: On Wed, Jan 6, 2016 at 8:11 AM, Random832 wrote: > On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote: > > One problem with stat() caching is that Path objects are considered > > immutable, and two Path objects referring to the same path are completely > > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is > > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', > > cache_stat=True), the behavior of two instances of that object might be > > observably different (if they were instantiated at times when the > > contents of the filesystem was different). So maybe stat-caching Path > instances > > should be considered unequal, or perhaps unhashable. Or perhaps they > > should only be considered equal if their stat() values are actually > equal (i.e. > > if the file's stat() info didn't change). > > What about a global cache? It would have to use a weak dict so if the last reference goes away it discards the cached stats for a given path, otherwise you'd have trouble containing the cache size. And caching Path objects should still not be comparable to non-caching Path objects (which we will need to preserve the semantics that repeatedly calling stat() on a Path object created the default way will always redo the syscall). The main advantage would be that caching Path objects could be compared safely. It could still cause unexpected results. E.g. if you have just traversed some big tree using caching, and saved some results (so hanging on to some paths and hence their stat() results), and then you make some changes and traverse it again to look for something else, you might accidentally be seeing stale (i.e. cached) stat() results. Maybe there's a middle ground, where the user can create a StatCache object and pass it into Path creation and traversal operations. Paths with the same StatCache object (or both None) compare equal if their path components are equal. Paths with different StatCache objects never compare equal (but otherwise are ordered by path as usual -- the StatCache object's identity is only used when the paths are equal. Are you (or anyone still reading this) interested in implementing this idea? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Jan 6 12:04:38 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 6 Jan 2016 18:04:38 +0100 Subject: [Python-ideas] intuitive timedeltas like in go Message-ID: <568D4926.8080902@mail.de> Hi, timedelta handling always felt cumbersome to me: from datetime import timedelta short_period = timedelta(seconds=10) long_period = timedelta(hours=4, seconds=37) Today, I came across this one https://github.com/lxc/lxd/pull/1471/files and I found the creation of a 10 seconds timeout extremely intuitive. Would this represent a valuable addition to Python? from datetime import second, hour short period = 10*second long_period = 4*hour + 37*second Best, Sven From ian.g.kelly at gmail.com Wed Jan 6 12:24:28 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Wed, 6 Jan 2016 10:24:28 -0700 Subject: [Python-ideas] intuitive timedeltas like in go In-Reply-To: <568D4926.8080902@mail.de> References: <568D4926.8080902@mail.de> Message-ID: On Wed, Jan 6, 2016 at 10:04 AM, Sven R. Kunze wrote: > Hi, > > timedelta handling always felt cumbersome to me: > > from datetime import timedelta > > short_period = timedelta(seconds=10) > long_period = timedelta(hours=4, seconds=37) > > Today, I came across this one https://github.com/lxc/lxd/pull/1471/files and > I found the creation of a 10 seconds timeout extremely intuitive. Would this > represent a valuable addition to Python? > > from datetime import second, hour > > short period = 10*second > long_period = 4*hour + 37*second Anybody who wants this can already accomplish it with just a few extra lines: >>> from datetime import timedelta >>> second = timedelta(seconds=1) >>> hour = timedelta(hours=1) >>> 10*second datetime.timedelta(0, 10) >>> 4*hour + 37*second datetime.timedelta(0, 14437) From storchaka at gmail.com Wed Jan 6 12:36:30 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 6 Jan 2016 19:36:30 +0200 Subject: [Python-ideas] intuitive timedeltas like in go In-Reply-To: <568D4926.8080902@mail.de> References: <568D4926.8080902@mail.de> Message-ID: On 06.01.16 19:04, Sven R. Kunze wrote: > timedelta handling always felt cumbersome to me: > > from datetime import timedelta > > short_period = timedelta(seconds=10) > long_period = timedelta(hours=4, seconds=37) > > Today, I came across this one https://github.com/lxc/lxd/pull/1471/files > and I found the creation of a 10 seconds timeout extremely intuitive. > Would this represent a valuable addition to Python? > > from datetime import second, hour > > short period = 10*second > long_period = 4*hour + 37*second Does Go support keyword arguments? From guido at python.org Wed Jan 6 15:00:24 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Jan 2016 12:00:24 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum wrote: > Following up on this, in theory the right way to walk a tree using pathlib > already exists, it's the rglob() method. E.g. all paths under /foo/bar > should be found as follows: > > for path in pathlib.Path('/foo/bar').rglob('**/*'): > [actually, rglob('*') or glob('**/*')] > print(path) > > The PermissionError bug you found is already reported: > http://bugs.python.org/issue24120 -- it even has a patch but it's stuck > in review. > I committed this fix. > Sadly there's another error: loops introduced by symlinks cause infinite > recursion. I filed that here: http://bugs.python.org/issue26012. (The fix > should be judicious use of is_symlink(), but the code is a little > convoluted.) > I committed a fix for this too (turned out to need just one call to is_symlink()). I also added a .path attribute to pathlib.*Path objects, so that p.path == str(p). You can now use the idiom getattr(arg, 'path', arg) to extract the path from a pathlib.Path object, or from an os.DirEntry object, or fall back to a plain string, without using str(arg), which would turn *any* object into a string, which is never what you want to happen by default. These changes will be released in Python 3.4.5, 3.5.2 and 3.6. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jan 6 17:42:35 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Jan 2016 14:42:35 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> Message-ID: I couldn't help myself and coded up a prototype for the StatCache design I sketched. See http://bugs.python.org/issue26031. Feedback welcome! On my Mac it only seems to offer limited benefits though... On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum wrote: > On Wed, Jan 6, 2016 at 8:11 AM, Random832 wrote: > >> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote: >> > One problem with stat() caching is that Path objects are considered >> > immutable, and two Path objects referring to the same path are >> completely >> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} >> is >> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', >> > cache_stat=True), the behavior of two instances of that object might be >> > observably different (if they were instantiated at times when the >> > contents of the filesystem was different). So maybe stat-caching Path >> instances >> > should be considered unequal, or perhaps unhashable. Or perhaps they >> > should only be considered equal if their stat() values are actually >> equal (i.e. >> > if the file's stat() info didn't change). >> >> What about a global cache? > > > It would have to use a weak dict so if the last reference goes away it > discards the cached stats for a given path, otherwise you'd have trouble > containing the cache size. > > And caching Path objects should still not be comparable to non-caching > Path objects (which we will need to preserve the semantics that repeatedly > calling stat() on a Path object created the default way will always redo > the syscall). The main advantage would be that caching Path objects could > be compared safely. > > It could still cause unexpected results. E.g. if you have just traversed > some big tree using caching, and saved some results (so hanging on to some > paths and hence their stat() results), and then you make some changes and > traverse it again to look for something else, you might accidentally be > seeing stale (i.e. cached) stat() results. > > Maybe there's a middle ground, where the user can create a StatCache > object and pass it into Path creation and traversal operations. Paths with > the same StatCache object (or both None) compare equal if their path > components are equal. Paths with different StatCache objects never compare > equal (but otherwise are ordered by path as usual -- the StatCache object's > identity is only used when the paths are equal. > > Are you (or anyone still reading this) interested in implementing this > idea? > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From moloney at ohsu.edu Wed Jan 6 17:48:45 2016 From: moloney at ohsu.edu (Brendan Moloney) Date: Wed, 6 Jan 2016 22:48:45 +0000 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> , Message-ID: <5F6A858FD00E5F4A82E3206D2D854EF892D48F92@EXMB09.ohsu.edu> Its important to keep in mind the main benefit of scandir is you don't have to do ANY stat call in many cases, because the directory listing provides some subset of this info. On Linux you can at least tell if a path is a file or directory. On windows there is much more info provided by the directory listing. Avoiding subsequent stat calls is also nice, but not nearly as important due to OS level caching. Brendan Moloney Research Associate Advanced Imaging Research Center Oregon Health Science University ________________________________ From: Python-ideas [python-ideas-bounces+moloney=ohsu.edu at python.org] on behalf of Guido van Rossum [guido at python.org] Sent: Wednesday, January 06, 2016 2:42 PM To: Random832 Cc: Python-Ideas Subject: Re: [Python-ideas] find-like functionality in pathlib I couldn't help myself and coded up a prototype for the StatCache design I sketched. See http://bugs.python.org/issue26031. Feedback welcome! On my Mac it only seems to offer limited benefits though... -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jan 6 23:35:00 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 6 Jan 2016 20:35:00 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> Message-ID: <7140563289735493919@unknownmsgid> The PermissionError bug you found is already reported > I committed this fix. Thanks! I also added a .path attribute to pathlib.*Path objects, so that p.path == str(p). You can now use the idiom getattr(arg, 'path', arg) to extract the path from a pathlib.Path object, or from an os.DirEntry object, or fall back to a plain string, without using str(arg), which would turn *any* object into a string, which is never what you want to happen by default. Very nice -- that opens the door to stdlib and third party modules taking Path objects in addition to strings. Maybe we will see greater adoption of pathlib after all! CHB These changes will be released in Python 3.4.5, 3.5.2 and 3.6. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Jan 7 04:03:01 2016 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 7 Jan 2016 03:03:01 -0600 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> Message-ID: A bit OT, possibly, but this may be a long way around (to a cached *graph* of paths and metadata) with similar use cases: path.py#walk(), NetworkX edge, node dicts https://github.com/westurner/pyleset/blob/249a0837/structp/structp.py def walk_path_into_graph(g, path_, errors='warn'): """ """ This stats and reads limited image format metadata as CSV, TSV, JSON: https://github.com/westurner/image_size/blob/ab46de73/get_image_size.py I suppose because of race conditions this metadata should actually be stored in a filesystem triplestore with extended attributes and also secontext attributes. (... gnome-tracker reads filesystem stat data into RDF, for SPARQL). BSP vertex messaging can probably handle cascading cache invalidation (with supersteps). On Jan 6, 2016 4:44 PM, "Guido van Rossum" wrote: > I couldn't help myself and coded up a prototype for the StatCache design I > sketched. See http://bugs.python.org/issue26031. Feedback welcome! On my > Mac it only seems to offer limited benefits though... > > On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum wrote: > >> On Wed, Jan 6, 2016 at 8:11 AM, Random832 wrote: >> >>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote: >>> > One problem with stat() caching is that Path objects are considered >>> > immutable, and two Path objects referring to the same path are >>> completely >>> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} >>> is >>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', >>> > cache_stat=True), the behavior of two instances of that object might be >>> > observably different (if they were instantiated at times when the >>> > contents of the filesystem was different). So maybe stat-caching Path >>> instances >>> > should be considered unequal, or perhaps unhashable. Or perhaps they >>> > should only be considered equal if their stat() values are actually >>> equal (i.e. >>> > if the file's stat() info didn't change). >>> >>> What about a global cache? >> >> >> It would have to use a weak dict so if the last reference goes away it >> discards the cached stats for a given path, otherwise you'd have trouble >> containing the cache size. >> >> And caching Path objects should still not be comparable to non-caching >> Path objects (which we will need to preserve the semantics that repeatedly >> calling stat() on a Path object created the default way will always redo >> the syscall). The main advantage would be that caching Path objects could >> be compared safely. >> >> It could still cause unexpected results. E.g. if you have just traversed >> some big tree using caching, and saved some results (so hanging on to some >> paths and hence their stat() results), and then you make some changes and >> traverse it again to look for something else, you might accidentally be >> seeing stale (i.e. cached) stat() results. >> >> Maybe there's a middle ground, where the user can create a StatCache >> object and pass it into Path creation and traversal operations. Paths with >> the same StatCache object (or both None) compare equal if their path >> components are equal. Paths with different StatCache objects never compare >> equal (but otherwise are ordered by path as usual -- the StatCache object's >> identity is only used when the paths are equal. >> >> Are you (or anyone still reading this) interested in implementing this >> idea? >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Jan 7 04:08:45 2016 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 7 Jan 2016 03:08:45 -0600 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> Message-ID: The PyFilesystem filesystem abstraction APIs may also have / be in need of a sensible .walk() API http://pyfilesystem.readthedocs.org/en/latest/path.html#module-fs.path http://pyfilesystem.readthedocs.org/en/latest/interface.html walk() Like listdir() but descends in to sub-directories walkdirs() Returns an iterable of paths to sub-directories walkfiles() Returns an iterable of file paths in a directory, and its sub-directories On Jan 7, 2016 3:03 AM, "Wes Turner" wrote: > A bit OT, possibly, but this may be a long way around (to a cached *graph* > of paths and metadata) with similar use cases: > > path.py#walk(), NetworkX edge, node dicts > > https://github.com/westurner/pyleset/blob/249a0837/structp/structp.py > > def walk_path_into_graph(g, path_, errors='warn'): > """ > """ > > This stats and reads limited image format metadata as CSV, TSV, JSON: > https://github.com/westurner/image_size/blob/ab46de73/get_image_size.py > > I suppose because of race conditions this metadata should actually be > stored in a filesystem triplestore with extended attributes and also > secontext attributes. > > (... gnome-tracker reads filesystem stat data into RDF, for SPARQL). > > BSP vertex messaging can probably handle cascading cache invalidation > (with supersteps). > On Jan 6, 2016 4:44 PM, "Guido van Rossum" wrote: > >> I couldn't help myself and coded up a prototype for the StatCache design >> I sketched. See http://bugs.python.org/issue26031. Feedback welcome! On >> my Mac it only seems to offer limited benefits though... >> >> On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum >> wrote: >> >>> On Wed, Jan 6, 2016 at 8:11 AM, Random832 >>> wrote: >>> >>>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote: >>>> > One problem with stat() caching is that Path objects are considered >>>> > immutable, and two Path objects referring to the same path are >>>> completely >>>> > interchangeable. For example, {pathlib.Path('/a'), >>>> pathlib.Path('/a')} is >>>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a', >>>> > cache_stat=True), the behavior of two instances of that object might >>>> be >>>> > observably different (if they were instantiated at times when the >>>> > contents of the filesystem was different). So maybe stat-caching Path >>>> instances >>>> > should be considered unequal, or perhaps unhashable. Or perhaps they >>>> > should only be considered equal if their stat() values are actually >>>> equal (i.e. >>>> > if the file's stat() info didn't change). >>>> >>>> What about a global cache? >>> >>> >>> It would have to use a weak dict so if the last reference goes away it >>> discards the cached stats for a given path, otherwise you'd have trouble >>> containing the cache size. >>> >>> And caching Path objects should still not be comparable to non-caching >>> Path objects (which we will need to preserve the semantics that repeatedly >>> calling stat() on a Path object created the default way will always redo >>> the syscall). The main advantage would be that caching Path objects could >>> be compared safely. >>> >>> It could still cause unexpected results. E.g. if you have just traversed >>> some big tree using caching, and saved some results (so hanging on to some >>> paths and hence their stat() results), and then you make some changes and >>> traverse it again to look for something else, you might accidentally be >>> seeing stale (i.e. cached) stat() results. >>> >>> Maybe there's a middle ground, where the user can create a StatCache >>> object and pass it into Path creation and traversal operations. Paths with >>> the same StatCache object (or both None) compare equal if their path >>> components are equal. Paths with different StatCache objects never compare >>> equal (but otherwise are ordered by path as usual -- the StatCache object's >>> identity is only used when the paths are equal. >>> >>> Are you (or anyone still reading this) interested in implementing this >>> idea? >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From em at kth.se Thu Jan 7 04:20:35 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 10:20:35 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client Message-ID: <568E2DE3.9030405@kth.se> Hi, I hope python-ideas is the right place to post this, I'm very new to this and appreciate a pointer in the right direction if this is not it. The requests project is getting multiple bug reports about a problem in the stdlib http.client, so I thought I'd raise an issue about it here. The bug reports concern people posting http requests with unicode strings when they should be using utf-8 encoded strings. Since RFC 2616 says latin-1 is the default encoding http.client tries that and fails with a UnicodeEncodeError. My idea is NOT to change from latin-1 to something else, that would break compliance with the spec, but instead catch that exception, and try encoding with utf-8 instead. That would avoid breaking backward compatibility, unless someone specifically relied on that exception, which I think is very unlikely. This is also how other languages http libraries seem to deal with this, sending in unicode just works: In cURL (works fine): curl http://example.com -d "Celebrate ?" In Ruby with http.rb (works fine): require 'http' r = HTTP.post("http://example.com", :body => "Celebrate ?) In Node with request (works fine): var request = require('request'); request.post({url: 'http://example.com', body: "Celebrate ?"}, function (error, response, body) { console.log(body) }) But Python 3 with requests crashes instead: import requests r = requests.post("http://localhost:8000/tag", data="Celebrate ?") ...with the following stacktrace: ... File "../lib/python3.4/http/client.py", line 1127, in _send_request body = body.encode('iso-8859-1') UnicodeEncodeError: 'latin-1' codec can't encode characters in position 14-15: ordinal not in range(256) ---- So the rationale for this idea is: * http.client doesn't work the way beginners expect for very basic usecases (posting unicode strings) * Libraries in other languages behave like beginners expect, which magnifies the problem. * Changing the default latin-1 encoding probably isn't possible, because it would break the spec... * But catching the exception and try encoding in utf-8 instead wouldn't break the spec and solves the problem. ---- Here's a couple of issues where people expect things to work differently: https://github.com/kennethreitz/requests/issues/1926 https://github.com/kennethreitz/requests/issues/2838 https://github.com/kennethreitz/requests/issues/1822 ---- Does this make sense? /Emil From rosuav at gmail.com Thu Jan 7 04:49:55 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Jan 2016 20:49:55 +1100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568E2DE3.9030405@kth.se> References: <568E2DE3.9030405@kth.se> Message-ID: On Thu, Jan 7, 2016 at 8:20 PM, Emil Stenstr?m wrote: > > So the rationale for this idea is: > > * http.client doesn't work the way beginners expect for very basic usecases (posting unicode strings) > * Libraries in other languages behave like beginners expect, which magnifies the problem. > * Changing the default latin-1 encoding probably isn't possible, because it would break the spec... > * But catching the exception and try encoding in utf-8 instead wouldn't break the spec and solves the problem. > > ---- > > Here's a couple of issues where people expect things to work differently: > > https://github.com/kennethreitz/requests/issues/1926 > https://github.com/kennethreitz/requests/issues/2838 > https://github.com/kennethreitz/requests/issues/1822 > > ---- > > Does this make sense? It makes sense, but I disagree with the suggestion. Having "Latin-1 or UTF-8" as the effective default encoding is not a good idea, IMO; sometimes I've *de*coded text using such heuristics (the other order, of course; attempt UTF-8 decode, and if that fail, decode as Latin-1 or possibly CP-1252) as a means of coping with broken systems, but I would much prefer the default to simply be one or the other. As the 'requests' module is not part of Python's standard library, it would be free to change its own default, regardless of the behaviour of http.client; whether that's a good idea or not is for the requests community to decide (unless there's something specifically binding it to http.client). But whether you're asking for a change in http.client or in requests, I would disagree with the "either-or" approach; change to a UTF-8 default, perhaps, but not to the hybrid. ChrisA From cory at lukasa.co.uk Thu Jan 7 05:07:41 2016 From: cory at lukasa.co.uk (Cory Benfield) Date: Thu, 7 Jan 2016 10:07:41 +0000 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568E2DE3.9030405@kth.se> References: <568E2DE3.9030405@kth.se> Message-ID: <0D8C565A-EEE0-4E2B-8A42-CB0DD1ADAC81@lukasa.co.uk> > On 7 Jan 2016, at 09:20, Emil Stenstr?m wrote: > > Since RFC 2616 says latin-1 is the default encoding http.client tries that and fails with a UnicodeEncodeError. I cannot stress this enough: there is *no* default encoding for HTTP bodies! This conversation is very confused, and it all starts because of a thoroughly misleading comment in http.client. Firstly, let?s all remember that RFC 2616 is dead (hurrah!), now superseded by RFCs 7230 through 7238. However, http.client blames its decision on RFC 2616. Note the comment here[0]. This is (in my view) a *misreading* of RFC 2616 Section 3.7.1, which says: > When no explicit charset > parameter is provided by the sender, media subtypes of the ?text" > type are defined to have a default charset value of "ISO-8859-1" when > received via HTTP. The thing is, this paragraph is referring to MIME types: that is, when the Content-Type header reads ?text/?, and specifies no charset parameter, the body should be encoded in UTF-8. That, of course, is not the invariant this code enforces. Instead, this code spots the *only* explicit reference to a text encoding and chooses to use it for any unicode string sent by the user. That?s a somewhat defensible decision, though it?s not the one I?d have made. *However*, that fallback was removed in RFC 7231. In appendix B of that RFC, we see this note: > The default charset of ISO-8859-1 for text media types has been > removed; the default is now whatever the media type definition says. > Likewise, special treatment of ISO-8859-1 has been removed from the > Accept-Charset header field. This means there is no longer a default content encoding for HTTP, and instead the default encoding varies based on media type. The relevant RFC for this is RFC 6657, which specifies the following things: - The default encoding for text/plain is US-ASCII - All other text subtypes either MUST provide a charset parameter that explicitly indicates what their encoding is, or MUST NOT provide one under any circumstances and instead carry that information in their contents (e.g. HTML, XML). That is to say, there are no defaults for text/* encodings: only explicit encoding choices! This whole thing was really very confusing from the beginning. IMO, the only safe decision is for http.client to simply refuse to accept unicode strings *at all* as request bodies: the ambiguity over what they mean is simply too great. Requests has had a large number of bug reports from people who claimed that something ?didn?t work?, when in practice there was just a disagreement over what the correct encoding of something was. And having written both a HTTP/1.1 and a HTTP/2 client myself, in both cases I restricted the arguments of HTTPConnection.send() to bytestrings. For what it?s worth, I don?t believe it?s a good idea to change the default body encoding of unicode strings. This is the kind of really perplexing change that takes working code that implicitly relies on this behaviour and breaks it. In my experience, breakage of this manner is particularly tricky to catch because anything that can be validly encoded as Latin-1 can be validly encoded as UTF-8, so the failure will manifest as request failures rather than tracebacks. In this instance I believe the http.client module has made its bed, and will need to lie in it. If this *did* change, Requests would (at least for the remainder of the 2.X release cycle) need to enforce the Latin-1 behaviour itself for the very same backward compatibility reasons, which removes any benefit we?d get from this anyway. The really correct behaviour would be to tell users they cannot send unicode strings, because it makes no sense. That?s a change I could get behind. But moving from one guess to another, even though the new guess is more likely to be right, seems to me to be misunderstanding the problem. Cory N.B: I should note that only one of the linked requests issues, #2838, is actually about the request body. Of the others, one is about unicode in the request URI and one is about unicode in header values. This set of related issues demonstrates an ongoing confusion amongst users about what unicode strings are and how they work, but that?s a separate discussion to this one. [0]: https://github.com/python/cpython/blob/master/Lib/http/client.py#L1173-L1176 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From p.f.moore at gmail.com Thu Jan 7 06:37:39 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Jan 2016 11:37:39 +0000 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568E2DE3.9030405@kth.se> References: <568E2DE3.9030405@kth.se> Message-ID: On 7 January 2016 at 09:20, Emil Stenstr?m wrote: > This is also how other languages http libraries seem to deal with this, > sending in unicode just works: > > In cURL (works fine): > curl http://example.com -d "Celebrate ?" > In a Unix shell, this would be supplying a bytestring argument to the curl exe, that encoded the characters in whatever language setting the user had specified (likely UTF-8). In Windows Powershell (the only Windows shell I can think of that would support Unicode) what would happen would depend on how curl accessed its command line. This probably relies on which specific CRT the code was built with. > In Ruby with http.rb (works fine): > require 'http' > r = HTTP.post("http://example.com", :body => "Celebrate ?) > I don't know how Ruby handles Unicode, but would that body argument *actually* be Unicode, or would it be a UTF-8 encoded bytestring? I have a vague recollection that Ruby uses a "utf-8 for internal string encodings" model, which may mean it's not as strict as Python 3 is about separating bytestrings and Unicode strings... > In Node with request (works fine): > var request = require('request'); > request.post({url: 'http://example.com', body: "Celebrate ?"}, function > (error, response, body) { > console.log(body) > }) > Same response here as for Ruby. It depends on the semantics of the language regarding Unicode support as to what's happening here. > But Python 3 with requests crashes instead: > import requests > r = requests.post("http://localhost:8000/tag", data="Celebrate ?") > ...with the following stacktrace: > ... > File "../lib/python3.4/http/client.py", line 1127, in _send_request > body = body.encode('iso-8859-1') > UnicodeEncodeError: 'latin-1' codec can't encode characters in position > 14-15: ordinal not in range(256) What does the requests documentation say it'll do with a Unicode string being passed as POST data to a request where there's no encoding? If it says it'll encode as latin-1, then that error is entirely correct. If it says it'll encode in some other encoding, then it isn't doing so (and that's a requests bug). If it's not explaining what it's doing, then the requests documentation is doing its users a disservice by not explaining the realities of sending Unicode over a byte-oriented protocol - and it's also leaving a huge "undefined behaviour" hole that people are falling into. I understand that beginners are confused by the apparent problem that other environments "just work", but they really don't - and the problems will hit the user further down the line, when the issue is harder to debug. For example, you're completely ignoring the potential issue of what the target server will do when faced with UTF-8 data - there's no guarantee that it will work in general. So IMO, this needs to be addressed as a documentation (and possibly code) fix in requests. It's something of a shame that httplib.client doesn't reject Unicode strings rather than making a silent assumption of the encoding, but that's something we have to live with for backward compatibility reasons. But there's no reason requests has to expose that behaviour to the user. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jan 7 06:53:57 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Jan 2016 22:53:57 +1100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> Message-ID: On Thu, Jan 7, 2016 at 10:37 PM, Paul Moore wrote: > > So IMO, this needs to be addressed as a documentation (and possibly code) fix in requests. It's something of a shame that httplib.client doesn't reject Unicode strings rather than making a silent assumption of the encoding, but that's something we have to live with for backward compatibility reasons. But there's no reason requests has to expose that behaviour to the user. > Personally, I would be happy with any of three behaviours: 1) Raise TypeError and demand that byte strings be used 2) Encode as UTF-8, since that's most likely to "just work", and is also consistent 3) Encode as ASCII, and let any errors bubble up. But, backward compat. ChrisA From p.f.moore at gmail.com Thu Jan 7 07:09:23 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Jan 2016 12:09:23 +0000 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> Message-ID: On 7 January 2016 at 11:53, Chris Angelico wrote: > 3) Encode as ASCII, and let any errors bubble up. 4) Encode as ASCII and catch UnicodeEncodeError and re-raise as a TypeError "Unicode string supplied without an explicit encoding". IMO, the underlying encoding errors are very user-unfriendly, and should nearly always be caught internally and replaced with something more user friendly. Most of the user confusion I see from Unicode issues could probably be significantly alleviated if the user was presented with something better than a raw (en/de)coding error and traceback. Paul From rosuav at gmail.com Thu Jan 7 07:29:31 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Jan 2016 23:29:31 +1100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> Message-ID: On Thu, Jan 7, 2016 at 11:09 PM, Paul Moore wrote: > On 7 January 2016 at 11:53, Chris Angelico wrote: >> 3) Encode as ASCII, and let any errors bubble up. > > 4) Encode as ASCII and catch UnicodeEncodeError and re-raise as a > TypeError "Unicode string supplied without an explicit encoding". > > IMO, the underlying encoding errors are very user-unfriendly, and > should nearly always be caught internally and replaced with something > more user friendly. Most of the user confusion I see from Unicode > issues could probably be significantly alleviated if the user was > presented with something better than a raw (en/de)coding error and > traceback. Maybe. Same difference, though - permit ASCII-only, anything else is an error. ChrisA From steve at pearwood.info Thu Jan 7 07:59:19 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 7 Jan 2016 23:59:19 +1100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> Message-ID: <20160107125919.GF10854@ando.pearwood.info> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote: > It makes sense, but I disagree with the suggestion. Having "Latin-1 or > UTF-8" as the effective default encoding is not a good idea, IMO; I'm curious what your reasoning is. That seems to be fairly common behavious with some email clients, for example I seem to recall that Thunderbird will try encoding emails as US-ASCII, if that fails, Latin-1, and only send UTF-8 if the other two don't work. I'm not defending this tactic, but wondering what you have against it. -- Steve From em at kth.se Thu Jan 7 08:11:01 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 14:11:01 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <20160107125919.GF10854@ando.pearwood.info> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> Message-ID: <568E63E5.4040003@kth.se> On 2016-01-07 13:59, Steven D'Aprano wrote: > On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote: > >> It makes sense, but I disagree with the suggestion. Having "Latin-1 or >> UTF-8" as the effective default encoding is not a good idea, IMO; > > I'm curious what your reasoning is. That seems to be fairly common > behavious with some email clients, for example I seem to recall that > Thunderbird will try encoding emails as US-ASCII, if that fails, > Latin-1, and only send UTF-8 if the other two don't work. > > I'm not defending this tactic, but wondering what you have against it. I'm fine with either tactic, either defaulting to utf-8 or trying them one after the other. The important thing for me is that the API works as expected by many. My main reason for not changing the default was that it would break backwards compatibility, but only for the case that people sent latin-1 strings as if they where unicode strings. If the reading of the spec that led to using latin-1 is incorrect that really makes we question if having latin-1 there is a good idea from the start. So I'm definitely pro switching to utf-8 as default as it would make the API work like many (including me) would expect. /Emil From rosuav at gmail.com Thu Jan 7 08:25:33 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Jan 2016 00:25:33 +1100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <20160107125919.GF10854@ando.pearwood.info> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> Message-ID: On Thu, Jan 7, 2016 at 11:59 PM, Steven D'Aprano wrote: > On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote: > >> It makes sense, but I disagree with the suggestion. Having "Latin-1 or >> UTF-8" as the effective default encoding is not a good idea, IMO; > > I'm curious what your reasoning is. That seems to be fairly common > behavious with some email clients, for example I seem to recall that > Thunderbird will try encoding emails as US-ASCII, if that fails, > Latin-1, and only send UTF-8 if the other two don't work. > > I'm not defending this tactic, but wondering what you have against it. An application is free to do that if it likes, although personally I wouldn't bother. For a library, I'd much rather the rules be as simple as possible. Maybe "ASCII or UTF-8" (since one is a strict subset of the other), but not "ASCII or Latin-1 or UTF-7". I'd prefer something extremely simple: if you don't specify an encoding, it has one default. That corresponds to a function signature that says encoding="UTF-8", and you can be 100% confident that omitting the encoding parameter will do the same thing as passing "UTF-8". ChrisA From guido at python.org Thu Jan 7 11:32:45 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Jan 2016 08:32:45 -0800 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> Message-ID: Thanks especially to Cory for digging into the source and the RFCs here! Personally I'm perplexed that Requests, which claims to be "HTTP for Humans" doesn't take care of this but just lets http/client.py blow up. (However, IIUC both 2838 and 1822 are about the body.encode() call in Python 3's http/client.py at _send_request(). 1926 seems to originate in Requests itself; it's also Python 2.7.) Anyways, if we were to follow the Python 3 philosophy regarding Unicode to the letter we would have to reject the str type altogether here, and insist on bytes. The error message could tell the caller what to do, e.g. "use data.encode('utf-8') if you want the data to be encoded in UTF-8". (Then of course the server might not like it.) An alternative could be to look at the content-type header (if one is given) and use the charset from there or the default from the RFC for the content/type. But all these are rather painfully backwards incompatible, which is a big concern here. Maybe the best solution (most backward compatible *and* most likely to stem the flood of bug reports) is to just catch the UnicodeError and replace its message with something more Human-friendly, explaining that the data must be encoded before sending it. Then the user can figure out what encoding to use (though yes, most likely UTF-8 is it, so the message could suggest trying that first). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Thu Jan 7 11:46:50 2016 From: cory at lukasa.co.uk (Cory Benfield) Date: Thu, 7 Jan 2016 16:46:50 +0000 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> Message-ID: <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> > On 7 Jan 2016, at 16:32, Guido van Rossum wrote: > > Personally I'm perplexed that Requests, which claims to be "HTTP for Humans" doesn't take care of this but just lets http/client.py blow up. (However, IIUC both 2838 and 1822 are about the body.encode() call in Python 3's http/client.py at _send_request(). 1926 seems to originate in Requests itself; it's also Python 2.7.) The main reason is historical: this was missed in the original (substantial) rewrite in requests 2.0, and as a result we can?t change it without a backward compat break, just the same as Python. We?ll probably fix it in 3.0. Cory -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From random832 at fastmail.com Thu Jan 7 11:56:07 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Jan 2016 11:56:07 -0500 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> Message-ID: <1452185767.1150694.485653442.4D95D8DB@webmail.messagingengine.com> On Thu, Jan 7, 2016, at 06:53, Chris Angelico wrote: > On Thu, Jan 7, 2016 at 10:37 PM, Paul Moore wrote: > > > > So IMO, this needs to be addressed as a documentation (and possibly code) fix in requests. It's something of a shame that httplib.client doesn't reject Unicode strings rather than making a silent assumption of the encoding, but that's something we have to live with for backward compatibility reasons. But there's no reason requests has to expose that behaviour to the user. > > > > Personally, I would be happy with any of three behaviours: > > 1) Raise TypeError and demand that byte strings be used > 2) Encode as UTF-8, since that's most likely to "just work", and is > also consistent > 3) Encode as ASCII, and let any errors bubble up. What about: 4) Silently add a content type (default text/plain; charset=UTF-8) or charset (if the user has specified a content type without one) if a unicode string is used. If a byte string is used, use application/octet-stream for the default content type and don't add a charset in any case (even if the user-specified content type is text/*) From abarnert at yahoo.com Thu Jan 7 11:57:42 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 7 Jan 2016 08:57:42 -0800 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568E2DE3.9030405@kth.se> References: <568E2DE3.9030405@kth.se> Message-ID: On Jan 7, 2016, at 01:20, Emil Stenstr?m wrote: > > This is also how other languages http libraries seem to deal with this, sending in unicode just works: No, sending Unicode as UTF-8 doesn't "just work", except when the server is expecting UTF-8. Otherwise, it just makes the problem harder to debug. Most commonly, people who run into this problem with requests are trying to send JSON or form-encoded data. In either case, the solution is simple: just pass the object to the json= or data= parameter. It's only if you try to do it half-way yourself, calling json.dumps but then not calling .encode, that you run into a problem. I've also seen people run into this uploading files. Again, if you let requests just take care of it for you (by passing it the filename or file object), it just works. But if you try to do it half-way, reading the whole file into memory as a string but not encoding it, that's when you have problems. The solution in every case is simple: don't make things harder for yourself by doing extra work and then trying to use the lower-level API, just let requests do it for you. Of course if you're using http.client or urllib instead of requests, you don't have that option. But if http.client is too low-level for you, the solution isn't to hack up http.client to be more magical when used by people who don't know what they're doing in hopes that it'll work more often than it'll cause further and harder-to-debug problems, it's to tell them to use requests if they don't want to learn what they're doing. From random832 at fastmail.com Thu Jan 7 11:59:23 2016 From: random832 at fastmail.com (Random832) Date: Thu, 07 Jan 2016 11:59:23 -0500 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <20160107125919.GF10854@ando.pearwood.info> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> Message-ID: <1452185963.1151281.485656762.1359EFD5@webmail.messagingengine.com> On Thu, Jan 7, 2016, at 07:59, Steven D'Aprano wrote: > On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote: > > > It makes sense, but I disagree with the suggestion. Having "Latin-1 or > > UTF-8" as the effective default encoding is not a good idea, IMO; > > I'm curious what your reasoning is. That seems to be fairly common > behavious with some email clients, for example I seem to recall that > Thunderbird will try encoding emails as US-ASCII, if that fails, > Latin-1, and only send UTF-8 if the other two don't work. Sure, but it includes a content-type header with a charset parameter. I think the behavior of encoding text but not including a charset parameter is fundamentally broken. If the user supplies a charset parameter, it should try to use the matching encoding, otherwise it should pick an encoding (whether that is "always UTF-8" or some other rule) and add the charset parameter. From brett at python.org Thu Jan 7 12:01:08 2016 From: brett at python.org (Brett Cannon) Date: Thu, 07 Jan 2016 17:01:08 +0000 Subject: [Python-ideas] intuitive timedeltas like in go In-Reply-To: References: <568D4926.8080902@mail.de> Message-ID: On Wed, 6 Jan 2016 at 09:37 Serhiy Storchaka wrote: > On 06.01.16 19:04, Sven R. Kunze wrote: > > timedelta handling always felt cumbersome to me: > > > > from datetime import timedelta > > > > short_period = timedelta(seconds=10) > > long_period = timedelta(hours=4, seconds=37) > > > > Today, I came across this one https://github.com/lxc/lxd/pull/1471/files > > and I found the creation of a 10 seconds timeout extremely intuitive. > > Would this represent a valuable addition to Python? > > > > from datetime import second, hour > > > > short period = 10*second > > long_period = 4*hour + 37*second > > Does Go support keyword arguments? > Nope: https://golang.org/ref/spec#Calls -Brett > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From em at kth.se Thu Jan 7 13:50:49 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 19:50:49 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> Message-ID: <568EB389.2090901@kth.se> Den 2016-01-07 kl. 17:46, skrev Cory Benfield: >> On 7 Jan 2016, at 16:32, Guido van Rossum >> wrote: >> >> Personally I'm perplexed that Requests, which claims to be "HTTP >> for Humans" doesn't take care of this but just lets http/client.py >> blow up. (However, IIUC both 2838 and 1822 are about the >> body.encode() call in Python 3's http/client.py at _send_request(). >> 1926 seems to originate in Requests itself; it's also Python 2.7.) > > The main reason is historical: this was missed in the original > (substantial) rewrite in requests 2.0, and as a result we can?t > change it without a backward compat break, just the same as Python. > We?ll probably fix it in 3.0. So as things stand: * The general consensus seems to be that the raised error should be changed to something like: TypeError("Unicode string supplied without an explicit encoding") * Python would like to change http.client to reject unicode input with an exception, but won't because of backwards compatibility * Requests would like to do the same but won't because of backwards compatibility I think it will be very hard to find code that breaks because of a type change in the exception when sending invalid data. On the other hand, it's VERY easy to find people that are affected by the confusing error currently in use everywhere. When a backward compatible change makes life easier for 99.9% of users, and 0.1% of users need to debug a TypeError with a very clear error message (which was probably a bug in their code to begin with), I'm starting to question having a policy that strict. /Emil From guido at python.org Thu Jan 7 14:04:13 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Jan 2016 11:04:13 -0800 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568EB389.2090901@kth.se> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> Message-ID: On Thu, Jan 7, 2016 at 10:50 AM, Emil Stenstr?m wrote: > Den 2016-01-07 kl. 17:46, skrev Cory Benfield: > >> On 7 Jan 2016, at 16:32, Guido van Rossum >>> wrote: >>> >>> Personally I'm perplexed that Requests, which claims to be "HTTP >>> for Humans" doesn't take care of this but just lets http/client.py >>> blow up. (However, IIUC both 2838 and 1822 are about the >>> body.encode() call in Python 3's http/client.py at _send_request(). >>> 1926 seems to originate in Requests itself; it's also Python 2.7.) >>> >> >> The main reason is historical: this was missed in the original >> (substantial) rewrite in requests 2.0, and as a result we can?t >> change it without a backward compat break, just the same as Python. >> We?ll probably fix it in 3.0. >> > > So as things stand: > > * The general consensus seems to be that the raised error should be > changed to something like: TypeError("Unicode string supplied without an > explicit encoding") > > * Python would like to change http.client to reject unicode input with an > exception, but won't because of backwards compatibility > > * Requests would like to do the same but won't because of backwards > compatibility > > I think it will be very hard to find code that breaks because of a type > change in the exception when sending invalid data. On the other hand, it's > VERY easy to find people that are affected by the confusing error currently > in use everywhere. > > When a backward compatible change makes life easier for 99.9% of users, > and 0.1% of users need to debug a TypeError with a very clear error message > (which was probably a bug in their code to begin with), I'm starting to > question having a policy that strict. > What policy are you referring to? I don't think anyone objects against making the error message clearer. The objection is against rejecting unicode strings that in the past would have been successfully encoded using Latin-1. I'm not sure whether it's a good idea to change the exception type from TypeError to UnicodeError -- the exception is really related to Unicode so keeping UnicodeError but changing the message sounds like the right thing to do. And this can be done independently in both Requests and the stdlib. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Jan 7 14:31:45 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 7 Jan 2016 19:31:45 +0000 (UTC) Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: Message-ID: <483803664.1337090.1452195105108.JavaMail.yahoo@mail.yahoo.com> On Thursday, January 7, 2016 11:05 AM, Guido van Rossum wrote: > I'm not sure whether it's a good idea to change the exception type from TypeError to UnicodeError -- the exception is really related to Unicode so keeping UnicodeError but changing the message sounds like the right thing to do. And this can be done independently in both Requests and the stdlib. That sounds like a good idea. A UnicodeEncodeError (or subclass of it?) with text like "HTTP body without encoding defaults to 'latin-1', which can't encode character '\u5555' in position 30: ordinal not in range(256)") would be pretty simple to implement, and would help a lot more than the current text. (And, for those who still can't figure it out, being a unique error message means that within a few days of the change, googling it should get a relevant StackOverflow answer, which isn't true for the generic encoding error message.) Requests could get fancier. For example, if the string starts with "{", make the error message ask if maybe they wanted to use json=obj instead of data=json.dumps(obj). But I think that wouldn't be appropriate for the stdlib. (Especially since http.client doesn't have a json parameter...) But then it sounds like Requests is planning to remove implicitly-Latin-1 strings via data= anyway in 3.0, which would solve the problem more simply. From em at kth.se Thu Jan 7 14:40:56 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 20:40:56 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> Message-ID: <568EBF48.7050004@kth.se> Den 2016-01-07 kl. 20:04, skrev Guido van Rossum: > What policy are you referring to? I was reading https://www.python.org/dev/peps/pep-0387/#backwards-compatibility-rules, which specifies "raised exceptions", but I see now that it's only a draft. > I don't think anyone objects against > making the error message clearer. The objection is against rejecting > unicode strings that in the past would have been successfully encoded > using Latin-1. Then I misunderstood, sorry. > I'm not sure whether it's a good idea to change the exception type from > TypeError to UnicodeError -- the exception is really related to Unicode > so keeping UnicodeError but changing the message sounds like the right > thing to do. And this can be done independently in both Requests and the > stdlib. Agreed. I would also suggest adding the suggestion of encoding in "utf-8" specifically which is most likely what will fix the problem. As time goes by and more and more legacy systems disappear, this advise will become truer each year. /Emil From abarnert at yahoo.com Thu Jan 7 15:04:04 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 7 Jan 2016 12:04:04 -0800 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: <568EBF48.7050004@kth.se> References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> <568EBF48.7050004@kth.se> Message-ID: On Jan 7, 2016, at 11:40, Emil Stenstr?m wrote: > > Agreed. I would also suggest adding the suggestion of encoding in "utf-8" specifically which is most likely what will fix the problem. As time goes by and more and more legacy systems disappear, this advise will become truer each year. I disagree. Services that take raw, unformatted text as HTTP bodies and do something useful with it are disappearing in general, not changing the encoding they use for that raw, unformatted text from Latin-1 to UTF-8. And they were never that common in the first place. So we shouldn't be making it easier to send raw, unformatted text as UTF-8; we should be making it easier to send JSON, form-encoded, multipart, XML, etc. Which, again, Requests already does. From guido at python.org Thu Jan 7 15:24:27 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Jan 2016 12:24:27 -0800 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> <568EBF48.7050004@kth.se> Message-ID: It's time that someone files a tracker issue so we can move the remaining discussion there. On Thu, Jan 7, 2016 at 12:04 PM, Andrew Barnert wrote: > On Jan 7, 2016, at 11:40, Emil Stenstr?m wrote: > > > > Agreed. I would also suggest adding the suggestion of encoding in > "utf-8" specifically which is most likely what will fix the problem. As > time goes by and more and more legacy systems disappear, this advise will > become truer each year. > > I disagree. Services that take raw, unformatted text as HTTP bodies and do > something useful with it are disappearing in general, not changing the > encoding they use for that raw, unformatted text from Latin-1 to UTF-8. And > they were never that common in the first place. > > So we shouldn't be making it easier to send raw, unformatted text as > UTF-8; we should be making it easier to send JSON, form-encoded, multipart, > XML, etc. Which, again, Requests already does. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From em at kth.se Thu Jan 7 17:28:13 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 23:28:13 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> <568EBF48.7050004@kth.se> Message-ID: <568EE67D.4090309@kth.se> Den 2016-01-07 kl. 21:24, skrev Guido van Rossum: > It's time that someone files a tracker issue so we can move the > remaining discussion there. Here is the relevant issue: http://bugs.python.org/issue26045 /Emil From em at kth.se Thu Jan 7 17:36:06 2016 From: em at kth.se (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=) Date: Thu, 7 Jan 2016 23:36:06 +0100 Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client In-Reply-To: References: <568E2DE3.9030405@kth.se> <20160107125919.GF10854@ando.pearwood.info> <38D1341A-C16B-4484-A677-C517E5C902B0@lukasa.co.uk> <568EB389.2090901@kth.se> <568EBF48.7050004@kth.se> Message-ID: <568EE856.6060506@kth.se> Den 2016-01-07 kl. 21:04, skrev Andrew Barnert: > I disagree. Services that take raw, unformatted text as HTTP bodies > and do something useful with it are disappearing in general, not > changing the encoding they use for that raw, unformatted text from > Latin-1 to UTF-8. And they were never that common in the first > place. I just wrote a service like this last week. It takes raw unformatted text and returns part-of-speech tags for the text as JSON. That's common for NLP services that structure unstructured text. The rationale for accepting POST body is simply that it makes it very simple to call the service from curl: curl http://example.com -d "string here" So there's no reason these kinds of services would be disappearing. Let's continue the discussion in the bug tracker: http://bugs.python.org/issue26045 /Emil From victor.stinner at gmail.com Fri Jan 8 16:27:09 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Jan 2016 22:27:09 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ Message-ID: Hi, Here is a first PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards. HTML version: https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-version PEP: xxx Title: Add dict.__version__ Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change. Rationale ========= In Python, the builtin ``dict`` type is used by many instructions. For example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the global namespace, or in the builtins namespace (two dict lookups). Python uses ``dict`` for the builtins namespace, globals namespace, type namespaces, instance namespaces, etc. The local namespace (namespace of a function) is usually optimized to an array, but it can be a dict too. Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implementing optimizations respecting the Python semantic requires to detect when "something changes": we will call these checks "guards". The speedup of optimizations depends on the speed of guard checks. This PEP proposes to add a version to dictionaries to implement efficient guards on namespaces. Example of optimization: replace loading a global variable with a constant. This optimization requires a guard on the global variable to check if it was modified. If the variable is modified, the variable must be loaded at runtime, instead of using the constant. Guard example ============= Pseudo-code of an efficient guard to check if a dictionary key was modified (created, updated or deleted):: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.value = dict.get(key, UNSET) self.version = dict.__version__ def check(self): """Return True if the dictionary value did not changed.""" version = self.dict.__version__ if version == self.version: # Fast-path: avoid the dictionary lookup return True value = self.dict.get(self.key, UNSET) if value == self.value: # another key was modified: # cache the new dictionary version self.version = version return True return False Changes ======= Add a read-only ``__version__`` property to builtin ``dict`` type and to the ``collections.UserDict`` type. New empty dictionaries are initilized to version ``0``. The version is incremented at each change: * ``clear()`` if the dict was non-empty * ``pop(key)`` if the key exists * ``popitem()`` if the dict is non-empty * ``setdefault(key, value)`` if the `key` does not exist * ``__detitem__(key)`` if the key exists * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value is different * ``update(...)`` if new values are different than existing values (the version can be incremented multiple times) Example:: >>> d = {} >>> d.__version__ 0 >>> d['key'] = 'value' >>> d.__version__ 1 >>> d['key'] = 'new value' >>> d.__version__ 2 >>> del d['key'] >>> d.__version__ 3 If a dictionary is created with items, the version is also incremented at each dictionary insertion. Example:: >>> d=dict(x=7, y=33) >>> d.__version__ 2 The version is not incremented is an existing key is modified to the same value, but only the identifier of the value is tested, not the content of the value. Example:: >>> d={} >>> value = object() >>> d['key'] = value >>> d.__version__ 2 >>> d['key'] = value >>> d.__version__ 2 .. note:: CPython uses some singleton like integers in the range [-5; 257], empty tuple, empty strings, Unicode strings of a single character in the range [U+0000; U+00FF], etc. When a key is set twice to the same singleton, the version is not modified. The PEP is designed to implement guards on namespaces, only the ``dict`` type can be used for namespaces in practice. ``collections.UserDict`` is modified because it must mimicks ``dict``. ``collections.Mapping`` is unchanged. Integer overflow ================ The implementation uses the C unsigned integer type ``size_t`` to store the version. On 32-bit systems, the maximum version is ``2**32-1`` (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum version is ``2**64-1`` (more than ``1.8 * 10**19``). The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified. The check ``dict.__version__ == old_version`` can be true after an integer overflow, so a guard can return false even if the value changed, which is wrong. The bug occurs if the dict is modified at least ``2**64`` times (on 64-bit system) between two checks of the guard. Using a more complex type (ex: ``PyLongObject``) to avoid the overflow would slow down operations on the ``dict`` type. Even if there is a theorical risk of missing a value change, the risk is considered too low compared to the slow down of using a more complex type. Alternatives ============ Add a version to each dict entry -------------------------------- A single version per dictionary requires to keep a strong reference to the value which can keep the value alive longer than expected. If we add also a version per dictionary entry, the guard can rely on the entry version and so avoid the strong reference to the value (only strong references to a dictionary and key are needed). Changes: add a ``getversion(key)`` method to dictionary which returns ``None`` if the key doesn't exist. When a key is created or modified, the entry version is set to the dictionary version which is incremented at each change (create, modify, delete). Pseudo-code of an efficient guard to check if a dict key was modified using ``getversion()``:: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.dict_version = dict.__version__ self.entry_version = dict.getversion(key) def check(self): """Return True if the dictionary value did not changed.""" dict_version = self.dict.__version__ if dict_version == self.version: # Fast-path: avoid the dictionary lookup return True # lookup in the dictionary, but get the entry version, #not the value entry_version = self.dict.getversion(self.key) if entry_version == self.entry_version: # another key was modified: # cache the new dictionary version self.dict_version = dict_version return True return False This main drawback of this option is the impact on the memory footprint. It increases the size of each dictionary entry, so the overhead depends on the number of buckets (dictionary entries, used or unused yet). For example, it increases the size of each dictionary entry by 8 bytes on 64-bit system if we use ``size_t``. In Python, the memory footprint matters and the trend is more to reduce it. Examples: * `PEP 393 -- Flexible String Representation `_ * `PEP 412 -- Key-Sharing Dictionary `_ Add a new dict subtype ---------------------- Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, use the ``verdict`` for namespaces (module namespace, type namespace, instance namespace, etc.) instead of ``dict``. Leave the ``dict`` type unchanged to not add any overhead (memory footprint) when guards are not needed. Technical issue: a lot of C code in the wild, including CPython core, expect the exact ``dict`` type. Issues: * ``exec()`` requires a ``dict`` for globals and locals. A lot of code use ``globals={}``. It is not possible to cast the ``dict`` to a ``dict`` subtype because the caller expects the ``globals`` parameter to be modified (``dict`` is mutable). * Functions call directly ``PyDict_xxx()`` functions, instead of calling ``PyObject_xxx()`` if the object is a ``dict`` subtype * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some functions require the exact ``dict`` type. * ``Python/ceval.c`` does not completly supports dict subtypes for namespaces The ``exec()`` issue is a blocker issue. Other issues: * The garbage collector has a special code to "untrack" ``dict`` instances. If a ``dict`` subtype is used for namespaces, the garbage collector may be unable to break some reference cycles. * Some functions have a fast-path for ``dict`` which would not be taken for ``dict`` subtypes, and so it would make Python a little bit slower. Usage of dict.__version__ ========================= astoptimizer of FAT Python -------------------------- The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples: * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on ``builtins.__dict__['len']`` and ``globals()['len']`` are required * Loop unrolling: to unroll the loop ``for i in range(...): ...``, guards on ``builtins.__dict__['range']`` and ``globals()['range']`` are required The `FAT Python `_ project is a static optimizer for Python 3.6. Pyjion ------ According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can also benefit from dictionary version to implement optimizations. Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core runtime). Unladen Swallow --------------- Even if dictionary version was not explicitly mentionned, optimization globals and builtins lookup was part of the Unladen Swallow plan: "Implement one of the several proposed schemes for speeding lookups of globals and builtins." Source: `Unladen Swallow ProjectPlan `_. Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective `_. Prior Art ========= Cached globals+builtins lookup ------------------------------ In 2006, Andrea Griffini proposes a patch implementing a `Cached globals+builtins lookup optimization `_. The patch adds a private ``timestamp`` field to dict. See the thread on python-dev: `About dictionary lookup caching `_. Globals / builtins cache ------------------------ In 2010, Antoine Pitrou proposed a `Globals / builtins cache `_ which adds a private ``ma_version`` field to the ``dict`` type. The patch adds a "global and builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the cache. PySizer ------- `PySizer `_: a memory profiler for Python, Google Summer of Code 2005 project by Nick Smallbone. This project has a patch for CPython 2.4 which adds ``key_time`` and ``value_time`` fields to dictionary entries. It uses a global process-wide counter for dictionaries, incremented each time that a dictionary is modified. The times are used to decide when child objects first appeared in their parent objects. Copyright ========= This document has been placed in the public domain. -- Victor From victor.stinner at gmail.com Fri Jan 8 16:31:40 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Jan 2016 22:31:40 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards Message-ID: Hi, Here is the second PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards. HTML version: https://faster-cpython.readthedocs.org/pep_specialize.html PEP: xxx Title: Specialized functions with guards Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Add an API to add specialized functions with guards to functions, to support static optimizers respecting the Python semantic. Rationale ========= Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implement optimizations respecting the Python semantic requires to detect when "something changes", we will call these checks "guards". This PEP proposes to add a ``specialize()`` method to functions to add a specialized functions with guards. When the function is called, the specialized function is used if nothing changed, otherwise use the original bytecode. Writing an optimizer is out of the scope of this PEP. Example ======= Using bytecode -------------- Replace ``chr(65)`` with ``"A"``:: import myoptimizer def func(): return chr(65) def fast_func(): return "A" func.specialize(fast_func.__code__, [myoptimizer.GuardBuiltins("chr")]) del fast_func print("func(): %s" % func()) print("#specialized: %s" % len(func.get_specialized())) print() import builtins builtins.chr = lambda obj: "mock" print("func(): %s" % func()) print("#specialized: %s" % len(func.get_specialized())) Output:: func(): A #specialized: 1 func(): mock #specialized: 0 The hypothetical ``myoptimizer.GuardBuiltins("len")`` is a guard on the builtin ``len()`` function and the ``len`` name in the global namespace. The guard fails if the builtin function is replaced or if a ``len`` name is defined in the global namespace. The first call returns directly the string ``"A"``. The second call removes the specialized function because the builtin ``chr()`` function was replaced, and executes the original bytecode On a microbenchmark, calling the specialized function takes 88 ns, whereas the original bytecode takes 145 ns (+57 ns): 1.6 times as fast. Using builtin function ---------------------- Replace a slow Python function calling ``chr(obj)`` with a direct call to the builtin ``chr()`` function:: import myoptimizer def func(arg): return chr(arg) func.specialize(chr, [myoptimizer.GuardBuiltins("chr")]) print("func(65): %s" % func(65)) print("#specialized: %s" % len(func.get_specialized())) print() import builtins builtins.chr = lambda obj: "mock" print("func(65): %s" % func(65)) print("#specialized: %s" % len(func.get_specialized())) Output:: func(): A #specialized: 1 func(): mock #specialized: 0 The first call returns directly the builtin ``chr()`` function (without creating a Python frame). The second call removes the specialized function because the builtin ``chr()`` function was replaced, and executes the original bytecode. On a microbenchmark, calling the specialized function takes 95 ns, whereas the original bytecode takes 155 ns (+60 ns): 1.6 times as fast. Calling directly ``chr(65)`` takes 76 ns. Python Function Call ==================== Pseudo-code to call a Python function having specialized functions with guards:: def call_func(func, *args, **kwargs): # by default, call the regular bytecode code = func.__code__.co_code specialized = func.get_specialized() nspecialized = len(specialized) index = 0 while index < nspecialized: guard = specialized[index].guard # pass arguments, some guards need them check = guard(args, kwargs) if check == 1: # guard succeeded: we can use the specialized function code = specialized[index].code break elif check == -1: # guard will always fail: remove the specialized function del specialized[index] elif check == 0: # guard failed temporarely index += 1 # code can be a code object or any callable object execute_code(code, args, kwargs) Changes ======= * Add two new methods to functions: - ``specialize(code, guards: list)``: add specialized function with guard. `code` is a code object (ex: ``func2.__code__``) or any callable object (ex: ``len``). The specialization can be ignored if a guard already fails. - ``get_specialized()``: get the list of specialized functions with guards * Base ``Guard`` type which can be used as parent type to implement guards. It requires to implement a ``check()`` function, with an optional ``first_check()`` function. API: * ``int check(PyObject *guard, PyObject **stack)``: return 1 on success, 0 if the guard failed temporarely, -1 if the guard will always fail * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on success, -1 if the guard will always fail Microbenchmark on ``python3.6 -m timeit -s 'def f(): pass' 'f()'`` (best of 3 runs): * Original Python: 79 ns * Patched Python: 79 ns According to this microbenchmark, the changes has no overhead on calling a Python function without specialization. Behaviour ========= When a function code is replaced (``func.__code__ = new_code``), all specialized functions are removed. When a function is serialized (by ``marshal`` or ``pickle`` for example), specialized functions and guards are ignored (not serialized). Copyright ========= This document has been placed in the public domain. -- Victor From guido at python.org Fri Jan 8 18:04:58 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Jan 2016 15:04:58 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 Message-ID: At Dropbox we're trying to be good citizens and we're working towards introducing gradual typing (PEP 484) into our Python code bases (several million lines of code). However, that code base is mostly still Python 2.7 and we believe that we should introduce gradual typing first and start working on conversion to Python 3 second (since having static types in the code can help a big refactoring like that). Since Python 2 doesn't support function annotations we've had to look for alternatives. We considered stub files, a magic codec, docstrings, and additional `# type:` comments. In the end we decided that `# type:` comments are the most robust approach. We've experimented a fair amount with this and we have a proposal for a standard. The proposal is very simple. Consider the following function with Python 3 annotations: def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: """Embezzle funds from account using fake receipts.""" An equivalent way to write this in Python 2 is the following: def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None """Embezzle funds from account using fake receipts.""" There are a few details to discuss: - Every argument must be accounted for, except 'self' (for instance methods) or 'cls' (for class methods). Also the return type is mandatory. If in Python 3 you would omit some argument or the return type, the Python 2 notation should use 'Any'. - If you're using names defined in the typing module, you must still import them! (There's a backport on PyPI.) - For `*args` and `**kwds`, put 1 or 2 starts in front of the corresponding type annotation. As with Python 3 annotations, the annotation here denotes the type of the individual argument values, not of the tuple/dict that you receive as the special argument value 'args' or 'kwds'. - The entire annotation must be one line. (However, see https://github.com/JukkaL/mypy/issues/1102.) We would like to propose this as a standard (either to be added to PEP 484 or as a new PEP) rather than making it a "proprietary" extension to mypy only, so that others in a similar situation can also benefit. A brief discussion of the considered alternatives: - Stub files: this would complicate the analysis in mypy quite a bit, because it would have to parse both the .py file and the .pyi file and somehow combine the information gathered from both, and for each function it would have to use the types from the stub file to type-check the body of the function in the .py file. This would require a lot of additional plumbing. And if we were using Python 3 we would want to use in-line annotations anyway. - A magic codec was implemented over a year ago ( https://github.com/JukkaL/mypy/tree/master/mypy/codec) but after using it for a bit we didn't like it much. It slows down imports, it requires a `# coding: mypy` declaration, it would conflict with pyxl ( https://github.com/dropbox/pyxl), things go horribly wrong when the codec isn't installed and registered, other tools would be confused by the Python 3 syntax in Python 2 source code, and because of the way the codec was implemented the Python interpreter would occasionally spit out confusing error messages showing the codec's output (which is pretty bare-bones). - While there are existing conventions for specifying types in docstrings, we haven't been using any of these conventions (at least not consistently, nor at an appreciable scale), and they are much more verbose if all you want is adding argument annotations. We're working on a tool that automatically adds type annotations[1], and such a tool would be complicated by the need to integrate the generated annotations into existing docstrings (which, in our code base, unfortunately are wildly incongruous in their conventions). - Finally, the proposed comment syntax is easy to mechanically translate into standard Python 3 function annotations once we're ready to let go of Python 2.7. __________ [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a bit over 200 lines. It's not very interesting yet, since it sets the types of nearly all arguments to 'Any'. We're considering building a much more advanced version that tries to guess much better argument types using some form of whole-program analysis. I've heard that Facebook's Hack project got a lot of mileage out of such a tool. I don't yet know how to write it yet -- possibly we could use a variant of mypy's type inference engine, or alternatively we might be able to use something like Jedi ( https://github.com/davidhalter/jedi). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Jan 8 20:00:46 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Jan 2016 12:00:46 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On Sat, Jan 9, 2016 at 8:27 AM, Victor Stinner wrote: > Here is a first PEP, part of a serie of 3 PEP to add an API to > implement a static Python optimizer specializing functions with > guards. Are you intending for these features to become part of the Python core language, or are you discussing this as something that your alternate implementation will do? If the former, send your PEP drafts to peps at python.org and we can get them assigned numbers; if the latter, is there some specific subset of this which *is* for the language core? (For example, MyPy has type checking, but PEP 484 isn't proposing to include that in the core; all it asks is for a 'typing.py' to allow the code to run unchanged.) ChrisA From victor.stinner at gmail.com Fri Jan 8 20:09:39 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 02:09:39 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 2:00 GMT+01:00 Chris Angelico : > Are you intending for these features to become part of the Python core > language Yes. > If the former, send your PEP drafts to > peps at python.org and we can get them assigned numbers My plan is to start a first round of discussion on python-ideas, then get a PEP number for my PEPs before moving the discussion to python-dev. Victor From rosuav at gmail.com Fri Jan 8 20:38:02 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Jan 2016 12:38:02 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On Sat, Jan 9, 2016 at 12:09 PM, Victor Stinner wrote: > 2016-01-09 2:00 GMT+01:00 Chris Angelico : >> Are you intending for these features to become part of the Python core >> language > > Yes. > >> If the former, send your PEP drafts to >> peps at python.org and we can get them assigned numbers > > My plan is to start a first round of discussion on python-ideas, then > get a PEP number for my PEPs before moving the discussion to > python-dev. The discussion on python-ideas can benefit from PEP numbers too, particularly since you're putting three separate proposals up. ("Wait, I know I saw a comment about that. Oh right, that was in PEP 142857, not 142856.") But it's up to you. ChrisA From rosuav at gmail.com Fri Jan 8 20:42:49 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Jan 2016 12:42:49 +1100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: On Sat, Jan 9, 2016 at 8:31 AM, Victor Stinner wrote: > When a function is serialized (by ``marshal`` or ``pickle`` for > example), specialized functions and guards are ignored (not serialized). > Does this mean that any code imported from a .pyc file cannot take advantage of these kinds of optimizations? ChrisA From victor.stinner at gmail.com Fri Jan 8 20:59:23 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 02:59:23 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: 2016-01-09 2:42 GMT+01:00 Chris Angelico : > On Sat, Jan 9, 2016 at 8:31 AM, Victor Stinner wrote: >> When a function is serialized (by ``marshal`` or ``pickle`` for >> example), specialized functions and guards are ignored (not serialized). >> > > Does this mean that any code imported from a .pyc file cannot take > advantage of these kinds of optimizations? Ah yes, this sentence is confusing. It should not mention marshal, it's wrong. A .pyc file doesn't not contain functions... It only contains code objects. Functions are only created at runtime. Specialized functions are also added at runtime. Victor From victor.stinner at gmail.com Fri Jan 8 21:01:39 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 03:01:39 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 2:38 GMT+01:00 Chris Angelico : > The discussion on python-ideas can benefit from PEP numbers too, > particularly since you're putting three separate proposals up. ("Wait, > I know I saw a comment about that. Oh right, that was in PEP 142857, > not 142856.") But it's up to you. Hum, I forgot to mention that I'm not 100% sure yet that I correctly splitted my work on the FAT Python project into the right number of PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into sub-PEPs because it requires too many changes (I'm thinking at the third PEP, not published yet, it's still a "private" draft). Victor From steve at pearwood.info Fri Jan 8 21:08:10 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Jan 2016 13:08:10 +1100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <20160109020810.GK10854@ando.pearwood.info> On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote: [...] > Since Python 2 doesn't support function annotations we've had to look for > alternatives. We considered stub files, a magic codec, docstrings, and > additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. We've experimented a fair amount > with this and we have a proposal for a standard. [...] > - Stub files: this would complicate the analysis in mypy quite a bit, > because it would have to parse both the .py file and the .pyi file and > somehow combine the information gathered from both, and for each function > it would have to use the types from the stub file to type-check the body of > the function in the .py file. This would require a lot of additional > plumbing. And if we were using Python 3 we would want to use in-line > annotations anyway. I don't understand this paragraph. Doesn't mypy (and any other type checker) have to support stub files? I thought that stub files are needed for extension files, among other things. So I would have expected that any Python 2 type checker would have to support stub files as well, regardless of whether inline #type comments are introduced or not. Will Python 3 type checkers be expected to support #type comments as well as annotations and stub files? -- Steve From ncoghlan at gmail.com Fri Jan 8 23:47:33 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jan 2016 14:47:33 +1000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 9 January 2016 at 12:01, Victor Stinner wrote: > 2016-01-09 2:38 GMT+01:00 Chris Angelico : >> The discussion on python-ideas can benefit from PEP numbers too, >> particularly since you're putting three separate proposals up. ("Wait, >> I know I saw a comment about that. Oh right, that was in PEP 142857, >> not 142856.") But it's up to you. > > Hum, I forgot to mention that I'm not 100% sure yet that I correctly > splitted my work on the FAT Python project into the right number of > PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into > sub-PEPs because it requires too many changes (I'm thinking at the > third PEP, not published yet, it's still a "private" draft). The first two proposals you've posted make sense to consider as standalone changes, so it seems reasonable to assign them PEP numbers now rather than waiting. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 9 00:22:04 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jan 2016 15:22:04 +1000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <20160109020810.GK10854@ando.pearwood.info> References: <20160109020810.GK10854@ando.pearwood.info> Message-ID: On 9 January 2016 at 12:08, Steven D'Aprano wrote: > On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote: > > [...] >> Since Python 2 doesn't support function annotations we've had to look for >> alternatives. We considered stub files, a magic codec, docstrings, and >> additional `# type:` comments. In the end we decided that `# type:` >> comments are the most robust approach. We've experimented a fair amount >> with this and we have a proposal for a standard. > [...] >> - Stub files: this would complicate the analysis in mypy quite a bit, >> because it would have to parse both the .py file and the .pyi file and >> somehow combine the information gathered from both, and for each function >> it would have to use the types from the stub file to type-check the body of >> the function in the .py file. This would require a lot of additional >> plumbing. And if we were using Python 3 we would want to use in-line >> annotations anyway. > > I don't understand this paragraph. Doesn't mypy (and any other type > checker) have to support stub files? I thought that stub files are > needed for extension files, among other things. So I would have expected > that any Python 2 type checker would have to support stub files as well, > regardless of whether inline #type comments are introduced or not. Stub files are easy to use if you're using them *instead of* the original source file (e.g. annotating extension modules, or typeshed annotations for the standard library). Checking a stub file for consistency against the published API of the corresponding module also seems like it would be straightforward (while using both a stub file *and* inline annotations for the same API seems like it would be a bad idea, it's at least necessary to check that the *shape* of the API matches, even if there's no type information). However, if I'm understanding correctly, the problem Guido is talking about here is a different one: analysing a function *implementation* to ensure it is consistent with its own published API. That's relatively straightforward with inline annotations (whether function annotation based, comment based, or docstring based), but trickier if you have to pause the analysis, go look for the right stub file, load it, determine the expected public API, and then resume the analysis of the original function. The other downside of the stub file approach is the same reason it's not the preferred approach in Python 3: you can't see the annotations yourself when you're working on the function. Folks working mostly on solo and small team projects may not see the appeal of that, but when doing maintenance on large unfamiliar code bases, the improved local reasoning those kinds of inline notes help support can be very helpful. > Will Python 3 type checkers be expected to support #type comments as > well as annotations and stub files? #type comment support is already required for variables and attributes: https://www.python.org/dev/peps/pep-0484/#type-comments That requirement for type checkers to support comment based type hints would remain, even if we were to later add native syntactic support for variable and attribute typing. I read Guido's proposal here as offering something similar for function annotations, only going in the other direction: providing a variant spelling for function type hinting that can be used in single source Python 2/3 code bases that can't use function annotations. I don't have a strong opinion on the specifics, but am +1 on the general idea - I think the approach Dropbox are pursuing of adopting static type analysis first, and then migrating to Python 3 (or at least single source Python 2/3 support) second is going to prove to be a popular one, as it allows you to detect a lot of potential migration issues without necessarily having to be able to exercise those code paths in a test running under Python 3. The 3 kinds of annotation would then have 3 clear function level use cases: stub files: annotating third party libraries (e.g. for typeshed) #type comments: annotating single source Python 2/3 code function annotations: annotating Python 3 code Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Sat Jan 9 01:03:12 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Jan 2016 08:03:12 +0200 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 08.01.16 23:27, Victor Stinner wrote: > Add a new read-only ``__version__`` property to ``dict`` and > ``collections.UserDict`` types, incremented at each change. This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion. > The C code uses ``version++``. The behaviour on integer overflow of the > version is undefined. The minimum guarantee is that the version always > changes when the dictionary is modified. For clarification, this code has defined behavior in C (we should avoid introducing new undefined behaviors). May be you mean that the bahavior is not specified from Python side (since it is platform and implementation defined). > Usage of dict.__version__ > ========================= This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332. From ncoghlan at gmail.com Sat Jan 9 02:43:41 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jan 2016 17:43:41 +1000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 9 January 2016 at 16:03, Serhiy Storchaka wrote: > On 08.01.16 23:27, Victor Stinner wrote: >> >> Add a new read-only ``__version__`` property to ``dict`` and >> ``collections.UserDict`` types, incremented at each change. > > > This may be not the best name for a property. Many modules already have the > __version__ attribute, this may make a confusion. The equivalent API for the global ABC object graph is abc.get_cache_token: https://docs.python.org/3/library/abc.html#abc.get_cache_token One of the reasons we chose that name is that even though it's a number, the only operation with semantic significance is equality testing, with the intended use case being cache invalidation when the token changes value. If we followed the same reasoning for Victor's proposal, then a suitable attribute name would be "__cache_token__". >> The C code uses ``version++``. The behaviour on integer overflow of the >> version is undefined. The minimum guarantee is that the version always >> changes when the dictionary is modified. > > For clarification, this code has defined behavior in C (we should avoid > introducing new undefined behaviors). May be you mean that the bahavior is > not specified from Python side (since it is platform and implementation > defined). At least in recent versions of the standard*, overflow is defined on unsigned types as wrapping modulo-N. It only remains formally undefined for signed types. *(I'm not sure about C89, but with MSVC getting their standards compliance act together, we could potentially update our minimum C version expectation in PEP 7 to C99 or even C11). >> Usage of dict.__version__ >> ========================= > > This also can be used for better detecting dict mutating during iterating: > https://bugs.python.org/issue19332. I initially thought the same thing, but the cache token will be updated even if the keys all stay the same, and one of the values is modified, while the mutation-during-iteration check is aimed at detecting changes to the keys, rather than the values. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Sat Jan 9 02:57:26 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 08:57:26 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: Le samedi 9 janvier 2016, Serhiy Storchaka a ?crit : > On 08.01.16 23:27, Victor Stinner wrote: > >> Add a new read-only ``__version__`` property to ``dict`` and >> ``collections.UserDict`` types, incremented at each change. >> > > This may be not the best name for a property. Many modules already have > the __version__ attribute, this may make a confusion. It's fine to have a __version__ property and a __version__ key in the same dict. They are different. For a module, it's something like: With moddict = globals(): - moddict.__version__ is the dict version - moddict['__version__'] is the module version Using the same name for different things is not new in Python. An example still in the module namespace: - moddict.__class__.__name__ is the dict class name - moddict['__name__'] is the module name (or '__main__') "Version" is really my favorite name for the name feature. Sometimes I saw "timestamp", but in my opinion it's more confusing because it's not related to a clock. > > The C code uses ``version++``. The behaviour on integer overflow of the >> version is undefined. The minimum guarantee is that the version always >> changes when the dictionary is modified. >> > > For clarification, this code has defined behavior in C (we should avoid > introducing new undefined behaviors). May be you mean that the bahavior is > not specified from Python side (since it is platform and implementation > defined). The C type for version is unsigned (size_t). I hope that version++ is defined but I was too lazy to check C specs for that :-) Does it wrap to 0 on overflow on all architecture (supported by Python)? If not, it's easy to wrap manually: version = (version==size_max) ? 0 : version+1; > > Usage of dict.__version__ >> ========================= >> > > This also can be used for better detecting dict mutating during iterating: > https://bugs.python.org/issue19332. Oh, cool. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Jan 9 03:55:29 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Jan 2016 10:55:29 +0200 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 09.01.16 09:57, Victor Stinner wrote: > Le samedi 9 janvier 2016, Serhiy Storchaka > This may be not the best name for a property. Many modules already > have the __version__ attribute, this may make a confusion. > > It's fine to have a __version__ property and a __version__ key in the > same dict. They are different. Oh, I meant not a confusion between a property and a key, but between properties of two related objects. Perhaps one time we'll want to add the property with the same meaning directly to module object, but it is already in use. > "Version" is really my favorite name for the name feature. Sometimes I > saw "timestamp", but in my opinion it's more confusing because it's not > related to a clock. Nick's "__cache_token__" LGTM. From storchaka at gmail.com Sat Jan 9 03:57:42 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Jan 2016 10:57:42 +0200 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 09.01.16 09:43, Nick Coghlan wrote: > If we followed the same reasoning for Victor's proposal, then a > suitable attribute name would be "__cache_token__". LGTM. >> This also can be used for better detecting dict mutating during iterating: >> https://bugs.python.org/issue19332. > > I initially thought the same thing, but the cache token will be > updated even if the keys all stay the same, and one of the values is > modified, while the mutation-during-iteration check is aimed at > detecting changes to the keys, rather than the values. This makes Raymond's objections even more strong. From ncoghlan at gmail.com Sat Jan 9 04:08:54 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jan 2016 19:08:54 +1000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 9 January 2016 at 18:55, Serhiy Storchaka wrote: > On 09.01.16 09:57, Victor Stinner wrote: >> >> Le samedi 9 janvier 2016, Serhiy Storchaka >> This may be not the best name for a property. Many modules already >> have the __version__ attribute, this may make a confusion. >> >> It's fine to have a __version__ property and a __version__ key in the >> same dict. They are different. > > Oh, I meant not a confusion between a property and a key, but between > properties of two related objects. Perhaps one time we'll want to add the > property with the same meaning directly to module object, but it is already > in use. The confusion I was referring to was yet a third variant of possible confusion: when people read "version", they're inevitably going to think "module version" or "package version" (since dealing with those kinds of versions is a day to day programming activity, regardless of domain), not "cache validity token" (as "version" in that sense is a technical term of art most programmers won't have encountered before). Yes, technically, "version" and "cache validity token" refer to the same thing in the context of data versioning, but the latter emphasises what the additional piece of information is primarily *for* in practical terms (checking if your caches are still valid), rather than what it *is* in formal terms (the current version of the stored data). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Sat Jan 9 04:18:20 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 10:18:20 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: Hi Nick, 2016-01-09 8:43 GMT+01:00 Nick Coghlan : >> For clarification, this code has defined behavior in C (we should avoid >> introducing new undefined behaviors). May be you mean that the bahavior is >> not specified from Python side (since it is platform and implementation >> defined). > > At least in recent versions of the standard*, overflow is defined on > unsigned types as wrapping modulo-N. It only remains formally > undefined for signed types. > > *(I'm not sure about C89, but with MSVC getting their standards > compliance act together, we could potentially update our minimum C > version expectation in PEP 7 to C99 or even C11). Great. >>> Usage of dict.__version__ >>> ========================= >> >> This also can be used for better detecting dict mutating during iterating: >> https://bugs.python.org/issue19332. > > I initially thought the same thing, but the cache token will be > updated even if the keys all stay the same, and one of the values is > modified, while the mutation-during-iteration check is aimed at > detecting changes to the keys, rather than the values. Serhiy's unit test ensure that creating a new key and deleting a key during an iteration is detected as a dict mutation, even if the dict size doesn't change. This use case works well with dict.__version__. Any __setitem__() changes the version (except if the key already exists and the value is exactly the same, id(old_value) == id(new_value)). Example: >>> d={1 :1} >>> len(d) 1 >>> d.__version__, len(d) (1, 1) >>> d[2]=2 >>> del d[1] >>> d.__version__, len(d) (3, 1) Changing the value can be detected as well during iteration using dict.__version__: >>> d={1:1} >>> d.__version__, len(d) (1, 1) >>> d[1]=2 >>> d.__version__, len(d) (2, 1) It would be nice to detect keys mutation while iteration on dict.keys(), but it would also be be nice to detect values mutation while iterating on dict.values() and dict.items(). No? Victor From pavol.lisy at gmail.com Sat Jan 9 04:54:11 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 9 Jan 2016 10:54:11 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: 2016-01-09 0:04 GMT+01:00, Guido van Rossum : > At Dropbox we're trying to be good citizens and we're working towards > introducing gradual typing (PEP 484) into our Python code bases (several > million lines of code). However, that code base is mostly still Python 2.7 > and we believe that we should introduce gradual typing first and start > working on conversion to Python 3 second (since having static types in the > code can help a big refactoring like that). > > Since Python 2 doesn't support function annotations we've had to look for > alternatives. We considered stub files, a magic codec, docstrings, and > additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. We've experimented a fair amount > with this and we have a proposal for a standard. > > The proposal is very simple. Consider the following function with Python 3 > annotations: > > def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: > str) -> None: > """Embezzle funds from account using fake receipts.""" > > > An equivalent way to write this in Python 2 is the following: > > def embezzle(self, account, funds=1000000, *fake_receipts): > # type: (str, int, *str) -> None > """Embezzle funds from account using fake receipts.""" > > > There are a few details to discuss: > > - Every argument must be accounted for, except 'self' (for instance > methods) or 'cls' (for class methods). Also the return type is mandatory. > If in Python 3 you would omit some argument or the return type, the Python > 2 notation should use 'Any'. > > - If you're using names defined in the typing module, you must still import > them! (There's a backport on PyPI.) > > - For `*args` and `**kwds`, put 1 or 2 starts in front of the corresponding > type annotation. As with Python 3 annotations, the annotation here denotes > the type of the individual argument values, not of the tuple/dict that you > receive as the special argument value 'args' or 'kwds'. > > - The entire annotation must be one line. (However, see > https://github.com/JukkaL/mypy/issues/1102.) > > We would like to propose this as a standard (either to be added to PEP 484 > or as a new PEP) rather than making it a "proprietary" extension to mypy > only, so that others in a similar situation can also benefit. > > A brief discussion of the considered alternatives: > > - Stub files: this would complicate the analysis in mypy quite a bit, > because it would have to parse both the .py file and the .pyi file and > somehow combine the information gathered from both, and for each function > it would have to use the types from the stub file to type-check the body of > the function in the .py file. This would require a lot of additional > plumbing. And if we were using Python 3 we would want to use in-line > annotations anyway. > > - A magic codec was implemented over a year ago ( > https://github.com/JukkaL/mypy/tree/master/mypy/codec) but after using it > for a bit we didn't like it much. It slows down imports, it requires a `# > coding: mypy` declaration, it would conflict with pyxl ( > https://github.com/dropbox/pyxl), things go horribly wrong when the codec > isn't installed and registered, other tools would be confused by the Python > 3 syntax in Python 2 source code, and because of the way the codec was > implemented the Python interpreter would occasionally spit out confusing > error messages showing the codec's output (which is pretty bare-bones). > > - While there are existing conventions for specifying types in docstrings, > we haven't been using any of these conventions (at least not consistently, > nor at an appreciable scale), and they are much more verbose if all you > want is adding argument annotations. We're working on a tool that > automatically adds type annotations[1], and such a tool would be > complicated by the need to integrate the generated annotations into > existing docstrings (which, in our code base, unfortunately are wildly > incongruous in their conventions). > > - Finally, the proposed comment syntax is easy to mechanically translate > into standard Python 3 function annotations once we're ready to let go of > Python 2.7. > > __________ > [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a > bit over 200 lines. It's not very interesting yet, since it sets the types > of nearly all arguments to 'Any'. We're considering building a much more > advanced version that tries to guess much better argument types using some > form of whole-program analysis. I've heard that Facebook's Hack project got > a lot of mileage out of such a tool. I don't yet know how to write it yet > -- possibly we could use a variant of mypy's type inference engine, or > alternatively we might be able to use something like Jedi ( > https://github.com/davidhalter/jedi). > > -- > --Guido van Rossum (python.org/~guido) > Could not something like this -> def embezzle(self, account, funds=1000000, *fake_receipts): # def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: """Embezzle funds from account using fake receipts.""" make 1. transition from python2 to python3 more simple? 2. python3 checkers more easily changeable to understand new python2 standard? 3. simpler impact to documentation (means also simpler knowledbase to be learn) about annotations? From victor.stinner at gmail.com Sat Jan 9 04:58:00 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 10:58:00 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka : >>> This also can be used for better detecting dict mutating during >>> iterating: >>> https://bugs.python.org/issue19332. > (...) > > This makes Raymond's objections even more strong. Raymond has two major objections: memory footprint and performance. I opened an issue with a patch implementing dict__version__ and I ran pybench: https://bugs.python.org/issue26058#msg257810 pybench doesn't seem reliable: microbenchmarks on dict seems faster with the patch, it doesn't make sense. I expect worse or same performance. With my own timeit microbenchmarks, I don't see any slowdown with the patch. For an unknown reason (it's really strange), dict operations seem even faster with the patch. For the memory footprint, it's clearly stated in the PEP that it adds 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" section which explains why I proposed to modify directly the dict type. IMHO adding 8 bytes per dict is worth it. See for example microbenchmarks on func.specialize() which rely on dict.__version__ to implement efficient guards on namespaces: https://faster-cpython.readthedocs.org/pep_specialize.html#example "1.6 times" (155 ns => 95 ns) is better than a few percent as fast usually seen when optimizing dict operations. Victor From victor.stinner at gmail.com Sat Jan 9 05:16:38 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 11:16:38 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 5:47 GMT+01:00 Nick Coghlan : > The first two proposals you've posted make sense to consider as > standalone changes, so it seems reasonable to assign them PEP numbers > now rather than waiting. Ok fine, I requested 3 numbers for my first draft PEPs. Victor From mal at egenix.com Sat Jan 9 07:09:13 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 9 Jan 2016 13:09:13 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: <5690F869.2090704@egenix.com> On 09.01.2016 10:58, Victor Stinner wrote: > 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka : >>>> This also can be used for better detecting dict mutating during >>>> iterating: >>>> https://bugs.python.org/issue19332. >> (...) >> >> This makes Raymond's objections even more strong. > > Raymond has two major objections: memory footprint and performance. I > opened an issue with a patch implementing dict__version__ and I ran > pybench: > https://bugs.python.org/issue26058#msg257810 > > pybench doesn't seem reliable: microbenchmarks on dict seems faster > with the patch, it doesn't make sense. I expect worse or same > performance. > > With my own timeit microbenchmarks, I don't see any slowdown with the > patch. For an unknown reason (it's really strange), dict operations > seem even faster with the patch. This can well be caused by a better memory alignment, which depends on the CPU you're using. > For the memory footprint, it's clearly stated in the PEP that it adds > 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" > section which explains why I proposed to modify directly the dict > type. Some questions: * How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ? * Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ? AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?". For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed. What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict. dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict. When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set. * Wouldn't it be possible to use the hash array itself to store the state index ? We could store the state object as regular key in the dict and filter this out when accessing the dict. Alternatively, we could try to use the free slots for storing these state objects by e.g. declaring a free slot as being NULL or a pointer to a state object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From mal at egenix.com Sat Jan 9 07:48:14 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 9 Jan 2016 13:48:14 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <5691018E.4090006@egenix.com> On 09.01.2016 00:04, Guido van Rossum wrote: > Since Python 2 doesn't support function annotations we've had to look for > alternatives. We considered stub files, a magic codec, docstrings, and > additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. We've experimented a fair amount > with this and we have a proposal for a standard. > > The proposal is very simple. Consider the following function with Python 3 > annotations: > > def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: > str) -> None: > """Embezzle funds from account using fake receipts.""" > > > An equivalent way to write this in Python 2 is the following: > > def embezzle(self, account, funds=1000000, *fake_receipts): > # type: (str, int, *str) -> None > """Embezzle funds from account using fake receipts.""" > By using comments, the annotations would not be available at runtime via an .__annotations__ attribute and every tool would have to implement a parser for extracting them. Wouldn't it be better and more in line with standard Python syntax to use decorators to define them ? @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" This would work in Python 2 as well and could (optionally) add an .__annotations__ attribute to the function/method, automatically create a type annotations file upon import, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From ncoghlan at gmail.com Sat Jan 9 08:12:49 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Jan 2016 23:12:49 +1000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 9 January 2016 at 19:18, Victor Stinner wrote: > It would be nice to detect keys mutation while iteration on > dict.keys(), but it would also be be nice to detect values mutation > while iterating on dict.values() and dict.items(). No? No, because mutating values as you go while iterating over a dictionary is perfectly legal: >>> data = dict.fromkeys(range(5)) >>> for k in data: ... data[k] = k ... >>> for k, v in data.items(): ... data[k] = v ** 2 ... >>> data {0: 0, 1: 1, 2: 4, 3: 9, 4: 16} It's only changing the key in the dict that's problematic, as that's the one that can affect the iteration order, regardless of whether you're emitting keys, values, or both. Raymond did mention that when closing the issue, but it was as an aside in one of his bullet points, rather than as a full example. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sat Jan 9 08:21:10 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jan 2016 00:21:10 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <5690F869.2090704@egenix.com> References: <5690F869.2090704@egenix.com> Message-ID: <20160109132110.GL10854@ando.pearwood.info> On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote: > * Given that this is an optimization and not meant to be exact > science, why would we need 64 bits worth of version information ? > > AFAIK, you only need the version information to be able to > answer the question "did anything change compared to last time > I looked ?". > > For an optimization it's good enough to get an answer "yes" > for slow changing dicts and "no" for all other cases. I don't understand this. The question has nothing to do with how quickly or slowly the dict has changed, but only on whether or not it actually has changed. Maybe your dict has been stable for three hours, except for one change; or it changes a thousand times a second. Either way, it has still changed. > False > negatives don't really hurt. False positives are not allowed. I think you have this backwards. False negatives potentially will introduce horrible bugs. A false negative means that you fail to notice when the dict has changed, when it actually has. ("Has the dict changed?" "No.") The result of that will be to apply the optimization when you shouldn't, and that is potentially catastrophic (the entirely wrong function is mysteriously called). A false positive means you wrongly think the dict has changed when it hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you miss out on the possibility of applying the optimization when you actually could have, but it's not so bad. So false positives (wrongly thinking the dict has changed when it hasn't) can be permitted, but false negatives shouldn't be. > What you'd need to answer the question is a way for the > code in need of the information to remember the dict > state and then later compare it's remembered state > with the now current state of the dict. > > dicts could do this with a 16-bit index into an array > of state object slots which are set by the code tracking > the dict. > > When it's time to check, the code would simply ask for the > current index value and compare the state object in the > array with the one it had set. If I've understand that correctly, and I may not have, that will on detect (some?) insertions and deletions to the dict, but fail to detect when an existing key has a new value bound. -- Steve From victor.stinner at gmail.com Sat Jan 9 08:24:07 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 14:24:07 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <5690F869.2090704@egenix.com> References: <5690F869.2090704@egenix.com> Message-ID: 2016-01-09 13:09 GMT+01:00 M.-A. Lemburg : > * How would the implementation deal with wrap around of the > version number for fast changing dicts (esp. on 32-bit platforms) ? Let me try to do some maths. haypo at selma$ python3 -m timeit 'd={}' 'for i in range(2**16): d[i]=i' 100 loops, best of 3: 7.01 msec per loop haypo at selma$ python3 Python 3.4.3 (default, Jun 29 2015, 12:16:01) >>> t=7.01e-3 / 2**16 >>> t*1e9 106.964111328125 It looks like __setitem__() takes 107 in average. I guess that the number depends a lot on the dictionary size, the number of required resize (rehash all keys), etc. But well, it's just to have an estimation. >>> print(datetime.timedelta(seconds=2**32 * t)) 0:07:39.407360 With a 32-bit version, less than 8 minutes are enough to hit the integer overflow if each dict operation changes the dict version and you modify a dict in a loop. >>> print(2016 + datetime.timedelta(seconds=2**64 * t) / datetime.timedelta(days=365.25)) 64541.02049400773 With a 64-bit version, the situation is very different: the next overflow will not occur before the year 64 541 :-) Maybe it's worth to use a 64-bit version on 32-bit platforms? Python 3.5 already uses a 64-bit integer on 32-bit platforms to store a timestamp in the private "pytime" API. Guard has only a bug on integer overflow if the new version modulo 2^32 (or modulo 2^64) is equal to the old version. The bet is also that it's "unlikely". > * Given that this is an optimization and not meant to be exact > science, why would we need 64 bits worth of version information ? If a guard says that nothing changes where something changes, it is a real issue for me. It means that the optimization changes the Python semantic. > For an optimization it's good enough to get an answer "yes" > for slow changing dicts and "no" for all other cases. False > negatives don't really hurt. False positives are not allowed. False negative means that you loose the optimization. It would be annoying to see server performance degrades after N days before of an integer overflow :-/ It can be a big issue. How do you choose the number of servers if performances are not stable? Victor From rosuav at gmail.com Sat Jan 9 08:32:21 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 00:32:21 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160109132110.GL10854@ando.pearwood.info> References: <5690F869.2090704@egenix.com> <20160109132110.GL10854@ando.pearwood.info> Message-ID: On Sun, Jan 10, 2016 at 12:21 AM, Steven D'Aprano wrote: > On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote: > >> * Given that this is an optimization and not meant to be exact >> science, why would we need 64 bits worth of version information ? >> >> AFAIK, you only need the version information to be able to >> answer the question "did anything change compared to last time >> I looked ?". >> >> For an optimization it's good enough to get an answer "yes" >> for slow changing dicts and "no" for all other cases. > > I don't understand this. The question has nothing to do with > how quickly or slowly the dict has changed, but only on whether or not > it actually has changed. Maybe your dict has been stable for three > hours, except for one change; or it changes a thousand times a second. > Either way, it has still changed. > > >> False >> negatives don't really hurt. False positives are not allowed. > > I think you have this backwards. False negatives potentially will > introduce horrible bugs. A false negative means that you fail to notice > when the dict has changed, when it actually has. ("Has the dict > changed?" "No.") The result of that will be to apply the optimization > when you shouldn't, and that is potentially catastrophic (the entirely > wrong function is mysteriously called). > > A false positive means you wrongly think the dict has changed when it > hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you > miss out on the possibility of applying the optimization when you > actually could have, but it's not so bad. So false positives (wrongly > thinking the dict has changed when it hasn't) can be permitted, but > false negatives shouldn't be. I think we're getting caught in terminology a bit. The original question was "why a 64-bit counter". Here's my take on it: * If the dict has changed but we say it hasn't, this is a critical failure. M-A L called this a "false positive", which works if the question is "may we use the optimized version". * If the dict has changed exactly N times since it was last checked, where N is the integer wrap-around period of the counter, a naive counter comparison will show that it has not changed. Consequently, a small counter is more problematic than a large one. If the counter has 2**8 states, then collisions will be frequent, and that would be bad. If it has 2**32 states, then a slow-changing dict will last longer than any typical run of a program (if it changes, say, once per second, you get over a century of uptime before it's a problem), but a fast-changing dict could run into issues (change every millisecond and you'll run into trouble after a couple of months). A 64-bit counter could handle ridiculously fast mutation (say, every nanosecond) for a ridiculously long time (hundreds of years). That's the only way that fast-changing and slow-changing have any meaning. ChrisA From victor.stinner at gmail.com Sat Jan 9 08:42:22 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 14:42:22 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: Hi, 2016-01-09 13:48 GMT+01:00 Neil Girdhar : > How is this not just a poorer version of PyPy's optimizations? This a very good question :-) There are a lot of optimizers in the wild, mostly JIT compilers. The problem is that most of them are specific to numerical computations, and the remaining ones are generic but not widely used. The most advanced and complete fast implementation of Python is obviously PyPy. I didn't heard a lot of deployements with PyPy. For example, PyPy is not used to install OpenStack (a very large project which has a big number of dependencies). I'm not even sure that PyPy is the favorite implementation of Python used to run Django, to give another example of popular Python application. PyPy is just amazing in term of performances, but for an unknown reason, it didn't replace CPython yet. PyPy has some drawbacks: it only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it has bad performances on the C API and I heard that performances are not as amazing as expected on some applications. PyPy has also a worse startup time and use more memory. IMHO the major issue of Python is the backward compatibility on the C API. In short, almost all users are stuck at CPython and CPython implements close to 0 optimization (come on, constant folding and dead code elimintation is not what I would call an "optimization" ;-)). My goal is to fill the hole between CPython (0 optimization) and PyPy (the reference for best performances). I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html > If what you want is optimization, it would be much better to devote time to a solution > that can potentially yield orders of magnitude worth of speedup like PyPy > rather than increasing language complexity for a minor payoff. I disagree that my proposed changes increase the "language complexity". According to early benchmarks, my changes has a negligible impact on performances. I don't see how adding a read-only __version__ property to dict makes the Python *language* more complex? My whole design is based on the idea that my optimizer will be optimal. You will be free to not use it ;-) And sorry, I'm not interested to contribute to PyPy. Victor From mal at egenix.com Sat Jan 9 08:50:06 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 9 Jan 2016 14:50:06 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160109132110.GL10854@ando.pearwood.info> References: <5690F869.2090704@egenix.com> <20160109132110.GL10854@ando.pearwood.info> Message-ID: <5691100E.9060407@egenix.com> On 09.01.2016 14:21, Steven D'Aprano wrote: > On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote: > >> * Given that this is an optimization and not meant to be exact >> science, why would we need 64 bits worth of version information ? >> >> AFAIK, you only need the version information to be able to >> answer the question "did anything change compared to last time >> I looked ?". >> >> For an optimization it's good enough to get an answer "yes" >> for slow changing dicts and "no" for all other cases. > > I don't understand this. The question has nothing to do with > how quickly or slowly the dict has changed, but only on whether or not > it actually has changed. Maybe your dict has been stable for three > hours, except for one change; or it changes a thousand times a second. > Either way, it has still changed. I was referring to how many versions will likely have passed since the code querying the dict last looked. Most algorithms won't be interested in the version number itself, but simply want to know whether the dict has changed or not. >> False >> negatives don't really hurt. False positives are not allowed. > > I think you have this backwards. With "false negatives" I meant: the code says the dict has changed, even though it has not. With "false positives" I meant the code says the dict has not changed, even though it has. But you're right: I should have used more explicit definitions :-) > False negatives potentially will > introduce horrible bugs. A false negative means that you fail to notice > when the dict has changed, when it actually has. ("Has the dict > changed?" "No.") The result of that will be to apply the optimization > when you shouldn't, and that is potentially catastrophic (the entirely > wrong function is mysteriously called). > > A false positive means you wrongly think the dict has changed when it > hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you > miss out on the possibility of applying the optimization when you > actually could have, but it's not so bad. So false positives (wrongly > thinking the dict has changed when it hasn't) can be permitted, but > false negatives shouldn't be. > > >> What you'd need to answer the question is a way for the >> code in need of the information to remember the dict >> state and then later compare it's remembered state >> with the now current state of the dict. >> >> dicts could do this with a 16-bit index into an array >> of state object slots which are set by the code tracking >> the dict. >> >> When it's time to check, the code would simply ask for the >> current index value and compare the state object in the >> array with the one it had set. > > If I've understand that correctly, and I may not have, that will on > detect (some?) insertions and deletions to the dict, but fail to > detect when an existing key has a new value bound. This depends on how the state object is managed by the dictionary implementation. It's currently just a rough idea. Thinking about this some more, I guess having external code set the state object would result in potential race conditions, so not a good plan. The idea to add a level of indirection to reduce the memory overhead, under the assumption that only few dictionaries will actually need to report changes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From mistersheik at gmail.com Sat Jan 9 09:55:08 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 9 Jan 2016 09:55:08 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner wrote: > Hi, > > 2016-01-09 13:48 GMT+01:00 Neil Girdhar : > > How is this not just a poorer version of PyPy's optimizations? > > This a very good question :-) There are a lot of optimizers in the > wild, mostly JIT compilers. The problem is that most of them are > specific to numerical computations, and the remaining ones are generic > but not widely used. The most advanced and complete fast > implementation of Python is obviously PyPy. I didn't heard a lot of > deployements with PyPy. For example, PyPy is not used to install > OpenStack (a very large project which has a big number of > dependencies). I'm not even sure that PyPy is the favorite > implementation of Python used to run Django, to give another example > of popular Python application. > > PyPy is just amazing in term of performances, but for an unknown > reason, it didn't replace CPython yet. PyPy has some drawbacks: it > only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it > has bad performances on the C API and I heard that performances are > not as amazing as expected on some applications. PyPy has also a worse > startup time and use more memory. IMHO the major issue of Python is > the backward compatibility on the C API. > > In short, almost all users are stuck at CPython and CPython implements > close to 0 optimization (come on, constant folding and dead code > elimintation is not what I would call an "optimization" ;-)). > > My goal is to fill the hole between CPython (0 optimization) and PyPy > (the reference for best performances). > > I wrote a whole website to explain the status of the Python optimizers > and why I want to write my own optimizer: > https://faster-cpython.readthedocs.org/index.html I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change. > > > > If what you want is optimization, it would be much better to devote time > to a solution > > that can potentially yield orders of magnitude worth of speedup like PyPy > > rather than increasing language complexity for a minor payoff. > > I disagree that my proposed changes increase the "language > complexity". According to early benchmarks, my changes has a > negligible impact on performances. I don't see how adding a read-only > __version__ property to dict makes the Python *language* more complex? > > It makes it more complex because you're adding a user-facing property. Every little property adds up in the cognitive load of a language. It also means that all of the other Python implementation need to follow suit even if their optimizations work differently. What is the point of making __version__ an exposed property? Why can't it be a hidden variable in CPython's underlying implementation of dict? If some code needs to query __version__ to see if it's changed then CPython should be the one trying to discover this pattern and automatically generate the right code. Ultimately, this is just a piece of a JIT, which is the way this is going to end up. My whole design is based on the idea that my optimizer will be > optimal. You will be free to not use it ;-) > > And sorry, I'm not interested to contribute to PyPy. > That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it. > > Victor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram at rachum.com Sat Jan 9 10:13:38 2016 From: ram at rachum.com (Ram Rachum) Date: Sat, 9 Jan 2016 17:13:38 +0200 Subject: [Python-ideas] More friendly access to chmod Message-ID: Hi everyone, What do you think about enabling a more friendly interface to chmod information in Python? I believe that currently if I want to get chmod information from a file, I need to do this: my_path.stat().st_mode & 0o777 (I'm using `pathlib`.) (If there's a nicer way than this, please let me know.) This sucks. And then the result is then a number, like 511, which you then have to call `oct` on it to get 0o777. I'm not even happy with getting the octal number. For some of us who live and breathe Linux, seeing a number like 0o440 might be crystal-clear, since your mind automatically translates that to the permissions that user/group/others have, but I haven't reached that level. I would really like an object-oriented approach to chmod, like an object which I can ask "Does group have execute permissions?" and say "Please add read permissions to everyone" etc. Just because Linux speaks in code doesn't mean that we need to. And of course, I'd want that on the `pathlib` module so I could do it all on the path object without referencing another module. What do you think? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 9 10:39:31 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jan 2016 02:39:31 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: <20160109153931.GR10854@ando.pearwood.info> On Sat, Jan 09, 2016 at 05:13:38PM +0200, Ram Rachum wrote: > Hi everyone, > > What do you think about enabling a more friendly interface to chmod > information in Python? I think that would make an awesome tool added to your own personal toolbox. Once you are satisfied that it works well, then it would be really good to realise it to the public as a third-party library or recipe on ActiveState or similar. And then we can talk about whether or not it belongs in the stdlib. > And of course, I'd want that on the `pathlib` module so I could do it all > on the path object without referencing another module. What's wrong with referencing other modules? -- Steve From ram at rachum.com Sat Jan 9 10:41:19 2016 From: ram at rachum.com (Ram Rachum) Date: Sat, 9 Jan 2016 17:41:19 +0200 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: <20160109153931.GR10854@ando.pearwood.info> References: <20160109153931.GR10854@ando.pearwood.info> Message-ID: On Sat, Jan 9, 2016 at 5:39 PM, Steven D'Aprano wrote: > On Sat, Jan 09, 2016 at 05:13:38PM +0200, Ram Rachum wrote: > > Hi everyone, > > > > What do you think about enabling a more friendly interface to chmod > > information in Python? > > I think that would make an awesome tool added to your own personal > toolbox. Once you are satisfied that it works well, then it would be > really good to realise it to the public as a third-party library or > recipe on ActiveState or similar. > > And then we can talk about whether or not it belongs in the stdlib. > Okay. I'm working on it now, we'll see how it goes. > > > And of course, I'd want that on the `pathlib` module so I could do it all > > on the path object without referencing another module. > > What's wrong with referencing other modules? > > > Not wrong, just desirable to avoid. For example, I think that doing `path.chmod(x)` is preferable to `os.chmod(path, x)`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Jan 9 10:59:41 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 02:59:41 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Sun, Jan 10, 2016 at 2:13 AM, Ram Rachum wrote: > > What do you think about enabling a more friendly interface to chmod > information in Python? I believe that currently if I want to get chmod > information from a file, I need to do this: > > my_path.stat().st_mode & 0o777 > > (I'm using `pathlib`.) > > (If there's a nicer way than this, please let me know.) Have you looked at the 'stat' module? At very least, you can ask questions like "Does group have execute permissions?" like this: my_path.stat().st_mode & stat.S_IXGRP You can also get a printable rwxrwxrwx with: stat.filemode(my_path.stat().st_mode) ChrisA From ram at rachum.com Sat Jan 9 11:06:57 2016 From: ram at rachum.com (Ram Rachum) Date: Sat, 9 Jan 2016 18:06:57 +0200 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Sat, Jan 9, 2016 at 5:59 PM, Chris Angelico wrote: > On Sun, Jan 10, 2016 at 2:13 AM, Ram Rachum wrote: > > > > What do you think about enabling a more friendly interface to chmod > > information in Python? I believe that currently if I want to get chmod > > information from a file, I need to do this: > > > > my_path.stat().st_mode & 0o777 > > > > (I'm using `pathlib`.) > > > > (If there's a nicer way than this, please let me know.) > > Have you looked at the 'stat' module? At very least, you can ask > questions like "Does group have execute permissions?" like this: > > my_path.stat().st_mode & stat.S_IXGRP > You can also get a printable rwxrwxrwx with: > > stat.filemode(my_path.stat().st_mode) > Thanks for the reference. Personally I think that `my_path.stat().st_mode & stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API. Probably this for the same action you described: 'x' in my_path.chmod()['g'] > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericfahlgren at gmail.com Sat Jan 9 11:09:26 2016 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Sat, 9 Jan 2016 08:09:26 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> Pavol Lisy, Saturday, January 09, 2016 01:54: > Could not something like this -> > > def embezzle(self, account, funds=1000000, *fake_receipts): > # def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: > """Embezzle funds from account using fake receipts.""" > > > make > 1. transition from python2 to python3 more simple? > 2. python3 checkers more easily changeable to understand new python2 standard? > 3. simpler impact to documentation (means also simpler knowledbase to be learn) about annotations? +1 on this, which is close to what I've been doing for a while now. 4. Educates people who have only seen Py2 prototypes to recognize what the Py3 annotations look like. From rosuav at gmail.com Sat Jan 9 11:11:13 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 03:11:13 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum wrote: > Thanks for the reference. Personally I think that `my_path.stat().st_mode & > stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API. > Probably this for the same action you described: > > 'x' in my_path.chmod()['g'] > > Okay. I'm not sure how popular that'll be, but sure. As an alternative API, you could have it return a tuple of permission strings, which you'd use thus: 'gx' in my_path.mode() # Group eXecute permission is set But scratch your own itch, and don't give in to the armchair advisers. ChrisA From victor.stinner at gmail.com Sat Jan 9 11:27:31 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Jan 2016 17:27:31 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: Le samedi 9 janvier 2016, Neil Girdhar a ?crit : > > On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner > wrote: >> >> I wrote a whole website to explain the status of the Python optimizers >> and why I want to write my own optimizer: >> https://faster-cpython.readthedocs.org/index.html > > > I think this is admirable. I also dream of faster Python. However, we > have a fundamental disagreement about how to get there. You can spend your > whole life adding one or two optimizations a year and Python may only end > up twice as fast as it is now, which would still be dog slow. A meaningful > speedup requires a JIT. So, I question the value of this kind of change. > There are multiple JIT compilers for Python actively developped: PyPy, Pyston, Pyjion, Numba (numerical computation), etc. I don't think that my work will slow down these projects. I hope that it will create more competition and that we will cooperate. For example, I am in contact with a Pythran developer who told me that my PEPs will help his project. As I wrote in the dict.__version__ PEP, the dictionary version will also be useful for Pyjion according to Brett Canon. But Antoine Pitrou told me that dictionary version will not help Numba. Numba doesn't use dictionaries and already has its own efficient implemenation for guards. > What is the point of making __version__ an exposed property? > Hum, technically I don't need it at the Python level. Guards are implemented in C and access directly the field from the strcuture. Having the property in Python helps to write unit tests, to write prototypes (experiment new things), etc. > That's fine, but I think you are probably wasting your time then :) The > "hole between CPython and PyPy" disappears as soon as PyPy catches up to > CPython 3.5 with numpy, and then all of this work goes with it. > PyPy works since many years but it's still not widely used by users. Maybe PyPy has drawbacks and the speedup is not enough to convince users to use it? I'm not sure that Python 3.5 support wil make PyPy immediatly more popular. Users still widely use Python 2 in practice. Yes, better and faster numpy will help PyPy. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From mahmoud at hatnote.com Sat Jan 9 11:51:18 2016 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Sat, 9 Jan 2016 08:51:18 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: I think it's a pretty common itch! Have you seen the boltons implementation? http://boltons.readthedocs.org/en/latest/fileutils.html#file-permissions Mahmoud github.com/mahmoud On Sat, Jan 9, 2016 at 8:11 AM, Chris Angelico wrote: > On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum wrote: > > Thanks for the reference. Personally I think that > `my_path.stat().st_mode & > > stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API. > > Probably this for the same action you described: > > > > 'x' in my_path.chmod()['g'] > > > > > > Okay. I'm not sure how popular that'll be, but sure. > > As an alternative API, you could have it return a tuple of permission > strings, which you'd use thus: > > 'gx' in my_path.mode() # Group eXecute permission is set > > But scratch your own itch, and don't give in to the armchair advisers. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jan 9 12:01:51 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 9 Jan 2016 12:01:51 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: On Sat, Jan 9, 2016 at 11:27 AM, Victor Stinner wrote: > > > Le samedi 9 janvier 2016, Neil Girdhar a ?crit : >> >> On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner >> wrote: >>> >>> I wrote a whole website to explain the status of the Python optimizers >>> and why I want to write my own optimizer: >>> https://faster-cpython.readthedocs.org/index.html >> >> >> I think this is admirable. I also dream of faster Python. However, we >> have a fundamental disagreement about how to get there. You can spend your >> whole life adding one or two optimizations a year and Python may only end >> up twice as fast as it is now, which would still be dog slow. A meaningful >> speedup requires a JIT. So, I question the value of this kind of change. >> > > There are multiple JIT compilers for Python actively developped: PyPy, > Pyston, Pyjion, Numba (numerical computation), etc. > > I don't think that my work will slow down these projects. I hope that it > will create more competition and that we will cooperate. For example, I am > in contact with a Pythran developer who told me that my PEPs will help his > project. As I wrote in the dict.__version__ PEP, the dictionary version > will also be useful for Pyjion according to Brett Canon. > > But Antoine Pitrou told me that dictionary version will not help Numba. > Numba doesn't use dictionaries and already has its own > efficient implemenation for guards. > > >> What is the point of making __version__ an exposed property? >> > > Hum, technically I don't need it at the Python level. Guards are > implemented in C and access directly the field from the strcuture. > > Having the property in Python helps to write unit tests, to write > prototypes (experiment new things), etc. > I understand what you mean, but If you can do this without changing the language, I think that would be better. Isn't it still possible to write your unit tests in the same C interface that you expose "version" with? Then the language with stay the same, but CPython would be faster, which is what you wanted. > > >> That's fine, but I think you are probably wasting your time then :) The >> "hole between CPython and PyPy" disappears as soon as PyPy catches up to >> CPython 3.5 with numpy, and then all of this work goes with it. >> > > PyPy works since many years but it's still not widely used by users. Maybe > PyPy has drawbacks and the speedup is not enough to convince users to use > it? I'm not sure that Python 3.5 support wil make PyPy immediatly more > popular. Users still widely use Python 2 in practice. > > Yes, better and faster numpy will help PyPy. > > Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.baker at python.org Sat Jan 9 13:52:39 2016 From: jim.baker at python.org (Jim Baker) Date: Sat, 9 Jan 2016 11:52:39 -0700 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> References: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> Message-ID: +1, I would really like to try out type annotation support in Jython, given the potential for tying in with Java as a source of type annotations (basically the equivalent of stubs for free). I'm planning on sprinting on Jython 3 at PyCon, but let's face it, that's going to take a while to really finish. re the two approaches, both are workable with Jython: * lib2to3 is something we should support in Jython 2.7. There are a couple of data files that we don't support in the tests (too large of a method for Java bytecode in infinite_recursion.py, not terribly interesting), plus a few other tests that should work. Therefore lib2to3 should be in the next release (2.7.1). * Jedi now works with the last commit to Jython 2.7 trunk, passing whatever it means to run random tests using its sith script against its source. (The sith test does not pass with either CPython or Jython's stdlib, starting with bad_coding.py.) - Jim On Sat, Jan 9, 2016 at 9:09 AM, Eric Fahlgren wrote: > Pavol Lisy, Saturday, January 09, 2016 01:54: > > Could not something like this -> > > > > def embezzle(self, account, funds=1000000, *fake_receipts): > > # def embezzle(self, account: str, funds: int = 1000000, > *fake_receipts: str) -> None: > > """Embezzle funds from account using fake receipts.""" > > > > > > make > > 1. transition from python2 to python3 more simple? > > 2. python3 checkers more easily changeable to understand new python2 > standard? > > 3. simpler impact to documentation (means also simpler knowledbase to be > learn) about annotations? > > +1 on this, which is close to what I've been doing for a while now. > > 4. Educates people who have only seen Py2 prototypes to recognize what the > Py3 annotations look like. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 9 14:30:50 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 9 Jan 2016 11:30:50 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy wrote: > Could not something like this -> > > def embezzle(self, account, funds=1000000, *fake_receipts): > # def embezzle(self, account: str, funds: int = 1000000, > *fake_receipts: str) -> None: > """Embezzle funds from account using fake receipts.""" > > > make > 1. transition from python2 to python3 more simple? > 2. python3 checkers more easily changeable to understand new python2 > standard? > 3. simpler impact to documentation (means also simpler knowledbase to > be learn) about annotations? > There would still have to be some marker like "# type:" for the type checker to recognize -- I'm sure that plain comments with alternate 'def' statements are pretty common and we really don't want the type checker to be confused by those. I don't like that the form you propose has so much repetition -- the design of Python 3 annotations intentionally is the least redundant possible, and my (really Jukka's) proposal tries to keep that property. Modifying type checkers to support this syntax is easy (Jukka already did it for mypy). Note that type checkers already have to parse the source code without the help of Python's ast module, because there are other things in comments: PEP 484 specifies variable annotations and a few forms of `# type: ignore` comments. Regarding the idea of a decorator, this was discussed and rejected for the original PEP 484 proposal as well. The problem is similar to that with your 'def' proposal: too verbose. Also a decorator is more expensive (we're envisioning adding many thousands of decorators, and it would weigh down program startup). We don't envision needing to introspect __annotations__ at run time. (Also, we already use decorators quite heavily -- introducing a @typehint decorator would make the code less readable due to excessive stacking of decorators.) Our needs at Dropbox are several: first, we want to add annotations to the code so that new engineers can learn their way around the code quicker and refactoring will be easier; second, we want to automatically check conformance to the annotations as part of our code review and continuous integration processes (this is where mypy comes in); third, once we have annotated enough of the code we want to start converting it to Python 3 with as much automation is feasible. The latter part is as yet unproven, but there's got to be a better way than manually checking the output of 2to3 (whose main weakness is that it does not know the types of variables). We see many benefits of annotations and automatically checking them using mypy -- but we don't want the them to affect the runtime at all. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jan 9 14:32:45 2016 From: brett at python.org (Brett Cannon) Date: Sat, 09 Jan 2016 19:32:45 +0000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: On Sat, 9 Jan 2016 at 07:04 Neil Girdhar wrote: > On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner > wrote: > >> Hi, >> >> 2016-01-09 13:48 GMT+01:00 Neil Girdhar : >> > How is this not just a poorer version of PyPy's optimizations? >> >> This a very good question :-) There are a lot of optimizers in the >> wild, mostly JIT compilers. The problem is that most of them are >> specific to numerical computations, and the remaining ones are generic >> but not widely used. The most advanced and complete fast >> implementation of Python is obviously PyPy. I didn't heard a lot of >> deployements with PyPy. For example, PyPy is not used to install >> OpenStack (a very large project which has a big number of >> dependencies). I'm not even sure that PyPy is the favorite >> implementation of Python used to run Django, to give another example >> of popular Python application. >> >> PyPy is just amazing in term of performances, but for an unknown >> reason, it didn't replace CPython yet. PyPy has some drawbacks: it >> only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it >> has bad performances on the C API and I heard that performances are >> not as amazing as expected on some applications. PyPy has also a worse >> startup time and use more memory. IMHO the major issue of Python is >> the backward compatibility on the C API. >> >> In short, almost all users are stuck at CPython and CPython implements >> close to 0 optimization (come on, constant folding and dead code >> elimintation is not what I would call an "optimization" ;-)). >> >> My goal is to fill the hole between CPython (0 optimization) and PyPy >> (the reference for best performances). >> >> I wrote a whole website to explain the status of the Python optimizers >> and why I want to write my own optimizer: >> https://faster-cpython.readthedocs.org/index.html > > > I think this is admirable. I also dream of faster Python. However, we > have a fundamental disagreement about how to get there. You can spend your > whole life adding one or two optimizations a year and Python may only end > up twice as fast as it is now, which would still be dog slow. A meaningful > speedup requires a JIT. So, I question the value of this kind of change. > Obviously a JIT can help, but even they can benefit from this. For instance, Pyjion could rely on this instead of creating our own guards for built-in and global namespaces if we wanted to inline calls to certain built-ins. > >> >> > If what you want is optimization, it would be much better to devote >> time to a solution >> > that can potentially yield orders of magnitude worth of speedup like >> PyPy >> > rather than increasing language complexity for a minor payoff. >> >> I disagree that my proposed changes increase the "language >> complexity". According to early benchmarks, my changes has a >> negligible impact on performances. I don't see how adding a read-only >> __version__ property to dict makes the Python *language* more complex? >> >> > It makes it more complex because you're adding a user-facing property. > Every little property adds up in the cognitive load of a language. It also > means that all of the other Python implementation need to follow suit even > if their optimizations work differently. > > What is the point of making __version__ an exposed property? Why can't it > be a hidden variable in CPython's underlying implementation of dict? If > some code needs to query __version__ to see if it's changed then CPython > should be the one trying to discover this pattern and automatically > generate the right code. Ultimately, this is just a piece of a JIT, which > is the way this is going to end up. > > My whole design is based on the idea that my optimizer will be >> optimal. You will be free to not use it ;-) >> >> And sorry, I'm not interested to contribute to PyPy. >> > > That's fine, but I think you are probably wasting your time then :) The > "hole between CPython and PyPy" disappears as soon as PyPy catches up to > CPython 3.5 with numpy, and then all of this work goes with it. > That doesn't solve the C API compatibility problem, nor other issues some people have with PyPy deployments (e.g., inconsistent performance that can't necessarily be relied upon). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jan 9 14:46:51 2016 From: brett at python.org (Brett Cannon) Date: Sat, 09 Jan 2016 19:46:51 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: On Sat, 9 Jan 2016 at 11:31 Guido van Rossum wrote: > On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy wrote: > >> Could not something like this -> >> >> def embezzle(self, account, funds=1000000, *fake_receipts): >> # def embezzle(self, account: str, funds: int = 1000000, >> *fake_receipts: str) -> None: >> """Embezzle funds from account using fake receipts.""" >> >> >> make >> 1. transition from python2 to python3 more simple? >> 2. python3 checkers more easily changeable to understand new python2 >> standard? >> 3. simpler impact to documentation (means also simpler knowledbase to >> be learn) about annotations? >> > > There would still have to be some marker like "# type:" for the type > checker to recognize -- I'm sure that plain comments with alternate 'def' > statements are pretty common and we really don't want the type checker to > be confused by those. > > I don't like that the form you propose has so much repetition -- the > design of Python 3 annotations intentionally is the least redundant > possible, and my (really Jukka's) proposal tries to keep that property. > > Modifying type checkers to support this syntax is easy (Jukka already did > it for mypy). > > Note that type checkers already have to parse the source code without the > help of Python's ast module, because there are other things in comments: > PEP 484 specifies variable annotations and a few forms of `# type: ignore` > comments. > > Regarding the idea of a decorator, this was discussed and rejected for the > original PEP 484 proposal as well. The problem is similar to that with your > 'def' proposal: too verbose. Also a decorator is more expensive (we're > envisioning adding many thousands of decorators, and it would weigh down > program startup). We don't envision needing to introspect __annotations__ > at run time. (Also, we already use decorators quite heavily -- introducing > a @typehint decorator would make the code less readable due to excessive > stacking of decorators.) > > Our needs at Dropbox are several: first, we want to add annotations to the > code so that new engineers can learn their way around the code quicker and > refactoring will be easier; second, we want to automatically check > conformance to the annotations as part of our code review and continuous > integration processes (this is where mypy comes in); third, once we have > annotated enough of the code we want to start converting it to Python 3 > with as much automation is feasible. The latter part is as yet unproven, > but there's got to be a better way than manually checking the output of > 2to3 (whose main weakness is that it does not know the types of variables). > We see many benefits of annotations and automatically checking them using > mypy -- but we don't want the them to affect the runtime at all. > To help answer the question about whether this could help with porting code to Python 3, the answer is "yes"; it's not essential but definitely would be helpful. Between Modernize, pylint, `python2.7 -3`, and `python3 -bb` you cover almost all of the issues that can arise in moving to Python 3. But notice that half of those tools are running your code under an interpreter with a certain flag flipped, which means run-time checks that require excellent test coverage. With type annotations you can do offline, static checking which is less reliant on your tests covering all corner cases. Depending on how the tools choose to handle representing str/unicode in Python 2/3 code (i.e., say that if you specify the type as 'str' it's an error and anything that is 'unicode' is considered the 'str' type in Python 3?), I don't see why mypy can't have a 2/3 compatibility mode that warns against uses of, e.g. the bytes type that don't directly translate between Python 2 and 3 like indexing. That kind of static warning would definitely be beneficial to anyone moving their code over as they wouldn't need to rely on e.g., `python3 -bb ` and their tests to catch that common issue with bytes and indexing. There is also the benefit of gradual porting with this kind of offline checking. Since you can slowly add more type information, you can slowly catch more issues in your code. Relying on `python3 -bb`, though, requires you have ported all of your code over first before running it under Python 3 to catch some issues. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Jan 9 15:24:10 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 9 Jan 2016 15:24:10 -0500 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: On 1/8/2016 6:04 PM, Guido van Rossum wrote: > At Dropbox we're trying to be good citizens and we're working towards > introducing gradual typing (PEP 484) into our Python code bases (several > million lines of code). However, that code base is mostly still Python > 2.7 and we believe that we should introduce gradual typing first and > start working on conversion to Python 3 second (since having static > types in the code can help a big refactoring like that). > > Since Python 2 doesn't support function annotations we've had to look > for alternatives. We considered stub files, a magic codec, docstrings, > and additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. We've experimented a fair amount > with this and we have a proposal for a standard. > > The proposal is very simple. Consider the following function with Python > 3 annotations: > > def embezzle(self, account: str, funds: int = 1000000, > *fake_receipts: str) -> None: > """Embezzle funds from account using fake receipts.""" > > > An equivalent way to write this in Python 2 is the following: > > def embezzle(self, account, funds=1000000, *fake_receipts): > # type: (str, int, *str) -> None > """Embezzle funds from account using fake receipts.""" > I find the this separate signature line to be at least as readable as the intermixed 3.x version. I noticed the same thing as Lemburg (no runtime .__annotations__ attributes, but am not sure whether adding them in 2.x code is a good or bad thing. > There are a few details to discuss: > > - Every argument must be accounted for, except 'self' (for instance > methods) or 'cls' (for class methods). Also the return type is > mandatory. If in Python 3 you would omit some argument or the return > type, the Python 2 notation should use 'Any'. > > - If you're using names defined in the typing module, you must still > import them! (There's a backport on PyPI.) > > - For `*args` and `**kwds`, put 1 or 2 starts in front of the > corresponding type annotation. As with Python 3 annotations, the > annotation here denotes the type of the individual argument values, not > of the tuple/dict that you receive as the special argument value 'args' > or 'kwds'. > > - The entire annotation must be one line. (However, see > https://github.com/JukkaL/mypy/issues/1102.) To me, really needed. > We would like to propose this as a standard (either to be added to PEP > 484 or as a new PEP) rather than making it a "proprietary" extension to > mypy only, so that others in a similar situation can also benefit. Since I am personally pretty much done with 2.x, the details do not matter to me, but I think a suggested standard approach is a good idea. I also think a new informational PEP, with a reference added to 484, would be better. 'Type hints for 2.x and 2&3 code' For a helpful tool, I would at least want something that added a template comment, without dummy 'Any's to be erased, to each function. # type: (, , *) -> A GUI with suggestions from both type-inferencing and from a name -> type dictionary would be even nicer. Name to type would work really well for a project with consistent use of parameter names. -- Terry Jan Reedy From tjreedy at udel.edu Sat Jan 9 17:18:40 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 9 Jan 2016 17:18:40 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On 1/8/2016 4:27 PM, Victor Stinner wrote: > Add a new read-only ``__version__`` property to ``dict`` and > ``collections.UserDict`` types, incremented at each change. I agree with Neil Girdhar that this looks to me like a CPython-specific implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals. Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone. > Python is hard to optimize because almost everything is mutable: builtin > functions, function code, global variables, local variables, ... can be > modified at runtime. I believe that C-coded functions are immutable. But I believe that mutability otherwise otherwise undercuts what your are trying to do. > Implementing optimizations respecting the Python > semantic requires to detect when "something changes": But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective. d = {'f': [int]} d['f'][0] = float # object mapping unchanged, value mapping changed d['f'] = [float] # object mapping changed, value mapping unchanged > The astoptimizer of the FAT Python project implements many optimizations > which require guards on namespaces. Examples: > > * Call pure builtins: to replace ``len("abc")`` with ``3``, Replacing a call with a return value assumes that the function is immutable, deterministic, and without side-effect. Perhaps this is what you meant by 'pure'. Are you proposing to provide astoptimizer with either a whitelist or blacklist of builtins that qualify or not? Aside from this, I don't find this example motivational. I would either write '3' in the first place or write something like "slen = len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = len(key)" to keep a reference to both the string and its length. Will astoptimizer 'propogate the constant' (in this case 'key')? The question in my mind is whether real code has enough pure builtin calls of constants to justify the overhead. > * Loop unrolling: to unroll the loop ``for i in range(...): ...``, How often is this useful in modern real-world Python code? Many old uses of range have been or could be replaced with enumerate or a collection iterator, making it less common than it once was. How often is N small enough that one wants complete versus partial unrolling? Wouldn't it be simpler to only use a (specialized) loop-unroller where range is known to be the builtin? -- Terry Jan Reedy From rosuav at gmail.com Sat Jan 9 17:19:15 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 09:19:15 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Sun, Jan 10, 2016 at 3:51 AM, Mahmoud Hashemi wrote: > I think it's a pretty common itch! Have you seen the boltons implementation? > http://boltons.readthedocs.org/en/latest/fileutils.html#file-permissions Yes it is, and no I haven't; everyone has a slightly different idea of what makes a good API, and that's why I put that caveat onto my suggestion. You can't make everyone happy, and APIs should not be designed by committee :) ChrisA From victor.stinner at gmail.com Sat Jan 9 18:08:56 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 00:08:56 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 23:18 GMT+01:00 Terry Reedy : > But as near as I can tell, your proposal cannot detect all relevant changes > unless one is *very* careful. A dict maps hashable objects to objects. > Objects represent values. So a dict represents a mapping of values to > values. If an object is mutated, the object to object mapping is not > changed, but the semantic value to value mapping *is* changed. In the > following example, __version__ twice gives the 'wrong' answer from a value > perspective. dict.__version__ is a technical solution to implement efficient guards on namespace. You are true, that it's not enough to detect any kind of change. For example, to inline a function inside the same module, we need a guard on the global variable, but also a guard on the function itself. We need to disable the optimization if the function code (func.__code__) is modified. Maybe a guard is also needed on default values of function parameters. But guards on functions don't need to modify CPython internals. It's already possible to implement efficient guards on functions. > Replacing a call with a return value assumes that the function is immutable, > deterministic, and without side-effect. that the function is deterministic and has no side-effect, yep. > Perhaps this is what you meant by > 'pure'. Are you proposing to provide astoptimizer with either a whitelist > or blacklist of builtins that qualify or not? Currently, I'm using a whitelist of builtin functions which are known to be pure. Later, I plan to detect automatically pure functions by analyzing the (AST) code. > Aside from this, I don't find this example motivational. I would either > write '3' in the first place or write something like "slen = > len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely > write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = > len(key)" to keep a reference to both the string and its length. Will > astoptimizer 'propogate the constant' (in this case 'key')? FYI I already have a working implementation of the astoptimizer: it's possible to run the full Python test suite with the optimizer. Implemented optimizations: https://faster-cpython.readthedocs.org/fat_python.html#optimizations Constant propagation and constant folding optimizations are implemented. A single optimization is not interesting, It's more interesting when you combine optimizations. Like constant propagation + constant folding + loop unrolling. > The question in my mind is whether real code has enough pure builtin calls > of constants to justify the overhead. Replacing len("abc") with 3 is not the goal of FAT Python. It's only an example simple to understand. > How often is this useful in modern real-world Python code? Many old uses of > range have been or could be replaced with enumerate or a collection > iterator, making it less common than it once was. IMHO the optimizations currently implemented will not provide any major speedup. It will become more interesting with function inlining. The idea is more to create an API to support pluggable static optimizations. > How often is N small enough that one wants complete versus partial > unrolling? Wouldn't it be simpler to only use a (specialized) loop-unroller > where range is known to be the builtin? What is the link between your question and dict.__version__? Victor From victor.stinner at gmail.com Sat Jan 9 18:14:10 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 00:14:10 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 2:00 GMT+01:00 Chris Angelico : > (...) send your PEP drafts to peps at python.org and we can get them assigned numbers Ok, this PEP got the number 509: "PEP 0509 -- Add dict.__version__" https://www.python.org/dev/peps/pep-0509/ FYI the second PEP got the number 510: "PEP 0510 -- Specialized functions with guards" https://www.python.org/dev/peps/pep-0510/ Victor From victor.stinner at gmail.com Sat Jan 9 18:16:13 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 00:16:13 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: > PEP: xxx > Title: Specialized functions with guards FYI I published the PEP at python.org and it got the number 510: "PEP 0510 -- Specialized functions with guards" https://www.python.org/dev/peps/pep-0510/ Victor From mistersheik at gmail.com Sat Jan 9 07:48:30 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 9 Jan 2016 04:48:30 -0800 (PST) Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> How is this not just a poorer version of PyPy's optimizations? If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff. Best, Neil On Friday, January 8, 2016 at 4:27:53 PM UTC-5, Victor Stinner wrote: > > Hi, > > Here is a first PEP, part of a serie of 3 PEP to add an API to > implement a static Python optimizer specializing functions with > guards. > > HTML version: > > https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-version > > PEP: xxx > Title: Add dict.__version__ > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 4-January-2016 > Python-Version: 3.6 > > > Abstract > ======== > > Add a new read-only ``__version__`` property to ``dict`` and > ``collections.UserDict`` types, incremented at each change. > > > Rationale > ========= > > In Python, the builtin ``dict`` type is used by many instructions. For > example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the > global namespace, or in the builtins namespace (two dict lookups). > Python uses ``dict`` for the builtins namespace, globals namespace, type > namespaces, instance namespaces, etc. The local namespace (namespace of > a function) is usually optimized to an array, but it can be a dict too. > > Python is hard to optimize because almost everything is mutable: builtin > functions, function code, global variables, local variables, ... can be > modified at runtime. Implementing optimizations respecting the Python > semantic requires to detect when "something changes": we will call these > checks "guards". > > The speedup of optimizations depends on the speed of guard checks. This > PEP proposes to add a version to dictionaries to implement efficient > guards on namespaces. > > Example of optimization: replace loading a global variable with a > constant. This optimization requires a guard on the global variable to > check if it was modified. If the variable is modified, the variable must > be loaded at runtime, instead of using the constant. > > > Guard example > ============= > > Pseudo-code of an efficient guard to check if a dictionary key was > modified (created, updated or deleted):: > > UNSET = object() > > class Guard: > def __init__(self, dict, key): > self.dict = dict > self.key = key > self.value = dict.get(key, UNSET) > self.version = dict.__version__ > > def check(self): > """Return True if the dictionary value did not changed.""" > version = self.dict.__version__ > if version == self.version: > # Fast-path: avoid the dictionary lookup > return True > > value = self.dict.get(self.key, UNSET) > if value == self.value: > # another key was modified: > # cache the new dictionary version > self.version = version > return True > > return False > > > Changes > ======= > > Add a read-only ``__version__`` property to builtin ``dict`` type and to > the ``collections.UserDict`` type. New empty dictionaries are initilized > to version ``0``. The version is incremented at each change: > > * ``clear()`` if the dict was non-empty > * ``pop(key)`` if the key exists > * ``popitem()`` if the dict is non-empty > * ``setdefault(key, value)`` if the `key` does not exist > * ``__detitem__(key)`` if the key exists > * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value > is different > * ``update(...)`` if new values are different than existing values (the > version can be incremented multiple times) > > Example:: > > >>> d = {} > >>> d.__version__ > 0 > >>> d['key'] = 'value' > >>> d.__version__ > 1 > >>> d['key'] = 'new value' > >>> d.__version__ > 2 > >>> del d['key'] > >>> d.__version__ > 3 > > If a dictionary is created with items, the version is also incremented > at each dictionary insertion. Example:: > > >>> d=dict(x=7, y=33) > >>> d.__version__ > 2 > > The version is not incremented is an existing key is modified to the > same value, but only the identifier of the value is tested, not the > content of the value. Example:: > > >>> d={} > >>> value = object() > >>> d['key'] = value > >>> d.__version__ > 2 > >>> d['key'] = value > >>> d.__version__ > 2 > > .. note:: > CPython uses some singleton like integers in the range [-5; 257], > empty tuple, empty strings, Unicode strings of a single character in > the range [U+0000; U+00FF], etc. When a key is set twice to the same > singleton, the version is not modified. > > The PEP is designed to implement guards on namespaces, only the ``dict`` > type can be used for namespaces in practice. ``collections.UserDict`` > is modified because it must mimicks ``dict``. ``collections.Mapping`` is > unchanged. > > > Integer overflow > ================ > > The implementation uses the C unsigned integer type ``size_t`` to store > the version. On 32-bit systems, the maximum version is ``2**32-1`` > (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum > version is ``2**64-1`` (more than ``1.8 * 10**19``). > > The C code uses ``version++``. The behaviour on integer overflow of the > version is undefined. The minimum guarantee is that the version always > changes when the dictionary is modified. > > The check ``dict.__version__ == old_version`` can be true after an > integer overflow, so a guard can return false even if the value changed, > which is wrong. The bug occurs if the dict is modified at least ``2**64`` > times (on 64-bit system) between two checks of the guard. > > Using a more complex type (ex: ``PyLongObject``) to avoid the overflow > would slow down operations on the ``dict`` type. Even if there is a > theorical risk of missing a value change, the risk is considered too low > compared to the slow down of using a more complex type. > > > Alternatives > ============ > > Add a version to each dict entry > -------------------------------- > > A single version per dictionary requires to keep a strong reference to > the value which can keep the value alive longer than expected. If we add > also a version per dictionary entry, the guard can rely on the entry > version and so avoid the strong reference to the value (only strong > references to a dictionary and key are needed). > > Changes: add a ``getversion(key)`` method to dictionary which returns > ``None`` if the key doesn't exist. When a key is created or modified, > the entry version is set to the dictionary version which is incremented > at each change (create, modify, delete). > > Pseudo-code of an efficient guard to check if a dict key was modified > using ``getversion()``:: > > UNSET = object() > > class Guard: > def __init__(self, dict, key): > self.dict = dict > self.key = key > self.dict_version = dict.__version__ > self.entry_version = dict.getversion(key) > > def check(self): > """Return True if the dictionary value did not changed.""" > dict_version = self.dict.__version__ > if dict_version == self.version: > # Fast-path: avoid the dictionary lookup > return True > > # lookup in the dictionary, but get the entry version, > #not the value > entry_version = self.dict.getversion(self.key) > if entry_version == self.entry_version: > # another key was modified: > # cache the new dictionary version > self.dict_version = dict_version > return True > > return False > > This main drawback of this option is the impact on the memory footprint. > It increases the size of each dictionary entry, so the overhead depends > on the number of buckets (dictionary entries, used or unused yet). For > example, it increases the size of each dictionary entry by 8 bytes on > 64-bit system if we use ``size_t``. > > In Python, the memory footprint matters and the trend is more to reduce > it. Examples: > > * `PEP 393 -- Flexible String Representation > `_ > * `PEP 412 -- Key-Sharing Dictionary > `_ > > > Add a new dict subtype > ---------------------- > > Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, > use the ``verdict`` for namespaces (module namespace, type namespace, > instance namespace, etc.) instead of ``dict``. > > Leave the ``dict`` type unchanged to not add any overhead (memory > footprint) when guards are not needed. > > Technical issue: a lot of C code in the wild, including CPython core, > expect the exact ``dict`` type. Issues: > > * ``exec()`` requires a ``dict`` for globals and locals. A lot of code > use ``globals={}``. It is not possible to cast the ``dict`` to a > ``dict`` subtype because the caller expects the ``globals`` parameter > to be modified (``dict`` is mutable). > * Functions call directly ``PyDict_xxx()`` functions, instead of calling > ``PyObject_xxx()`` if the object is a ``dict`` subtype > * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some > functions require the exact ``dict`` type. > * ``Python/ceval.c`` does not completly supports dict subtypes for > namespaces > > > The ``exec()`` issue is a blocker issue. > > Other issues: > > * The garbage collector has a special code to "untrack" ``dict`` > instances. If a ``dict`` subtype is used for namespaces, the garbage > collector may be unable to break some reference cycles. > * Some functions have a fast-path for ``dict`` which would not be taken > for ``dict`` subtypes, and so it would make Python a little bit > slower. > > > Usage of dict.__version__ > ========================= > > astoptimizer of FAT Python > -------------------------- > > The astoptimizer of the FAT Python project implements many optimizations > which require guards on namespaces. Examples: > > * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on > ``builtins.__dict__['len']`` and ``globals()['len']`` are required > * Loop unrolling: to unroll the loop ``for i in range(...): ...``, > guards on ``builtins.__dict__['range']`` and ``globals()['range']`` > are required > > The `FAT Python > `_ project is a > static optimizer for Python 3.6. > > > Pyjion > ------ > > According of Brett Cannon, one of the two main developers of Pyjion, > Pyjion can > also benefit from dictionary version to implement optimizations. > > Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET > Core > runtime). > > > Unladen Swallow > --------------- > > Even if dictionary version was not explicitly mentionned, optimization > globals > and builtins lookup was part of the Unladen Swallow plan: "Implement one > of the > several proposed schemes for speeding lookups of globals and builtins." > Source: `Unladen Swallow ProjectPlan > `_. > > Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler > implemented > with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective > `_. > > > > Prior Art > ========= > > Cached globals+builtins lookup > ------------------------------ > > In 2006, Andrea Griffini proposes a patch implementing a `Cached > globals+builtins lookup optimization `_. > > The patch adds a private ``timestamp`` field to dict. > > See the thread on python-dev: `About dictionary lookup caching > `_. > > > > Globals / builtins cache > ------------------------ > > In 2010, Antoine Pitrou proposed a `Globals / builtins cache > `_ which adds a private > ``ma_version`` field to the ``dict`` type. The patch adds a "global and > builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and > ``STORE_GLOBAL`` instructions to use the cache. > > > PySizer > ------- > > `PySizer `_: a memory profiler for Python, > Google Summer of Code 2005 project by Nick Smallbone. > > This project has a patch for CPython 2.4 which adds ``key_time`` and > ``value_time`` fields to dictionary entries. It uses a global > process-wide counter for dictionaries, incremented each time that a > dictionary is modified. The times are used to decide when child objects > first appeared in their parent objects. > > > Copyright > ========= > > This document has been placed in the public domain. > > -- > Victor > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Jan 9 22:22:41 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 9 Jan 2016 19:22:41 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: <196F6749-06BF-4098-8C18-848CAA2DE3AA@yahoo.com> On Jan 9, 2016, at 04:48, Neil Girdhar wrote: > > How is this not just a poorer version of PyPy's optimizations? If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff. I think he's already answered this twice between the two threads, plus at least once in the thread last year, not to mention similar questions from slightly different angles. Which implies to me that the PEPs really need to anticipate and answer these questions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jan 9 22:31:41 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jan 2016 14:31:41 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> Message-ID: <20160110033140.GS10854@ando.pearwood.info> On Sat, Jan 09, 2016 at 09:55:08AM -0500, Neil Girdhar wrote: > I think this is admirable. I also dream of faster Python. However, we > have a fundamental disagreement about how to get there. You can spend your > whole life adding one or two optimizations a year and Python may only end > up twice as fast as it is now, which would still be dog slow. A meaningful > speedup requires a JIT. So, I question the value of this kind of change. I think that's pessimistic and unrealistic. If Python were twice as fast as it is now, it would mean that scripts could process twice as much data in the same time as they do now. How is that not meaningful? Sometimes I work hard to get a 5% or 10% improvement in speed of a function, because it's worth it. Doubling the speed is something I can only dream about. As for a JIT, they have limited value for code that isn't long-running. As the PyPy FAQ says: "Note also that our JIT has a very high warm-up cost, meaning that any program is slow at the beginning. If you want to compare the timings with CPython, even relatively simple programs need to run at least one second, preferrably at least a few seconds." which means that PyPy is going to have little or no benefit for short-lived programs and scripts. But if you call those scripts thousands or tens of thousands of times (say, from the shell) the total amount of time can be considerable. Halving that time would be a good thing. There is plenty of room in the Python ecosystem for many different approaches to optimization. [...] > It makes it more complex because you're adding a user-facing property. > Every little property adds up in the cognitive load of a language. It also > means that all of the other Python implementation need to follow suit even > if their optimizations work differently. That second point is a reasonable criticism of Victor's idea. > What is the point of making __version__ an exposed property? Why can't it > be a hidden variable in CPython's underlying implementation of dict? Making it public means that anyone can make use of it. Just because Victor wants to use it for CPython optimizations doesn't mean that others can't or shouldn't make use of it for their own code. Victor wants to detect changes to globals() and builtins, but I might want to use it to detect changes to some other dict: mydict = {'many': 1, 'keys': 2, 'with': 3, 'an': 4, 'invariant': 5} v = mydict.__version__ modify(mydict) if v != mydict.__version__: expensive_invariant_check(mydict) If Victor is right that tracking this version flag is cheap, then there's no reason not to expose it. Some people will find a good use for it, and others can ignore it. -- Steve From steve at pearwood.info Sat Jan 9 23:24:27 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Jan 2016 15:24:27 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: <20160110042427.GT10854@ando.pearwood.info> On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote: > On 1/8/2016 4:27 PM, Victor Stinner wrote: > > >Add a new read-only ``__version__`` property to ``dict`` and > >``collections.UserDict`` types, incremented at each change. > > I agree with Neil Girdhar that this looks to me like a CPython-specific > implementation detail that should not be imposed on other > implementations. For testing, perhaps we could add a dict_version > function in test.support that uses ctypes to access the internals. > > Another reason to hide __version__ from the Python level is that its use > seems to me rather tricky and bug-prone. What makes you say that? Isn't it a simple matter of: v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed") which doesn't seen tricky or bug-prone to me. The only thing I would consider is the risk that people will write v > mydict.__version__ instead of not equal, which is wrong if the flag overflows back to zero. But with a 64-bit flag, and one modification to the dict every nanosecond (i.e. a billion changes per second), it will take approximately 584 years before the counter overflows. I don't think this is a realistic scenario. How many computers do you know with an uptime of more than a decade? (A 32-bit counter, on the other hand, will only take four seconds to overflow at that rate.) > >Python is hard to optimize because almost everything is mutable: builtin > >functions, function code, global variables, local variables, ... can be > >modified at runtime. > > I believe that C-coded functions are immutable. But I believe that > mutability otherwise otherwise undercuts what your are trying to do. If I have understood Victor's intention correctly, what he's looking for is a way to quickly detect the shadowing or monkey-patching of builtins, so that if they *haven't* been shadowed/monkey-patched, functions can bypass the (slow) lookup process with a fast inline version. Here's a sketch of the idea: def demo(arg): return len(arg) This has to do a time-consuming lookup of len in the globals, and if not found, then a second lookup in builtins. But 99.99% of the time, we haven't shadowed or monkey-patched len, so the compiler ought to be able to inline the function and skip the search. This is how static programming languages typically operate, and is one of the reasons why they're so fast. In Python, you will often see functions like this: def demo(arg, len=len): return len(arg) which replace the slow global lookup with a fast local lookup, but at the cost of adding an extra parameter to the function call. Ugly and confusing. And, it has the side-effect that if you do shadow or monkey-patch len, the demo function won't see the new version, which may not be what you want. Victor wants to be able to make that idiom obsolete by allowing the compiler to automatically translate this: def demo(arg): return len(arg) into something like this: def demo(arg): if len has been shadowed or monkey-patched: return len(arg) # calls the new version else: return inlined or cached version of len(arg) (I stress that you, the code's author, don't have to write the code like that, the compiler will automatically do this. And it won't just operate on len, it could potentially operate on any function that has no side-effects.) This relies on the test for shadowing etc to be cheap, which Victor's tests suggest it is. But he needs a way to detect when the globals() and builtins.__dict__ dictionaries have been changed, hence his proposal. > >Implementing optimizations respecting the Python > >semantic requires to detect when "something changes": > > But as near as I can tell, your proposal cannot detect all relevant > changes unless one is *very* careful. A dict maps hashable objects to > objects. Objects represent values. So a dict represents a mapping of > values to values. If an object is mutated, the object to object mapping > is not changed, but the semantic value to value mapping *is* changed. > In the following example, __version__ twice gives the 'wrong' answer > from a value perspective. > > d = {'f': [int]} > d['f'][0] = float # object mapping unchanged, value mapping changed > d['f'] = [float] # object mapping changed, value mapping unchanged I don't think that matters for Victor's use-case. Going back to the toy example above, Victor doesn't need to detect internal modifications to the len built-in, because as you say it's immutable: py> len.foo = "spam" Traceback (most recent call last): File "", line 1, in AttributeError: 'builtin_function_or_method' object has no attribute 'foo' He just needs to know if globals()['len'] and/or builtins.len are different (in any way) from how they were when the function "demo" was compiled. I'm sure that there are complications that I haven't thought of, but these sorts of static compiler optimizations aren't cutting edge computer science, they've been around for decades and are well-studied and well-understood. > >The astoptimizer of the FAT Python project implements many optimizations > >which require guards on namespaces. Examples: > > > >* Call pure builtins: to replace ``len("abc")`` with ``3``, > > Replacing a call with a return value assumes that the function is > immutable, deterministic, and without side-effect. Perhaps this is what > you meant by 'pure'. Yes, "pure function" is the term of art for a function which is deterministic and free of side-effects. https://en.wikipedia.org/wiki/Pure_function Immutability is only important in the sense that if a function *is* pure now, you know it will be pure in the future as well. > Are you proposing to provide astoptimizer with > either a whitelist or blacklist of builtins that qualify or not? I don't think the implementation details of astoptimizer are important for this proposal. [...] > The question in my mind is whether real code has enough pure builtin > calls of constants to justify the overhead. Its not just builtin calls of constants, this technique has much wider application. If I understand Victor correctly, he thinks he can get function inlining, where instead of having to make a full function call to the built-in (which is slow), the compiler can jump directly to the function's implementation as if it were written inline. https://en.wikipedia.org/wiki/Inline_expansion Obviously you can't do this optimization if len has changed from the inlined version, hence Victor needs to detect changes to globals() and builtins.__dict__. This shouldn't turn into a general critique of optimization techniques, but I think that Victor's PEP should justify why he is confident that these optimizations have a good chance to be worthwhile. It's not enough to end up with "well, we applied all the optimizations we could, and the good news is that Python is no slower". We want some evidence that it will actually be faster. -- Steve From rosuav at gmail.com Sun Jan 10 00:23:46 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 16:23:46 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110042427.GT10854@ando.pearwood.info> References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano wrote: > On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote: >> On 1/8/2016 4:27 PM, Victor Stinner wrote: >> >> >Add a new read-only ``__version__`` property to ``dict`` and >> >``collections.UserDict`` types, incremented at each change. >> >> I agree with Neil Girdhar that this looks to me like a CPython-specific >> implementation detail that should not be imposed on other >> implementations. For testing, perhaps we could add a dict_version >> function in test.support that uses ctypes to access the internals. >> >> Another reason to hide __version__ from the Python level is that its use >> seems to me rather tricky and bug-prone. > > What makes you say that? Isn't it a simple matter of: > > v = mydict.__version__ > maybe_modify(mydict) > if v != mydict.__version__: > print("dict has changed") > > which doesn't seen tricky or bug-prone to me. That doesn't. I would, however, expect that __version__ is a read-only attribute. I can't imagine any justifiable excuse for changing it; if you want to increment it, just mutate the dict in some unnecessary way. >> But as near as I can tell, your proposal cannot detect all relevant >> changes unless one is *very* careful. A dict maps hashable objects to >> objects. Objects represent values. So a dict represents a mapping of >> values to values. If an object is mutated, the object to object mapping >> is not changed, but the semantic value to value mapping *is* changed. >> In the following example, __version__ twice gives the 'wrong' answer >> from a value perspective. >> >> d = {'f': [int]} >> d['f'][0] = float # object mapping unchanged, value mapping changed >> d['f'] = [float] # object mapping changed, value mapping unchanged > > I don't think that matters for Victor's use-case. Going back to the toy > example above, Victor doesn't need to detect internal modifications to > the len built-in, because as you say it's immutable: > > py> len.foo = "spam" > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'builtin_function_or_method' object has no attribute > 'foo' > > He just needs to know if globals()['len'] and/or builtins.len are > different (in any way) from how they were when the function "demo" was > compiled. There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable (otherwise hashing has problems), and this optimization doesn't actually care about the immutability of the value. When you use the name "len" in a Python function, somewhere along the way, that will resolve to some object. Currently, CPython knows in advance that it isn't in the function-locals, but checks at run-time for a global and then a built-in; all FAT Python is doing differently is snapshotting the object referred to, and then having a quick check to prove that globals and builtins haven't been mutated. Consider: def enumerate_classes(): return (cls.__name__ for cls in object.__subclasses__()) As long as nobody has *rebound* the name 'object', this will continue to work - and it'll pick up new subclasses, which means that something's mutable or non-pure in there. FAT Python should be able to handle this just as easily as it handles an immutable. The only part that has to be immutable is the string "len" or "object" that is used as the key. The significance of len being immutable and pure comes from the other optimization, which is actually orthogonal to the non-rebound names optimization, except that CPython already does this where it doesn't depend on names. CPython already constant-folds in situations where no names are involved. That's how we maintain the illusion that there is such a thing as a "complex literal": >>> dis.dis(lambda: 1+2j) 1 0 LOAD_CONST 3 ((1+2j)) 3 RETURN_VALUE FAT Python proposes to do the same here: >>> dis.dis(lambda: len("abc")) 1 0 LOAD_GLOBAL 0 (len) 3 LOAD_CONST 1 ('abc') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 RETURN_VALUE And that's where it might be important to check more than just the identity of the object. If len were implemented in Python: >>> def len(x): ... l = 0 ... for l, _ in enumerate(x, 1): pass ... return l ... >>> len("abc") 3 >>> len then it would be possible to keep the same len object but change its behaviour. >>> len.__code__ = (lambda x: 5).__code__ >>> len >>> len("abc") 5 Does anyone EVER do this? C compilers often have optimization levels that can potentially alter the program's operation (eg replacing division with multiplication by the reciprocal); if FAT Python has an optimization flag that says "Assume no __code__ objects are ever replaced", most programs would have no problem with it. (Having it trigger an immediate exception would mean there's no "what the bleep is going on" moment, and I still doubt it'll ever happen.) I think there are some interesting possibilities here. Whether they actually result in real improvement I don't know; but if FAT Python is aiming to be fast at the "start program, do a tiny bit of work, and then terminate" execution model (where JIT compilation can't help), then it could potentially make Mercurial a *lot* faster to fiddle with. ChrisA From ethan at stoneleaf.us Sun Jan 10 01:25:25 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 09 Jan 2016 22:25:25 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: <5691F955.40309@stoneleaf.us> On 01/09/2016 09:23 PM, Chris Angelico wrote: > On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano wrote: >> On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote: >>> On 1/8/2016 4:27 PM, Victor Stinner wrote: >>> >>>> Add a new read-only ``__version__`` property to ``dict`` and >>>> ``collections.UserDict`` types, incremented at each change. >>> >>> Another reason to hide __version__ from the Python level is that its use >>> seems to me rather tricky and bug-prone. >> >> What makes you say that? Isn't it a simple matter of: >> >> [snip] >> >> which doesn't seen tricky or bug-prone to me. > > That doesn't. I would, however, expect that __version__ is a read-only > attribute. You mean like it says in the first quote of this message? ;) -- ~Ethan~ From rosuav at gmail.com Sun Jan 10 02:17:07 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 10 Jan 2016 18:17:07 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <5691F955.40309@stoneleaf.us> References: <20160110042427.GT10854@ando.pearwood.info> <5691F955.40309@stoneleaf.us> Message-ID: On Sun, Jan 10, 2016 at 5:25 PM, Ethan Furman wrote: >> That doesn't. I would, however, expect that __version__ is a read-only >> attribute. > > > You mean like it says in the first quote of this message? ;) D'oh. Yep. Reminder, to self: Read through things twice. You never know what you missed the first time. ChrisA From bunslow at gmail.com Sun Jan 10 03:27:36 2016 From: bunslow at gmail.com (Bill Winslow) Date: Sun, 10 Jan 2016 02:27:36 -0600 Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some arguments of a function In-Reply-To: References: Message-ID: Sorry for the late reply everyone. I think relying on closures, while a solution, is messy. I'd still much prefer a way to tell lru_cache to merely ignore certain arguments. I'll use some variant of the storing-the-partial-progress-as-an-attribute-on-the-cached-recursive-function method (either with the cached-recursive hidden from the top level via the try/except stuff, or with a more simple wrapper function). I've further considered my original proposal, and rather than naming it "arg_filter", I realized that builtins like sorted(), min(), max(), etc all already have the exact same thing -- a "key" argument which transforms the elements to the user's purpose. (In the sorted/min/max case, it's called on the elements of the argument rather than the argument itself, but it's still the same concept.) So basically, my original proposal with renaming from arg_filter to key, is tantamount to extending the same functionality from sorted/min/max to lru_cache as well. As has been pointed out, my own use case is almost certainly *not* the only use case. The implementation and interface are both simple, and simpler than the alternatives which I'll rely on for now (wrappers and closures, or worse, global singletons etc). I would still like to see it in the stdlib in the future. I've appended a largely similar patch with the proposed additions (there's some internal variable renaming to avoid confusion, resulting in a longer diff). Thanks again for all the input. -Bill ----------------------------------------------------------------------------------------------------------------- https://hg.python.org/cpython/file/3.5/Lib/functools.py diff functools.py.orig functools.py 363c363 < def _make_key(args, kwds, typed, --- > def _make_key(args, kwds, typed, key, 377c377,379 < key = args --- > if key is not None: > args, kwds = key(args, kwds) > cache_key = args 380c382 < key += kwd_mark --- > cache_key += kwd_mark 382c384 < key += item --- > cache_key += item 384c386 < key += tuple(type(v) for v in args) --- > cache_key += tuple(type(v) for v in args) 386,389c388,391 < key += tuple(type(v) for k, v in sorted_items) < elif len(key) == 1 and type(key[0]) in fasttypes: < return key[0] < return _HashedSeq(key) --- > cache_key += tuple(type(v) for k, v in sorted_items) > elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes: > return cache_key[0] > return _HashedSeq(cache_key) 391c393 < def lru_cache(maxsize=128, typed=False): --- > def lru_cache(maxsize=128, typed=False, key=None): 400a403,407 > If *key* is not None, it must be a callable which acts on the arguments > passed to the function. Its return value is used in place of the actual > arguments. It works analogously to the *key* argument to the builtins > sorted, max, and min. > 421a429,431 > if key is not None and not callable(key): > raise TypeErrpr('Expected key to be a callable') > 423c433,434 < wrapper = _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo) --- > wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key, > _CacheInfo) 428c439 < def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): --- > def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo): 456,457c467,468 < key = make_key(args, kwds, typed) < result = cache_get(key, sentinel) --- > cache_key = make_key(args, kwds, typed, key) > result = cache_get(cache_key, sentinel) 462c473 < cache[key] = result --- > cache[cache_key] = result 471c482 < key = make_key(args, kwds, typed) --- > cache_key = make_key(args, kwds, typed, key) 473c484 < link = cache_get(key) --- > link = cache_get(cache_key) 487c498 < if key in cache: --- > if cache_key in cache: 496c507 < oldroot[KEY] = key --- > oldroot[KEY] = cache_key 513c524 < cache[key] = oldroot --- > cache[cache_key] = oldroot 517,518c528,529 < link = [last, root, key, result] < last[NEXT] = root[PREV] = cache[key] = link --- > link = [last, root, cache_key, result] > last[NEXT] = root[PREV] = cache[cache_key] = link On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik wrote: > On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee < > leewangzhong+python at gmail.com> wrote: > >> On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik wrote: >> > On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee < >> leewangzhong+python at gmail.com> >> > wrote: >> > > This whole thing is probably best implemented as two separate functions >> > rather than using a closure, depending on how intertwined the code >> paths are >> > for the shortcut/non-shortcut versions. >> >> I like the closure because it has semantic ownership: the inner >> function is a worker for the outer function. >> > > True, a closure has better encapsulation, making it less likely someone > will misuse the helper function. On the other hand, that means there's less > modularity and it would be difficult for someone to use the inner function. > It's hard to know the right choice without seeing the exact problem the > original author was working on. > > >> >> On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee >> >> wrote: >> >> > 1. Rewrite your recursive function so that the partial state is a >> >> > nonlocal variable (in the closure), and memoize the recursive part. >> > >> > I'd flip the rare-case to the except block and put the normal-case in >> the >> > try block. I believe this will be more compute-efficient and more >> readable. >> >> The rare case is in the except block, though. >> > > You're correct. Sorry, I somehow misinterpreted the comment, "# To trigger > the exception the first time" as indicating that code path would run only > once. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Sun Jan 10 04:50:31 2016 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Sun, 10 Jan 2016 01:50:31 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: On 1/8/2016 6:04 PM, Guido van Rossum wrote: > At Dropbox we're trying to be good citizens and we're working towards > introducing gradual typing (PEP 484) into our Python code bases (several > million lines of code). However, that code base is mostly still Python > 2.7 and we believe that we should introduce gradual typing first and > start working on conversion to Python 3 second (since having static > types in the code can help a big refactoring like that). > Big +1 I maintain some packages that are single-source 2/3 compatible packages, thus we haven't been able to add type annotations yet (which I was initially skeptical about, but now love) without dropping py2 support. So even for packages that have already been ported to py3, this proposal would be great. -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jan 10 08:01:05 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 14:01:05 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <196F6749-06BF-4098-8C18-848CAA2DE3AA@yahoo.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <196F6749-06BF-4098-8C18-848CAA2DE3AA@yahoo.com> Message-ID: Hi, Andrew Barnert: > Which implies to me that the PEPs really need to anticipate and answer these questions. The dict.__version__ PEP mentions FAT python as an use case. In fact, I should point to the func.specialize() PEP which already explains partially the motivation for static optimizers: https://www.python.org/dev/peps/pep-0510/#rationale But ok I will enhance the PEP 510 rationale to explain why static optimizers makes sense in Python, maybe even more sense than a JIT compiler in some cases (short living programs). By the way, I think that Mercurial is a good example of short living program. (There is a project for a local "server" to keep a process in backgroud, this one would benefit from a JIT compiler.) Victor From victor.stinner at gmail.com Sun Jan 10 08:08:47 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 14:08:47 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110042427.GT10854@ando.pearwood.info> References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: 2016-01-10 5:24 GMT+01:00 Steven D'Aprano : > Here's a sketch of the idea: > > def demo(arg): > return len(arg) > (...) For examples of guards and how they can be used, please see the PEP 510: https://www.python.org/dev/peps/pep-0510/#example Victor From victor.stinner at gmail.com Sun Jan 10 08:32:46 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 14:32:46 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: Hi, 2016-01-10 6:23 GMT+01:00 Chris Angelico : > Consider: > > def enumerate_classes(): > return (cls.__name__ for cls in object.__subclasses__()) > > As long as nobody has *rebound* the name 'object', this will continue > to work - and it'll pick up new subclasses, which means that > something's mutable or non-pure in there. FAT Python should be able to > handle this just as easily as it handles an immutable. The only part > that has to be immutable is the string "len" or "object" that is used > as the key. FYI I implemented a "copy builtin to constant" optimization which replaces "LOAD_GLOBAL object" instruction with "LOAD_CONST ": https://faster-cpython.readthedocs.org/fat_python.html#fat-copy-builtin-to-constant It uses a guard on the builtin and global namespaces to disable the optimization if object is replaced. If you want to make object.__subclasses__ constant, we need more guards: * guard on the object.__subclasses__ attribute * guard on the private tp_version_tag attribute of the object type * ... and it looks like object.__subclasses__ uses weak references, so I'm not sure that it's really possible to make object.__subclasses__() constant with guards. Is it really worth it? Is it a common case? Oh... I just remember that the "type" type already implements a version as I propose for dict. It's called "tp_version_tag" and it's private. It has the C type "unsigned int" and it's incremented at each modification. > And that's where it might be important to check more than just the > identity of the object. If len were implemented in Python: > >>>> def len(x): > ... l = 0 > ... for l, _ in enumerate(x, 1): pass > ... return l > ... >>>> len("abc") > 3 >>>> len > > > then it would be possible to keep the same len object but change its behaviour. > >>>> len.__code__ = (lambda x: 5).__code__ >>>> len > >>>> len("abc") > 5 > > Does anyone EVER do this? FAT Python implements a fat.GuardFunc which checks if func.__code__ was replaced or not. It doesn't matter if replacing replacing func.__code__ is unlikely. An optimizer must not change the Python semantic, otherwise it will break some applications and cannot be used widely. > if FAT Python has an > optimization flag that says "Assume no __code__ objects are ever > replaced", most programs would have no problem with it. (Having it > trigger an immediate exception would mean there's no "what the bleep > is going on" moment, and I still doubt it'll ever happen.) In my plan, I will add an option to skip guards if you are 100% sure that some things will never change. For example, if you control all code of your application (not only the app itself, all modules) and you know that func.__code__ is never replaced, you can skip fat.GuardFunc (not emit them). > I think there are some interesting possibilities here. Whether they > actually result in real improvement I don't know; but if FAT Python is > aiming to be fast at the "start program, do a tiny bit of work, and > then terminate" execution model (where JIT compilation can't help), > then it could potentially make Mercurial a *lot* faster to fiddle > with. FAT Python is designed to compile the code ahead of code. The installation can be pre-optimized in a package, or optimized at the installation, but it's not optimized when the program is started. If the optimization are efficient, the program will run faster, even for short living programs (yes, like Mercurial). Victor From ericfahlgren at gmail.com Sun Jan 10 10:28:07 2016 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Sun, 10 Jan 2016 07:28:07 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110033140.GS10854@ando.pearwood.info> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> Message-ID: <01b601d14bbb$82648b90$872da2b0$@gmail.com> Steven D'Aprano Saturday, January 09, 2016 19:32: > I think that's pessimistic and unrealistic. If Python were twice as fast as it is now, it would mean that scripts could process twice as much data in the same time as they do now. How is that not meaningful? > > Sometimes I work hard to get a 5% or 10% improvement in speed of a function, because it's worth it. Doubling the speed is something I can only dream about. Often when I hear people complain about "tiny" improvements, I change the context: "Ok, I'm going to raise your salary 5%, or is that too small and you don't want it?" Suddenly that 5% looks pretty good. From rosuav at gmail.com Sun Jan 10 10:36:00 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 02:36:00 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <01b601d14bbb$82648b90$872da2b0$@gmail.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> Message-ID: On Mon, Jan 11, 2016 at 2:28 AM, Eric Fahlgren wrote: > Steven D'Aprano Saturday, January 09, 2016 19:32: >> I think that's pessimistic and unrealistic. If Python were twice as fast > as it is now, it would mean that scripts could process twice as much data in > the same time as they do now. How is that not meaningful? >> >> Sometimes I work hard to get a 5% or 10% improvement in speed of a > function, because it's worth it. Doubling the speed is something I can only > dream about. > > Often when I hear people complain about "tiny" improvements, I change the > context: "Ok, I'm going to raise your salary 5%, or is that too small and > you don't want it?" Suddenly that 5% looks pretty good. Although realistically, it's more like saying "If you put in enough overtime, I'll raise by 5% the rate you get paid for one of the many types of work you do". Evaluating that depends on what proportion of your salary comes from that type of work. 5% across the board is pretty good. 5% to one function is only worth serious effort if that's a hot spot. But broadly I do agree - 5% is significant. ChrisA From nicholas.chammas at gmail.com Sun Jan 10 10:38:54 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 10 Jan 2016 10:38:54 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <01b601d14bbb$82648b90$872da2b0$@gmail.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> Message-ID: On Sun, Jan 10, 2016 at 10:28 AM, Eric Fahlgren wrote: > Often when I hear people complain about "tiny" improvements, I change the > context: "Ok, I'm going to raise your salary 5%, or is that too small and > you don't want it?" Suddenly that 5% looks pretty good. > To extend this analogy a bit, I think Neil's objection was more along the lines of "Why work an extra 5 hours a week for only a 5% raise?" I don't think anyone's going to pooh-pooh a performance improvement. Neil's concern is just about whether the benefit justifies the cost. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jan 10 11:47:21 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 17:47:21 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> Message-ID: Le 10 janv. 2016 4:39 PM, "Nicholas Chammas" a ?crit : > To extend this analogy a bit, I think Neil's objection was more along the lines of "Why work an extra 5 hours a week for only a 5% raise?" Your analogy is wrong. I am working and you get the salary. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sun Jan 10 11:48:35 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 10 Jan 2016 11:48:35 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> Message-ID: I read through this thread and just want to quickly address some good points. First of all, I didn't mean to suggest that this kind of optimization is not useful. Of course, I will be thankful of any optimization that makes it into CPython. Making CPython faster is good, useful work. It's just that my dream of the future of Python is one where Python is faster than C thanks to a very clever JIT. > I agree with Neil Girdhar that this looks to me like a CPython-specific > > implementation detail that should not be imposed on other > > implementations. For testing, perhaps we could add a dict_version > > function in test.support that uses ctypes to access the internals. > > > > Another reason to hide __version__ from the Python level is that its use > > seems to me rather tricky and bug-prone. > What makes you say that? Isn't it a simple matter of: > v = mydict.__version__ > maybe_modify(mydict) > if v != mydict.__version__: > print("dict has changed") This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods. What happens if someone uses a custom Mapping? Do all custom Mappings need to implement __version__? Do they need a __version__ that indicates that no key-value pairs have changed, and another version that indicates that nothing has changed (for example OrderedDict has an order, sorteddict has a sort function; changing either of those doesn't change key-value pairs). This is not supposed to be user-facing; this is an interpreter optimization. > > Obviously a JIT can help, but even they can benefit from this. For > instance, Pyjion could rely on this instead of creating our own guards for > built-in and global namespaces if we wanted to inline calls to certain > built-ins. I understand that, but what if another JIT decides that instead of __version__ being an attribute on the object, version is a global mapping from objects to version numbers? What if someone else wants to implement it instead as a set of changed objects at ever sequence point? There are many ways to do this optimization. It's not obvious to me that everyone will want to do it this way. > > C compilers often have optimization levels that can potentially alter the > program's operation Some of those optimizations lead to bugs that are very hard to track down. One of the advantages of Python is that what you pay for in runtime, you save ten-fold in development time. In summary, I am 100% behind this idea if it were hidden from the user. Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jan 10 12:57:32 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jan 2016 04:57:32 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> Message-ID: <20160110175731.GX10854@ando.pearwood.info> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote: [...] > > v = mydict.__version__ > > maybe_modify(mydict) > > if v != mydict.__version__: > > print("dict has changed") > > > This is exactly what I want to avoid. If you want to do something like > this, I think you should do it in regular Python by subclassing dict and > overriding the mutating methods. That doesn't help Victor, because exec need an actual dict, not subclasses. Victor's PEP says this is a blocker. I can already subclass dict to do that now. But if Victor's suggestion is accepted, then I don't need to. The functionality will already exist. Why shouldn't I use it? > What happens if someone uses a custom Mapping? If they inherit from dict or UserDict, they get this functionality for free. If they don't, they're responsible for implementing it if they want it. > Do all custom Mappings need to implement __version__? I believe the answer to that is No, but the PEP probably should clarify that. -- Steve From mistersheik at gmail.com Sun Jan 10 13:35:10 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 10 Jan 2016 13:35:10 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110175731.GX10854@ando.pearwood.info> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano wrote: > On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote: > > [...] > > > v = mydict.__version__ > > > maybe_modify(mydict) > > > if v != mydict.__version__: > > > print("dict has changed") > > > > > > This is exactly what I want to avoid. If you want to do something like > > this, I think you should do it in regular Python by subclassing dict and > > overriding the mutating methods. > > That doesn't help Victor, because exec need an actual dict, not > subclasses. Victor's PEP says this is a blocker. > No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code. > > I can already subclass dict to do that now. But if Victor's suggestion > is accepted, then I don't need to. The functionality will already exist. > Why shouldn't I use it? > Because people write code for the abc "Mapping". What you are suggesting is then to add "__version__" to the abc Mapping, which I am against. Mapping provides the minimum interface to be a mapping; there is no reason that every Mapping should have a "__version__". > > > What happens if someone uses a custom Mapping? > > If they inherit from dict or UserDict, they get this functionality for > free. If they don't, they're responsible for implementing it if they > want it. > But they shouldn't have to implement it just so that code written for Mappings works ? as it does now. > > > > Do all custom Mappings need to implement __version__? > > I believe the answer to that is No, but the PEP probably should clarify > that. > If the answer is "no" then honestly no user should write code counting on the existence of __version__. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/HP5qdo3rJxE/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Sun Jan 10 14:03:55 2016 From: mike at selik.org (Michael Selik) Date: Sun, 10 Jan 2016 19:03:55 +0000 Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some arguments of a function In-Reply-To: References: Message-ID: Shouldn't the key function be called with ``key(*args, **kwargs)``? It'd be helpful to see the entire revision, rather than just the diff. It's easier for me to read at least. On Sun, Jan 10, 2016, 3:27 AM Bill Winslow wrote: > Sorry for the late reply everyone. > > I think relying on closures, while a solution, is messy. I'd still much > prefer a way to tell lru_cache to merely ignore certain arguments. I'll use > some variant of the > storing-the-partial-progress-as-an-attribute-on-the-cached-recursive-function > method (either with the cached-recursive hidden from the top level via the > try/except stuff, or with a more simple wrapper function). > > I've further considered my original proposal, and rather than naming it > "arg_filter", I realized that builtins like sorted(), min(), max(), etc all > already have the exact same thing -- a "key" argument which transforms the > elements to the user's purpose. (In the sorted/min/max case, it's called on > the elements of the argument rather than the argument itself, but it's > still the same concept.) So basically, my original proposal with renaming > from arg_filter to key, is tantamount to extending the same functionality > from sorted/min/max to lru_cache as well. As has been pointed out, my own > use case is almost certainly *not* the only use case. The implementation > and interface are both simple, and simpler than the alternatives which I'll > rely on for now (wrappers and closures, or worse, global singletons etc). I > would still like to see it in the stdlib in the future. I've appended a > largely similar patch with the proposed additions (there's some internal > variable renaming to avoid confusion, resulting in a longer diff). > > Thanks again for all the input. > > -Bill > > > ----------------------------------------------------------------------------------------------------------------- > https://hg.python.org/cpython/file/3.5/Lib/functools.py > > > diff functools.py.orig functools.py > 363c363 > < def _make_key(args, kwds, typed, > --- > > def _make_key(args, kwds, typed, key, > 377c377,379 > < key = args > --- > > if key is not None: > > args, kwds = key(args, kwds) > > cache_key = args > 380c382 > < key += kwd_mark > --- > > cache_key += kwd_mark > 382c384 > < key += item > --- > > cache_key += item > 384c386 > < key += tuple(type(v) for v in args) > --- > > cache_key += tuple(type(v) for v in args) > 386,389c388,391 > < key += tuple(type(v) for k, v in sorted_items) > < elif len(key) == 1 and type(key[0]) in fasttypes: > < return key[0] > < return _HashedSeq(key) > --- > > cache_key += tuple(type(v) for k, v in sorted_items) > > elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes: > > return cache_key[0] > > return _HashedSeq(cache_key) > 391c393 > < def lru_cache(maxsize=128, typed=False): > --- > > def lru_cache(maxsize=128, typed=False, key=None): > 400a403,407 > > If *key* is not None, it must be a callable which acts on the > arguments > > passed to the function. Its return value is used in place of the > actual > > arguments. It works analogously to the *key* argument to the builtins > > sorted, max, and min. > > > 421a429,431 > > if key is not None and not callable(key): > > raise TypeErrpr('Expected key to be a callable') > > > 423c433,434 > < wrapper = _lru_cache_wrapper(user_function, maxsize, typed, > _CacheInfo) > --- > > wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key, > > _CacheInfo) > 428c439 > < def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): > --- > > def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo): > 456,457c467,468 > < key = make_key(args, kwds, typed) > < result = cache_get(key, sentinel) > --- > > cache_key = make_key(args, kwds, typed, key) > > result = cache_get(cache_key, sentinel) > 462c473 > < cache[key] = result > --- > > cache[cache_key] = result > 471c482 > < key = make_key(args, kwds, typed) > --- > > cache_key = make_key(args, kwds, typed, key) > 473c484 > < link = cache_get(key) > --- > > link = cache_get(cache_key) > 487c498 > < if key in cache: > --- > > if cache_key in cache: > 496c507 > < oldroot[KEY] = key > --- > > oldroot[KEY] = cache_key > 513c524 > < cache[key] = oldroot > --- > > cache[cache_key] = oldroot > 517,518c528,529 > < link = [last, root, key, result] > < last[NEXT] = root[PREV] = cache[key] = link > --- > > link = [last, root, cache_key, result] > > last[NEXT] = root[PREV] = cache[cache_key] = link > > On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik wrote: > >> On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee < >> leewangzhong+python at gmail.com> wrote: >> >>> On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik wrote: >>> > On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee < >>> leewangzhong+python at gmail.com> >>> > wrote: >>> >> > This whole thing is probably best implemented as two separate functions >>> > rather than using a closure, depending on how intertwined the code >>> paths are >>> > for the shortcut/non-shortcut versions. >>> >>> I like the closure because it has semantic ownership: the inner >>> function is a worker for the outer function. >>> >> >> True, a closure has better encapsulation, making it less likely someone >> will misuse the helper function. On the other hand, that means there's less >> modularity and it would be difficult for someone to use the inner function. >> It's hard to know the right choice without seeing the exact problem the >> original author was working on. >> >> >>> >> On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee >>> >> wrote: >>> >> > 1. Rewrite your recursive function so that the partial state is a >>> >> > nonlocal variable (in the closure), and memoize the recursive part. >>> > >>> > I'd flip the rare-case to the except block and put the normal-case in >>> the >>> > try block. I believe this will be more compute-efficient and more >>> readable. >>> >>> The rare case is in the except block, though. >>> >> >> You're correct. Sorry, I somehow misinterpreted the comment, "# To >> trigger the exception the first time" as indicating that code path would >> run only once. >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jan 10 15:02:47 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 10 Jan 2016 21:02:47 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110175731.GX10854@ando.pearwood.info> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: 2016-01-10 18:57 GMT+01:00 Steven D'Aprano : >> Do all custom Mappings need to implement __version__? > > I believe the answer to that is No, but the PEP probably should clarify > that. In the PEP, I wrote "The PEP is designed to implement guards on namespaces, only the dict type can be used for namespaces in practice. collections.UserDict is modified because it must mimicks dict. collections.Mapping is unchanged." https://www.python.org/dev/peps/pep-0509/#changes Is it enough? If no, what do you suggest to be more explicit? Victor From ethan at stoneleaf.us Sun Jan 10 15:32:50 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 10 Jan 2016 12:32:50 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: <5692BFF2.4070909@stoneleaf.us> On 01/10/2016 12:02 PM, Victor Stinner wrote: > 2016-01-10 18:57 GMT+01:00 Steven D'Aprano : >>> Do all custom Mappings need to implement __version__? >> >> I believe the answer to that is No, but the PEP probably should clarify >> that. > > In the PEP, I wrote "The PEP is designed to implement guards on > namespaces, only the dict type can be used for namespaces in practice. > collections.UserDict is modified because it must mimicks dict. > collections.Mapping is unchanged." > https://www.python.org/dev/peps/pep-0509/#changes > > Is it enough? If no, what do you suggest to be more explicit? It is enough. -- ~Ethan~ From jim.baker at python.org Sun Jan 10 16:51:13 2016 From: jim.baker at python.org (Jim Baker) Date: Sun, 10 Jan 2016 14:51:13 -0700 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> Message-ID: FWIW, we now fully support both Jedi and lib2to3 in Jython 2.7.1 master. With some other work this weekend, we should be releasing 2.7.1 beta 3 and then shortly a RC - we just fixed the last blocking bug. On Sat, Jan 9, 2016 at 11:52 AM, Jim Baker wrote: > +1, I would really like to try out type annotation support in Jython, > given the potential for tying in with Java as a source of type annotations > (basically the equivalent of stubs for free). I'm planning on sprinting on > Jython 3 at PyCon, but let's face it, that's going to take a while to > really finish. > > re the two approaches, both are workable with Jython: > > * lib2to3 is something we should support in Jython 2.7. There are a couple > of data files that we don't support in the tests (too large of a method for > Java bytecode in infinite_recursion.py, not terribly interesting), plus a > few other tests that should work. Therefore lib2to3 should be in the next > release (2.7.1). > > * Jedi now works with the last commit to Jython 2.7 trunk, passing > whatever it means to run random tests using its sith script against its > source. (The sith test does not pass with either CPython or Jython's > stdlib, starting with bad_coding.py.) > > - Jim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jan 10 18:30:12 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 00:30:12 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: 2016-01-09 14:12 GMT+01:00 Nick Coghlan : > On 9 January 2016 at 19:18, Victor Stinner wrote: >> It would be nice to detect keys mutation while iteration on >> dict.keys(), but it would also be be nice to detect values mutation >> while iterating on dict.values() and dict.items(). No? > > No, because mutating values as you go while iterating over a > dictionary is perfectly legal: (...) Oh you're right. I removed the reference to the issue #19332 from the PEP, since the PEP doesn't help. Too bad. Victor From steve at pearwood.info Sun Jan 10 19:12:18 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jan 2016 11:12:18 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: <20160111001218.GY10854@ando.pearwood.info> On Sun, Jan 10, 2016 at 09:02:47PM +0100, Victor Stinner wrote: > 2016-01-10 18:57 GMT+01:00 Steven D'Aprano : > >> Do all custom Mappings need to implement __version__? > > > > I believe the answer to that is No, but the PEP probably should clarify > > that. > > In the PEP, I wrote "The PEP is designed to implement guards on > namespaces, only the dict type can be used for namespaces in practice. > collections.UserDict is modified because it must mimicks dict. > collections.Mapping is unchanged." > https://www.python.org/dev/peps/pep-0509/#changes > > Is it enough? If no, what do you suggest to be more explicit? You also should argue whether or not __version__ should be visible to users from pure Python, or only from C code (as Neil wants). In other words, should __version__ be part of the public API of dict, or an implementation detail? (1) Make __version__ part of the public API. Pros: - Simpler implementation? - Allows easier debugging. - Users can make use of it for their own purposes. Cons: - Neil wants to avoid users making use of this feature. (Why, he hasn't explained, or if he did, I missed it.) - All implementations (PyPy, Jython, etc.) must copy it. - You lock in one specific implementation for guards and cannot change to another one. (2) Keep __version__ private. Pros: - Other implementations can ignore it. - You can change the implementation for guards. Cons: - Users may resort to ctypes to make use of it. (If they can.) -- Steve From victor.stinner at gmail.com Sun Jan 10 19:15:47 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 01:15:47 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: 2016-01-10 19:35 GMT+01:00 Neil Girdhar : > If the answer is "no" then honestly no user should write code counting on > the existence of __version__. For my use case, I don't need a public (read-only) property at the Python level. When I wrote the PEP, I proposed a public property to try to find more use cases and make the PEP more interesting. I'm not sure anymore that it's worth since they are legit and good counterargument were listed: * it gives more work for other Python implementations, whereas they may not use or benefit from the overall API for static optimizers (discussed in following PEPs). Except of guards used for static optimizers, I don't see any use case for dictionary versionning. * the behaviour on integer overflow is an implementation detail, it's sad to have to describe it in the specification of a *Python* property. Users expect Python to abtract the hardware Victor From victor.stinner at gmail.com Sun Jan 10 19:20:33 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 01:20:33 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160111001218.GY10854@ando.pearwood.info> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <20160111001218.GY10854@ando.pearwood.info> Message-ID: 2016-01-11 1:12 GMT+01:00 Steven D'Aprano : > Cons: > > - Users may resort to ctypes to make use of it. > (If they can.) It's not something new. It's already possible to access any C private attribute using ctypes. I don't think that it's a real issue. "We are all consenting adults here" ;-) Victor From rosuav at gmail.com Sun Jan 10 19:27:45 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 11:27:45 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: On Mon, Jan 11, 2016 at 11:15 AM, Victor Stinner wrote: > * the behaviour on integer overflow is an implementation detail, it's > sad to have to describe it in the specification of a *Python* > property. Users expect Python to abtract the hardware Compromise: Document that it's an integer that changes every time the dictionary is changed, and has a "vanishingly small chance" of ever reusing a number. It'll trap the same people who try to use id(obj) as a memory address, but at least it'll be documented as false. ChrisA From abarnert at yahoo.com Sun Jan 10 19:37:56 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 10 Jan 2016 16:37:56 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> On Jan 10, 2016, at 10:35, Neil Girdhar wrote: > >> On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano wrote: >> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote: >> >> [...] >> > > v = mydict.__version__ >> > > maybe_modify(mydict) >> > > if v != mydict.__version__: >> > > print("dict has changed") >> > >> > >> > This is exactly what I want to avoid. If you want to do something like >> > this, I think you should do it in regular Python by subclassing dict and >> > overriding the mutating methods. >> >> That doesn't help Victor, because exec need an actual dict, not >> subclasses. Victor's PEP says this is a blocker. > > No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code. Well, he could change exec so it can use arbitrary mappings (or at least dict subclasses), but I assume that's much harder and more disruptive than his proposed change. Anyway, if I understand your point, it's this: __version__ should either be a private implementation-specific property of dicts, or it should be a property of all mappings; anything in between gets all the disadvantages of both. If so, I agree with you. Encouraging people to use __version__ for other purposes besides namespace guards, but not doing anything to guarantee it actually exists anywhere besides namespaces, seems like a bad idea. But there is still something in between public and totally internal to FAT Python. Making it a documented property of PyDict objects at the C API level is a different story--there are already plenty of ways that C code can use those objects that won't work with arbitrary mappings, so adding another doesn't seem like a problem. And even making it public but implementation-specific at the Python level may be useful for other CPython-specific optimizers (even if partially written in Python); if so, the best way to deal with the danger that someone could abuse it for code that should work with arbitrary mappings or with another Python implementation should be solved by clearly documenting it's non portability and discouraging its abuse in the docs, not by hiding it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Jan 10 19:53:14 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 10 Jan 2016 16:53:14 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <196F6749-06BF-4098-8C18-848CAA2DE3AA@yahoo.com> Message-ID: On Jan 10, 2016, at 05:01, Victor Stinner wrote: > > Andrew Barnert: >> Which implies to me that the PEPs really need to anticipate and answer these questions. > > The dict.__version__ PEP mentions FAT python as an use case. In fact, > I should point to the func.specialize() PEP which already explains > partially the motivation for static optimizers: > https://www.python.org/dev/peps/pep-0510/#rationale Sure, linking to PEP 510 instead of repeating its while rationale seems perfectly reasonable to me. > But ok I will enhance the PEP 510 rationale to explain why static > optimizers makes sense in Python, maybe even more sense than a JIT > compiler in some cases (short living programs). By the way, I think > that Mercurial is a good example of short living program. If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work? Anyway, my point is just that it would be nice if, the next time someone raises the same kind of objection (because I'll bet it comes up when you post to -dev on the next pass, from people who don't read -ideas), you could just say "read this section of PEP 509 and that section of PEP 510 and then tell me what objections you still have", instead of needing to repeat the arguments you've already made. From victor.stinner at gmail.com Sun Jan 10 20:36:20 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 02:36:20 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <196F6749-06BF-4098-8C18-848CAA2DE3AA@yahoo.com> Message-ID: 2016-01-11 1:53 GMT+01:00 Andrew Barnert : > If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work? My optimizer now has a good library to implement optimizations, but I didn't start to implement optimizations which will provide real speedup on real applications. I expect a speedup with function inlining, detecting pure functions, elimination of "unused" variables (after constant propagation), etc. In short, since the optimizer is "incomplete", I don't even want to start playing with benchmarks. You can play with microbenchmarks if you want. Try FAT Python, it's a working Python 3.6: https://faster-cpython.readthedocs.org/fat_python.html Currently, you have to run it with "-X fat" to enable the optimizer. But the command line argument may change, I'm still working on the API. Victor From steve at pearwood.info Sun Jan 10 20:39:01 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jan 2016 12:39:01 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: <20160111013901.GZ10854@ando.pearwood.info> On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote: > * the behaviour on integer overflow is an implementation detail, it's > sad to have to describe it in the specification of a *Python* > property. Users expect Python to abtract the hardware Is that a real possibility? A 32-bit counter will overflow, sure, but a 64-bit counter starting from zero should never overflow in a human lifetime. Even if we assume a billion increments per second (one per nanosecond), it would take over 584 years of continuous operation for the counter to overflow. What am I missing? So I would be inclined to just document that the counter may overflow, and you should always compare it using == or != and not >. I think anything else is overkill. -- Steve From rosuav at gmail.com Sun Jan 10 20:55:24 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 12:55:24 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160111013901.GZ10854@ando.pearwood.info> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <20160111013901.GZ10854@ando.pearwood.info> Message-ID: On Mon, Jan 11, 2016 at 12:39 PM, Steven D'Aprano wrote: > On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote: > >> * the behaviour on integer overflow is an implementation detail, it's >> sad to have to describe it in the specification of a *Python* >> property. Users expect Python to abtract the hardware > > > Is that a real possibility? A 32-bit counter will overflow, sure, but a > 64-bit counter starting from zero should never overflow in a human > lifetime. > > Even if we assume a billion increments per second (one per nanosecond), > it would take over 584 years of continuous operation for the counter to > overflow. What am I missing? You're missing that a 32-bit build of Python would then be allowed to use a 32-bit counter. But if the spec says "64-bit counter", then yeah, we can pretty much assume that it won't overflow. Reasonable usage wouldn't include nanosecondly updates; I doubt you could even achieve 1000 updates a second, sustained over a long period of time, and that would only overflow every 50ish days. Unless there's some bizarre lockstep system that forces you to run into the rollover, it's going to be basically one chance in four billion that you hit the exact equal counter. So even a 32-bit counter is unlikely to cause problems in real-world situations; and anyone who's paranoid can just insist on using a 64-bit build of Python. (Most of us probably are anyway.) ChrisA From abarnert at yahoo.com Sun Jan 10 21:25:49 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 10 Jan 2016 18:25:49 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <20160111013901.GZ10854@ando.pearwood.info> Message-ID: On Jan 10, 2016, at 17:55, Chris Angelico wrote: > > You're missing that a 32-bit build of Python would then be allowed to > use a 32-bit counter. But if the spec says "64-bit counter", then > yeah, we can pretty much assume that it won't overflow. As I understand it from Victor's PEP, the added cost of maintaining this counter is literally so small as to be unmeasurable against the cost of normal dict operations in microbenchmarks. If that's true, surely the cost of requiring a 64-bit counter is going to be acceptable? I realize that some MicroPython projects will be targeting platforms where there's no fast way to do an inc64 (or where the available compilers are too dumb to do it the fast way), but those projects are probably not going to want FAT Python anyway. On a P3 or later x86 or an ARM 7 or something like that, the cost should be more than acceptable. Or at least it's worth testing. From guido at python.org Sun Jan 10 22:17:48 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Jan 2016 19:17:48 -0800 Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected? In-Reply-To: References: <659951373.398716.1450406333370.JavaMail.yahoo@mail.yahoo.com> <20151218110755.GH1609@ando.pearwood.info> Message-ID: Seems like we dropped the ball... Is there any action item here? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Jan 11 01:04:02 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 01:04:02 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: On 1/10/2016 12:23 AM, Chris Angelico wrote: (in reponse to Steven's response to my post) > There's more to it than that. Yes, a dict maps values to values; but > the keys MUST be immutable Keys just have to be hashable; only hashes need to be immutable. By default, hashes depends on ids, which are immutable for a particular object within a run. (otherwise hashing has problems), only if the hash depends on values that mutate. Some do. > and this optimization > doesn't actually care about the immutability of the value. astoptimizer has multiple optimizations. One is not repeating name lookups. This is safe as long as the relevant dicts have not changed. I am guessing that you were pointing to this one. Another is not repeating the call of a function with a particular value. This optimization, in general, is not safe even if dicts have not changed. It *does* care about the nature of dict values -- in particular the nature of functions that are dict values. It is the one *I* discussed, and the reason I claimed that using __version__ is tricky. His toy example is replacing conditionally replacing 'len('abc') (at runtime) with '3', where '3' is computed *when the code is compiled. For this, it is crucial that builtin len is pure and immutable. Viktor is being super careful to not break code. In response to my question, Viktor said astoptimizer uses a whitelist of pure builtins to supplement the information supplied by .__version__. Dict history, summarized by __version__ is not always enough to answer 'is this optimization safe'? The nature of values is sometimes crucially important. However, others might use __version__ *without* thinking through what other information is needed. This is why I think its exposure is a bit dangerous. 19 years of experience suggests to me that misuse *will* happen. Viktor just reported that CPython's type already has a *private* version count. The issue of exposing a new internal feature is somewhat separate and comes after the decision to add it. As you know, and even alluded to later in your post, CPython already replaces '1 + 1' with '2' at compile time. Method int.__add__ is pure and immutable. Since it (unlike len) also cannot be replaced or shadowed, the replacement can be complete, with '2' put in the code object (and .pyc if written), as if the programmer had actually written '2'. >>> from dis import dis >>> dis('1 + 1') 1 0 LOAD_CONST 1 (2) 3 RETURN_VALUE JIT compilers depend on the same properties of int, float, and str operations, for instance, as well as the fact that unbox(Py object) and box(machine value) are inverses, so that unbox(box(temp_machine_value) can be replaced by temp_machine_value. -- Terry Jan Reedy From tjreedy at udel.edu Mon Jan 11 01:16:19 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 01:16:19 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160110042427.GT10854@ando.pearwood.info> References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: On 1/9/2016 11:24 PM, Steven D'Aprano wrote: > On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote: >> Another reason to hide __version__ from the Python level is that its use >> seems to me rather tricky and bug-prone. > > What makes you say that? We would like to replace slow tortoise steps with quick rabbit jumps. Is it safe? For avoiding name lookups in dicts, careful dict guards using __version__ should be enough. For avoiding function calls, they help but are not enough. Optimization is empirically tricky and bug prone. CPython has many private implementation details that have not been exposed at the Python level because the expected gain is not worth the expected pain. If __version__ is added, I think exposing it should be giving separate consideration. -- Terry Jan Reedy From tjreedy at udel.edu Mon Jan 11 01:36:19 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 01:36:19 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> Message-ID: On 1/10/2016 3:02 PM, Victor Stinner wrote: > In the PEP, I wrote "The PEP is designed to implement guards on > namespaces, only the dict type can be used for namespaces in practice. > collections.UserDict is modified because it must mimicks dict. collections.UserDict mimics the public interface of dict, not internal implementation details. It uses an actual dict to do this. If __version__ is not exposed at the python level, it will not be and should not be visible via UserDict. > collections.Mapping is unchanged." > https://www.python.org/dev/peps/pep-0509/#changes > > Is it enough? If no, what do you suggest to be more explicit? Your minimal core proposal is or should be to add a possibly private .__version__ attribute to CPython dicts, so as to enable astoptimizer. Stick with that. Stop inviting peripheral discussion and distractions. Modifying UserDict and exposing __version__ to Python code are separate issues, and can be done later if later deemed to be desirable. -- Terry Jan Reedy From abarnert at yahoo.com Mon Jan 11 01:48:28 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 10 Jan 2016 22:48:28 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: <5A41DC19-5652-4821-9FC5-521CB803D564@yahoo.com> On Jan 10, 2016, at 22:04, Terry Reedy wrote: > > On 1/10/2016 12:23 AM, Chris Angelico wrote: > > (in reponse to Steven's response to my post) > >> There's more to it than that. Yes, a dict maps values to values; but >> the keys MUST be immutable > > Keys just have to be hashable; only hashes need to be immutable. > By default, hashes depends on ids, which are immutable for a particular object within a run. > > (otherwise hashing has problems), > > only if the hash depends on values that mutate. Some do. But if equality depends on values, the hash has to depend on those same values. (Because two values that are equal have to hash equal.) Which means that if equality depends on any mutable values, the type can't be hashable. Which is why none of the built-in mutable types are hashable. Of course Python doesn't stop you from writing your own types that can provide different hashes for equal values, or that can change hashes as they're mutated. It's even possible to use them as dict keys as long as you're very careful (the keys don't mutate in a way that changes either their hash or their equivalence while they're in the dict, and you never look up or add a key that's equal to an existing key but has a different hash). But it's not _that_ much of an oversimplification to say that keys have to be immutable. And any dict-based optimizations can safely rely on the same thing basic dict usage relies on: if the keys _aren't_ actually immutable, they're coded and used carefully (as described above) so that you can't tell they're mutable. From rosuav at gmail.com Mon Jan 11 03:07:32 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 19:07:32 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: On Mon, Jan 11, 2016 at 5:04 PM, Terry Reedy wrote: > On 1/10/2016 12:23 AM, Chris Angelico wrote: > > (in reponse to Steven's response to my post) > >> There's more to it than that. Yes, a dict maps values to values; but >> the keys MUST be immutable > > > Keys just have to be hashable; only hashes need to be immutable. By > default, hashes depends on ids, which are immutable for a particular object > within a run. Yes, but if you're using the ID as the hash and identity as equality, then *by definition* the only way to look up that key is with that object. That means it doesn't matter to the lookup optimization if the object itself has changed: class Puddle(object): pass d = {} key, val = Puddle(), Puddle() key.foo = "foo"; val.foo = "bar" d[key] = val print(d[key]) snapshotted_d_key = d[key] key.foo = "not foo" print(d[key]) print(snapshotted_d_key) The optimization in question is effectively using a local reference like snapshotted_d_key rather than doing the actual lookup again. It can safely do this even if the attributes of that key have changed, because there is no way for that to affect the result of the lookup. So in terms of dict lookups, whatever affects hash and equality *is* the object's value; if that's its identity, then identity is the sole value that object has. >> and this optimization >> doesn't actually care about the immutability of the value. > > astoptimizer has multiple optimizations. One is not repeating name lookups. > This is safe as long as the relevant dicts have not changed. I am guessing > that you were pointing to this one. Yes, that's the one I was talking about. > Another is not repeating the call of a function with a particular value. > This optimization, in general, is not safe even if dicts have not changed. > It *does* care about the nature of dict values -- in particular the nature > of functions that are dict values. It is the one *I* discussed, and the > reason I claimed that using __version__ is tricky. Okay. In that case, yes, it takes a lot more checks. > His toy example is replacing conditionally replacing 'len('abc') (at > runtime) with '3', where '3' is computed *when the code is compiled. For > this, it is crucial that builtin len is pure and immutable. Correct. I'm getting this mental picture of angelic grace, with a chosen few most beautiful functions being commended for their purity, immutability, and reverence. > Viktor is being super careful to not break code. In response to my > question, Viktor said astoptimizer uses a whitelist of pure builtins to > supplement the information supplied by .__version__. Dict history, > summarized by __version__ is not always enough to answer 'is this > optimization safe'? The nature of values is sometimes crucially important. There would be very few operations that can be optimized like this. In practical terms, the only ones that I can think of are what you might call "computed literals" - like (2+3j), they aren't technically literals, but the programmer thinks of them that way. Things like module-level constants (the 'stat' module comes to mind), a small handful of simple transformations, and maybe some text<->bytes transformations (eg "abc".encode("ascii") could be replaced at compile-time with b"abc"). There won't be very many others, I suspect. ChrisA From victor.stinner at gmail.com Mon Jan 11 05:00:24 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 11:00:24 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> Message-ID: Hi, 2016-01-11 9:07 GMT+01:00 Chris Angelico : > Yes, but if you're using the ID as the hash and identity as equality, > then *by definition* the only way to look up that key is with that > object. That means it doesn't matter to the lookup optimization if the > object itself has changed: > > class Puddle(object): pass > d = {} > key, val = Puddle(), Puddle() > key.foo = "foo"; val.foo = "bar" > d[key] = val IMHO the discussion gone too far. See the PEP: the goal is to implement efficient guards on namespaces. In namespaces, keys are short immutable strings. Not funny objects. Keys come from the Python source code, like "x" from "x=1". Again, if the dict value is mutable (like functions implemented in pure Python), they are dedicated guards for that, but no PEP is required to implement these guards ;-) See the PEP 510: to specialize a function, you have to a pass a *list* of guards. There is not arbitrary limit on the number of guards :-) (But I expect to have less than 10 guards for the common case, or more likely just a few ones.) > There would be very few operations that can be optimized like this. In > practical terms, the only ones that I can think of are what you might > call "computed literals" - like (2+3j), they aren't technically > literals, but the programmer thinks of them that way. FYI Python 2 peephole optimizer is not able to optimize all operations like that because of technical issues, it's limited :-/ Python 3 peephole optimizer is better ;-) http://faster-cpython.readthedocs.org/bytecode.html#cpython-peephole-optimizer In more general, the optimizer is limited because it works on the bytecode which is difficult to manipulate. It's difficult to implement simple optimizations. For example, the peephole optimizer of Python 3 maintains a "stack of constants" to implement constant folding. Implemeting constant folding on the AST is much easier, you can browse the subtree of a node with nice Python objects. If you are curious, you can take a look at the constant folding optimization step of astoptimizer: https://hg.python.org/sandbox/fatpython/file/tip/Lib/astoptimizer/const_fold.py It implements more optimizations than the peephole optimizer: http://faster-cpython.readthedocs.org/fat_python.html#comparison-with-the-peephole-optimizer > Things like > module-level constants (the 'stat' module comes to mind), In Python, it's rare to manipulate directly constants. But it's common to access constant coming from a different namespace, like constants at the module level. To implement constant propagation on these constants, we also need guards on the namespace to disable the optimization when a "constant" is modified (which can be done for unit tests for example). For example, the base64 module defines: _b32alphabet = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567' and later in a function, it uses: {v: k for k, v in enumerate(_b32alphabet)} This "complex" dict-comprehension calling enumerate() can be replaced with a simpler dict literal: {65: 0, 66: 1, ...} or dict(((65,0), (66,0), ...)). I don't know if it's the best example, I don't know if it's really much faster, it's just to explain the general idea. Another simple example: pickle.py defines many constants like MARK = b'(' and TUPLE = b(', defined at module-level. Later it uses for example MARK + TUPLE. Using guards on the global namespace, it's possible to replace MARK + TUPLE with b'((' to avoid two dict lookups and a call to byte string concatenation. Again, it's a simple explain to explain the principle. Usually, a single optimization alone is not interesting. It's when you combine optimization that it becomes interesting. For example, constant propagation + constant folding + simplify iterable + loop unrolling + elimitation of unused variables really makes the code simpler (and more efficient). > a small > handful of simple transformations, and maybe some text<->bytes > transformations (eg "abc".encode("ascii") could be replaced at > compile-time with b"abc"). There won't be very many others, I suspect. It's possible to optimize some method calls on builtin types without guards since it's not possible to replace methods of builtin types. My old AST optimizer implements such optimizations (I didn't reimplement them in my new AST optimizer yet), but alone they are not really interesting in term of performance. Victor From mistersheik at gmail.com Mon Jan 11 05:18:59 2016 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 11 Jan 2016 05:18:59 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> Message-ID: On Sun, Jan 10, 2016 at 7:37 PM, Andrew Barnert wrote: > On Jan 10, 2016, at 10:35, Neil Girdhar wrote: > > > On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano > wrote: > >> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote: >> >> [...] >> > > v = mydict.__version__ >> > > maybe_modify(mydict) >> > > if v != mydict.__version__: >> > > print("dict has changed") >> > >> > >> > This is exactly what I want to avoid. If you want to do something like >> > this, I think you should do it in regular Python by subclassing dict and >> > overriding the mutating methods. >> >> That doesn't help Victor, because exec need an actual dict, not >> subclasses. Victor's PEP says this is a blocker. >> > > No, he can still do what he wants transparently in the interpreter. What > I want to avoid is Python users using __version__ in their own code. > > > Well, he could change exec so it can use arbitrary mappings (or at least > dict subclasses), but I assume that's much harder and more disruptive than > his proposed change. > > Anyway, if I understand your point, it's this: __version__ should either > be a private implementation-specific property of dicts, or it should be a > property of all mappings; anything in between gets all the disadvantages of > both. > Right. I prefer the the former since making it a property of mappings bloats Mapping beyond a minimum interface. > > If so, I agree with you. Encouraging people to use __version__ for other > purposes besides namespace guards, but not doing anything to guarantee it > actually exists anywhere besides namespaces, seems like a bad idea. > > But there is still something in between public and totally internal to FAT > Python. Making it a documented property of PyDict objects at the C API > level is a different story--there are already plenty of ways that C code > can use those objects that won't work with arbitrary mappings, so adding > another doesn't seem like a problem. > Adding it to PyDict and exposing it in the C API is totally reasonable to me. > And even making it public but implementation-specific at the Python level > may be useful for other CPython-specific optimizers (even if partially > written in Python); if so, the best way to deal with the danger that > someone could abuse it for code that should work with arbitrary mappings or > with another Python implementation should be solved by clearly documenting > it's non portability and discouraging its abuse in the docs, not by hiding > it. > > Here is where I have to disagree. I hate it when experts say "we'll just document it and then it's the user's fault for misusing it". Yeah, you're right, but as a user, it is very frustrating to have to read other people's documentation. You know that some elite Python programmer is going to optimize his code using this and someone years later is going to scratch his head wondering where __version__ is coming from. Is it the provided by the caller? Was it added to the object at some earlier point? Finally, he'll search the web, arrive at a stackoverflow question with 95 upvotes that finally clears things up. And for what? Some minor optimization. (Not Victor's optimization, but a Python user's optimization in Python code.) Python should make it easy to write clear code. It's my opinion that documentation is not a substitute for good language design, just as comments are not a substitute for good code design. Also, using this __version__ in source code is going to complicate switching from CPython to any of the other Python implementations, so those implementations will probably end up implementing it just to simplify "porting", which would otherwise be painless. Why don't we leave exposing __version__ in Python to another PEP? Once it's in the C API (as you proposed) you will be able to use it from Python by writing an extension and then someone can demonstrate the value of exposing it in Python by writing tests. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 11 05:23:35 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Jan 2016 20:23:35 +1000 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On 10 January 2016 at 08:19, Chris Angelico wrote: > On Sun, Jan 10, 2016 at 3:51 AM, Mahmoud Hashemi wrote: >> I think it's a pretty common itch! Have you seen the boltons implementation? >> http://boltons.readthedocs.org/en/latest/fileutils.html#file-permissions > > Yes it is, and no I haven't; everyone has a slightly different idea of > what makes a good API, and that's why I put that caveat onto my > suggestion. You can't make everyone happy, and APIs should not be > designed by committee :) In the context of Python as a cross-platform language, it's also important to remember that POSIX-style user/group/other permissions are only one form of file level access control - depending on your filesystem and OS, there will be a range of others. That significantly reduces the motivation to try to provide a platform independent abstraction for an inherently platform specific concept (at least in the standard library - PyPI is a different matter). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Mon Jan 11 05:55:37 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 02:55:37 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Jan 11, 2016, at 02:23, Nick Coghlan wrote: > > In the context of Python as a cross-platform language, it's also > important to remember that POSIX-style user/group/other permissions > are only one form of file level access control - depending on your > filesystem and OS, there will be a range of others. Well, not a _huge_ range; as far as I know, the only things you're ever likely to run into besides POSIX permissions or a simple read-only flag are ACLs*. But that's still enough of a range to worry about... * Yes, NT and POSIX ACLs aren't quite identical, and the POSIX standard was never completed and there are some minor differences between the Linux and BSD implementations, and OS X confused things by using the NT design with a POSIX-ish API and completely unique tools, so using ACLs portably isn't trivial. But, except for the problem of representing ACLs for users who don't exist on the system, they're pretty much equivalent for almost anything you care about at the application level. From steve at pearwood.info Mon Jan 11 06:20:11 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Jan 2016 22:20:11 +1100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> Message-ID: <20160111112011.GA10854@ando.pearwood.info> On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote: > Here is where I have to disagree. I hate it when experts say "we'll just > document it and then it's the user's fault for misusing it". Yeah, you're > right, but as a user, it is very frustrating to have to read other people's > documentation. You know that some elite Python programmer is going to > optimize his code using this and someone years later is going to scratch > his head wondering where __version__ is coming from. Is it the provided by > the caller? Was it added to the object at some earlier point? Neil, don't you think you're being overly dramatic here? "Programmer needs to look up API feature, news at 11!" The same could be said about class.__name__, instance.__class__, obj.__doc__, module.__dict__ and indeed every single Python feature. Sufficiently inexperienced or naive programmers could be scratching their head over literally *anything*. (I remember being perplexed by None the first time I read Python code. What was it and where did it come from? I had no idea.) All those words for such a simple, and minor, point: every new API feature is one more thing for programmers to learn. We get that. But the following is a good, strong argument: > Also, using this __version__ in source code is going to complicate > switching from CPython to any of the other Python implementations, so those > implementations will probably end up implementing it just to simplify > "porting", which would otherwise be painless. > > Why don't we leave exposing __version__ in Python to another PEP? Once > it's in the C API (as you proposed) you will be able to use it from Python > by writing an extension and then someone can demonstrate the value of > exposing it in Python by writing tests. I can't really argue against this. As much as I would love to play around with __version__, I think you're right. It needs to prove itself before being exposed as a public API. -- Steve From ram at rachum.com Mon Jan 11 07:02:44 2016 From: ram at rachum.com (Ram Rachum) Date: Mon, 11 Jan 2016 14:02:44 +0200 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: Hi everyone, I spent some time thinking about this. I come up with a big and impressive API, then figured it's overkill, shelved it and made a simpler one :) Here's my new preferred API. Assume that `path` is a `pathlib.Path` object. Checking the chmod of the file: int(path.chmod) # Get an int 393 which in octal is 0o611 oct(path.chmod) # Get a string '0o611' str(path.chmod) # Get a string 'rw-r--r--' repr(path.chmod) # Get a string ' Modifying the chmod of the file: path.chmod(0o611) # Set chmod to 0o611 (for backward compatibility) path.chmod = 0o611 # Set chmod to 0o611 path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal path.chmod = other_path.chmod # Set chmod to be the same as that of some other file path.chmod = 'rw-r--r--' # Set chmod to 0o611 path.chmod += '--x--x--x' # Add execute permission to everyone path.chmod -= '----rwx' # Remove all permissions from others I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__. On an unrelated note, maybe we should have __inand__? (I mean x ^~= y) What do you think? On Sat, Jan 9, 2016 at 6:11 PM, Chris Angelico wrote: > On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum wrote: > > Thanks for the reference. Personally I think that > `my_path.stat().st_mode & > > stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API. > > Probably this for the same action you described: > > > > 'x' in my_path.chmod()['g'] > > > > > > Okay. I'm not sure how popular that'll be, but sure. > > As an alternative API, you could have it return a tuple of permission > strings, which you'd use thus: > > 'gx' in my_path.mode() # Group eXecute permission is set > > But scratch your own itch, and don't give in to the armchair advisers. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Jan 11 07:29:26 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 23:29:26 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Mon, Jan 11, 2016 at 11:02 PM, Ram Rachum wrote: > Here's my new preferred API. Assume that `path` is a `pathlib.Path` object. > > Checking the chmod of the file: > int(path.chmod) # Get an int 393 which in octal is 0o611 > oct(path.chmod) # Get a string '0o611' > str(path.chmod) # Get a string 'rw-r--r--' > repr(path.chmod) # Get a string ' > > Modifying the chmod of the file: > path.chmod(0o611) # Set chmod to 0o611 (for backward > compatibility) > path.chmod = 0o611 # Set chmod to 0o611 > path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal > path.chmod = other_path.chmod # Set chmod to be the same as that > of some other file > path.chmod = 'rw-r--r--' # Set chmod to 0o611 > path.chmod += '--x--x--x' # Add execute permission to everyone > path.chmod -= '----rwx' # Remove all permissions from others > > I've chosen += and -=, despite the fact they're not set operations, because > Python doesn't have __inand__. On an unrelated note, maybe we should have > __inand__? (I mean x ^~= y) > > What do you think? > The one thing I'd do differently is call it "mode" or "permissions" rather than "chmod" (CHange MODe), and drop the callability. If you're going to do it as property assignment, making that property also be callable feels awkward (plus it'll be a pain to implement). But otherwise, yeah! Looks great! ChrisA From jsbueno at python.org.br Mon Jan 11 07:41:37 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 11 Jan 2016 10:41:37 -0200 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: If you are doing it OO and trying to create a human-usable API, then, why the hell to stick with octal and string representations from the 1970's? path.chmod.executable could return a named-tuple-like object, with owner=True, group=False, all=False - and conversely, you could have path.chmod.owner to return (read=True, write=True, execute=True) Ad thus one could simply do: if path.chmod.owner.writable: ... On 11 January 2016 at 10:29, Chris Angelico wrote: > On Mon, Jan 11, 2016 at 11:02 PM, Ram Rachum wrote: >> Here's my new preferred API. Assume that `path` is a `pathlib.Path` object. >> >> Checking the chmod of the file: >> int(path.chmod) # Get an int 393 which in octal is 0o611 >> oct(path.chmod) # Get a string '0o611' >> str(path.chmod) # Get a string 'rw-r--r--' >> repr(path.chmod) # Get a string ' >> >> Modifying the chmod of the file: >> path.chmod(0o611) # Set chmod to 0o611 (for backward >> compatibility) >> path.chmod = 0o611 # Set chmod to 0o611 >> path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal >> path.chmod = other_path.chmod # Set chmod to be the same as that >> of some other file >> path.chmod = 'rw-r--r--' # Set chmod to 0o611 >> path.chmod += '--x--x--x' # Add execute permission to everyone >> path.chmod -= '----rwx' # Remove all permissions from others >> >> I've chosen += and -=, despite the fact they're not set operations, because >> Python doesn't have __inand__. On an unrelated note, maybe we should have >> __inand__? (I mean x ^~= y) >> >> What do you think? >> > > The one thing I'd do differently is call it "mode" or "permissions" > rather than "chmod" (CHange MODe), and drop the callability. If you're > going to do it as property assignment, making that property also be > callable feels awkward (plus it'll be a pain to implement). But > otherwise, yeah! Looks great! > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Mon Jan 11 07:47:48 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Jan 2016 23:47:48 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Mon, Jan 11, 2016 at 11:41 PM, Joao S. O. Bueno wrote: > If you are doing it OO and trying to create a human-usable API, then, > why the hell to stick with > octal and string representations from the 1970's? Because they are compact and readable, even when you have lots of them in a column. $ ll /tmp total 24 drwx------ 2 rosuav rosuav 4096 Jan 8 05:27 gpg-4K37Xk drwxr-xr-x 2 root root 4096 Jan 8 05:36 hsperfdata_root drwxr-xr-x 2 rosuav rosuav 4096 Jan 11 22:35 hsperfdata_rosuav drwx------ 2 root root 4096 Jan 8 05:26 pulse-PKdhtXMmr18n prwxr-xr-x 1 rosuav rosuav 0 Jan 11 22:26 SciTE.1722.in drwx------ 2 rosuav rosuav 4096 Jan 8 05:27 ssh-Feo5RK7e1TV3 drwx------ 3 root root 4096 Jan 8 05:27 systemd-private-843a33883f7e4c5c8e6ff168f853c415-rtkit-daemon.service-cdt00F You can see at a glance which ones are readable by people other than their owners. That's worth keeping. It doesn't have to be the ONLY way to do things, but it's definitely one that I do not want to lose. ChrisA From marcin at urzenia.net Mon Jan 11 07:57:02 2016 From: marcin at urzenia.net (Marcin Sztolcman) Date: Mon, 11 Jan 2016 13:57:02 +0100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: On Mon, Jan 11, 2016 at 1:02 PM, Ram Rachum wrote: > I spent some time thinking about this. I come up with a big and impressive > API, then figured it's overkill, shelved it and made a simpler one :) > > Here's my new preferred API. Assume that `path` is a `pathlib.Path` object. > > Checking the chmod of the file: > int(path.chmod) # Get an int 393 which in octal is 0o611 > oct(path.chmod) # Get a string '0o611' > str(path.chmod) # Get a string 'rw-r--r--' > repr(path.chmod) # Get a string ' > > Modifying the chmod of the file: > path.chmod(0o611) # Set chmod to 0o611 (for backward > compatibility) > path.chmod = 0o611 # Set chmod to 0o611 > path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal > path.chmod = other_path.chmod # Set chmod to be the same as that > of some other file > path.chmod = 'rw-r--r--' # Set chmod to 0o611 > path.chmod += '--x--x--x' # Add execute permission to everyone > path.chmod -= '----rwx' # Remove all permissions from others > > I've chosen += and -=, despite the fact they're not set operations, because > Python doesn't have __inand__. On an unrelated note, maybe we should have > __inand__? (I mean x ^~= y) > > What do you think? There is only one way...? ;) My proposal (used for some time in few private projects, but extracted as standalone few days ago): https://github.com/msztolcman/fileperms -- Marcin Sztolcman :: http://urzenia.net/ :: http://sztolcman.eu/ From victor.stinner at gmail.com Mon Jan 11 09:04:15 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 15:04:15 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <49b26af0-3233-4904-8b85-0106cb38f666@googlegroups.com> <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> Message-ID: 2016-01-11 11:18 GMT+01:00 Neil Girdhar : >> No, he can still do what he wants transparently in the interpreter. What >> I want to avoid is Python users using __version__ in their own code. >> >> Well, he could change exec so it can use arbitrary mappings (or at least >> dict subclasses), but I assume that's much harder and more disruptive than >> his proposed change. >> >> Anyway, if I understand your point, it's this: __version__ should either >> be a private implementation-specific property of dicts, or it should be a >> property of all mappings; anything in between gets all the disadvantages of >> both. > > Right. I prefer the the former since making it a property of mappings > bloats Mapping beyond a minimum interface. The discussion on adding a __version__ property on all mapping types is interesting. I now agree that it's a boolean choice: no mapping type must have a __version__ property, or all types must have it. It would be annoying to get a cryptic issue when we pass a dict subtype or a dict-like type to a function expecting a "mapping". I *don't* want to require all mapping types to implement a __version__ property. Even if it's simple to implement, some types can be a simple wrapper on top on an existing efficient mapping type which doesn't implement such property (or worse, have a similar *but different* property). For example, Jython and IronPython probably reuse existing mapping types of Java and .NET, and I don't think that they have such version property. The Mapping ABC already requires a lot of methods, having to implement yet another property would make the implementation even more complex and difficult to maintain. My PEP 509 requires 8 methods (including the constructor) to update the __version__. > Here is where I have to disagree. I hate it when experts say "we'll just > document it and then it's the user's fault for misusing it". Yeah, you're > right, but as a user, it is very frustrating to have to read other people's > documentation. You know that some elite Python programmer is going to > optimize his code using this and someone years later is going to scratch his > head wondering where __version__ is coming from. Is it the provided by the > caller? Was it added to the object at some earlier point? Finally, he'll > search the web, arrive at a stackoverflow question with 95 upvotes that > finally clears things up. And for what? Some minor optimization. (Not > Victor's optimization, but a Python user's optimization in Python code.) I agree that it would be a bad practice to use widely __version__ in a project to micro-optimize manually an application. Well, micro-optimizations are bad practice in most cases ;-) Remember that dict lookup have a complex of O(1), that's why they are used for namespaces ;-) It's a bad idea because at the Python level, the dict lookup and checking the version has... the same cost! (48.7 ns vs 47.5 ns... a difference of 1 nanosecond) haypo at smithers$ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd["33"] == 33' 10000000 loops, best of 3: 0.0487 usec per loop haypo at smithers$ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd.__version__ == 100' 10000000 loops, best of 3: 0.0475 usec per loop The difference is only visible at the C level: * PyObject_GetItem: 16.5 ns * PyDict_GetItem: 14.8 ns * fat.GuardDict: 3.8 ns (check dict.__version__) Well, 3.8 ns (guard) vs 14.8 ns (dict lookup) is nice but not so amazing, a dict lookup is already *fast*. The difference between guards and dict lookups is that a guard check has a complexity of O(1) in the common case (if the dict was not modified). For example, an optimization using 10 global variables in a function, the check costs 148 ns for 10 dict lookups, whereas the guard still only cost 3.8 ns (39x as fast). The guards must be as cheap as possible, otherwise it will have to work harder to implement more efficient optimizations :-D Note: the performance of a dict lookup also depends if the key is "interned" (in short, it's a kind of singleton to compare strings by their address instead of having to compare character per character). For code objects, Python interns strings which are made of characters a-z, A-Z and "_". Well, it's just to confirm that yes, the PEP is designed to implement fast guards in C, but it would be a bad idea to start to use it widely at the Python level. > Also, using this __version__ in source code is going to complicate switching > from CPython to any of the other Python implementations, so those > implementations will probably end up implementing it just to simplify > "porting", which would otherwise be painless. IMHO *if* we add __version__ to dict (or even to all mapping types), it must be done for all Python implementations. It would be really annoying to have to start putting kind of #ifdef in the code for a feature of a core builtin type (dict). But again, I now agree to not expose the version at the Python level... > Why don't we leave exposing __version__ in Python to another PEP? According to this thread and my benchmark above, the __version__ property at the Python level is a *bad* idea. So I'm not interested anymore to expose it. Victor From barry at python.org Mon Jan 11 10:10:41 2016 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Jan 2016 10:10:41 -0500 Subject: [Python-ideas] PEP 9 - plaintext PEP format - is officially deprecated In-Reply-To: References: <20160105184921.317ac5ec@limelight.wooz.org> Message-ID: <20160111101041.1ba6177d@limelight.wooz.org> On Jan 11, 2016, at 03:25 PM, anatoly techtonik wrote: >On Wed, Jan 6, 2016 at 2:49 AM, Barry Warsaw wrote: > >> reStructuredText is clearly a better format > >Can you expand on that? I use markdown everywhere reST is better than plain text. Markdown is not a PEP format option. >> all recent PEP submissions have been in reST for a while now anyway. > >Is it possible to query exact numbers automatically? Feel free to grep the PEPs hg repo. >What is the tooling support for handling PEP 9 and PEP 12? UTSL. Everything is in the PEPs hg repo. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From victor.stinner at gmail.com Mon Jan 11 11:49:21 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 17:49:21 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: Thank you very much for the first round of comments on this PEP 509 (dict version). I posted a second version to python-dev. The main changes since the first version are that the dictionary version is no more exposed at the Python level and the field type now also has a size of 64-bit on 32-bit platforms. Please continue the discussion there, this thread is now closed ;-) It's now time to review my second PEP 510 (func.specialize), also posted on the python-ideas list 3 days ago: "RFC: PEP: Specialized functions with guards"! Victor From abarnert at yahoo.com Mon Jan 11 11:49:45 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 08:49:45 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: Message-ID: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> On Jan 11, 2016, at 04:02, Ram Rachum wrote: > > I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__. For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea. > On an unrelated note, maybe we should have __inand__? (I mean x ^~= y) First, why would you spell inand that way? x ^ ~y is the exclusive or of x and ~y, which is true for ~x and ~y and for x and y. That's completely different from nand, which is true for ~x and ~y, ~x and y, and x and y, but not x and ~y. And neither is what you want, which is true only for x and ~y, which you can easily write as x & ~y. Second, why do you think you need an i-operation for a combined operator? x &= ~y does the same thing you'd expect from x &~= y. And why do you think you need an overridable i-operator in the first place? If you call x |= y and x.__ior__ doesn't exist, it just compiles to the same as x = x | y. And, unless x is mutable (which would be very surprising for something that acts like an int), that's actually the way you want it to be interpreted anyway. All of this implies that adding the 70s bitwise operator syntax for dealing with permissions doesn't help with concise but readable code so much as encourage people who don't actually understand bitwise operations to write things that aren't correct or to misread other people's code. What's wrong with just spelling it "clear"? Or, better, as attribute access ("p.chmod.executable = False" or "p.chmod.group.executable = False") or actual set operations with sets of enums instead of integers? The other advantage of using named operations is that it lets you write things that are useful but can't be expressed in a single bitwise operation. For example, "p.chmod.group = q.chmod.owner" is a lot simpler than "p.chmod = p.chmod & ~0o070 | (q.chmod >> 3) & 0o070". Meanwhile, why are you calling the mode "chmod", which is an abbreviation for "change mode"? That's sort of readable but still weird for the cases when you're modifying it, but completely confusing for cases when you're just reading it off. Have you looked at the existing alternatives on PyPI? If so, why isn't one of them good enough? And meanwhile, why not just put your library on PyPI and see if others take it up and start using it? Is there a reason this has to be in the stdlib (and only available on 3.6+) to be usable? From rosuav at gmail.com Mon Jan 11 11:53:04 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Jan 2016 03:53:04 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> References: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> Message-ID: On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert wrote: > On Jan 11, 2016, at 04:02, Ram Rachum wrote: >> >> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__. > > For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea. I would expect it NOT to be a subclass of int, actually - just that it has __int__ (and maybe __index__) to convert it to one. ChrisA From guido at python.org Mon Jan 11 12:27:20 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 09:27:20 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> Message-ID: Unless there's a huge outcry I'm going to add this as an informational section to PEP 484. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 11 12:38:14 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 09:38:14 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <007001d14af8$1dcdb3a0$59691ae0$@gmail.com> Message-ID: Done: https://hg.python.org/peps/rev/06f8470390c2 (I'm happy to change or move this if there *is* a serious concern -- but I figured if there isn't I might as well get it over with. On Mon, Jan 11, 2016 at 9:27 AM, Guido van Rossum wrote: > Unless there's a huge outcry I'm going to add this as an informational > section to PEP 484. > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kramm at google.com Mon Jan 11 13:10:20 2016 From: kramm at google.com (Matthias Kramm) Date: Mon, 11 Jan 2016 10:10:20 -0800 (PST) Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <46f6aedf-9871-4541-9263-36302c2f0b1d@googlegroups.com> On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote: > > At Dropbox we're trying to be good citizens and we're working towards > introducing gradual typing (PEP 484) into our Python code bases (several > million lines of code). However, that code base is mostly still Python 2.7 > and we believe that we should introduce gradual typing first and start > working on conversion to Python 3 second (since having static types in the > code can help a big refactoring like that). > > Since Python 2 doesn't support function annotations we've had to look for > alternatives. We considered stub files, a magic codec, docstrings, and > additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. > FWIW, we had the same problem at Google. (Almost) all our code is Python 2. However, we went the route of backporting the type annotations grammar from Python 3. We now run a custom Python 2 that knows about PEP 3107. The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on the eyes (capitalized container names, square brackets, quoting, ...) , and squeezing it all into comments wouldn't have helped matters, and would have hindered adoption. We're still happy with our decision of running a custom Python 2, but your mileage might vary. It's certainly true that other tools (pylint etc.) need to learn to not be confused by the "odd" Python 2 syntax. [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a > bit over 200 lines. It's not very interesting yet, since it sets the types > of nearly all arguments to 'Any'. We're considering building a much more > advanced version that tries to guess much better argument types using some > form of whole-program analysis. I've heard that Facebook's Hack project got > a lot of mileage out of such a tool. I don't yet know how to write it yet > -- possibly we could use a variant of mypy's type inference engine, or > alternatively we might be able to use something like Jedi ( > https://github.com/davidhalter/jedi). > pytype (http://github.com/google/pytype) already does (context sensitive, path-sensitive) whole-program analysis, and we're working on making it (more) PEP 484 compatible. We're also writing a (2to3 based) tool for inserting the derived tools back into the source code. Should we join forces? Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Jan 11 13:35:00 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Jan 2016 10:35:00 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: <20160109153931.GR10854@ando.pearwood.info> Message-ID: On Sat, Jan 9, 2016 at 7:41 AM, Ram Rachum wrote: > >> What's wrong with referencing other modules? >> >> > Not wrong, just desirable to avoid. For example, I think that doing > `path.chmod(x)` is preferable to `os.chmod(path, x)`. > I often prefer OO structure as well, but you can get that by subclassing Path -- it doesn't need to be in the stdlib. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Jan 11 13:42:30 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 11 Jan 2016 18:42:30 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <5691018E.4090006@egenix.com> References: <5691018E.4090006@egenix.com> Message-ID: On Sat, Jan 9, 2016 at 4:48 AM M.-A. Lemburg wrote: > On 09.01.2016 00:04, Guido van Rossum wrote: > > Since Python 2 doesn't support function annotations we've had to look for > > alternatives. We considered stub files, a magic codec, docstrings, and > > additional `# type:` comments. In the end we decided that `# type:` > > comments are the most robust approach. We've experimented a fair amount > > with this and we have a proposal for a standard. > > > > The proposal is very simple. Consider the following function with Python > 3 > > annotations: > > > > def embezzle(self, account: str, funds: int = 1000000, > *fake_receipts: > > str) -> None: > > """Embezzle funds from account using fake receipts.""" > > > > > > An equivalent way to write this in Python 2 is the following: > > > > def embezzle(self, account, funds=1000000, *fake_receipts): > > # type: (str, int, *str) -> None > > """Embezzle funds from account using fake receipts.""" > > > > By using comments, the annotations would not be available at > runtime via an .__annotations__ attribute and every tool would > have to implement a parser for extracting them. > > Wouldn't it be better and more in line with standard Python > syntax to use decorators to define them ? > > @typehint("(str, int, *str) -> None") > def embezzle(self, account, funds=1000000, *fake_receipts): > """Embezzle funds from account using fake receipts.""" > > > This would work in Python 2 as well and could (optionally) > add an .__annotations__ attribute to the function/method, > automatically create a type annotations file upon import, > etc. > The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to *deploy* code that looks at __annotations__). -gps > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Jan 09 2016) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > ________________________________________________________________________ > > ::: We implement business ideas - efficiently in both time and costs ::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kramm at google.com Mon Jan 11 13:10:20 2016 From: kramm at google.com (Matthias Kramm) Date: Mon, 11 Jan 2016 10:10:20 -0800 (PST) Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <46f6aedf-9871-4541-9263-36302c2f0b1d@googlegroups.com> On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote: > > At Dropbox we're trying to be good citizens and we're working towards > introducing gradual typing (PEP 484) into our Python code bases (several > million lines of code). However, that code base is mostly still Python 2.7 > and we believe that we should introduce gradual typing first and start > working on conversion to Python 3 second (since having static types in the > code can help a big refactoring like that). > > Since Python 2 doesn't support function annotations we've had to look for > alternatives. We considered stub files, a magic codec, docstrings, and > additional `# type:` comments. In the end we decided that `# type:` > comments are the most robust approach. > FWIW, we had the same problem at Google. (Almost) all our code is Python 2. However, we went the route of backporting the type annotations grammar from Python 3. We now run a custom Python 2 that knows about PEP 3107. The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on the eyes (capitalized container names, square brackets, quoting, ...) , and squeezing it all into comments wouldn't have helped matters, and would have hindered adoption. We're still happy with our decision of running a custom Python 2, but your mileage might vary. It's certainly true that other tools (pylint etc.) need to learn to not be confused by the "odd" Python 2 syntax. [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a > bit over 200 lines. It's not very interesting yet, since it sets the types > of nearly all arguments to 'Any'. We're considering building a much more > advanced version that tries to guess much better argument types using some > form of whole-program analysis. I've heard that Facebook's Hack project got > a lot of mileage out of such a tool. I don't yet know how to write it yet > -- possibly we could use a variant of mypy's type inference engine, or > alternatively we might be able to use something like Jedi ( > https://github.com/davidhalter/jedi). > pytype (http://github.com/google/pytype) already does (context sensitive, path-sensitive) whole-program analysis, and we're working on making it (more) PEP 484 compatible. We're also writing a (2to3 based) tool for inserting the derived tools back into the source code. Should we join forces? Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Jan 11 13:57:43 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 11 Jan 2016 18:57:43 +0000 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: <5F6A858FD00E5F4A82E3206D2D854EF892D48F92@EXMB09.ohsu.edu> References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> <5F6A858FD00E5F4A82E3206D2D854EF892D48F92@EXMB09.ohsu.edu> Message-ID: On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney wrote: > Its important to keep in mind the main benefit of scandir is you don't > have to do ANY stat call in many cases, because the directory listing > provides some subset of this info. On Linux you can at least tell if a path > is a file or directory. On windows there is much more info provided by the > directory listing. Avoiding subsequent stat calls is also nice, but not > nearly as important due to OS level caching. > +1 - this was one of the two primary motivations behind scandir. Anything trying to reimplement a filesystem tree walker without using scandir is going to have sub-standard performance. If we ever offer anything with "find like functionality" related to pathlib, it *needs* to be based on scandir. Anything else would just be repeating the convenient but untrue limiting assumptions of os.listdir: That the contents of a directory can be loaded into memory and that we don't mind re-querying the OS for stat information that it already gave us but we threw away as part of reading the directory. -gps > > > Brendan Moloney > Research Associate > Advanced Imaging Research Center > Oregon Health Science University > *From:* Python-ideas [python-ideas-bounces+moloney=ohsu.edu at python.org] > on behalf of Guido van Rossum [guido at python.org] > *Sent:* Wednesday, January 06, 2016 2:42 PM > *To:* Random832 > > *Cc:* Python-Ideas > *Subject:* Re: [Python-ideas] find-like functionality in pathlib > I couldn't help myself and coded up a prototype for the StatCache design I > sketched. See http://bugs.python.org/issue26031. Feedback welcome! On my > Mac it only seems to offer limited benefits though... > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jan 11 15:00:54 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 11 Jan 2016 20:00:54 +0000 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> <5F6A858FD00E5F4A82E3206D2D854EF892D48F92@EXMB09.ohsu.edu> Message-ID: On 11 January 2016 at 18:57, Gregory P. Smith wrote: > On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney wrote: >> >> Its important to keep in mind the main benefit of scandir is you don't >> have to do ANY stat call in many cases, because the directory listing >> provides some subset of this info. On Linux you can at least tell if a path >> is a file or directory. On windows there is much more info provided by the >> directory listing. Avoiding subsequent stat calls is also nice, but not >> nearly as important due to OS level caching. > > > +1 - this was one of the two primary motivations behind scandir. Anything > trying to reimplement a filesystem tree walker without using scandir is > going to have sub-standard performance. > > If we ever offer anything with "find like functionality" related to pathlib, > it needs to be based on scandir. Anything else would just be repeating the > convenient but untrue limiting assumptions of os.listdir: That the contents > of a directory can be loaded into memory and that we don't mind re-querying > the OS for stat information that it already gave us but we threw away as > part of reading the directory. This is very much why I feel that we need something in pathlib. I understand the motivation for not caching stat information in path objects. And I don't have a viable design for how a "find-like functionality" API should be implemented in pathlib. But as it stands, I feel as though using pathlib for anything that does bulk filesystem scans is deliberately choosing something that I know won't scale well. So (in my mind) pathlib doesn't fulfil the role of "one obvious way to do things". Which is a shame, because Path.rglob is very often far closer to what I need in my programs than os.walk (even when it's just rootpath.rglob('*')). In practice, by far the most common need I have[1] for filetree walking is to want to get back a list of all the names of files starting at a particular directory with the returned filenames *relative to the given root*. Pathlib.rglob gives absolute pathnames. os.walk gives the absolute directory name and the base filename. Neither is what I want, although obviously in both cases it's pretty trivial to extract the "relative to the root" part from the returned data. But an API that gave that information directly, with scandir-level speed and scalability, in the form of pathlib.Path relative path objects, would be ideal for me[1]. Paul [1] And yes, I know this means I should just write a utility function for it :-) [2] The feature creep starts when people want to control things like pruning particular directories such as '.git', or only matching particular glob patterns, or choosing whether or not to include directories in the output, or... Adding *those* features without ending up with a Frankenstein's monster of an API is the challenge :-) From abarnert at yahoo.com Mon Jan 11 15:22:55 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 12:22:55 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> Message-ID: <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> On Jan 11, 2016, at 10:42, Gregory P. Smith wrote: > > The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__). These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed. 3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations? Meanwhile, when _are_ annotations useful at runtime? Mostly during the kind of debugging that you'll be doing during something like a port from 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're not useful there, it's hard to imagine why they'd be useful after the port is done, when you're deploying your 3.x code. So it seems like using decorators (or backporting the syntax, as Google has done) has better be acceptable for 2.7, or the PEP 484 design has a serious problem, and in a few months we're going to see Dropbox and Google and everyone else demanding a way to use type hinting without wasting memory on annotations are runtime in 3.x. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 11 15:41:48 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 12:41:48 -0800 Subject: [Python-ideas] find-like functionality in pathlib In-Reply-To: References: <357278012.1941601.1450821295123.JavaMail.yahoo@mail.yahoo.com> <8348957817041496645@unknownmsgid> <5F6A858FD00E5F4A82E3206D2D854EF892D48AA2@EXMB09.ohsu.edu> <1452096673.440928.484566490.207F90DE@webmail.messagingengine.com> <5F6A858FD00E5F4A82E3206D2D854EF892D48F92@EXMB09.ohsu.edu> Message-ID: On Mon, Jan 11, 2016 at 10:57 AM, Gregory P. Smith wrote: > On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney wrote: > >> Its important to keep in mind the main benefit of scandir is you don't >> have to do ANY stat call in many cases, because the directory listing >> provides some subset of this info. On Linux you can at least tell if a path >> is a file or directory. On windows there is much more info provided by the >> directory listing. Avoiding subsequent stat calls is also nice, but not >> nearly as important due to OS level caching. >> > > +1 - this was one of the two primary motivations behind scandir. Anything > trying to reimplement a filesystem tree walker without using scandir is > going to have sub-standard performance. > > If we ever offer anything with "find like functionality" related to > pathlib, it *needs* to be based on scandir. Anything else would just be > repeating the convenient but untrue limiting assumptions of os.listdir: > That the contents of a directory can be loaded into memory and that we > don't mind re-querying the OS for stat information that it already gave us > but we threw away as part of reading the directory. > And we already have this in the form of pathlib's [r]glob() methods. There's a patch to the glob module in http://bugs.python.org/issue25596 and as soon as that's committed I hope that its author(s) will work on doing a similar patch for pathlib's [r]glob (tracking this in http://bugs.python.org/issue26032). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 11 15:44:18 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 12:44:18 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> Message-ID: <1A67A49B-C9AC-435B-AA75-34C7D16FE39F@yahoo.com> On Jan 11, 2016, at 08:53, Chris Angelico wrote: > >> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert wrote: >>> On Jan 11, 2016, at 04:02, Ram Rachum wrote: >>> >>> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__. >> >> For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea. > > I would expect it NOT to be a subclass of int, actually - just that it > has __int__ (and maybe __index__) to convert it to one. If you read his proposal, he wants oct(path.chmod) to work. That doesn't work on types with __int__. Of course it does work on types with __index__, but that's because the whole point of __index__ is to allow your type to act like an actual int everywhere that Python expects an int, rather than just something coercible to int. The point of PEP 357 was to allow numpy.int64 to act as close to a subtype of int as possible without actually being a subtype. It would be very surprising for, say, IntEnum (which subclasses int), or numpy.int64 (which uses __index__), to offer an __add__ method that actually did an or instead of an add. It will be just as surprising here. And the fact that he wants to make it possible (in fact, _encouraged_) to directly assign an int to the property makes it even more confusing. For example, "p.chmod = q + 0o010" does one thing if q is an integer, and another thing if it's the chmod of another path object. (Of course he also wants to be able to assign a string, but that's not a problem; if you mix up str and int, you get a nice TypeError, as opposed to mixing up int and an int subclass or __index__-using class, where you silently get incorrect behavior.) From guido at python.org Mon Jan 11 16:22:11 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 13:22:11 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <46f6aedf-9871-4541-9263-36302c2f0b1d@googlegroups.com> References: <46f6aedf-9871-4541-9263-36302c2f0b1d@googlegroups.com> Message-ID: On Mon, Jan 11, 2016 at 10:10 AM, Matthias Kramm wrote: > On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote: >> >> At Dropbox we're trying to be good citizens and we're working towards >> introducing gradual typing (PEP 484) into our Python code bases (several >> million lines of code). However, that code base is mostly still Python 2.7 >> and we believe that we should introduce gradual typing first and start >> working on conversion to Python 3 second (since having static types in the >> code can help a big refactoring like that). >> >> Since Python 2 doesn't support function annotations we've had to look for >> alternatives. We considered stub files, a magic codec, docstrings, and >> additional `# type:` comments. In the end we decided that `# type:` >> comments are the most robust approach. >> > > FWIW, we had the same problem at Google. (Almost) all our code is Python > 2. However, we went the route of backporting the type annotations grammar > from Python 3. We now run a custom Python 2 that knows about PEP 3107. > Yeah, we looked into this but we use many 3rd party tools that would not know what to do with the new syntax, so that's why we went the route of adding support for these comments to mypy. > The primary reasons are aesthetic - PEP 484 syntax is already a bit hard > on the eyes (capitalized container names, square brackets, quoting, ...) , > and squeezing it all into comments wouldn't have helped matters, and would > have hindered adoption. > Possibly. I haven't had any pushback about this from the Dropbox engineers who have seen this so far. > We're still happy with our decision of running a custom Python 2, but your > mileage might vary. It's certainly true that other tools (pylint etc.) need > to learn to not be confused by the "odd" Python 2 syntax. > We had some relevant experience with pyxl, and basically it wasn't good -- too many tools had to had custom support added or simply can't be used on files containing pyxl syntax. (https://github.com/dropbox/pyxl) > [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's >> a bit over 200 lines. It's not very interesting yet, since it sets the >> types of nearly all arguments to 'Any'. We're considering building a much >> more advanced version that tries to guess much better argument types using >> some form of whole-program analysis. I've heard that Facebook's Hack >> project got a lot of mileage out of such a tool. I don't yet know how to >> write it yet -- possibly we could use a variant of mypy's type inference >> engine, or alternatively we might be able to use something like Jedi ( >> https://github.com/davidhalter/jedi). >> > > pytype (http://github.com/google/pytype) already does (context sensitive, > path-sensitive) whole-program analysis, and we're working on making it > (more) PEP 484 compatible. We're also writing a (2to3 based) tool for > inserting the derived tools back into the source code. Should we join > forces? > I would love to! Perhaps we can take this discussion off line? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 11 16:38:53 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 13:38:53 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert wrote: > On Jan 11, 2016, at 10:42, Gregory P. Smith wrote: > > The goal of the # type: comments as described is to have this information > for offline analysis of code, not to make it available at run time. Yes, a > decorator syntax could be adopted if anyone needs that. I don't expect > anyone does. Decorators and attributes would add run time cpu and memory > overhead whether the information was going to be used at runtime or not > (likely not; nobody is likely to *deploy* code that looks at > __annotations__). > > > These same arguments were made against PEP 484 in the first place, and (I > think rightly) dismissed. > The way I recall it the argument was made against using decorators for PEP 484 and we rightly decided not to use decorators. > 3.x code with annotations incurs a memory overhead, even though most > runtime code is never going to use them. That was considered to be > acceptable. So why isn't it acceptable for the same code before it's ported > to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a > serious blocking regression that, once the port is completed and you're > running under 3.x, you're now wasting memory for those useless annotations? > I'm not objecting to the memory overhead of using decorators, but to the execution time (the extra function call). And the scope for the proposal is much smaller -- while PEP 484 is the first step on a long road towards integrating gradual (i.e. OPTIONAL) typing into Python, the proposal on the table today is only meant for annotating Python 2.7 code so we can get rid of it more quickly. > Meanwhile, when _are_ annotations useful at runtime? Mostly during the > kind of debugging that you'll be doing during something like a port from > 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're > not useful there, it's hard to imagine why they'd be useful after the port > is done, when you're deploying your 3.x code. > I'm not sure how to respond to this -- I disagree with your prediction but I don't think either of us really has any hard data from experience yet. I am however going to be building the kind of experience that might eventually be used to decide this, over the next few years. The first step is going to introduce annotations into Python 2.7 code, and I know my internal customers well enough to know that convincing them that we should use decorators for annotations would be a much bigger battle than putting annotations in comments. Since I have many other battles to fight I would like this one to be as short as possible. So it seems like using decorators (or backporting the syntax, as Google has > done) has better be acceptable for 2.7, or the PEP 484 design has a serious > problem, and in a few months we're going to see Dropbox and Google and > everyone else demanding a way to use type hinting without wasting memory on > annotations are runtime in 3.x. > Again, I disagree with your assessment but it's difficult to prove anything without hard data. One possible argument may be that Python 3 offers a large package of combined run-time advantages, with some cost that's hard to separate. However, for Python 2.7 there's either a run-time cost or there's no run-time cost -- there's no run-time benefit. And I don't want to have to calculate how many extra machines we'll need to provision in order to make up for the run-time cost. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Mon Jan 11 16:48:16 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Mon, 11 Jan 2016 22:48:16 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: What about this? def embezzle(self, account: "PEP3107 annotation"): # type: (str) -> Any """Embezzle funds from account using fake receipts.""" --- And BTW in PEP484 text -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no annotations. could be probably? -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no type hints. From mertz at gnosis.cx Mon Jan 11 16:49:50 2016 From: mertz at gnosis.cx (David Mertz) Date: Mon, 11 Jan 2016 13:49:50 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <5691018E.4090006@egenix.com> References: <5691018E.4090006@egenix.com> Message-ID: > > > An equivalent way to write this in Python 2 is the following: > > > > def embezzle(self, account, funds=1000000, *fake_receipts): > > # type: (str, int, *str) -> None > > """Embezzle funds from account using fake receipts.""" > > > > By using comments, the annotations would not be available at > runtime via an .__annotations__ attribute and every tool would > have to implement a parser for extracting them. > > Wouldn't it be better and more in line with standard Python > syntax to use decorators to define them ? > > @typehint("(str, int, *str) -> None") > def embezzle(self, account, funds=1000000, *fake_receipts): > """Embezzle funds from account using fake receipts.""" > > I really like MAL's variation much better. Being able to see .__annotations__ at runtime feels like an important feature that we'd give up with the purely comment style. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 11 16:52:28 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 13:52:28 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy wrote: > What about this? > > def embezzle(self, account: "PEP3107 annotation"): > # type: (str) -> Any > """Embezzle funds from account using fake receipts.""" > > I don't understand your proposal -- this is not valid Python 2.7 syntax so we cannot use it. > --- > > And BTW in PEP484 text -> > > Functions with the @no_type_check decorator or with a # type: ignore > comment should be treated as having no annotations. > > could be probably? -> > > Functions with the @no_type_check decorator or with a # type: ignore > comment should be treated as having no type hints. > In the context of the PEP the latter interpretation is already implied, so I don't think I need to update the text. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Jan 11 17:21:33 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 11 Jan 2016 22:21:33 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> Message-ID: On Mon, Jan 11, 2016 at 1:50 PM David Mertz wrote: > > An equivalent way to write this in Python 2 is the following: >> > >> > def embezzle(self, account, funds=1000000, *fake_receipts): >> > # type: (str, int, *str) -> None >> > """Embezzle funds from account using fake receipts.""" >> > >> >> By using comments, the annotations would not be available at >> runtime via an .__annotations__ attribute and every tool would >> have to implement a parser for extracting them. >> >> Wouldn't it be better and more in line with standard Python >> syntax to use decorators to define them ? >> >> @typehint("(str, int, *str) -> None") >> def embezzle(self, account, funds=1000000, *fake_receipts): >> """Embezzle funds from account using fake receipts.""" >> >> > > I really like MAL's variation much better. Being able to see > .__annotations__ at runtime feels like an important feature that we'd give > up with the purely comment style. > I'd like people who demonstrate practical important production uses for having .__annotation__ information available at runtime to champion that. Both Google and Dropbox are looking at it as only being meaningful in the offline code analysis context. Even our (Google's) modified 2.7 with annotation grammar backported is just that, grammar only, no .__annotations__ or even validation of names while parsing. It may as well be a # type: comment. We explicitly chose not to use decorators due to their resource usage side effects. 2.7.x itself understandably is... highly unlikely to be modified... to put it lightly. So a backport of ignored annotation syntax is a non-starter there. In that sense I think the # type: comments are fine and are pretty much what I've been expecting to see. The only other alternative not yet mentioned would be to put the information in the docstring. But that has yet other side effects and challenges. So the comments make a lot of sense to recommend for Python 2 within the PEP. .__annotations__ isn't something any Python 2 code has ever had in the past. It can continue to live without it. I do not believe we need to formally recommend a decorator and its implementation in the PEP. (read another way: I do not expect Guido to do that work... but anyone is free to propose it and see if anyone else wants to adopt it) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon Jan 11 17:38:40 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 11 Jan 2016 23:38:40 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: <56942EF0.2040308@egenix.com> On 11.01.2016 22:38, Guido van Rossum wrote: > On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert wrote: > >> On Jan 11, 2016, at 10:42, Gregory P. Smith wrote: >> >> The goal of the # type: comments as described is to have this information >> for offline analysis of code, not to make it available at run time. Yes, a >> decorator syntax could be adopted if anyone needs that. I don't expect >> anyone does. Decorators and attributes would add run time cpu and memory >> overhead whether the information was going to be used at runtime or not >> (likely not; nobody is likely to *deploy* code that looks at >> __annotations__). >> >> >> These same arguments were made against PEP 484 in the first place, and (I >> think rightly) dismissed. >> > > The way I recall it the argument was made against using decorators for PEP > 484 and we rightly decided not to use decorators. To clarify: My suggestion to use a simple decorator with essentially the same syntax as proposed for the "# type: comments " was meant as *additional* allowed syntax, not necessarily as the only one to standardize. I'm a bit worried that by standardizing on using comments for these annotations only, we'll end up having people not use the type annotations because they simply don't like the style of having function bodies begin with comments instead of doc-strings. I certainly wouldn't want to clutter up my code like that. Tools parsing Python 2 source code may also have a problem with this (e.g. not recognize the doc-string anymore). This simply reads better, IMO: @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" and it has the advantage of allowing to have the decorator do additional things such as taking the annotations and writing out a type annotations file for Python 3 and other tools to use. We could also use a variant of the two proposals and additionally allow this syntax: #@typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" to avoid memory and runtime overhead, if that's a problem. Moving from one to the other would then be a simple search&replace over the source code. Or we could have -O remove all those typehint decorator calls from the byte code to a similar effect. Code written for Python 2 & 3 will have to stick to the proposed syntax for quite a while, so we should try to find something that doesn't introduce a new syntax variant of how to specify additional function/method properties, because people are inevitably going to start using the same scheme for all sorts of other crazy stuff and this would make Python code look closer to Java than necessary, IMO: @public_interface @rest_accessible @map_exceptions_to_http_codes def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None # raises: ValueError, TypeError # optimize: jit, inline_globals # tracebacks: hide_locals # reviewed_by: xyz, abc """Embezzle funds from account using fake receipts.""" -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 11 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From pavol.lisy at gmail.com Mon Jan 11 17:41:17 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Mon, 11 Jan 2016 23:41:17 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: 2016-01-11 22:52 GMT+01:00, Guido van Rossum : > On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy wrote: > >> What about this? >> >> def embezzle(self, account: "PEP3107 annotation"): >> # type: (str) -> Any >> """Embezzle funds from account using fake receipts.""" >> >> > > I don't understand your proposal -- this is not valid Python 2.7 syntax so > we cannot use it. I had two things in my mind: 1. suggest some possible impact in the future. In time we are writing code compatible with python2 and python3 we will have type hints comments under python3 too. And because they are more compatible, there is risk(?) that they could be more popular then original PEP484 (for python3) proposal! 2. PEP484 describe possibility how to support other use of annotations and propose to use # type: ignore but similar method how to preserve other use of annotations could be (for example): # type: (str) -> Any and this could combine goodness of type-hints-tools and other types of annotations. At least in deprecation period (if there will be any) for other annotation types. From victor.stinner at gmail.com Mon Jan 11 17:44:15 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Jan 2016 23:44:15 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: I discussed this PEP on the #pypy IRC channel. I will try to summarize the discussion with comments on the PEP directly. 2016-01-08 22:31 GMT+01:00 Victor Stinner : > Add an API to add specialized functions with guards to functions, to > support static optimizers respecting the Python semantic. "respecting the Python semantics" is not 100% exact. In fact, my FAT Python makes suble changes on the "Python semantics". For example, loop unrolling can completly remove the call the range() function. If a debugger is executed instruction per instruction, the output is different on an unrolled loop, since the range() call was removed, and the loop copy is copied. I should maybe elaborate this point in the rationale, explain that a compromise must be found between the funny "in Python, everything is mutable" and performance. But remember that the whole thing (FAT Python, specialization, etc.) is developed outside CPython and is fully optional. > Changes > ======= > > * Add two new methods to functions: > > - ``specialize(code, guards: list)``: add specialized > function with guard. `code` is a code object (ex: > ``func2.__code__``) or any callable object (ex: ``len``). > The specialization can be ignored if a guard already fails. This method doesn't make sense at all in PyPy. The method is specific to CPython since it relies on guards which have a pure C API (see below). The PEP must be more explicit about that. IMHO it's perfectly fine that PyPy makes this method a no-op (the method exactly does nothing). It's already the case if a guard "always" fail in first_check(). > - ``get_specialized()``: get the list of specialized functions with > guards Again, it doesn't make sense for PyPy. Since this method is only used for unit tests, it can be converted to a function and put somewhere else, maybe in the _testcapi module. It's not a good idea to rely on this method in an application, it's really an implementation detail. > * Base ``Guard`` type In fact, exposing the type at the C level is enough. There is no need to expose it at Python level, since the type has no method nor data, and it's not possible to use it in Python. We might expose it in a different module, again, maybe in _testcapi for unit tests. > * ``int check(PyObject *guard, PyObject **stack)``: return 1 on > success, 0 if the guard failed temporarely, -1 if the guard will > always fail I forgot "int na" and "int nk" parameters to support keywords arguments. Note for myself: I should add support for raising an exception. > * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on > success, -1 if the guard will always fail Note for myself: I should rename the method to "init()" and support raising an exception. > Behaviour > ========= > > When a function code is replaced (``func.__code__ = new_code``), all > specialized functions are removed. Moreover, the PEP must be clear about func.__code__ content: func.specialize() must *not* modify func.__code__. It should be a completly black box. Victor From greg at krypto.org Mon Jan 11 17:53:25 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 11 Jan 2016 22:53:25 +0000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: Message-ID: On Fri, Jan 8, 2016 at 11:44 PM Nick Coghlan wrote: > On 9 January 2016 at 16:03, Serhiy Storchaka wrote: > > On 08.01.16 23:27, Victor Stinner wrote: > >> > >> Add a new read-only ``__version__`` property to ``dict`` and > >> ``collections.UserDict`` types, incremented at each change. > > > > > > This may be not the best name for a property. Many modules already have > the > > __version__ attribute, this may make a confusion. > > The equivalent API for the global ABC object graph is > abc.get_cache_token: > https://docs.python.org/3/library/abc.html#abc.get_cache_token > > One of the reasons we chose that name is that even though it's a > number, the only operation with semantic significance is equality > testing, with the intended use case being cache invalidation when the > token changes value. > > If we followed the same reasoning for Victor's proposal, then a > suitable attribute name would be "__cache_token__". > +1 for consistency. for most imaginable uses the actual value and type of the value doesn't matter, you just care if it is different than the value you recorded earlier. How the token/version gets mutated should be up to the implementation within defined parameters such as "the same value is never re-used twice for the lifetime of a process" (which pretty much guarantees some form of unsigned 64-bit counter increment - but an implementation could choose to use 256 bit random numbers for all we really care). Calling it __version__ implies numeric, but that isn't a requirement. we _really_ don't want someone to write code depending upon it being a number and expecting it to change in a given manner so that they do something conditional on math performed on that number rather than a simple == vs !=. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Jan 11 17:56:11 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 17:56:11 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <5A41DC19-5652-4821-9FC5-521CB803D564@yahoo.com> References: <20160110042427.GT10854@ando.pearwood.info> <5A41DC19-5652-4821-9FC5-521CB803D564@yahoo.com> Message-ID: On 1/11/2016 1:48 AM, Andrew Barnert via Python-ideas wrote: > On Jan 10, 2016, at 22:04, Terry Reedy > wrote: >> >> On 1/10/2016 12:23 AM, Chris Angelico wrote: >> >> (in reponse to Steven's response to my post) >> >>> There's more to it than that. Yes, a dict maps values to values; >>> but the keys MUST be immutable >> >> Keys just have to be hashable; only hashes need to be immutable. > >> By default, hashes depends on ids, which are immutable for a >> particular object within a run. >> >> (otherwise hashing has problems), A '>' quote mark is missing here. This line is from Chris. >> only if the hash depends on values that mutate. Some do. In other words, hashes should not depend on values that mutate. We all agree on that. > But We all three agree on the following. > if equality depends on values, the hash has to depend on those > same values. (Because two values that are equal have to hash equal.) > Which means that if equality depends on any mutable values, the type > can't be hashable. Which is why none of the built-in mutable types > are hashable. By default, object equality is based on ids. > But it's not _that_ much of an oversimplification to say that keys > have to be immutable. By default, an instance of a subclass of object is mutable, hashable (by id, making the hash immutable), and usable as a dict key. The majority of both builtin and user-defined classes follow this pattern and are quite usable as keys, contrary to the claim. Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions. So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__). -- Terry Jan Reedy From greg at krypto.org Mon Jan 11 17:57:14 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 11 Jan 2016 22:57:14 +0000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <5690F869.2090704@egenix.com> References: <5690F869.2090704@egenix.com> Message-ID: On Sat, Jan 9, 2016 at 4:09 AM M.-A. Lemburg wrote: > On 09.01.2016 10:58, Victor Stinner wrote: > > 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka : > >>>> This also can be used for better detecting dict mutating during > >>>> iterating: > >>>> https://bugs.python.org/issue19332. > >> (...) > >> > >> This makes Raymond's objections even more strong. > > > > Raymond has two major objections: memory footprint and performance. I > > opened an issue with a patch implementing dict__version__ and I ran > > pybench: > > https://bugs.python.org/issue26058#msg257810 > > > > pybench doesn't seem reliable: microbenchmarks on dict seems faster > > with the patch, it doesn't make sense. I expect worse or same > > performance. > > > > With my own timeit microbenchmarks, I don't see any slowdown with the > > patch. For an unknown reason (it's really strange), dict operations > > seem even faster with the patch. > > This can well be caused by a better memory alignment, which > depends on the CPU you're using. > > > For the memory footprint, it's clearly stated in the PEP that it adds > > 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" > > section which explains why I proposed to modify directly the dict > > type. > > Some questions: > > * How would the implementation deal with wrap around of the > version number for fast changing dicts (esp. on 32-bit platforms) ? > > * Given that this is an optimization and not meant to be exact > science, why would we need 64 bits worth of version information ? > > AFAIK, you only need the version information to be able to > answer the question "did anything change compared to last time > I looked ?". > > For an optimization it's good enough to get an answer "yes" > for slow changing dicts and "no" for all other cases. False > negatives don't really hurt. False positives are not allowed. > > What you'd need to answer the question is a way for the > code in need of the information to remember the dict > state and then later compare it's remembered state > with the now current state of the dict. > > dicts could do this with a 16-bit index into an array > of state object slots which are set by the code tracking > the dict. > > When it's time to check, the code would simply ask for the > current index value and compare the state object in the > array with the one it had set. > Given it is for optimization only with the fallback slow path being to do an actual dict lookup, we could implement this using a single bit. Every modification sets the bit. There exists an API to clear the bit and to query the bit. Nothing else is needed. The bit could be stored creatively to avoid increasing the struct size, though ABI compatibility may prevent that... > > * Wouldn't it be possible to use the hash array itself to > store the state index ? > > We could store the state object as regular key in the > dict and filter this out when accessing the dict. > > Alternatively, we could try to use the free slots for > storing these state objects by e.g. declaring a free > slot as being NULL or a pointer to a state object. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Jan 09 2016) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > ________________________________________________________________________ > > ::: We implement business ideas - efficiently in both time and costs ::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 11 18:04:46 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 15:04:46 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: <17CDD12A-1203-4A27-B2DB-5A8A898C6279@yahoo.com> On Jan 11, 2016, at 13:38, Guido van Rossum wrote: > >> On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert wrote: >>> On Jan 11, 2016, at 10:42, Gregory P. Smith wrote: >>> >>> The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__). >> >> These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed. > > The way I recall it the argument was made against using decorators for PEP 484 and we rightly decided not to use decorators. Sure. But you also decided that the type information has to be there at runtime. Anyway, I don't buy GPS's argument, but I think I buy yours. Even if there are good reasons to have annotations at runtime, and they'd apply to debugging/introspecting/etc. code during a 2.7->3.6 port just as much as in new 3.6 work, but I can see that they may not be worth _enough_ to justify the cost of extra runtime CPU (which can't be avoided in 2.7 the way it is in 3.6). And that, even if they were worth the cost, it may still not be worth trying to convince a team of that fact, especially without any hard information). >> 3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations? > > I'm not objecting to the memory overhead of using decorators, OK, but GPS was. And he was also arguing that having annotations at runtime is useless. Which is an argument that was made against PEP 484, and considered and rejected at the time. Your argument is different, and seems convincing to me, but I can't retroactively change my reply to his email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 11 18:30:18 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Jan 2016 15:30:18 -0800 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <20160110042427.GT10854@ando.pearwood.info> <5A41DC19-5652-4821-9FC5-521CB803D564@yahoo.com> Message-ID: <4886C5D1-555D-495F-B71D-3700AAE20DB3@yahoo.com> On Jan 11, 2016, at 14:56, Terry Reedy wrote: > > Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions. So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__). I don't understand your terminology here. What are "classes with mutable hashes"? Your examples of lists, sets, and dicts don't have mutable hashes; they have no hashes. If you write "hash([])", you get a TypeError("unhashable type: 'list'"). And well-behaved extensions classes and user classes that override __eq__ and __hash__ provide immutable hashes and immutable equality to match, or they use __hash__=None if they need mutable equality. Python can't actually stop you from creating a class with mutable hashes, and even putting instances of such a class in a dict, but that dict won't actually work right. So, there's nothing for a version-guarded dict to worry about there. From victor.stinner at gmail.com Mon Jan 11 18:34:59 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Jan 2016 00:34:59 +0100 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: References: <5690F869.2090704@egenix.com> Message-ID: Marc-Andre Lemburg: >> * Given that this is an optimization and not meant to be exact >> science, why would we need 64 bits worth of version information ? >> >> AFAIK, you only need the version information to be able to >> answer the question "did anything change compared to last time >> I looked ?". >> (...) Gregory P. Smith : > Given it is for optimization only with the fallback slow path being to do an > actual dict lookup, we could implement this using a single bit. You misunderstood the purpose of the PEP. The purpose is to implement fast guards by avoiding dict lookups in the common case (when watched keys are not modified) because dict lookups are fast, but still slower than reading a field of a C structure and an integer comparison. See the result of my microbenchmark: https://www.python.org/dev/peps/pep-0509/#implementation We are talking about a nanoseconds. For the optimizations that I implemented in FAT Python, I bet that watched keys are rarely modified. But it's common to modify the watched namespaces. For example, a global namespace can be modified by the "lazy module import" pattern: "global module; if module is None: import module". Or a global variable can be a counter used to generate identifiers, counter modified regulary with "global counter; counter = counter + 1" which changes the dictionary version. >> * Wouldn't it be possible to use the hash array itself to >> store the state index ? >> >> We could store the state object as regular key in the >> dict and filter this out when accessing the dict. >> >> Alternatively, we could try to use the free slots for >> storing these state objects by e.g. declaring a free >> slot as being NULL or a pointer to a state object. I'm sorry, I don't understand this idea. Victor From rosuav at gmail.com Mon Jan 11 19:12:20 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Jan 2016 11:12:20 +1100 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: <1A67A49B-C9AC-435B-AA75-34C7D16FE39F@yahoo.com> References: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> <1A67A49B-C9AC-435B-AA75-34C7D16FE39F@yahoo.com> Message-ID: On Tue, Jan 12, 2016 at 7:44 AM, Andrew Barnert wrote: > On Jan 11, 2016, at 08:53, Chris Angelico wrote: >> >>> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert wrote: >>>> On Jan 11, 2016, at 04:02, Ram Rachum wrote: >>>> >>>> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__. >>> >>> For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea. >> >> I would expect it NOT to be a subclass of int, actually - just that it >> has __int__ (and maybe __index__) to convert it to one. > > If you read his proposal, he wants oct(path.chmod) to work. That doesn't work on types with __int__. > > Of course it does work on types with __index__, but that's because the whole point of __index__ is to allow your type to act like an actual int everywhere that Python expects an int, rather than just something coercible to int. The point of PEP > 357 was to allow numpy.int64 to act as close to a subtype of int as possible without actually being a subtype. This is what I get for not actually testing stuff. I thought having __int__ would work for oct. In that case, I would simply recommend dropping that part of the proposal; retrieving the octal representation can be spelled oct(int(x)), or maybe x.octal or x.octal(). This is NOT an integer; it's much closer to a set of bitwise enumeration. ChrisA From ncoghlan at gmail.com Mon Jan 11 19:47:57 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Jan 2016 10:47:57 +1000 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: On 12 January 2016 at 08:44, Victor Stinner wrote: > I discussed this PEP on the #pypy IRC channel. I will try to summarize > the discussion with comments on the PEP directly. > > 2016-01-08 22:31 GMT+01:00 Victor Stinner : >> Add an API to add specialized functions with guards to functions, to >> support static optimizers respecting the Python semantic. > > "respecting the Python semantics" is not 100% exact. In fact, my FAT > Python makes suble changes on the "Python semantics". For example, > loop unrolling can completly remove the call the range() function. If > a debugger is executed instruction per instruction, the output is > different on an unrolled loop, since the range() call was removed, and > the loop copy is copied. I should maybe elaborate this point in the > rationale, explain that a compromise must be found between the funny > "in Python, everything is mutable" and performance. But remember that > the whole thing (FAT Python, specialization, etc.) is developed > outside CPython and is fully optional. > >> Changes >> ======= >> >> * Add two new methods to functions: >> >> - ``specialize(code, guards: list)``: add specialized >> function with guard. `code` is a code object (ex: >> ``func2.__code__``) or any callable object (ex: ``len``). >> The specialization can be ignored if a guard already fails. > > This method doesn't make sense at all in PyPy. The method is specific > to CPython since it relies on guards which have a pure C API (see > below). The PEP must be more explicit about that. IMHO it's perfectly > fine that PyPy makes this method a no-op (the method exactly does > nothing). It's already the case if a guard "always" fail in > first_check(). Perhaps the specialisation call should also move to being a pure C API, only exposed through _testcapi for testing purposes? That would move both this and the dict versioning PEP into the same territory as the dynamic memory allocator PEP: low level C plumbing that enables interesting CPython specific extensions (like tracemalloc, in the dynamic allocator case) without committing other implementations to emulating features that aren't useful to them in any way. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Mon Jan 11 20:09:12 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 20:09:12 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <4886C5D1-555D-495F-B71D-3700AAE20DB3@yahoo.com> References: <20160110042427.GT10854@ando.pearwood.info> <5A41DC19-5652-4821-9FC5-521CB803D564@yahoo.com> <4886C5D1-555D-495F-B71D-3700AAE20DB3@yahoo.com> Message-ID: On 1/11/2016 6:30 PM, Andrew Barnert via Python-ideas wrote: > On Jan 11, 2016, at 14:56, Terry Reedy > wrote: >> >> Classes with immutable instances (tuples, numbers, strings, frozen >> sets, some extension classes, and user classes that take special >> measures) are exceptions. So are classes with mutable hashes >> (lists, sets, dicts, some extension classes, and user classes that >> override __eq__ and __hash__). > > I don't understand your terminology here. Yes, the term, as a negation, is wrong. It should be 'classes that don't have immutable hashes'. The list is right, except that 'override' should really be 'disable'. Anyway, Viktor changed the PEP and has moved on, so I will too. -- Terry Jan Reedy From tjreedy at udel.edu Mon Jan 11 20:44:45 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Jan 2016 20:44:45 -0500 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <56942EF0.2040308@egenix.com> References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> <56942EF0.2040308@egenix.com> Message-ID: On 1/11/2016 5:38 PM, M.-A. Lemburg wrote: > To clarify: My suggestion to use a simple decorator with essentially > the same syntax as proposed for the "# type: comments " was meant > as *additional* allowed syntax, not necessarily as the only one > to standardize. Code with type comments will run on any standard 2.7 interpreter. Code with an @typehint decorator will have to either run on a nonstandard interpreter or import 'typehint' from somewhere other than the stdlib or define 'typehint' at the top of the file or have the decorators stripped out before public distribution. To me, these options come close to making the decorator inappropriate as a core dev recommendation. However, the new section of the PEP could have a short paragraph that mentions @typehint(typestring) as a possible alternative (with the options given above) and recommend that if a decorator is used, then the name should be 'typehint' (or something else agreed on) and that the typestring should be a quoted version of what would follow '# type: ' in a comment, 'as already defined above' (in the previous recommendation). In other words, Guido's current addition has two recommendations: 1. the syntax for a typestring 2. the use of a typestring (append it to a '# type: ' comment) If a decorator alternative uses the same syntax, a checker would need just one typestring parser. I think the conditional recommendation would be within the scope of what is appropriate for us to do. > I'm a bit worried that by standardizing on using comments > for these annotations only, we'll end up having people not > use the type annotations because they simply don't like the > style of having function bodies begin with comments instead > of doc-strings. I certainly wouldn't want to clutter up my > code like that. Tools parsing Python 2 source code may > also have a problem with this (e.g. not recognize the > doc-string anymore). I have to admit that I was not fully cognizant before than a comment could precede a docstring. -- Terry Jan Reedy From steve at pearwood.info Mon Jan 11 20:39:11 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Jan 2016 12:39:11 +1100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> Message-ID: <20160112013911.GC10854@ando.pearwood.info> On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas wrote: > in a few months we're going to see > Dropbox and Google and everyone else demanding a way to use type > hinting without wasting memory on annotations are runtime in 3.x. I would be happy to see a runtime switch similar to -O that drops annotations in 3.x, similar to how -OO drops docstrings. -- Steve From yselivanov.ml at gmail.com Mon Jan 11 21:24:46 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 11 Jan 2016 21:24:46 -0500 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: <569463EE.7070705@gmail.com> On 2016-01-11 7:47 PM, Nick Coghlan wrote: > Perhaps the specialisation call should also move to being a pure C > API, only exposed through _testcapi for testing purposes? > > That would move both this and the dict versioning PEP into the same > territory as the dynamic memory allocator PEP: low level C plumbing > that enables interesting CPython specific extensions (like > tracemalloc, in the dynamic allocator case) without committing other > implementations to emulating features that aren't useful to them in > any way. +1. Exposing 'FunctionType.specialize()'-like APIs to Python level feels very wrong to me. Yury From barry at python.org Mon Jan 11 22:04:12 2016 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Jan 2016 22:04:12 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ References: Message-ID: <20160111220412.6f5a341a@anarchist.wooz.org> On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote: >IMHO adding 8 bytes per dict is worth it. I'm not so sure. There are already platforms where Python is unfeasible to generally use (e.g. some mobile devices) at least in part because of memory footprint. Dicts are used everywhere so think about the kind of impact adding 8 bytes to every dict in an application running on such systems will have. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mahmoud at hatnote.com Mon Jan 11 22:10:10 2016 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Mon, 11 Jan 2016 19:10:10 -0800 Subject: [Python-ideas] More friendly access to chmod In-Reply-To: References: <98A19944-7566-454D-8FC0-3106EFB30560@yahoo.com> <1A67A49B-C9AC-435B-AA75-34C7D16FE39F@yahoo.com> Message-ID: Seems like the committee has some designs after all? FilePerms is tested, on PyPI, and is even 2/3 compatible. And notice the lack of "chmod" as a noun. ;) Mahmoud github.com/mahmoud On Mon, Jan 11, 2016 at 4:12 PM, Chris Angelico wrote: > On Tue, Jan 12, 2016 at 7:44 AM, Andrew Barnert > wrote: > > On Jan 11, 2016, at 08:53, Chris Angelico wrote: > >> > >>> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert > wrote: > >>>> On Jan 11, 2016, at 04:02, Ram Rachum wrote: > >>>> > >>>> I've chosen += and -=, despite the fact they're not set operations, > because Python doesn't have __inand__. > >>> > >>> For a property that acts like a number, and presumably is implemented > as a subclass of int, this seems like a horribly confusing idea. > >> > >> I would expect it NOT to be a subclass of int, actually - just that it > >> has __int__ (and maybe __index__) to convert it to one. > > > > If you read his proposal, he wants oct(path.chmod) to work. That doesn't > work on types with __int__. > > > > Of course it does work on types with __index__, but that's because the > whole point of __index__ is to allow your type to act like an actual int > everywhere that Python expects an int, rather than just something coercible > to int. The point of PEP > > 357 was to allow numpy.int64 to act as close to a subtype of int as > possible without actually being a subtype. > > This is what I get for not actually testing stuff. I thought having > __int__ would work for oct. In that case, I would simply recommend > dropping that part of the proposal; retrieving the octal > representation can be spelled oct(int(x)), or maybe x.octal or > x.octal(). This is NOT an integer; it's much closer to a set of > bitwise enumeration. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 11 22:37:29 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Jan 2016 13:37:29 +1000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160111220412.6f5a341a@anarchist.wooz.org> References: <20160111220412.6f5a341a@anarchist.wooz.org> Message-ID: On 12 January 2016 at 13:04, Barry Warsaw wrote: > On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote: > >>IMHO adding 8 bytes per dict is worth it. > > I'm not so sure. There are already platforms where Python is unfeasible to > generally use (e.g. some mobile devices) at least in part because of memory > footprint. Dicts are used everywhere so think about the kind of impact adding > 8 bytes to every dict in an application running on such systems will have. This is another advantage of making this a CPython specific internal implementation detail - embedded focused variants like MicroPython won't need to implement it. The question then becomes "Are we willing to let CPython cede high memory pressure environments to more specialised Python variants?", and I think the answer to that is "yes". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mike at selik.org Mon Jan 11 22:57:37 2016 From: mike at selik.org (Michael Selik) Date: Tue, 12 Jan 2016 03:57:37 +0000 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160111112011.GA10854@ando.pearwood.info> References: <20160110033140.GS10854@ando.pearwood.info> <01b601d14bbb$82648b90$872da2b0$@gmail.com> <20160110175731.GX10854@ando.pearwood.info> <27E98AE5-CB3C-41AF-B31F-A73A84DC7F61@yahoo.com> <20160111112011.GA10854@ando.pearwood.info> Message-ID: On Mon, Jan 11, 2016 at 6:20 AM Steven D'Aprano wrote: > On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote: > > > Here is where I have to disagree. I hate it when experts say "we'll just > > document it and then it's the user's fault for misusing it". Yeah, > you're > > right, but as a user, it is very frustrating to have to read other > people's > > documentation. You know that some elite Python programmer is going to > > optimize his code using this and someone years later is going to scratch > > his head wondering where __version__ is coming from. Is it the provided > by > > the caller? Was it added to the object at some earlier point? > > Neil, don't you think you're being overly dramatic here? "Programmer > needs to look up API feature, news at 11!" The same could be said about > class.__name__, instance.__class__, obj.__doc__, module.__dict__ and > indeed every single Python feature. Sufficiently inexperienced or naive > programmers could be scratching their head over literally *anything*. > All those words for such a simple, and minor, point: every new API > feature is one more thing for programmers to learn. We get that. > I don't think Neil is being overly dramatic, nor is it a minor point. Simple, yes, but important. If Python wants to maintain its enviable position as the majority language for intro computer science of top schools, it needs to stay an easily teachable language. The more junk showing up in ``dir()`` the harder it is to learn. When it's unclear what purpose a feature would have for an expert, why not err on the side of caution and keep the language as usable for a newbie as possible? But the following is a good, strong argument: > > > Also, using this __version__ in source code is going to complicate > > switching from CPython to any of the other Python implementations, so > those > > implementations will probably end up implementing it just to simplify > > "porting", which would otherwise be painless. > > > > Why don't we leave exposing __version__ in Python to another PEP? Once > > it's in the C API (as you proposed) you will be able to use it from > Python > > by writing an extension and then someone can demonstrate the value of > > exposing it in Python by writing tests. > > I can't really argue against this. As much as I would love to play > around with __version__, I think you're right. It needs to prove itself > before being exposed as a public API. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jan 11 23:38:59 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Jan 2016 20:38:59 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <20160112013911.GC10854@ando.pearwood.info> References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> <20160112013911.GC10854@ando.pearwood.info> Message-ID: On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano wrote: > On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas > wrote: > > > in a few months we're going to see > > Dropbox and Google and everyone else demanding a way to use type > > hinting without wasting memory on annotations are runtime in 3.x. > > I would be happy to see a runtime switch similar to -O that drops > annotations in 3.x, similar to how -OO drops docstrings. Actually my experience with -OO (and even -O) suggest that that's not a great model (e.g. it can't work with libraries like PLY that inspect docstrings). A better model might be to let people select this on a per module basis. Though I could also see a future where __annotations__ is a more space-efficient data structure than dict. Have you already run into a situation where __annotations__ takes up too much space? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Jan 12 04:58:37 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Jan 2016 10:58:37 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: Hi, 2016-01-12 1:47 GMT+01:00 Nick Coghlan : >> This method doesn't make sense at all in PyPy. The method is specific >> to CPython since it relies on guards which have a pure C API (see >> below). The PEP must be more explicit about that. IMHO it's perfectly >> fine that PyPy makes this method a no-op (the method exactly does >> nothing). It's already the case if a guard "always" fail in >> first_check(). > > Perhaps the specialisation call should also move to being a pure C > API, only exposed through _testcapi for testing purposes? > > That would move both this and the dict versioning PEP into the same > territory as the dynamic memory allocator PEP: low level C plumbing > that enables interesting CPython specific extensions (like > tracemalloc, in the dynamic allocator case) without committing other > implementations to emulating features that aren't useful to them in > any way. I really like your idea :-) It solves many issues and technically it's trivial to only add a C API and then expose it somewhere else at the Python level (for example in my "fat" module", or as you said in _testcapi for testing purpose). Instead of adding func.specialize() and func.get_specialized() at Python level, we can add *public* functions to the Python C API (excluded of the stable ABI): /* Add a specialized function with guards. Result: * - return 1 on success * - return 0 if the specialization has been ignored * - raise an exception and return -1 on error */ PyAPI_DATA(int) PyFunction_Specialize(PyObject *func, PyObject *func2, PyObject *guards); /* Get the list of specialized functions as a list of * (func, guards) where func is a callable or code object and guards * is a list of PyFuncGuard (or subtypes) objects. * Raise an exception and return NULL on error. */ PyAPI_FUNC(PyObject*) PyFunction_GetSpecialized(PyObject *func); /* Get the specialized function of a function. stack is a an array of PyObject* * objects: indexed arguments followed by (key, value) objects of keyword * arguments. na is the number of indexed arguments, nk is the number of * keyword arguments. stack contains na + nk * 2 objects. * * Return a callable or a code object on success. * Raise an exception and return NULL on error. */ PyAPI_FUNC(PyObject*) PyFunction_GetSpecializedFunc(PyObject *func, PyObject **stack, int na, int nk); Again, other Python implementations which don't want to implement function specializations can implement these functions as no-op (it's fine with the API): * PyFunction_Specialize() just returns 0 * PyFunction_GetSpecialized() creates an empty list * PyFunction_GetSpecializedFunc() returns the code object of the function (which is not something new) Or not implement these functions at all, since it doesn't make sense for them. -- First, I tried hard to avoid the need of a module to specialize functions. My first API added a specialize() method to functions which took a list of dictionaries to describe guards. The problem is this API is that it exposes the implementation details and it avoids to extend easily guard (implement new guards). Now the AST optimizer injects "import fat" to optimize code when needed. Hey, it's difficult to design a simple and obvious API! Victor From brett at python.org Tue Jan 12 11:52:43 2016 From: brett at python.org (Brett Cannon) Date: Tue, 12 Jan 2016 16:52:43 +0000 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: On Tue, 12 Jan 2016 at 01:59 Victor Stinner wrote: > Hi, > > 2016-01-12 1:47 GMT+01:00 Nick Coghlan : > >> This method doesn't make sense at all in PyPy. The method is specific > >> to CPython since it relies on guards which have a pure C API (see > >> below). The PEP must be more explicit about that. IMHO it's perfectly > >> fine that PyPy makes this method a no-op (the method exactly does > >> nothing). It's already the case if a guard "always" fail in > >> first_check(). > > > > Perhaps the specialisation call should also move to being a pure C > > API, only exposed through _testcapi for testing purposes? > > > > That would move both this and the dict versioning PEP into the same > > territory as the dynamic memory allocator PEP: low level C plumbing > > that enables interesting CPython specific extensions (like > > tracemalloc, in the dynamic allocator case) without committing other > > implementations to emulating features that aren't useful to them in > > any way. > > I really like your idea :-) It solves many issues and technically it's > trivial to only add a C API and then expose it somewhere else at the > Python level (for example in my "fat" module", or as you said in > _testcapi for testing purpose). > > Instead of adding func.specialize() and func.get_specialized() at > Python level, we can add *public* functions to the Python C API > (excluded of the stable ABI): > > /* Add a specialized function with guards. Result: > * - return 1 on success > * - return 0 if the specialization has been ignored > * - raise an exception and return -1 on error */ > PyAPI_DATA(int) PyFunction_Specialize(PyObject *func, PyObject *func2, > PyObject *guards); > > /* Get the list of specialized functions as a list of > * (func, guards) where func is a callable or code object and guards > * is a list of PyFuncGuard (or subtypes) objects. > * Raise an exception and return NULL on error. */ > PyAPI_FUNC(PyObject*) PyFunction_GetSpecialized(PyObject *func); > > /* Get the specialized function of a function. stack is a an array of > PyObject* > * objects: indexed arguments followed by (key, value) objects of keyword > * arguments. na is the number of indexed arguments, nk is the number of > * keyword arguments. stack contains na + nk * 2 objects. > * > * Return a callable or a code object on success. > * Raise an exception and return NULL on error. */ > PyAPI_FUNC(PyObject*) PyFunction_GetSpecializedFunc(PyObject *func, > PyObject **stack, > int na, int nk); > > Again, other Python implementations which don't want to implement > function specializations can implement these functions as no-op (it's > fine with the API): > > * PyFunction_Specialize() just returns 0 > * PyFunction_GetSpecialized() creates an empty list > * PyFunction_GetSpecializedFunc() returns the code object of the > function (which is not something new) > > Or not implement these functions at all, since it doesn't make sense for > them. > This is somewhat similar to the JIT API we have been considering through our Pyjion work: * PyJit_Init() * PyJit_RegisterCodeObject() * PyJit_CompileCodeObject() If both ideas gain traction we may want to talk about whether there is some way to consolidate the APIs so we don't end up with a ton of different ways to optimize code objects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Jan 12 12:11:27 2016 From: barry at python.org (Barry Warsaw) Date: Tue, 12 Jan 2016 12:11:27 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ References: <20160111220412.6f5a341a@anarchist.wooz.org> Message-ID: <20160112121127.4460c0fd@anarchist.wooz.org> On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote: >The question then becomes "Are we willing to let CPython cede high >memory pressure environments to more specialised Python variants?", >and I think the answer to that is "yes". I'm not so willing to cede that space to alternative implementations, at least not yet. If this suite of ideas yields *significant* performance improvements, it might be a worthwhile trade-off. But I'm not in favor of adding dict.__version__ in the hopes that we'll see that improvement; I think we need proof. That makes me think that 1) it should not be exposed to Python yet; 2) it should be conditionally compiled in, and not by default. This would allow experimentation without committing us to long-term maintenance or an across-the-board increase in memory pressures for speculative gains. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From victor.stinner at gmail.com Tue Jan 12 15:38:02 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Jan 2016 21:38:02 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: 2016-01-12 17:52 GMT+01:00 Brett Cannon : > This is somewhat similar to the JIT API we have been considering through > our Pyjion work: > > * PyJit_Init() > * PyJit_RegisterCodeObject() > * PyJit_CompileCodeObject() > > If both ideas gain traction we may want to talk about whether there is some > way to consolidate the APIs so we don't end up with a ton of different ways > to optimize code objects. Since the proposed changes adds many "public" symbols (prefixed with "Py_", but excluded of the stable ABI and only exposed at the C level), I chose to add "public" functions. Are you ok with that? Victor From leewangzhong+python at gmail.com Tue Jan 12 18:26:07 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 12 Jan 2016 18:26:07 -0500 Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some arguments of a function In-Reply-To: References: Message-ID: On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow wrote: > Sorry for the late reply everyone. > > I think relying on closures, while a solution, is messy. I'd still much > prefer a way to tell lru_cache to merely ignore certain arguments. Wait, why is it messy? The function is created inside the outer function, and never gets released to the outside. I think it's cleaner, because it's encapsulating the recursive part, the memo, and the cached work. Besides, `lru_cache` is implemented using a closure, and your solution of passing a key function might be used with a closure on a nested function. If you're solving dynamic programming puzzles, an outer/inner pairing of non-recursive/recursive represents the fact that your memoization can't be reused for different instances of the same problem. (See, for example, Edit Distance (https://web.stanford.edu/class/cs124/lec/med.pdf), in which your recursive parameters are indices into your non-recursive parameters.) > I've further considered my original proposal, and rather than naming it > "arg_filter", I realized that builtins like sorted(), min(), max(), etc all > already have the exact same thing -- a "key" argument which transforms the > elements to the user's purpose. (In the sorted/min/max case, it's called on > the elements of the argument rather than the argument itself, but it's still > the same concept.) So basically, my original proposal with renaming from > arg_filter to key, is tantamount to extending the same functionality from > sorted/min/max to lru_cache as well. I like this conceptually, because the `key` parameter sort of lets you customize the cache dict (or whatever). You can use, for example, `str.lower` (though not directly). Note that the key parameter in those other functions allows you to call the key function only once per element, which is impossible for this. Should it be possible to specify a tuple for `key` to transform each arg separately? In your case, you might pass in `(None, lambda x: 0)` to specify that the first parameter shouldn't be transformed, and the second parameter should be considered constant. But that's very confusing: should `None` mean "ignore", or "don't transform" (like `filter`)? Or we can use `False` for "ignore", perhaps. On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik wrote: > Shouldn't the key function be called with ``key(*args, **kwargs)``? Does `lru_cache` know how to deal with passing regular args as kwargs? Does it From tjreedy at udel.edu Tue Jan 12 18:33:10 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 12 Jan 2016 18:33:10 -0500 Subject: [Python-ideas] RFC: PEP: Add dict.__version__ In-Reply-To: <20160112121127.4460c0fd@anarchist.wooz.org> References: <20160111220412.6f5a341a@anarchist.wooz.org> <20160112121127.4460c0fd@anarchist.wooz.org> Message-ID: On 1/12/2016 12:11 PM, Barry Warsaw wrote: > On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote: > >> The question then becomes "Are we willing to let CPython cede high >> memory pressure environments to more specialised Python variants?", >> and I think the answer to that is "yes". > > I'm not so willing to cede that space to alternative implementations, at least > not yet. If this suite of ideas yields *significant* performance > improvements, it might be a worthwhile trade-off. But I'm not in favor of > adding dict.__version__ in the hopes that we'll see that improvement; I think > we need proof. > > That makes me think that 1) it should not be exposed to Python yet; 2) it > should be conditionally compiled in, and not by default. This would allow > experimentation without committing us to long-term maintenance or an > across-the-board increase in memory pressures for speculative gains. New modules can be labelled 'provisional', whose meaning includes 'might be removed'. Can we do the same with new internal features? -- Terry Jan Reedy From victor.stinner at gmail.com Tue Jan 12 19:03:58 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 13 Jan 2016 01:03:58 +0100 Subject: [Python-ideas] RFC: PEP: Specialized functions with guards In-Reply-To: References: Message-ID: Thank you for comments on the first version of the PEP 510. I changed it to only changes the C API, there is no more change on the Python API. I just posted the second version of the PEP to python-dev. Please move the discussion there. If you want to review others PEP on python-ideas, I'm going to post a first version of my AST transformer PEP (PEP 511), stay tuned :-D (yeah, I love working on 3 PEPs at the same time!) Victor From khali119 at umn.edu Tue Jan 12 20:11:55 2016 From: khali119 at umn.edu (Muhammad Ahmed Khalid) Date: Tue, 12 Jan 2016 19:11:55 -0600 Subject: [Python-ideas] Password masking for getpass.getpass Message-ID: Greetings, I am working on a project and I am using getpass.getpass() to grab passwords from the user. Some of the users wanted asterisks to be displayed when they were typing in the passwords for feedback i.e. how many characters were typed and how many to backspace. Therefore I have created a solution but I think the feature should come by default (and the programmer should have the option to use a blank or any other character). The code lives here: https://goo.gl/OeY3QI Please let me know about your thoughts on the issue. Also my apologies if this is the wrong mailing list. Please guide me in the right direction if that is the case. Sincerely, King Mak -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jan 12 20:50:23 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Jan 2016 12:50:23 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: Message-ID: On Wed, Jan 13, 2016 at 12:11 PM, Muhammad Ahmed Khalid wrote: > Therefore I have created a solution but I think the feature should come by > default (and the programmer should have the option to use a blank or any > other character). > > The code lives here: https://goo.gl/OeY3QI > > Please let me know about your thoughts on the issue. Also my apologies if > this is the wrong mailing list. Please guide me in the right direction if > that is the case. First off, here's a direct link, bypassing the URL shortener. https://code.activestate.com/recipes/579148-add-password-masking-ability-to-getpassgetpass/?in=user-4193393 The use of raw_input at the end suggests that you're planning this for Python 2.7, and not for 3.x. Does your code work (apart from that, which is insignificant) on Python 3? If not, this would be well worth considering; this mailing list is all about new features for the new versions of Python, which means 3.6 at the moment. Cool snippets of Py2 code might be useful as ActiveState recipes, or as PyPI modules, but they won't be added to the core language. You've used the Windows-only msvcrt module. That means your snippet works only on Windows, which you acknowledge in a comment underneath it. I'm echoing that here on the list to make sure that's clear. Have you checked PyPI (pypi.python.org) for similar code that already exists? ChrisA From steve at pearwood.info Tue Jan 12 20:54:14 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Jan 2016 12:54:14 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: Message-ID: <20160113015414.GF10854@ando.pearwood.info> On Tue, Jan 12, 2016 at 07:11:55PM -0600, Muhammad Ahmed Khalid wrote: > Greetings, > > I am working on a project and I am using getpass.getpass() to grab > passwords from the user. > > Some of the users wanted asterisks to be displayed when they were typing in > the passwords for feedback i.e. how many characters were typed and how many > to backspace. I think that's an excellent idea. The old convention on Linux and Unix is to just suppress all feedback, but even on Linux GUI applications normally show bullets ? or asterisks. Users who are less familiar with old-school Unix conventions have trouble with the standard password idiom of suppressing all feedback. > Therefore I have created a solution but I think the feature should come by > default (and the programmer should have the option to use a blank or any > other character). I think that the default should remain as it is now, but I would support adding an extra argument for getpass() to set the feedback character. But it would need to support POSIX systems (Unix, Linux and Mac OS X) as well as Windows. -- Steve From mike at selik.org Tue Jan 12 21:13:25 2016 From: mike at selik.org (Michael Selik) Date: Wed, 13 Jan 2016 02:13:25 +0000 Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some arguments of a function In-Reply-To: References: Message-ID: On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee wrote: > On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow wrote: > > Sorry for the late reply everyone. > > > > I think relying on closures, while a solution, is messy. I'd still much > > prefer a way to tell lru_cache to merely ignore certain arguments. > > Wait, why is it messy? The function is created inside the outer > function, and never gets released to the outside. I think it's > cleaner, because it's encapsulating the recursive part, the memo, and > the cached work. Besides, `lru_cache` is implemented using a closure, > and your solution of passing a key function might be used with a > closure on a nested function. > > If you're solving dynamic programming puzzles, an outer/inner pairing > of non-recursive/recursive represents the fact that your memoization > can't be reused for different instances of the same problem. (See, for > example, Edit Distance > (https://web.stanford.edu/class/cs124/lec/med.pdf), in which your > recursive parameters are indices into your non-recursive parameters.) > > > > I've further considered my original proposal, and rather than naming it > > "arg_filter", I realized that builtins like sorted(), min(), max(), etc > all > > already have the exact same thing -- a "key" argument which transforms > the > > elements to the user's purpose. (In the sorted/min/max case, it's called > on > > the elements of the argument rather than the argument itself, but it's > still > > the same concept.) So basically, my original proposal with renaming from > > arg_filter to key, is tantamount to extending the same functionality from > > sorted/min/max to lru_cache as well. > > I like this conceptually, because the `key` parameter sort of lets you > customize the cache dict (or whatever). You can use, for example, > `str.lower` (though not directly). > > Note that the key parameter in those other functions allows you to > call the key function only once per element, which is impossible for > this. > > Should it be possible to specify a tuple for `key` to transform each > arg separately? In your case, you might pass in `(None, lambda x: 0)` > to specify that the first parameter shouldn't be transformed, and the > second parameter should be considered constant. But that's very > confusing: should `None` mean "ignore", or "don't transform" (like > `filter`)? Or we can use `False` for "ignore", perhaps. > I think his intention was to mimic the ``key`` argument of sorted, which expects a function that takes 1 and only 1 positional argument. Perhaps it's best to see the exact use case and a few other examples, to get a better idea for the specifics, before implementing this feature? On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik wrote: > > Shouldn't the key function be called with ``key(*args, **kwargs)``? > > Does `lru_cache` know how to deal with passing regular args as kwargs? > Now that you mention it, I realized it treats the two differently. ``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different entries in the cache. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Tue Jan 12 21:17:46 2016 From: phd at phdru.name (Oleg Broytman) Date: Wed, 13 Jan 2016 03:17:46 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113015414.GF10854@ando.pearwood.info> References: <20160113015414.GF10854@ando.pearwood.info> Message-ID: <20160113021746.GA26480@phdru.name> Hi! On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: > The old convention on Linux and Unix is to just suppress all feedback, > but even on Linux GUI applications normally show bullets ??? or asterisks. Modern GUIs show the real character for a short period of time and then replace it with an asterisk. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From steve at pearwood.info Tue Jan 12 21:17:57 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Jan 2016 13:17:57 +1100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <5691018E.4090006@egenix.com> <20821836-523F-4224-BC6E-2A813070A5CA@yahoo.com> <20160112013911.GC10854@ando.pearwood.info> Message-ID: <20160113021757.GG10854@ando.pearwood.info> On Mon, Jan 11, 2016 at 08:38:59PM -0800, Guido van Rossum wrote: > On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano > wrote: > > > On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas > > wrote: > > > > > in a few months we're going to see > > > Dropbox and Google and everyone else demanding a way to use type > > > hinting without wasting memory on annotations are runtime in 3.x. > > > > I would be happy to see a runtime switch similar to -O that drops > > annotations in 3.x, similar to how -OO drops docstrings. > > > Actually my experience with -OO (and even -O) suggest that that's not a > great model (e.g. it can't work with libraries like PLY that inspect > docstrings). A better model might be to let people select this on a per > module basis. Though I could also see a future where __annotations__ is a > more space-efficient data structure than dict. > > Have you already run into a situation where __annotations__ takes up too > much space? No at such, but it does seem an obvious and low-impact place to save some memory. Like doc strings, they're rarely used at runtime outside of the interactive interpreter. But your suggestion sounds more useful. -- Steve From rosuav at gmail.com Tue Jan 12 21:22:02 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Jan 2016 13:22:02 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113021746.GA26480@phdru.name> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> Message-ID: On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: > Hi! > > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: >> The old convention on Linux and Unix is to just suppress all feedback, >> but even on Linux GUI applications normally show bullets ??? or asterisks. > > Modern GUIs show the real character for a short period of time and > then replace it with an asterisk. Ugh. I've only seen that on mobile devices, not on any desktop GUI, and I think it's a sop to the terrible keyboards they have. I hope this NEVER becomes a standard on full-sized computers with real keyboards. ChrisA From phd at phdru.name Tue Jan 12 21:45:08 2016 From: phd at phdru.name (Oleg Broytman) Date: Wed, 13 Jan 2016 03:45:08 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> Message-ID: <20160113024508.GA27407@phdru.name> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: > On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: > > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: > >> The old convention on Linux and Unix is to just suppress all feedback, > >> but even on Linux GUI applications normally show bullets ??? or asterisks. > > > > Modern GUIs show the real character for a short period of time and > > then replace it with an asterisk. > > Ugh. I've only seen that on mobile devices, not on any desktop GUI, On desktop (Windows) I saw a password entry with a checkbox to switch between real characters and asterisks. > and I think it's a sop to the terrible keyboards they have. I hope > this NEVER becomes a standard on full-sized computers with real > keyboards. > > ChrisA Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ethan at stoneleaf.us Tue Jan 12 22:07:27 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 12 Jan 2016 19:07:27 -0800 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113024508.GA27407@phdru.name> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113024508.GA27407@phdru.name> Message-ID: <5695BF6F.6040705@stoneleaf.us> On 01/12/2016 06:45 PM, Oleg Broytman wrote: > On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: >> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: >>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: >>>> The old convention on Linux and Unix is to just suppress all feedback, >>>> but even on Linux GUI applications normally show bullets ??? or asterisks. >>> >>> Modern GUIs show the real character for a short period of time and >>> then replace it with an asterisk. >> >> Ugh. I've only seen that on mobile devices, not on any desktop GUI, > > On desktop (Windows) I saw a password entry with a checkbox to switch > between real characters and asterisks. While that can be handy, it is not the same as displaying each character as it is typed and then covering it with something else. I agree with ChrisA and hope that never becomes the convention on non-mobile devices. -- ~Ethan~ From leewangzhong+python at gmail.com Tue Jan 12 22:45:15 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 12 Jan 2016 22:45:15 -0500 Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some arguments of a function In-Reply-To: References: Message-ID: On Tue, Jan 12, 2016 at 9:13 PM, Michael Selik wrote: > On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee > wrote: >> Should it be possible to specify a tuple for `key` to transform each >> arg separately? In your case, you might pass in `(None, lambda x: 0)` >> to specify that the first parameter shouldn't be transformed, and the >> second parameter should be considered constant. But that's very >> confusing: should `None` mean "ignore", or "don't transform" (like >> `filter`)? Or we can use `False` for "ignore", perhaps. > > I think his intention was to mimic the ``key`` argument of sorted, which > expects a function that takes 1 and only 1 positional argument. I know, but those three functions expect a sequence of single things, while `lru_cache` expects several things. (Not exactly a tuple, because that would be a single thing, but rather a collection of things.) > Perhaps it's best to see the exact use case and a few other examples, to get > a better idea for the specifics, before implementing this feature? > >> On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik wrote: >> > Shouldn't the key function be called with ``key(*args, **kwargs)``? >> >> Does `lru_cache` know how to deal with passing regular args as kwargs? > > > Now that you mention it, I realized it treats the two differently. > ``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different > entries in the cache. I feel like, at least ideally, there should be a way for `update_wrapper`/`wraps` to unpack named arguments, so that wrappers can truly reflect the params of the functions they wrap. (For example, when inspecting a function, and by decoding kw-or-positional-args to their place in `*args`.) It should also be possible to add or remove args, though I'm not sure how useful that will be. (Also ideally, a wrapper function would "pass up" its default args to the wrapper.) From khali119 at umn.edu Tue Jan 12 23:31:38 2016 From: khali119 at umn.edu (Muhammad Ahmed Khalid) Date: Tue, 12 Jan 2016 22:31:38 -0600 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <5695BF6F.6040705@stoneleaf.us> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113024508.GA27407@phdru.name> <5695BF6F.6040705@stoneleaf.us> Message-ID: I've gotten some ideas from people's emails and I think it is worth investing more time with this feature. I will work to make the code platform independent and python 3 compatible. The standard library code for getpass.getpass() actually does use msvcrt for the windows platform so I think I'll keep my code like that but I'll add another function supporting unix. Considering the mobile device issue: there can always be options. The developers can choose either to implement that feature or not and even more let the users decide if they want to use the feature. This is exactly what i am aiming for with the desktop version too. The ability to choose. ~ KingMak On Tue, Jan 12, 2016 at 9:07 PM, Ethan Furman wrote: > On 01/12/2016 06:45 PM, Oleg Broytman wrote: > >> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: >> >>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: >>> >>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: >>>> >>> > The old convention on Linux and Unix is to just suppress all feedback, >>>>> but even on Linux GUI applications normally show bullets ??? or >>>>> asterisks. >>>>> >>>> >>>> Modern GUIs show the real character for a short period of time and >>>> then replace it with an asterisk. >>>> >>> >>> Ugh. I've only seen that on mobile devices, not on any desktop GUI, >>> >> >> On desktop (Windows) I saw a password entry with a checkbox to switch >> between real characters and asterisks. >> > > While that can be handy, it is not the same as displaying each character > as it is typed and then covering it with something else. I agree with > ChrisA and hope that never becomes the convention on non-mobile devices. > > -- > ~Ethan~ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jan 13 05:04:43 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Jan 2016 21:04:43 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> Message-ID: <20160113100442.GI10854@ando.pearwood.info> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: > On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: > > Hi! > > > > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: > >> The old convention on Linux and Unix is to just suppress all feedback, > >> but even on Linux GUI applications normally show bullets ??? or asterisks. > > > > Modern GUIs show the real character for a short period of time and > > then replace it with an asterisk. > > Ugh. I've only seen that on mobile devices, not on any desktop GUI, > and I think it's a sop to the terrible keyboards they have. I hope > this NEVER becomes a standard on full-sized computers with real > keyboards. I don't know... I'm about 35% convinced that obfuscating the password is just security theatre. I'm not sure that "shoulder surfing" of passwords is a significant threat. But the other 65% tells me that we should continue to obfuscate. -- Steve From rosuav at gmail.com Wed Jan 13 05:19:41 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Jan 2016 21:19:41 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113100442.GI10854@ando.pearwood.info> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: On Wed, Jan 13, 2016 at 9:04 PM, Steven D'Aprano wrote: > On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: >> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: >> > Hi! >> > >> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: >> >> The old convention on Linux and Unix is to just suppress all feedback, >> >> but even on Linux GUI applications normally show bullets ??? or asterisks. >> > >> > Modern GUIs show the real character for a short period of time and >> > then replace it with an asterisk. >> >> Ugh. I've only seen that on mobile devices, not on any desktop GUI, >> and I think it's a sop to the terrible keyboards they have. I hope >> this NEVER becomes a standard on full-sized computers with real >> keyboards. > > I don't know... I'm about 35% convinced that obfuscating the password is > just security theatre. I'm not sure that "shoulder surfing" of passwords > is a significant threat. > > But the other 65% tells me that we should continue to obfuscate. In some situations it's absolutely appropriate to not hide the password at all. (A lot of routers let me type in a wifi password unobscured, for instance.) But if you're doing that, then just keep the whole password visible, same as if you're asking for a user name. Don't show the one last-typed character and then hide it. You're quite probably right that obfuscating the display is security theatre; but it's the security theatre that people are expecting. If you're about to enter your credit card details into a web form, does it really matter whether or not the form itself was downloaded over an encrypted link? But people are used to "look for the padlock", which means that NOT having the padlock will bother people. If you ask for a password and it gets displayed, people will wonder if they're entering it in the right place. That said, though, I honestly don't think there's much value in seeing the length of a password by the number of asterisks. Have you ever looked at them and realized that you missed out a letter? But again, they're what people expect... ChrisA From p.f.moore at gmail.com Wed Jan 13 05:26:08 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 13 Jan 2016 10:26:08 +0000 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: On 13 January 2016 at 10:19, Chris Angelico wrote: > That said, though, I honestly don't think there's much value in seeing > the length of a password by the number of asterisks. Have you ever > looked at them and realized that you missed out a letter? But again, > they're what people expect... Personally, I frequently look at the line of asterisks and think "that doesn't look right" - it helps me catch typos. Also, doing things like deleting everything but the first N characters lets me retype from a "known good" point. I tend to get uncomfortable when I get no feedback at all, as is typical on Unix systems. But yes, it's about expectations, and it depends what type of system you typically work with. Although many, many people are used to seeing feedback asterisks or similar, as that's the norm on Windows and in many (most?) web applications. Paul From mal at egenix.com Wed Jan 13 05:36:07 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 13 Jan 2016 11:36:07 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <5695BF6F.6040705@stoneleaf.us> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113024508.GA27407@phdru.name> <5695BF6F.6040705@stoneleaf.us> Message-ID: <56962897.5010102@egenix.com> On 13.01.2016 04:07, Ethan Furman wrote: > On 01/12/2016 06:45 PM, Oleg Broytman wrote: >> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: >>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: >>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: > >>>>> The old convention on Linux and Unix is to just suppress all feedback, >>>>> but even on Linux GUI applications normally show bullets ??? or asterisks. >>>> >>>> Modern GUIs show the real character for a short period of time and >>>> then replace it with an asterisk. >>> >>> Ugh. I've only seen that on mobile devices, not on any desktop GUI, >> >> On desktop (Windows) I saw a password entry with a checkbox to switch >> between real characters and asterisks. > > While that can be handy, it is not the same as displaying each character as it is typed and then > covering it with something else. I agree with ChrisA and hope that never becomes the convention on > non-mobile devices. At least in Windows GUIs, the password field only provides a very thin layer to obfuscate the underlying password text: http://www.nirsoft.net/utils/bullets_password_view.html More secure systems always show 8 bullets regardless of how many characters the password actually has and only provide limited feedback when hitting a key without allowing to see the number of chars in the password. Not showing anything is certainly more secure than any other method of providing user feedback, so I agree that we should not make this the default. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 13 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From jonathan at slenders.be Wed Jan 13 06:00:06 2016 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 13 Jan 2016 12:00:06 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <56962897.5010102@egenix.com> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113024508.GA27407@phdru.name> <5695BF6F.6040705@stoneleaf.us> <56962897.5010102@egenix.com> Message-ID: FYI: prompt_toolkit can prompt for password input: https://github.com/jonathanslenders/python-prompt-toolkit/blob/master/examples/get-password.py https://github.com/jonathanslenders/python-prompt-toolkit/blob/master/examples/get-password-with-toggle-display-shortcut.py It displays as asterisks and keeps all readline-like navigation. The second is an example of password input where Ctrl-T toggels between asterisks and all visible. Feedback is welcome (create an issue), but this probably will never become part of core Python. Jonathan 2016-01-13 11:36 GMT+01:00 M.-A. Lemburg : > > > On 13.01.2016 04:07, Ethan Furman wrote: > > On 01/12/2016 06:45 PM, Oleg Broytman wrote: > >> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: > >>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: > >>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: > > > >>>>> The old convention on Linux and Unix is to just suppress all > feedback, > >>>>> but even on Linux GUI applications normally show bullets ??? or > asterisks. > >>>> > >>>> Modern GUIs show the real character for a short period of time and > >>>> then replace it with an asterisk. > >>> > >>> Ugh. I've only seen that on mobile devices, not on any desktop GUI, > >> > >> On desktop (Windows) I saw a password entry with a checkbox to > switch > >> between real characters and asterisks. > > > > While that can be handy, it is not the same as displaying each character > as it is typed and then > > covering it with something else. I agree with ChrisA and hope that > never becomes the convention on > > non-mobile devices. > > At least in Windows GUIs, the password field only provides a > very thin layer to obfuscate the underlying password text: > > http://www.nirsoft.net/utils/bullets_password_view.html > > More secure systems always show 8 bullets regardless of how > many characters the password actually has and only provide > limited feedback when hitting a key without allowing to > see the number of chars in the password. > > Not showing anything is certainly more secure than any other > method of providing user feedback, so I agree that we should > not make this the default. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Jan 13 2016) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > ________________________________________________________________________ > > ::: We implement business ideas - efficiently in both time and costs ::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Wed Jan 13 12:56:44 2016 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 13 Jan 2016 09:56:44 -0800 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113100442.GI10854@ando.pearwood.info> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: <56968FDC.2000701@mgmiller.net> As in everything, it depends on the situation: https://www.schneier.com/blog/archives/2009/07/the_pros_and_co.html The Security Now podcast has also expressed doubt on the practice in common cases. My take is that a few flags to control the behavior with convenient defaults perhaps, show_text=True, display_char=None, display_delay=0, and a Ctrl-T keybinding to toggle (as mentioned elsewhere). A good case could also be made for the most secure defaults instead. As long as the toggle keybinding were available it wouldn't be a great burden. This is a console-only solution, correct? So, Ctrl/Alt keys should be available. -Mike On 2016-01-13 02:04, Steven D'Aprano wrote: > I don't know... I'm about 35% convinced that obfuscating the password is > just security theatre. I'm not sure that "shoulder surfing" of passwords > is a significant threat. > > But the other 65% tells me that we should continue to obfuscate. > From tjreedy at udel.edu Wed Jan 13 19:29:40 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 13 Jan 2016 19:29:40 -0500 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: Message-ID: On 1/12/2016 8:11 PM, Muhammad Ahmed Khalid wrote: > Greetings, > > I am working on a project and I am using getpass.getpass() to grab > passwords from the user. > > Some of the users wanted asterisks to be displayed when they were typing > in the passwords for feedback i.e. how many characters were typed and > how many to backspace. ... > Please let me know about your thoughts on the issue. You are debating the wrong issue. I work at home. I HATE Passwork Masking Security Theatre. Since I cannot reliably type 10 random hidden characters (or so sites tell me), it causes me endless grief for 0.00000% gain. If any of my passwords is stolen, it will, with probability 1.0 - epsilon, be part of one of the hacks that steal millions at a time from corporate sites. Epsilon would be something other than a stranger looking over my shoulder. http://www.zdnet.com/article/we-need-to-stop-masking-passwords/ PS: When UNIX decided to give no feedback, most people had one short easy-to-remember, easy-to-type password. Not a hundred hard to remember and type. --- Terry Jan Reedy From g.brandl at gmx.net Thu Jan 14 04:50:09 2016 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 14 Jan 2016 10:50:09 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: <20160113100442.GI10854@ando.pearwood.info> References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: On 01/13/2016 11:04 AM, Steven D'Aprano wrote: > On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: >> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: >> > Hi! >> > >> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote: >> >> The old convention on Linux and Unix is to just suppress all feedback, >> >> but even on Linux GUI applications normally show bullets ??? or asterisks. >> > >> > Modern GUIs show the real character for a short period of time and >> > then replace it with an asterisk. >> >> Ugh. I've only seen that on mobile devices, not on any desktop GUI, >> and I think it's a sop to the terrible keyboards they have. I hope >> this NEVER becomes a standard on full-sized computers with real >> keyboards. > > I don't know... I'm about 35% convinced that obfuscating the password is > just security theatre. I'm not sure that "shoulder surfing" of passwords > is a significant threat. This might not apply for people working from home, but at work I regularly enter my own password or passwords for other systems with other people intentionally looking over my shoulder (e.g. pair-programming, debugging, confirming error reports etc.) Should I ask them to look away from the screen each time? cheers, Georg From rosuav at gmail.com Thu Jan 14 05:08:24 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 14 Jan 2016 21:08:24 +1100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: On Thu, Jan 14, 2016 at 8:50 PM, Georg Brandl wrote: > This might not apply for people working from home, but at work I regularly > enter my own password or passwords for other systems with other people > intentionally looking over my shoulder (e.g. pair-programming, debugging, > confirming error reports etc.) Should I ask them to look away from the > screen each time? Yes - and ask them to block their ears, too. The sound of your keyboard can give away information about what your password is. ChrisA From khali119 at umn.edu Thu Jan 14 05:07:54 2016 From: khali119 at umn.edu (Muhammad Ahmed Khalid) Date: Thu, 14 Jan 2016 04:07:54 -0600 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: Regarding the issue of people looking at the user typing in the password. Unless the person looking is right next to the user, it doesn't really matter if they look at the screen, because if password masking is enabled they will only see the masking characters. If the person looking is right next to the user then that person can just look at the keyboard and the keys being pressed. Also the main issue here is that there should be a choice provided by the getpass function to provide feedback or not. On Thu, Jan 14, 2016 at 3:50 AM, Georg Brandl wrote: > On 01/13/2016 11:04 AM, Steven D'Aprano wrote: > > On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote: > >> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote: > >> > Hi! > >> > > >> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano < > steve at pearwood.info> wrote: > >> >> The old convention on Linux and Unix is to just suppress all > feedback, > >> >> but even on Linux GUI applications normally show bullets ??? or > asterisks. > >> > > >> > Modern GUIs show the real character for a short period of time and > >> > then replace it with an asterisk. > >> > >> Ugh. I've only seen that on mobile devices, not on any desktop GUI, > >> and I think it's a sop to the terrible keyboards they have. I hope > >> this NEVER becomes a standard on full-sized computers with real > >> keyboards. > > > > I don't know... I'm about 35% convinced that obfuscating the password is > > just security theatre. I'm not sure that "shoulder surfing" of passwords > > is a significant threat. > > This might not apply for people working from home, but at work I regularly > enter my own password or passwords for other systems with other people > intentionally looking over my shoulder (e.g. pair-programming, debugging, > confirming error reports etc.) Should I ask them to look away from the > screen each time? > > cheers, > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From khali119 at umn.edu Thu Jan 14 05:29:18 2016 From: khali119 at umn.edu (Muhammad Ahmed Khalid) Date: Thu, 14 Jan 2016 04:29:18 -0600 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: This discussion is kind of going into different directions and I want to bring it back to the getpass function. The original argument is that there should be a choice provided by the getpass function of getting feedback or not. Currently the getpass function does not provide any feedback and I just want to add the ability to make it so that I can get some feedback. Some one earlier mentioned that by default the function will not echo anything back and I totally agree with that. In fact I like that suggestion a lot. Only when the users* want feedback they can change the parameters of the function and add which ever character they want for masking. Please note that currently the first parameter of the getpass function is the Prompt. The second parameter can then be used as the masking character which can be None / blank by default. On Thu, Jan 14, 2016 at 4:08 AM, Chris Angelico wrote: > On Thu, Jan 14, 2016 at 8:50 PM, Georg Brandl wrote: > > This might not apply for people working from home, but at work I > regularly > > enter my own password or passwords for other systems with other people > > intentionally looking over my shoulder (e.g. pair-programming, debugging, > > confirming error reports etc.) Should I ask them to look away from the > > screen each time? > > Yes - and ask them to block their ears, too. The sound of your > keyboard can give away information about what your password is. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Thu Jan 14 05:32:46 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 14 Jan 2016 05:32:46 -0500 Subject: [Python-ideas] (FAT Python) Convert keyword arguments to positional? Message-ID: (FAT Python: http://faster-cpython.readthedocs.org/fat_python.html) FAT Python uses guards to check whether a global name (for example, the name for a function) has changed its value. Idea: If you know exactly which function will be called, you can also optimize based on the properties of that function. According to Eli Bendersky's 2012 blog post[1] (which might be outdated), a function call with keyword arguments is potentially slower than one with only positional arguments. """ If the function in question accepts no arguments (marked by the METH_NOARGS flag when the function is created) or just a single object argument (METH_0 flag), call_function doesn't go through the usual argument packing and can call the underlying function pointer directly. ... do_call ... implements the most generic form of calling. However, there's one more optimization - if func is a PyFunction (an object used internally to represent functions defined in Python code), a separate path is taken - fast_function. ... ... PyCFunction objects that do [receive] keyword arguments [use do_call instead of fast_function]. A curious aspect of this fact is that it's somewhat more efficient to not pass keyword arguments to C functions that either accept them or are fine with just positional arguments. """ So maybe, in a function which uses FAT Python's guards, we can replace some of the keyworded-calls to global function with positional-only calls. It might be a micro-optimization, but it's one that the Python programmer doesn't have to worry about. Concerns: 1. Is it possible to correctly determine, for a given function, which positional parameters have which names? 2. Is it possible to change a function object's named parameters some time after it's created (and inspected)? PS: I didn't feel like this was appropriate for either of Victor's running PEP threads, and the third milestone thread is in the previous month's archives, so I thought that making a new thread would be best. [1] http://eli.thegreenplace.net/2012/03/23/python-internals-how-callables-work From mal at egenix.com Thu Jan 14 05:39:54 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 14 Jan 2016 11:39:54 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: <56977AFA.4000606@egenix.com> On 14.01.2016 11:29, Muhammad Ahmed Khalid wrote: > This discussion is kind of going into different directions and I want to > bring it back to the getpass function. > > The original argument is that there should be a choice provided by the > getpass function of getting feedback or not. Currently the getpass function > does not provide any feedback and I just want to add the ability to make it > so that I can get some feedback. > > Some one earlier mentioned that by default the function will not echo > anything back and I totally agree with that. In fact I like that suggestion > a lot. Only when the users* want feedback they can change the parameters of > the function and add which ever character they want for masking. > > Please note that currently the first parameter of the getpass function is > the Prompt. The second parameter can then be used as the masking character > which can be None / blank by default. If you can make this work cross-platform, I don't think anyone would object to having such an option, as long as the default remains "show nothing" :-) For more complex password / key card / single signon / etc. functionality, I believe a PyPI installable package would be better. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 14 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From g.brandl at gmx.net Thu Jan 14 06:17:45 2016 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 14 Jan 2016 12:17:45 +0100 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: On 01/14/2016 11:07 AM, Muhammad Ahmed Khalid wrote: > Regarding the issue of people looking at the user typing in the password. Unless > the person looking is right next to the user, it doesn't really matter if they > look at the screen, because if password masking is enabled they will only see > the masking characters. > > If the person looking is right next to the user then that person can just look > at the keyboard and the keys being pressed. Well, I can type reasonably fast. For Chris and anyone else who'd like to pretend not to know what I meant: The point is that these are *coworkers*, not *hackers*. They have no reason to go to great lengths to know these passwords. But at the same time, they're not theirs to know, and I'd like to keep them to myself, otherwise we could just use one company-wide password for everything. Having the password masked on the screen, where eyes will be anyway, is a good compromise to avoid needless interruptions. Georg From jsbueno at python.org.br Thu Jan 14 11:06:46 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Thu, 14 Jan 2016 14:06:46 -0200 Subject: [Python-ideas] Higher leavel heapq Message-ID: Hi, the heapq stdlib module is really handy, but a little low level - in that it accepts a sequence, possibly only a list, as the heap-object, and that object have to be handled independently, outside the functions provided in there. (One can't otherwise insert or delete elements of that list, without destroying the heap, for example). It would be simple to have a higher level class that would do just that, and simplify the use of an ordered container - what about having an extra class there? I have the snippet bellow I wrote on stack-overflow a couple years ago - it is very handy.With a little more boiler plate and code hardening, maybe it could be a nice thing for the stdlib? What do you say? http://stackoverflow.com/questions/8875706/python-heapq-with-custom-compare-predicate/8875823#8875823 js -><- From guido at python.org Thu Jan 14 11:17:28 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Jan 2016 08:17:28 -0800 Subject: [Python-ideas] Higher leavel heapq In-Reply-To: References: Message-ID: Well, it's a lot of overhead for a very small bit of convenience. I say let's not do this, it would just encourage people to settle for a slower version. Not everything needs to be OO, you know! On Thu, Jan 14, 2016 at 8:06 AM, Joao S. O. Bueno wrote: > Hi, > > the heapq stdlib module is really handy, but a little low level - > in that it accepts a sequence, possibly only a list, as the heap-object, > and that object have to be handled independently, outside the functions > provided in there. (One can't otherwise insert or delete elements of that > list, > without destroying the heap, for example). > > It would be simple to have a higher level class that would do just > that, and simplify the use > of an ordered container - what about having an extra class there? > > I have the snippet bellow I wrote on stack-overflow a couple years ago - > it is very handy.With a little more boiler plate and code hardening, > maybe it could > be a nice thing for the stdlib? > > What do you say? > > > http://stackoverflow.com/questions/8875706/python-heapq-with-custom-compare-predicate/8875823#8875823 > > js > -><- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Jan 14 11:54:50 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 14 Jan 2016 08:54:50 -0800 Subject: [Python-ideas] Higher leavel heapq In-Reply-To: References: Message-ID: <0EDC6592-AC16-4B4A-B345-33E5607D4B89@yahoo.com> On Jan 14, 2016, at 08:06, Joao S. O. Bueno wrote: > > Hi, > > the heapq stdlib module is really handy, but a little low level - > in that it accepts a sequence, possibly only a list, as the heap-object, > and that object have to be handled independently, outside the functions > provided in there. (One can't otherwise insert or delete elements of that list, > without destroying the heap, for example). > > It would be simple to have a higher level class that would do just > that, and simplify the use > of an ordered container - what about having an extra class there? > > I have the snippet bellow I wrote on stack-overflow a couple years ago - > it is very handy.With a little more boiler plate and code hardening, > maybe it could > be a nice thing for the stdlib? Using (key(x), x) as the elements doesn't work if the real values aren't comparable, and isn't stable even if they are. So, to make this work fully generally, you have to add a third element, like (key(x), next(counter), x). But when that isn't necessary, it's pretty wasteful. Also, for many uses, the key doesn't have anything to do with the values--e.g., a timer queue just uses the insertion time--so the sorting-style key function is misleading. Also, a heap as a collection-like data structure isn't that useful on its own. There are a variety of iterative algorithms that use a heap internally, but they don't need to expose it to callers. (And most of the common ones are already included in the module.) And there are also a variety of data structures that use a heap internally, but they also don't need to expose it to callers. For example, read the section in the docs on priority queues, and try to implement a pqueue class on top of your wrapper class vs. directly against the module functions. And do the same with then nlargest function (you can find the source linked from the docs). In both cases, the version without the class is more readable, less code, and likely more efficient, and the API for users is the same, so what has the wrapper class bought you? From victor.stinner at gmail.com Thu Jan 14 13:07:46 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Jan 2016 19:07:46 +0100 Subject: [Python-ideas] (FAT Python) Convert keyword arguments to positional? In-Reply-To: References: Message-ID: 2016-01-14 11:32 GMT+01:00 Franklin? Lee : > (FAT Python: http://faster-cpython.readthedocs.org/fat_python.html) FYI I moved the optimizer into a new project at GitHub to ease contributions and experiments: https://github.com/haypo/fatoptimizer Running tests work on Python 3.4, but running optimized code required a patched Python 3.6 (http://hg.python.org/sandbox/fatpython repository which includes all patches). > FAT Python uses guards to check whether a global name (for example, > the name for a function) has changed its value. Idea: If you know > exactly which function will be called, you can also optimize based on > the properties of that function. You need a guard on the function. The fat module provides such guard: https://fatoptimizer.readthedocs.org/en/latest/fat.html#GuardFunc Right now, it only watch for func.__code__. I'm not sure that it's enought. A function has many attributes which can change its behaviour if they are modified: __defaults__, __closure__, __dict__, __globals__, __kwdefaults__, __module__ (?), __name__ (?), __qualname__ (?). > According to Eli Bendersky's 2012 blog post[1] (which might be > outdated), a function call with keyword arguments is potentially > slower than one with only positional arguments. Yeah, ext_do_call() has to create a temporary dictionary, while calling a function only with indexed parameters can avoid *completly* the creation of any temporary object. PyEval_EvalFrameEx() takes a stack (array of objects) which is used to pass parameters from CALL_FUNCTION, but only for pure Python functions. > So maybe, in a function which uses FAT Python's guards, we can replace > some of the keyworded-calls to global function with positional-only > calls. It might be a micro-optimization, but it's one that the Python > programmer doesn't have to worry about. It looks like you have a plan, and I think that you can implement the optimization without changing the Python semantics. > Concerns: > 1. Is it possible to correctly determine, for a given function, which > positional parameters have which names? I think so. Just "read" the function prototype no? Such info is available in AST. > 2. Is it possible to change a function object's named parameters some > time after it's created (and inspected)? What do you think? > PS: I didn't feel like this was appropriate for either of Victor's > running PEP threads, and the third milestone thread is in the previous > month's archives, so I thought that making a new thread would be best. Yeah, it's better to start a separated thread, thanks. Victor From python-ideas at mgmiller.net Thu Jan 14 13:35:59 2016 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 14 Jan 2016 10:35:59 -0800 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: Message-ID: <5697EA8F.4050506@mgmiller.net> Sounds like this default should be user configurable as well, not only by the developer. Perhaps a function call to set the preference in PYSTARTUP? (or other precedent). A hotkey to toggle would be helpful for laptops, which tend to travel. -Mike On 2016-01-13 16:29, Terry Reedy wrote: > I HATE Passwork Masking Security Theatre. From greg.ewing at canterbury.ac.nz Thu Jan 14 16:16:20 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Jan 2016 10:16:20 +1300 Subject: [Python-ideas] Password masking for getpass.getpass In-Reply-To: References: <20160113015414.GF10854@ando.pearwood.info> <20160113021746.GA26480@phdru.name> <20160113100442.GI10854@ando.pearwood.info> Message-ID: <56981024.8080300@canterbury.ac.nz> Georg Brandl wrote: > I regularly > enter my own password or passwords for other systems with other people > intentionally looking over my shoulder (e.g. pair-programming, debugging, > confirming error reports etc.) I have a solution! It requires a display capable of emitting light with selected polarisation. The password field displays the password in such a way that it can only be seen when wearing polariod glasses that are polarised in the correct direction. So a pair of programmers can wear glasses that let them each see their own password but not the other's. Bystanders not wearing any glasses would not see either of them. -- Greg From leewangzhong+python at gmail.com Thu Jan 14 16:32:28 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 14 Jan 2016 16:32:28 -0500 Subject: [Python-ideas] (FAT Python) Convert keyword arguments to positional? In-Reply-To: References: Message-ID: Maybe also have it substitute in the function's default args, if default args take extra work (though it would take extra memory (new local variables) and probably doesn't give any savings). On Jan 14, 2016 1:08 PM, "Victor Stinner" wrote: > > 2016-01-14 11:32 GMT+01:00 Franklin? Lee : > > Concerns: > > 1. Is it possible to correctly determine, for a given function, which > > positional parameters have which names? > > I think so. Just "read" the function prototype no? Such info is > available in AST. > > > 2. Is it possible to change a function object's named parameters some > > time after it's created (and inspected)? > > What do you think? I'm not too familiar (yet) with the details of the AST. I had function wrappers in mind. In particular, I would like to permit "faked"/computed function signatures for wrappers based on what they wrap (e.g. lru_cache, partial), and I'm not sure (though I suspect) that computed signatures are compatible with immutable signatures (that is, fixed upon creation). (Sorry for the double-mail, Victor. I will try to remember not to post from the phone.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Jan 15 11:10:25 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Jan 2016 17:10:25 +0100 Subject: [Python-ideas] PEP 511: API for code transformers Message-ID: Hi, This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API to implement a static Python optimizer specializing functions with guards. If the PEP is accepted, it will solve a long list of issues, some issues are old, like #1346238 which is 11 years old ;-) I found 12 issues: * http://bugs.python.org/issue1346238 * http://bugs.python.org/issue2181 * http://bugs.python.org/issue2499 * http://bugs.python.org/issue2506 * http://bugs.python.org/issue4264 * http://bugs.python.org/issue7682 * http://bugs.python.org/issue10399 * http://bugs.python.org/issue11549 * http://bugs.python.org/issue17068 * http://bugs.python.org/issue17430 * http://bugs.python.org/issue17515 * http://bugs.python.org/issue26107 I worked to make the PEP more generic that "this hook is written for FAT Python". Please read the full PEP to see a long list of existing usages in Python of code transformers. You may read again the discussion which occurred 4 years ago about the same topic: https://mail.python.org/pipermail/python-dev/2012-August/121286.html (the thread starts with an idea of AST optimizer, but is moves quickly to a generic API to transform the code) Thanks to Red Hat for giving me time to experiment on this. Victor HTML version: https://www.python.org/dev/peps/pep-0510/#changes PEP: 511 Title: API for code transformers Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Propose an API to register bytecode and AST transformers. Add also ``-o OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o noopt`` disables the peephole optimizer. Raise an ``ImportError`` exception on import if the ``.pyc`` file is missing and the code transformers required to transform the code are missing. code transformers are not needed code transformed ahead of time (loaded from ``.pyc`` files). Rationale ========= Python does not provide a standard way to transform the code. Projects transforming the code use various hooks. The MacroPy project uses an import hook: it adds its own module finder in ``sys.meta_path`` to hook its AST transformer. Another option is to monkey-patch the builtin ``compile()`` function. There are even more options to hook a code transformer. Python 3.4 added a ``compile_source()`` method to ``importlib.abc.SourceLoader``. But code transformation is wider than just importing modules, see described use cases below. Writing an optimizer or a preprocessor is out of the scope of this PEP. Usage 1: AST optimizer ---------------------- Transforming an Abstract Syntax Tree (AST) is a convenient way to implement an optimizer. It's easier to work on the AST than working on the bytecode, AST contains more information and is more high level. Since the optimization can done ahead of time, complex but slow optimizations can be implemented. Example of optimizations which can be implemented with an AST optimizer: * `Copy propagation `_: replace ``x=1; y=x`` with ``x=1; y=1`` * `Constant folding `_: replace ``1+1`` with ``2`` * `Dead code elimination `_ Using guards (see the `PEP 510 `_), it is possible to implement a much wider choice of optimizations. Examples: * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used as iterable * `Loop unrolling `_ * Call pure builtins: replace ``len("abc")`` with ``3`` * Copy used builtin symbols to constants * See also `optimizations implemented in fatoptimizer `_, a static optimizer for Python 3.6. The following issues can be implemented with an AST optimizer: * `Issue #1346238 `_: A constant folding optimization pass for the AST * `Issue #2181 `_: optimize out local variables at end of function * `Issue #2499 `_: Fold unary + and not on constants * `Issue #4264 `_: Patch: optimize code to use LIST_APPEND instead of calling list.append * `Issue #7682 `_: Optimisation of if with constant expression * `Issue #10399 `_: AST Optimization: inlining of function calls * `Issue #11549 `_: Build-out an AST optimizer, moving some functionality out of the peephole optimizer * `Issue #17068 `_: peephole optimization for constant strings * `Issue #17430 `_: missed peephole optimization Usage 2: Preprocessor --------------------- A preprocessor can be easily implemented with an AST transformer. A preprocessor has various and different usages. Some examples: * Remove debug code like assertions and logs to make the code faster to run it for production. * `Tail-call Optimization `_ * Add profiling code * `Lazy evaluation `_: see `lazy_python `_ (bytecode transformer) and `lazy macro of MacroPy `_ (AST transformer) * Change dictionary literals into collection.OrderedDict instances * Declare constants: see `@asconstants of codetransformer `_ * Domain Specific Language (DSL) like SQL queries. The Python language itself doesn't need to be modified. Previous attempts to implement DSL for SQL like `PEP 335 - Overloadable Boolean Operators `_ was rejected. * Pattern Matching of functional languages * String Interpolation, but `PEP 498 -- Literal String Interpolation `_ was merged into Python 3.6. `MacroPy `_ has a long list of examples and use cases. This PEP does not add any new code transformer. Using a code transformer will require an external module and to register it manually. See also `PyXfuscator `_: Python obfuscator, deobfuscator, and user-assisted decompiler. Usage 3: Disable all optimization --------------------------------- Ned Batchelder asked to add an option to disable the peephole optimizer because it makes code coverage more difficult to implement. See the discussion on the python-ideas mailing list: `Disable all peephole optimizations `_. This PEP adds a new ``-o noopt`` command line option to disable the peephole optimizer. In Python, it's as easy as:: sys.set_code_transformers([]) It will fix the `Issue #2506 `_: Add mechanism to disable optimizations. Usage 4: Write new bytecode optimizers in Python ------------------------------------------------ Python 3.6 optimizes the code using a peephole optimizer. By definition, a peephole optimizer has a narrow view of the code and so can only implement basic optimizations. The optimizer rewrites the bytecode. It is difficult to enhance it, because it written in C. With this PEP, it becomes possible to implement a new bytecode optimizer in pure Python and experiment new optimizations. Some optimizations are easier to implement on the AST like constant folding, but optimizations on the bytecode are still useful. For example, when the AST is compiled to bytecode, useless jumps can be emited because the compiler is naive and does not try to optimize anything. Use Cases ========= This section give examples of use cases explaining when and how code transformers will be used. Interactive interpreter ----------------------- It will be possible to use code transformers with the interactive interpreter which is popular in Python and commonly used to demonstrate Python. The code is transformed at runtime and so the interpreter can be slower when expensive code transformers are used. Build a transformed package --------------------------- It will be possible to build a package of the transformed code. A transformer can have a configuration. The configuration is not stored in the package. All ``.pyc`` files of the package must be transformed with the same code transformers and the same transformers configuration. It is possible to build different ``.pyc`` files using different optimizer tags. Example: ``fat`` for the default configuration and ``fat_inline`` for a different configuration with function inlining enabled. A package can contain ``.pyc`` files with different optimizer tags. Install a package containing transformed .pyc files --------------------------------------------------- It will be possible to install a package which contains transformed ``.pyc`` files. All ``.pyc`` files with any optimizer tag contained in the package are installed, not only for the current optimizer tag. Build .pyc files when installing a package ------------------------------------------ If a package does not contain any ``.pyc`` files of the current optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are created during the installation. Code transformers of the optimizer tag are required. Otherwise, the installation fails with an error. Execute transformed code ------------------------ It will be possible to execute transformed code. Raise an ``ImportError`` exception on import if the ``.pyc`` file of the current optimizer tag is missing and the code transformers required to transform the code are missing. The interesting point here is that code transformers are not needed to execute the transformed code if all required ``.pyc`` files are already available. Code transformer API ==================== A code transformer is a class with ``ast_transformer()`` and/or ``code_transformer()`` methods (API described below) and a ``name`` attribute. For efficiency, do not define a ``code_transformer()`` or ``ast_transformer()`` method if it does nothing. The ``name`` attribute (``str``) must be a short string used to identify an optimizer. It is used to build a ``.pyc`` filename. The name must not contain dots (``'.'``), dashes (``'-'``) or directory separators: dots are used to separated fields in a ``.pyc`` filename and dashes areused to join code transformer names to build the optimizer tag. .. note:: It would be nice to pass the fully qualified name of a module in the *context* when an AST transformer is used to transform a module on import, but it looks like the information is not available in ``PyParser_ASTFromStringObject()``. code_transformer() ------------------ Prototype:: def code_transformer(code, consts, names, lnotab, context): ... return (code, consts, names, lnotab) Parameters: * *code*: the bytecode (``bytes``) * *consts*: a sequence of constants * *names*: tuple of variable names * *lnotab*: table mapping instruction offsets to line numbers (``bytes``) The code transformer is run after the compilation to bytecode ast_transformer() ------------------ Prototype:: def ast_transformer(tree, context): ... return tree Parameters: * *tree*: an AST tree * *context*: an object with a ``filename`` attribute (``str``) It must return an AST tree. It can modify the AST tree in place, or create a new AST tree. The AST transformer is called after the creation of the AST by the parser and before the compilation to bytecode. New attributes may be added to *context* in the future. Changes ======= In short, add: * ``-o OPTIM_TAG`` command line option * ``ast.Constant`` * ``ast.PyCF_TRANSFORMED_AST`` * ``sys.get_code_transformers()`` * ``sys.implementation.optim_tag`` * ``sys.set_code_transformers(transformers)`` API to get/set code transformers -------------------------------- Add new functions to register code transformers: * ``sys.set_code_transformers(transformers)``: set the list of code transformers and update ``sys.implementation.optim_tag`` * ``sys.get_code_transformers()``: get the list of code transformers. The order of code transformers matter. Running transformer A and then transformer B can give a different output than running transformer B an then transformer A. Example to prepend a new code transformer:: transformers = sys.get_code_transformers() transformers.insert(0, new_cool_transformer) sys.set_code_transformers(transformers) All AST tranformers are run sequentially (ex: the second transformer gets the input of the first transformer), and then all bytecode transformers are run sequentially. Optimizer tag ------------- Changes: * Add ``sys.implementation.optim_tag`` (``str``): optimization tag. The default optimization tag is ``'opt'``. * Add a new ``-o OPTIM_TAG`` command line option to set ``sys.implementation.optim_tag``. Changes on ``importlib``: * ``importlib`` uses ``sys.implementation.optim_tag`` to build the ``.pyc`` filename to importing modules, instead of always using ``opt``. Remove also the special case for the optimizer level ``0`` with the default optimizer tag ``'opt'`` to simplify the code. * When loading a module, if the ``.pyc`` file is missing but the ``.py`` is available, the ``.py`` is only used if code optimizers have the same optimizer tag than the current tag, otherwise an ``ImportError`` exception is raised. Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can be compiled to import a module:: def transformers_tag(): transformers = sys.get_code_transformers() if not transformers: return 'noopt' return '-'.join(transformer.name for transformer in transformers) def use_py(): return (transformers_tag() == sys.implementation.optim_tag) The order of ``sys.get_code_transformers()`` matter. For example, the ``fat`` transformer followed by the ``pythran`` transformer gives the optimizer tag ``fat-pythran``. The behaviour of the ``importlib`` module is unchanged with the default optimizer tag (``'opt'``). Peephole optimizer ------------------ By default, ``sys.implementation.optim_tag`` is ``opt`` and ``sys.get_code_transformers()`` returns a list of one code transformer: the peephole optimizer (optimize the bytecode). Use ``-o noopt`` to disable the peephole optimizer. In this case, the optimizer tag is ``noopt`` and no code transformer is registered. Using the ``-o opt`` option has not effect. AST enhancements ---------------- Enhancements to simplify the implementation of AST transformers: * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the transformed AST. ``PyCF_ONLY_AST`` returns the AST before the transformers. * Add ``ast.Constant``: this type is not emited by the compiler, but can be used in an AST transformer to simplify the code. It does not contain line number and column offset informations on tuple or frozenset items. * ``PyCodeObject.co_lnotab``: line number delta becomes signed to support moving instructions (note: need to modify MAGIC_NUMBER in importlib). Implemented in the `issue #26107 `_ * Enhance the bytecode compiler to support ``tuple`` and ``frozenset`` constants. Currently, ``tuple`` and ``frozenset`` constants are created by the peephole transformer, after the bytecode compilation. * ``marshal`` module: fix serialization of the empty frozenset singleton * update ``Tools/parser/unparse.py`` to support the new ``ast.Constant`` node type Examples ======== .pyc filenames -------------- Example of ``.pyc`` filenames of the ``os`` module. With the default optimizer tag ``'opt'``: =========================== ================== .pyc filename Optimization level =========================== ================== ``os.cpython-36.opt-0.pyc`` 0 ``os.cpython-36.opt-1.pyc`` 1 ``os.cpython-36.opt-2.pyc`` 2 =========================== ================== With the ``'fat'`` optimizer tag: =========================== ================== .pyc filename Optimization level =========================== ================== ``os.cpython-36.fat-0.pyc`` 0 ``os.cpython-36.fat-1.pyc`` 1 ``os.cpython-36.fat-2.pyc`` 2 =========================== ================== Bytecode transformer -------------------- Scary bytecode transformer replacing all strings with ``"Ni! Ni! Ni!"``:: import sys class BytecodeTransformer: name = "knights_who_say_ni" def code_transformer(self, code, consts, names, lnotab, context): consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const for const in consts] return (code, consts, names, lnotab) # replace existing code transformers with the new bytecode transformer sys.set_code_transformers([BytecodeTransformer()]) # execute code which will be transformed by code_transformer() exec("print('Hello World!')") Output:: Ni! Ni! Ni! AST transformer --------------- Similary to the bytecode transformer example, the AST transformer also replaces all strings with ``"Ni! Ni! Ni!"``:: import ast import sys class KnightsWhoSayNi(ast.NodeTransformer): def visit_Str(self, node): node.s = 'Ni! Ni! Ni!' return node class ASTTransformer: name = "knights_who_say_ni" def __init__(self): self.transformer = KnightsWhoSayNi() def ast_transformer(self, tree, context): self.transformer.visit(tree) return tree # replace existing code transformers with the new AST transformer sys.set_code_transformers([ASTTransformer()]) # execute code which will be transformed by ast_transformer() exec("print('Hello World!')") Output:: Ni! Ni! Ni! Other Python implementations ============================ The PEP 511 should be implemented by all Python implementation, but the bytecode and the AST are not standardized. By the way, even between minor version of CPython, there are changes on the AST API. There are differences, but only minor differences. It is quite easy to write an AST transformer which works on Python 2.7 and Python 3.5 for example. Discussion ========== * `[Python-Dev] AST optimizer implemented in Python `_ (August 2012) Prior Art ========= AST optimizers -------------- In 2011, Eugene Toder proposed to rewrite some peephole optimizations in a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving some functionality out of the peephole optimizer `_. The patch adds ``ast.Lit`` (it was proposed to rename it to ``ast.Literal``). In 2012, Victor Stinner wrote the `astoptimizer `_ project, an AST optimizer implementing various optimizations. Most interesting optimizations break the Python semantics since no guard is used to disable optimization if something changes. In 2015, Victor Stinner wrote the `fatoptimizer `_ project, an AST optimizer specializing functions using guards. The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST" optimizer `_ was a first attempt of API for code transformers, but specific to AST. Python Preprocessors -------------------- * `MacroPy `_: MacroPy is an implementation of Syntactic Macros in the Python Programming Language. MacroPy provides a mechanism for user-defined functions (macros) to perform transformations on the abstract syntax tree (AST) of a Python program at import time. * `pypreprocessor `_: C-style preprocessor directives in Python, like ``#define`` and ``#ifdef`` Bytecode transformers --------------------- * `codetransformer `_: Bytecode transformers for CPython inspired by the ``ast`` module?s ``NodeTransformer``. * `byteplay `_: Byteplay lets you convert Python code objects into equivalent objects which are easy to play with, and lets you convert those objects back into living Python code objects. It's useful for applying crazy transformations on Python functions, and is also useful in learning Python byte code intricacies. See `byteplay documentation `_. See also: * `BytecodeAssembler `_ Copyright ========= This document has been placed in the public domain. From abarnert at yahoo.com Fri Jan 15 11:12:58 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 15 Jan 2016 10:12:58 -0600 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: <04E1C9B3-B11B-47D7-9FD9-944A57078F7A@yahoo.com> You linked to PEP 510 #changes. I think you wanted https://www.python.org/dev/peps/pep-0511/ Sent from my iPhone > On Jan 15, 2016, at 10:10, Victor Stinner wrote: > > Hi, > > This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API > to implement a static Python optimizer specializing functions with > guards. > > If the PEP is accepted, it will solve a long list of issues, some > issues are old, like #1346238 which is 11 years old ;-) I found 12 > issues: > > * http://bugs.python.org/issue1346238 > * http://bugs.python.org/issue2181 > * http://bugs.python.org/issue2499 > * http://bugs.python.org/issue2506 > * http://bugs.python.org/issue4264 > * http://bugs.python.org/issue7682 > * http://bugs.python.org/issue10399 > * http://bugs.python.org/issue11549 > * http://bugs.python.org/issue17068 > * http://bugs.python.org/issue17430 > * http://bugs.python.org/issue17515 > * http://bugs.python.org/issue26107 > > I worked to make the PEP more generic that "this hook is written for > FAT Python". Please read the full PEP to see a long list of existing > usages in Python of code transformers. > > You may read again the discussion which occurred 4 years ago about the > same topic: > https://mail.python.org/pipermail/python-dev/2012-August/121286.html > (the thread starts with an idea of AST optimizer, but is moves quickly > to a generic API to transform the code) > > Thanks to Red Hat for giving me time to experiment on this. > Victor > > > HTML version: > https://www.python.org/dev/peps/pep-0510/#changes > > > PEP: 511 > Title: API for code transformers > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 4-January-2016 > Python-Version: 3.6 > > Abstract > ======== > > Propose an API to register bytecode and AST transformers. Add also ``-o > OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o > noopt`` disables the peephole optimizer. Raise an ``ImportError`` > exception on import if the ``.pyc`` file is missing and the code > transformers required to transform the code are missing. code > transformers are not needed code transformed ahead of time (loaded from > ``.pyc`` files). > > > Rationale > ========= > > Python does not provide a standard way to transform the code. Projects > transforming the code use various hooks. The MacroPy project uses an > import hook: it adds its own module finder in ``sys.meta_path`` to > hook its AST transformer. Another option is to monkey-patch the > builtin ``compile()`` function. There are even more options to > hook a code transformer. > > Python 3.4 added a ``compile_source()`` method to > ``importlib.abc.SourceLoader``. But code transformation is wider than > just importing modules, see described use cases below. > > Writing an optimizer or a preprocessor is out of the scope of this PEP. > > Usage 1: AST optimizer > ---------------------- > > Transforming an Abstract Syntax Tree (AST) is a convenient > way to implement an optimizer. It's easier to work on the AST than > working on the bytecode, AST contains more information and is more high > level. > > Since the optimization can done ahead of time, complex but slow > optimizations can be implemented. > > Example of optimizations which can be implemented with an AST optimizer: > > * `Copy propagation > `_: > replace ``x=1; y=x`` with ``x=1; y=1`` > * `Constant folding > `_: > replace ``1+1`` with ``2`` > * `Dead code elimination > `_ > > Using guards (see the `PEP 510 > `_), it is possible to > implement a much wider choice of optimizations. Examples: > > * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used > as iterable > * `Loop unrolling `_ > * Call pure builtins: replace ``len("abc")`` with ``3`` > * Copy used builtin symbols to constants > * See also `optimizations implemented in fatoptimizer > `_, > a static optimizer for Python 3.6. > > The following issues can be implemented with an AST optimizer: > > * `Issue #1346238 > `_: A constant folding > optimization pass for the AST > * `Issue #2181 `_: > optimize out local variables at end of function > * `Issue #2499 `_: > Fold unary + and not on constants > * `Issue #4264 `_: > Patch: optimize code to use LIST_APPEND instead of calling list.append > * `Issue #7682 `_: > Optimisation of if with constant expression > * `Issue #10399 `_: AST > Optimization: inlining of function calls > * `Issue #11549 `_: > Build-out an AST optimizer, moving some functionality out of the > peephole optimizer > * `Issue #17068 `_: > peephole optimization for constant strings > * `Issue #17430 `_: > missed peephole optimization > > > Usage 2: Preprocessor > --------------------- > > A preprocessor can be easily implemented with an AST transformer. A > preprocessor has various and different usages. > > Some examples: > > * Remove debug code like assertions and logs to make the code faster to > run it for production. > * `Tail-call Optimization `_ > * Add profiling code > * `Lazy evaluation `_: > see `lazy_python `_ > (bytecode transformer) and `lazy macro of MacroPy > `_ (AST transformer) > * Change dictionary literals into collection.OrderedDict instances > * Declare constants: see `@asconstants of codetransformer > `_ > * Domain Specific Language (DSL) like SQL queries. The > Python language itself doesn't need to be modified. Previous attempts > to implement DSL for SQL like `PEP 335 - Overloadable Boolean > Operators `_ was rejected. > * Pattern Matching of functional languages > * String Interpolation, but `PEP 498 -- Literal String Interpolation > `_ was merged into Python > 3.6. > > `MacroPy `_ has a long list of > examples and use cases. > > This PEP does not add any new code transformer. Using a code transformer > will require an external module and to register it manually. > > See also `PyXfuscator `_: Python > obfuscator, deobfuscator, and user-assisted decompiler. > > > Usage 3: Disable all optimization > --------------------------------- > > Ned Batchelder asked to add an option to disable the peephole optimizer > because it makes code coverage more difficult to implement. See the > discussion on the python-ideas mailing list: `Disable all peephole > optimizations > `_. > > This PEP adds a new ``-o noopt`` command line option to disable the > peephole optimizer. In Python, it's as easy as:: > > sys.set_code_transformers([]) > > It will fix the `Issue #2506 `_: Add > mechanism to disable optimizations. > > > Usage 4: Write new bytecode optimizers in Python > ------------------------------------------------ > > Python 3.6 optimizes the code using a peephole optimizer. By > definition, a peephole optimizer has a narrow view of the code and so > can only implement basic optimizations. The optimizer rewrites the > bytecode. It is difficult to enhance it, because it written in C. > > With this PEP, it becomes possible to implement a new bytecode optimizer > in pure Python and experiment new optimizations. > > Some optimizations are easier to implement on the AST like constant > folding, but optimizations on the bytecode are still useful. For > example, when the AST is compiled to bytecode, useless jumps can be > emited because the compiler is naive and does not try to optimize > anything. > > > Use Cases > ========= > > This section give examples of use cases explaining when and how code > transformers will be used. > > Interactive interpreter > ----------------------- > > It will be possible to use code transformers with the interactive > interpreter which is popular in Python and commonly used to demonstrate > Python. > > The code is transformed at runtime and so the interpreter can be slower > when expensive code transformers are used. > > Build a transformed package > --------------------------- > > It will be possible to build a package of the transformed code. > > A transformer can have a configuration. The configuration is not stored > in the package. > > All ``.pyc`` files of the package must be transformed with the same code > transformers and the same transformers configuration. > > It is possible to build different ``.pyc`` files using different > optimizer tags. Example: ``fat`` for the default configuration and > ``fat_inline`` for a different configuration with function inlining > enabled. > > A package can contain ``.pyc`` files with different optimizer tags. > > > Install a package containing transformed .pyc files > --------------------------------------------------- > > It will be possible to install a package which contains transformed > ``.pyc`` files. > > All ``.pyc`` files with any optimizer tag contained in the package are > installed, not only for the current optimizer tag. > > > Build .pyc files when installing a package > ------------------------------------------ > > If a package does not contain any ``.pyc`` files of the current > optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are > created during the installation. > > Code transformers of the optimizer tag are required. Otherwise, the > installation fails with an error. > > > Execute transformed code > ------------------------ > > It will be possible to execute transformed code. > > Raise an ``ImportError`` exception on import if the ``.pyc`` file of the > current optimizer tag is missing and the code transformers required to > transform the code are missing. > > The interesting point here is that code transformers are not needed to > execute the transformed code if all required ``.pyc`` files are already > available. > > > Code transformer API > ==================== > > A code transformer is a class with ``ast_transformer()`` and/or > ``code_transformer()`` methods (API described below) and a ``name`` > attribute. > > For efficiency, do not define a ``code_transformer()`` or > ``ast_transformer()`` method if it does nothing. > > The ``name`` attribute (``str``) must be a short string used to identify > an optimizer. It is used to build a ``.pyc`` filename. The name must not > contain dots (``'.'``), dashes (``'-'``) or directory separators: dots > are used to separated fields in a ``.pyc`` filename and dashes areused > to join code transformer names to build the optimizer tag. > > .. note:: > It would be nice to pass the fully qualified name of a module in the > *context* when an AST transformer is used to transform a module on > import, but it looks like the information is not available in > ``PyParser_ASTFromStringObject()``. > > > code_transformer() > ------------------ > > Prototype:: > > def code_transformer(code, consts, names, lnotab, context): > ... > return (code, consts, names, lnotab) > > Parameters: > > * *code*: the bytecode (``bytes``) > * *consts*: a sequence of constants > * *names*: tuple of variable names > * *lnotab*: table mapping instruction offsets to line numbers > (``bytes``) > > The code transformer is run after the compilation to bytecode > > > ast_transformer() > ------------------ > > Prototype:: > > def ast_transformer(tree, context): > ... > return tree > > Parameters: > > * *tree*: an AST tree > * *context*: an object with a ``filename`` attribute (``str``) > > It must return an AST tree. It can modify the AST tree in place, or > create a new AST tree. > > The AST transformer is called after the creation of the AST by the > parser and before the compilation to bytecode. New attributes may be > added to *context* in the future. > > > Changes > ======= > > In short, add: > > * ``-o OPTIM_TAG`` command line option > * ``ast.Constant`` > * ``ast.PyCF_TRANSFORMED_AST`` > * ``sys.get_code_transformers()`` > * ``sys.implementation.optim_tag`` > * ``sys.set_code_transformers(transformers)`` > > > API to get/set code transformers > -------------------------------- > > Add new functions to register code transformers: > > * ``sys.set_code_transformers(transformers)``: set the list of code > transformers and update ``sys.implementation.optim_tag`` > * ``sys.get_code_transformers()``: get the list of code > transformers. > > The order of code transformers matter. Running transformer A and then > transformer B can give a different output than running transformer B an > then transformer A. > > Example to prepend a new code transformer:: > > transformers = sys.get_code_transformers() > transformers.insert(0, new_cool_transformer) > sys.set_code_transformers(transformers) > > All AST tranformers are run sequentially (ex: the second transformer > gets the input of the first transformer), and then all bytecode > transformers are run sequentially. > > > Optimizer tag > ------------- > > Changes: > > * Add ``sys.implementation.optim_tag`` (``str``): optimization tag. > The default optimization tag is ``'opt'``. > * Add a new ``-o OPTIM_TAG`` command line option to set > ``sys.implementation.optim_tag``. > > Changes on ``importlib``: > > * ``importlib`` uses ``sys.implementation.optim_tag`` to build the > ``.pyc`` filename to importing modules, instead of always using > ``opt``. Remove also the special case for the optimizer level ``0`` > with the default optimizer tag ``'opt'`` to simplify the code. > * When loading a module, if the ``.pyc`` file is missing but the ``.py`` > is available, the ``.py`` is only used if code optimizers have the > same optimizer tag than the current tag, otherwise an ``ImportError`` > exception is raised. > > Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can > be compiled to import a module:: > > def transformers_tag(): > transformers = sys.get_code_transformers() > if not transformers: > return 'noopt' > return '-'.join(transformer.name > for transformer in transformers) > > def use_py(): > return (transformers_tag() == sys.implementation.optim_tag) > > The order of ``sys.get_code_transformers()`` matter. For example, the > ``fat`` transformer followed by the ``pythran`` transformer gives the > optimizer tag ``fat-pythran``. > > The behaviour of the ``importlib`` module is unchanged with the default > optimizer tag (``'opt'``). > > > Peephole optimizer > ------------------ > > By default, ``sys.implementation.optim_tag`` is ``opt`` and > ``sys.get_code_transformers()`` returns a list of one code transformer: > the peephole optimizer (optimize the bytecode). > > Use ``-o noopt`` to disable the peephole optimizer. In this case, the > optimizer tag is ``noopt`` and no code transformer is registered. > > Using the ``-o opt`` option has not effect. > > > AST enhancements > ---------------- > > Enhancements to simplify the implementation of AST transformers: > > * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the > transformed AST. ``PyCF_ONLY_AST`` returns the AST before the > transformers. > * Add ``ast.Constant``: this type is not emited by the compiler, but > can be used in an AST transformer to simplify the code. It does not > contain line number and column offset informations on tuple or > frozenset items. > * ``PyCodeObject.co_lnotab``: line number delta becomes signed to > support moving instructions (note: need to modify MAGIC_NUMBER in > importlib). Implemented in the `issue #26107 > `_ > * Enhance the bytecode compiler to support ``tuple`` and ``frozenset`` > constants. Currently, ``tuple`` and ``frozenset`` constants are > created by the peephole transformer, after the bytecode compilation. > * ``marshal`` module: fix serialization of the empty frozenset singleton > * update ``Tools/parser/unparse.py`` to support the new ``ast.Constant`` > node type > > > Examples > ======== > > .pyc filenames > -------------- > > Example of ``.pyc`` filenames of the ``os`` module. > > With the default optimizer tag ``'opt'``: > > =========================== ================== > .pyc filename Optimization level > =========================== ================== > ``os.cpython-36.opt-0.pyc`` 0 > ``os.cpython-36.opt-1.pyc`` 1 > ``os.cpython-36.opt-2.pyc`` 2 > =========================== ================== > > With the ``'fat'`` optimizer tag: > > =========================== ================== > .pyc filename Optimization level > =========================== ================== > ``os.cpython-36.fat-0.pyc`` 0 > ``os.cpython-36.fat-1.pyc`` 1 > ``os.cpython-36.fat-2.pyc`` 2 > =========================== ================== > > > Bytecode transformer > -------------------- > > Scary bytecode transformer replacing all strings with > ``"Ni! Ni! Ni!"``:: > > import sys > > class BytecodeTransformer: > name = "knights_who_say_ni" > > def code_transformer(self, code, consts, names, lnotab, context): > consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const > for const in consts] > return (code, consts, names, lnotab) > > # replace existing code transformers with the new bytecode transformer > sys.set_code_transformers([BytecodeTransformer()]) > > # execute code which will be transformed by code_transformer() > exec("print('Hello World!')") > > Output:: > > Ni! Ni! Ni! > > > AST transformer > --------------- > > Similary to the bytecode transformer example, the AST transformer also > replaces all strings with ``"Ni! Ni! Ni!"``:: > > import ast > import sys > > class KnightsWhoSayNi(ast.NodeTransformer): > def visit_Str(self, node): > node.s = 'Ni! Ni! Ni!' > return node > > class ASTTransformer: > name = "knights_who_say_ni" > > def __init__(self): > self.transformer = KnightsWhoSayNi() > > def ast_transformer(self, tree, context): > self.transformer.visit(tree) > return tree > > # replace existing code transformers with the new AST transformer > sys.set_code_transformers([ASTTransformer()]) > > # execute code which will be transformed by ast_transformer() > exec("print('Hello World!')") > > Output:: > > Ni! Ni! Ni! > > > Other Python implementations > ============================ > > The PEP 511 should be implemented by all Python implementation, but the > bytecode and the AST are not standardized. > > By the way, even between minor version of CPython, there are changes on > the AST API. There are differences, but only minor differences. It is > quite easy to write an AST transformer which works on Python 2.7 and > Python 3.5 for example. > > > Discussion > ========== > > * `[Python-Dev] AST optimizer implemented in Python > `_ > (August 2012) > > > Prior Art > ========= > > AST optimizers > -------------- > > In 2011, Eugene Toder proposed to rewrite some peephole optimizations in > a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving > some functionality out of the peephole optimizer > `_. The patch adds ``ast.Lit`` (it > was proposed to rename it to ``ast.Literal``). > > In 2012, Victor Stinner wrote the `astoptimizer > `_ project, an AST optimizer > implementing various optimizations. Most interesting optimizations break > the Python semantics since no guard is used to disable optimization if > something changes. > > In 2015, Victor Stinner wrote the `fatoptimizer > `_ project, an AST optimizer > specializing functions using guards. > > The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST" > optimizer `_ was a first attempt of > API for code transformers, but specific to AST. > > > Python Preprocessors > -------------------- > > * `MacroPy `_: MacroPy is an > implementation of Syntactic Macros in the Python Programming Language. > MacroPy provides a mechanism for user-defined functions (macros) to > perform transformations on the abstract syntax tree (AST) of a Python > program at import time. > * `pypreprocessor `_: C-style > preprocessor directives in Python, like ``#define`` and ``#ifdef`` > > > Bytecode transformers > --------------------- > > * `codetransformer `_: > Bytecode transformers for CPython inspired by the ``ast`` module?s > ``NodeTransformer``. > * `byteplay `_: Byteplay lets you > convert Python code objects into equivalent objects which are easy to > play with, and lets you convert those objects back into living Python > code objects. It's useful for applying crazy transformations on Python > functions, and is also useful in learning Python byte code > intricacies. See `byteplay documentation > `_. > > See also: > > * `BytecodeAssembler `_ > > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From victor.stinner at gmail.com Fri Jan 15 12:11:39 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Jan 2016 18:11:39 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: I have a fully working implementation of the PEP 509, 510 and 511 (all together). You can install it to play with it if you want ;-) Get and compile patched (FAT) Python with: -------------- hg clone http://hg.python.org/sandbox/fatpython cd fatpython ./configure && make -------------- Enjoy slow and non optimized bytecode :-) ------------- $ ./python -o noopt -c 'import dis; dis.dis(compile("1+1", "test", "exec"))' 1 0 LOAD_CONST 0 (1) 3 LOAD_CONST 0 (1) 6 BINARY_ADD 7 POP_TOP 8 LOAD_CONST 1 (None) 11 RETURN_VALUE ------------- Ok, now if you want to play with fat & fatoptimizer modules (FAT Python): -------------- ./python -m venv ENV cd ENV git clone https://github.com/haypo/fat.git git clone https://github.com/haypo/fatoptimizer.git (cd fat; ../bin/python setup.py install) (cd fatoptimizer; ../bin/python setup.py install) cd .. -------------- I'm not using virtual environment for my development, I prefer to copy manually fatoptimizer/fatoptimizer/ directory and the build .so file of the fat module into the Lib/ directory of the standard library. If you installed the patched Python into /opt/fatpython (./confgure --prefix=/opt/fatpython && make && sudo make install), you can also use "python setup.py install" in fat/ and fatoptimizer/ to install them easily. The drawback of the virtualenv is that it's easy to use the wrong python (./python vs ENV/bin/python) and don't have FAT Python enabled because of http://bugs.python.org/issue26099 which ignores silently import errors in sitecustomize... Ensure that FAT Python is enabled with: -------- $ ./python -X fat -c 'import sys; print(sys.implementation.optim_tag)' fat-opt -------- You must get "fat-opt" (and not "opt"). Note: The optimizer tag is "fat-opt" and not "fat" because fatoptimizer keeps the peephole optimizer. Enable FAT Python using the "-X fat" command line option: -------------- $ ENV/bin/python -X fat >>> def func(): return len("abc") ... >>> import dis >>> dis.dis(func) 1 0 LOAD_GLOBAL 0 (len) 3 LOAD_CONST 1 ('abc') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 RETURN_VALUE >>> import fat >>> fat.get_specialized(func) [(", line 1>, [])] >>> dis.dis(fat.get_specialized(func)[0][0]) 1 0 LOAD_CONST 1 (3) 3 RETURN_VALUE -------------- Play with microbenchmarks: --------------- $ ENV/bin/python -m timeit -s 'def f(): return len("abc")' 'f()' 10000000 loops, best of 3: 0.122 usec per loop $ ENV/bin/python -X fat -m timeit -s 'def f(): return len("abc")' 'f()' 10000000 loops, best of 3: 0.0932 usec per loop --------------- Oh look! It's faster without having to touch the code ;-) I'm using Lib/sitecustomize.py to register the optimizer if -X fat is used: ------- import sys if sys._xoptions.get('fat'): import fatoptimizer; fatoptimizer._register() ------- If you want to run optimized code without registering the optimizer, it doesn't work because .pyc are missing: --- $ ENV/bin/python -o fat-opt Fatal Python error: Py_Initialize: Unable to get the locale encoding ImportError: missing AST transformers for '.../Lib/encodings/__init__.py': optim_tag='fat', transformers tag='noopt' --- You have to compile optimized .pyc files: --- # the optimizer is slow, so add -v to enable fatoptimizer logs for more fun ENV/bin/python -X fat -v -m compileall # why does compileall not compile encodings/*.py? ENV/bin/python -X fat -m py_compile /home/haypo/prog/python/fatpython/Lib/encodings/{__init__,aliases,latin_1,utf_8}.py --- Finally, enjoy optimized code with no registered optimized: --- # hum, use maybe ENV/bin/activate instead of my magic tricks $ export PYTHONPATH=ENV/lib/python3.6/site-packages/ $ ENV/bin/python -o fat-opt -c 'import sys; print(sys.implementation.optim_tag, sys.get_code_transformers())' fat-opt [] --- Remember that you cannot import .py files in this case, only .pyc: --- $ touch x.py $ ENV/bin/python -o fat-opt -c 'import x' Traceback (most recent call last): File "", line 1, in ImportError: missing AST transformers for '.../x.py': optim_tag='fat-opt', transformers tag='noopt' --- Victor From brett at python.org Fri Jan 15 12:22:08 2016 From: brett at python.org (Brett Cannon) Date: Fri, 15 Jan 2016 17:22:08 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: On Fri, 15 Jan 2016 at 08:11 Victor Stinner wrote: > [SNIP] > > Optimizer tag > ------------- > > Changes: > > * Add ``sys.implementation.optim_tag`` (``str``): optimization tag. > The default optimization tag is ``'opt'``. > * Add a new ``-o OPTIM_TAG`` command line option to set > ``sys.implementation.optim_tag``. > > Changes on ``importlib``: > > * ``importlib`` uses ``sys.implementation.optim_tag`` to build the > ``.pyc`` filename to importing modules, instead of always using > ``opt``. Remove also the special case for the optimizer level ``0`` > with the default optimizer tag ``'opt'`` to simplify the code. > * When loading a module, if the ``.pyc`` file is missing but the ``.py`` > is available, the ``.py`` is only used if code optimizers have the > same optimizer tag than the current tag, otherwise an ``ImportError`` > exception is raised. > > Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can > be compiled to import a module:: > > def transformers_tag(): > transformers = sys.get_code_transformers() > if not transformers: > return 'noopt' > return '-'.join(transformer.name > for transformer in transformers) > > def use_py(): > return (transformers_tag() == sys.implementation.optim_tag) > > The order of ``sys.get_code_transformers()`` matter. For example, the > ``fat`` transformer followed by the ``pythran`` transformer gives the > optimizer tag ``fat-pythran``. > > The behaviour of the ``importlib`` module is unchanged with the default > optimizer tag (``'opt'``). > I just wanted to point out to people that the key part of this PEP is the change in semantics of `-O` accepting an argument. Without this change there is no way to cause import to pick up on optimized .pyc files that you want it to use without abusing pre-existing .pyc filenames. This also means that everything else is optional. That doesn't mean it shouldn't be considered, mind you, as it makes using AST and bytecode transformers more practical. But some `-O` change that allows user-defined optimization tags is needed for any of this to work reasonably. From there it's theoretically possible for someone to write their own compileall that pre-compiles all Python code to .pyc files with a specific optimization tag which they specify with `-O` using their own AST and bytecode transformers and hence not need the transformation features built into sys/import. I should also point out that this does get tricky in terms of how to handle the stdlib if you have not pre-compiled it, e.g., if the first module imported by Python is the encodings module then how to make sure the AST optimizers are ready to go by the time that import happens? And lastly, Victor proposes that all .pyc files get an optimization tag. While there is nothing technically wrong with that, PEP 488 purposefully didn't do that in the default case for backwards-compatibility, so that will need to be at least mentioned in the PEP. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Jan 15 12:40:13 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Jan 2016 18:40:13 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: 2016-01-15 18:22 GMT+01:00 Brett Cannon : > I just wanted to point out to people that the key part of this PEP is the > change in semantics of `-O` accepting an argument. The be exact, it's a new "-o arg" option, it's different from -O and -OO (uppercase). Since I don't know what to do with -O and -OO, I simply kept them :-D > I should also point out that this does get tricky in terms of how to handle > the stdlib if you have not pre-compiled it, e.g., if the first module > imported by Python is the encodings module then how to make sure the AST > optimizers are ready to go by the time that import happens? Since importlib reads sys.implementation.optim_tag at each import, it works fine. For example, you start with "opt" optimizer tag. You import everything needed for fatoptimizer. Then calling sys.set_code_transformers() will set a new optimizer flag (ex: "fat-opt"). But it works since the required code transformers are now available. The tricky part is more when you want to deploy an application without the code transformer, you have to ensure that all .py files are compiled to .pyc. But there is no technical issues to compile them, it's more a practical issue. See my second email with a lot of commands, I showed how .pyc are created with different .pyc filenames. Or follow my commands to try my "fatpython" fork to play yourself with the code ;-) > And lastly, Victor proposes that all .pyc files get an optimization tag. > While there is nothing technically wrong with that, PEP 488 purposefully > didn't do that in the default case for backwards-compatibility, so that will > need to be at least mentioned in the PEP. The PEP already contains: https://www.python.org/dev/peps/pep-0511/#optimizer-tag "Remove also the special case for the optimizer level 0 with the default optimizer tag 'opt' to simplify the code." Code relying on the exact .pyc filename (like unit tests) already have to be modified to use the optimizer tag. It's just an opportunity to simplify the code. I don't really care of this specific change ;-) Victor From ethan at stoneleaf.us Fri Jan 15 13:22:56 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 15 Jan 2016 10:22:56 -0800 Subject: [Python-ideas] Boolean value of an Enum member Message-ID: <56993900.60105@stoneleaf.us> When Enum was being designed one of the questions considered was where to start autonumbering: zero or one. As I remember the discussion we chose not to start with zero because we didn't want an enum member to be False by default, and having a member with value 0 be True was discordant. So the functional API starts with 1 unless overridden. In fact, according to the Enum docs: The reason for defaulting to ``1`` as the starting number and not ``0`` is that ``0`` is ``False`` in a boolean sense, but enum members all evaluate to ``True``. However, if the Enum is combined with some other type (str, int, float, etc), then most behaviour is determined by that type -- including boolean evaluation. So the empty string, 0 values, etc, will cause that Enum member to evaluate as False. So the question now is: for a standard Enum (meaning no other type besides Enum is involved) should __bool__ look to the value of the Enum member to determine True/False, or should we always be True by default and make the Enum creator add their own __bool__ if they want something different? On the one hand we have backwards compatibility, which will take a version to change. On the other hand we have a pretty basic difference in how zero/empty is handled between "pure" Enums and "mixed" Enums. On the gripping hand we have . . . Please respond with your thoughts on changing pure Enums to match mixed Enums or any experience you have had with relying on the "always True" behaviour or if you have implemented your own __bool__ to match the standard True/False meanings or if you have implemented your own __bool__ to match some other scheme entirely. -- ~Ethan~ From guido at python.org Fri Jan 15 13:28:40 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Jan 2016 10:28:40 -0800 Subject: [Python-ideas] Boolean value of an Enum member In-Reply-To: <56993900.60105@stoneleaf.us> References: <56993900.60105@stoneleaf.us> Message-ID: Honestly I think it's too late to change. The proposal to change plain Enums to False when their value is zero (or falsey) would be a huge backward incompatibility. I don't think there's a reasonable path forward, and also don't think there's a big reason to regret the current semantics. On Fri, Jan 15, 2016 at 10:22 AM, Ethan Furman wrote: > When Enum was being designed one of the questions considered was where to > start autonumbering: zero or one. > > As I remember the discussion we chose not to start with zero because we > didn't want an enum member to be False by default, and having a member with > value 0 be True was discordant. So the functional API starts with 1 unless > overridden. In fact, according to the Enum docs: > > The reason for defaulting to ``1`` as the starting number and > not ``0`` is that ``0`` is ``False`` in a boolean sense, but > enum members all evaluate to ``True``. > > However, if the Enum is combined with some other type (str, int, float, > etc), then most behaviour is determined by that type -- including boolean > evaluation. So the empty string, 0 values, etc, will cause that Enum > member to evaluate as False. > > So the question now is: for a standard Enum (meaning no other type > besides Enum is involved) should __bool__ look to the value of the Enum > member to determine True/False, or should we always be True by default and > make the Enum creator add their own __bool__ if they want something > different? > > On the one hand we have backwards compatibility, which will take a version > to change. > > On the other hand we have a pretty basic difference in how zero/empty is > handled between "pure" Enums and "mixed" Enums. > > On the gripping hand we have . . . > > Please respond with your thoughts on changing pure Enums to match mixed > Enums or any experience you have had with relying on the "always True" > behaviour or if you have implemented your own __bool__ to match the > standard True/False meanings or if you have implemented your own __bool__ > to match some other scheme entirely. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Jan 15 13:32:58 2016 From: barry at python.org (Barry Warsaw) Date: Fri, 15 Jan 2016 13:32:58 -0500 Subject: [Python-ideas] [Python-Dev] Boolean value of an Enum member In-Reply-To: <56993900.60105@stoneleaf.us> References: <56993900.60105@stoneleaf.us> Message-ID: <20160115133258.3ca202dd@limelight.wooz.org> On Jan 15, 2016, at 10:22 AM, Ethan Furman wrote: >So the question now is: for a standard Enum (meaning no other type besides >Enum is involved) should __bool__ look to the value of the Enum member to >determine True/False, or should we always be True by default and make the >Enum creator add their own __bool__ if they want something different? The latter. I think in general enums are primarily a symbolic value and don't have truthiness. It's also so easy to override when you define the enum that it's not worth changing the current behavior. Cheers, -Barry From abarnert at yahoo.com Fri Jan 15 14:41:15 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 15 Jan 2016 13:41:15 -0600 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: On Jan 15, 2016, at 10:10, Victor Stinner wrote: > > This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API > to implement a static Python optimizer specializing functions with > guards. Some thoughts (and I realize that for many of these the answer will just be "that's out of scope for this PEP"): * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers. * Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self? (Are they automatically static methods, like __new__?) It seems like the only advantage to require attaching them to a class is to associate each one with a name; surely there's a simpler way to do that. And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers? Why not just let them be functions, and use the function name (or maybe have a separate attribute to override that, which a simple decorator can apply)? * Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object? * It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass? * The way it's written, it sounds like the main advantage of your proposal is that it makes it easier to write optimizations that need guards. But it also makes it easier to write the same kinds of optimizations that are already possible but a bit painful. It might be worth rewording a bit to make that clearer. * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal. * In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit. * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode. For example, some language extensions add things that can't be parsed as a valid Python AST. This is particularly an issue when playing with new feature ideas. In some cases, a simple text preprocessor can convert it into code which can be compiled into AST nodes that you can then transform the way you want. At present, with import hooks being the best way to do any of these, there's no disparity that makes text transforms harder than AST transforms. But if we're going to have transformer objects with code_transformer and ast_transformer methods, but a text preprocessor still requires an import hook, that seems unfortunate. Is there a reason you can't add text_transformer as well? (And maybe bytes_transformer. And this would open the door to later add token_transformer in the same place--and for now, you can call tokenize, untokenize, and tokenize again inside a text_transformer.) * I like that I can now compile to PyCF_AST or to PyCF_TRANSFORMED_AST. But can I call compile with an untransformed AST and the PyCF_TRANSFORMED_AST flag? This would be useful if I had some things that still worked via import hook--I could choose whether to hook in before or after the standard/registered set--e.g., if I'm using a text transformer, or a CPython compiled with a hacked-up grammar that generates dummy AST nodes for new language productions, I may want to then transform those to real nodes before the optimizers get to them. (This would be less necessary if we had text-transformer.) * It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself? Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful. * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source; you have to manually recompile the source--making sure to use the same flags, globals, etc.--to get back to the AST. I think that will become even more of a problem now that you need separate ways to get the "basic" parse and the "post-all-installed-transformations" parse. Maybe this would be out of scope for your project, but having some way to access these rather than rebuild them could be very cool. > If the PEP is accepted, it will solve a long list of issues, some > issues are old, like #1346238 which is 11 years old ;-) I found 12 > issues: > > * http://bugs.python.org/issue1346238 > * http://bugs.python.org/issue2181 > * http://bugs.python.org/issue2499 > * http://bugs.python.org/issue2506 > * http://bugs.python.org/issue4264 > * http://bugs.python.org/issue7682 > * http://bugs.python.org/issue10399 > * http://bugs.python.org/issue11549 > * http://bugs.python.org/issue17068 > * http://bugs.python.org/issue17430 > * http://bugs.python.org/issue17515 > * http://bugs.python.org/issue26107 > > I worked to make the PEP more generic that "this hook is written for > FAT Python". Please read the full PEP to see a long list of existing > usages in Python of code transformers. > > You may read again the discussion which occurred 4 years ago about the > same topic: > https://mail.python.org/pipermail/python-dev/2012-August/121286.html > (the thread starts with an idea of AST optimizer, but is moves quickly > to a generic API to transform the code) > > Thanks to Red Hat for giving me time to experiment on this. > Victor > > > HTML version: > https://www.python.org/dev/peps/pep-0510/#changes > > > PEP: 511 > Title: API for code transformers > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 4-January-2016 > Python-Version: 3.6 > > Abstract > ======== > > Propose an API to register bytecode and AST transformers. Add also ``-o > OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o > noopt`` disables the peephole optimizer. Raise an ``ImportError`` > exception on import if the ``.pyc`` file is missing and the code > transformers required to transform the code are missing. code > transformers are not needed code transformed ahead of time (loaded from > ``.pyc`` files). > > > Rationale > ========= > > Python does not provide a standard way to transform the code. Projects > transforming the code use various hooks. The MacroPy project uses an > import hook: it adds its own module finder in ``sys.meta_path`` to > hook its AST transformer. Another option is to monkey-patch the > builtin ``compile()`` function. There are even more options to > hook a code transformer. > > Python 3.4 added a ``compile_source()`` method to > ``importlib.abc.SourceLoader``. But code transformation is wider than > just importing modules, see described use cases below. > > Writing an optimizer or a preprocessor is out of the scope of this PEP. > > Usage 1: AST optimizer > ---------------------- > > Transforming an Abstract Syntax Tree (AST) is a convenient > way to implement an optimizer. It's easier to work on the AST than > working on the bytecode, AST contains more information and is more high > level. > > Since the optimization can done ahead of time, complex but slow > optimizations can be implemented. > > Example of optimizations which can be implemented with an AST optimizer: > > * `Copy propagation > `_: > replace ``x=1; y=x`` with ``x=1; y=1`` > * `Constant folding > `_: > replace ``1+1`` with ``2`` > * `Dead code elimination > `_ > > Using guards (see the `PEP 510 > `_), it is possible to > implement a much wider choice of optimizations. Examples: > > * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used > as iterable > * `Loop unrolling `_ > * Call pure builtins: replace ``len("abc")`` with ``3`` > * Copy used builtin symbols to constants > * See also `optimizations implemented in fatoptimizer > `_, > a static optimizer for Python 3.6. > > The following issues can be implemented with an AST optimizer: > > * `Issue #1346238 > `_: A constant folding > optimization pass for the AST > * `Issue #2181 `_: > optimize out local variables at end of function > * `Issue #2499 `_: > Fold unary + and not on constants > * `Issue #4264 `_: > Patch: optimize code to use LIST_APPEND instead of calling list.append > * `Issue #7682 `_: > Optimisation of if with constant expression > * `Issue #10399 `_: AST > Optimization: inlining of function calls > * `Issue #11549 `_: > Build-out an AST optimizer, moving some functionality out of the > peephole optimizer > * `Issue #17068 `_: > peephole optimization for constant strings > * `Issue #17430 `_: > missed peephole optimization > > > Usage 2: Preprocessor > --------------------- > > A preprocessor can be easily implemented with an AST transformer. A > preprocessor has various and different usages. > > Some examples: > > * Remove debug code like assertions and logs to make the code faster to > run it for production. > * `Tail-call Optimization `_ > * Add profiling code > * `Lazy evaluation `_: > see `lazy_python `_ > (bytecode transformer) and `lazy macro of MacroPy > `_ (AST transformer) > * Change dictionary literals into collection.OrderedDict instances > * Declare constants: see `@asconstants of codetransformer > `_ > * Domain Specific Language (DSL) like SQL queries. The > Python language itself doesn't need to be modified. Previous attempts > to implement DSL for SQL like `PEP 335 - Overloadable Boolean > Operators `_ was rejected. > * Pattern Matching of functional languages > * String Interpolation, but `PEP 498 -- Literal String Interpolation > `_ was merged into Python > 3.6. > > `MacroPy `_ has a long list of > examples and use cases. > > This PEP does not add any new code transformer. Using a code transformer > will require an external module and to register it manually. > > See also `PyXfuscator `_: Python > obfuscator, deobfuscator, and user-assisted decompiler. > > > Usage 3: Disable all optimization > --------------------------------- > > Ned Batchelder asked to add an option to disable the peephole optimizer > because it makes code coverage more difficult to implement. See the > discussion on the python-ideas mailing list: `Disable all peephole > optimizations > `_. > > This PEP adds a new ``-o noopt`` command line option to disable the > peephole optimizer. In Python, it's as easy as:: > > sys.set_code_transformers([]) > > It will fix the `Issue #2506 `_: Add > mechanism to disable optimizations. > > > Usage 4: Write new bytecode optimizers in Python > ------------------------------------------------ > > Python 3.6 optimizes the code using a peephole optimizer. By > definition, a peephole optimizer has a narrow view of the code and so > can only implement basic optimizations. The optimizer rewrites the > bytecode. It is difficult to enhance it, because it written in C. > > With this PEP, it becomes possible to implement a new bytecode optimizer > in pure Python and experiment new optimizations. > > Some optimizations are easier to implement on the AST like constant > folding, but optimizations on the bytecode are still useful. For > example, when the AST is compiled to bytecode, useless jumps can be > emited because the compiler is naive and does not try to optimize > anything. > > > Use Cases > ========= > > This section give examples of use cases explaining when and how code > transformers will be used. > > Interactive interpreter > ----------------------- > > It will be possible to use code transformers with the interactive > interpreter which is popular in Python and commonly used to demonstrate > Python. > > The code is transformed at runtime and so the interpreter can be slower > when expensive code transformers are used. > > Build a transformed package > --------------------------- > > It will be possible to build a package of the transformed code. > > A transformer can have a configuration. The configuration is not stored > in the package. > > All ``.pyc`` files of the package must be transformed with the same code > transformers and the same transformers configuration. > > It is possible to build different ``.pyc`` files using different > optimizer tags. Example: ``fat`` for the default configuration and > ``fat_inline`` for a different configuration with function inlining > enabled. > > A package can contain ``.pyc`` files with different optimizer tags. > > > Install a package containing transformed .pyc files > --------------------------------------------------- > > It will be possible to install a package which contains transformed > ``.pyc`` files. > > All ``.pyc`` files with any optimizer tag contained in the package are > installed, not only for the current optimizer tag. > > > Build .pyc files when installing a package > ------------------------------------------ > > If a package does not contain any ``.pyc`` files of the current > optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are > created during the installation. > > Code transformers of the optimizer tag are required. Otherwise, the > installation fails with an error. > > > Execute transformed code > ------------------------ > > It will be possible to execute transformed code. > > Raise an ``ImportError`` exception on import if the ``.pyc`` file of the > current optimizer tag is missing and the code transformers required to > transform the code are missing. > > The interesting point here is that code transformers are not needed to > execute the transformed code if all required ``.pyc`` files are already > available. > > > Code transformer API > ==================== > > A code transformer is a class with ``ast_transformer()`` and/or > ``code_transformer()`` methods (API described below) and a ``name`` > attribute. > > For efficiency, do not define a ``code_transformer()`` or > ``ast_transformer()`` method if it does nothing. > > The ``name`` attribute (``str``) must be a short string used to identify > an optimizer. It is used to build a ``.pyc`` filename. The name must not > contain dots (``'.'``), dashes (``'-'``) or directory separators: dots > are used to separated fields in a ``.pyc`` filename and dashes areused > to join code transformer names to build the optimizer tag. > > .. note:: > It would be nice to pass the fully qualified name of a module in the > *context* when an AST transformer is used to transform a module on > import, but it looks like the information is not available in > ``PyParser_ASTFromStringObject()``. > > > code_transformer() > ------------------ > > Prototype:: > > def code_transformer(code, consts, names, lnotab, context): > ... > return (code, consts, names, lnotab) > > Parameters: > > * *code*: the bytecode (``bytes``) > * *consts*: a sequence of constants > * *names*: tuple of variable names > * *lnotab*: table mapping instruction offsets to line numbers > (``bytes``) > > The code transformer is run after the compilation to bytecode > > > ast_transformer() > ------------------ > > Prototype:: > > def ast_transformer(tree, context): > ... > return tree > > Parameters: > > * *tree*: an AST tree > * *context*: an object with a ``filename`` attribute (``str``) > > It must return an AST tree. It can modify the AST tree in place, or > create a new AST tree. > > The AST transformer is called after the creation of the AST by the > parser and before the compilation to bytecode. New attributes may be > added to *context* in the future. > > > Changes > ======= > > In short, add: > > * ``-o OPTIM_TAG`` command line option > * ``ast.Constant`` > * ``ast.PyCF_TRANSFORMED_AST`` > * ``sys.get_code_transformers()`` > * ``sys.implementation.optim_tag`` > * ``sys.set_code_transformers(transformers)`` > > > API to get/set code transformers > -------------------------------- > > Add new functions to register code transformers: > > * ``sys.set_code_transformers(transformers)``: set the list of code > transformers and update ``sys.implementation.optim_tag`` > * ``sys.get_code_transformers()``: get the list of code > transformers. > > The order of code transformers matter. Running transformer A and then > transformer B can give a different output than running transformer B an > then transformer A. > > Example to prepend a new code transformer:: > > transformers = sys.get_code_transformers() > transformers.insert(0, new_cool_transformer) > sys.set_code_transformers(transformers) > > All AST tranformers are run sequentially (ex: the second transformer > gets the input of the first transformer), and then all bytecode > transformers are run sequentially. > > > Optimizer tag > ------------- > > Changes: > > * Add ``sys.implementation.optim_tag`` (``str``): optimization tag. > The default optimization tag is ``'opt'``. > * Add a new ``-o OPTIM_TAG`` command line option to set > ``sys.implementation.optim_tag``. > > Changes on ``importlib``: > > * ``importlib`` uses ``sys.implementation.optim_tag`` to build the > ``.pyc`` filename to importing modules, instead of always using > ``opt``. Remove also the special case for the optimizer level ``0`` > with the default optimizer tag ``'opt'`` to simplify the code. > * When loading a module, if the ``.pyc`` file is missing but the ``.py`` > is available, the ``.py`` is only used if code optimizers have the > same optimizer tag than the current tag, otherwise an ``ImportError`` > exception is raised. > > Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can > be compiled to import a module:: > > def transformers_tag(): > transformers = sys.get_code_transformers() > if not transformers: > return 'noopt' > return '-'.join(transformer.name > for transformer in transformers) > > def use_py(): > return (transformers_tag() == sys.implementation.optim_tag) > > The order of ``sys.get_code_transformers()`` matter. For example, the > ``fat`` transformer followed by the ``pythran`` transformer gives the > optimizer tag ``fat-pythran``. > > The behaviour of the ``importlib`` module is unchanged with the default > optimizer tag (``'opt'``). > > > Peephole optimizer > ------------------ > > By default, ``sys.implementation.optim_tag`` is ``opt`` and > ``sys.get_code_transformers()`` returns a list of one code transformer: > the peephole optimizer (optimize the bytecode). > > Use ``-o noopt`` to disable the peephole optimizer. In this case, the > optimizer tag is ``noopt`` and no code transformer is registered. > > Using the ``-o opt`` option has not effect. > > > AST enhancements > ---------------- > > Enhancements to simplify the implementation of AST transformers: > > * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the > transformed AST. ``PyCF_ONLY_AST`` returns the AST before the > transformers. > * Add ``ast.Constant``: this type is not emited by the compiler, but > can be used in an AST transformer to simplify the code. It does not > contain line number and column offset informations on tuple or > frozenset items. > * ``PyCodeObject.co_lnotab``: line number delta becomes signed to > support moving instructions (note: need to modify MAGIC_NUMBER in > importlib). Implemented in the `issue #26107 > `_ > * Enhance the bytecode compiler to support ``tuple`` and ``frozenset`` > constants. Currently, ``tuple`` and ``frozenset`` constants are > created by the peephole transformer, after the bytecode compilation. > * ``marshal`` module: fix serialization of the empty frozenset singleton > * update ``Tools/parser/unparse.py`` to support the new ``ast.Constant`` > node type > > > Examples > ======== > > .pyc filenames > -------------- > > Example of ``.pyc`` filenames of the ``os`` module. > > With the default optimizer tag ``'opt'``: > > =========================== ================== > .pyc filename Optimization level > =========================== ================== > ``os.cpython-36.opt-0.pyc`` 0 > ``os.cpython-36.opt-1.pyc`` 1 > ``os.cpython-36.opt-2.pyc`` 2 > =========================== ================== > > With the ``'fat'`` optimizer tag: > > =========================== ================== > .pyc filename Optimization level > =========================== ================== > ``os.cpython-36.fat-0.pyc`` 0 > ``os.cpython-36.fat-1.pyc`` 1 > ``os.cpython-36.fat-2.pyc`` 2 > =========================== ================== > > > Bytecode transformer > -------------------- > > Scary bytecode transformer replacing all strings with > ``"Ni! Ni! Ni!"``:: > > import sys > > class BytecodeTransformer: > name = "knights_who_say_ni" > > def code_transformer(self, code, consts, names, lnotab, context): > consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const > for const in consts] > return (code, consts, names, lnotab) > > # replace existing code transformers with the new bytecode transformer > sys.set_code_transformers([BytecodeTransformer()]) > > # execute code which will be transformed by code_transformer() > exec("print('Hello World!')") > > Output:: > > Ni! Ni! Ni! > > > AST transformer > --------------- > > Similary to the bytecode transformer example, the AST transformer also > replaces all strings with ``"Ni! Ni! Ni!"``:: > > import ast > import sys > > class KnightsWhoSayNi(ast.NodeTransformer): > def visit_Str(self, node): > node.s = 'Ni! Ni! Ni!' > return node > > class ASTTransformer: > name = "knights_who_say_ni" > > def __init__(self): > self.transformer = KnightsWhoSayNi() > > def ast_transformer(self, tree, context): > self.transformer.visit(tree) > return tree > > # replace existing code transformers with the new AST transformer > sys.set_code_transformers([ASTTransformer()]) > > # execute code which will be transformed by ast_transformer() > exec("print('Hello World!')") > > Output:: > > Ni! Ni! Ni! > > > Other Python implementations > ============================ > > The PEP 511 should be implemented by all Python implementation, but the > bytecode and the AST are not standardized. > > By the way, even between minor version of CPython, there are changes on > the AST API. There are differences, but only minor differences. It is > quite easy to write an AST transformer which works on Python 2.7 and > Python 3.5 for example. > > > Discussion > ========== > > * `[Python-Dev] AST optimizer implemented in Python > `_ > (August 2012) > > > Prior Art > ========= > > AST optimizers > -------------- > > In 2011, Eugene Toder proposed to rewrite some peephole optimizations in > a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving > some functionality out of the peephole optimizer > `_. The patch adds ``ast.Lit`` (it > was proposed to rename it to ``ast.Literal``). > > In 2012, Victor Stinner wrote the `astoptimizer > `_ project, an AST optimizer > implementing various optimizations. Most interesting optimizations break > the Python semantics since no guard is used to disable optimization if > something changes. > > In 2015, Victor Stinner wrote the `fatoptimizer > `_ project, an AST optimizer > specializing functions using guards. > > The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST" > optimizer `_ was a first attempt of > API for code transformers, but specific to AST. > > > Python Preprocessors > -------------------- > > * `MacroPy `_: MacroPy is an > implementation of Syntactic Macros in the Python Programming Language. > MacroPy provides a mechanism for user-defined functions (macros) to > perform transformations on the abstract syntax tree (AST) of a Python > program at import time. > * `pypreprocessor `_: C-style > preprocessor directives in Python, like ``#define`` and ``#ifdef`` > > > Bytecode transformers > --------------------- > > * `codetransformer `_: > Bytecode transformers for CPython inspired by the ``ast`` module?s > ``NodeTransformer``. > * `byteplay `_: Byteplay lets you > convert Python code objects into equivalent objects which are easy to > play with, and lets you convert those objects back into living Python > code objects. It's useful for applying crazy transformations on Python > functions, and is also useful in learning Python byte code > intricacies. See `byteplay documentation > `_. > > See also: > > * `BytecodeAssembler `_ > > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From yselivanov.ml at gmail.com Fri Jan 15 15:39:53 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 15 Jan 2016 15:39:53 -0500 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: <56995919.3080402@gmail.com> Hi Victor, On 2016-01-15 11:10 AM, Victor Stinner wrote: > Hi, > > This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API > to implement a static Python optimizer specializing functions with > guards. All your PEPs are very interesting, thanks for your hard work! I'm very happy to see that we're trying to make CPython faster. There are some comments below: > > If the PEP is accepted, it will solve a long list of issues, some > issues are old, like #1346238 which is 11 years old ;-) I found 12 > issues: > > * http://bugs.python.org/issue1346238 > * http://bugs.python.org/issue2181 > * http://bugs.python.org/issue2499 > * http://bugs.python.org/issue2506 > * http://bugs.python.org/issue4264 > * http://bugs.python.org/issue7682 > * http://bugs.python.org/issue10399 > * http://bugs.python.org/issue11549 > * http://bugs.python.org/issue17068 > * http://bugs.python.org/issue17430 > * http://bugs.python.org/issue17515 > * http://bugs.python.org/issue26107 It's important to say that all of those issues (except 2506) are not bugs, but proposals to implement some nano- and micro- optimizations. Issue 2506 is about having an option to disable the peephole optimizer, which is a very narrow subset of what PEP 511 proposes to add. [..] > Usage 2: Preprocessor > --------------------- > > A preprocessor can be easily implemented with an AST transformer. A > preprocessor has various and different usages. > > Some examples: > > * Remove debug code like assertions and logs to make the code faster to > run it for production. > * `Tail-call Optimization `_ > * Add profiling code > * `Lazy evaluation `_: > see `lazy_python `_ > (bytecode transformer) and `lazy macro of MacroPy > `_ (AST transformer) > * Change dictionary literals into collection.OrderedDict instances > * Declare constants: see `@asconstants of codetransformer > `_ > * Domain Specific Language (DSL) like SQL queries. The > Python language itself doesn't need to be modified. Previous attempts > to implement DSL for SQL like `PEP 335 - Overloadable Boolean > Operators `_ was rejected. > * Pattern Matching of functional languages > * String Interpolation, but `PEP 498 -- Literal String Interpolation > `_ was merged into Python > 3.6. [..] I think that most of those examples are rather weak. Things like tail-call optimizations, constants declarations, pattern matching, case classes (from MacroPy) are nice concepts, but they should be either directly implemented in Python language or not used at all (IMHO). Things like auto-changing dictionary literals to OrderedDict objects or in-Python DSLs will only help in creating hard to maintain code base. I say this because I have a first-hand experience with decorators that patch opcodes, and import hooks that rewrite AST. When you get back to your code years after it was written, you usually regret about doing those things. All in all, I think that adding a blessed API for preprocessors shouldn't be a focus of this PEP. MacroPy works right now with importlib, and I think it's a good solution for it. I propose to only expose new APIs on the C level, and explicitly mark them as provisional and experimental. It should be clear, that those APIs are only for *writing optimizers*, and nothing else. [off-topic] I do think that having a macro system similar to Rust might be a good idea. However, macro in Rust have explicit and distinct syntax, they have the necessary level of documentation and tooling. But this is a separate matter deserving its own PEP ;) [..] > Usage 4: Write new bytecode optimizers in Python > ------------------------------------------------ > > Python 3.6 optimizes the code using a peephole optimizer. By > definition, a peephole optimizer has a narrow view of the code and so > can only implement basic optimizations. The optimizer rewrites the > bytecode. It is difficult to enhance it, because it written in C. > > With this PEP, it becomes possible to implement a new bytecode optimizer > in pure Python and experiment new optimizations. > > Some optimizations are easier to implement on the AST like constant > folding, but optimizations on the bytecode are still useful. For > example, when the AST is compiled to bytecode, useless jumps can be > emited because the compiler is naive and does not try to optimize > anything. [..] Would it be possible to (or does it make any sense): 1. Add new APIs for AST transformers (only exposed on the C level!) 2. Remove the peephole optimizer. 3. Re-implement peephole optimizer using new APIs in CPython (peephole does some very basic optimizations). 4. Implement other basic optimizations (like limited constant folding) in CPython. 5. Leave the door open for you and other people to add more AST optimizers (so that FAT isn't locked to CPython's slow release cycle)? I also want to say this: I'm -1 on implementing all three PEPs until we see that FAT is able to give us at least 10% performance improvement on micro-benchmarks. We still have several months before 3.6beta to see if that's possible. Thanks, Yury From victor.stinner at gmail.com Fri Jan 15 16:14:17 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Jan 2016 22:14:17 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: Wow, giant emails (as mine, ok). 2016-01-15 20:41 GMT+01:00 Andrew Barnert : > * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers. The goal is to have a short optimizer tag. I'm not sure yet that it makes sense, but I would like to be able to transform AST and bytecode in a single code transformer. I prefer to add a single get/set function to sys, instead of two (4 new functions). > * Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self? They take self. It's just a formating issue (a mistake in the PEP) :-) They take self parameter, see examples. https://www.python.org/dev/peps/pep-0511/#bytecode-transformer It's just hard to format a PEP correctly when you know Sphinx :-) I started to use ".. method:: ..." but it doesn't work, it's the simpler reST format ;-) > It seems like the only advantage to require attaching them to a class is to associate each one with a name I started with a function, but it's a little bit weird to set a name attribute to a function (func.name = "fat"). Moreover, it's convenient to store some data in the object. In fatoptimizer, I store the configuration. Even in the most simple AST transformer example of the PEP, the constructor creates an object: https://www.python.org/dev/peps/pep-0511/#id1 It may be possible to use functions, but classes are just more "natural" in Python. > And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers? The two methods are disconnected, but they are linked by the optimizer tag. IMHO it makes sense to implement all optimizations (crazy stuff in AST, simple optimizer like peephole on bytecode) in a single code transformer. It avoids to use a long optimizer tag like "fat_ast-fat_bytecode". I also like short filenames. > * Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object? To be honest, I don't feel confortable with a function taking 5 parameters which has to return a tuple of 4 items :-/ Especially if it's only the first version, we may have to add more items. code_transformer() API comes from PyCode_Optimize() API: the CPython peephole optimizer. PyAPI_FUNC(PyObject*) PyCode_Optimize(PyObject *code, PyObject* consts, PyObject *names, PyObject *lnotab); The function modifies lntotab in-place and returns the modified code. Passing a whole code object makes the API much simpler and code objects contain all information. I take your suggestion, thanks. > * It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass? fatoptimizer doesn't use ast.NodeTransformer ;-) ast.NodeTransformer has a naive and inefficent design. For example, fatoptimizer uses a metaclass to only create the mapping of visitors once (visit_xxx methods). My transformer copies modified nodes to leave the input tree unchanged. I need this to be able to duplicate a tree later (to specialize functions). (Maybe I can proposed to enhance ast.NodeTransformer, but that's a different topic.) > * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal. I wrote "A preprocessor has various and different usages." Maybe I can elaborate :-) It looks like it is possible to "implement" f-string (PEP 498) using macros. I think that it's a good example of experimenting evolutions of the language (without having to modify the C code which is much more complex, Yury Selivanov may want to share his experience here for this async/await PEP ;-)). > * In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit. Yeah, I tried to split changes to make them independant. Only PEP 509 (dict version) is linked to PEP 510 (func specialize). Even alone, the PEP 509 can be used to implement the "copy globals to locals/constants" optimization mentioned in the PEP (at least two developers proposed changes to implement! it was also in Unladen Swallow plans). > * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode. > (...) > Is there a reason you can't add text_transformer as well? I don't know this part of the compiler. Does Python already has an API to manipulate tokens, etc.? What about other Python implementations? I proposed AST transformers because it's already commonly used in the wild. I also proposed bytecode to replace the peephole optimizer: make it optional and maybe implement a new one (in Python to be more easily be maintenable?). The Hy language uses its own parser and emits Python AST. Why not using this design? > (...) e.g., if I'm using a text transformer, (...) IHMO you are going too far and it becomes out of the scope of the PEP. You should also read the previous discussion: https://mail.python.org/pipermail/python-dev/2012-August/121309.html > * It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself? To be honest, right now, I'm focsed on fatoptimizer. I don't want to integrate it in the stdlib because: * it's incomplete: see the giant https://fatoptimizer.readthedocs.org/en/latest/todo.html list if you are bored * the stdlib is moving ... is not really moving... well, the development process is way too slow for such very young project * fatoptimizer still changes the Python semantics in subtle ways which should be tested in large applications and discussed point per point * etc. It's way too early to discuss that (at least for fatoptimizer). Since pip becomes standard, I don't think that it's real issue in practice. > Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful. byteplay doesn't seem to be maintained anymore. Last commit in 2010... IHMO you can do the same than byteplay on the AST with much simpler code. I only mentioned some projects modifying bytecode to pick ideas of what can be done with a code transformer. I don't think that it's worth to add more examples than the two "Ni! Ni! Ni!" examples. > * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source; You should take a look at MacroPy, it looks like it has some crazy stuff to modify the AST and compile at runtime. I'm not sure, I never used MacroPy, I only read its documentation to generalize my PEP ;-) Modifying and recompiling the code at runtime (using AST, something higher level than bytecode) sounds like a Lisp feature and like JIT compiler, two cool stuff ;) Victor From victor.stinner at gmail.com Fri Jan 15 17:16:38 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Jan 2016 23:16:38 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56995919.3080402@gmail.com> References: <56995919.3080402@gmail.com> Message-ID: 2016-01-15 21:39 GMT+01:00 Yury Selivanov : > All your PEPs are very interesting, thanks for your hard work! > I'm very happy to see that we're trying to make CPython faster. Thanks. > It's important to say that all of those issues (except 2506) > are not bugs, but proposals to implement some nano- and > micro- optimizations. Hum, let me see. >> * http://bugs.python.org/issue1346238 https://bugs.python.org/issue11549 "A constant folding optimization pass for the AST" & "Build-out an AST optimizer, moving some functionality out of the peephole optimizer" Well, that's a way to start working on larger optimizations. Anyway, the peephole optimizer has many limits. Raymond Hettinger keeps repeating that it was designed to be simple and limited. And each time, suggested to reimplement the peephole optimize in pure Python (as I'm proposing). On AST, we can do much better than just 1+1, even without changing the Python semantics. But I'm ok that speedup are minor on such changes. Without specialization and guards, you are limited. >> * http://bugs.python.org/issue2181 "optimize out local variables at end of function" Alone, this optimization is not really interesting. But other optimizations can produce inefficient code. Example with loop unrolling: for i in range(2): print(i) is replaced with: i = 0 print(i) i = 1 print(i) with constant propagation, it becomes: i = 0 print(0) i = 1 print(1) at the point, i variable becomes useless and can removed the optimization mentioned in http://bugs.python.org/issue2181 print(0) print(1) >> * http://bugs.python.org/issue10399 "AST Optimization: inlining of function calls" IMHO this one is really interesting. But again, not alone, but when combined with other optimizations. >> Usage 2: Preprocessor >> --------------------- >> >> A preprocessor can be easily implemented with an AST transformer. A >> preprocessor has various and different usages. >> 3.6. > > [..] > > I think that most of those examples are rather weak. Things like > tail-call optimizations, constants declarations, pattern matching, > case classes (from MacroPy) are nice concepts, but they should be > either directly implemented in Python language or not used at all > (IMHO). At least, it allows to experiment new things. If a transformer becomes popular, we can start to discuss integrating into Python. About tail recursion, I recall that Guido wrote something about it: http://neopythonic.blogspot.fr/2009/04/tail-recursion-elimination.html I found a lot of code transformers projects. I understand that there is a real need. In a previous job, we used a text preprocessor to remove all calls to log.debug() to release the code to the production. It was in the embedded world (set top boxes), where performances matter. The preprocessor was based on long and unreliable regular expressions. I would prefer to use AST for that. That's my first item in the list: "Remove debug code like assertions and logs to make the code faster to run it for production." > Things like auto-changing dictionary literals to OrderedDict > objects or in-Python DSLs will only help in creating hard to > maintain code base. I say this because I have a first-hand > experience with decorators that patch opcodes, and import > hooks that rewrite AST. When you get back to your code years > after it was written, you usually regret about doing those things. To be honest, I don't plan to use such macros, they look too magic, and change Python semantics too much. But I dont want to restrict users to do cool things in their sandbox. In my experience, Python developers are good enough to make decision. When the f-string PEP was discussed, I was strongly opposed to allow *any* Python expressions in f-string. But Guido said that the language designers must not restrict users. Well, something like, I probably misuse his quote ;-) > All in all, I think that adding a blessed API for preprocessors > shouldn't be a focus of this PEP. MacroPy works right now > with importlib, and I think it's a good solution for it. Do you mean that we should add the feature but add a warning in the doc like "don't use it for evil things"? I don't think that we can forbid users for specific usage of an API. The only strong solution to ensure that users will not misuse an API is to not add the API (reject the PEP) :-) So I chose instead to document different kinds of usage of code transformers, just to know how they can be used. > I propose to only expose new APIs on the C level, > and explicitly mark them as provisional and experimental. > It should be clear, that those APIs are only for > *writing optimizers*, and nothing else. Currently, the PEP adds: * -o OPTIM_TAG command line option * sys.implementation.optim_tag * sys.get_code_transformers() * sys.set_code_transformers(transformers) * ast.Constant * ast.PyCF_TRANSFORMED_AST importlib uses sys.implementation.optim_tag and sys.get_code_transformers(). *If* we want to remove them, we should find a way to expose these information to importlib. I really like ast.Constant, I would like to add it, but it's really a minor part of the PEP. I don't think that it's controversal. PyCF_TRANSFORMED_AST can only be exposed at the C level. "-o OPTIM_TAG command line option" is a shortcut to set sys.implementation.optim_tag. optim_tag can be set manually. But the problem is to be able to set the optim_tag before the first Python module is imported. It doesn't seem easy to avoid this change. According to Brett, the whole PEP can be simplified to this single command line option :-) > [off-topic] I do think that having a macro system similar to > Rust might be a good idea. However, macro in Rust have explicit > and distinct syntax, they have the necessary level of > documentation and tooling. But this is a separate matter > deserving its own PEP ;) I agree that extending the Python syntax is out of the scope of the PEP 511. > Would it be possible to (or does it make any sense): > > 1. Add new APIs for AST transformers (only exposed on the C > level!) > > 2. Remove the peephole optimizer. FYI my fatoptimizer is quite slow. But it implements a lot of optimizations, much more than the Python peephole optimizer. I fear that the conversions are expensive: * AST (light) internal objects => Python (heavy) AST objects * (run AST optimizers implemented in Python) * Python (heavy) AST objects => AST (light) internal objects So in a near future, I prefer to keep the peephole optimizer implemented in C. The performance of the optimizer itself matters when you run a short script using "python script.py" (without compilation ahead of time). > I also want to say this: I'm -1 on implementing all three PEPs > until we see that FAT is able to give us at least 10% performance > improvement on micro-benchmarks. We still have several months > before 3.6beta to see if that's possible. I prefer to not start benchmarking fatoptimizer because I spent 3 months just to design the API, fix bugs, etc. I only few a small fraction of time on writing optimizations. I expect significan speedups with more optimizations like function inlining. If you are curious, take a look at the todo list: https://fatoptimizer.readthedocs.org/en/latest/todo.html I understand that an optimizer which does not produce faster code is not really interesting. My PEPs request many changes which become part of the public API and have to be maintained later. I already changed the PEP 509 and 510 to make the changes private (only visible in the C API). Victor From abarnert at yahoo.com Fri Jan 15 17:57:09 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 15 Jan 2016 16:57:09 -0600 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: <6BF769D9-3684-4FEF-9CD5-20D84A2989BF@yahoo.com> Sent from my iPhone > On Jan 15, 2016, at 15:14, Victor Stinner wrote: > > Wow, giant emails (as mine, ok). Well, this is a big idea, so it needs a big breakfast. I mean a big email. :) But fortunately, you had great answers to most of my points, which means I can snip them out of this reply and make it not quite as giant. > > 2016-01-15 20:41 GMT+01:00 Andrew Barnert : >> * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers. > > The goal is to have a short optimizer tag. I'm not sure yet that it > makes sense, but I would like to be able to transform AST and bytecode > in a single code transformer. But that doesn't work as soon as there are even two of them: the bytecode #0 no longer runs after ast #0, but after ast #1; similarly, bytecode #1 no longer runs after ast #1, but after bytecode #0. So, it seems like whatever benefits you get by keeping them coupled will be illusory. > I prefer to add a single get/set > function to sys, instead of two (4 new functions). That's a good point. (I suppose you could have a pair of get/set functions that each set multiple lists instead of one, but that isn't really any simpler than multiple get/set functions...) >> It seems like the only advantage to require attaching them to a class is to associate each one with a name > > I started with a function, but it's a little bit weird to set a name > attribute to a function (func.name = "fat"). It looks a lot less weird with a decorator `@transform('fat')` that sets it for you. > Moreover, it's convenient > to store some data in the object. In fatoptimizer, I store the > configuration. Even in the most simple AST transformer example of the > PEP, the constructor creates an object: > https://www.python.org/dev/peps/pep-0511/#id1 > > It may be possible to use functions, but classes are just more > "natural" in Python. In general, sure. But for data that isn't accessible from outside, and only needs to be used in a single call, a simple function (with the option of a wrapping data in a closure) can be simpler. That's why so many decorators are functions that return a closure, not classes that build an object with a __call__ method. But more specifically to this case, after looking over your examples, maybe the class makes sense here. >> * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal. > > I wrote "A preprocessor has various and different usages." Maybe I can > elaborate :-) Sure. It's just a matter of emphasis, and whether more of it would help sell your idea or not. From the other big reply you got, maybe it would even hurt selling it... So, your call. > It looks like it is possible to "implement" f-string (PEP 498) using > macros. I think that it's a good example of experimenting evolutions > of the language (without having to modify the C code which is much > more complex, Yury Selivanov may want to share his experience here for > this async/await PEP ;-)). I did an experiment last year where I tried to add the same feature two ways (Haskell-style operator partials, so you can write `(* 2)` instead of `lambda x: x * 2)` or `rpartial(mul, 2)` or whatever). First, I did all the steps to add it "for real", from the grammar through to the code generator. Second, I added a quick grammar hack to create a noop AST node, then did everything else in Python with an import hook--preprocessor the text to get the noop nodes, then preprocessing the AST to turn those into nodes that do the intended semantics. As you might expect, the second version took a lot less time, required debugging a lot fewer segfaults, etc. and if your proposal removed the need for the import hook, it would be even simpler (and cleaner, too). >> * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode. >> (...) >> Is there a reason you can't add text_transformer as well? > > I don't know this part of the compiler. > > Does Python already has an API to manipulate tokens, etc.? What about > other Python implementations? Well, Python does have an API to manipulate tokens, but it involves manually tokenizing the text, modifying the token stream, untokenizing it back to text, and then parsing and compiling the result, which is far from ideal. (In fact, in some cases you even need to encode back to bytes.) There's an open enhancement issue to make it easier to write token processors. But don't worry about that part for now. A text preprocessor step should be very easy to add, and useful on its own (and it opens the door for adding a token preprocessor between text and AST in the future when that becomes feasible). I also mentioned a bytes preprocessor, which could munge the bytes before the decoding to text. But that seems a lot less useful. (Maybe if you needed an alternative to the coding-declaration syntax for some reason?) I only included it because it's another layer you can hook in an import hook today, so it seems like if it is left out, that should be an intentional decision, not just something nobody thought about. > I proposed AST transformers because it's already commonly used in the wild. Text preprocessors are also used in the wild. IIRC, Guido mentioned having written one that turns Python 3-style annotations into something that compiles as legal Python 2.7 (although he later abandoned it, because it turned out to be too hard to integrate with their other Python 2 tools). (Token preprocessors are not used much I n the wild, because it's painful to write them, nor are bytes preprocessors, because they're not that useful.) > The Hy language uses its own parser and emits Python AST. Why not > using this design? By the same token, why not use your own code generator and emit Python bytecode, instead of just preprocessing ASTs? If you're making a radical change, that makes sense. But for most uses, where you only want to make a small change on top of the normal processing, it makes a lot more sense to just hook the normal processing than to completely reproduce everything it does. >> Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful. > > byteplay doesn't seem to be maintained anymore. Last commit in 2010... There's a byteplay3 fork, which is maintained. But it doesn't support 3.5 yet. (As I mentioned, it's usually a few months to a few years behind each new Python release. Which is one reason integrating parts of it into the core might be nice. The dis module changes in 3.4 were basically integrating part of byteplay, and that part has paid off--the code in dis is automatically up to date with the compiler. There may be more you could do here. But probably it's out of scope for your project.) > IHMO you can do the same than byteplay on the AST with much simpler > code. If that's really true, then you shouldn't include code_transformers in the PEP at all. You're just making things more complicated, in multiple ways, to enable a feature you don't think anyone will ever need. However, based on my own experience, I think code transformers _are_ sometimes useful, but they usually require something like byteplay. Even just something as simple as removing an unnecessary jump instruction requires reordering the arguments of every other jump; something like merging two finally blocks would be a nightmare to do manually. >> * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source; > > You should take a look at MacroPy, Yes, I love MacroPy. But it doesn't provide the functionality I'm asking about here. (It _might_ be possible to write a macro that stores the AST on each function object; I haven't tried.) Anyway, the reason I bring it up is that it's trivial to write a decorator that byteplay-hacks a function after compilation, and not much harder to write one that text-hacks the source and recompiles it, but taking the AST and recompiling it is more painful. Since your proposal is about making similar things easier in other cases, it could be nice to do that here as well. But, as I said at the top, I realize some of these ideas are out of scope; some of them are more about getting a definite "yeah, that might be cool but it's out of scope" as opposed to not knowing whether it had even been considered. > Modifying and recompiling the code at runtime (using AST, something > higher level than bytecode) sounds like a Lisp feature and like JIT > compiler, two cool stuff ;) Well, part of the point of Lisp is that there is only one step--effectively, your source bytes are your AST. Python has to decode, tokenize, and parse to get to the AST. But being able to start there instead of repeating that work would give us the best of both worlds (as easy to do stuff as Lisp, but as readable as Python). From stephen at xemacs.org Sat Jan 16 04:56:48 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 16 Jan 2016 18:56:48 +0900 Subject: [Python-ideas] Boolean value of an Enum member In-Reply-To: <56993900.60105@stoneleaf.us> References: <56993900.60105@stoneleaf.us> Message-ID: <22170.5088.502579.612956@turnbull.sk.tsukuba.ac.jp> I don't understand why this was cross-posted. Python-Dev removed from addressees. Ethan Furman writes: > When Enum was being designed one of the questions considered was where > to start autonumbering: zero or one. > > As I remember the discussion we chose not to start with zero because we > didn't want an enum member to be False by default, and having a member > with value 0 be True was discordant. So the functional API starts with > 1 unless overridden. In fact, according to the Enum docs: > > The reason for defaulting to ``1`` as the starting number and > not ``0`` is that ``0`` is ``False`` in a boolean sense, but > enum members all evaluate to ``True``. > > However, if the Enum is combined with some other type (str, int, float, > etc), then most behaviour is determined by that type -- including > boolean evaluation. So the empty string, 0 values, etc, will cause that > Enum member to evaluate as False. Seems like perfectly desirable behavior to me. A pure enum is a set of mutually exclusive abstract symbolic values, and if you want one of them to have specific behavior other than that you should say so. If you need a falsey value for a variable that takes pure Enum values, "None" or "False" (or both!) seems fine to me depending on the semantics of the variable and dataset in question, and if neither seems to fit the bill, define __bool__. OTOH, an Enum which is conceptually a set of symbolic names for constants of some type should take on the semantics of that type, including truthiness of the values represented. Do you have a use case where that distinction seems totally inappropriate, or have you merely been bitten by Emerson's Hobgoblin? From encukou at gmail.com Sat Jan 16 06:06:58 2016 From: encukou at gmail.com (Petr Viktorin) Date: Sat, 16 Jan 2016 12:06:58 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: <569A2452.1000709@gmail.com> On 01/15/2016 05:10 PM, Victor Stinner wrote: > Hi, > > This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API > to implement a static Python optimizer specializing functions with > guards. > > If the PEP is accepted, it will solve a long list of issues, some > issues are old, like #1346238 which is 11 years old ;-) I found 12 > issues: > > * http://bugs.python.org/issue1346238 > * http://bugs.python.org/issue2181 > * http://bugs.python.org/issue2499 > * http://bugs.python.org/issue2506 > * http://bugs.python.org/issue4264 > * http://bugs.python.org/issue7682 > * http://bugs.python.org/issue10399 > * http://bugs.python.org/issue11549 > * http://bugs.python.org/issue17068 > * http://bugs.python.org/issue17430 > * http://bugs.python.org/issue17515 > * http://bugs.python.org/issue26107 > > I worked to make the PEP more generic that "this hook is written for > FAT Python". Please read the full PEP to see a long list of existing > usages in Python of code transformers. > > You may read again the discussion which occurred 4 years ago about the > same topic: > https://mail.python.org/pipermail/python-dev/2012-August/121286.html > (the thread starts with an idea of AST optimizer, but is moves quickly > to a generic API to transform the code) > > Thanks to Red Hat for giving me time to experiment on this. > Victor > > HTML version: > https://www.python.org/dev/peps/pep-0510/#changes Victor, Thanks for your efforts on making Python faster! This PEP addresses two things that would benefit from different approaches: let's call them optimizers and extensions. Optimizers, such as your FAT, don't change Python semantics. They're designed to run on *all* code, including the standard library. It makes sense to register them as early in interpreter startup as possible, but if they're not registered, nothing breaks (things will just be slower). Experiments with future syntax (like when async/await was being developed) have the same needs. Syntax extensions, such as MacroPy or Hy, tend to target specific modules, with which they're closely coupled: The modules won't run without the transformer. And with other modules, the transformer either does nothing (as with MacroPy, hopefully), or would fail altogether (as with Hy). So, they would benefit from specific packages opting in. The effects of enabling them globally range from inefficiency (MacroPy) to failures or needing workarounds (Hy). The PEP is designed optimizers. It would be good to stick to that use case, at least as far as the registration is concerned. I suggest noting in the documentation that Python semantics *must* be preserved, and renaming the API, e.g.:: sys.set_global_optimizers([]) The "transformer" API can be used for syntax extensions as well, but the registration needs to be different so the effects are localized. For example it could be something like:: importlib.util.import_with_transformer( 'mypackage.specialmodule', MyTransformer()) or a special flag in packages:: __transformers_for_submodules__ = [MyTransformer()] or extendeding exec (which you actually might want to add to the PEP, to make giving examples easier):: exec("print('Hello World!')", transformers=[MyTransformer()]) or making it easier to write an import hook with them, etc... but all that would probably be out of scope for your PEP. Another thing: this snippet from the PEP sounds too verbose:: transformers = sys.get_code_transformers() transformers.insert(0, new_cool_transformer) sys.set_code_transformers(transformers) Can this just be a list, as with sys.path? Using the "optimizers" term:: sys.global_optimizers.insert(0, new_cool_transformer) This:: def code_transformer(code, consts, names, lnotab, context): It's a function, so it would be better to name it:: def transform_code(code): And this:: def ast_transformer(tree, context): might work better with keyword arguments:: def transform_ast(tree, *, filename, **kwargs): otherwise people might use context objects with other attributes than "filename", breaking when a future PEP assigns a specific meaning to them. It actually might be good to make the code transformer API extensible as well, and synchronize with the AST transformer:: def transform_code(code, *, filename, **kwargs): From sjoerdjob at sjec.nl Sat Jan 16 11:22:35 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Sat, 16 Jan 2016 17:22:35 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <569A2452.1000709@gmail.com> References: <569A2452.1000709@gmail.com> Message-ID: <20160116162235.GB3208@sjoerdjob.com> On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote: > The "transformer" API can be used for syntax extensions as well, but the > registration needs to be different so the effects are localized. For > example it could be something like:: > > importlib.util.import_with_transformer( > 'mypackage.specialmodule', MyTransformer()) > > or a special flag in packages:: > > __transformers_for_submodules__ = [MyTransformer()] > > or extendeding exec (which you actually might want to add to the PEP, to > make giving examples easier):: > > exec("print('Hello World!')", transformers=[MyTransformer()]) > > or making it easier to write an import hook with them, etc... So, you'd have to supply the transformer used before importing? That seems like a troublesome solution to me. A better approach (to me) would require being able to document what transformers need to be run inside the module itself. Something like #:Transformers modname.TransformerClassName, modname.OtherTransformerClassName The reason why I would prefer this, is that it makes sense to document the transformers needed in the module itself, instead of in the code importing the module. As you suggest (and rightly so) to localize the effects of the registration, it makes sense to do the registration in the affected module. Of course there might be some cases where you want to import a module using a transformer it does not need to know about, but I think that would be less likely than the case where a module knows what transformers there should be applied. As an added bonus, it would let you apply transformers to the entry-point: #!/usr/bin/env python #:Transformers foo.BarTransformerMyCodeCanNotRunWithout But as you said, this support is probably outside the scope of the PEP anyway. Kind regards, Sjoerd Job From kevinjacobconway at gmail.com Sat Jan 16 11:56:05 2016 From: kevinjacobconway at gmail.com (Kevin Conway) Date: Sat, 16 Jan 2016 16:56:05 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <20160116162235.GB3208@sjoerdjob.com> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: I'm a big fan of your motivation to build an optimizer for cPython code. What I'm struggling with is understanding why this requires a PEP and language modification. There are already several projects that manipulate the AST for performance gains such as [1] or even my own ham fisted attempt [2]. Would you please elaborate on why these external approaches fail and how language modifications would make your approach successful. [1] https://pypi.python.org/pypi/astoptimizer [2] http://pycc.readthedocs.org/en/latest/ On Sat, Jan 16, 2016, 10:30 Sjoerd Job Postmus wrote: > On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote: > > The "transformer" API can be used for syntax extensions as well, but the > > registration needs to be different so the effects are localized. For > > example it could be something like:: > > > > importlib.util.import_with_transformer( > > 'mypackage.specialmodule', MyTransformer()) > > > > or a special flag in packages:: > > > > __transformers_for_submodules__ = [MyTransformer()] > > > > or extendeding exec (which you actually might want to add to the PEP, to > > make giving examples easier):: > > > > exec("print('Hello World!')", transformers=[MyTransformer()]) > > > > or making it easier to write an import hook with them, etc... > > So, you'd have to supply the transformer used before importing? That > seems like a troublesome solution to me. > > A better approach (to me) would require being able to document what > transformers need to be run inside the module itself. Something like > > #:Transformers modname.TransformerClassName, > modname.OtherTransformerClassName > > The reason why I would prefer this, is that it makes sense to document > the transformers needed in the module itself, instead of in the code > importing the module. > > As you suggest (and rightly so) to localize the effects of the > registration, it makes sense to do the registration in the affected > module. > > Of course there might be some cases where you want to import a module > using a transformer it does not need to know about, but I think that > would be less likely than the case where a module knows what > transformers there should be applied. > > As an added bonus, it would let you apply transformers to the > entry-point: > > #!/usr/bin/env python > #:Transformers foo.BarTransformerMyCodeCanNotRunWithout > > But as you said, this support is probably outside the scope of the PEP > anyway. > > Kind regards, > Sjoerd Job > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre.roberge at gmail.com Sat Jan 16 12:00:39 2016 From: andre.roberge at gmail.com (Andre Roberge) Date: Sat, 16 Jan 2016 13:00:39 -0400 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <20160116162235.GB3208@sjoerdjob.com> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On Sat, Jan 16, 2016 at 12:22 PM, Sjoerd Job Postmus wrote: > > > > > or making it easier to write an import hook with them, etc... > > So, you'd have to supply the transformer used before importing? That > seems like a troublesome solution to me. > > A better approach (to me) would require being able to document what > transformers need to be run inside the module itself. Something like > > #:Transformers modname.TransformerClassName, > modname.OtherTransformerClassName > > The reason why I would prefer this, is that it makes sense to document > the transformers needed in the module itself, instead of in the code > importing the module. > +1 for this (but see below). This is the approach I used when playing with import hooks as shown in http://aroberge.blogspot.ca/2015/10/from-experimental-import-somethingnew_14.html and a few other posts I wrote about similar transformations. > > As you suggest (and rightly so) to localize the effects of the > registration, it makes sense to do the registration in the affected > module. > > Of course there might be some cases where you want to import a module > using a transformer it does not need to know about, but I think that > would be less likely than the case where a module knows what > transformers there should be applied. > > As an added bonus, it would let you apply transformers to the > entry-point: > > #!/usr/bin/env python > #:Transformers foo.BarTransformerMyCodeCanNotRunWithout > > But as you said, this support is probably outside the scope of the PEP > anyway. > While I would like to see some standard way to apply code transformations, I agree that this is likely (and unfortunately) outside the scope of this PEP. Andr? Roberge > Kind regards, > Sjoerd Job > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Jan 16 13:17:56 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 16 Jan 2016 10:17:56 -0800 Subject: [Python-ideas] Boolean value of an Enum member In-Reply-To: <22170.5088.502579.612956@turnbull.sk.tsukuba.ac.jp> References: <56993900.60105@stoneleaf.us> <22170.5088.502579.612956@turnbull.sk.tsukuba.ac.jp> Message-ID: <569A8954.5060309@stoneleaf.us> On 01/16/2016 01:56 AM, Stephen J. Turnbull wrote: > I don't understand why this was cross-posted. Python-Dev removed from > addressees. Not every one reads both lists, and I wanted the widest audience. > Ethan Furman writes: >> However, if the Enum is combined with some other type (str, int, >> float, etc), then most behaviour is determined by that type -- >> including boolean evaluation. So the empty string, 0 values, etc, >> will cause that Enum member to evaluate as False. > > Do you have a use case where that distinction seems totally > inappropriate, or have you merely been bitten by Emerson's Hobgoblin? > Sadly, it was of failing memory. -- ~Ethan~ From ethan at stoneleaf.us Sat Jan 16 13:18:15 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 16 Jan 2016 10:18:15 -0800 Subject: [Python-ideas] Boolean value of an Enum member In-Reply-To: <5699583A.8090102@canterbury.ac.nz> References: <56993900.60105@stoneleaf.us> <5699583A.8090102@canterbury.ac.nz> Message-ID: <569A8967.8010805@stoneleaf.us> [resending to lists -- sorry, Greg] On 01/15/2016 12:36 PM, Greg Ewing wrote: > Ethan Furman wrote: >> So the question now is: for a standard Enum (meaning no other type >> besides Enum is involved) should __bool__ look to the value of the >> Enum member to determine True/False, or should we always be True by >> default and make the Enum creator add their own __bool__ if they want >> something different? > > Can't you just specify a starting value of 0 if you > want the enum to have a false value? That doesn't > seem too onerous to me. You can start with zero, but unless the Enum is mixed with a numeric type it will evaluate to True. Also, but there are other falsey values that a pure Enum member could have: False, None, '', etc., to name a few. However, as Barry said, writing your own is a whopping two lines of code: def __bool__(self): return bool(self._value_) With Barry and Guido's feedback this issue is closed. Thanks everyone! -- ~Ethan~ From brett at python.org Sat Jan 16 13:28:42 2016 From: brett at python.org (Brett Cannon) Date: Sat, 16 Jan 2016 18:28:42 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: On Fri, 15 Jan 2016 at 09:40 Victor Stinner wrote: > 2016-01-15 18:22 GMT+01:00 Brett Cannon : > > I just wanted to point out to people that the key part of this PEP is the > > change in semantics of `-O` accepting an argument. > > The be exact, it's a new "-o arg" option, it's different from -O and > -OO (uppercase). Since I don't know what to do with -O and -OO, I > simply kept them :-D > > > I should also point out that this does get tricky in terms of how to > handle > > the stdlib if you have not pre-compiled it, e.g., if the first module > > imported by Python is the encodings module then how to make sure the AST > > optimizers are ready to go by the time that import happens? > > Since importlib reads sys.implementation.optim_tag at each import, it > works fine. > > For example, you start with "opt" optimizer tag. You import everything > needed for fatoptimizer. Then calling sys.set_code_transformers() will > set a new optimizer flag (ex: "fat-opt"). But it works since the > required code transformers are now available. > I understand all of that; my point is what if you don't compile the stdlib for your optimization? You have to import over 20 modules before user code gets imported. My question is how do you expect the situation to be handled where you didn't optimize the stdlib since the 'encodings' module is imported before anything else? If you set your `-o` flag and you want to fail imports if the .pyc isn't there, then wouldn't that mean you are going to fail immediately when you try and import 'encodings' in Py_Initialize()? > > The tricky part is more when you want to deploy an application without > the code transformer, you have to ensure that all .py files are > compiled to .pyc. But there is no technical issues to compile them, > it's more a practical issue. > > See my second email with a lot of commands, I showed how .pyc are > created with different .pyc filenames. Or follow my commands to try my > "fatpython" fork to play yourself with the code ;-) > > > And lastly, Victor proposes that all .pyc files get an optimization tag. > > While there is nothing technically wrong with that, PEP 488 purposefully > > didn't do that in the default case for backwards-compatibility, so that > will > > need to be at least mentioned in the PEP. > > The PEP already contains: > https://www.python.org/dev/peps/pep-0511/#optimizer-tag > "Remove also the special case for the optimizer level 0 with the > default optimizer tag 'opt' to simplify the code." > > Code relying on the exact .pyc filename (like unit tests) already have > to be modified to use the optimizer tag. It's just an opportunity to > simplify the code. I don't really care of this specific change ;-) > Right, it's just mentioning the backwards-compatibility issue should be there. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jan 16 22:38:47 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jan 2016 13:38:47 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: On 17 January 2016 at 04:28, Brett Cannon wrote: > > On Fri, 15 Jan 2016 at 09:40 Victor Stinner > wrote: >> >> 2016-01-15 18:22 GMT+01:00 Brett Cannon : >> > I just wanted to point out to people that the key part of this PEP is >> > the >> > change in semantics of `-O` accepting an argument. >> >> The be exact, it's a new "-o arg" option, it's different from -O and >> -OO (uppercase). Since I don't know what to do with -O and -OO, I >> simply kept them :-D >> >> > I should also point out that this does get tricky in terms of how to >> > handle >> > the stdlib if you have not pre-compiled it, e.g., if the first module >> > imported by Python is the encodings module then how to make sure the AST >> > optimizers are ready to go by the time that import happens? >> >> Since importlib reads sys.implementation.optim_tag at each import, it >> works fine. >> >> For example, you start with "opt" optimizer tag. You import everything >> needed for fatoptimizer. Then calling sys.set_code_transformers() will >> set a new optimizer flag (ex: "fat-opt"). But it works since the >> required code transformers are now available. > > > I understand all of that; my point is what if you don't compile the stdlib > for your optimization? You have to import over 20 modules before user code > gets imported. My question is how do you expect the situation to be handled > where you didn't optimize the stdlib since the 'encodings' module is > imported before anything else? If you set your `-o` flag and you want to > fail imports if the .pyc isn't there, then wouldn't that mean you are going > to fail immediately when you try and import 'encodings' in Py_Initialize()? I don't think that's a major problem - it seems to me that it's the same as going for "pyc only" deployment with an embedded Python interpreter, and then forgetting to a precompiled standard library in addition to your own components. Yes, it's going to fail, but the bug is in the build process for your deployment artifacts rather than in the runtime behaviour of CPython. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 16 22:49:52 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jan 2016 13:49:52 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <20160116162235.GB3208@sjoerdjob.com> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On 17 January 2016 at 02:22, Sjoerd Job Postmus wrote: > On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote: >> The "transformer" API can be used for syntax extensions as well, but the >> registration needs to be different so the effects are localized. For >> example it could be something like:: >> >> importlib.util.import_with_transformer( >> 'mypackage.specialmodule', MyTransformer()) >> >> or a special flag in packages:: >> >> __transformers_for_submodules__ = [MyTransformer()] >> >> or extendeding exec (which you actually might want to add to the PEP, to >> make giving examples easier):: >> >> exec("print('Hello World!')", transformers=[MyTransformer()]) >> >> or making it easier to write an import hook with them, etc... > > So, you'd have to supply the transformer used before importing? That > seems like a troublesome solution to me. I think Sjoerd's confusion here is a strong argument in favour of clearly and permanently distinguishing semantics preserving code optimizers (which can be sensibly applied externally and/or globally), and semantically significant code transformers (which also need to be taken into account when *reading* the code, and hence should be visible locally, at least at the module level, and often at the function level). Making that distinction means we can be clear that the transformation case is already well served by import hooks that process alternate filename extensions rather than standard Python source or bytecode files, encoding cookie tricks (which are visible as a comment in the module header), and function decorators that alter the semantics of the functions they're applied to. The case which *isn't* currently well served is transparently applying a semantics preserving code optimiser like FAT Python - that's a decision for the person *running* the code, rather than the person writing it, so this PEP is about providing the hooks at the interpreter level to let them do that. While we can't *prevent* people from using these new hooks with semantically significant transformers, we *can* make it clear that we think actually doing is a bad idea, as it is likely to result in a tightly coupled hard to maintain code base that can't even be read reliably without understand the transforms that are being implicitly applied. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 16 22:59:56 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jan 2016 13:59:56 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On 17 January 2016 at 02:56, Kevin Conway wrote: > I'm a big fan of your motivation to build an optimizer for cPython code. > What I'm struggling with is understanding why this requires a PEP and > language modification. There are already several projects that manipulate > the AST for performance gains such as [1] or even my own ham fisted attempt > [2]. > > Would you please elaborate on why these external approaches fail and how > language modifications would make your approach successful. Existing external optimizers (include Victor's own astoptimizer, the venerable psyco, static compilers like Cython, and dynamic compilers like Numba) make simplifying assumptions that technically break some of Python's expected runtime semantics. They get away with that by relying on the assumption that people will only apply them in situations where the semantic differences don't matter. That's not good enough for optimization passes that are enabled globally: those need to be semantics *preserving*, so they can be applied blindly to any piece of Python code, with the worst possible outcome being "the optimization was automatically bypassed or disabled at runtime due to its prerequisites no longer being met". The PyPy JIT actually works in much the same way, it just does it dynamically at runtime by tracing frequently run execution paths. This is both a strength (it allows even more optimal code generation based on the actual running application), and a weakness (it requires time for the JIT to warm up by identifying critical execution paths, tracing them, and substituting the optimised code) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Sat Jan 16 23:28:54 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 16 Jan 2016 20:28:54 -0800 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On Jan 16, 2016, at 19:49, Nick Coghlan wrote: > > Making that distinction means we can be clear that the transformation > case is already well served by import hooks that process alternate > filename extensions rather than standard Python source or bytecode > files, encoding cookie tricks (which are visible as a comment in the > module header), and function decorators that alter the semantics of > the functions they're applied to. > > The case which *isn't* currently well served is transparently applying > a semantics preserving code optimiser like FAT Python - that's a > decision for the person *running* the code, rather than the person > writing it... I think something that isn't made clear in the rationale is why an import hook is good enough for most semantic extensions, but isn't good enough for global optimizers. After all, it's not that hard to write a module that installs an import hook for normal .py files instead of .hy or .pyq or whatever files. Then, to optimize your own code, or a third-party library, you just import the optimizer module first; to optimize an application, you write a 2- or 3-line wrapper (which can be trivially automated a la setuptools entry point scripts) to import the optimizer and then start the app. There are good reasons that isn't sufficient. For example, parts of the stdlib have already been imported before the top of the main module. While there are ways around that (I believe FAT comes with a script to recompile the stdlib into a venv or something?), they're clumsy and ad hoc, and it's unlikely two different optimizers would play nicely together. Also, making it work in a sensible way with .pyc files takes a decent amount of code, and will again be an ad-hoc solution that won't play well with other projects doing similar things. And there are people who write and execute long-running, optimization-ripe bits of code in the REPL (or at least in an IPython notebook), and that can't be handled with an import hook. Nor can code that extensively used exec. And probably other reasons I haven't thought of. Maybe the PEP should explain those reasons, so it's clear why this feature will help projects like FAT. Then again, some of those same reasons seem to apply equally well to semantic extensions. Two extensions are no more likely to play together as import hooks than two optimizers, and yet in many cases there's no syntactic or semantic reason they couldn't. Extensions are probably even more useful than optimizations at the REPL. And so on. And this is all even more true for extensions that people write to explore a new feature idea than for things people want to publish as deployable code. So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes, even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive. From ncoghlan at gmail.com Sun Jan 17 01:06:41 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jan 2016 16:06:41 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On 17 January 2016 at 14:28, Andrew Barnert wrote: > So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes, The main problem with globally enabled transformations of any kind is that action at a distance in software design is generally a *bad thing*. Python's tolerant of it because sometimes it's a *necessary* thing that actually makes code more maintainable - using monkeypatching for use cases like testing and monitoring means those cases can be ignored when reading and writing the code, using metaclasses lets you enlist the interpreter in defining "class-like" objects that differ in some specific way from normal ones (e.g. ORMs, ABCs, enums), using codecs lets you more easily provide configurable encoding and decoding behaviour, etc. While relying too heavily on those kinds of features can significantly harm debuggability, the pay-off in readability is worth it often enough for them to be officially supported language and runtime features. The kind of code transformation hooks that Victor is talking about here are the ultimate in action at a distance - if it wants to, an "optimizer" can completely throw away your code and substitute its own. Import hooks do indeed give you a comparable level of power (at least if you go so far as to write your own meta_path hook), but also still miss the code that Python runs without importing it (__main__, exec, eval, runpy, etc). > even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive. That's all I'm suggesting - I think the proposed hooks should be designed for globally enabled optimizations (and named accordingly), but I don't think we should erect any specific barriers against using them for other things. Designing them that way will provide a healthy nudge towards the primary intended use case (transparently enabling semantically compatible code optimizations), while still providing a new transformation technique to projects like MacroPy. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Sun Jan 17 01:22:57 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 16 Jan 2016 22:22:57 -0800 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: <1765B151-1591-41FC-8A9E-61AF3FB4603D@yahoo.com> On Jan 16, 2016, at 22:06, Nick Coghlan wrote: > >> On 17 January 2016 at 14:28, Andrew Barnert wrote: >> So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes, > > The main problem with globally enabled transformations of any kind is > that action at a distance in software design is generally a *bad > thing*. Python's tolerant of it because sometimes it's a *necessary* > thing that actually makes code more maintainable ... > That's all I'm suggesting - I think the proposed hooks should be > designed for globally enabled optimizations (and named accordingly), > but I don't think we should erect any specific barriers against using > them for other things. Designing them that way will provide a healthy > nudge towards the primary intended use case (transparently enabling > semantically compatible code optimizations), while still providing a > new transformation technique to projects like MacroPy. OK, then I agree 100% on this part. But on the main point, I still think it's important for the PEP to explain why import hooks aren't good enough for semantically-neutral global optimizations. As I said, I can think of multiple answers (top-level code, interaction with .pyc files, etc.), but as long as the PEP doesn't give those answers, people are going to keep asking (even years from now, when people want to know why TOOWTDI didn't apply here). From victor.stinner at gmail.com Sun Jan 17 06:48:59 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 17 Jan 2016 12:48:59 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <569A2452.1000709@gmail.com> References: <569A2452.1000709@gmail.com> Message-ID: 2016-01-16 12:06 GMT+01:00 Petr Viktorin >: > This PEP addresses two things that would benefit from different > approaches: let's call them optimizers and extensions. > > Optimizers, such as your FAT, don't change Python semantics. They're > designed to run on *all* code, including the standard library. It makes > sense to register them as early in interpreter startup as possible, but > if they're not registered, nothing breaks (things will just be slower). > Experiments with future syntax (like when async/await was being > developed) have the same needs. > > Syntax extensions, such as MacroPy or Hy, tend to target specific > modules, with which they're closely coupled: The modules won't run > without the transformer. And with other modules, the transformer either > does nothing (as with MacroPy, hopefully), or would fail altogether (as > with Hy). So, they would benefit from specific packages opting in. The > effects of enabling them globally range from inefficiency (MacroPy) to > failures or needing workarounds (Hy). To be clear, Hylang will not benefit from my PEP. That's why it is not mentioned in the PEP. "Syntax extensions" only look like a special case of optimizers. I'm not sure that it's worth to make them really different. > The PEP is designed optimizers. It would be good to stick to that use > case, at least as far as the registration is concerned. I suggest noting > in the documentation that Python semantics *must* be preserved, and > renaming the API, e.g.:: > > sys.set_global_optimizers([]) I would prefer to not restrict the PEP to a specific usage. > The "transformer" API can be used for syntax extensions as well, but the > registration needs to be different so the effects are localized. For > example it could be something like:: > > importlib.util.import_with_transformer( > 'mypackage.specialmodule', MyTransformer()) Brett may help on this part. I don't think that it's the best way to use importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy uses an import hook. (Maybe it should continue to use an import hook?) > or a special flag in packages:: > > __transformers_for_submodules__ = [MyTransformer()] Does it mean that you have to parse a .py file to then decide how to transform it? It will slow down compilation of code not using transformers. I would prefer to do that differently: always register transformers very early, but configure each transformer to only apply it on some files. The transformer can use the filename (file extension? importlib is currently restricted to .py files by default no?), it can use a special variable in the file (ex: fatoptimizer searchs for a __fatoptimizer__ variable which is used to configure the optimizer), a configuration loaded when the transformer is created, etc. > or extendeding exec (which you actually might want to add to the PEP, to > make giving examples easier):: > > exec("print('Hello World!')", transformers=[MyTransformer()]) There are a lot of ways to load, compile and execute code. Starting to add optional parameters will end as my old PEP 410 ( https://www.python.org/dev/peps/pep-0410/ ) which was rejected because it added an optional parameter a lot of functions (at least 16 functions!). (It was not the only reason to reject the PEP.) Brett Canon proposed to add hooks to importlib, but it would restrict the feature to imports. See use cases in the PEP, I would like to use the same code transformers everywhere. > Another thing: this snippet from the PEP sounds too verbose:: > > transformers = sys.get_code_transformers() > transformers.insert(0, new_cool_transformer) > sys.set_code_transformers(transformers) > > Can this just be a list, as with sys.path? Using the "optimizers" term:: > > sys.global_optimizers.insert(0, new_cool_transformer) set_code_transformers() checks the transformer name and ensures that the transformer has at least a AST transformer or a bytecode transformer. That's why it's a function and not a simple list. set_code_transformers() also gets the AST and bytecode transformers methods only once, to provide a simple C structure for PyAST_CompileObject (bytecode transformers) and PyParser_ASTFromStringObject (AST transformers). Note: sys.implementation.cache_tag is modifiable without any check. If you mess it, importlib will probably fail badly. And the newly added sys.implementation.optim_tag can also be modified without any check. > This:: > > def code_transformer(code, consts, names, lnotab, context): > > It's a function, so it would be better to name it:: > > def transform_code(code): Fair enough :-) But I want the context parameter to pass additional information. Note: if we pass a code object, the filename is already in the code object, but there are other informations (see below). > And this:: > > def ast_transformer(tree, context): > > might work better with keyword arguments:: > > def transform_ast(tree, *, filename, **kwargs): > > otherwise people might use context objects with other attributes than > "filename", breaking when a future PEP assigns a specific meaning to them. The idea of a context object is to be "future-proof". Future versions of Python can add new attributes without having to modify all code transformers (or even worse, having to use kind of "#ifdef" in the code depending on the Python version). > It actually might be good to make the code transformer API extensible as > well, and synchronize with the AST transformer:: > > def transform_code(code, *, filename, **kwargs): **kwargs and context is basically the same, but I prefer a single parameter rather than an ugly **kwargs. IMHO "**kwargs" cannot be called an API. By the way, I added lately the bytecode transformers to the PEP. In fact, we already can more informations to its context: * compiler_flags: flags like * optimization_level (int): 0, 1 or 2 depending on the -O and -OO command line options * interactive (boolean): True if interactive mode * etc. => see the compiler structure in Python/compile.c. We will have to check that these attributes make sense to other Python implementations, or make it clear in the PEP that as sys.implementation, each Python implementation can add specific attributes, and only a few of them are always available. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jan 17 07:36:32 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Jan 2016 22:36:32 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> Message-ID: On 17 January 2016 at 21:48, Victor Stinner wrote: > 2016-01-16 12:06 GMT+01:00 Petr Viktorin : >> The PEP is designed optimizers. It would be good to stick to that use >> case, at least as far as the registration is concerned. I suggest noting >> in the documentation that Python semantics *must* be preserved, and >> renaming the API, e.g.:: >> >> sys.set_global_optimizers([]) > > I would prefer to not restrict the PEP to a specific usage. The problem I see with making the documentation and naming too generic is that people won't know what the feature is useful for - a generic term like "transformer" accurately describes these units of code, but provides no hint as to why a developer might care about their existence. However, if the reason we're adding the capability is to make global static optimizers feasible, then we cam describe it accordingly (so the answer to "Why does this feature exist?" becomes relatively self evident), and have the fact that the feature can actually be used for arbitrary transforms be an added bonus rather than the core intent. Alternatively, we could follow the example of the atexit module, and provide these hook registration capabilities through a new "atcompile" module rather than through the sys module. Doing that would also provide a namespace for doing things like allowing runtime caching of compiled code objects - if there's no caching mechanism, then optimising code compiled at runtime (rather than loading pre-optimised code from bytecode files) could easily turn into a pessimisation if the optimiser takes more time to run than is gained back in a single execution of the optimised code relative to the unoptimised code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Sun Jan 17 11:27:03 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 17 Jan 2016 08:27:03 -0800 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> Message-ID: On Jan 17, 2016, at 03:48, Victor Stinner wrote: > > 2016-01-16 12:06 GMT+01:00 Petr Viktorin : > > > The PEP is designed optimizers. It would be good to stick to that use > > case, at least as far as the registration is concerned. I suggest noting > > in the documentation that Python semantics *must* be preserved, and > > renaming the API, e.g.:: > > > > sys.set_global_optimizers([]) > > I would prefer to not restrict the PEP to a specific usage. > > > The "transformer" API can be used for syntax extensions as well, but the > > registration needs to be different so the effects are localized. For > > example it could be something like:: > > > > importlib.util.import_with_transformer( > > 'mypackage.specialmodule', MyTransformer()) > > Brett may help on this part. I don't think that it's the best way to use importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy uses an import hook. (Maybe it should continue to use an import hook?) > > > or a special flag in packages:: > > > > __transformers_for_submodules__ = [MyTransformer()] > > Does it mean that you have to parse a .py file to then decide how to transform it? It will slow down compilation of code not using transformers. > > I would prefer to do that differently: always register transformers very early, but configure each transformer to only apply it on some files. At that point, you're exactly duplicating what can be done with import hooks. I think this is part of the reason Nick suggested the PEP should largely ignore the issue of syntax extensions and experiments: because then you don't have to solve Petr's problem. Globally-applicable optimizers are either on or off globally, so the only API you need to control them is a simple global list. The fact that this same API works for some uses of extensions doesn't matter; the fact that it doesn't work for some other uses of extensions also doesn't matter; just design it for the intended use. > The transformer can use the filename (file extension? importlib is currently restricted to .py files by default no?), Everything goes through the same import machinery. The usual importer gets registered for .py files. Something like hylang can register for a different extension. Something like PyMacro can wrap the usual importer, then register to take over for .py files. (This isn't quite what PyMacro does, because it's designed to work with older versions of Python, with less powerful/simple customization opportunities, but it's what a new PyMacro-like project would do.) A global optimizer could also be written that way today. And doing this is a couple dozen lines of code (or about 5 lines to do it as a quick&dirty hack without worrying about portability or backward/forward compatibility). The reason your PEP is necessary, I believe, is to overcome the limitations of such an import hook: to work at the REPL/notebook/etc. level, to allow multiple optimizers to play nicely without them having to agree on some wrapping protocol, to work with exec, etc. By keeping things simple and only serving the global case, you can (or, rather, you already have) come up with easier solutions to those issues--no need for enabling/disabling files by type or other information, no need for extra optional parameters to exec, etc. (Or, if you aren't trying to overcome those limitations, then I'm not sure why your PEP is necessary. Import hooks already work, after all.) > > Another thing: this snippet from the PEP sounds too verbose:: > > > > transformers = sys.get_code_transformers() > > transformers.insert(0, new_cool_transformer) > > sys.set_code_transformers(transformers) > > > > Can this just be a list, as with sys.path? Using the "optimizers" term:: > > > > sys.global_optimizers.insert(0, new_cool_transformer) > > set_code_transformers() checks the transformer name and ensures that the transformer has at least a AST transformer or a bytecode transformer. That's why it's a function and not a simple list. That doesn't seem necessary. After all, sys.path doesn't check that you aren't assigning non-strings or strings that don't make valid paths, and nobody has ever complained that it's too hard to debug the case where you write `sys.paths.insert(0, {1, 2, 3})` because the error comes at import time instead of locally. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Jan 17 16:54:10 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Jan 2016 10:54:10 +1300 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <1765B151-1591-41FC-8A9E-61AF3FB4603D@yahoo.com> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <1765B151-1591-41FC-8A9E-61AF3FB4603D@yahoo.com> Message-ID: <569C0D82.1000503@canterbury.ac.nz> Concerning ways to allow a module to opt in to transformations that change semantics, my first thought was to use an import from a magic module: from __extensions__ import modulename This would have to appear before any other statements or non-magic imports, like __future__ does. The named module would be imported at compile time and some suitable convention used to extract transformers from it. The problem is that if your extension is in a package, you want to be able to write from __extensions__ import packagename.modulename which is not valid syntax. So instead of a magic module, maybe a magic namespace package: import __extensions__.packagename.modulename -- Greg From encukou at gmail.com Mon Jan 18 04:50:49 2016 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 18 Jan 2016 10:50:49 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> Message-ID: <569CB579.6000309@gmail.com> On 01/17/2016 12:48 PM, Victor Stinner wrote: > 2016-01-16 12:06 GMT+01:00 Petr Viktorin >: >> This PEP addresses two things that would benefit from different >> approaches: let's call them optimizers and extensions. >> >> Optimizers, such as your FAT, don't change Python semantics. They're >> designed to run on *all* code, including the standard library. It makes >> sense to register them as early in interpreter startup as possible, but >> if they're not registered, nothing breaks (things will just be slower). >> Experiments with future syntax (like when async/await was being >> developed) have the same needs. >> >> Syntax extensions, such as MacroPy or Hy, tend to target specific >> modules, with which they're closely coupled: The modules won't run >> without the transformer. And with other modules, the transformer either >> does nothing (as with MacroPy, hopefully), or would fail altogether (as >> with Hy). So, they would benefit from specific packages opting in. The >> effects of enabling them globally range from inefficiency (MacroPy) to >> failures or needing workarounds (Hy). > > To be clear, Hylang will not benefit from my PEP. That's why it is not > mentioned in the PEP. > > "Syntax extensions" only look like a special case of optimizers. I'm not > sure that it's worth to make them really different. There is an important difference: optimizers should be installed globally. But modules that don't opt in to a specific syntax extension should not get compiled with it. >> The PEP is designed optimizers. It would be good to stick to that use >> case, at least as far as the registration is concerned. I suggest noting >> in the documentation that Python semantics *must* be preserved, and >> renaming the API, e.g.:: My API examples seem to have led the conversation astray. The point I wanted to make is that "syntax extensions" need a registration API that only enables them for specific modules. I admit the particular examples weren't very well thought out. I'm not proposing adding *any* of them to the PEP: I'd be happy if the PEP stuck to the "optimizers" use case and do that well. The "extensions" case is worth another PEP, which can reuse the transformers API (probably integrating it with importlib), but not the registration API. > I would prefer to do that differently: always register transformers > very early, but configure each transformer to only apply it on some > files. The transformer can use the filename (file extension? > importlib is currently restricted to .py files by default no?), it > can use a special variable in the file (ex: fatoptimizer searchs > for a __fatoptimizer__ variable which is used to configure the > optimizer), a configuration loaded when the transformer is > created, etc. Why very early? If a syntax extension is used in some package, it should only be activated right before that package is imported. And ideally it shouldn't get a chance to be activated on other packages. importlib is not restricted to .py (it can do .zip, .pyc, .so, etc. out of the box). Actually, with import hooks, the *source* file that uses the DSL can use a different extension (as opposed to the *.pyc getting a different tag, as for optimizers). For example, a library using a SQL DSL could look like:: __init__.py (imports a package to set up the transformer) queries.sqlpy __pycache__/ __init__.cpython-36.opt-0.pyc queries.cpython-36.opt-0.pyc That is probably what you want for syntax extensions. You can't really look at special variables in the file, because the transformer needs to be enabled before the code is compiled -- especially if text/tokenstream transforms are added, so the file might not be valid "vanilla Python". What's left is making it easy to register an import hook with a specific PEP 511 transformer -- but again, that can be a different PEP. From yselivanov.ml at gmail.com Mon Jan 18 11:45:37 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 18 Jan 2016 11:45:37 -0500 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> Message-ID: <569D16B1.6020708@gmail.com> On 2016-01-17 7:36 AM, Nick Coghlan wrote: > On 17 January 2016 at 21:48, Victor Stinner wrote: >> >2016-01-16 12:06 GMT+01:00 Petr Viktorin: >>> >>The PEP is designed optimizers. It would be good to stick to that use >>> >>case, at least as far as the registration is concerned. I suggest noting >>> >>in the documentation that Python semantics*must* be preserved, and >>> >>renaming the API, e.g.:: >>> >> >>> >> sys.set_global_optimizers([]) >> > >> >I would prefer to not restrict the PEP to a specific usage. > The problem I see with making the documentation and naming too generic > is that people won't know what the feature is useful for - a generic > term like "transformer" accurately describes these units of code, but > provides no hint as to why a developer might care about their > existence. > > However, if the reason we're adding the capability is to make global > static optimizers feasible, then we cam describe it accordingly (so > the answer to "Why does this feature exist?" becomes relatively self > evident), and have the fact that the feature can actually be used for > arbitrary transforms be an added bonus rather than the core intent. +1. Yury From brett at python.org Mon Jan 18 11:52:42 2016 From: brett at python.org (Brett Cannon) Date: Mon, 18 Jan 2016 16:52:42 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: On Sat, 16 Jan 2016 at 19:38 Nick Coghlan wrote: > On 17 January 2016 at 04:28, Brett Cannon wrote: > > > > On Fri, 15 Jan 2016 at 09:40 Victor Stinner > > wrote: > >> > >> 2016-01-15 18:22 GMT+01:00 Brett Cannon : > >> > I just wanted to point out to people that the key part of this PEP is > >> > the > >> > change in semantics of `-O` accepting an argument. > >> > >> The be exact, it's a new "-o arg" option, it's different from -O and > >> -OO (uppercase). Since I don't know what to do with -O and -OO, I > >> simply kept them :-D > >> > >> > I should also point out that this does get tricky in terms of how to > >> > handle > >> > the stdlib if you have not pre-compiled it, e.g., if the first module > >> > imported by Python is the encodings module then how to make sure the > AST > >> > optimizers are ready to go by the time that import happens? > >> > >> Since importlib reads sys.implementation.optim_tag at each import, it > >> works fine. > >> > >> For example, you start with "opt" optimizer tag. You import everything > >> needed for fatoptimizer. Then calling sys.set_code_transformers() will > >> set a new optimizer flag (ex: "fat-opt"). But it works since the > >> required code transformers are now available. > > > > > > I understand all of that; my point is what if you don't compile the > stdlib > > for your optimization? You have to import over 20 modules before user > code > > gets imported. My question is how do you expect the situation to be > handled > > where you didn't optimize the stdlib since the 'encodings' module is > > imported before anything else? If you set your `-o` flag and you want to > > fail imports if the .pyc isn't there, then wouldn't that mean you are > going > > to fail immediately when you try and import 'encodings' in > Py_Initialize()? > > I don't think that's a major problem - it seems to me that it's the > same as going for "pyc only" deployment with an embedded Python > interpreter, and then forgetting to a precompiled standard library in > addition to your own components. Yes, it's going to fail, but the bug > is in the build process for your deployment artifacts rather than in > the runtime behaviour of CPython. > It is the same, and that's my point. If we are going to enforce this import requirement of having a matching .pyc file in order to do a proper import, then we are already requiring an offline compilation which makes the dynamic registering of optimizers a lot less necessary. Now if we tweak the proposed semantics of `-o` to say "import these of kind of optimized .pyc file *if you can*, otherwise don't worry about it", then having registered optimizers makes more sense as that gets around the bootstrap problem with the stdlib. This would require optimizations to be module-level and not application-level, though. This also makes the difference between `-o` and `-O` even more prevalent as the latter is not only required, but then restricted to only optimizations which affect what syntax is executed instead of what AST transformations were applied. This also means that the file name of the .pyc files should keep `opt-1`, etc. and the AST transformation names get appended on as it would stack `-O` and `-o`. It really depends on what kinds of optimizations we expect people to do. If we expect application-level optimizations then we need to enforce universal importing of bytecode because it may make assumptions about other modules. But if we limit it to module-level optimizations then it isn't quite so critical that the .pyc files pre-exist, making it such that `-o` can be more of a request than a requirement for importing modules of a certain optimization. That also means if the same AST optimizers are not installed it's no big deal since you just work with what you have (although you could set it to raise an ImportWarning when an import didn't find a .pyc file of the requested optimization *and* the needed AST optimizers weren't available either). -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Mon Jan 18 11:54:32 2016 From: random832 at fastmail.com (Random832) Date: Mon, 18 Jan 2016 11:54:32 -0500 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <569D16B1.6020708@gmail.com> References: <569A2452.1000709@gmail.com> <569D16B1.6020708@gmail.com> Message-ID: <1453136072.1159611.495409490.68458470@webmail.messagingengine.com> > >> >2016-01-16 12:06 GMT+01:00 Petr Viktorin: > >>> >>The PEP is designed optimizers. It would be good to stick to that use > >>> >>case, at least as far as the registration is concerned. I suggest noting > >>> >>in the documentation that Python semantics*must* be preserved, and > >>> >>renaming the API, e.g.:: > >>> >> > >>> >> sys.set_global_optimizers([]) > > On 17 January 2016 at 21:48, Victor Stinner wrote: > >> >I would prefer to not restrict the PEP to a specific usage. > On 2016-01-17 7:36 AM, Nick Coghlan wrote: > > The problem I see with making the documentation and naming too generic > > is that people won't know what the feature is useful for - a generic > > term like "transformer" accurately describes these units of code, but > > provides no hint as to why a developer might care about their > > existence. > > > > However, if the reason we're adding the capability is to make global > > static optimizers feasible, then we cam describe it accordingly (so > > the answer to "Why does this feature exist?" becomes relatively self > > evident), and have the fact that the feature can actually be used for > > arbitrary transforms be an added bonus rather than the core intent. On Mon, Jan 18, 2016, at 11:45, Yury Selivanov wrote: > +1. I think that it depends on how it's implemented. Having a _requirement_ that semantics _must_ be preserved suggests that they may not always be applied, or may not be applied in a deterministic order. From yselivanov.ml at gmail.com Mon Jan 18 12:04:31 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 18 Jan 2016 12:04:31 -0500 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <1453136072.1159611.495409490.68458470@webmail.messagingengine.com> References: <569A2452.1000709@gmail.com> <569D16B1.6020708@gmail.com> <1453136072.1159611.495409490.68458470@webmail.messagingengine.com> Message-ID: <569D1B1F.4080701@gmail.com> [..] >> On 2016-01-17 7:36 AM, Nick Coghlan wrote: >>> The problem I see with making the documentation and naming too generic >>> is that people won't know what the feature is useful for - a generic >>> term like "transformer" accurately describes these units of code, but >>> provides no hint as to why a developer might care about their >>> existence. >>> >>> However, if the reason we're adding the capability is to make global >>> static optimizers feasible, then we cam describe it accordingly (so >>> the answer to "Why does this feature exist?" becomes relatively self >>> evident), and have the fact that the feature can actually be used for >>> arbitrary transforms be an added bonus rather than the core intent. > On Mon, Jan 18, 2016, at 11:45, Yury Selivanov wrote: >> +1. > I think that it depends on how it's implemented. Having a _requirement_ > that semantics _must_ be preserved suggests that they may not always be > applied, or may not be applied in a deterministic order. It just won't be possible to enforce that "requirement". What Nick suggests (and I suggested in my email earlier in this thread) is that we should name the APIs clearly to avoid any confusion. `sys.set_code_transformers` is less clear about what it should be used for than `sys.set_code_optimizers`. Yury From random832 at fastmail.com Mon Jan 18 12:26:30 2016 From: random832 at fastmail.com (Random832) Date: Mon, 18 Jan 2016 12:26:30 -0500 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <569D1B1F.4080701@gmail.com> References: <569A2452.1000709@gmail.com> <569D16B1.6020708@gmail.com> <1453136072.1159611.495409490.68458470@webmail.messagingengine.com> <569D1B1F.4080701@gmail.com> Message-ID: <1453137990.1166821.495443002.212A72D8@webmail.messagingengine.com> On Mon, Jan 18, 2016, at 12:04, Yury Selivanov wrote: > > I think that it depends on how it's implemented. Having a _requirement_ > > that semantics _must_ be preserved suggests that they may not always be > > applied, or may not be applied in a deterministic order. > > It just won't be possible to enforce that "requirement". I'm not talking about mechanically enforcing it. I'm talking about it being a documented requirement to write such code, and that people *should not* use this feature for things that need to be applied 100% of the time for their applications to work. Either we have to nail down exactly when and how these things are invoked so that people can rely on them, or they are only _useful_ for optimizations (and other semantic-preserving things like instrumentation) rather than arbitrary transformations. From haael at interia.pl Tue Jan 19 09:10:35 2016 From: haael at interia.pl (haael at interia.pl) Date: Tue, 19 Jan 2016 15:10:35 +0100 Subject: [Python-ideas] Explicit variable capture list Message-ID: Hi C++ has a nice feature of explicit variable capture list for lambdas: int a = 1, b = 2, c = 3; auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; This allows easy construction of closures. In Python to achieve that, you need to say: def make_closure(a, b, c): def fun(x, y): return a + b + c + x + y return def a = 1 b = 2 c = 3 fun = make_closure(a, b, c) My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables a = 1 b = 2 c = 3 def fun(x, y): capture a, b, c return a + b + c + x + y This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour. To make it more useful, we also need some syntax for inline lambdas. I.e.: a = 1 b = 2 c = 3 fun = lambda[a, b, c] x, y: a + b + c + x + y Thanks, haael From jeanpierreda at gmail.com Tue Jan 19 09:39:17 2016 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Tue, 19 Jan 2016 06:39:17 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On Tue, Jan 19, 2016 at 6:10 AM, wrote: > > > Hi > > C++ has a nice feature of explicit variable capture list for lambdas: > > int a = 1, b = 2, c = 3; > auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; > > This allows easy construction of closures. In Python to achieve that, you need to say: This is worded very confusingly. Python has easy construction of closures with implicit variable capture. The difference has to do with "value semantics" in C++, which Python doesn't have. If you were using int* variables in your C++ example, you'd have the same semantics as Python does with its int references. > def make_closure(a, b, c): > def fun(x, y): > return a + b + c + x + y > return def > a = 1 > b = 2 > c = 3 > fun = make_closure(a, b, c) The usual workaround is actually: a = 1 b = 1 c = 1 def fun(x, y, a=a, b=b, c=c): return a + b + c + x + y -- Devin From jeanpierreda at gmail.com Tue Jan 19 09:45:59 2016 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Tue, 19 Jan 2016 06:45:59 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: Sorry, forget the first part entirely, I was still confused when I wrote it. Definitely the semantics of values are very different, but they don't matter for this. I think the rough equivalent of the capture-by-copy C++ lambda is the function definition I provided with default values. -- Devin On Tue, Jan 19, 2016 at 6:39 AM, Devin Jeanpierre wrote: > On Tue, Jan 19, 2016 at 6:10 AM, wrote: >> >> >> Hi >> >> C++ has a nice feature of explicit variable capture list for lambdas: >> >> int a = 1, b = 2, c = 3; >> auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; >> >> This allows easy construction of closures. In Python to achieve that, you need to say: > > This is worded very confusingly. Python has easy construction of > closures with implicit variable capture. > > The difference has to do with "value semantics" in C++, which Python > doesn't have. If you were using int* variables in your C++ example, > you'd have the same semantics as Python does with its int references. > >> def make_closure(a, b, c): >> def fun(x, y): >> return a + b + c + x + y >> return def >> a = 1 >> b = 2 >> c = 3 >> fun = make_closure(a, b, c) > > The usual workaround is actually: > > a = 1 > b = 1 > c = 1 > def fun(x, y, a=a, b=b, c=c): > return a + b + c + x + y > > -- Devin From tjreedy at udel.edu Tue Jan 19 10:23:36 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Jan 2016 10:23:36 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On 1/19/2016 9:10 AM, haael at interia.pl wrote: > > Hi > > C++ has a nice feature of explicit variable capture list for lambdas: > > int a = 1, b = 2, c = 3; > auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; > > This allows easy construction of closures. In Python to achieve that, you need to say: > > def make_closure(a, b, c): > def fun(x, y): > return a + b + c + x + y > return def > a = 1 > b = 2 > c = 3 > fun = make_closure(a, b, c) The purpose of writing a make_closure function is so it can be called more than once, to make more than one closure. f123 = make_closure(1, 2, 3) f456 = make_closure(4, 5, 6) > My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables > > a = 1 > b = 2 > c = 3 > def fun(x, y): > capture a, b, c > return a + b + c + x + y > > This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour. This only allows one version of fun, not multiple, so it is not equivalent at all. As Devin stated, it is equivalent to to using parameters with default argument values. -- Terry Jan Reedy From abarnert at yahoo.com Tue Jan 19 11:22:51 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 19 Jan 2016 08:22:51 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On Jan 19, 2016, at 06:10, haael at interia.pl wrote: > > > Hi > > C++ has a nice feature of explicit variable capture list for lambdas: > > int a = 1, b = 2, c = 3; > auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; > > This allows easy construction of closures. In Python to achieve that, you need to say: > > def make_closure(a, b, c): > def fun(x, y): > return a + b + c + x + y > return def > a = 1 > b = 2 > c = 3 > fun = make_closure(a, b, c) > > My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables > > a = 1 > b = 2 > c = 3 > def fun(x, y): > capture a, b, c > return a + b + c + x + y > > This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour. What you're suggesting is the exact opposite of what you say you're suggesting. Capturing a, b, and c in a closure is what Python already does. What you're trying to do is _not_ capture them and _not_ create a closure. So calling the statement "capture" is very misleading, and saying it "allows easy construction of closures" even more so. In C++ terms, this: def fun(x, y): return a + b + c + x + y means: auto fun = [&](int x, int y) { return a + b + c + x + y; }; It obviously doesn't mean this, as you imply: auto fun = [](int x, int y) { return a + b + c + x + y; }; ... because that just gives you a compile-time error saying that local variables a, b, and c aren't defined, which is not what Python does. If you're looking for a way to copy references to the values, instead of capturing the variables, you write this: def fun(x, y, a=a, b=b, c=c): return a + b + c + x + y And if you want to actually copy the values themselves, you have to do that explicitly (which has no visible effect for ints, of course, but think about lists or dicts here): def fun(x, y, a=copy.copy(a), b=copy.copy(b), c=copy.copy(c)): return a + b + c + x + y ... because Python, unlike C++, never automatically copies values. (Again, think about lists or dicts. If passing them to a function or storing them in a variable made an automatic copy, as in C++, you'd be wasting lots of time and space copying them all over the place. That's why you have to explicitly create vector& variables, or shared_ptr>, or pass around iterators instead of the container itself--because you almost never actually want to waste time and space making a copy if you're not mutating, and you almost always want the changes to be effective if you are mutating.) From guido at python.org Tue Jan 19 11:47:28 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jan 2016 08:47:28 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: I think it's reasonable to divert this discussion to "value capture". Not sure if that's the usual terminology, but the idea should be that a reference to the value is captured, rather than (as Python normally does with closures) a reference to the variable (implemented as something called a "cell"). (However let's please not consider whether the value should be copied or deep-copied. Just capture the object reference at the point the capture is executed.) The best syntax for such capture remains to be seen. ("Capture" seems to universally make people think of "variable capture" which is the opposite of what we want here.) On Tue, Jan 19, 2016 at 8:22 AM, Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Jan 19, 2016, at 06:10, haael at interia.pl wrote: > > > > > > Hi > > > > C++ has a nice feature of explicit variable capture list for lambdas: > > > > int a = 1, b = 2, c = 3; > > auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; > > > > This allows easy construction of closures. In Python to achieve that, > you need to say: > > > > def make_closure(a, b, c): > > def fun(x, y): > > return a + b + c + x + y > > return def > > a = 1 > > b = 2 > > c = 3 > > fun = make_closure(a, b, c) > > > > My proposal: create a special variable qualifier (like global and > nonlocal) to automatically capture variables > > > > a = 1 > > b = 2 > > c = 3 > > def fun(x, y): > > capture a, b, c > > return a + b + c + x + y > > > > This will have an effect that symbols a, b and c in the body of the > function have values as they had at the moment of function creation. The > variables a, b, c must be defined at the time of function creation. If they > are not, an error is thrown. The 'capture' qualifier may be combined with > keywords global and nonlocal to change lookup behaviour. > > What you're suggesting is the exact opposite of what you say you're > suggesting. Capturing a, b, and c in a closure is what Python already does. > What you're trying to do is _not_ capture them and _not_ create a closure. > So calling the statement "capture" is very misleading, and saying it > "allows easy construction of closures" even more so. > > In C++ terms, this: > > def fun(x, y): return a + b + c + x + y > > means: > > auto fun = [&](int x, int y) { return a + b + c + x + y; }; > > It obviously doesn't mean this, as you imply: > > auto fun = [](int x, int y) { return a + b + c + x + y; }; > > ... because that just gives you a compile-time error saying that local > variables a, b, and c aren't defined, which is not what Python does. > > If you're looking for a way to copy references to the values, instead of > capturing the variables, you write this: > > def fun(x, y, a=a, b=b, c=c): return a + b + c + x + y > > And if you want to actually copy the values themselves, you have to do > that explicitly (which has no visible effect for ints, of course, but think > about lists or dicts here): > > def fun(x, y, a=copy.copy(a), b=copy.copy(b), c=copy.copy(c)): return > a + b + c + x + y > > ... because Python, unlike C++, never automatically copies values. (Again, > think about lists or dicts. If passing them to a function or storing them > in a variable made an automatic copy, as in C++, you'd be wasting lots of > time and space copying them all over the place. That's why you have to > explicitly create vector& variables, or shared_ptr>, or > pass around iterators instead of the container itself--because you almost > never actually want to waste time and space making a copy if you're not > mutating, and you almost always want the changes to be effective if you are > mutating.) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Jan 19 15:24:37 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 19 Jan 2016 22:24:37 +0200 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On 19.01.16 18:47, Guido van Rossum wrote: > I think it's reasonable to divert this discussion to "value capture". > Not sure if that's the usual terminology, but the idea should be that a > reference to the value is captured, rather than (as Python normally does > with closures) a reference to the variable (implemented as something > called a "cell"). > > (However let's please not consider whether the value should be copied or > deep-copied. Just capture the object reference at the point the capture > is executed.) > > The best syntax for such capture remains to be seen. ("Capture" seems to > universally make people think of "variable capture" which is the > opposite of what we want here.) A number of variants of more powerful syntax were proposed in [1]. In neighbour topic Scott Sanderson had pointed to the asconstants decorator in codetransformer [2] that patches the code object by substituting a references to the variable with a reference to the constant. Ryan Gonzalez provided other implementation of similar decorator [3]. May be this feature doesn't need new syntax, but just new decorator in the stdlib. [1] http://comments.gmane.org/gmane.comp.python.ideas/37047 [2] http://permalink.gmane.org/gmane.comp.python.ideas/36958 [3] http://permalink.gmane.org/gmane.comp.python.ideas/37058 From rosuav at gmail.com Tue Jan 19 18:29:48 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Jan 2016 10:29:48 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum wrote: > I think it's reasonable to divert this discussion to "value capture". Not > sure if that's the usual terminology, but the idea should be that a > reference to the value is captured, rather than (as Python normally does > with closures) a reference to the variable (implemented as something called > a "cell"). +1. This would permit deprecation of the "def blah(...., len=len):" optimization - all you need to do is set a value capture on the name "len". ChrisA From guido at python.org Tue Jan 19 18:43:12 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jan 2016 15:43:12 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: On Tue, Jan 19, 2016 at 12:24 PM, Serhiy Storchaka wrote: > > A number of variants of more powerful syntax were proposed in [1]. In > neighbour topic Scott Sanderson had pointed to the asconstants decorator in > codetransformer [2] that patches the code object by substituting a > references to the variable with a reference to the constant. Ryan Gonzalez > provided other implementation of similar decorator [3]. > > May be this feature doesn't need new syntax, but just new decorator in the > stdlib. > Hmm... Using a decorator would mean that you'd probably have to add quotes around the names of the variables whose values you want to capture, and it'd require hacking the bytecode. That would mean that it'd only work for CPython, and it'd not be a real part of the language. This feels like it wants to be a language-level feature, like nonlocal. > [1] http://comments.gmane.org/gmane.comp.python.ideas/37047 > [2] http://permalink.gmane.org/gmane.comp.python.ideas/36958 > [3] http://permalink.gmane.org/gmane.comp.python.ideas/37058 > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Jan 19 18:51:15 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jan 2016 10:51:15 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: <20160119235115.GX10854@ando.pearwood.info> On Tue, Jan 19, 2016 at 03:10:35PM +0100, haael at interia.pl wrote: > > Hi > > C++ has a nice feature of explicit variable capture list for lambdas: > > int a = 1, b = 2, c = 3; > auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; For the benefit of those who don't speak C++, could you explain what that does? Are C++ name binding semantics the same as Python's? Specifically, inside fun, does "a" refer to the global a? If you rebind global a, what happens to fun? fun(0, 0) # returns 6 a = 0 fun(0, 0) # returns 5 or 6? > This allows easy construction of closures. In Python to achieve that, you need to say: > > def make_closure(a, b, c): > def fun(x, y): > return a + b + c + x + y > return def > a = 1 > b = 2 > c = 3 > fun = make_closure(a, b, c) I cannot tell whether the C++ semantics above are the same as the Python semantics here. Andrew's response to you suggests that it is not. > My proposal: create a special variable qualifier (like global and > nonlocal) to automatically capture variables "Variables" is an ambiguous term. I don't want to get into a debate about "Python doesn't have variables", but it's not clear what you mean here. Python certainly has names, and values, and when you talk about "variables" do you mean the name or the value or both? > a = 1 > b = 2 > c = 3 > def fun(x, y): > capture a, b, c > return a + b + c + x + y > > This will have an effect that symbols a, b and c in the body of the > function have values as they had at the moment of function creation. > The variables a, b, c must be defined at the time of function > creation. If they are not, an error is thrown. If I have understood you correctly, we can already do that in Python, and don't even need a closure: a, b, c = 1, 2, 3 fun = lambda x, y, a=a, b=b, c=c: a + b + c + x + y will capture the current *value* of GLOBAL a, b and c, store them as default values, and use them as the LOCAL a, b and c. You may consider it a strength or a weakness that they are exposed as regular function parameters: fun(x, y) # intended call signature fun(x, y, a, b, c) # actual call signature but if you really care about hiding the extra parameters, a second approach will work: from functools import partial a, b, c = 1, 2, 3 fun = partial(lambda a, b, c, x, y: a + b + c + x + y, a, b, c) If a, b, c are mutable objects, you can make a copy of the value: fun = partial(lambda a, b, c, x, y: a + b + c + x + y, a, b, copy.copy(c) ) for example. Does your proposal behave any differently from these examples? -- Steve From steve at pearwood.info Tue Jan 19 19:14:25 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jan 2016 11:14:25 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: <20160120001425.GY10854@ando.pearwood.info> On Wed, Jan 20, 2016 at 10:29:48AM +1100, Chris Angelico wrote: > On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum wrote: > > I think it's reasonable to divert this discussion to "value capture". Not > > sure if that's the usual terminology, but the idea should be that a > > reference to the value is captured, rather than (as Python normally does > > with closures) a reference to the variable (implemented as something called > > a "cell"). > > +1. This would permit deprecation of the "def blah(...., len=len):" > optimization - all you need to do is set a value capture on the name > "len". Some might argue that the default argument trick is already the One Obvious Way to capture a value in a function. I don't think deprecation is the right word here, you can't deprecate "len=len" style code because it's just a special case of the more general name=expr function default argument syntax. I suppose a linter might complain if the expression on the right hand side is precisely the same as the name on the left, but _len=len would trivially work around that. -- Steve From rymg19 at gmail.com Tue Jan 19 19:33:55 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 19 Jan 2016 18:33:55 -0600 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160119235115.GX10854@ando.pearwood.info> References: <20160119235115.GX10854@ando.pearwood.info> Message-ID: <222A4DBC-B94E-4DAC-BCE1-46C4DB6E536C@gmail.com> On January 19, 2016 5:51:15 PM CST, Steven D'Aprano wrote: >On Tue, Jan 19, 2016 at 03:10:35PM +0100, haael at interia.pl wrote: >> >> Hi >> >> C++ has a nice feature of explicit variable capture list for lambdas: >> >> int a = 1, b = 2, c = 3; >> auto fun = [a, b, c](int x, int y){ return a + b + c + x + y}; > >For the benefit of those who don't speak C++, could you explain what >that does? Are C++ name binding semantics the same as Python's? > No. >Specifically, inside fun, does "a" refer to the global a? If you rebind >global a, what happens to fun? > >fun(0, 0) # returns 6 >a = 0 >fun(0, 0) # returns 5 or 6? > The given C++ lambda syntax copies the input parameters, so it would return 5. This would return 6: auto fun = [&a, &b, &c](int x, int y){ return a + b + c + x + y}; > > >> This allows easy construction of closures. In Python to achieve that, >you need to say: >> >> def make_closure(a, b, c): >> def fun(x, y): >> return a + b + c + x + y >> return def >> a = 1 >> b = 2 >> c = 3 >> fun = make_closure(a, b, c) > >I cannot tell whether the C++ semantics above are the same as the >Python >semantics here. Andrew's response to you suggests that it is not. > > > >> My proposal: create a special variable qualifier (like global and >> nonlocal) to automatically capture variables > >"Variables" is an ambiguous term. I don't want to get into a debate >about "Python doesn't have variables", but it's not clear what you mean > >here. Python certainly has names, and values, and when you talk about >"variables" do you mean the name or the value or both? > > >> a = 1 >> b = 2 >> c = 3 >> def fun(x, y): >> capture a, b, c >> return a + b + c + x + y >> >> This will have an effect that symbols a, b and c in the body of the >> function have values as they had at the moment of function creation. >> The variables a, b, c must be defined at the time of function >> creation. If they are not, an error is thrown. > >If I have understood you correctly, we can already do that in >Python, and don't even need a closure: > >a, b, c = 1, 2, 3 >fun = lambda x, y, a=a, b=b, c=c: a + b + c + x + y > > >will capture the current *value* of GLOBAL a, b and c, store them as >default values, and use them as the LOCAL a, b and c. > >You may consider it a strength or a weakness that they are exposed as >regular function parameters: > >fun(x, y) # intended call signature >fun(x, y, a, b, c) # actual call signature > >but if you really care about hiding the extra parameters, a second >approach will work: > >from functools import partial >a, b, c = 1, 2, 3 >fun = partial(lambda a, b, c, x, y: a + b + c + x + y, a, b, c) > > >If a, b, c are mutable objects, you can make a copy of the value: > >fun = partial(lambda a, b, c, x, y: a + b + c + x + y, > a, b, copy.copy(c) > ) > >for example. > >Does your proposal behave any differently from these examples? -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. From steve at pearwood.info Tue Jan 19 19:37:12 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Jan 2016 11:37:12 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: <20160120003712.GZ10854@ando.pearwood.info> On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote: > I think it's reasonable to divert this discussion to "value capture". Not > sure if that's the usual terminology, but the idea should be that a > reference to the value is captured, rather than (as Python normally does > with closures) a reference to the variable (implemented as something called > a "cell"). If I understand you correctly, that's precisely what a function default argument does: capture the current value of the default value expression at the time the function is called. This has the side-effect of exposing that as an argument, which may be underdesirable. partial() can be used to work around that. > (However let's please not consider whether the value should be copied or > deep-copied. Just capture the object reference at the point the capture is > executed.) > > The best syntax for such capture remains to be seen. ("Capture" seems to > universally make people think of "variable capture" which is the opposite > of what we want here.) If I recall correctly, there was a recent(?) proposal for a "static" keyword with similar semantics: def func(a): static b = expression ... would guarantee that expression was evaluated exactly once. If that evaluation occured when func was defined, rather than when it was first called, that might be the semantics you are looking for: def func(a): static b = b # captures the value of b from the enclosing scope Scoping rules might be tricky to get right. Perhaps rather than a declaration, "static" might be better treated as a block: def func(a): static: # Function initialisation section. Occurs once, when the # def statement runs. b = b # b on the left is local, b on the right is non-local # (just like in a parameter list) # Normal function body goes here. But neither of these approaches would be good for lambdas. I'm okay with that -- lambda is a lightweight syntax, for lightweight needs. If your needs are great (doc strings, annotations, multiple statements) don't use lambda. -- Steve From rosuav at gmail.com Tue Jan 19 19:38:45 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Jan 2016 11:38:45 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160120001425.GY10854@ando.pearwood.info> References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: On Wed, Jan 20, 2016 at 11:14 AM, Steven D'Aprano wrote: > On Wed, Jan 20, 2016 at 10:29:48AM +1100, Chris Angelico wrote: >> On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum wrote: >> > I think it's reasonable to divert this discussion to "value capture". Not >> > sure if that's the usual terminology, but the idea should be that a >> > reference to the value is captured, rather than (as Python normally does >> > with closures) a reference to the variable (implemented as something called >> > a "cell"). >> >> +1. This would permit deprecation of the "def blah(...., len=len):" >> optimization - all you need to do is set a value capture on the name >> "len". > > Some might argue that the default argument trick is already the One > Obvious Way to capture a value in a function. I disagree. There is nothing obvious about this, outside of the fact that it's already used in so many places. It's not even obvious after looking at the code. > I don't think deprecation is the right word here, you can't deprecate > "len=len" style code because it's just a special case of the more > general name=expr function default argument syntax. I suppose a linter > might complain if the expression on the right hand side is precisely the > same as the name on the left, but _len=len would trivially work around > that. The deprecation isn't of named arguments with defaults, but of the use of that for no reason other than optimization. IMO function arguments should always exist primarily so a caller can override them. In contrast, random.randrange has a parameter _int which is not mentioned in the docs, and which should never be provided. Why should it even be exposed? It exists solely as an optimization. Big one for the bike-shedding: Is this "capture as local" (the same semantics as the default arg - if you rebind it, it changes for the current invocation only), or "capture as static" (the same semantics as a closure if you use the 'nonlocal' directive - if you rebind it, it stays changed), or "capture as constant" (what people are usually going to be doing anyway)? ChrisA From guido at python.org Tue Jan 19 20:01:42 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jan 2016 17:01:42 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160120003712.GZ10854@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> Message-ID: On Tue, Jan 19, 2016 at 4:37 PM, Steven D'Aprano wrote: > On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote: > > > I think it's reasonable to divert this discussion to "value capture". Not > > sure if that's the usual terminology, but the idea should be that a > > reference to the value is captured, rather than (as Python normally does > > with closures) a reference to the variable (implemented as something > called > > a "cell"). > > If I understand you correctly, that's precisely what a function default > argument does: capture the current value of the default value expression > at the time the function is called. > I think you misspoke here (I don't think you actually believe what you said :-). Function defaults capture the current value at the time the function is *define*. > This has the side-effect of exposing that as an argument, which may be > underdesirable. Indeed. It's also non-obvious to people who haven't seen it before. > partial() can be used to work around that. > Hardly. Adding a partial() call usually makes code *less* obvious. > > The best syntax for such capture remains to be seen. ("Capture" seems to > > universally make people think of "variable capture" which is the opposite > > of what we want here.) > > If I recall correctly, there was a recent(?) proposal for a "static" > keyword with similar semantics: > > def func(a): > static b = expression > ... > > > would guarantee that expression was evaluated exactly once. Once per what? In the lifetime of the universe? Per CPython process start? Per call? J/K, I think I know what you meant -- once per function definition (same as default values). > If that > evaluation occurred when func was defined, rather than when it was first > called, (FWIW, "when it was first called" would be a recipe for disaster and irreproducible results.) > that might be the semantics you are looking for: > > def func(a): > static b = b # captures the value of b from the enclosing scope > Yeah, I think the OP proposed 'capture b' with these semantics. > Scoping rules might be tricky to get right. Perhaps rather than a > declaration, "static" might be better treated as a block: > Why? This does smell like a directive similar to global and nonlocal. > def func(a): > static: > # Function initialisation section. Occurs once, when the > # def statement runs. > b = b # b on the left is local, b on the right is non-local > # (just like in a parameter list) > Hm, this repetition of the name in parameter lists is actually a strike against it, and the flexibility it adds (of allowing arbitrary expressions to be captured) doesn't seem to be needed much in reality -- the examples for the argument default pattern invariably use 'foo=foo, bar=bar'. > # Normal function body goes here. > > > > But neither of these approaches would be good for lambdas. I'm okay with > that -- lambda is a lightweight syntax, for lightweight needs. If your > needs are great (doc strings, annotations, multiple statements) don't > use lambda. > Yeah, the connection with lambdas in C++ is unfortunate. In C++, IIRC, the term lambda is used to refer to any function nested inside another, and that's the only place where closures exist. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Tue Jan 19 23:44:00 2016 From: random832 at fastmail.com (Random832) Date: Tue, 19 Jan 2016 23:44:00 -0500 Subject: [Python-ideas] Explicit variable capture list References: <20160120003712.GZ10854@ando.pearwood.info> Message-ID: Steven D'Aprano writes: > But neither of these approaches would be good for lambdas. I'm okay with > that -- lambda is a lightweight syntax, for lightweight needs. If your > needs are great (doc strings, annotations, multiple statements) don't > use lambda. Yeah, but the fact that it's specifically part of C++'s lambda syntax suggests that it is a very common thing to need with a lambda, doesn't it? What about... lambda a, = b: [stuff with captured value b] ? From guido at python.org Tue Jan 19 23:54:51 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Jan 2016 20:54:51 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> Message-ID: On Tue, Jan 19, 2016 at 8:44 PM, Random832 wrote: > Steven D'Aprano writes: > > But neither of these approaches would be good for lambdas. I'm okay with > > that -- lambda is a lightweight syntax, for lightweight needs. If your > > needs are great (doc strings, annotations, multiple statements) don't > > use lambda. > > Yeah, but the fact that it's specifically part of C++'s lambda syntax > suggests that it is a very common thing to need with a lambda, doesn't > it? No, that's because in C++ "lambdas" are the only things with closures. > What about... lambda a, = b: [stuff with captured value b] ? > Noooooo! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jan 20 01:37:48 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Jan 2016 16:37:48 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: On 20 January 2016 at 10:38, Chris Angelico wrote: > Big one for the bike-shedding: Is this "capture as local" (the same > semantics as the default arg - if you rebind it, it changes for the > current invocation only), or "capture as static" (the same semantics > as a closure if you use the 'nonlocal' directive - if you rebind it, > it stays changed), or "capture as constant" (what people are usually > going to be doing anyway)? The "shared value" approach can already be achieved by binding a mutable object rather than an immutable one, and there's no runtime speed difference between looking up a local and looking up a constant, so I think it makes sense to just stick with "default argument semantics, but without altering the function signature" One possible name for such a directive would be "sharedlocal": it's in most respects a local variable, but the given definition time initialisation value is shared across all invocations to the function. With that spelling: def f(*, len=len): ... Would become: def f(): sharedlocal len=len ... And you'd also be able to do things like: def f(): sharedlocal cache={} Alternatively, if we just wanted to support early binding of pre-existing names, then "bindlocal" could work: def f(): bindlocal len ... Either approach could be used to handle early binding of loop iteration variables: for i in range(10): def f(): sharedlocal i=i ... for i in range(10): def f(): bindlocal i ... I'd be -1 on bindlocal (I think dynamic optimisers like PyPy or Numba, or static ones like Victor's FAT Python project are better answers there), but "sharedlocal" is more interesting, since it means you can avoid creating a closure if all you need is to persist a bit of state between invocations of a function. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ipipomme+python at gmail.com Wed Jan 20 06:13:44 2016 From: ipipomme+python at gmail.com (Alexandre Figura) Date: Wed, 20 Jan 2016 12:13:44 +0100 Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected? In-Reply-To: References: <659951373.398716.1450406333370.JavaMail.yahoo@mail.yahoo.com> <20151218110755.GH1609@ando.pearwood.info> Message-ID: If we put technical considerations aside, maybe we should just ask to ourselves what behavior do we expect when doing equality tests between ordered dictionaries. As a reminder: >>> xy = OrderedDict([('x', None), ('y', None)]) >>> yx = OrderedDict([('y', None), ('x', None)]) >>> xy == yx False >>> xy.items() == yx.items() True >>> xy.keys() == yx.keys() True >>> xy.values() == yx.values() False So, it appears that: 1. equality tests between odict_values use objects identity and not equality, 2. equality tests between odict_keys do not respect order. If it is not technically possible to change the current implementation, maybe all we can do is just add a warning about current behavior in the documentation? On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum wrote: > Seems like we dropped the ball... Is there any action item here? > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From agustin.herranz at gmail.com Wed Jan 20 09:27:37 2016 From: agustin.herranz at gmail.com (=?UTF-8?Q?Agust=c3=adn_Herranz_Cecilia?=) Date: Wed, 20 Jan 2016 15:27:37 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 Message-ID: <569F9959.3020202@gmail.com> Hi!, I'd come to this thread late and by coincidence, but after read the whole thread I want to share some thoughts: The main concern it's add a way to add some kind of gradual typing to python 2 code. Because it's working python2 code it can't use annotations and it can't be added in code, so a type comment is the way to go (independent of the convenience, or not, of having type annotation available on runtime). Someone points that using a comment with the python3 annotation signature it's good to educate people on how to use annotations, I feel that's not the point, the point is to people get used to type hints. For this the same syntax must be used across different python versions, so 'function type comments' must be available to use also in python 3 code, this also allow people who can't/won't use annotations to use type hinting. For this, I don't believe that the section "Suggested syntax for Python 2.7 and straddling code" added to PEP 484 is the correct way to go, the proper it's add type comments for functions, as a extension of the "Type comments" section or perhaps in a new PEP. Some concerns that must be take into account to add function type comments: - Using 'type comments' syntax of PEP484 a function signature should look like this: def func(arg1, arg2): # type: Callable[[int, int], int] """ Do something """ return arg1 + arg2 - This easily becomes a long line so breaks PEP8 and linters would complain. So it need to define a way to put the type comment in another line. The type comment will be put In the line after or in the line before? Put in another line will be only available for function type comments or for other type comments too? - With some kind of complex types the type comment surely become a long lint too, how the type comments will be break into different lines? - GVR proposal includes some kind of syntactic sugar for function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but this must be an alternative over typing module syntax (PEP484), not the preferred way (for people get used to typehints). Is this syntactic sugar compatible with generators? The type analyzers could be differentiate between a Callable and a Generator? More concerns on type comments: - As this is intended to gradual type python2 code to port it to python 3 I think it's convenient to add some sort of import that only be used for type checking, and be only imported by the type analyzer, not the runtime. This could be achieve by prepending "#type: " to the normal import statement, something like: # type: import module # type: from package import module - Also there must be addressed how it work on a python2 to python3 environment as there are types with the same name, str for example, that works differently on each python version. If the code is for only one version uses the type names of that version. For 2/3 code types could be define with a "py2" prefix on a module that could be "py2types" having "py2str", for example, to mark things that be of python2 str type. Python 3 types will not have prefixes. I hope this reasoning/ideas will be useful. Also I hope that I have been expressed good enough, English is not my mother tongue. Agust?n Herranz. From guido at python.org Wed Jan 20 11:48:56 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jan 2016 08:48:56 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: But 'shared' and 'local' are both the wrong words to use here. Also probably this should syntactically be tied to the function header so the time of evaluation is clear(er). On Tue, Jan 19, 2016 at 10:37 PM, Nick Coghlan wrote: > On 20 January 2016 at 10:38, Chris Angelico wrote: > > Big one for the bike-shedding: Is this "capture as local" (the same > > semantics as the default arg - if you rebind it, it changes for the > > current invocation only), or "capture as static" (the same semantics > > as a closure if you use the 'nonlocal' directive - if you rebind it, > > it stays changed), or "capture as constant" (what people are usually > > going to be doing anyway)? > > The "shared value" approach can already be achieved by binding a > mutable object rather than an immutable one, and there's no runtime > speed difference between looking up a local and looking up a constant, > so I think it makes sense to just stick with "default argument > semantics, but without altering the function signature" > > One possible name for such a directive would be "sharedlocal": it's in > most respects a local variable, but the given definition time > initialisation value is shared across all invocations to the function. > > With that spelling: > > def f(*, len=len): > ... > > Would become: > > def f(): > sharedlocal len=len > ... > > And you'd also be able to do things like: > > def f(): > sharedlocal cache={} > > Alternatively, if we just wanted to support early binding of > pre-existing names, then "bindlocal" could work: > > def f(): > bindlocal len > ... > > Either approach could be used to handle early binding of loop > iteration variables: > > for i in range(10): > def f(): > sharedlocal i=i > ... > > for i in range(10): > def f(): > bindlocal i > ... > > I'd be -1 on bindlocal (I think dynamic optimisers like PyPy or Numba, > or static ones like Victor's FAT Python project are better answers > there), but "sharedlocal" is more interesting, since it means you can > avoid creating a closure if all you need is to persist a bit of state > between invocations of a function. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 20 12:42:05 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 20 Jan 2016 09:42:05 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <569F9959.3020202@gmail.com> References: <569F9959.3020202@gmail.com> Message-ID: On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia wrote: > > - GVR proposal includes some kind of syntactic sugar for function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but this must be an alternative over typing module syntax (PEP484), not the preferred way (for people get used to typehints). Is this syntactic sugar compatible with generators? The type analyzers could be differentiate between a Callable and a Generator? I'm pretty sure Generator is not the type of a generator function, bit of a generator object. So to type a generator function, you just write `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int], Generator[int]]`. (Of course you can use Callable instead of the more specific Function, or Iterator (or even Iterable) instead of the more specific Generator, if you want to be free to change the implementation to use an iterator class or something later, but normally you'd want the most specific type, I think.) > - As this is intended to gradual type python2 code to port it to python 3 I think it's convenient to add some sort of import that only be used for type checking, and be only imported by the type analyzer, not the runtime. This could be achieve by prepending "#type: " to the normal import statement, something like: > # type: import module > # type: from package import module That sounds like a bad idea. If the typing module shadows some global, you won't get any errors, but your code will be misleading to a reader (and even worse if you from package.module import t). If the cost of the import is too high for Python 2, surely it's also too high for Python 3. And what other reason do you have for skipping it? > - Also there must be addressed how it work on a python2 to python3 environment as there are types with the same name, str for example, that works differently on each python version. If the code is for only one version uses the type names of that version. That's the same problem that exists at runtime, and people (and tools) already know how to deal with it: use bytes when you mean bytes, unicode when you mean unicode, and str when you mean whatever is "native" to the version you're running under and are willing to deal with it. So now you just have to do the same thing in type hints that you're already doing in constructors, isinstance checks, etc. Of course many people use libraries like six to help them deal with this, which means that those libraries have to be type-hinted appropriately for both versions (maybe using different stubs for py2 and py3, with the right one selected at pip install time?), but if that's taken care of, user code should just work. From srkunze at mail.de Wed Jan 20 13:39:09 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 20 Jan 2016 19:39:09 +0100 Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected? In-Reply-To: References: <659951373.398716.1450406333370.JavaMail.yahoo@mail.yahoo.com> <20151218110755.GH1609@ando.pearwood.info> Message-ID: <569FD44D.80008@mail.de> Documentation is a very good idea. Maybe, even raise an error when comparing values. Best, Sven On 20.01.2016 12:13, Alexandre Figura wrote: > If we put technical considerations aside, maybe we should just ask to > ourselves what behavior do we expect when doing equality tests between > ordered dictionaries. As a reminder: > > >>> xy = OrderedDict([('x', None), ('y', None)]) > >>> yx = OrderedDict([('y', None), ('x', None)]) > >>> xy == yx > False > >>> xy.items() == yx.items() > True > >>> xy.keys() == yx.keys() > True > >>> xy.values() == yx.values() > False > > So, it appears that: > 1. equality tests between odict_values use objects identity and not > equality, > 2. equality tests between odict_keys do not respect order. > > If it is not technically possible to change the current > implementation, maybe all we can do is just add a warning about > current behavior in the documentation? > > On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum > wrote: > > Seems like we dropped the ball... Is there any action item here? > > -- > --Guido van Rossum (python.org/~guido ) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 20 13:58:46 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 20 Jan 2016 13:58:46 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: On 1/20/2016 11:48 AM, Guido van Rossum wrote: > But 'shared' and 'local' are both the wrong words to use here. Also > probably this should syntactically be tied to the function header so the > time of evaluation is clear(er). Use ';' in the parameter list, followed by name=expr pairs. The question is whether names after are initialized local variables, subject to rebinding at runtime, or named constants, with the names replaced by the values at definition time. In the former case, a type hint could by included. In the latter case, which is much better for optimization, the fixed object would already be typed. def f(int a, int b=1; int c=2) => int -- Terry Jan Reedy From yselivanov.ml at gmail.com Wed Jan 20 14:05:17 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 20 Jan 2016 14:05:17 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: <569FDA6D.6000006@gmail.com> Nick, On 2016-01-20 1:37 AM, Nick Coghlan wrote: > On 20 January 2016 at 10:38, Chris Angelico wrote: >> >Big one for the bike-shedding: Is this "capture as local" (the same >> >semantics as the default arg - if you rebind it, it changes for the >> >current invocation only), or "capture as static" (the same semantics >> >as a closure if you use the 'nonlocal' directive - if you rebind it, >> >it stays changed), or "capture as constant" (what people are usually >> >going to be doing anyway)? > The "shared value" approach can already be achieved by binding a > mutable object rather than an immutable one, and there's no runtime > speed difference between looking up a local and looking up a constant, > so I think it makes sense to just stick with "default argument > semantics, but without altering the function signature" > > One possible name for such a directive would be "sharedlocal": it's in > most respects a local variable, but the given definition time > initialisation value is shared across all invocations to the function. > > With that spelling: > > def f(*, len=len): > ... > > Would become: > > def f(): > sharedlocal len=len FWIW I strongly believe that this feature (at least the "len=len"-like optimizations) should be implemented as an optimization in the interpreter. We already have "nonlocal" and "global". Having a third modifier (such as sharedlocal, static, etc) will only introduce confusion and make Python less comprehensible. Yury From abarnert at yahoo.com Wed Jan 20 15:42:47 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 20 Jan 2016 20:42:47 +0000 (UTC) Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <569FDA6D.6000006@gmail.com> References: <569FDA6D.6000006@gmail.com> Message-ID: <1598165082.6579073.1453322567057.JavaMail.yahoo@mail.yahoo.com> On Wednesday, January 20, 2016 11:05 AM, Yury Selivanov wrote: > FWIW I strongly believe that this feature (at least the > "len=len"-like optimizations) should be implemented as an > optimization in the interpreter. The problem is that there are two reasonable interpretations for free variables--variable capture or value capture--and Python can only do one or the other automatically. Python does variable capture, because that's what you usually want.[*] But when you _do_ want value capture, you need some way to signal it. In some cases, the only reason you want value capture is as an optimization, and maybe the optimizer can handle that for you. But sometimes there's a semantic reason you want it--such as the well known case (covered in the official Python Programming FAQ [1]) where you're trying to capture the separate values of an iteration variable in a bunch of separate functions defined in the loop. And we need some way to spell that. Of course we already have a way to spell that, the `a=a` default value trick. And I personally think that's good enough. But if the community disagrees, and we come up with a new syntax, I don't see why we should stop people from also using that new syntaxfor the optimization case when they know they want it.[**] [*] Note that in C++, which people keep referring to, the Core Guidelines suggest using variable capture by default. And their main exception--use value capture when you need to keep something around beyond the lifetime of its original scope, because otherwise you'd get a dangling reference to a destroyed object--doesn't apply to Python. [**] I don't think people are abusing the default-value trick for optimization--I generally only see `len=len` in low-level library code that may end up getting used inside an inner loop--so I doubt they'd abuse any new syntax for the same thing. [1] https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result > We already have "nonlocal" and "global". Having a third > modifier (such as sharedlocal, static, etc) will only > introduce confusion and make Python less comprehensible. I agree with that. Also, none of the names people are proposing make much sense. "static" looks like a function-level static in C and its descendants, but does something completely different. "capture" means the exact opposite of what it says, and "sharedlocal" sounds like it's going to be "more shared" than the default for free variables when it's actually less shared. From abarnert at yahoo.com Wed Jan 20 15:55:30 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 20 Jan 2016 20:55:30 +0000 (UTC) Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: Message-ID: <112116328.6521148.1453323330328.JavaMail.yahoo@mail.yahoo.com> On Wednesday, January 20, 2016 10:59 AM, Terry Reedy wrote: > Use ';' in the parameter list, followed by name=expr pairs. This is the best option anyone's suggested (short of just not doing anything and telling people to keep using the default-value trick on the rare occasions where this is necessary). However, I'd suggest one minor change: for the common case of `j=j, len=len`, allow people to just write the name once. The compiler can definitely handle this: def spam(eggs; _i=i, j, len): > The > question is whether names after are initialized local variables, subject > to rebinding at runtime, or named constants, with the names replaced by > the values at definition time. They almost certainly should be variables, just like parameters, with the values stored in `__defaults__`. Otherwise, this code: powers = [lambda x; i: x**i for i in range(5)] ... produces functions with their own separate code objects, instead of functions that share a single code object. And this isn't some weird use case; the "defining functions in a loop that capture the loop iterator by value" is the paradigm case for this new feature. (It's even covered in the official FAQ.) The performance cost of those separate code objects (and the cache misses caused when you try to use them in a loop) has almost no compensating performance gain (`LOAD_CONST` isn't faster than `LOAD_FAST`, and the initial copy from `__defaults__` at call time is about 1/10th the cost of either). And it's more complicated to implement (especially from where we are today), and less flexible for reflective code that munges functions. > In the former case, a type hint could by > included. In the latter case, which is much better for optimization, > the fixed object would already be typed. > > def f(int a, int b=1; int c=2) => int You've got the syntax wrong. But, more importantly, besides the latter case (const vs. default) actually being worse for optimization, it isn't any better for type inference. In this function: def f(a: int, b: int=1; c=2) -> int: or even this one: def f(): for i in range(5): def local(x: int; i) -> int: return x**i yield local ... the type checker can infer the type of `i`: it's initialized with an int literal (first version) or the value of a variable that's been inferred as an int; therefore, it's an int. So it can emit a warning if you assign anything but another int to it. The only problem with your solution is that we now have three different variations that are all spelled very differently: def spam(i; j): # captured by value def spam(i): nonlocal j # captured by variable def spam(i): # captured by variable if no assignment, else shadowed by a local Is that acceptable? From mike at selik.org Wed Jan 20 18:56:00 2016 From: mike at selik.org (Michael Selik) Date: Wed, 20 Jan 2016 23:56:00 +0000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <569FDA6D.6000006@gmail.com> References: <20160120001425.GY10854@ando.pearwood.info> <569FDA6D.6000006@gmail.com> Message-ID: On Wed, Jan 20, 2016 at 2:05 PM Yury Selivanov wrote: > On 2016-01-20 1:37 AM, Nick Coghlan wrote: > > On 20 January 2016 at 10:38, Chris Angelico wrote: > > With that spelling: > > > > def f(*, len=len): > > ... > > > > Would become: > > > > def f(): > > sharedlocal len=len > > FWIW I strongly believe that this feature (at least the > "len=len"-like optimizations) should be implemented as an > optimization in the interpreter. > > We already have "nonlocal" and "global". Having a third > modifier (such as sharedlocal, static, etc) will only > introduce confusion and make Python less comprehensible. > If the purpose is to improve speed, it certainly feels like an interpreter optimization. The other thread about adding ``ma_version`` to dicts might be useful for quickening the global variable lookup. If the purpose is to store the current global value, it might be reasonable to add a language feature to make that more explicit. Beginners often mistakenly think that default values are evaluated and assigned at call-time instead of def-time. However, adding a new, more explicit language feature wouldn't eliminate the current confusion. Instead we'd have two ways to do it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jan 20 19:10:32 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Jan 2016 11:10:32 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> Message-ID: <20160121001027.GB4619@ando.pearwood.info> On Tue, Jan 19, 2016 at 05:01:42PM -0800, Guido van Rossum wrote: > On Tue, Jan 19, 2016 at 4:37 PM, Steven D'Aprano > wrote: > > > On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote: > > > > > I think it's reasonable to divert this discussion to "value capture". [...] > > If I understand you correctly, that's precisely what a function default > > argument does: capture the current value of the default value expression > > at the time the function is called. > > I think you misspoke here (I don't think you actually believe what you said > :-). > > Function defaults capture the current value at the time the function is > *define*. Oops! You got me. Yes, I meant defined, not called. [...] > > > The best syntax for such capture remains to be seen. ("Capture" seems to > > > universally make people think of "variable capture" which is the opposite > > > of what we want here.) > > > > If I recall correctly, there was a recent(?) proposal for a "static" > > keyword with similar semantics: > > > > def func(a): > > static b = expression > > ... > > > > would guarantee that expression was evaluated exactly once. > > Once per what? In the lifetime of the universe? Per CPython process start? > Per call? > > J/K, I think I know what you meant -- once per function definition (same as > default values). That's what I mean. Although, I am curious as to how we might implement the once per lifetime of the universe requirement :-) > > If that > > evaluation occurred when func was defined, rather than when it was first > > called, > > (FWIW, "when it was first called" would be a recipe for disaster and > irreproducible results.) It probably would be a bug magnet. Good thing I'm not asking for that behaviour then :-) [...] > > Scoping rules might be tricky to get right. Perhaps rather than a > > declaration, "static" might be better treated as a block: > > > > Why? This does smell like a directive similar to global and nonlocal. I'm just tossing the "static block" idea out for discussion, but if you want a justification here are two differences between capture/static and global/nonlocal which suggest they aren't that similar and so we shouldn't feel obliged to use the same syntax. (1) global and nonlocal operate on *names*, not values. E.g. after "global x", x refers to a name in the global scope, not the local scope. But "capture"/"static" doesn't affect the name, or the scope that x belongs to. x is still a local, it just gets pre-initialised to the value of x in the enclosing scope. That makes it more of a binding operation or assignment than a declaration. (2) If we limit this to only capturing the same name, then we can only write (say) "static x", and that does look like a declaration. But maybe we want to allow the local name to differ from the global name: static x = y or even arbitrary expressions on the right: static x = x + 1 Now that starts to look more like it should be in a block of code, especially if you have a lot of them: static x = x + 1 static len = len static data = open("data.txt").read() versus: static: x = x + 1 len = len data = open("data.txt").read() I acknowledge that this goes beyond what the OP asked for, and I think that YAGNI is a reasonable response to the static block idea. I'm not going to champion it any further unless there's a bunch of interest from others. (I'm saving my energy for Eiffel-like require/ensure blocks *wink*). -- Steve From guido at python.org Wed Jan 20 19:11:03 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jan 2016 16:11:03 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> Message-ID: On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia < > agustin.herranz at gmail.com> wrote: > > > > - GVR proposal includes some kind of syntactic sugar for function type > comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but > this must be an alternative over typing module syntax (PEP484), not the > preferred way (for people get used to typehints). Is this syntactic sugar > compatible with generators? The type analyzers could be differentiate > between a Callable and a Generator? > > I'm pretty sure Generator is not the type of a generator function, bit of > a generator object. So to type a generator function, you just write `(int, > int) -> Generator[int]`. Or, the long way, `Function[[int, int], > Generator[int]]`. > There is no 'Function' -- it existed in mypy before PEP 484 but was replaced by 'Callable'. And you don't annotate a function def with '-> Callable' (unless it returns another function). The Callable type is only needed in the signature of higher-order functions, i.e. functions that take functions for arguments or return a function. For example, a simple map function would be written like this: def map(f: Callable[[T], S], a: List[T]) -> List[S]: ... As to generators, we just improved how mypy treats generators ( https://github.com/JukkaL/mypy/commit/d8f72279344f032e993a3518c667bba813ae041a). The Generator type has *three* parameters: the "yield" type (what's yielded), the "send" type (what you send() into the generator, and what's returned by yield), and the "return" type (what a return statement in the generator returns, i.e. the value for the StopIteration exception). You can also use Iterator if your generator doesn't expect its send() or throw() messages to be called and it isn't returning a value for the benefit of `yield from'. For example, here's a simple generator that iterates over a list of strings, skipping alternating values: def skipper(a: List[str]) -> Iterator[str]: for i, s in enumerate(a): if i%2 == 0: yield s and here's a coroutine returning a string (I know, it's pathetic, but it's an example :-): @asyncio.coroutine def readchar() -> Generator[Any, None, str]: # Implementation not shown @asyncio.coroutine def readline() -> Generator[Any, None, str]: buf = '' while True: c = yield from readchar() if not c: break buf += c if c == '\n': break return buf Here, in Generator[Any, None, str], the first parameter ('Any') refers to the type yielded -- it actually yields Futures, but we don't care about that (it's an asyncio implementation detail). The second parameter ('None') is the type returned by yield -- again, it's an implementation detail and we might just as well say 'Any' here. The third parameter (here 'str') is the type actually returned by the 'return' statement. It's illustrative to observe that the signature of readchar() is exactly the same (since it also returns a string). OTOH the return type of e.g. asyncio.sleep() is Generator[Any, None, None], because it doesn't return a value. This business is clearly still suboptimal -- we would like to introduce a new type, perhaps named Coroutine, so that you can write Coroutine[T] instead of Generator[Any, None, T]. But that would just be a shorthand. The actual type of a generator object is always some parametrization of Generator. In any case, whatever we write after the -> (i.e., the return type) is still the type of the value you get when you call the function. If the function is a generator function, the value you get is a generator object, and that's what the return type designates. > (Of course you can use Callable instead of the more specific Function, or > Iterator (or even Iterable) instead of the more specific Generator, if you > want to be free to change the implementation to use an iterator class or > something later, but normally you'd want the most specific type, I think.) > I don't know where you read about Callable vs. Function. Regarding using Iterator[T] instead of Generator[..., ..., T], you are correct. Note that you *cannot* define a generator function as returning a *subclass* of Iterator/Generator; there is no way to have a generator function instantiate some other class as its return value. Consider (ignoring generic types): class MyIterator: def __next__(self): ... def __iter__(self): ... def bar(self): ... def foo() -> MyIterator: yield x = foo() x.bar() # Boom! The type checker would assume that x has a method bar() based on the declared return type for foo(), but it doesn't. (There are a few other special cases, in addition to Generator and Iterator; declaring the return type to be Any or object is allowed.) > > > - As this is intended to gradual type python2 code to port it to python > 3 I think it's convenient to add some sort of import that only be used for > type checking, and be only imported by the type analyzer, not the runtime. > This could be achieve by prepending "#type: " to the normal import > statement, something like: > > # type: import module > > # type: from package import module > > That sounds like a bad idea. If the typing module shadows some global, you > won't get any errors, but your code will be misleading to a reader (and > even worse if you from package.module import t). If the cost of the import > is too high for Python 2, surely it's also too high for Python 3. And what > other reason do you have for skipping it? > Exactly. Even though (when using Python 2) all type annotations are in comments, you still must write real imports. (This causes minor annoyances with linters that warn about unused imports, but there are ways to teach them.) > > - Also there must be addressed how it work on a python2 to python3 > environment as there are types with the same name, str for example, that > works differently on each python version. If the code is for only one > version uses the type names of that version. > > That's the same problem that exists at runtime, and people (and tools) > already know how to deal with it: use bytes when you mean bytes, unicode > when you mean unicode, and str when you mean whatever is "native" to the > version you're running under and are willing to deal with it. So now you > just have to do the same thing in type hints that you're already doing in > constructors, isinstance checks, etc. > This is actually still a real problem. But it has no bearing on the choice of syntax for annotations in Python 2 or straddling code. > Of course many people use libraries like six to help them deal with this, > which means that those libraries have to be type-hinted appropriately for > both versions (maybe using different stubs for py2 and py3, with the right > one selected at pip install time?), but if that's taken care of, user code > should just work. > Yeah, we could use help. There are some very rudimentary stubs for a few things defined by six ( https://github.com/python/typeshed/tree/master/third_party/3/six, https://github.com/python/typeshed/tree/master/third_party/2.7/six) but we need more. There's a PR but it's of bewildering size ( https://github.com/python/typeshed/pull/21). PS. I have a hard time following the rest of Agustin's comments. The comment-based syntax I proposed for Python 2.7 does support exactly the same functionality as the official PEP 484 syntax; the only thing it doesn't allow is selectively leaving out types for some arguments -- you must use 'Any' to fill those positions. It's not a problem in practice, and it doesn't reduce functionality (omitted argument types are assumed to be Any in PEP 484 too). I should also remark that mypy supports the comment-based syntax in Python 2 mode as well as in Python 3 mode; but when writing Python 3 only code, the non-comment version is strongly preferred. (We plan to eventually produce a tool that converts the comments to standard PEP 484 syntax). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jan 20 19:52:25 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Jan 2016 11:52:25 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120001425.GY10854@ando.pearwood.info> Message-ID: <20160121005224.GC4619@ando.pearwood.info> On Wed, Jan 20, 2016 at 01:58:46PM -0500, Terry Reedy wrote: > On 1/20/2016 11:48 AM, Guido van Rossum wrote: > >But 'shared' and 'local' are both the wrong words to use here. Also > >probably this should syntactically be tied to the function header so the > >time of evaluation is clear(er). > > Use ';' in the parameter list, followed by name=expr pairs. The > question is whether names after are initialized local variables, subject > to rebinding at runtime, or named constants, with the names replaced by > the values at definition time. In the former case, a type hint could by > included. In the latter case, which is much better for optimization, > the fixed object would already be typed. > > def f(int a, int b=1; int c=2) => int I almost like that. The problem is that the difference between ; and , is visually indistinct and easy to mess up. I've occasionally typed ; in a parameter list and got a nice SyntaxError telling me I've messed up, but with your suggestion the function will just silently do the wrong thing. I suggest a second "parameter list": def func(a:int, b:int=1)(c:int)->int: ... is morally equivalent to: def func(a:int, b:int=1, c:int=c)->int: ... except that c is not a parameter of the function and cannot be passed as an argument: func(a=0, b=2) # okay func(a=0, b=2, c=1) # TypeError We still lack a good term for what the (c) thingy should be called. I'm not really happy with either of "static" or "capture", but for lack of anything better I'll go with capture for the moment. So a full function declaration looks like: def NAME ( PARAMETERS ) ( CAPTURES ) -> RETURN-HINT : (Bike-shedders: do you prefer () [] or {} for the list of captures?) CAPTURES is a comma-delimitered list of local variable names, with optional type hint and optional bindings. Here are some examples: # Capture the values of x and y from the enclosing scope. # Both x and y must exist at func definition time. def func(arg)(x, y): # inside the body of func, x and y are locals # Same as above, except with type-hinting. # If x or y in the enclosing scope are not floats, # a checker should report a type error. def func(arg)(x:float, y:float): # inside the body of func, x and y are locals # Capture the values of x and y from the enclosing scope, # binding to names x and z. # Both x and y must exist at func definition time. def func(arg)(x, z=y): # inside the body of func, x and z are locals # while y would follow the usual scoping rules # Capture a copy of the value of dict d from the enclosing scope. # d must exist at func definition time. def func(arg)(d:dict=d.copy()): # inside the body of func, d is a local If a capture consists of a name alone (or a name plus annotation), it declares a local variable of that name, and binds to it the captured value of the same name in the enclosing scope. E.g.: x = 999 def func()(x): # like x=x x += 1 return (x, globals()['x']) assert func() == (1000, 999) x = 0 assert func() == (1000, 0) If a capture consists of a name = expression, the expression is evaluated at function definition time, and the result captured. y = 999 def func()(x=y+1): return x assert func() == 1000 del y assert func() == 1000 Can we make this work with lambda? I think we can. The current lambda syntax is: lambda params: expression e.g. lambda x, y=y: x+y Could we keep that (for backwards compatibility) but allow parens to optionally surround the parameter list? If so, then we can allow an optional second set of parens after the first, allowing captures: lambda (x)(y): x+y The difference between lambda x,y=y: ... and lambda (x)(y): ... is that the first takes two arguments, mandatory x and optional y (which defaults to the value of y from the enclosing scope), while the second only takes one argument, x. -- Steve From abarnert at yahoo.com Wed Jan 20 19:54:53 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 21 Jan 2016 00:54:53 +0000 (UTC) Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: Message-ID: <1464926720.6695850.1453337693633.JavaMail.yahoo@mail.yahoo.com> On Wednesday, January 20, 2016 4:11 PM, Guido van Rossum wrote: >On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas wrote: > >On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia wrote: >>> >>> - GVR proposal includes some kind of syntactic sugar for function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but this must be an alternative over typing module syntax (PEP484), not the preferred way (for people get used to typehints). Is this syntactic sugar compatible with generators? The type analyzers could be differentiate between a Callable and a Generator? >> >>I'm pretty sure Generator is not the type of a generator function, bit of a generator object. So to type a generator function, you just write `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int], Generator[int]]`. > >There is no 'Function' -- it existed in mypy before PEP 484 but was replaced by 'Callable'. And you don't annotate a function def with '-> Callable' (unless it returns another function). Sorry about getting the `Function` from the initial proposal instead of the current PEP. Anyway, I don't think the OP was suggesting that. If I interpreted his question right: He was expecting that the comment `(int, int) -> int` was a way to annotate a function so it comes out as type `Callable[[int, int], int]`, which is correct. And he wanted to know how to instead write a comment for a generator function of type `GeneratorFunction[[int, int], int]`, and the answer is that you don't. There is no type needed for generator functions; they're just functions that return generators. You're right that he doesn't need to know the actual type; you're never going to write that, you're just going to annotate the arguments and return value, or use the 2.x comment style: def f(arg1: int, arg2: int) -> Iterator[int] def f(arg1, arg2): # type: (int, int) -> Iterator[int] Either way, the type checker will determine that type of the function is `Callable[[int, int], Iterator[int]]`, and the only reason you'll ever care is if that type shows up in an error message. >Regarding using Iterator[T] instead of Generator[..., ..., T], you are correct. > >Note that you *cannot* define a generator function as returning a *subclass* of Iterator/Generator; But you could define it as returning the superclass `Iterable`, right? As I understand it, it's normal type variance, so any superclass will work; the only reason `Iterator` is special is that it happens to be simpler to specify than Generator and it's plausible that it isn't going to matter whether you've written a generator function or, say, a function that returns a list iterator. > there is no way to have a generator function instantiate some other class as its return value. If you really want that, you could always write a wrapper that forwards __next__, and a decorator that applies the wrapper. Can MyPy infer the type of the decorated function from the wrapped function and the decorator? # Can I leave this annotation off? And get one specialized to the actual # argument types of the wrapped function? That would be cool. def my_iterating(func: Callable[Any, Iterator]) -> Callable[Any, MyIterator] @wraps(func) def wrapper(*args, **kw): return MyIterator(func(*args, **kw)) return wrapper @my_iterating def foo() -> Iterator[int]: yield x = foo() x.bar() From guido at python.org Wed Jan 20 20:04:21 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jan 2016 17:04:21 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160121001027.GB4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano wrote: > I'm just tossing the "static block" idea out for discussion, but if you > want a justification here are two differences between capture/static > and global/nonlocal which suggest they aren't that similar and so we > shouldn't feel obliged to use the same syntax. > > (1) global and nonlocal operate on *names*, not values. E.g. after > "global x", x refers to a name in the global scope, not the local scope. > > But "capture"/"static" doesn't affect the name, or the scope that x > belongs to. x is still a local, it just gets pre-initialised to the > value of x in the enclosing scope. That makes it more of a binding > operation or assignment than a declaration. > > (2) If we limit this to only capturing the same name, then we can only > write (say) "static x", and that does look like a declaration. But maybe > we want to allow the local name to differ from the global name: > > static x = y > > or even arbitrary expressions on the right: > > static x = x + 1 > > Now that starts to look more like it should be in a block of code, > especially if you have a lot of them: > > static x = x + 1 > static len = len > static data = open("data.txt").read() > > versus: > > static: > x = x + 1 > len = len > data = open("data.txt").read() > > > I acknowledge that this goes beyond what the OP asked for, and I think > that YAGNI is a reasonable response to the static block idea. I'm not > going to champion it any further unless there's a bunch of interest from > others. Yeah, your arguments why it's different from global/nonlocal are reasonable, but the question remains whether we really need all that functionality. IIRC C++ lambdas only allow capturing a variable's value, not an expression's. So we should ask ourselves first: if we *only* had some directive that captures some variables' values, essentially like the len=len argument trick but without affecting the signature (e.g. just "static x, y, z"), how much of the current pain would be addressed, and how much would remain? > (I'm saving my energy for Eiffel-like require/ensure blocks > *wink*). > Now you're making me curious. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jan 20 20:36:58 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Jan 2016 17:36:58 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <1464926720.6695850.1453337693633.JavaMail.yahoo@mail.yahoo.com> References: <1464926720.6695850.1453337693633.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Wed, Jan 20, 2016 at 4:54 PM, Andrew Barnert wrote: > On Wednesday, January 20, 2016 4:11 PM, Guido van Rossum > wrote: > > >On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas < > python-ideas at python.org> wrote: > > > >On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia < > agustin.herranz at gmail.com> wrote: > >>> > >>> - GVR proposal includes some kind of syntactic sugar for function type > comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but > this must be an alternative over typing module syntax (PEP484), not the > preferred way (for people get used to typehints). Is this syntactic sugar > compatible with generators? The type analyzers could be differentiate > between a Callable and a Generator? > >> > >>I'm pretty sure Generator is not the type of a generator function, bit > of a generator object. So to type a generator function, you just write > `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int], > Generator[int]]`. > > > > >There is no 'Function' -- it existed in mypy before PEP 484 but was > replaced by 'Callable'. And you don't annotate a function def with '-> > Callable' (unless it returns another function). > > Sorry about getting the `Function` from the initial proposal instead of > the current PEP. > > Anyway, I don't think the OP was suggesting that. If I interpreted his > question right: > > He was expecting that the comment `(int, int) -> int` was a way to > annotate a function so it comes out as type `Callable[[int, int], int]`, > which is correct. Not really. I understand that you're saying that after: def foo(a, b): # type: (int, int) -> str return str(a+b) the type of 'foo' is 'Callable[[int, int], int]'. But it really isn't. The type checker (e.g. mypy) knows more at this point: it knows that foo has arguments named 'a' and 'b' and that e.g. calls like 'foo(1, b=2)' are valid. There's no way to express that using Callable. Also Callable doesn't support argument defaults. > And he wanted to know how to instead write a comment for a generator > function of type `GeneratorFunction[[int, int], int]`, and the answer is > that you don't. There is no type needed for generator functions; they're > just functions that return generators. > Aha. No wonder I didn't get the question. :-( > You're right that he doesn't need to know the actual type; you're never > going to write that, you're just going to annotate the arguments and return > value, or use the 2.x comment style: > > def f(arg1: int, arg2: int) -> Iterator[int] > > def f(arg1, arg2): > > # type: (int, int) -> Iterator[int] > > Either way, the type checker will determine that type of the function is > `Callable[[int, int], Iterator[int]]`, and the only reason you'll ever care > is if that type shows up in an error message. > I don't think you can the word 'Callable' to show up in an error message unless it's part of the type as written somewhere. A name defined with 'def' is special and it shows up differently. (And so is a lambda.) > >Regarding using Iterator[T] instead of Generator[..., ..., T], you are > correct. > > > > >Note that you *cannot* define a generator function as returning a > *subclass* of Iterator/Generator; > > But you could define it as returning the superclass `Iterable`, right? Yes. > As I understand it, it's normal type variance, so any superclass will > work; the only reason `Iterator` is special is that it happens to be > simpler to specify than Generator and it's plausible that it isn't going to > matter whether you've written a generator function or, say, a function that > returns a list iterator. > Right. > > there is no way to have a generator function instantiate some other > class as its return value. > > If you really want that, you could always write a wrapper that forwards > __next__, and a decorator that applies the wrapper. Can MyPy infer the type > of the decorated function from the wrapped function and the decorator? > I think that's an open question. Your example below is complicated because of the **args, *kw pattern. > # Can I leave this annotation off? And get one specialized to the > actual > # argument types of the wrapped function? That would be cool. > You can't -- mypy never infers a function's type from its inner workings. However, some Googlers are working on a tool that does infer types: https://github.com/google/pytype It's early days though (relatively speaking), and I don't think it handles this case yet. > def my_iterating(func: Callable[Any, Iterator]) -> Callable[Any, > MyIterator] > Alas, PEP 484 is not powerful enough to describe the relationship between the input and output functions. You'd want to do something that uses a type variable to capture all arguments together, so you could write something like T = TypeVar('T') S = TypeVar('S') def my_iterating(func: Callable[T, Iterator[S]]) -> Callable[T, MyIterator[S]]: > @wraps(func) > def wrapper(*args, **kw): > return MyIterator(func(*args, **kw)) > return wrapper > > @my_iterating > def foo() -> Iterator[int]: > yield > > x = foo() > x.bar() > The only reasonable way to do something like this without adding more sophistication to PEP 484 would be to give up on the decorator and just hardcode it using a pair of functions: # API def foo() -> MyIterator[int]: return MyIterator(_foo()) # Implementation def _foo() -> Iterator[int]: yield 0 -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Jan 20 23:59:52 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 21 Jan 2016 17:59:52 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <56A065C8.9050305@canterbury.ac.nz> My idea for handling this kind of thing is: for new x in things: funcs.append(lambda: dosomethingwith(x)) The 'new' modifier can be applied to any assignment target, and conceptually has the effect of creating a new binding instead of changing an existing binding. There is a very simple way to implement this in CPython: create a new cell each time instead of replacing the contents of an existing cell. -- Greg From ncoghlan at gmail.com Thu Jan 21 02:54:15 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Jan 2016 17:54:15 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <1598165082.6579073.1453322567057.JavaMail.yahoo@mail.yahoo.com> References: <569FDA6D.6000006@gmail.com> <1598165082.6579073.1453322567057.JavaMail.yahoo@mail.yahoo.com> Message-ID: On 21 January 2016 at 06:42, Andrew Barnert via Python-ideas wrote: > On Wednesday, January 20, 2016 11:05 AM, Yury Selivanov wrote: > >> FWIW I strongly believe that this feature (at least the >> "len=len"-like optimizations) should be implemented as an >> optimization in the interpreter. > > The problem is that there are two reasonable interpretations for free variables--variable capture or value capture--and Python can only do one or the other automatically. Can we please use the longstanding early binding and late binding terminology for these two variants, rather than introducing new phrases that just confuse the matter... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Thu Jan 21 03:48:21 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 21 Jan 2016 09:48:21 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160121001027.GB4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: Hi, Sorry but I'm lost in this long thread. Do you want to extend the Python language to declare constant in a function? Maybe I'm completly off-topic, sorry. 2016-01-21 1:10 GMT+01:00 Steven D'Aprano : > (2) If we limit this to only capturing the same name, then we can only > write (say) "static x", and that does look like a declaration. But maybe > we want to allow the local name to differ from the global name: > > static x = y 3 months ago, Serhiy Storchaka proposed a "const var = expr" syntax: https://mail.python.org/pipermail/python-ideas/2015-October/037028.html With a shortcut "const len" which is like "const len = len". In the meanwhile, I implemented an optimization in my FAT Python project: "Copy builtins to constant". It's quite simple: replace the "LOAD_GLOBAL builtin" instruction with a "LOAD_CONST builtin" transation and "patch" co_consts constants of a code object at runtime: def hello(): print("hello world") is replaced with: def hello(): "LOAD_GLOBAL print"("hello world") hello.__code__ = fat.replace_consts(hello.__code__, {'LOAD_GLOBAL print': print}) Where fat.replace_consts() is an helper to create a new code object replacing constants with the specified mapping: http://fatoptimizer.readthedocs.org/en/latest/fat.html#replace_consts Replacing print(...) with "LOAD_GLOBAL"(...) is done in the fatoptimizer (an AST optimpizer): http://fatoptimizer.readthedocs.org/en/latest/optimizations.html#copy-builtin-functions-to-constants We have to inject the builtin function at runtime. It cannot be done when the code object is created by "def ..." because a code object can only contain objects serializable by marshal (to be able to compile a .py file to a .pyc file). > I acknowledge that this goes beyond what the OP asked for, and I think > that YAGNI is a reasonable response to the static block idea. I'm not > going to champion it any further unless there's a bunch of interest from > others. (I'm saving my energy for Eiffel-like require/ensure blocks > *wink*). The difference between "def hello(print=print): ..." and Serhiy's const idea (or my optimization) is that "def hello(print=print): ..." changes the signature of the function which can be a serious issue in an API. Note: The other optimization "local_print = print" in the function is only useful for loops (when the builtin is loaded multiple times) and it still loads the builtin once per function call, whereas my optimization uses a constant and so no lookup is required anymore. Then guards are used to disable the optimization if builtins are modified. See the PEP 510 for an explanation on that part. Victor From mal at egenix.com Thu Jan 21 04:39:43 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Jan 2016 10:39:43 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <56A0A75F.9050208@egenix.com> On 21.01.2016 09:48, Victor Stinner wrote: > The difference between "def hello(print=print): ..." and Serhiy's > const idea (or my optimization) is that "def hello(print=print): ..." > changes the signature of the function which can be a serious issue in > an API. > > Note: The other optimization "local_print = print" in the function is > only useful for loops (when the builtin is loaded multiple times) and > it still loads the builtin once per function call, whereas my > optimization uses a constant and so no lookup is required anymore. > > Then guards are used to disable the optimization if builtins are > modified. See the PEP 510 for an explanation on that part. I ran performance tests on these optimization tricks (and others) in 2014. See this talk: http://www.egenix.com/library/presentations/PyCon-UK-2014-When-performance-matters/ (slides 33ff.) The keyword trick doesn't really pay off in terms of added performance vs. danger of introducing weird bugs. Still, it would be great to have a way to say "please look this symbol up at compile time and stick the result in a local variable" (which is basically what the keyword trick does), only in a form that's easier to detect when reading the code and doesn't change the function signature. A decorator could help with this (by transforming the byte code and localizing the symbols), e.g. @localize(len) def f(seq): z = 0 for x in seq: if x: z += len(x) return z but the more we move language features to decorators, the less readable the code will get by having long tails of decorators on many functions (we don't really want our functions to resemble snakes, do we ? :-)). So perhaps it is indeed time for a new keyword to localize symbols in a function or module, say: # module scope localization, applies to all code objects in # this module: localize len def f(seq): ... or: def f(seq): # Localize len in this function, since we need it in # tight loops localize len ... All that said, I don't really believe that this is a high priority feature request. The gained performance win is not all that great and only becomes relevant when used in tight loops. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From wes.turner at gmail.com Thu Jan 21 05:14:58 2016 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 21 Jan 2016 04:14:58 -0600 Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected? In-Reply-To: <569FD44D.80008@mail.de> References: <659951373.398716.1450406333370.JavaMail.yahoo@mail.yahoo.com> <20151218110755.GH1609@ando.pearwood.info> <569FD44D.80008@mail.de> Message-ID: * DOC: OrderDict.values() comparisons in Python 3 * Src: https://hg.python.org/cpython/file/f2a0a4a45292/Doc/library/collections.rst#l793 What should it say? ```rst .. `< https://hg.python.org/cpython/file/f2a0a4a45292/Doc/library/collections.rst#l793 >`__ collections.OrderedDict.values().__eq__ * "is suprising" * Python 3 has `dict views`_ * :class:`OrderedDict` matches the dict interface in Python 2.7 and Python 3. * https://docs.python.org/2/library/collections.html#collections.OrderedDict * https://docs.python.org/3/library/collections.html#collections.OrderedDict * Python 2 dict interface: * dict.viewkeys(), dict.viewvalues(), dict.viewitems() * dict.keys(), dict.values(), dict.items() * https://docs.python.org/2/library/stdtypes.html#dict * https://docs.python.org/2/library/stdtypes.html#dict.values * https://docs.python.org/2/library/stdtypes.html#dictionary-view-objects * Python 3 dict interface (:ref:`dictionary view objects`: * dict.keys(), dict.values(), dict.items() * list(dict.keys()), list(dict.values()), list(dict.items()) * https://docs.python.org/3/library/stdtypes.html#dict * https://docs.python.org/3/library/stdtypes.html#dict.values * https://docs.python.org/3/library/stdtypes.html#dictionary-view-objects * In order to compare OrderedDict.values() **by value** you must either: * Cast values() to a sequence (e.g. a list) before comparison * Subclass :class:`OrderedDict` and wrap `values()` .. code:: python from collections import OrderedDict a = 'a' x = 1 y = x ab = [( a, x), ('b', 2)] ba = [('b', 2), (a, y)] ab_odict = OrderedDict(ab) ab_odict_ = OrderedDict(ab) ab_odict_copy = OrderedDict(ab.copy()) ba_odict = OrderedDict(ba) ab_dict = dict(ab) ba_dict = dict(ba) # In Python 3, # OrderedDict.values.__eq__ does not compare by value: assert ( ab_odict.values() == ab_odict_.values()) is False assert (list(ab_odict.values()) == list(ab_odict_.values()) is True # In Python 2.7 and 3, # OrderedDict.__eq__ compares ordered sequences assert (ab_odict == ab_odict_) is True assert (ab_odict == ab_odict_copy) is True assert (ab_odict == ba_odict) is False assert (ab_dict == ba_dict) is True # - [ ] How to explain the x, y part? # - in terms of references, __eq__, id(obj), __hash__ ``` On Wed, Jan 20, 2016 at 12:39 PM, Sven R. Kunze wrote: > Documentation is a very good idea. > > Maybe, even raise an error when comparing values. > > Best, > Sven > > > On 20.01.2016 12:13, Alexandre Figura wrote: > > If we put technical considerations aside, maybe we should just ask to > ourselves what behavior do we expect when doing equality tests between > ordered dictionaries. As a reminder: > > >>> xy = OrderedDict([('x', None), ('y', None)]) > >>> yx = OrderedDict([('y', None), ('x', None)]) > >>> xy == yx > False > >>> xy.items() == yx.items() > True > >>> xy.keys() == yx.keys() > True > >>> xy.values() == yx.values() > False > > So, it appears that: > 1. equality tests between odict_values use objects identity and not > equality, > 2. equality tests between odict_keys do not respect order. > > If it is not technically possible to change the current implementation, > maybe all we can do is just add a warning about current behavior in the > documentation? > > On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum < > guido at python.org> wrote: > >> Seems like we dropped the ball... Is there any action item here? >> >> -- >> --Guido van Rossum (python.org/~guido ) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > _______________________________________________ > Python-ideas mailing listPython-ideas at python.orghttps://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Jan 21 08:19:59 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 21 Jan 2016 14:19:59 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A0A75F.9050208@egenix.com> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A0A75F.9050208@egenix.com> Message-ID: 2016-01-21 10:39 GMT+01:00 M.-A. Lemburg : > I ran performance tests on these optimization tricks (and > others) in 2014. See this talk: > > http://www.egenix.com/library/presentations/PyCon-UK-2014-When-performance-matters/ > (slides 33ff.) Ah nice, thanks for the slides. > The keyword trick doesn't really pay off in terms of added > performance vs. danger of introducing weird bugs. I ran a quick microbenchmark to measure the cost of LOAD_GLOBAL to load a global: call func("abc") with mylen = len def func(obj): return mylen(obj) Result: 117 ns: original bytecode (LOAD_GLOBAL) 109 ns: LOAD_CONST 116 ns: LOAD_CONST with guard LOAD_CONST avoids 1 dict lookup (globals) and reduces the runtime by 8 ns: 7% faster. But the guard has a cost of 7 ns: we only win 1 nanosecond. Not really interesting here. LOAD_CONST means that the LOAD_GLOBAL instruction has been replaced with a LOAD_CONST instruction. The guard checks if the frame globals and globals()['mylen'] didn't change. I ran a second microbenchmark on func("abc") to measure the cost LOAD_GLOBAL to load a builtin: call func("abc") with def func(obj): return len(obj) Result: 124 ns: original bytecode (LOAD_GLOBAL) 107 ns: LOAD_CONST 116 ns: LOAD_CONST with guard on builtins + globals LOAD_CONST avoids 2 dict lookup (globals, builtins) and reduces the runtime by 17 ns: 14% faster. But the guard has a cost of 9 ns: we win 8 nanosecond, 6% faster. Here is the guard is more complex: checks if the frame builtins, the frame globals, builtins.__dict__['len'] and globals()['len'] didn't change. If you avoid guards, it's always faster, but it changes the Python semantics. The speedup on such very small example is low. It's more interesting when the global or builtin variable is used in a loop: the speedup is multipled by the number of loop iterations. > A decorator could help with this (by transforming the byte > code and localizing the symbols), e.g. > > @localize(len) > def f(seq): > z = 0 > for x in seq: > if x: > z += len(x) > return z FYI https://pypi.python.org/pypi/codetransformer has such decorator: @asconstants(len=len). > All that said, I don't really believe that this is a high > priority feature request. The gained performance win is > not all that great and only becomes relevant when used > in tight loops. Yeah, in the Python stdlib, the hack is only used for loops. Victor From mal at egenix.com Thu Jan 21 08:39:29 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Jan 2016 14:39:29 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A0A75F.9050208@egenix.com> Message-ID: <56A0DF91.1050304@egenix.com> On 21.01.2016 14:19, Victor Stinner wrote: > 2016-01-21 10:39 GMT+01:00 M.-A. Lemburg : >> I ran performance tests on these optimization tricks (and >> others) in 2014. See this talk: >> >> http://www.egenix.com/library/presentations/PyCon-UK-2014-When-performance-matters/ >> (slides 33ff.) > > Ah nice, thanks for the slides. Forgot to mention the benchmarks I used: https://github.com/egenix/when-performance-matters >> The keyword trick doesn't really pay off in terms of added >> performance vs. danger of introducing weird bugs. > > I ran a quick microbenchmark to measure the cost of LOAD_GLOBAL to > load a global: call func("abc") with > > mylen = len > def func(obj): return mylen(obj) > > Result: > > 117 ns: original bytecode (LOAD_GLOBAL) > 109 ns: LOAD_CONST > 116 ns: LOAD_CONST with guard > > LOAD_CONST avoids 1 dict lookup (globals) and reduces the runtime by 8 > ns: 7% faster. But the guard has a cost of 7 ns: we only win 1 > nanosecond. Not really interesting here. > > LOAD_CONST means that the LOAD_GLOBAL instruction has been replaced > with a LOAD_CONST instruction. The guard checks if the frame globals > and globals()['mylen'] didn't change. > > > I ran a second microbenchmark on func("abc") to measure the cost > LOAD_GLOBAL to load a builtin: call func("abc") with > > def func(obj): return len(obj) > > Result: > > 124 ns: original bytecode (LOAD_GLOBAL) > 107 ns: LOAD_CONST > 116 ns: LOAD_CONST with guard on builtins + globals > > LOAD_CONST avoids 2 dict lookup (globals, builtins) and reduces the > runtime by 17 ns: 14% faster. But the guard has a cost of 9 ns: we win > 8 nanosecond, 6% faster. > > Here is the guard is more complex: checks if the frame builtins, the > frame globals, builtins.__dict__['len'] and globals()['len'] didn't > change. > > > If you avoid guards, it's always faster, but it changes the Python semantics. > > The speedup on such very small example is low. It's more interesting > when the global or builtin variable is used in a loop: the speedup is > multipled by the number of loop iterations. Sure, but for those, you'd probably simply use the in-function localization: def f(seq): z = 0 local_len = len for x in seq: if x: z += local_len(x) return z This results in a LOAD_FAST inside the loop and is probably the better way to speed things up. >> A decorator could help with this (by transforming the byte >> code and localizing the symbols), e.g. >> >> @localize(len) >> def f(seq): >> z = 0 >> for x in seq: >> if x: >> z += len(x) >> return z > > FYI https://pypi.python.org/pypi/codetransformer has such decorator: > @asconstants(len=len). Interesting :-) >> All that said, I don't really believe that this is a high >> priority feature request. The gained performance win is >> not all that great and only becomes relevant when used >> in tight loops. > > Yeah, in the Python stdlib, the hack is only used for loops. Right. The only advantage I'd see in having a keyword to "configure" the behavior is that you could easily apply the change to a whole module/function without having to add explicit localizations everywhere. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 21 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From agustin.herranz at gmail.com Thu Jan 21 13:14:18 2016 From: agustin.herranz at gmail.com (=?UTF-8?Q?Agust=c3=adn_Herranz_Cecilia?=) Date: Thu, 21 Jan 2016 19:14:18 +0100 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> Message-ID: <56A11FFA.7040408@gmail.com> El 2016/01/21 a las 1:11, Guido van Rossum escribi?: > On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas > > wrote: > > On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia > > wrote: > > > > - GVR proposal includes some kind of syntactic sugar for > function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I > think it's good but this must be an alternative over typing module > syntax (PEP484), not the preferred way (for people get used to > typehints). Is this syntactic sugar compatible with generators? > The type analyzers could be differentiate between a Callable and a > Generator? > > I'm pretty sure Generator is not the type of a generator function, > bit of a generator object. So to type a generator function, you > just write `(int, int) -> Generator[int]`. Or, the long way, > `Function[[int, int], Generator[int]]`. > > > There is no 'Function' -- it existed in mypy before PEP 484 but was > replaced by 'Callable'. And you don't annotate a function def with '-> > Callable' (unless it returns another function). The Callable type is > only needed in the signature of higher-order functions, i.e. functions > that take functions for arguments or return a function. For example, a > simple map function would be written like this: > > def map(f: Callable[[T], S], a: List[T]) -> List[S]: > ... > > As to generators, we just improved how mypy treats generators > (https://github.com/JukkaL/mypy/commit/d8f72279344f032e993a3518c667bba813ae041a). > The Generator type has *three* parameters: the "yield" type (what's > yielded), the "send" type (what you send() into the generator, and > what's returned by yield), and the "return" type (what a return > statement in the generator returns, i.e. the value for the > StopIteration exception). You can also use Iterator if your generator > doesn't expect its send() or throw() messages to be called and it > isn't returning a value for the benefit of `yield from'. > > For example, here's a simple generator that iterates over a list of > strings, skipping alternating values: > > def skipper(a: List[str]) -> Iterator[str]: > for i, s in enumerate(a): > if i%2 == 0: > yield s > > and here's a coroutine returning a string (I know, it's pathetic, but > it's an example :-): > > @asyncio.coroutine > def readchar() -> Generator[Any, None, str]: > # Implementation not shown > @asyncio.coroutine > def readline() -> Generator[Any, None, str]: > buf = '' > while True: > c = yield from readchar() > if not c: break > buf += c > if c == '\n': break > return buf > > Here, in Generator[Any, None, str], the first parameter ('Any') refers > to the type yielded -- it actually yields Futures, but we don't care > about that (it's an asyncio implementation detail). The second > parameter ('None') is the type returned by yield -- again, it's an > implementation detail and we might just as well say 'Any' here. The > third parameter (here 'str') is the type actually returned by the > 'return' statement. > > It's illustrative to observe that the signature of readchar() is > exactly the same (since it also returns a string). OTOH the return > type of e.g. asyncio.sleep() is Generator[Any, None, None], because it > doesn't return a value. > > This business is clearly still suboptimal -- we would like to > introduce a new type, perhaps named Coroutine, so that you can write > Coroutine[T] instead of Generator[Any, None, T]. But that would just > be a shorthand. The actual type of a generator object is always some > parametrization of Generator. > > In any case, whatever we write after the -> (i.e., the return type) is > still the type of the value you get when you call the function. If the > function is a generator function, the value you get is a generator > object, and that's what the return type designates. > > (Of course you can use Callable instead of the more specific > Function, or Iterator (or even Iterable) instead of the more > specific Generator, if you want to be free to change the > implementation to use an iterator class or something later, but > normally you'd want the most specific type, I think.) > > > I don't know where you read about Callable vs. Function. > > Regarding using Iterator[T] instead of Generator[..., ..., T], you are > correct. > > Note that you *cannot* define a generator function as returning a > *subclass* of Iterator/Generator; there is no way to have a generator > function instantiate some other class as its return value. Consider > (ignoring generic types): > > class MyIterator: > def __next__(self): ... > def __iter__(self): ... > def bar(self): ... > > def foo() -> MyIterator: > yield > > x = foo() > x.bar() # Boom! > > The type checker would assume that x has a method bar() based on the > declared return type for foo(), but it doesn't. (There are a few other > special cases, in addition to Generator and Iterator; declaring the > return type to be Any or object is allowed.) This is a mistake by my side, I got confused, the generator is just the return type of the callable, but the returned generator it's also a callable. > > - As this is intended to gradual type python2 code to port it to > python 3 I think it's convenient to add some sort of import that > only be used for type checking, and be only imported by the type > analyzer, not the runtime. This could be achieve by prepending > "#type: " to the normal import statement, something like: > > # type: import module > > # type: from package import module > > That sounds like a bad idea. If the typing module shadows some > global, you won't get any errors, but your code will be misleading > to a reader (and even worse if you from package.module import t). > If the cost of the import is too high for Python 2, surely it's > also too high for Python 3. And what other reason do you have for > skipping it? > > > Exactly. Even though (when using Python 2) all type annotations are in > comments, you still must write real imports. (This causes minor > annoyances with linters that warn about unused imports, but there are > ways to teach them.) This type comment 'imports' are not intended to shadow the current namespace, are intended to tell the analyzer where it can find those types present in the type comments that are not in the current namespace without import in it. This surely complicates the analyzer task but helps avoid namespace pollution and also saves memory on runtime. The typical case I've found is when using a third party library (that don't have type information) and you creates objects with a factory. The class of the objects is no needed anywhere so it's not imported in the current namespace, but it's needed only for type analysis and autocomplete. > > - Also there must be addressed how it work on a python2 to > python3 environment as there are types with the same name, str for > example, that works differently on each python version. If the > code is for only one version uses the type names of that version. > > That's the same problem that exists at runtime, and people (and > tools) already know how to deal with it: use bytes when you mean > bytes, unicode when you mean unicode, and str when you mean > whatever is "native" to the version you're running under and are > willing to deal with it. So now you just have to do the same thing > in type hints that you're already doing in constructors, > isinstance checks, etc. > > > This is actually still a real problem. But it has no bearing on the > choice of syntax for annotations in Python 2 or straddling code. Yes, this is no related with the choice of syntax for annotations directly. This is intended to help in the process of porting python2 code to python3, and it's outside of the PEP scope but related to the original problem. What I have in mind is some type aliases so you could annotate a version specific type to avoid ambiguousness on code that it's used on different versions. At the end what I originally try to said is that it's good to have a convention way to name this type aliases. This are intended to use during the process of porting, to help some automated tools, in a period of transition between versions. It's a way to tell the analyzer that a type have a behavior, perhaps different, than the same type on the running python version. For example. You start with some working python2 code that you want to still be working. A code analysis tool can infer the types and annotate the code. Also can check which parts are py2/py3 compatible and which not, and mark those types with the mentioned type aliases. With this, and test suites, it could be calculated how much code is needed to be ported. Refactor to adapt the code to python3 maintaining code to still run on python2 (it could be marked for automate deletion), and when it's done, drop all the python2 code.. > > Of course many people use libraries like six to help them deal > with this, which means that those libraries have to be type-hinted > appropriately for both versions (maybe using different stubs for > py2 and py3, with the right one selected at pip install time?), > but if that's taken care of, user code should just work. > > > Yeah, we could use help. There are some very rudimentary stubs for a > few things defined by six > (https://github.com/python/typeshed/tree/master/third_party/3/six, > https://github.com/python/typeshed/tree/master/third_party/2.7/six) > but we need more. There's a PR but it's of bewildering size > (https://github.com/python/typeshed/pull/21). > I think the process of porting it's different from the process of adapting code to work on python 2/3. Code with bytes, unicode, & str(don't mind) are not python2 code nor python3. Lot's of libraries that are 2/3 compatibles are just python2 code minimally adapted to run on python3 with six, and still be developed with a python2 style. When the time of drop python2 arrives the refactor needed will be huge. There is also an article that recently claims "Stop writing code that break on Python 4" and show code that treats python3 as the special case.. > PS. I have a hard time following the rest of Agustin's comments. The > comment-based syntax I proposed for Python 2.7 does support exactly > the same functionality as the official PEP 484 syntax; the only thing > it doesn't allow is selectively leaving out types for some arguments > -- you must use 'Any' to fill those positions. It's not a problem in > practice, and it doesn't reduce functionality (omitted argument types > are assumed to be Any in PEP 484 too). I should also remark that mypy > supports the comment-based syntax in Python 2 mode as well as in > Python 3 mode; but when writing Python 3 only code, the non-comment > version is strongly preferred. (We plan to eventually produce a tool > that converts the comments to standard PEP 484 syntax). > -- > --Guido van Rossum (python.org/~guido ) My original point is that if comment-based function annotations are going to be added, add it to python 3 too, no only for the special case of "Python 2.7 and straddling code", even though, on python 3, type annotations are preferred. I think that have the alternative to define types of a function as a type comment is a good thing because annotations could become a mesh, specially with complex types and default parameters, and I don't fell that the optional part of gradual typing must include readability. Some examples of my own code: class Field: def __init__(self, name: str, extract: Callable[[str], str], validate: Callable[[str], bool]=bool_test, transform: Callable[[str], Any]=identity) -> 'Field': class RepeatableField: def __init__(self, extract: Callable[[str], str], size: int, fields: List[Field], index_label: str, index_transform: Callable[[int], str]=lambda x: str(x)) -> 'RepeatableField': def filter_by(field_gen: Iterable[Dict[str, Any]], **kwargs) -> Generator[Dict[str, Any], Any, Any]: So, for define a comment-based function annotation it should be accepted two kind of syntax: - one 'explicit' marking the type of the function according to the PEP484 syntax: def embezzle(self, account, funds=1000000, *fake_receipts): # type: Callable[[str, int, *str], None] """Embezzle funds from account using fake receipts.""" like if was a normal type comment: embezzle = get_embezzle_function() # type: Callable[[str, int, *str], None] - and another one that 'implicitly' define the type of the function as Callable: def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None """Embezzle funds from account using fake receipts.""" Both ways are easily translated back and forth into python3 annotations. Also, comment-based function annotations easily goes over one line's characters, so it should be define which syntax is used to break the line. As it said on https://github.com/JukkaL/mypy/issues/1102 Those things should be on a PEP as a standard way to implement this, not only for mypy, also for other tools. Accept comment-based function annotations in python3 is good for migration python 2/3 code as it helps on refactor and use (better autocomplete), but makes it a python2 feature and not python3 increase the gap between versions. Hope I expressed better, if not, sorry about that. Agust?n Herranz -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Jan 21 13:44:07 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Jan 2016 10:44:07 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <56A11FFA.7040408@gmail.com> References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia < agustin.herranz at gmail.com> wrote: > El 2016/01/21 a las 1:11, Guido van Rossum escribi?: > [...] > > >> >> > - As this is intended to gradual type python2 code to port it to python >> 3 I think it's convenient to add some sort of import that only be used for >> type checking, and be only imported by the type analyzer, not the runtime. >> This could be achieve by prepending "#type: " to the normal import >> statement, something like: >> > # type: import module >> > # type: from package import module >> >> That sounds like a bad idea. If the typing module shadows some global, >> you won't get any errors, but your code will be misleading to a reader (and >> even worse if you from package.module import t). If the cost of the import >> is too high for Python 2, surely it's also too high for Python 3. And what >> other reason do you have for skipping it? >> > > Exactly. Even though (when using Python 2) all type annotations are in > comments, you still must write real imports. (This causes minor annoyances > with linters that warn about unused imports, but there are ways to teach > them.) > > This type comment 'imports' are not intended to shadow the current > namespace, are intended to tell the analyzer where it can find those types > present in the type comments that are not in the current namespace without > import in it. This surely complicates the analyzer task but helps avoid > namespace pollution and also saves memory on runtime. > > The typical case I've found is when using a third party library (that > don't have type information) and you creates objects with a factory. The > class of the objects is no needed anywhere so it's not imported in the > current namespace, but it's needed only for type analysis and autocomplete. > You're describing a case I have also encountered: we have a module with a function foo # foo_mod.py def foo(a): return a.bar() and the intention is that a is an instance of a class A defined in another module, which is not imported. If we add annotations we have to add an import from a_mod import A def foo(a: A) -> str: return a.bar() But the code that calls foo() is already importing A from a_mod somewhere, so there's not really any time wasted -- the import is just done at a different time. At least, that's the theory. In practice, indeed there are some unpleasant cases. For example, adding the explicit import might create an import cycle, and A may not yet be defined when foo_mod is loaded. We can't use the usual subterfuge, since we really need the definition of A: import a_mod def foo(a: a_mod.A) -> str: return a.bar() This will still fail if a_mod hasn't defined A yet because we reference a_mod.A at load time (annotations are evaluated when the function definition is executed). So we end up with this: import a_mod def foo(a: 'a_mod.A') -> str: return a.bar() This is both hard to read and probably wastes a lot of developer time figuring out they have to do this. And there are other issues, e.g. some folks have tricks to work around their start-up time by importing modules late (e.g. do the import inside the function that needs that module). In mypy there's another hack possible: it doesn't care if an import is inside "if False". So you can write: if False: from a_mod import A def foo(a: 'A') -> str: return a.bar() You still have to quote 'A' because A isn't actually defined at run time, but it's the best we can do. When using type comments you can skip the quotes: if False: from a_mod import A def foo(a): # type: (A) -> str return a.bar() All of this is unpleasant but not unbearable -- the big constraint here is that we don't want to add extra syntax (beyond PEP 3107, i.e. function annotations), so that we can use mypy for Python 3.2 and up. And with the type comments we even support 2.7. > > >> > - Also there must be addressed how it work on a python2 to python3 >> environment as there are types with the same name, str for example, that >> works differently on each python version. If the code is for only one >> version uses the type names of that version. >> >> That's the same problem that exists at runtime, and people (and tools) >> already know how to deal with it: use bytes when you mean bytes, unicode >> when you mean unicode, and str when you mean whatever is "native" to the >> version you're running under and are willing to deal with it. So now you >> just have to do the same thing in type hints that you're already doing in >> constructors, isinstance checks, etc. >> > > This is actually still a real problem. But it has no bearing on the choice > of syntax for annotations in Python 2 or straddling code. > > > Yes, this is no related with the choice of syntax for annotations > directly. This is intended to help in the process of porting python2 code > to python3, and it's outside of the PEP scope but related to the original > problem. What I have in mind is some type aliases so you could annotate a > version specific type to avoid ambiguousness on code that it's used on > different versions. At the end what I originally try to said is that it's > good to have a convention way to name this type aliases. > Yes, this is a useful thing to discuss. Maybe we can standardize on the types defined by the 'six' package, which is commonly used for 2-3 straddling code: six.text_type (unicode in PY2, str in PY3) six.binary_type (str in PY2, bytes in PY3) Actually for the latter we might as well use bytes. > This are intended to use during the process of porting, to help some > automated tools, in a period of transition between versions. It's a way to > tell the analyzer that a type have a behavior, perhaps different, than the > same type on the running python version. > > For example. You start with some working python2 code that you want to > still be working. A code analysis tool can infer the types and annotate the > code. Also can check which parts are py2/py3 compatible and which not, and > mark those types with the mentioned type aliases. With this, and test > suites, it could be calculated how much code is needed to be ported. > Refactor to adapt the code to python3 maintaining code to still run on > python2 (it could be marked for automate deletion), and when it's done, > drop all the python2 code.. > Yes, that's the kind of process we're trying to develop. It's still early days though -- people have gotten different workflows already using six and tests and the notion of straddling code, __future__ imports, and PyPI backports of some PY3 stdlib packages (e.g. contextlib2). There's also a healthy set of tools that converts PY2 code to straddling code, approximately (e.g. futurize and modernize). What's missing (as you point out) is tools that help automating a larger part of the conversion once PY2 code has been annotated. But first we need to agree on how to annotate PY2 code. > > Of course many people use libraries like six to help them deal with this, >> which means that those libraries have to be type-hinted appropriately for >> both versions (maybe using different stubs for py2 and py3, with the right >> one selected at pip install time?), but if that's taken care of, user code >> should just work. >> > > Yeah, we could use help. There are some very rudimentary stubs for a few > things defined by six ( > > https://github.com/python/typeshed/tree/master/third_party/3/six, > https://github.com/python/typeshed/tree/master/third_party/2.7/six) but > we need more. There's a PR but it's of bewildering size ( > https://github.com/python/typeshed/pull/21). > > I think the process of porting it's different from the process of adapting > code to work on python 2/3. Code with bytes, unicode, & str(don't mind) are > not python2 code nor python3. Lot's of libraries that are 2/3 compatibles > are just python2 code minimally adapted to run on python3 with six, and > still be developed with a python2 style. When the time of drop python2 > arrives the refactor needed will be huge. There is also an article that > recently claims "Stop writing code that break on Python 4" and show code > that treats python3 as the special case.. > > PS. I have a hard time following the rest of Agustin's comments. The > comment-based syntax I proposed for Python 2.7 does support exactly the > same functionality as the official PEP 484 syntax; the only thing it > doesn't allow is selectively leaving out types for some arguments -- you > must use 'Any' to fill those positions. It's not a problem in practice, and > it doesn't reduce functionality (omitted argument types are assumed to be > Any in PEP 484 too). I should also remark that mypy supports the > comment-based syntax in Python 2 mode as well as in Python 3 mode; but when > writing Python 3 only code, the non-comment version is strongly preferred. > (We plan to eventually produce a tool that converts the comments to > standard PEP 484 syntax). > > -- > --Guido van Rossum (python.org/~guido ) > > > My original point is that if comment-based function annotations are going > to be added, add it to python 3 too, no only for the special case of > "Python 2.7 and straddling code", even though, on python 3, type > annotations are preferred. > The text I added to the end of PEP 484 already says so: """ - Tools that support this syntax should support it regardless of the Python version being checked. This is necessary in order to support code that straddles Python 2 and Python 3. """ > > I think that have the alternative to define types of a function as a type > comment is a good thing because annotations could become a mesh, specially > with complex types and default parameters, and I don't fell that the > optional part of gradual typing must include readability. > Some examples of my own code: > > class Field: > def __init__(self, name: str, > extract: Callable[[str], str], > validate: Callable[[str], bool]=bool_test, > transform: Callable[[str], Any]=identity) -> 'Field': > > class RepeatableField: > def __init__(self, > extract: Callable[[str], str], > size: int, > fields: List[Field], > index_label: str, > index_transform: Callable[[int], str]=lambda x: str(x)) > -> 'RepeatableField': > > def filter_by(field_gen: Iterable[Dict[str, Any]], **kwargs) -> > Generator[Dict[str, Any], Any, Any]: > > > So, for define a comment-based function annotation it should be accepted > two kind of syntax: > - one 'explicit' marking the type of the function according to the PEP484 > syntax: > > def embezzle(self, account, funds=1000000, *fake_receipts): > # type: Callable[[str, int, *str], None] > """Embezzle funds from account using fake receipts.""" > > > like if was a normal type comment: > > embezzle = get_embezzle_function() # type: Callable[[str, int, *str], None] > > > - and another one that 'implicitly' define the type of the function as > Callable: > > def embezzle(self, account, funds=1000000, *fake_receipts): > # type: (str, int, *str) -> None > """Embezzle funds from account using fake receipts.""" > > > Both ways are easily translated back and forth into python3 annotations. > I don't see what adding support for # type: Callable[[str, int, *str], None] adds. It's more verbose, and when the 'implicit' notation is used, the type checker already knows that embezzle is a function with that signature. You can already do this (except for the *str part): from typing import Callable def embezzle(account, funds=1000000): # type: (str, int) -> None """Embezzle funds from account using fake receipts.""" pass f = None # type: Callable[[str, int], None] f = embezzle f('a', 42) However, note that no matter which notation you use, there's no way in PEP 484 to write the type of the original embezzle() function object using Callable -- Callable does not have support for varargs like *fake_receipts. If you want that the best place to bring it up is the typehinting tracker ( https://github.com/ambv/typehinting/issues). But it's going to be a tough nut to crack, and the majority of places where Callable is needed (mostly higher-order functions like filter/map) don't need it -- their function arguments have purely positional arguments. > Also, comment-based function annotations easily goes over one line's > characters, so it should be define which syntax is used to break the line. > As it said on https://github.com/JukkaL/mypy/issues/1102 > > Those things should be on a PEP as a standard way to implement this, not > only for mypy, also for other tools. > Accept comment-based function annotations in python3 is good for migration > python 2/3 code as it helps on refactor and use (better autocomplete), but > makes it a python2 feature and not python3 increase the gap between > versions. > Consider it done. The time machine strikes again. :-) > > > Hope I expressed better, if not, sorry about that. > It's perfectly fine this time! > > Agust?n Herranz > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Jan 21 14:35:43 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 21 Jan 2016 11:35:43 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A065C8.9050305@canterbury.ac.nz> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A065C8.9050305@canterbury.ac.nz> Message-ID: <1792B086-3AA1-46FB-9115-966645506F62@yahoo.com> On Jan 20, 2016, at 20:59, Greg Ewing wrote: > > My idea for handling this kind of thing is: > > for new x in things: > funcs.append(lambda: dosomethingwith(x)) > > The 'new' modifier can be applied to any assignment target, > and conceptually has the effect of creating a new binding > instead of changing an existing binding. C# almost did this (but only in foreach statements, not all bindings), but in the end they decided that it was simpler to just make foreach _always_ create a new binding each time through the loop, instead of requiring new syntax. I think most of the rationale is in one of Eric Lippert's blog posts with a name like "loop closures considered harmful" (I can't remember the exact title, and searching while typing sucks on a phone), but I can summarize here. C# had the exact same problem, for the exact same reasons. And, since they don't have the default-value trick, the solution required defining a new local copy in the same scope as the function definition (which means, if you're defining the function in expression context, you have to wrap it in another lambda and call it). After years of closing bugs with "no, C# closures are not broken, what you're complaining about is the exact definition of a closure", they decided they had to do something about it. Every option they considered had some unacceptable feature, but in the end they decided that leaving it as-is was also unacceptable. So, borrowing a bit of practicality-beats-purity from some other language, they decided that a breaking semantic change, and making foreach and C-style for less consistent, and violating one of their fundamental design principles (left is always at least as outside as right) was the best choice. Python doesn't have the left-outside principle to break (see comprehensions), doesn't have a C-style for to be consistent with, and has probably less rather than more performance impact (we know whether a loop variable is captured, and can skip it for non-cellvars). But it probably has more backward compatibility issues (nobody writes new code expecting it to work for C# 3 as well as C# 5, but people are still writing code that has to work with Python 2.7). So, unless we can be sure that nobody intentionally writes code with a free variable that captures a loop variable, the C# solution isn't available. Which means your solution is probably the next best thing. And, while I don't see any compelling need for it anywhere other than loop variables, there's also no compelling reason to ban it elsewhere, so why not keep assignment targets consistent. From tjreedy at udel.edu Thu Jan 21 23:08:02 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Jan 2016 23:08:02 -0500 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On 1/21/2016 1:44 PM, Guido van Rossum wrote: [Snip discussion of nitty-gritty issue of annotating code, especially 2.7 code.] I suspect that at this point making migration from 2.7 to 3.x *easier*, with annotations, will do more to encourage migration, overall, than yet another new module. So I support this work even if I will not directly use it. If you are looking for a PyCon talk topic, I think this, with your experiences up that time, would be a good one. Only slightly off topic, I also think it worthwhile to reiterate that pydev support for 2.7 really really will end in 2020, possibly on the first day, as now documented in the nice new front-page devguide chart. https://docs.python.org/devguide/#status-of-python-branches I have read people saying (SO comments, I think) that there might or will be a security-patch only phase of some number of years *after* that. > There's also a healthy set of tools that converts PY2 code to straddling > code, approximately (e.g. futurize and modernize). What's missing (as > you point out) is tools that help automating a larger part of the > conversion once PY2 code has been annotated. PEP 484 gives the motivation for 2.7 compatible type comments as "Some tools may want to support type annotations in code that must be compatible with Python 2.7. " To me, this just implies running a static analyzer over *existing* code. Using type hint comments to help automate conversion, if indeed possible, would be worth adding to the motivation. > But first we need to agree on how to annotate PY2 code. Given the current addition to an accepted PEP, I though we more or less had, at least provisionally. -- Terry Jan Reedy From brett at python.org Fri Jan 22 13:37:48 2016 From: brett at python.org (Brett Cannon) Date: Fri, 22 Jan 2016 18:37:48 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: > On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia < > agustin.herranz at gmail.com> wrote: > >> El 2016/01/21 a las 1:11, Guido van Rossum escribi?: >> > [...] >> > [SNIP] >> >> > - Also there must be addressed how it work on a python2 to python3 >>> environment as there are types with the same name, str for example, that >>> works differently on each python version. If the code is for only one >>> version uses the type names of that version. >>> >>> That's the same problem that exists at runtime, and people (and tools) >>> already know how to deal with it: use bytes when you mean bytes, unicode >>> when you mean unicode, and str when you mean whatever is "native" to the >>> version you're running under and are willing to deal with it. So now you >>> just have to do the same thing in type hints that you're already doing in >>> constructors, isinstance checks, etc. >>> >> >> This is actually still a real problem. But it has no bearing on the >> choice of syntax for annotations in Python 2 or straddling code. >> >> >> Yes, this is no related with the choice of syntax for annotations >> directly. This is intended to help in the process of porting python2 code >> to python3, and it's outside of the PEP scope but related to the original >> problem. What I have in mind is some type aliases so you could annotate a >> version specific type to avoid ambiguousness on code that it's used on >> different versions. At the end what I originally try to said is that it's >> good to have a convention way to name this type aliases. >> > > Yes, this is a useful thing to discuss. > > Maybe we can standardize on the types defined by the 'six' package, which > is commonly used for 2-3 straddling code: > > six.text_type (unicode in PY2, str in PY3) > six.binary_type (str in PY2, bytes in PY3) > > Actually for the latter we might as well use bytes. > I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in Python 3. As for the textual type, I say either `text` or `unicode` since they are both unambiguous between Python 2 and 3 and get the point across. And does `str` represent the type for the specific version of Python mypy is running under, or is it pegged to a specific representation across Python 2 and 3? If it's the former then fine, else those people who use the "native string" concept might want a way to say "I want the `str` type as defined on the version of Python I'm running under" (personally I don't promote the "native string" concept, but I know it has been brought up in the past). -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jan 22 14:08:14 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 11:08:14 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon wrote: > > > On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: > >> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia < >> agustin.herranz at gmail.com> wrote: >> [...] >> Yes, this is no related with the choice of syntax for annotations >> directly. This is intended to help in the process of porting python2 code >> to python3, and it's outside of the PEP scope but related to the original >> problem. What I have in mind is some type aliases so you could annotate a >> version specific type to avoid ambiguousness on code that it's used on >> different versions. At the end what I originally try to said is that it's >> good to have a convention way to name this type aliases. >> >> Yes, this is a useful thing to discuss. >> >> Maybe we can standardize on the types defined by the 'six' package, which >> is commonly used for 2-3 straddling code: >> >> six.text_type (unicode in PY2, str in PY3) >> six.binary_type (str in PY2, bytes in PY3) >> >> Actually for the latter we might as well use bytes. >> > > I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in > Python 3. > OK, that's settled. > As for the textual type, I say either `text` or `unicode` since they are > both unambiguous between Python 2 and 3 and get the point across. > Then let's call it unicode. I suppose we can add this to typing.py. In PY2, typing.unicode is just the built-in unicode. In PY3, it's the built-in str. > > And does `str` represent the type for the specific version of Python mypy > is running under, or is it pegged to a specific representation across > Python 2 and 3? If it's the former then fine, else those people who use the > "native string" concept might want a way to say "I want the `str` type as > defined on the version of Python I'm running under" (personally I don't > promote the "native string" concept, but I know it has been brought up in > the past). > In mypy (and in typeshed and in typing.py), 'str' refers to the type named str in the Python version for which you are checking -- i.e. by default mypy checks in PY3 mode and str will be the unicode type; but "mypy --py2" checks in PY2 mode and str will be the Python 2 8-bit string type. (This is actually the only thing that makes sense IMO.) There's one more thing that I wonder might be useful. In PY2 we have basestring as the supertype of str and unicode. As far as mypy is concerned it's almost the same as Union[str, unicode]. Maybe we could add this to typing.py as well so it's also available in PY3, in that case as a shorthand for Union[str, unicode]. FWIW We are having a long discussion about this topic in the mypy tracker: https://github.com/JukkaL/mypy/issues/1135 -- interested parties are invited to participate there! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Fri Jan 22 14:19:24 2016 From: random832 at fastmail.com (Random832) Date: Fri, 22 Jan 2016 14:19:24 -0500 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: <1453490364.3735097.499905674.22B21996@webmail.messagingengine.com> On Fri, Jan 22, 2016, at 14:08, Guido van Rossum wrote: > In mypy (and in typeshed and in typing.py), 'str' refers to the type > named > str in the Python version for which you are checking -- i.e. by default > mypy checks in PY3 mode and str will be the unicode type; but "mypy > --py2" > checks in PY2 mode and str will be the Python 2 8-bit string type. (This > is > actually the only thing that makes sense IMO.) Why should it need to check both modes separately? Does it not work at a level where it can see if the expression that a value originates from is "native" (e.g. a literal with no u/b) or bytes/unicode? From guido at python.org Fri Jan 22 14:24:13 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 11:24:13 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <1453490364.3735097.499905674.22B21996@webmail.messagingengine.com> References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> <1453490364.3735097.499905674.22B21996@webmail.messagingengine.com> Message-ID: On Fri, Jan 22, 2016 at 11:19 AM, Random832 wrote: > On Fri, Jan 22, 2016, at 14:08, Guido van Rossum wrote: > > In mypy (and in typeshed and in typing.py), 'str' refers to the typenamed > > str in the Python version for which you are checking -- i.e. by default > > mypy checks in PY3 mode and str will be the unicode type; but "mypy > --py2" > > checks in PY2 mode and str will be the Python 2 8-bit string type. (This > > is actually the only thing that makes sense IMO.) > > Why should it need to check both modes separately? Does it not work at a > level where it can see if the expression that a value originates from is > "native" (e.g. a literal with no u/b) or bytes/unicode? > There are many differences between PY2 and PY3, not the least in the stubs for the stdlib. If you get an expression by calling a built-in function (or anything else that's not a literal) the type depends on what's in the stub. The architecture of mypy just isn't designed to take two different sets of stubs (and other differences in rules, e.g. whether something's an iterator because it defines '__next__' or 'next') into account at once. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Fri Jan 22 14:35:55 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 22 Jan 2016 11:35:55 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> On Jan 22, 2016, at 10:37, Brett Cannon wrote: > >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: >> >> Yes, this is a useful thing to discuss. >> >> Maybe we can standardize on the types defined by the 'six' package, which is commonly used for 2-3 straddling code: >> >> six.text_type (unicode in PY2, str in PY3) >> six.binary_type (str in PY2, bytes in PY3) >> >> Actually for the latter we might as well use bytes. > > I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in Python 3. > > As for the textual type, I say either `text` or `unicode` since they are both unambiguous between Python 2 and 3 and get the point across. The only problem is that, while bytes is a builtin type in both 2.7 and 3.x, with similar behavior (especially in 3.5, where simple %-formatting code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that would require people writing something like "try: unicode except: unicode=str" at the top of every file (or monkeypatching builtins somewhere) for the annotations to actually be valid 3.x code. And, if you're going to do that, using something that's already wide-spread and as close to a de facto standard as possible, like the six type suggested by Guido, seems less disruptive than inventing a new standard (even if "text" or "unicode" is a little nicer than "six.text_type"). (Or, of course, Guido could just get in his time machine and, along with restoring the u string literal prefix in 3.3, also restore the builtin name unicode as a synonym for str, and then this whole mail thread would fade out like Marty McFly.) Also, don't forget "basestring", which some 2.x code uses. A lot of such code just drops bytes support when modernizing, but if not, it has to change to something that means basestring or str|unicode in 2.x and bytes|str in 3.x. Again, six has a solution for that, string_types, and mypy could standardize on that solution too. > And does `str` represent the type for the specific version of Python mypy is running under, or is it pegged to a specific representation across Python 2 and 3? If it's the former then fine, In six-based code, it means native string, and there are tools designed to help you go over all your str uses and decide which ones should be changed to something else (usually text_type or binary_type), but no special name to use when you decide "I really do want native str here". So, I think it makes sense for mypy to assume the same, rather than to encourage people to shadow or rebind str to make mypy happy in 2.x. Speaking of native strings: six code often doesn't use native strings for __str__, instead using explicit text, and the @python_2_unicode_compatible class decorator. Will mypy need special support for that decorator to handle those types? If so, it's probably worth adding; otherwise, it would be encouraging people to stick with native strings instead of switching to text. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jan 22 14:44:27 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 11:44:27 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> Message-ID: Looks like our messages crossed. On Fri, Jan 22, 2016 at 11:35 AM, Andrew Barnert wrote: > On Jan 22, 2016, at 10:37, Brett Cannon wrote: > > > On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: > >> >> Yes, this is a useful thing to discuss. >> >> Maybe we can standardize on the types defined by the 'six' package, which >> is commonly used for 2-3 straddling code: >> >> six.text_type (unicode in PY2, str in PY3) >> six.binary_type (str in PY2, bytes in PY3) >> >> Actually for the latter we might as well use bytes. >> > > I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in > Python 3. > > As for the textual type, I say either `text` or `unicode` since they are > both unambiguous between Python 2 and 3 and get the point across. > > > The only problem is that, while bytes is a builtin type in both 2.7 and > 3.x, with similar behavior (especially in 3.5, where simple %-formatting > code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that > would require people writing something like "try: unicode except: > unicode=str" at the top of every file (or monkeypatching builtins > somewhere) for the annotations to actually be valid 3.x code. And, if > you're going to do that, using something that's already wide-spread and as > close to a de facto standard as possible, like the six type suggested by > Guido, seems less disruptive than inventing a new standard (even if "text" > or "unicode" is a little nicer than "six.text_type"). > > (Or, of course, Guido could just get in his time machine and, along with > restoring the u string literal prefix in 3.3, also restore the builtin name > unicode as a synonym for str, and then this whole mail thread would fade > out like Marty McFly.) > > Also, don't forget "basestring", which some 2.x code uses. A lot of such > code just drops bytes support when modernizing, but if not, it has to > change to something that means basestring or str|unicode in 2.x and > bytes|str in 3.x. Again, six has a solution for that, string_types, and > mypy could standardize on that solution too. > > And does `str` represent the type for the specific version of Python mypy > is running under, or is it pegged to a specific representation across > Python 2 and 3? If it's the former then fine, > > > In six-based code, it means native string, and there are tools designed to > help you go over all your str uses and decide which ones should be changed > to something else (usually text_type or binary_type), but no special name > to use when you decide "I really do want native str here". So, I think it > makes sense for mypy to assume the same, rather than to encourage people to > shadow or rebind str to make mypy happy in 2.x. > > Speaking of native strings: six code often doesn't use native strings for > __str__, instead using explicit text, and the @python_2_unicode_compatible > class decorator. Will mypy need special support for that decorator to > handle those types? If so, it's probably worth adding; otherwise, it would > be encouraging people to stick with native strings instead of switching to > text. > That decorator is in the typeshed stubs and appears to work -- although it looks like it's just a noop even in PY2. If that requires tweaks please submit a bug to the typeshed project tracker ( https://github.com/python/typeshed/issues). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jan 22 15:00:57 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 12:00:57 -0800 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files Message-ID: Ben Darnell (Tornado lead) brought up a good use case for allowing @overload in regular Python files. There's some discussion (some old, some new) here: https://github.com/ambv/typehinting/issues/72 I now propose to allow @overload in non-stub (i.e. .py) files, but with the following rule: a series of @overload-decorated functions must be followed by an implementation function that's not @overload-decorated. Calling an @overload-decorated function is still an error (I propose NotImplemented). Due to the way repeated function definitions with the same name replace each other, leaving only the last one active, this should work. E.g. for Tornado's utf8() the full definition would look like this: @overloaddef utf8(value: None) -> None: ... at overloaddef utf8(value: bytes) -> bytes: ... at overloaddef utf8(value: str) -> bytes: ... # or (unicode)->bytes, in PY2def utf8(value): # Real implementation goes here. NOTE: If you are trying to understand why we can't use a stub file here or why we can't solve this with type variables or unions, please read the issue and comment there if things are not clear. Here on python-ideas I'd like to focus on seeing whether this amendment is non-controversial (apart from tea party members who just want to repeal PEP 484 entirely :-). I know that previously we wanted to come up with a complete solution for multi-dispatch based on type annotations first, and there are philosophical problems with using @overload (though it can be made to work using sys._getframe()). The proposal here is *not* that solution. If you call the @overload-decorated function, it will raise NotImplemented. (But if you follow the rule, the @overload-decorated function objects are inaccessible so this would only happen if you forgot or misspelled the final, undecorated implementation function). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Jan 22 15:58:43 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 22 Jan 2016 20:58:43 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On 22 January 2016 at 19:08, Guido van Rossum wrote: > On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon wrote: >> >> >> >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: >>> >>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia >>> wrote: >>> [...] >>> Yes, this is no related with the choice of syntax for annotations >>> directly. This is intended to help in the process of porting python2 code to >>> python3, and it's outside of the PEP scope but related to the original >>> problem. What I have in mind is some type aliases so you could annotate a >>> version specific type to avoid ambiguousness on code that it's used on >>> different versions. At the end what I originally try to said is that it's >>> good to have a convention way to name this type aliases. >>> >>> Yes, this is a useful thing to discuss. >>> >>> Maybe we can standardize on the types defined by the 'six' package, which >>> is commonly used for 2-3 straddling code: >>> >>> six.text_type (unicode in PY2, str in PY3) >>> six.binary_type (str in PY2, bytes in PY3) >>> >>> Actually for the latter we might as well use bytes. >> >> >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in >> Python 3. > > > OK, that's settled. > >> >> As for the textual type, I say either `text` or `unicode` since they are >> both unambiguous between Python 2 and 3 and get the point across. > > > Then let's call it unicode. I suppose we can add this to typing.py. In PY2, > typing.unicode is just the built-in unicode. In PY3, it's the built-in str. This thread came to my attention just as I'd been thinking about a related point. For me, by far the worst Unicode-related porting issue I see is people with a confused view of what type of data reading a file will give. This is because open() returns a different type (byte stream or character stream) depending on its arguments (specifically 'b' in the mode) and it's frustratingly difficult to track this type across function calls - especially in code originally written in a Python 2 environment where people *expect* to confuse bytes and strings in this context. So, for example, I see a function read_one_byte which does f.read(1), and works fine in real use when a data file (opened with 'b') is processed, but fails when sys.stdin us used (on Python 3once someone types a Unicode character). As far as I know, there's no way for type annotations to capture this distinction - either as they are at present in Python3, nor as being discussed here. But what I'm not sure of is whether it's something that *could* be tracked by a type checker. Of course I'm also not sure I'm right when I say you can't do it right now :-) Is this something worth including in the discussion, or is it a completely separate topic? Paul From guido at python.org Fri Jan 22 16:11:24 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 13:11:24 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: Interesting. PEP 484 defines an IO generic class, so you can write IO[str] or IO[bytes]. Maybe introducing separate helper functions that open files in text or binary mode can complement this to get a solution? On Fri, Jan 22, 2016 at 12:58 PM, Paul Moore wrote: > On 22 January 2016 at 19:08, Guido van Rossum wrote: > > On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon wrote: > >> > >> > >> > >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: > >>> > >>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia > >>> wrote: > >>> [...] > >>> Yes, this is no related with the choice of syntax for annotations > >>> directly. This is intended to help in the process of porting python2 > code to > >>> python3, and it's outside of the PEP scope but related to the original > >>> problem. What I have in mind is some type aliases so you could > annotate a > >>> version specific type to avoid ambiguousness on code that it's used on > >>> different versions. At the end what I originally try to said is that > it's > >>> good to have a convention way to name this type aliases. > >>> > >>> Yes, this is a useful thing to discuss. > >>> > >>> Maybe we can standardize on the types defined by the 'six' package, > which > >>> is commonly used for 2-3 straddling code: > >>> > >>> six.text_type (unicode in PY2, str in PY3) > >>> six.binary_type (str in PY2, bytes in PY3) > >>> > >>> Actually for the latter we might as well use bytes. > >> > >> > >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in > >> Python 3. > > > > > > OK, that's settled. > > > >> > >> As for the textual type, I say either `text` or `unicode` since they are > >> both unambiguous between Python 2 and 3 and get the point across. > > > > > > Then let's call it unicode. I suppose we can add this to typing.py. In > PY2, > > typing.unicode is just the built-in unicode. In PY3, it's the built-in > str. > > This thread came to my attention just as I'd been thinking about a > related point. > > For me, by far the worst Unicode-related porting issue I see is people > with a confused view of what type of data reading a file will give. > This is because open() returns a different type (byte stream or > character stream) depending on its arguments (specifically 'b' in the > mode) and it's frustratingly difficult to track this type across > function calls - especially in code originally written in a Python 2 > environment where people *expect* to confuse bytes and strings in this > context. So, for example, I see a function read_one_byte which does > f.read(1), and works fine in real use when a data file (opened with > 'b') is processed, but fails when sys.stdin us used (on Python 3once > someone types a Unicode character). > > As far as I know, there's no way for type annotations to capture this > distinction - either as they are at present in Python3, nor as being > discussed here. But what I'm not sure of is whether it's something > that *could* be tracked by a type checker. Of course I'm also not sure > I'm right when I say you can't do it right now :-) > > Is this something worth including in the discussion, or is it a > completely separate topic? > Paul > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Fri Jan 22 16:40:21 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 22 Jan 2016 13:40:21 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On Jan 22, 2016, at 13:11, Guido van Rossum wrote: > > Interesting. PEP 484 defines an IO generic class, so you can write IO[str] or IO[bytes]. Maybe introducing separate helper functions that open files in text or binary mode can complement this to get a solution? The runtime types are a little weird here as well. In 3.x, open returns different types depending on the value, rather than the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even though it isn't a subtype in the LSP sense, so you have to test isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is going to return bytes. That's all a little wonky, but not impossible to deal with. In 2.x, most file-like objects--including file itself, which open returns--don't satisfy either ABC, and most of them can return either type from read. Having a different function for open-binary instead of a mode flag would solve this, but it seems a little late to be adding that now. You'd have to go through all your 2.x code and change every open to one of the two new functions just to statically type your code, and then change it again for 3.x. Plus, you'd need to do the same thing not just for the builtin open, but for every library that provides an open-like method. Maybe this special case is special enough that static type checkers just have to deal with it specially? When the mode flag is a literal, process it; when it's forwarded from another function, it may be possible to get the type from there; otherwise, everything is just unicode|bytes and the type checker can't know any more unless you explicitly tell it (by annotating the variable the result of open is stored in). > >> On Fri, Jan 22, 2016 at 12:58 PM, Paul Moore wrote: >> On 22 January 2016 at 19:08, Guido van Rossum wrote: >> > On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon wrote: >> >> >> >> >> >> >> >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: >> >>> >> >>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia >> >>> wrote: >> >>> [...] >> >>> Yes, this is no related with the choice of syntax for annotations >> >>> directly. This is intended to help in the process of porting python2 code to >> >>> python3, and it's outside of the PEP scope but related to the original >> >>> problem. What I have in mind is some type aliases so you could annotate a >> >>> version specific type to avoid ambiguousness on code that it's used on >> >>> different versions. At the end what I originally try to said is that it's >> >>> good to have a convention way to name this type aliases. >> >>> >> >>> Yes, this is a useful thing to discuss. >> >>> >> >>> Maybe we can standardize on the types defined by the 'six' package, which >> >>> is commonly used for 2-3 straddling code: >> >>> >> >>> six.text_type (unicode in PY2, str in PY3) >> >>> six.binary_type (str in PY2, bytes in PY3) >> >>> >> >>> Actually for the latter we might as well use bytes. >> >> >> >> >> >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in >> >> Python 3. >> > >> > >> > OK, that's settled. >> > >> >> >> >> As for the textual type, I say either `text` or `unicode` since they are >> >> both unambiguous between Python 2 and 3 and get the point across. >> > >> > >> > Then let's call it unicode. I suppose we can add this to typing.py. In PY2, >> > typing.unicode is just the built-in unicode. In PY3, it's the built-in str. >> >> This thread came to my attention just as I'd been thinking about a >> related point. >> >> For me, by far the worst Unicode-related porting issue I see is people >> with a confused view of what type of data reading a file will give. >> This is because open() returns a different type (byte stream or >> character stream) depending on its arguments (specifically 'b' in the >> mode) and it's frustratingly difficult to track this type across >> function calls - especially in code originally written in a Python 2 >> environment where people *expect* to confuse bytes and strings in this >> context. So, for example, I see a function read_one_byte which does >> f.read(1), and works fine in real use when a data file (opened with >> 'b') is processed, but fails when sys.stdin us used (on Python 3once >> someone types a Unicode character). >> >> As far as I know, there's no way for type annotations to capture this >> distinction - either as they are at present in Python3, nor as being >> discussed here. But what I'm not sure of is whether it's something >> that *could* be tracked by a type checker. Of course I'm also not sure >> I'm right when I say you can't do it right now :-) >> >> Is this something worth including in the discussion, or is it a >> completely separate topic? >> Paul > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jan 22 18:17:27 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 15:17:27 -0800 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> Message-ID: On Fri, Jan 22, 2016 at 1:40 PM, Andrew Barnert wrote: > On Jan 22, 2016, at 13:11, Guido van Rossum wrote: > > Interesting. PEP 484 defines an IO generic class, so you can write IO[str] > or IO[bytes]. Maybe introducing separate helper functions that open files > in text or binary mode can complement this to get a solution? > > > The runtime types are a little weird here as well. > > In 3.x, open returns different types depending on the value, rather than > the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even > though it isn't a subtype in the LSP sense, so you have to test > isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is > going to return bytes. That's all a little wonky, but not impossible to > deal with. > Agreed. At this level it's really hard to fix. :-( > In 2.x, most file-like objects--including file itself, which open > returns--don't satisfy either ABC, and most of them can return either type > from read. > Well, the type returned by the builtin open() never returns Unicode. For duck types (and even StringIO) it's indeed a crapshoot. :-( > Having a different function for open-binary instead of a mode flag would > solve this, but it seems a little late to be adding that now. You'd have to > go through all your 2.x code and change every open to one of the two new > functions just to statically type your code, and then change it again for > 3.x. Plus, you'd need to do the same thing not just for the builtin open, > but for every library that provides an open-like method. > Yeah, painful. Though in most cases you can also patch up individual calls using cast(IO[str], open(...)) etc. > Maybe this special case is special enough that static type checkers just > have to deal with it specially? When the mode flag is a literal, process > it; when it's forwarded from another function, it may be possible to get > the type from there; otherwise, everything is just unicode|bytes and the > type checker can't know any more unless you explicitly tell it (by > annotating the variable the result of open is stored in). > That would be a lot of work too. We have so many other important-but-not-urgent things already that I would really like to push back on this until someone has actually tried the alternative and tells us how bad it is (like Ben Darnell did for @overload). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Jan 22 21:32:44 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Jan 2016 21:32:44 -0500 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On 1/22/2016 3:00 PM, Guido van Rossum wrote: > Ben Darnell (Tornado lead) brought up a good use case for allowing > @overload in regular Python files. > > There's some discussion (some old, some new) here: > https://github.com/ambv/typehinting/issues/72 > > I now propose to allow @overload in non-stub (i.e. .py) files, From a naive point of view, it is the prohibition that is exceptional and in need of justification. So its removal would seem non-problematical. > but with > the following rule: a series of @overload-decorated functions must be > followed by an implementation function that's not @overload-decorated. > Calling an @overload-decorated function is still an error (I propose > NotImplemented). Due to the way repeated function definitions with the > same name replace each other, leaving only the last one active, this > should work. E.g. for Tornado's utf8() the full definition would look > like this: > > @overload > def utf8(value:None) ->None:... > @overload > def utf8(value:bytes) ->bytes:... > @overload > def utf8(value:str) ->bytes:... # or (unicode)->bytes, in PY2 > def utf8(value): > # Real implementation goes here. Again, from a naive point of view, treating 'overload' the same as 'x', this seems wasteful, so the usage must be a consenting-adult tradeoff between the time taken to create function objects that get thrown away and the side-effect of 'overload'. I do understand that non-beginners with expectation based on other languages, who don't know Pythons specific usage of 'overload', may get confused. Your proposed implementation is missing a return statement. def overload(func): def overload_dummy(*args, **kwds): raise NotImplemented( "You should not call an overloaded function. " "A series of @overload-decorated functions " "outside a stub module should always be followed " "by an implementation that is not @overloaded.") To avoid throwing away two functions with each def, I suggested moving the constant replacement outside of overload. def _overload_dummy(*args, **kwds): raise NotImplemented( "You should not call an overloaded function. " "A series of @overload-decorated functions " "outside a stub module should always be followed " "by an implementation that is not @overloaded.") def overload(func): return _overload_dummy > NOTE: If you are trying to understand why we can't use a stub file here > or why we can't solve this with type variables or unions, please read > the issue and comment there if things are not clear. Here on > python-ideas I'd like to focus on seeing whether this amendment is > non-controversial (apart from tea party members who just want to repeal > PEP 484 entirely :-). Sorry, I don't see any connection between tea party philosophy and type hinting, except maybe in the opposite direction. Maybe we should continue leaving external politics, US or otherwise, out of pydev discussions. > I know that previously we wanted to come up with a complete solution for > multi-dispatch based on type annotations first, and there are > philosophical problems with using @overload (though it can be made to > work using sys._getframe()). The proposal here is *not* that solution. It would be possible for _overload_dummy to add a reference to each func arg to a list or set thereof. Perhaps you meant more or less the same with 'using sys._getframe()'. The challenge would be writing a new 'overload_complete' decorator, for the last function, that would combine the pieces. -- Terry Jan Reedy From abarnert at yahoo.com Fri Jan 22 22:28:48 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 22 Jan 2016 19:28:48 -0800 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: <935BD252-7AF1-4DE6-ABDF-6135817D776A@yahoo.com> On Jan 22, 2016, at 12:00, Guido van Rossum wrote: > > Ben Darnell (Tornado lead) brought up a good use case for allowing @overload in regular Python files. > > There's some discussion (some old, some new) here: https://github.com/ambv/typehinting/issues/72 > > I now propose to allow @overload in non-stub (i.e. .py) files, but with the following rule: a series of @overload-decorated functions must be followed by an implementation function that's not @overload-decorated. Calling an @overload-decorated function is still an error (I propose NotImplemented). Due to the way repeated function definitions with the same name replace each other, leaving only the last one active, this should work. E.g. for Tornado's utf8() the full definition would look like this: > > @overload > def utf8(value: None) -> None: ... > @overload > def utf8(value: bytes) -> bytes: ... > @overload > def utf8(value: str) -> bytes: ... # or (unicode)->bytes, in PY2 > def utf8(value): > # Real implementation goes here. It feels like this is too similar to single_dispatch to be so different in details. I get the distinction (the former is for a function that has a single implementation but acts like a bunch of overloads that switch on type, the latter is for a function that's actually implemented as a bunch of overloads that switch on type), and I also get that it'll be much easier to extend overload to compile-time multiple dispatch than to extend single_dispatch to run-time multiple dispatch (and you don't want the former to have to wait on the latter), and so on. But it still feels like someone has stapled together two languages here. (Of course I feel the same way about typing.protocols and implicit ABCs, and I know you disagreed there, so I wouldn't be too surprised if you disagree here as well. But this is even _more_ of a distinct and parallel system than that was--at least typing.Sized is in some way related to collections.abc.Sized, while overload is not related to single_dispatch at all, so someone who finds the wrong one in a search seems much more liable to assume that Python just doesn't have the one he wants than to find it.) Other than that detail, I like everything else: some feature like this should definitely be part of the type checker (the only alternative is horribly complex type annotations); if it's allowed in stub files, it should be allowed in source files; and the rule of "follow the overloads with the real implementation" seems like by far the simplest rule that could make this work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jan 22 23:04:43 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 23 Jan 2016 14:04:43 +1000 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On 23 January 2016 at 06:00, Guido van Rossum wrote: > Ben Darnell (Tornado lead) brought up a good use case for allowing @overload > in regular Python files. > > There's some discussion (some old, some new) here: > https://github.com/ambv/typehinting/issues/72 > > I now propose to allow @overload in non-stub (i.e. .py) files, but with the > following rule: a series of @overload-decorated functions must be followed > by an implementation function that's not @overload-decorated. Calling an > @overload-decorated function is still an error (I propose NotImplemented). > Due to the way repeated function definitions with the same name replace each > other, leaving only the last one active, this should work. E.g. for > Tornado's utf8() the full definition would look like this: > > @overload > def utf8(value: None) -> None: ... > @overload > def utf8(value: bytes) -> bytes: ... > @overload > def utf8(value: str) -> bytes: ... # or (unicode)->bytes, in PY2 > def utf8(value): > # Real implementation goes here. I share Andrew's concerns about the lack of integration between this and functools.singledispatch, so would it be feasible to apply the "magic comments" approach here, similar to the workarounds for variable annotations and Py2 compatible function annotations? That is, would it be possible to use a notation like the following?: def utf8(value): # type: (None) -> None # type: (bytes) -> bytes # type: (unicode) -> bytes ... You're already going to have to allow this for single lines to handle Py2 compatible annotations, so it seems reasonable to also extend it to handle overloading while you're still figuring out a native syntax for that. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Fri Jan 22 23:50:52 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 22 Jan 2016 20:50:52 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On Jan 21, 2016, at 00:48, Victor Stinner wrote: > > Hi, > > Sorry but I'm lost in this long thread. I think the whole issue of const optimization is taking this discussion way off track, so let me try to summarize the actual issue. What the thread is ultimately looking for is a solution to the "closures capturing loop variables" problem. This problem has been in the official programming FAQ[1] for decades, as "Why do lambdas defined in a loop with different values all return the same result"? powers = [lambda x: x**i for i in range(10)] This gives you ten functions that all return x**9, which is probably not what you wanted. The reason this is a problem is that Python uses "late binding", which in this context means that each of those functions is a closure that captures the variable i in a way that looks up the value of i at call time. All ten functions capture the same variable, and when you later call them, that variable's value is 9. Almost every language with real closures and for-each loops has the same problem, but someone who's coming to Python as a first language, or coming from a language like C that doesn't have those features, is almost guaranteed to be confused by this when he first sees it. (Presumably, that's why it's in the FAQ.) The OP proposed that we should add some syntax, borrowed from C++, to function definitions that specifies that some things get captured by value. You could instead describe this as early binding the specified names, or as not capturing at all, but however you describe it, the idea is pretty simple. The obvious way to implement it is to copy the values into the function object at function-creation time, then copy them into locals at call time--exactly like default parameter values. (Not too surprising, because default parameter values are the idiomatic workaround today.) A few alternatives to the parameter-like syntax borrowed from C++ were proposed, including "def power(x; i):" (notice the semicolon) and "def power(x)(i):". A few people also proposed a new declaration statement similar to "global" and "nonlocal"--which opens the question of what to call it; suggested names included "shared", "sharedlocal", and "capture". People also suggested an optimization: store them like constants, instead of like default values, so they don't need to be copied into locals. (This is similar to the global->const optimizations being discussed in the FAT threads, but here it's optimizing the equivalent of default parameter values, not globals. Which means it's much less important of an optimization, since defaults are only fetched once per call, after which they're looked up the same as locals, which are just as fast as consts. It _could_ potentially feed into further FAT-type optimizations, but that's getting pretty speculative.) The obvious downside here is that constants are stored in the code object, so instead of 10 (small) function objects all sharing the same (big) code object, you'd have 10 function objects with 10 separate (big) code objects Another alternative, which I don't think anyone seriously considered, is to flag the specified freevars so that, at function creation time, we copy the cell and bind that copy, instead of binding the original cell. (This alternative can't really be called "early binding" or "capture by value", but it has the same net effect.) Finally, Terry suggested a completely different solution to the problem: don't change closures; change for loops. Make them create a new variable each time through the loop, instead of reusing the same variable. When the variable isn't captured, this would make no difference, but when it is, closures from different iterations would capture different variables (and therefore different cells). For backward-compatibility reasons, this might have to be optional, which means new syntax; he proposed "for new i in range(10):". I don't know of any languages that use the C++-style solution that don't have lvalues to worry about. It's actually necessary for other reasons in C++ (capturing a variable doesn't extend its lifetime, so you need to be able to explicitly copy things or you end up with dangling references), but those reasons don't apply to Python (or C#, Swift, JavaScript, etc.). Still, it is a well-known solution to the problem. Terry's solution, on the other hand, is used by Swift (from the start, even though it _does_ have lvalues), C# (since 5.0), and Ruby (since 1.9), among other languages. C#, in particular, decided to add it as a breaking change to a mature language, rather than adding new syntax, because Eric Lippert believed that almost any code that's relying on the old behavior is probably a bug rather than intentional. [1]: https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result > Do you want to extend the > Python language to declare constant in a function? Maybe I'm completly > off-topic, sorry. > > > 2016-01-21 1:10 GMT+01:00 Steven D'Aprano : >> (2) If we limit this to only capturing the same name, then we can only >> write (say) "static x", and that does look like a declaration. But maybe >> we want to allow the local name to differ from the global name: >> >> static x = y > > 3 months ago, Serhiy Storchaka proposed a "const var = expr" syntax: > https://mail.python.org/pipermail/python-ideas/2015-October/037028.html > > With a shortcut "const len" which is like "const len = len". > > In the meanwhile, I implemented an optimization in my FAT Python > project: "Copy builtins to constant". It's quite simple: replace the > "LOAD_GLOBAL builtin" instruction with a "LOAD_CONST builtin" > transation and "patch" co_consts constants of a code object at > runtime: > > def hello(): print("hello world") > > is replaced with: > > def hello(): "LOAD_GLOBAL print"("hello world") > hello.__code__ = fat.replace_consts(hello.__code__, {'LOAD_GLOBAL > print': print}) > > Where fat.replace_consts() is an helper to create a new code object > replacing constants with the specified mapping: > http://fatoptimizer.readthedocs.org/en/latest/fat.html#replace_consts > > Replacing print(...) with "LOAD_GLOBAL"(...) is done in the > fatoptimizer (an AST optimpizer): > http://fatoptimizer.readthedocs.org/en/latest/optimizations.html#copy-builtin-functions-to-constants > > We have to inject the builtin function at runtime. It cannot be done > when the code object is created by "def ..." because a code object can > only contain objects serializable by marshal (to be able to compile a > .py file to a .pyc file). > > >> I acknowledge that this goes beyond what the OP asked for, and I think >> that YAGNI is a reasonable response to the static block idea. I'm not >> going to champion it any further unless there's a bunch of interest from >> others. (I'm saving my energy for Eiffel-like require/ensure blocks >> *wink*). > > The difference between "def hello(print=print): ..." and Serhiy's > const idea (or my optimization) is that "def hello(print=print): ..." > changes the signature of the function which can be a serious issue in > an API. > > Note: The other optimization "local_print = print" in the function is > only useful for loops (when the builtin is loaded multiple times) and > it still loads the builtin once per function call, whereas my > optimization uses a constant and so no lookup is required anymore. > > Then guards are used to disable the optimization if builtins are > modified. See the PEP 510 for an explanation on that part. > > Victor > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Jan 22 23:58:02 2016 From: mike at selik.org (Michael Selik) Date: Fri, 22 Jan 2016 23:58:02 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <53B472E8-9D3F-4624-A840-B32DF75DE19D@selik.org> > On Jan 22, 2016, at 11:50 PM, Andrew Barnert via Python-ideas wrote: > > On Jan 21, 2016, at 00:48, Victor Stinner wrote: > >> Hi, >> >> Sorry but I'm lost in this long thread. > > I think the whole issue of const optimization is taking this discussion way off track, so let me try to summarize the actual issue. > > What the thread is ultimately looking for is a solution to the "closures capturing loop variables" problem. This problem has been in the official programming FAQ[1] for decades, as "Why do lambdas defined in a loop with different values all return the same result"? > > powers = [lambda x: x**i for i in range(10)] > > This gives you ten functions that all return x**9, which is probably not what you wanted. The original request could have also been solved with ``functools.partial``. Sure, this is a toy solution, but the problem as originally shared was a toy problem. >>> from functors import partial >>> a = 1 >>> f = partial(lambda a,x: a+x, a) >>> f(10) 11 >>> a = 2 >>> f(10) 11 Seems to me quite similar to the original suggestion from haael: """ a = 1 b = 2 c = 3 fun = lambda[a, b, c] x, y: a + b + c + x + y """ From rosuav at gmail.com Sat Jan 23 00:06:36 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 23 Jan 2016 16:06:36 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas wrote: > Finally, Terry suggested a completely different solution to the problem: > don't change closures; change for loops. Make them create a new variable > each time through the loop, instead of reusing the same variable. When the > variable isn't captured, this would make no difference, but when it is, > closures from different iterations would capture different variables (and > therefore different cells). For backward-compatibility reasons, this might > have to be optional, which means new syntax; he proposed "for new i in > range(10):". Not just for backward compatibility. Python's scoping and assignment rules are currently very straight-forward: assignment creates a local name unless told otherwise by a global/nonlocal declaration, and *all* name binding follows the same rules as assignment. Off the top of my head, I can think of two special cases, neither of which is truly a change to the binding semantics: "except X as Y:" triggers an unbinding at the end of the block, and comprehensions have a hidden function boundary that means their iteration variables are more local than you might think. Making for loops behave differently by default would be a stark break from that tidiness. It seems odd to change this on the loop, though. Is there any reason to use "for new i in range(10):" if you're not making a series of nested functions? Seems most logical to make this a special way of creating functions, not of looping. ChrisA From tjreedy at udel.edu Sat Jan 23 00:09:37 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 23 Jan 2016 00:09:37 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On 1/22/2016 11:50 PM, Andrew Barnert via Python-ideas wrote: > Finally, Terry suggested a completely different solution to the problem: > don't change closures; change for loops. I remember that proposal, but it was someone other than me. -- Terry Jan Reedy From abarnert at yahoo.com Sat Jan 23 00:36:30 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 22 Jan 2016 21:36:30 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <5AD4E9DA-1970-45E3-BB20-DD5FB8D55833@yahoo.com> On Jan 22, 2016, at 21:06, Chris Angelico wrote: > > On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas > wrote: >> Finally, Terry suggested a completely different solution to the problem: >> don't change closures; change for loops. Make them create a new variable >> each time through the loop, instead of reusing the same variable. When the >> variable isn't captured, this would make no difference, but when it is, >> closures from different iterations would capture different variables (and >> therefore different cells). For backward-compatibility reasons, this might >> have to be optional, which means new syntax; he proposed "for new i in >> range(10):". > > Not just for backward compatibility. Python's scoping and assignment > rules are currently very straight-forward: assignment creates a local > name unless told otherwise by a global/nonlocal declaration, and *all* > name binding follows the same rules as assignment. Off the top of my > head, I can think of two special cases, neither of which is truly a > change to the binding semantics: "except X as Y:" triggers an > unbinding at the end of the block, and comprehensions have a hidden > function boundary that means their iteration variables are more local > than you might think. Making for loops behave differently by default > would be a stark break from that tidiness. As a side note, notice that if you don't capture the variable, there is no observable difference (which means CPython would be well within its rights to optimize it by reusing the same variable unless it's a cellvar). Anyway, yes, it's still something that you have to learn--but the unexpected-on-first-encounter interaction between loop variables and closures is also something that everybody has to learn. And, even after you understand it, it still doesn't become obvious until you've been bitten by it enough times (and if you're going back and forth between Python and a language that's solved the problem, one way or the other, you may keep relearning it). So, theoretically, the status quo is certainly simpler, but in practice, I'm not sure it is. > It seems odd to change this on the loop, though. Is there any reason > to use "for new i in range(10):" if you're not making a series of > nested functions? Rarely if ever. But is there any reason to "def spam(x; i):" or "def [i](x):" or whatever syntax people like if you're not overwriting i with a different and unwanted value? And is there any reason to reuse a variable you've bound in that way if a loop isn't forcing you to do so? This problem comes up all the time, in all kinds of languages, when loops and closures intersect. It almost never comes up with loops alone or closures alone. > Seems most logical to make this a special way of > creating functions, not of looping. There are also some good theoretical motivations for changing loops, but I'm really hoping someone else (maybe the Swift or C# dev team blogs) has already written it up, so I can just post a link and a short "... and here's why it also applies to Python" (complicated by the fact that one of the motivations _doesn't_ apply to Python...). Also, the idea of a closure "capturing by value" is pretty strange on the surface; you have to think through why that doesn't just mean "not capturing" in a language like Python. Nick Coghlan suggests calling it "capture at definition" vs. "capture at call", which helps, but it's still weird. Weirder than loops creating a new binding that has the same name as the old one in a let-less language? I don't know. They're both weird. And so is the existing behavior, despite the fact that it makes perfect sense once you work it through. Anyway, for now, I'll just repeat that Ruby, Swift, C#, etc. all solved this by changing for loops, while only C++, which already needed to change closures because of its lifetime rules, solved it by changing closures. On the other hand, JavaScript and Java both explicitly rejected any change to fix the problem, and Python has lived with it for a long time, so... From guido at python.org Sat Jan 23 00:43:12 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 21:43:12 -0800 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On Fri, Jan 22, 2016 at 8:04 PM, Nick Coghlan wrote: > On 23 January 2016 at 06:00, Guido van Rossum wrote: > > Ben Darnell (Tornado lead) brought up a good use case for allowing > @overload > > in regular Python files. > > > > There's some discussion (some old, some new) here: > > https://github.com/ambv/typehinting/issues/72 > > > > I now propose to allow @overload in non-stub (i.e. .py) files, but with > the > > following rule: a series of @overload-decorated functions must be > followed > > by an implementation function that's not @overload-decorated. Calling an > > @overload-decorated function is still an error (I propose > NotImplemented). > > Due to the way repeated function definitions with the same name replace > each > > other, leaving only the last one active, this should work. E.g. for > > Tornado's utf8() the full definition would look like this: > > > > @overload > > def utf8(value: None) -> None: ... > > @overload > > def utf8(value: bytes) -> bytes: ... > > @overload > > def utf8(value: str) -> bytes: ... # or (unicode)->bytes, in PY2 > > def utf8(value): > > # Real implementation goes here. > > I share Andrew's concerns about the lack of integration between this > and functools.singledispatch, so would it be feasible to apply the > "magic comments" approach here, similar to the workarounds for > variable annotations and Py2 compatible function annotations? > > That is, would it be possible to use a notation like the following?: > > def utf8(value): > # type: (None) -> None > # type: (bytes) -> bytes > # type: (unicode) -> bytes > ... > > You're already going to have to allow this for single lines to handle > Py2 compatible annotations, so it seems reasonable to also extend it > to handle overloading while you're still figuring out a native syntax > for that. > That's clever. I'm sure we could make it work if we wanted to. But it doesn't map to anything else -- in stub files do already have @overload, and there's no way to translate this directly to Python 3 in-line annotations. Regarding the confusion with @functools.singledispatch, hopefully all documentation for @overload (including StackOverflow :-) would quickly point out how you are supposed to use it. There's also a deep distinction between @overload in PEP 484 and singledispatch, multidispatch or even the (ultimately deferred) approach from PEP 3124, also called @overload. PEP 484's @overload (whether in stubs or in this proposed form in .py files) talks to the *type checker* and it can be used with generic types. For example, suppose you have @overload def foo(a: Sequence[int]) -> int: ... @overload def foo(a: Sequence[str]) -> float: ... def foo(a): return sum(float(x) for x in a) (NOTE: Don't be fooled to think that the implementation is the last word on the supported types and hence the list of overloads is "obviously" incomplete. The type checker needs to take the overloads at their word and reject calls to e.g. foo([3.14]). A future implementation that matches the overloaded signatures given here might not work for float arguments.) Here the implementation will have to somehow figure out whether its argument is a list of integers or strings, e.g. by checking the type of the first item -- that should be okay since passing the type check implies a promise that the argument is homogeneous. But a purely runtime dispatcher would not be able to make that distinction so easily, since PEP 484 assumes type erasure -- at runtime the argument is just a Sequence. Of course, functools.singledispatch sidesteps this by not supporting generic types at all. But the example illustrates that the two are more different than you'd think from the utf8() example (which just distinguishes between unicode, bytes and None -- no generic types there). >From a type-checking perspective, functools.singledispatch is not easy to handle -- it is defined in terms of its runtime behavior, and it explicitly supports dynamic registration. (Who uses it? There's only one use in the stdlib, which is in pkgutil.py, under the guise @simplegeneric.) Clearly both @overload and functools.singledispatch are stepping stones on the way to an elusive better solution. Hopefully until that solution is found they can live together? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 23 00:50:03 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Jan 2016 21:50:03 -0800 Subject: [Python-ideas] Typehinting repo moved to python/typing Message-ID: This is just a note that with Benjamin's help we've moved the ambv/typehinting repo on GitHub into the python org, so its URL is now https://github.com/python/typing . This repo was used most intensely for discussions during PEP 484's drafting period. It also contains the code for typing.py, repackaged for earlier releases on PyPI. The issue tracker is still open for proposals to change PEP 484, which is not unheard of given its provisional status. If you find a pointer to the original location of this repo in a file you can update, please go ahead (though GitHub is pretty good at forwarding URLs from renamed repos). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jan 23 00:54:09 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 23 Jan 2016 15:54:09 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On 23 January 2016 at 14:50, Andrew Barnert via Python-ideas wrote: > What the thread is ultimately looking for is a solution to the "closures > capturing loop variables" problem. This problem has been in the official > programming FAQ[1] for decades, as "Why do lambdas defined in a loop with > different values all return the same result"? > > powers = [lambda x: x**i for i in range(10)] > > This gives you ten functions that all return x**9, which is probably not > what you wanted. > > The reason this is a problem is that Python uses "late binding", which in > this context means that each of those functions is a closure that captures > the variable i in a way that looks up the value of i at call time. All ten > functions capture the same variable, and when you later call them, that > variable's value is 9. Thanks for that summary, Andrew. While I do make some further thoughts below, I'll also note explicitly that I think the status quo in this area is entirely acceptable, and we don't actually *need* to change anything. However, there have already been some new ways of looking at the question that haven't come up previously, so I think it's a worthwhile discussion, even though the most likely outcome is still "No change". > The OP proposed that we should add some syntax, borrowed from C++, to > function definitions that specifies that some things get captured by value. > You could instead describe this as early binding the specified names, or as > not capturing at all, but however you describe it, the idea is pretty > simple. The obvious way to implement it is to copy the values into the > function object at function-creation time, then copy them into locals at > call time--exactly like default parameter values. (Not too surprising, > because default parameter values are the idiomatic workaround today.) In an off-list discussion with Andrew, I noted that one reason the "capture by value" terminology was confusing me was because it made me think in terms of "pass by reference" and "pass by value" in C/C++, neither of which is actually relevant to the discussion at hand. However, he also pointed out that "early binding" vs "late binding" was also confusing, since the compile-time/definition-time/call-time distinction in Python is relatively unique, and in many other contexts "early binding" refers to things that happen at compile time. As a result (and as Andrew already noted in another email), I'm currently thinking of the behaviour of nonlocal and global variables as "capture at call", while the values of default parameters are "capture at definition". (If "capture" sounds weird, "resolve at call" and "resolve at definition" also work). The subtlety of this distinction actually shows up in *two* entries in the programming FAQ. Andrew already mentioned the interaction of loops and closures, where capture-at-call surprises people: https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result However, there are also mutable default arguments, where it is capture-at-definition that is often surprising: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects While nobody's proposing to change the latter, providing an explicit syntax for "capture at definition" may still have a beneficial side effect in making it easier to explain the way default arguments are evaluated and stored on the function object at function definition time rather than created anew each time the function runs. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 23 01:17:55 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 23 Jan 2016 16:17:55 +1000 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On 23 January 2016 at 15:43, Guido van Rossum wrote: > On Fri, Jan 22, 2016 at 8:04 PM, Nick Coghlan wrote: >> That is, would it be possible to use a notation like the following?: >> >> def utf8(value): >> # type: (None) -> None >> # type: (bytes) -> bytes >> # type: (unicode) -> bytes >> ... >> >> You're already going to have to allow this for single lines to handle >> Py2 compatible annotations, so it seems reasonable to also extend it >> to handle overloading while you're still figuring out a native syntax >> for that. > > > That's clever. I'm sure we could make it work if we wanted to. But it > doesn't map to anything else -- in stub files do already have @overload, and > there's no way to translate this directly to Python 3 in-line annotations. Right, my assumption is that it would eventually be translated to a full multi-dispatch solution for Python 3, whatever that spelling turns out to be - I'm just assuming that spelling *won't* involve annotating empty functions the way that stub files currently do, but rather annotating separate implementations for a multidispatch algorithm, or perhaps gaining a way to more neatly compose multiple sets of annotations. > @overload > def foo(a: Sequence[int]) -> int: ... > @overload > def foo(a: Sequence[str]) -> float: ... > def foo(a): > return sum(float(x) for x in a) While the disconnect with functools.singledispatch is one concern, another is the sheer visual weight of this approach. The real function has to go last to avoid getting clobbered, but the annotations for multiple dispatch end up using a lot of space beforehand. It gets worse if you need to combine it with Python 2 compatible type hinting comments since you can't squeeze the function definitions onto one line anymore: @overload def foo(a): # type: (Sequence[int]) -> int ... @overload def foo(a): # type: (Sequence[str]) -> float ... def foo(a): return sum(float(x) for x in a) If you were to instead go with a Python 2 compatible comment based inline solution for now, you'd then get to design the future official spelling for multi-dispatch annotations based on your experience with both that and with the decorator+annotations approach used in stub files. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From spencerb21 at live.com Sat Jan 23 03:14:58 2016 From: spencerb21 at live.com (Spencer Brown) Date: Sat, 23 Jan 2016 08:14:58 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> , Message-ID: > On 23 Jan 2016, at 7:41 AM, Andrew Barnert via Python-ideas wrote: > > The runtime types are a little weird here as well. > > In 3.x, open returns different types depending on the value, rather than the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even though it isn't a subtype in the LSP sense, so you have to test isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is going to return bytes. That's all a little wonky, but not impossible to deal with. > > In 2.x, most file-like objects--including file itself, which open returns--don't satisfy either ABC, and most of them can return either type from read. > > Having a different function for open-binary instead of a mode flag would solve this, but it seems a little late to be adding that now. You'd have to go through all your 2.x code and change every open to one of the two new functions just to statically type your code, and then change it again for 3.x. Plus, you'd need to do the same thing not just for the builtin open, but for every library that provides an open-like method. > > Maybe this special case is special enough that static type checkers just have to deal with it specially? When the mode flag is a literal, process it; when it's forwarded from another function, it may be possible to get the type from there; otherwise, everything is just unicode|bytes and the type checker can't know any more unless you explicitly tell it (by annotating the variable the result of open is stored in). Instead of special-casing open() specifically, adding a 'Literal' class would solve this issue (although only in a stub file): @overload def open(mode: Literal['rb', 'wb', 'ab']) -> BufferedIOBase: ... @overload def open(mode: Literal['rt', 'wt', 'at']) -> TextIOBase: ... Literal[a,b,c] == Union[Literal[a], Literal[b], Literal[c]] for convenience purposes. To avoid repetition, func(arg: Literal='value') could be made equivalent to func(arg: Literal['value']='value'). Typecheckers should just treat this the same as the type of the value, but for cases where it knows the value (literals or aliases) check the value too. (Either by comparison for core types, or just by identity. That allows use of object() sentinel values or Enum members.) From greg.ewing at canterbury.ac.nz Sat Jan 23 06:01:13 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jan 2016 00:01:13 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <56A35D79.4080408@canterbury.ac.nz> Terry Reedy wrote: >> Finally, Terry suggested a completely different solution to the problem: >> don't change closures; change for loops. > > I remember that proposal, but it was someone other than me. If you're looking for the perpetrator of "for new i in ...", I confess it was me. -- Greg From greg.ewing at canterbury.ac.nz Sat Jan 23 06:11:29 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jan 2016 00:11:29 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <56A35FE1.4070702@canterbury.ac.nz> Nick Coghlan wrote: > As a result (and as Andrew already noted in another email), I'm > currently thinking of the behaviour of nonlocal and global variables > as "capture at call", That's not right either, because if a free variable gets reassigned between the time of the call and the time the variable is used within the function, the new value is seen. -- Greg From stephen at xemacs.org Sat Jan 23 06:53:38 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 23 Jan 2016 20:53:38 +0900 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> Andrew Barnert via Python-ideas writes: > powers = [lambda x: x**i for i in range(10)] > This gives you ten functions that all return x**9, which is > probably not what you wanted. > The reason this is a problem is that Python uses "late binding", > which in this context means that each of those functions is a > closure that captures the variable i in a way that looks up the > value of i at call time. All ten functions capture the same > variable, and when you later call them, that variable's value is > 9. But this explanation going to confuse people who understand the concept of variable in Python to mean names that are bound and re-bound to objects. The comprehension's binding of i disappears before any element of powers can be called. So from their point of view, either that expression is an error, or powers[i] closes over a new binding of the name "i", specific to "the lambda's scope" (see below), to the current value of i in the comprehension. Of course the same phenomenon is observable with other scopes. In particular global scope behaves this way, as importing this file i = 0 def f(x): return x + i i = 1 and calling f(0) will demonstrate. But changing the value of a global, used the way i is here, within a library module is a rather unusual thing to do; I doubt people will observe it. Also, once again the semantics of lambda (specifically, that unlike def it doesn't create a scope) seem to be a source of confusion more than anything else. Maybe it's possible to exhibit the same issue with def, but the def equivalent to the above lambda >>> def make_increment(i): ... def _(x): ... return x + i ... return _ ... >>> funcs = [make_increment(j) for j in range(3)] >>> [f(0) for f in funcs] [0, 1, 2] closes over i in the expected way. (Of course in practicality, it's way more verbose, and in purity, it's not truly equivalent since there's at least one extra nesting of scope involved.) While >>> def make_increment(): ... def _(x): ... return x + i ... return _ ... >>> funcs = [make_increment() for i in range(3)] >>> [f(0) for f in funcs] Traceback (most recent call last): File "", line 1, in File "", line 1, in File "", line 3, in _ NameError: name 'i' is not defined >>> i = 6 >>> [f(0) for f in funcs] [6, 6, 6] doesn't make closures at all, but rather retains the global binding. From skrah.temporarily at gmail.com Sat Jan 23 07:57:36 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sat, 23 Jan 2016 12:57:36 +0000 (UTC) Subject: [Python-ideas] Explicit variable capture list References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: Nick Coghlan writes: > On 23 January 2016 at 14:50, Andrew Barnert via Python-ideas > wrote: > > What the thread is ultimately looking for is a solution to the "closures > > capturing loop variables" problem. This problem has been in the official > > programming FAQ[1] for decades, as "Why do lambdas defined in a loop with > > different values all return the same result"? > > > > powers = [lambda x: x**i for i in range(10)] > > > > This gives you ten functions that all return x**9, which is probably not > > what you wanted. > > > > The reason this is a problem is that Python uses "late binding", which in > > this context means that each of those functions is a closure that captures > > the variable i in a way that looks up the value of i at call time. All ten > > functions capture the same variable, and when you later call them, that > > variable's value is 9. I've never liked the use of "late binding" in this context. The behavior is totally standard for closures that use mutable values. Here's OCaml, using refs (mutable reference cells) instead of the regular immutable values. BTW, no one would write OCaml like in the following example, it's just for clarity): let i = ref 0.0;; # val i : float ref = {contents = 0.} let rpow = ref [];; # val rpow : '_a list ref = {contents = []} while (!i < 10.0) do rpow := (fun x -> x**(!i)) :: !rpow; i := !i +. 1.0 done;; - : unit = () let powers = List.rev !rpow;; val powers : (float -> float) list = [; ; ; ; ; ; ; ; ; ] List.map (fun f -> f 10.0) powers;; - : float list = [10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.] # You see that "i" is a reference cell, i.e. it's compiled to a C struct and lookups are just a pointer dereference. Conceptually Python's dictionaries are really just the same as reference cells, except they hold more than one value. So, to me the entire question is more one of immutable vs. mutable rather than late vs. early binding. Stefan Krah From guido at python.org Sat Jan 23 11:54:37 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Jan 2016 08:54:37 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sat, Jan 23, 2016 at 3:53 AM, Stephen J. Turnbull wrote: > Andrew Barnert via Python-ideas writes: > > > powers = [lambda x: x**i for i in range(10)] > > > This gives you ten functions that all return x**9, which is > > probably not what you wanted. > > > The reason this is a problem is that Python uses "late binding", > > which in this context means that each of those functions is a > > closure that captures the variable i in a way that looks up the > > value of i at call time. All ten functions capture the same > > variable, and when you later call them, that variable's value is > > 9. > Actually it doesn't look up the value at call time, but each time it's used. This technicality matters if in between uses you call something that has write access to the same variable (typically using nonlocal) and modifies it. > But this explanation going to confuse people who understand the > concept of variable in Python to mean names that are bound and > re-bound to objects. The comprehension's binding of i disappears > before any element of powers can be called. So from their point of > view, either that expression is an error, or powers[i] closes over a > new binding of the name "i", specific to "the lambda's scope" (see > below), to the current value of i in the comprehension. > But this seems to refer to a very specific definition of "binding" that doesn't have root in Python's semantic model. I suppose it may come from Lisp (which didn't influence Python quite as much as people think :-). So I think what you're saying here comes down that it will confuse people who misunderstand Python's variables. Given that the misunderstanding you're supposing here is pretty specific (it's not just due to people who've never thought much about variables) I'm not sure I care much. > Of course the same phenomenon is observable with other scopes. In > particular global scope behaves this way, as importing this file > > i = 0 > def f(x): > return x + i > i = 1 > > and calling f(0) will demonstrate. But changing the value of a > global, used the way i is here, within a library module is a rather > unusual thing to do; I doubt people will observe it. > I disagree again: in interactive mode most of what you do is global and you will see this quite often. And all scopes in Python behave the same way. > Also, once again the semantics of lambda (specifically, that unlike > def it doesn't create a scope) Uh, what? I can sort of guess what you are referring to here (namely, that no syntactic construct permissible in a lambda can assign to a local variable -- or any variable, for that matter) but it certainly has a scope (to hold the arguments, which are just variables, as one quickly learns from experimenting with the arguments to a function defined using def). > seem to be a source of confusion more > than anything else. Maybe it's possible to exhibit the same issue > with def, but the def equivalent to the above lambda > > >>> def make_increment(i): > ... def _(x): > ... return x + i > ... return _ > ... > >>> funcs = [make_increment(j) for j in range(3)] > >>> [f(0) for f in funcs] > [0, 1, 2] > > closes over i in the expected way. (Of course in practicality, it's > way more verbose, and in purity, it's not truly equivalent since > there's at least one extra nesting of scope involved.) It's such a strawman that I'm surprised you bring it up. Who would even *think* of using that idiom as equivalent to the simple lambda? If I were to deconstruct the original statement, I would start by replacing the list comprehension with a plain old for loop. That would also not be truly equivalent because the comprehension introduces a scope while the for loop doesn't, but the difference only matters if it stomps on another variable -- the semantics relative to the lambda are exactly the same. In particular, this example exhibits the same phenomenon without using a comprehension: powers = [] for i in range(10): powers.append(lambda x: x**i) This in turn can be rewritten without changing the semantics related to scopes using a def that's equivalent (truly equivalent except for its __name__ attribute!): powers = [] for i in range(10): def f(x): return x**i powers.append(f) (Note that the leakage of f here is irrelevant to the problem.) This has the same problem, without being distracted by lambda or comprehensions, and we can now explore its semantics through experimentation. We could even unroll the for loop and get the same issue: powers = [] i = 0 def f(x): return x**i powers.append(f) i = 1 def f(x): return x**i powers.append(f) # Etc. > While > > >>> def make_increment(): > ... def _(x): > ... return x + i > ... return _ > ... > >>> funcs = [make_increment() for i in range(3)] > >>> [f(0) for f in funcs] > Traceback (most recent call last): > File "", line 1, in > File "", line 1, in > File "", line 3, in _ > NameError: name 'i' is not defined > >>> i = 6 > >>> [f(0) for f in funcs] > [6, 6, 6] > > doesn't make closures at all, but rather retains the global binding. > Totally different idiom again -- another strawman. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 23 12:08:15 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Jan 2016 09:08:15 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: On Sat, Jan 23, 2016 at 4:57 AM, Stefan Krah wrote: > I've never liked the use of "late binding" in this context. The > behavior is totally standard for closures that use mutable values. > I wonder if the problem isn't that "binding" is a term imported from a different language philosophy, and the idea there is just fundamentally different from Python's philosophy about variables. In Python, a variable is *conceptually* just a key in a dict (and often, like for globals, builtins and instance or class variables, that really is how it's implemented). The variable name is the key, and there are implicit (and often dynamic) rules for deciding which dict to use. For local variables this is a bit of a lie, but the language goes out of its way to make it appear true (e.g. the existence of locals()). This concept is also valid for nonlocals (either the implicit PY2 kind, of the explicit PY3 kind introduced by a nonlocal statement). The implementation through "cells" is nearly unobservable (try getting a hold of a cell object through introspection without using ctypes!) and is just an optimization. Semantically (if we don't mind keeping other objects alive loger), nonlocals can be implemented by just holding on to the stack frame of the function call where they live, or, if locals hadn't been optimized, holding on to the dict containing that frame's locals would also work. So, I don't really want to introduce "for new x in ..." because it suddenly introduces a completely different concept into the language, and it would be really hard to explain what it does to someone who has correctly grasped Python's concept of variables as keys in a dict. What dict hold x in "for new x ..."? It would have to be considered a new dict created just to hold x, but other variables assigned in the body of the for loop would still be in the dict holding all the other locals of the function. Bah. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlehtosalo at gmail.com Sat Jan 23 13:13:08 2016 From: jlehtosalo at gmail.com (Jukka Lehtosalo) Date: Sat, 23 Jan 2016 18:13:08 +0000 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On Sat, Jan 23, 2016 at 6:17 AM, Nick Coghlan wrote: > On 23 January 2016 at 15:43, Guido van Rossum wrote: > > That's clever. I'm sure we could make it work if we wanted to. But it > > doesn't map to anything else -- in stub files do already have @overload, > and > > there's no way to translate this directly to Python 3 in-line > annotations. > > Right, my assumption is that it would eventually be translated to a > full multi-dispatch solution for Python 3, whatever that spelling > turns out to be - I'm just assuming that spelling *won't* involve > annotating empty functions the way that stub files currently do, but > rather annotating separate implementations for a multidispatch > algorithm, or perhaps gaining a way to more neatly compose multiple > sets of annotations. > We don't have a proposal for multidispatch even though people have been hoping for it to happen for a long time. It's a much harder problem than providing multiple signatures for a function, and it's also arguably a different problem, even if there is some overlap. @overload might still be preferable to a multidispatch solution in a lot of cases even if both existed since @overload is conceptually pretty simple. However, this is all conjecture since we don't know what multidispatch would look like. I'd love to see a multidispatch proposal. > > @overload > > def foo(a: Sequence[int]) -> int: ... > > @overload > > def foo(a: Sequence[str]) -> float: ... > > def foo(a): > > return sum(float(x) for x in a) > > While the disconnect with functools.singledispatch is one concern, > another is the sheer visual weight of this approach. The real function > has to go last to avoid getting clobbered, but the annotations for > multiple dispatch end up using a lot of space beforehand. > There is little evidence that @overload answers a *common* need, even if the need is important -- originally we left out @overload in .py files because we hadn't found a convincing use case. I don't consider the visual weight to be a major problem, as this would only be used rarely, at least based on our current understanding. But clearly the proposed syntax won't win any prettiness awards. Singledispatch solves a different problem and makes different tradeoffs -- for example, it adds more runtime overhead, and it doesn't lend itself to specifying multiple return types for a single function body that depend on argument types. It also lives in a different module. I don't worry too much about the overlap. > It gets worse if you need to combine it with Python 2 compatible type > hinting comments since you can't squeeze the function definitions onto > one line anymore: > > @overload > def foo(a): > # type: (Sequence[int]) -> int > ... > @overload > def foo(a): > # type: (Sequence[str]) -> float > ... > def foo(a): > return sum(float(x) for x in a) > > If you were to instead go with a Python 2 compatible comment based > inline solution for now, you'd then get to design the future official > spelling for multi-dispatch annotations based on your experience with > both that and with the decorator+annotations approach used in stub > files. > Your proposed comment based solution looks nicer in Python 2 code than @overload. I'd prefer optimizing any syntax we choose for Python 3 as that's where the future is. I'd rather not be forced to use comment-based signatures in Python 3 only code. Jukka -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jan 23 14:18:42 2016 From: brett at python.org (Brett Cannon) Date: Sat, 23 Jan 2016 19:18:42 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> Message-ID: On Fri, 22 Jan 2016 at 11:35 Andrew Barnert wrote: > On Jan 22, 2016, at 10:37, Brett Cannon wrote: > > > On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: > >> >> Yes, this is a useful thing to discuss. >> >> Maybe we can standardize on the types defined by the 'six' package, which >> is commonly used for 2-3 straddling code: >> >> six.text_type (unicode in PY2, str in PY3) >> six.binary_type (str in PY2, bytes in PY3) >> >> Actually for the latter we might as well use bytes. >> > > I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in > Python 3. > > As for the textual type, I say either `text` or `unicode` since they are > both unambiguous between Python 2 and 3 and get the point across. > > > The only problem is that, while bytes is a builtin type in both 2.7 and > 3.x, with similar behaviour (especially in 3.5, where simple %-formatting > code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that > would require people writing something like "try: unicode except: > unicode=str" at the top of every file (or monkeypatching builtins > somewhere) for the annotations to actually be valid 3.x code. > But why do they have to be valid code? This is for Python 2/3 code which means any typing information is going to be in a comment and so it isn't important that it be valid code as-is as long as the tools involved realize what `unicode` represents. IOW if mypy knows what the `unicode` type represents in PY3 mode then what does it matter if `unicode` is not a built-in type of Python 3? > And, if you're going to do that, using something that's already > wide-spread and as close to a de facto standard as possible, like the six > type suggested by Guido, seems less disruptive than inventing a new > standard (even if "text" or "unicode" is a little nicer than > "six.text_type"). > > (Or, of course, Guido could just get in his time machine and, along with > restoring the u string literal prefix in 3.3, also restore the builtin name > unicode as a synonym for str, and then this whole mail thread would fade > out like Marty McFly.) > I long thought about that option, but I don't think it buys us enough to bother to add the alias for `str` in Python 3. Considering all of the other built-in tweaks you typically end up making, I don't think this one change is worth it. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jan 23 14:22:19 2016 From: brett at python.org (Brett Cannon) Date: Sat, 23 Jan 2016 19:22:19 +0000 Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7 In-Reply-To: References: <569F9959.3020202@gmail.com> <56A11FFA.7040408@gmail.com> <3D88B286-0519-42D9-893D-F4B8AD698041@yahoo.com> Message-ID: On Sat, 23 Jan 2016 at 11:18 Brett Cannon wrote: > On Fri, 22 Jan 2016 at 11:35 Andrew Barnert wrote: > >> On Jan 22, 2016, at 10:37, Brett Cannon wrote: >> >> >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum wrote: >> >>> >>> Yes, this is a useful thing to discuss. >>> >>> Maybe we can standardize on the types defined by the 'six' package, >>> which is commonly used for 2-3 straddling code: >>> >>> six.text_type (unicode in PY2, str in PY3) >>> six.binary_type (str in PY2, bytes in PY3) >>> >>> Actually for the latter we might as well use bytes. >>> >> >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in >> Python 3. >> >> As for the textual type, I say either `text` or `unicode` since they are >> both unambiguous between Python 2 and 3 and get the point across. >> >> >> The only problem is that, while bytes is a builtin type in both 2.7 and >> 3.x, with similar behaviour (especially in 3.5, where simple %-formatting >> code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that >> would require people writing something like "try: unicode except: >> unicode=str" at the top of every file (or monkeypatching builtins >> somewhere) for the annotations to actually be valid 3.x code. >> > > But why do they have to be valid code? This is for Python 2/3 code which > means any typing information is going to be in a comment and so it isn't > important that it be valid code as-is as long as the tools involved realize > what `unicode` represents. IOW if mypy knows what the `unicode` type > represents in PY3 mode then what does it matter if `unicode` is not a > built-in type of Python 3? > I should also mention that Guido is suggesting typing.unicode come into existence, so there is no special import guard necessary. And since you will be importing `typing` anyway for type details then having typing.unicode in both Python 2 and Python 3 is a very minor overhead. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah.temporarily at gmail.com Sat Jan 23 15:03:35 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sat, 23 Jan 2016 20:03:35 +0000 (UTC) Subject: [Python-ideas] Explicit variable capture list References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: Guido van Rossum writes: >> I've never liked the use of "late binding" in this context. The >> behavior is totally standard for closures that use mutable values. > > > I wonder if the problem isn't that "binding" is a term imported from a different language philosophy, and the idea there is just fundamentally different from Python's philosophy about variables. I think my point is that even if "late binding" is the best term for Python's symbol resolution scheme, it may not be optimal to use it as an explanation for this particular closure behavior, since all languages with mutable closures behave in the same manner (and most of them would be classified as "early binding" languages). Stefan Krah From skrah.temporarily at gmail.com Sat Jan 23 16:12:26 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sat, 23 Jan 2016 21:12:26 +0000 (UTC) Subject: [Python-ideas] PEP 484 change proposal: Allowing overload outside stub files References: Message-ID: Nick Coghlan writes: > You're already going to have to allow this for single lines to handle > Py2 compatible annotations, so it seems reasonable to also extend it > to handle overloading while you're still figuring out a native syntax > for that. I find that https://pypi.python.org/pypi/multipledispatch looks quite nice: >>> from multipledispatch import dispatch >>> @dispatch(int, int) ... def add(x, y): ... return x + y ... >>> @dispatch(float, float) ... def add(x, y): ... return x + y ... >>> add(1, 2) 3 >>> add(1.0, 2.0) 3.0 >>> add(1.0, 2) Traceback (most recent call last): File [cut because gmane.org is inflexible] line 155, in __call__ func = self._cache[types] KeyError: (, ) Stefan Krah From greg.ewing at canterbury.ac.nz Sat Jan 23 16:16:11 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Jan 2016 10:16:11 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <56A3ED9B.6030009@canterbury.ac.nz> Guido van Rossum wrote: > So, I don't really want to introduce "for new x in ..." because it > suddenly introduces a completely different concept into the language, > > What > dict hold x in "for new x ..."? It would have to be considered a new > dict created just to hold x, but other variables assigned in the body of > the for loop would still be in the dict holding all the other locals of > the function. We could say that the body of a "for new" loop is a nested scope in which all other referenced variables are implicitly declared "nonlocal". -- Greg From ncoghlan at gmail.com Sat Jan 23 21:18:18 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Jan 2016 12:18:18 +1000 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: On 24 January 2016 at 04:13, Jukka Lehtosalo wrote: > On Sat, Jan 23, 2016 at 6:17 AM, Nick Coghlan wrote: >> If you were to instead go with a Python 2 compatible comment based >> inline solution for now, you'd then get to design the future official >> spelling for multi-dispatch annotations based on your experience with >> both that and with the decorator+annotations approach used in stub >> files. > > Your proposed comment based solution looks nicer in Python 2 code than > @overload. I'd prefer optimizing any syntax we choose for Python 3 as that's > where the future is. I'd rather not be forced to use comment-based > signatures in Python 3 only code. For the benefit of folks reading this thread, but not the linked issue: Guido pointed out some cases with variable signatures (e.g. annotating a range-style API) & keyword args where the stacked comments idea doesn't work, so I switched to being +0 on the "@overload in .py files" interim solution. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 23 21:45:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Jan 2016 12:45:05 +1000 Subject: [Python-ideas] Multiple dispatch (was Re: PEP 484 change proposal: Allowing overload outside stub files Message-ID: On 24 January 2016 at 07:12, Stefan Krah wrote: > Nick Coghlan writes: >> You're already going to have to allow this for single lines to handle >> Py2 compatible annotations, so it seems reasonable to also extend it >> to handle overloading while you're still figuring out a native syntax >> for that. > > I find that https://pypi.python.org/pypi/multipledispatch looks quite > nice: > >>>> from multipledispatch import dispatch >>>> @dispatch(int, int) > ... def add(x, y): > ... return x + y > ... >>>> @dispatch(float, float) > ... def add(x, y): > ... return x + y > ... >>>> add(1, 2) > 3 >>>> add(1.0, 2.0) > 3.0 >>>> add(1.0, 2) > Traceback (most recent call last): > File > [cut because gmane.org is inflexible] > line 155, in __call__ > func = self._cache[types] > KeyError: (, ) Right, the Blaze folks have been doing some very nice work in that area. One of the projects building on multipledispatch is the Odo network of data conversion operations: https://github.com/blaze/odo They do make the somewhat controversial design decision to make dispatch operations process global by default [1], rather than scoping by module. On the other hand, the design also makes it easy to define your own dispatch namespace, so the default orthogonality with the module system likely isn't a problem in practice, and the lack of essential boilerplate does make it very easy to use in contexts like an IPython notebook. There is one aspect that still requires runtime stack introspection [2], and that's getting access to the class scope in order to implicitly make method dispatch specific to the class defining the methods. It's the kind of thing that makes me wonder whether we should be exposing a thread-local variable somewhere with a "class namespace stack" that made it possible to: - tell that you're currently running in the context of a class definition - readily get access to the namespace of the innermost class currently being defined Cheers, Nick. [1] http://multiple-dispatch.readthedocs.org/en/latest/design.html#namespaces-and-dispatch [2] https://github.com/mrocklin/multipledispatch/blob/master/multipledispatch/core.py -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jan 23 22:22:30 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Jan 2016 13:22:30 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A3ED9B.6030009@canterbury.ac.nz> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On 24 January 2016 at 07:16, Greg Ewing wrote: > Guido van Rossum wrote: >> >> So, I don't really want to introduce "for new x in ..." because it >> suddenly introduces a completely different concept into the language, >> >> What dict hold x in "for new x ..."? It would have to be considered a new >> dict created just to hold x, but other variables assigned in the body of the >> for loop would still be in the dict holding all the other locals of the >> function. > > We could say that the body of a "for new" loop is a nested > scope in which all other referenced variables are implicitly > declared "nonlocal". This actually ties into an idea your suggestion prompted: it would likely suffice if we had a way to request "create a new scope per iteration" behaviour in for loops and comprehensions, with no implicit nonlocal behaviour at all. Consider Guido's spelled out list comprehension equivalent: powers = [] for i in range(10): def f(x): return x**i powers.append(f) There's no rebinding of values in the current scope there - only mutation of a list. Container comprehensions and generator expressions have the same characteristic - no name rebinding occurs in the loop body, so the default handling of rebinding of names other than the iteration variables doesn't matter. Accordingly, a statement like: powers = [] for new i in range(10): def f(x): return x**i powers.append(f) Could be semantically equivalent to: powers = [] for i in range(10): def _for_loop_suite(i=i): def f(x): return x**i powers.append(f) _for_loop_suite() del _for_loop_suite Capturing additional values on each iteration would be possible with a generator expression: for new i, a, b, c in (i, a, b, c for i range(10)): def f(x): return x**i, a, b, c While nonlocal and global declarations would work the same way they do in any other nested function. For a practical example of this, consider the ThreadPoolExecutor example from the concurrent.futures docs: https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example A scope-per-iteration construct makes it much easier to use a closure to define the operation submitted to the executor for each URL: with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: # Start the load operations and mark each future with its URL future_to_site = {} for new site_url in sites_to_load: def load_site(): with urllib.request.urlopen(site_url, timeout=60) as conn: return conn.read() future_to_site[executor.submit(load_site)] = site_url # Report results as they become available for future in concurrent.futures.as_completed(future_to_site): site_url = future_to_site[future] try: data = future.result() except Exception as exc: print('%r generated an exception: %s' % (site_url, exc)) else: print('%r page is %d bytes' % (site_url, len(data))) If you try to write that code that way today (i.e. without the "new" on the first for loop), you'll end up with a race condition between the main thread changing the value of "site_url" and the executor issuing the URL open request. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sun Jan 24 00:16:57 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Jan 2016 21:16:57 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On Sat, Jan 23, 2016 at 7:22 PM, Nick Coghlan wrote: > [...] > For a practical example of this, consider the ThreadPoolExecutor > example from the concurrent.futures docs: > https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example > > A scope-per-iteration construct makes it much easier to use a closure > to define the operation submitted to the executor for each URL: > > with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: > # Start the load operations and mark each future with its URL > future_to_site = {} > for new site_url in sites_to_load: > def load_site(): > with urllib.request.urlopen(site_url, timeout=60) as conn: > return conn.read() > future_to_site[executor.submit(load_site)] = site_url > # Report results as they become available > for future in concurrent.futures.as_completed(future_to_site): > site_url = future_to_site[future] > try: > data = future.result() > except Exception as exc: > print('%r generated an exception: %s' % (site_url, exc)) > else: > print('%r page is %d bytes' % (site_url, len(data))) > > If you try to write that code that way today (i.e. without the "new" > on the first for loop), you'll end up with a race condition between > the main thread changing the value of "site_url" and the executor > issuing the URL open request. I wonder if kids today aren't too much in love with local function definitions. :-) There's a reason why executor.submit() takes a function *and arguments*. If you move the function out of the for loop and pass the url as a parameter to submit(), problem solved, and you waste fewer resources on function objects and cells to hold nonlocals. A generation ago most people would have naturally used such a solution (since most languages didn't support the alternative :-). -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Jan 24 00:37:17 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Jan 2016 21:37:17 -0800 Subject: [Python-ideas] Multiple dispatch (was Re: PEP 484 change proposal: Allowing overload outside stub files In-Reply-To: References: Message-ID: On Sat, Jan 23, 2016 at 6:45 PM, Nick Coghlan wrote: > [...] > There is one aspect that still requires runtime stack introspection > [2], and that's getting access to the class scope in order to > implicitly make method dispatch specific to the class defining the > methods. It's the kind of thing that makes me wonder whether we should > be exposing a thread-local variable somewhere with a "class namespace > stack" that made it possible to: > > - tell that you're currently running in the context of a class definition > - readily get access to the namespace of the innermost class currently > being defined I wonder if it wouldn't be acceptable to have a metaclass that takes care of the dispatch registry. You'd have a metaclass whose __prepare__ method produces a special kind of namespace that collaborates with a @dispatch() decorator. In this design, @dispatch() would not do the registration, it would just store its parameters on a function attribute and mark the function (or return some other object representing the dispatch parameters and the function). When the namespace receives a __setattr__() call with such an object, it registers it and if needed merges it with the object already there. Admittedly, calling inspect.currentframe() and assuming it never returns None is probably less code. (Hm, maybe sys._getframe() could be guaranteed to work inside a class scope?) > [1] http://multiple-dispatch.readthedocs.org/en/latest/design.html#namespaces-and-dispatch > [2] https://github.com/mrocklin/multipledispatch/blob/master/multipledispatch/core.py -- --Guido van Rossum (python.org/~guido) From stephen at xemacs.org Sun Jan 24 01:27:52 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 24 Jan 2016 15:27:52 +0900 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> Message-ID: <22180.28392.435688.885238@turnbull.sk.tsukuba.ac.jp> Guido, Thank you for taking the trouble to address my rather confused post. Guido van Rossum writes: > If I were to deconstruct the original statement, I would start by > replacing the list comprehension with a plain old for loop. I did that. But that actually doesn't bother me because the loop index's identifier doesn't go out of scope. I now see why that's a red herring, but maybe documentation can be improved. Anyway, I wrote that post before seeing your explanation that things just aren't that difficult, they all follow from "variable reference as dictionary lookup". The clue I needed was the way to view a scope as an object, and then realize that all free variable references are the same, except for visibility of the relevant scope to the other code at the call site. For me it's now a documentation issue (I know why the comprehension of lambdas work as they do, and I also know how to get the "expected", more useful result). I'll go take a look at the language reference, and tutorial, and see if I think they can be improved. From mertz at gnosis.cx Sun Jan 24 01:45:24 2016 From: mertz at gnosis.cx (David Mertz) Date: Sat, 23 Jan 2016 22:45:24 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sat, Jan 23, 2016 at 8:54 AM, Guido van Rossum wrote: > Also, once again the semantics of lambda (specifically, that unlike >> > def it doesn't create a scope) > > > Uh, what? I can sort of guess what you are referring to here (namely, that > no syntactic construct permissible in a lambda can assign to a local > variable -- or any variable, for that matter). > That's not even quite true, you can assign to global variables in a lambda: >>> myglobal = 1 >>> f = lambda: globals().__setitem__('myglobal', 2) or 42 >>> f() >>> myglobal 2 -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Jan 24 01:53:56 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 23 Jan 2016 22:53:56 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <22180.28392.435688.885238@turnbull.sk.tsukuba.ac.jp> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> <22180.28392.435688.885238@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sat, Jan 23, 2016 at 10:27 PM, Stephen J. Turnbull wrote: > Guido, > > Thank you for taking the trouble to address my rather confused post. You're welcome. And thanks for taking it as constructive criticism. > Guido van Rossum writes: > > > If I were to deconstruct the original statement, I would start by > > replacing the list comprehension with a plain old for loop. > > I did that. But that actually doesn't bother me because the loop > index's identifier doesn't go out of scope. I now see why that's a > red herring, but maybe documentation can be improved. > > Anyway, I wrote that post before seeing your explanation that things > just aren't that difficult, they all follow from "variable reference > as dictionary lookup". The clue I needed was the way to view a scope > as an object, and then realize that all free variable references are > the same, except for visibility of the relevant scope to the other > code at the call site. > > For me it's now a documentation issue (I know why the comprehension of > lambdas work as they do, and I also know how to get the "expected", > more useful result). I'll go take a look at the language reference, > and tutorial, and see if I think they can be improved. I expect that the tutorial just needs some touch-up or an extra section on these issues. But the language reference... Well, it's a mess, it is often confusing and not all that exact. I should take a year off to rewrite it from scratch (what a book that would be!), but I don't have the kind of discipline to finish long writing projects. :-( -- --Guido van Rossum (python.org/~guido) From julien at palard.fr Sun Jan 24 05:35:59 2016 From: julien at palard.fr (Julien Palard) Date: Sun, 24 Jan 2016 11:35:59 +0100 Subject: [Python-ideas] Cross link documentation translations Message-ID: <56A4A90F.1030706@palard.fr> o/ While translating the Python Documentation in French [1][2], I discovered that we're not the only country doing it, there is also Japan [3][4], and Spain [5]. It's possible there's other but I didn't find them (and it's the problem). But there's only a few way for users to find the translations (hearing about them, or explicitly searching for them on a search engine, which they won't do, obviously expecting a link from the english version if they exists). So here is my idea: Why not linking translations from the main documentation? I know that's not directly supported by Sphinx doc [6], but separate sphinx build, blindly (with hardcoded links) linking themselves, may work (like readthedoc is probably doing). The downside of those links is that we'll sometime link to untranslated parts, but those parts may be marked as untranslated [7] to encourage new translators to help. Thoughts? [1] http://www.afpy.org/doc/python/3.5/ [2] https://github.com/afpy/python_doc_fr [3] http://docs.python.jp/3/ [4] https://github.com/python-doc-ja/python-doc-ja [5] http://docs.python.org.ar/tutorial/3/index.html [6] https://github.com/sphinx-doc/sphinx/issues/2252 [7] https://github.com/sphinx-doc/sphinx/issues/1246 -- Julien Palard From ncoghlan at gmail.com Sun Jan 24 07:54:53 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Jan 2016 22:54:53 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On 24 January 2016 at 15:16, Guido van Rossum wrote: > I wonder if kids today aren't too much in love with local function > definitions. :-) There's a reason why executor.submit() takes a > function *and arguments*. If you move the function out of the for loop > and pass the url as a parameter to submit(), problem solved, and you > waste fewer resources on function objects and cells to hold nonlocals. Aye, that's how the current example code in the docs handles it - there's an up front definition of the page loading function, and then the submission to the executor is with a dict comprehension. The only thing "wrong" with it is that when reading the code, the potentially single-use function is introduced first without any context, and it's only later that you get to see what it's for. > A generation ago most people would have naturally used such a solution > (since most languages didn't support the alternative :-). In programming we would have, but I don't think the same is true when writing work instructions for other people to follow - for those, we're more likely to use nested bullets to describe subtasks, and only pull them out to a separate document or section if we need to reference the same subtask from multiple places. While my view is admittedly only based on intuition rather than hard data, it seems to me that when folks are reaching for nested functions, it's that "subtask as a nested bulleted list" idiom they're aiming to express, and Python is otherwise so accommodating of English structural idioms that it's jarring when it doesn't work properly. (I also suspect that's why it's a question we keep returning to - as a *programming language*, making closures play more nicely with iteration variables doesn't add any real power to Python, but as *executable pseudo-code*, it makes it a little bit easier to express certain ideas in the same way we'd describe them to another person). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tritium-list at sdamon.com Sun Jan 24 09:51:57 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Sun, 24 Jan 2016 09:51:57 -0500 Subject: [Python-ideas] Cross link documentation translations In-Reply-To: <56A4A90F.1030706@palard.fr> References: <56A4A90F.1030706@palard.fr> Message-ID: <56A4E50D.2060201@sdamon.com> I am -1 for linking the official documentation to anything not on docs.python.org (or to things like the source listings link on the library docs, which is controlled by the same party as controls the documentation). It irks me every time I see a link to activestate recipes in the official docs. Activestate has already taken down documentation they have previously hosted, and the recipes will only exist as long as it is advantageous to them to continue hosting them*. The same can be said of translation projects that don't exist as special interest groups under the PSF. But I do like the idea of linking to translations. Would it not be a better solution to try and unify the translation efforts into one system as SIGs of the Doc-SIG? Besides, linking only to documentation you generate (as hosting the docs as a sig under the doc-sig would allow) would make the technical implementation much easier. Baring that, its a better-than-nothing idea. * not to mention the questionable quality of the recipes. On 1/24/2016 05:35, Julien Palard wrote: > o/ > > While translating the Python Documentation in French [1][2], I > discovered that we're not the only country doing it, there is also > Japan [3][4], and Spain [5]. It's possible there's other but I didn't > find them (and it's the problem). > > But there's only a few way for users to find the translations (hearing > about them, or explicitly searching for them on a search engine, which > they won't do, obviously expecting a link from the english version if > they exists). > > So here is my idea: Why not linking translations from the main > documentation? > > I know that's not directly supported by Sphinx doc [6], but separate > sphinx build, blindly (with hardcoded links) linking themselves, may > work (like readthedoc is probably doing). The downside of those links > is that we'll sometime link to untranslated parts, but those parts may > be marked as untranslated [7] to encourage new translators to help. > > Thoughts? > > [1] http://www.afpy.org/doc/python/3.5/ > [2] https://github.com/afpy/python_doc_fr > [3] http://docs.python.jp/3/ > [4] https://github.com/python-doc-ja/python-doc-ja > [5] http://docs.python.org.ar/tutorial/3/index.html > [6] https://github.com/sphinx-doc/sphinx/issues/2252 > [7] https://github.com/sphinx-doc/sphinx/issues/1246 > From wes.turner at gmail.com Sun Jan 24 14:04:12 2016 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 24 Jan 2016 13:04:12 -0600 Subject: [Python-ideas] Cross link documentation translations In-Reply-To: <56A4A90F.1030706@palard.fr> References: <56A4A90F.1030706@palard.fr> Message-ID: ReadTheDocs supports hosting projects with multiple translations: | Docs: http://docs.readthedocs.org/en/latest/localization.html - [ ] There could be a dedicated Python Infrastructure ReadtheDocs Docker instance. https://github.com/rtfd/readthedocs-docker-images ReadTheDocs CPython Docs /en/latest/ * | Docs: http://cpython.readthedocs.org/en/latest/ * | Project: http://readthedocs.org/projects/cpython/ * [ ] All past revisions * [ ] All translations On Jan 24, 2016 4:41 AM, "Julien Palard" wrote: > > o/ > > While translating the Python Documentation in French [1][2], I discovered that we're not the only country doing it, there is also Japan [3][4], and Spain [5]. It's possible there's other but I didn't find them (and it's the problem). > > But there's only a few way for users to find the translations (hearing about them, or explicitly searching for them on a search engine, which they won't do, obviously expecting a link from the english version if they exists). > > So here is my idea: Why not linking translations from the main documentation? > > I know that's not directly supported by Sphinx doc [6], but separate sphinx build, blindly (with hardcoded links) linking themselves, may work (like readthedoc is probably doing). The downside of those links is that we'll sometime link to untranslated parts, but those parts may be marked as untranslated [7] to encourage new translators to help. > > Thoughts? > > [1] http://www.afpy.org/doc/python/3.5/ > [2] https://github.com/afpy/python_doc_fr > [3] http://docs.python.jp/3/ > [4] https://github.com/python-doc-ja/python-doc-ja > [5] http://docs.python.org.ar/tutorial/3/index.html > [6] https://github.com/sphinx-doc/sphinx/issues/2252 > [7] https://github.com/sphinx-doc/sphinx/issues/1246 > > -- > Julien Palard > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Jan 24 14:57:30 2016 From: brett at python.org (Brett Cannon) Date: Sun, 24 Jan 2016 19:57:30 +0000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On Sun, Jan 24, 2016, 04:55 Nick Coghlan wrote: > On 24 January 2016 at 15:16, Guido van Rossum wrote: > > I wonder if kids today aren't too much in love with local function > > definitions. :-) There's a reason why executor.submit() takes a > > function *and arguments*. If you move the function out of the for loop > > and pass the url as a parameter to submit(), problem solved, and you > > waste fewer resources on function objects and cells to hold nonlocals. > > Aye, that's how the current example code in the docs handles it - > there's an up front definition of the page loading function, and then > the submission to the executor is with a dict comprehension. > > The only thing "wrong" with it is that when reading the code, the > potentially single-use function is introduced first without any > context, and it's only later that you get to see what it's for. > So the doics just need an added comment to help explain it. Want to file an issue for that? > > A generation ago most people would have naturally used such a solution > > (since most languages didn't support the alternative :-). > > In programming we would have, but I don't think the same is true when > writing work instructions for other people to follow - for those, > we're more likely to use nested bullets to describe subtasks, and only > pull them out to a separate document or section if we need to > reference the same subtask from multiple places. > > While my view is admittedly only based on intuition rather than hard > data, it seems to me that when folks are reaching for nested > functions, it's that "subtask as a nested bulleted list" idiom they're > aiming to express, and Python is otherwise so accommodating of English > structural idioms that it's jarring when it doesn't work properly. (I > also suspect that's why it's a question we keep returning to - as a > *programming language*, making closures play more nicely with > iteration variables doesn't add any real power to Python, but as > *executable pseudo-code*, it makes it a little bit easier to express > certain ideas in the same way we'd describe them to another person). > I personally like the semantics we currently have. I get why people bring this up, but I'm voting for the programming language side over the pseudo-code angle. -Brett > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Jan 24 15:17:27 2016 From: brett at python.org (Brett Cannon) Date: Sun, 24 Jan 2016 20:17:27 +0000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> <22180.28392.435688.885238@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sat, Jan 23, 2016, 22:55 Guido van Rossum wrote: > On Sat, Jan 23, 2016 at 10:27 PM, Stephen J. Turnbull > wrote: > > Guido, > > > > Thank you for taking the trouble to address my rather confused post. > > You're welcome. And thanks for taking it as constructive criticism. > > > Guido van Rossum writes: > > > > > If I were to deconstruct the original statement, I would start by > > > replacing the list comprehension with a plain old for loop. > > > > I did that. But that actually doesn't bother me because the loop > > index's identifier doesn't go out of scope. I now see why that's a > > red herring, but maybe documentation can be improved. > > > > Anyway, I wrote that post before seeing your explanation that things > > just aren't that difficult, they all follow from "variable reference > > as dictionary lookup". The clue I needed was the way to view a scope > > as an object, and then realize that all free variable references are > > the same, except for visibility of the relevant scope to the other > > code at the call site. > > > > For me it's now a documentation issue (I know why the comprehension of > > lambdas work as they do, and I also know how to get the "expected", > > more useful result). I'll go take a look at the language reference, > > and tutorial, and see if I think they can be improved. > > I expect that the tutorial just needs some touch-up or an extra > section on these issues. But the language reference... Well, it's a > mess, it is often confusing and not all that exact. I should take a > year off to rewrite it from scratch (what a book that would be!), but > I don't have the kind of discipline to finish long writing projects. > :-( > Would doing something like the Ruby community where we write a spec using a BDD-style so it's more a set of tests than verbiage be easier? -Brett > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Jan 24 15:42:52 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 25 Jan 2016 09:42:52 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: <56A5374C.8090103@canterbury.ac.nz> Nick Coghlan wrote: > Capturing additional values on each iteration would be possible with a > generator expression: > > for new i, a, b, c in (i, a, b, c for i range(10)): > def f(x): > return x**i, a, b, c I'm not sure I see the point of this. If you're needing to capture a, b and c from an outer scope, presumably it's because there's some outer loop that's changing them -- in which case you can just make *that* loop a "new" loop as well. BTW, should there be a "while new" loop too? -- Greg From cs at zip.com.au Sun Jan 24 16:21:04 2016 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 25 Jan 2016 08:21:04 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160121005224.GC4619@ando.pearwood.info> References: <20160121005224.GC4619@ando.pearwood.info> Message-ID: <20160124212104.GA24825@cskk.homeip.net> On 21Jan2016 11:52, Steven D'Aprano wrote: >So a full function declaration looks like: > > def NAME ( PARAMETERS ) ( CAPTURES ) -> RETURN-HINT : > >(Bike-shedders: do you prefer () [] or {} for the list of captures?) Just to this: I prefer () - this is very much like a special parameter list. [] and {} should list and dict to me. Cheers, Cameron Simpson From python at lucidity.plus.com Sun Jan 24 16:22:22 2016 From: python at lucidity.plus.com (Erik) Date: Sun, 24 Jan 2016 21:22:22 +0000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A5374C.8090103@canterbury.ac.nz> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> <56A5374C.8090103@canterbury.ac.nz> Message-ID: <56A5408E.7020302@lucidity.plus.com> On 24/01/16 20:42, Greg Ewing wrote: > BTW, should there be a "while new" loop too? And a "with foo() as new i:" ... and what about "func(new bar)"? Removing tongue from cheek now ;) E. From julien at palard.fr Sun Jan 24 17:16:49 2016 From: julien at palard.fr (Julien Palard) Date: Sun, 24 Jan 2016 23:16:49 +0100 Subject: [Python-ideas] Cross link documentation translations In-Reply-To: <56A4E50D.2060201@sdamon.com> References: <56A4A90F.1030706@palard.fr> <56A4E50D.2060201@sdamon.com> Message-ID: <56A54D51.3020407@palard.fr> On 01/24/2016 03:51 PM, Alexander Walters wrote: > I am -1 for linking the official documentation to anything not on > docs.python.org My principal goal is not to cross-link outside of doc.python.org, but to cross-link efforts, and provide users a way to find the translations. Being hosted on doc.python.org is probably the neatest way to do it: So I can only agree. > But I do like the idea of linking to translations. Would it not be a > better solution to try and unify the translation efforts into one > system as SIGs of the Doc-SIG? I'm not well-aware of SIGs and their inner workings, (SIG-Doc mail archive has not been updated since 2013), and cross-language unification is probably not a good idea, at first (each team have its unique organization / leaders / etc), but why not, I'm open to any ideas. > Besides, linking only to documentation you generate (as hosting the > docs as a sig under the doc-sig would allow) would make the technical > implementation much easier. Actually we generate the whole documentation, even untranslated parts, (actually Sphinx does it). Also, ignoring the internal working of special interest groups, I completely miss the "hosting the docs as a sig under the doc-sig" part: does SIG has hostings ? -- Julien Palard From stephen at xemacs.org Sun Jan 24 21:08:32 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 25 Jan 2016 11:08:32 +0900 Subject: [Python-ideas] Cross link documentation translations In-Reply-To: <56A54D51.3020407@palard.fr> References: <56A4A90F.1030706@palard.fr> <56A4E50D.2060201@sdamon.com> <56A54D51.3020407@palard.fr> Message-ID: <22181.33696.693535.529701@turnbull.sk.tsukuba.ac.jp> Julien Palard writes: > I'm not well-aware of SIGs and their inner workings, (SIG-Doc mail > archive has not been updated since 2013), and cross-language > unification is probably not a good idea, at first (each team have > its unique organization / leaders / etc), but why not, I'm open to > any ideas. As you're probably aware, Debian[1] has #(languages) + 2 teams translating the Debian-specific parts of packages. One of the special teams works on internationalizing the packaging software (mostly but not entirely done, even after more than a decade), and another provides infrastructure for accepting and distributing translations. As with Debian, I don't think there will be unification of organizations, and it certainly isn't needed. On the other hand, *somebody* will need to construct the web page and repository structure and linkage, and there will be an on-going need for integrating new versions. The debian-i18n mailing list is also useful for propagating best practices. Quality of translation is an issue that Debian doesn't much have to deal with (because the Debian teams are working on the same task -- installation and configuration -- they quickly develop idioms for repetitive queries), but for manuals the issue is important. IMHO, it would be ideal if the integrators included a team of editor/reviewers independent of the various language teams. At least for Japanese, translations (both of generic English and specifically English software manuals) are often mechanical and not very educational, and occasionally actively misleading. Of course this is pie-in-the-sky; surely people with such skills and the interest are likely to be participating in the teams. But I think it's a good idea to keep quality of translation in mind, and possibly impose some sort of formal review process or two-approvals requirement. Footnotes: [1] The example I'm familiar with, I suppose many other projects have similar setups. From ncoghlan at gmail.com Sun Jan 24 21:31:38 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Jan 2016 12:31:38 +1000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On 25 January 2016 at 05:57, Brett Cannon wrote: > > On Sun, Jan 24, 2016, 04:55 Nick Coghlan wrote: >> >> On 24 January 2016 at 15:16, Guido van Rossum wrote: >> > I wonder if kids today aren't too much in love with local function >> > definitions. :-) There's a reason why executor.submit() takes a >> > function *and arguments*. If you move the function out of the for loop >> > and pass the url as a parameter to submit(), problem solved, and you >> > waste fewer resources on function objects and cells to hold nonlocals. >> >> Aye, that's how the current example code in the docs handles it - >> there's an up front definition of the page loading function, and then >> the submission to the executor is with a dict comprehension. >> >> The only thing "wrong" with it is that when reading the code, the >> potentially single-use function is introduced first without any >> context, and it's only later that you get to see what it's for. > > So the doics just need an added comment to help explain it. Want to file an > issue for that? There's nothing to comment on given the Python semantics we have today - what's there is a sensible way to write that code, and the design FAQ covers why the inline closure approach wouldn't work. As noted, I suspect the only reason the topic keeps coming up is the niggling sense that the closure based approach "should" work, and the fact that it doesn't is a case where underlying technical details that we generally aim to let people gloss over make themselves apparent. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sun Jan 24 23:40:26 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Jan 2016 20:40:26 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <22179.27074.75981.357853@turnbull.sk.tsukuba.ac.jp> <22180.28392.435688.885238@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sun, Jan 24, 2016 at 12:17 PM, Brett Cannon wrote: > Would doing something like the Ruby community where we write a spec using a > BDD-style so it's more a set of tests than verbiage be easier? I haven't seen that, bu tif it's anything like the typical way of writing unit tests in Ruby, please no. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Mon Jan 25 01:32:17 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 25 Jan 2016 01:32:17 -0500 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On 1/24/2016 7:54 AM, Nick Coghlan wrote: > On 24 January 2016 at 15:16, Guido van Rossum wrote: >> I wonder if kids today aren't too much in love with local function >> definitions. :-) There's a reason why executor.submit() takes a >> function *and arguments*. If you move the function out of the for loop What I've concluded from this thread is that function definitions (with direct use 'def' or 'lambda') do not fit well within loops, though I used them there myself. When delayed function calls are are needed, what belongs within loops is packaging of a pre-defined function with one or more arguments within a callable. Instance.method is an elegant syntax for doing so. functools.partial(func, args, ...) is a much clumsier generalized expression, which requires an import. Note that 'partial' returns a function for delayed execution even when a complete, not partial, set of arguments is passed. A major attempted (and tempting) use for definitions within a loop is multiple callbacks for multiple gui widgets, where delayed execution is needed. The three answers to multiple 'why doesn't this work' on both python-list and Stackoverflow are multiple definitions with variant 'default args', a custom make_function function outside the loop called multiple times within the loop, and a direct function outside the loop called with partial within the loop. I am going to start using partial more. Making partial a builtin would make it easier to use and more attractive. Even more attractive would be syntax that abbreviates delayed calls with pre-bound arguments in the way that inst.meth abbreviates a much more complicated expression roughly equivalent to "bind(inst.__getattr__('meth'), inst)". A possibility would be to make {} a delayed and possibly partial call operator, in parallel to the current use of () as a immediate and total call operator. expr{arguments} would evaluate to a function, whether of type or a special class similar to bound methods. The 'arguments' would be anything allowed within partial, which I believe is anything allowed in any function call. I chose {} because expr{...} is currently illegal, just as expr(arguments) is for anything other than a function call. On the other hand, expr[...] is currently legal, at least up to '[', as is expr<...> at least up to '<'. >> and pass the url as a parameter to submit(), problem solved, and you >> waste fewer resources on function objects and cells to hold nonlocals. executor.submit appears to me to be a specialized version of partial, with all arguments required. With the proposal above, I think submit(func{all args}) would work. > Aye, that's how the current example code in the docs handles it - > there's an up front definition of the page loading function, and then > the submission to the executor is with a dict comprehension. I presume you both are referring to ThreadPoolExecutor Example. The load_url function, which I think should be 'get_page' has a comment that is wrong (it does not 'report the url') and no docstring. My suggestion: # Define an example function for the executor.submit call below. def get_page(url, timeout): "Return the page, as a string, retrieved from the url." with ... > The only thing "wrong" with it is that when reading the code, the > potentially single-use function is introduced first without any > context, and it's only later that you get to see what it's for. A proper comment would fix this I think. That aside, if the main code were packaged within def main, as in the following ProcessPoolExecutor Example, so as to delay the lookup of 'load_url' or 'get_page', then the two functions definitions could be in *either* order. The general convention in Pythonland seems to be to put main last (bottom up, define everything before use), but in a recent python-list thread, at least one person, and I think two, said they like to start with def main (top down style, which you seem to like). I just checked and PEP8 seems to be silent on the placement of 'def main'. So unless Guido says otherwise, I would not mind if you revised one of the examples to start with def main, just to show that that is a legitimate alternative. It is a feature of Python that one can do this without having to add, before the first appearance of a function name within a function, a dummy 'forward declaration' giving the function signature. >> A generation ago most people would have naturally used such a solution >> (since most languages didn't support the alternative :-). > > In programming we would have, but I don't think the same is true when > writing work instructions for other people to follow - for those, > we're more likely to use nested bullets to describe subtasks, and only > pull them out to a separate document or section if we need to > reference the same subtask from multiple places. People can and do jump around while reading code for understanding. They can do this without markers as explicit as needed for machines. Current compilers and interpreters initially read code linearly, with only one character or token lookahead. For Python, a def header is needed for forward reference, to delay name resolution to call time, after the whole file has been read. > While my view is admittedly only based on intuition rather than hard > data, it seems to me that when folks are reaching for nested > functions, it's that "subtask as a nested bulleted list" idiom they're > aiming to express, and Python is otherwise so accommodating of English > structural idioms that it's jarring when it doesn't work properly. (I > also suspect that's why it's a question we keep returning to - as a > *programming language*, making closures play more nicely with > iteration variables doesn't add any real power to Python, but as > *executable pseudo-code*, it makes it a little bit easier to express > certain ideas in the same way we'd describe them to another person). I thought about some explicit examples and it is not necessarily clear how to translate bullet points to code. But in general, I do not believe that instructions to another person are meant to induce in the mind of a listener multiple functions that only differ in a default argumnet object. In other words, I do not see for i in it: def f(i=i): pass as corresponding to natural language. Hence my initial statement above. -- Terry Jan Reedy From marcel at marceloneil.com Mon Jan 25 10:11:56 2016 From: marcel at marceloneil.com (Marcel O'Neil) Date: Mon, 25 Jan 2016 10:11:56 -0500 Subject: [Python-ideas] intput() Message-ID: def intput(): return int(input()) Life would be just marginally easier, with a punny function name as a bonus. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Mon Jan 25 10:38:39 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 25 Jan 2016 09:38:39 -0600 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: <6DF8E94C-A6DB-4EDC-9701-CB4DBE6B5E48@gmail.com> Me: *sees intput* Huh, there's a typo here. Let me just change it back to input! *program explodes* Seriously, it's too easy to mistype to me. On January 25, 2016 9:11:56 AM CST, Marcel O'Neil wrote: >def intput(): > return int(input()) > >Life would be just marginally easier, with a punny function name as a >bonus. > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From geoffspear at gmail.com Mon Jan 25 10:40:44 2016 From: geoffspear at gmail.com (Geoffrey Spear) Date: Mon, 25 Jan 2016 10:40:44 -0500 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: On Mon, Jan 25, 2016 at 10:11 AM, Marcel O'Neil wrote: > def intput(): > return int(input()) > > Life would be just marginally easier, with a punny function name as a > bonus. > > Cute, and easy enough to do in your own code. Way too much of a trivial special case to add to the core language, though, in my opinion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Mon Jan 25 10:47:34 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Mon, 25 Jan 2016 15:47:34 +0000 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: <56A64396.7070309@btinternet.com> On 25/01/2016 15:40, Geoffrey Spear wrote: > > > On Mon, Jan 25, 2016 at 10:11 AM, Marcel O'Neil > > wrote: > > def intput(): > return int(input()) > > Life would be just marginally easier, with a punny function name > as a bonus. > > > Cute, and easy enough to do in your own code. Way too much of a > trivial special case to add to the core language, though, in my opinion. +1. In real life you would probably want validation and allow the user retries, and a prompt. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.g.kelly at gmail.com Mon Jan 25 10:56:55 2016 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Mon, 25 Jan 2016 08:56:55 -0700 Subject: [Python-ideas] Documenting asyncio methods as returning awaitables Message-ID: The official asyncio documentation includes this note: """ Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future. If such a function is needed to be used in a callback-style code, wrap its result with ensure_future(). """ Despite the note, this still causes confusion. See for example https://mail.python.org/pipermail/python-list/2016-January/702342.html As of Python 3.5, "awaitable" is a thing, and as of Python 3.5.1, ensure_future is supposed to accept any awaitable. Would it be better then to document these methods as returning awaitables rather than as coroutines? From guido at python.org Mon Jan 25 11:52:49 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Jan 2016 08:52:49 -0800 Subject: [Python-ideas] Documenting asyncio methods as returning awaitables In-Reply-To: References: Message-ID: I agree there's been considerable confusion. For example, quoting from the conversation you linked, "While coroutines are the focus of the library, they're based on futures". That's actually incorrect. Until PEP 492 (async/await), there were two separate concepts: Future and coroutine. A Future is an object with certain methods (e.g. get_result(), cancel(), add_done_callback()). A coroutine is a generator object -- it sports no such methods (though it has some of its own, e.g. send() and throw()). But not every generator object is a coroutine -- coroutine is expected to have certain "behavior" that makes it interact correctly with a scheduler. (Details of this behavior don't matter for this explanation, but it involves yielding zero or more Futures. Also, the @asyncio.coroutine decorator must be used to mark the generator as supporting that "behavior".) Coroutines are more efficient, because when a coroutine calls and waits for another coroutine (using yield from, or in 3.5 also await) no trip to the scheduler is required -- it's all taken care of by the Python interpreter. Now, the confusion typically occurs because when you use yield from, it accepts either a coroutine or a Future. And in most cases you're not really aware (and you don't care) whether a particular thing you're waiting on is a coroutine or a Future -- you just want to wait for it, letting the event loop do other things, until it has a result for you, and either type supports that. However sometimes you *do* care about the type -- and that's typically because you want a Future, so you can call some of its methods like cancel() or add_done_callback(). The correct way to do this, when you're not sure whether something is a Future or a coroutine, is to call ensure_future(). If what you've got is already a Future it will just return that unchanged; if you've got a coroutine it wraps it in a Task. Many asyncio operations take either a Future or a coroutine -- they all just call ensure_future() on that argument. So how do things change in Python 3.5 with PEP 492? Not much -- the same story applies, except there's a third type of object, confusingly called a coroutine object (as opposed to the coroutine I was talking about above, which is called a generator object). A coroutine object is almost the same as a generator object, and supports mostly the same interface (e.g. send(), throw()). We can treat generator objects with coroutine "behavior" and proper (PEP 492) coroutine objects as essentially interchangeable, because that's how PEP 492 was designed. (Differences come out only when you're making a mistake, such as trying to iterate over one. Iterating over a pre-PEP-492 coroutine is invalid, but (because it's implemented as a generator object) you can still call its iter() method. Calling iter() on a PEP 492 coroutine object fails with a TypeError. So what should the docs do? IMO they should be very clear about the distinction between functions that return Futures and functions that return coroutines (of either kind). I think it's fine if they are fuzzy about whether the latter return a PEP 492 style coroutine (i.e. defined with async def) or a pre-PEP-492 coroutine (marked with @asyncio.coroutine), since those are almost entirely interchangeable, and the plan is to eventually make everything a PEP 492 coroutine. Finally, what should you do if you have a Future but you need a coroutine? This has come up a few times but it's probably an indication that there's something you haven't understood yet. The only API that requires a coroutine (and rejects a Future) is the Task() constructor, but you should only call that with a coroutine you defined yourself -- if it's something you received, you should be using ensure_future(), which will do the right thing (wrapping a coroutine in a Task). Good luck! --Guido On Mon, Jan 25, 2016 at 7:56 AM, Ian Kelly wrote: > The official asyncio documentation includes this note: > > """ > Note: In this documentation, some methods are documented as > coroutines, even if they are plain Python functions returning a > Future. This is intentional to have a freedom of tweaking the > implementation of these functions in the future. If such a function is > needed to be used in a callback-style code, wrap its result with > ensure_future(). > """ > > Despite the note, this still causes confusion. See for example > https://mail.python.org/pipermail/python-list/2016-January/702342.html > > As of Python 3.5, "awaitable" is a thing, and as of Python 3.5.1, > ensure_future is supposed to accept any awaitable. Would it be better > then to document these methods as returning awaitables rather than as > coroutines? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Jan 25 13:52:26 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Jan 2016 10:52:26 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On Sun, Jan 24, 2016 at 10:32 PM, Terry Reedy wrote: > What I've concluded from this thread is that function definitions (with > direct use 'def' or 'lambda') do not fit well within loops, though I used > them there myself. Right. When you can avoid them, you avoid extra work in an inner loop, which is often a good idea. > When delayed function calls are are needed, what belongs within loops is > packaging of a pre-defined function with one or more arguments within a > callable. Instance.method is an elegant syntax for doing so. > functools.partial(func, args, ...) is a much clumsier generalized > expression, which requires an import. Note that 'partial' returns a > function for delayed execution even when a complete, not partial, set of > arguments is passed. Right. I've always hated partial() (which is why it's not a builtin) because usually a lambda is clearer (it's difficult to calculate in your head the signature of the thing it returns from the arguments passed), but this is one thing where partial() wins, since it captures values. > A major attempted (and tempting) use for definitions within a loop is > multiple callbacks for multiple gui widgets, where delayed execution is > needed. The three answers to multiple 'why doesn't this work' on both > python-list and Stackoverflow are multiple definitions with variant 'default > args', a custom make_function function outside the loop called multiple > times within the loop, and a direct function outside the loop called with > partial within the loop. I am going to start using partial more. Yes, the make_function() approach is just a custom partial(). > Making partial a builtin would make it easier to use and more attractive. > Even more attractive would be syntax that abbreviates delayed calls with > pre-bound arguments in the way that inst.meth abbreviates a much more > complicated expression roughly equivalent to "bind(inst.__getattr__('meth'), > inst)". A recommended best practice / idiom is more useful, because it can be applied to all Python versions. > A possibility would be to make {} a delayed and possibly partial call > operator, in parallel to the current use of () as a immediate and total call > operator. > expr{arguments} > would evaluate to a function, whether of type or a special class > similar to bound methods. The 'arguments' would be anything allowed within > partial, which I believe is anything allowed in any function call. I chose > {} because expr{...} is currently illegal, just as expr(arguments) is for > anything other than a function call. On the other hand, expr[...] is > currently legal, at least up to '[', as is expr<...> at least up to '<'. -1 on expr{...}. >>> and pass the url as a parameter to submit(), problem solved, and you >>> waste fewer resources on function objects and cells to hold nonlocals. > > executor.submit appears to me to be a specialized version of partial, with > all arguments required. With the proposal above, I think submit(func{all > args}) would work. But not before 3.6. >> Aye, that's how the current example code in the docs handles it - >> there's an up front definition of the page loading function, and then >> the submission to the executor is with a dict comprehension. > > I presume you both are referring to ThreadPoolExecutor Example. The > load_url function, which I think should be 'get_page' has a comment that is > wrong (it does not 'report the url') and no docstring. My suggestion: > > # Define an example function for the executor.submit call below. > def get_page(url, timeout): > "Return the page, as a string, retrieved from the url." > with ... > >> The only thing "wrong" with it is that when reading the code, the >> potentially single-use function is introduced first without any >> context, and it's only later that you get to see what it's for. > > A proper comment would fix this I think. That aside, if the main code were > packaged within def main, as in the following ProcessPoolExecutor Example, > so as to delay the lookup of 'load_url' or 'get_page', then the two > functions definitions could be in *either* order. The general convention in > Pythonland seems to be to put main last (bottom up, define everything before > use), but in a recent python-list thread, at least one person, and I think > two, said they like to start with def main (top down style, which you seem > to like). I like both. :-) -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Mon Jan 25 15:04:08 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 26 Jan 2016 09:04:08 +1300 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: <56A67FB8.5020001@canterbury.ac.nz> Marcel O'Neil wrote: > def intput(): > return int(input()) And also def flintput(): return float(input()) Yabba-dabba-doo-ly, Greg From abarnert at yahoo.com Mon Jan 25 16:03:49 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 13:03:49 -0800 Subject: [Python-ideas] PEP 484 change proposal: Allowing overload outside stub files In-Reply-To: References: Message-ID: <940B65C5-C2C0-41DF-8ADD-A2D36D01C18A@yahoo.com> On Jan 23, 2016, at 13:12, Stefan Krah wrote: > > Nick Coghlan writes: >> You're already going to have to allow this for single lines to handle >> Py2 compatible annotations, so it seems reasonable to also extend it >> to handle overloading while you're still figuring out a native syntax >> for that. > > I find that https://pypi.python.org/pypi/multipledispatch looks quite > nice: > >>>> from multipledispatch import dispatch >>>> @dispatch(int, int) > ... def add(x, y): > ... return x + y > ... >>>> @dispatch(float, float) > ... def add(x, y): > ... return x + y > ... >>>> add(1, 2) > 3 >>>> add(1.0, 2.0) > 3.0 >>>> add(1.0, 2) > Traceback (most recent call last): > File > [cut because gmane.org is inflexible] > line 155, in __call__ > func = self._cache[types] > KeyError: (, ) Of course you still have to work out how that would fit with type annotations. Presumably you could just move the dispatched types from the decorator to the annotations, and add a return type on each overload. And you could make the dispatch algorithm ignore element types in generic types (so add(a: Sequence[T], b: Sequence[T]) gets called on any pair of Sequences). But even then, it's hard to imagine how a type checker could understand your code unless it had special-case code for this special decorator. Not to mention that you're not supposed to runtime-dispatch on typing.Sequence (isinstance and issubclass only work by accident), but you can't genericize collections.abc.Sequence. Plus, as either Guido or Jukka pointed out earlier, you may want to specify that Sequence[T] normally returns T but Sequence[Number] always returns float or something; at runtime, those are the same type, so they have to share a single implementation, which takes you right back to needing a way to specify overloads for the type checker. Still, I would love to see someone take that library and mypy and experiment with making them work together and solving all of these problems. (As a side note, every time I look at this stuff, I start thinking I want type computation so I can specify that add(a: Sequence[T], b: Sequence[U]) -> Sequence[decltype(T + U)], until I spend a few minutes trying to find a way to write that that isn't as horrible as C++ without introducing all of Haskell into Python, and then appreciating again why maybe building the simple thing first was a good idea...) From srkunze at mail.de Mon Jan 25 16:03:50 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 25 Jan 2016 22:03:50 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: <56A68DB6.2050705@mail.de> On 24.01.2016 06:16, Guido van Rossum wrote: > I wonder if kids today aren't too much in love with local function > definitions. :-) There's a reason why executor.submit() takes a > function *and arguments*. If you move the function out of the for loop > and pass the url as a parameter to submit(), problem solved, and you > waste fewer resources on function objects and cells to hold nonlocals. > A generation ago most people would have naturally used such a solution > (since most languages didn't support the alternative :-). Well said. I remember js be a hatchery of this kind of programming. My main concern always was "how can I test these inner functions?" Almost impossible but a good excuse not to. So, it's unprofessional from my point of view but things may change. On-topic: I like the way Python allows me to bind early. It's simple and that's the main argument for it and against introducing an yet-another syntax (like colons, brakes, etc.); especially for solving such a side issue. Best, Sven From bzvi7919 at gmail.com Mon Jan 25 15:58:33 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Mon, 25 Jan 2016 20:58:33 +0000 Subject: [Python-ideas] intput() In-Reply-To: <56A67FB8.5020001@canterbury.ac.nz> References: <56A67FB8.5020001@canterbury.ac.nz> Message-ID: def dictput(): input() raise SyntaxError("You entered a dict in the wrong way") Will probably raise a few lols. btw flintput is float(int(input()) which rounds down. flinput is float(input()). -- Bar On Mon, Jan 25, 2016 at 10:04 PM Greg Ewing wrote: > Marcel O'Neil wrote: > > def intput(): > > return int(input()) > > And also > > def flintput(): > return float(input()) > > Yabba-dabba-doo-ly, > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Jan 25 16:28:18 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 25 Jan 2016 13:28:18 -0800 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> Message-ID: <56A69372.3080701@stoneleaf.us> Let's not forget def dolphinput(message): "get fish order from are ocean-going mammalian friends" ... flipper'nly yrs, -- ~Ethan~ From abarnert at yahoo.com Mon Jan 25 16:42:08 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 13:42:08 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: On Jan 23, 2016, at 19:22, Nick Coghlan wrote: > > Accordingly, a statement like: > > powers = [] > for new i in range(10): > def f(x): > return x**i > powers.append(f) > > Could be semantically equivalent to: > > powers = [] > for i in range(10): > def _for_loop_suite(i=i): > def f(x): > return x**i > powers.append(f) > _for_loop_suite() > del _for_loop_suite A simpler translation of the Swift/C#/etc. behavior might be: > powers = [] > for i in range(10): > def _for_loop_suite(i): > def f(x): > return x**i > powers.append(f) > _for_loop_suite(i) > del _for_loop_suite This is, after all, how comprehensions work, and how you mechanically translate let bindings from other languages to Python (I believe MacroPy even has a let macro that does exactly this); it's slightly simpler to understand under the hood; it's even slightly more efficient (not that it will ever matter). Of course that raises an important point: when you're _not_ mechanically translating, you rarely translate a let this way; instead, you translate it by rewriting the code at a higher level. (And the fact that this translation _is_ idiomatic in JavaScript is exactly why JS code is ugly in the way that Guido and others decry in this thread.) Do we want the compiler doing something under the hood that we wouldn't want to write ourselves? (Again, people in JS, and other languages like C#, don't consider that a problem--both languages define async as effectively a macro that transforms your code into something you wouldn't want to look at, and those kinds of macros are almost the whole point of Lisp, but I think part of why people like Python is that the semantics of most sugar can be described in terms that are just as readable as the sugared version, except for being longer.) That's why I think I prefer not-Terry's (sorry for the misattribution) version: if something is going to act differently from the usual semantics, maybe it's better to describe it honestly as a new rule you have to learn, than to describe it as a translation to code that has familiar semantics but is nowhere near idiomatic. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah.temporarily at gmail.com Mon Jan 25 16:45:11 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Mon, 25 Jan 2016 21:45:11 +0000 (UTC) Subject: [Python-ideas] PEP 484 change proposal: Allowing overload outside stub files References: <940B65C5-C2C0-41DF-8ADD-A2D36D01C18A@yahoo.com> Message-ID: Andrew Barnert via Python-ideas writes: [multipledispatch] > Still, I would love to see someone take that library and mypy and experiment with making them work together Exactly: I posted that link mainly in the hope of not having a simple @overload now and perhaps a fully typed-checked @dispatch version later. But apparently people really want the simple version right now. Stefan Krah From srkunze at mail.de Mon Jan 25 16:57:32 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 25 Jan 2016 22:57:32 +0100 Subject: [Python-ideas] Making Python great again Message-ID: <56A69A4C.2020307@mail.de> Hi, for all those who felt that something is wrong with Python. Here's the solution: https://github.com/samshadwell/TrumpScript ;) Best, Sven From rymg19 at gmail.com Mon Jan 25 16:58:00 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 25 Jan 2016 15:58:00 -0600 Subject: [Python-ideas] intput() In-Reply-To: <56A67FB8.5020001@canterbury.ac.nz> References: <56A67FB8.5020001@canterbury.ac.nz> Message-ID: <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Also: def linput(): 'Reads a list. Completely, 100% secure and bulletproof.' return map(eval, input[1:-1].split(','))) def ninput(): 'Reads None.' assert input() == 'None' def strinput(): 'Reads a string. Also 100% secure.' return eval("'" + input() + "'") On January 25, 2016 2:04:08 PM CST, Greg Ewing wrote: >Marcel O'Neil wrote: >> def intput(): >> return int(input()) > >And also > > def flintput(): > return float(input()) > >Yabba-dabba-doo-ly, >Greg >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 25 17:14:51 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 14:14:51 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <56A3ED9B.6030009@canterbury.ac.nz> Message-ID: <6283E2BE-34D0-44AE-A829-0C4535B53C32@yahoo.com> On Jan 24, 2016, at 22:32, Terry Reedy wrote: > > A possibility would be to make {} a delayed and possibly partial call operator, in parallel to the current use of () as a immediate and total call operator. > expr{arguments} > would evaluate to a function, whether of type or a special class similar to bound methods. The 'arguments' would be anything allowed within partial, which I believe is anything allowed in any function call. I chose {} because expr{...} is currently illegal, just as expr(arguments) is for anything other than a function call. On the other hand, expr[...] is currently legal, at least up to '[', as is expr<...> at least up to '<'. I like the idea of "easy" partials, but I don't like this syntax. Many languages (Scala, C++ with boost::lambda, etc.) use a syntax something like this: hex = int(_, 16) binopen = open(_, "rb", *_, **_) setspam = setattr(spam, attr, _) The equivalent functions are: lambda x: int(x, 16) lambda arg, *args, **kw: open(arg, "rb", *args, **kw) lambda arg, *, _spam=spam, _attr=attr: setattr(_spam, _attr, arg) You can extend this to allow reordering arguments, similarly to the way %-formatting handles reordering: modexp = pow(_3, _1, _2) Obviously '_' only works if that's not a valid identifier (or if you're implementing things with horrible template metaprogramming tricks and argument-dependent lookup rather than in the language), but some other symbol like ':', '%', or '$' might work. I won't get into the ways you can extend this to expressions other than calls, like 2*_ or just (2*). The first problem with this syntax is that it doesn't give you a way to specify _all_ of the arguments and return a nullary partial. But you can always work around that with dummy params with default values. And it really doesn't come up that often in practice anyway in languages with this syntax, except in the special case that Python already handles with bound methods. The other big problem is that it just doesn't look like Python, no matter how much you squint. But going only half-way there, via an extended functools.partial that's more like boost bind than boost lambda isn't nearly as bad: hex = partial(int, _, 16) binopen = partial(open, _, "rb", *_, **_) setspam = partial(setattr, spam, attr, _) Only the last one can be built with partial today, and even that one seems a lot more comprehensible with the explicit ', _' showing that the resulting function takes one argument, and you can see exactly where that argument will go, than with the current implicit version. At any rate, I'm not sure I like either of these, but I definitely like them both better than: setspam = setattr{spam, attr} From abarnert at yahoo.com Mon Jan 25 17:24:44 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 14:24:44 -0800 Subject: [Python-ideas] intput() In-Reply-To: <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: def binput(): return bytes(map(ord, input())) This should make Python 3-haters happy: it works perfectly, without any need for thought, as long as all of your friends are American. If not, just throw in random calls to .encode and .decode all over the place until the errors go away. Sent from my iPhone > On Jan 25, 2016, at 13:58, Ryan Gonzalez wrote: > > Also: > > > def linput(): > 'Reads a list. Completely, 100% secure and bulletproof.' > return map(eval, input[1:-1].split(','))) > > > def ninput(): > 'Reads None.' > assert input() == 'None' > > def strinput(): > 'Reads a string. Also 100% secure.' > return eval("'" + input() + "'") > >> On January 25, 2016 2:04:08 PM CST, Greg Ewing wrote: >> Marcel O'Neil wrote: >>> def intput(): >>> return int(input()) >> >> And also >> >> def flintput(): >> return float(input()) >> >> Yabba-dabba-doo-ly, >> Greg >> >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Mon Jan 25 17:25:05 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Mon, 25 Jan 2016 22:25:05 +0000 Subject: [Python-ideas] intput() In-Reply-To: <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S. def duckput(): """Reads anything. 'Cause there's never too much ducktyping""" return eval(input()+";") # ; makes sure there is only one line. On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez wrote: > Also: > > > def linput(): > 'Reads a list. Completely, 100% secure and bulletproof.' > return map(eval, input[1:-1].split(','))) > > > def ninput(): > 'Reads None.' > assert input() == 'None' > > def strinput(): > 'Reads a string. Also 100% secure.' > return eval("'" + input() + "'") > > On January 25, 2016 2:04:08 PM CST, Greg Ewing < > greg.ewing at canterbury.ac.nz> wrote: > >> Marcel O'Neil wrote: >> >>> def intput(): >>> return int(input()) >>> >> >> And also >> >> def flintput(): >> return float(input()) >> >> Yabba-dabba-doo-ly, >> Greg >> ------------------------------ >> >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- > Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 25 17:30:40 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 14:30:40 -0800 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: On Jan 25, 2016, at 14:25, Bar Harel wrote: > > For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S. > > def duckput(): > """Reads anything. 'Cause there's never too much ducktyping""" > return eval(input()+";") # ; makes sure there is only one line. Isn't that a guaranteed syntax error? Expressions can't include semicolons. Although I suppose that makes it even more secure, I think it would be more efficient to just `raise SyntaxError`. > >> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez wrote: >> Also: >> >> >> def linput(): >> 'Reads a list. Completely, 100% secure and bulletproof.' >> return map(eval, input[1:-1].split(','))) >> >> >> def ninput(): >> 'Reads None.' >> assert input() == 'None' >> >> def strinput(): >> 'Reads a string. Also 100% secure.' >> return eval("'" + input() + "'") >> >>> On January 25, 2016 2:04:08 PM CST, Greg Ewing wrote: >> >>> Marcel O'Neil wrote: >>>> def intput(): >>>> return int(input()) >>> >>> And also >>> >>> def flintput(): >>> return float(input()) >>> >>> Yabba-dabba-doo-ly, >>> Greg >>> >> >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> -- >> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Mon Jan 25 17:36:09 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Mon, 25 Jan 2016 22:36:09 +0000 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: Just decorate it with fuckit and everything will be alright. Make sure to follow the module's guideline though: "This module is like violence: if it doesn't work, you just need more of it." On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert wrote: > On Jan 25, 2016, at 14:25, Bar Harel wrote: > > For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S. > > def duckput(): > """Reads anything. 'Cause there's never too much ducktyping""" > return eval(input()+";") # ; makes sure there is only one line. > > > Isn't that a guaranteed syntax error? Expressions can't include > semicolons. Although I suppose that makes it even more secure, I think it > would be more efficient to just `raise SyntaxError`. > > > On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez wrote: > >> Also: >> >> >> def linput(): >> 'Reads a list. Completely, 100% secure and bulletproof.' >> return map(eval, input[1:-1].split(','))) >> >> >> def ninput(): >> 'Reads None.' >> assert input() == 'None' >> >> def strinput(): >> 'Reads a string. Also 100% secure.' >> return eval("'" + input() + "'") >> >> On January 25, 2016 2:04:08 PM CST, Greg Ewing < >> greg.ewing at canterbury.ac.nz> wrote: >> >>> Marcel O'Neil wrote: >>> >>>> def intput(): >>>> return int(input()) >>>> >>> >>> And also >>> >>> def flintput(): >>> return float(input()) >>> >>> Yabba-dabba-doo-ly, >>> Greg >>> ------------------------------ >>> >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> -- >> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 25 17:44:08 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 14:44:08 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <5AD4E9DA-1970-45E3-BB20-DD5FB8D55833@yahoo.com> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <5AD4E9DA-1970-45E3-BB20-DD5FB8D55833@yahoo.com> Message-ID: I said I'd write something up over the weekend if I couldn't find a good writeup from the Swift, C#, or Scala communities. I couldn't, so I did: https://stupidpythonideas.blogspot.com/2016/01/for-each-loops-should-define-new.html Apologies for the formatting (which I blame on blogspot--my markdown-to-html-with-workarounds-for-blogspot-sucking toolchain is still not perfect), and for being not entirely focused on Python (which is a consequence of Ruby and C# people being vaguely interested in it), and for being overly verbose (which is entirely my fault, as usual). Sent from my iPhone > On Jan 22, 2016, at 21:36, Andrew Barnert via Python-ideas wrote: > >> On Jan 22, 2016, at 21:06, Chris Angelico wrote: >> >> On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas >> wrote: >>> Finally, Terry suggested a completely different solution to the problem: >>> don't change closures; change for loops. Make them create a new variable >>> each time through the loop, instead of reusing the same variable. When the >>> variable isn't captured, this would make no difference, but when it is, >>> closures from different iterations would capture different variables (and >>> therefore different cells). For backward-compatibility reasons, this might >>> have to be optional, which means new syntax; he proposed "for new i in >>> range(10):". >> >> Not just for backward compatibility. Python's scoping and assignment >> rules are currently very straight-forward: assignment creates a local >> name unless told otherwise by a global/nonlocal declaration, and *all* >> name binding follows the same rules as assignment. Off the top of my >> head, I can think of two special cases, neither of which is truly a >> change to the binding semantics: "except X as Y:" triggers an >> unbinding at the end of the block, and comprehensions have a hidden >> function boundary that means their iteration variables are more local >> than you might think. Making for loops behave differently by default >> would be a stark break from that tidiness. > > As a side note, notice that if you don't capture the variable, there is no observable difference (which means CPython would be well within its rights to optimize it by reusing the same variable unless it's a cellvar). > > Anyway, yes, it's still something that you have to learn--but the unexpected-on-first-encounter interaction between loop variables and closures is also something that everybody has to learn. And, even after you understand it, it still doesn't become obvious until you've been bitten by it enough times (and if you're going back and forth between Python and a language that's solved the problem, one way or the other, you may keep relearning it). So, theoretically, the status quo is certainly simpler, but in practice, I'm not sure it is. > >> It seems odd to change this on the loop, though. Is there any reason >> to use "for new i in range(10):" if you're not making a series of >> nested functions? > > Rarely if ever. But is there any reason to "def spam(x; i):" or "def [i](x):" or whatever syntax people like if you're not overwriting i with a different and unwanted value? And is there any reason to reuse a variable you've bound in that way if a loop isn't forcing you to do so? > > This problem comes up all the time, in all kinds of languages, when loops and closures intersect. It almost never comes up with loops alone or closures alone. > >> Seems most logical to make this a special way of >> creating functions, not of looping. > > There are also some good theoretical motivations for changing loops, but I'm really hoping someone else (maybe the Swift or C# dev team blogs) has already written it up, so I can just post a link and a short "... and here's why it also applies to Python" (complicated by the fact that one of the motivations _doesn't_ apply to Python...). > > Also, the idea of a closure "capturing by value" is pretty strange on the surface; you have to think through why that doesn't just mean "not capturing" in a language like Python. Nick Coghlan suggests calling it "capture at definition" vs. "capture at call", which helps, but it's still weird. Weirder than loops creating a new binding that has the same name as the old one in a let-less language? I don't know. They're both weird. And so is the existing behavior, despite the fact that it makes perfect sense once you work it through. > > Anyway, for now, I'll just repeat that Ruby, Swift, C#, etc. all solved this by changing for loops, while only C++, which already needed to change closures because of its lifetime rules, solved it by changing closures. On the other hand, JavaScript and Java both explicitly rejected any change to fix the problem, and Python has lived with it for a long time, so... > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From mahmoud at hatnote.com Mon Jan 25 18:01:54 2016 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Mon, 25 Jan 2016 15:01:54 -0800 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: I tried to have fun, but my joke ended up long and maybe useful. Anyways, here's *ynput()*: https://gist.github.com/mahmoud/f23785445aff7a367f78 Get yourself a True/False from a y/n. D[Yn]amically, Mahmoud https://github.com/mahmoud https://twitter.com/mhashemi On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel wrote: > Just decorate it with fuckit and > everything will be alright. Make sure to follow the module's guideline > though: "This module is like violence: if it doesn't work, you just need > more of it." > > > On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert > wrote: > >> On Jan 25, 2016, at 14:25, Bar Harel wrote: >> >> For the ducks among us. Simple, Clean, Efficient and Secure. The 4 >> S/C/E/S. >> >> def duckput(): >> """Reads anything. 'Cause there's never too much ducktyping""" >> return eval(input()+";") # ; makes sure there is only one line. >> >> >> Isn't that a guaranteed syntax error? Expressions can't include >> semicolons. Although I suppose that makes it even more secure, I think it >> would be more efficient to just `raise SyntaxError`. >> >> >> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez wrote: >> >>> Also: >>> >>> >>> def linput(): >>> 'Reads a list. Completely, 100% secure and bulletproof.' >>> return map(eval, input[1:-1].split(','))) >>> >>> >>> def ninput(): >>> 'Reads None.' >>> assert input() == 'None' >>> >>> def strinput(): >>> 'Reads a string. Also 100% secure.' >>> return eval("'" + input() + "'") >>> >>> On January 25, 2016 2:04:08 PM CST, Greg Ewing < >>> greg.ewing at canterbury.ac.nz> wrote: >>> >>>> Marcel O'Neil wrote: >>>> >>>>> def intput(): >>>>> return int(input()) >>>>> >>>> >>>> And also >>>> >>>> def flintput(): >>>> return float(input()) >>>> >>>> Yabba-dabba-doo-ly, >>>> Greg >>>> ------------------------------ >>>> >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> -- >>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jan 25 18:21:36 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 26 Jan 2016 10:21:36 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <20160125232136.GP4619@ando.pearwood.info> Excellent summary, thank you, but I want to take exception to something you wrote. I fear that you have inadvertently derailed the thread into a considerably narrower focus than it should have. On Fri, Jan 22, 2016 at 08:50:52PM -0800, Andrew Barnert wrote: > What the thread is ultimately looking for is a solution to the > "closures capturing loop variables" problem. This problem has been in > the official programming FAQ[1] for decades, as "Why do lambdas > defined in a loop with different values all return the same result"? The issue is not loop variables, or rather, it's not *only* loop variables, and so any solution which focuses on fixing loop variables is only half a solution. If we look back at Haael's original post, his example captures *three* variables, not one, and there is no suggestion that they are necessarily loop variables. It's nice that since we have lambda and list comps we can occasionally write closures in a one-liner loop like so: > powers = [lambda x: x**i for i in range(10)] > > This gives you ten functions that all return x**9, which is probably > not what you wanted. but in my option, that's really a toy example suitable only for demonstrating the nature of the issue and the difference between early and late binding. Outside of such toys, we often find ourselves closing over at least one variable which is derived from the loop variable, but not the loop variable itself: # Still a toy, but perhaps a bit more of a realistic toy. searchers = [] for provider in search_provider: key = API_KEYS[provider] url = SEARCH_URLS[provider] def lookup(*terms): terms = "/q=" + "+".join(escape(t) for t in terms) u = url + ("key=%s" % key) + terms return fetch(u) or [] searchers.append(lookup) > The OP proposed that we should add some syntax, borrowed from C++, to > function definitions that specifies that some things get captured by > value. [...] Regardless of the syntax chosen, this has a few things to recommend it: - It's completely explicit. If you want a value captured, you have to say so explicitly, otherwise you will get the normal variable lookup behaviour that Python uses now. - It's general. We can capture locals, nonlocals, globals or builtins, not just loop variables. - It allows us to avoid the "default argument" idiom, in cases where we really don't want the argument, we just want to capture the value. There are a lot of functions which have their parameter list polluted by extraneous arguments that should never be used by the caller simply because that's the only way to get early binding/value capturing. > Finally, Terry suggested a completely different solution to the > problem: don't change closures; change for loops. Make them create a > new variable each time through the loop, instead of reusing the same > variable. When the variable isn't captured, this would make no > difference, but when it is, closures from different iterations would > capture different variables (and therefore different cells). It was actually Greg, not Terry. I strongly dislike this suggestion (sorry Greg), and I am concerned that the thread seems to have been derailed into treating loop variables as special enough to break the rules. It does nothing to solve the general problem of capturing values. It doesn't work for my "searchers" example above, or even the toy example here: funcs = [] for i in range(10): n = i**2 funcs.append(lambda x: x + n) This example can be easily re-written to close over the loop variable directly, that's not the point. The point is that we frequently need to capture more than just the loop variable. Coming up with a solution that only solves the issue for loop variables isn't enough, and it is a mistake to think that this is about "closures capturing loop variables". I won't speak for other languages, but in Python, where loops don't introduce a new scope, "closures capturing loop variables" shouldn't even be seen as a seperate problem from the more general issue of capturing values early rather than late. It's just a common, easily stumbled across, manifestation of the same. > For > backward-compatibility reasons, this might have to be optional, which > means new syntax; he proposed "for new i in range(10):". I would not like to see "new" become a keyword. I have a lot of code using new (and old) as a variable. -- Steve From steve at pearwood.info Mon Jan 25 18:34:55 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 26 Jan 2016 10:34:55 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> Message-ID: <20160125233455.GQ4619@ando.pearwood.info> On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote: > On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano > wrote: [...] > > (I'm saving my energy for Eiffel-like require/ensure blocks > > *wink*). > > > > Now you're making me curious. Okay, just to satisfy your curiosity, and not as a concrete proposal at this time, here is a sketch of the sort of thing Eiffel uses for Design By Contract. Each function or method has an (optional, but recommended) pre-condition and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy example: class Lunch: def __init__(self, arg): self.meat = self.spam(arg) def spam(self, n:int=5): """Set the lunch meat to n servings of spam.""" require: # Assert the pre-conditions of the method. assert n >= 1 ensure: # Assert the post-conditions of the method. assert self.meat.startswith('Spam') if ' ' in self.meat: assert ' spam' in self.meat # main body of the method, as usual serves = ['spam']*n serves[0] = serves.title() self.meat = ' '.join(serves) The require block runs before the body of the method, and the ensure block runs after the body, but before the method returns to the caller. If either fail their assertions, the method fails and raises an exception. Benefits: - The pre- and post-conditions make up (part of) the method's contract, which is part of the executable documentation of the method. Documentation tools can extract the ensure and require sections as present them as part of the API docs. - The compiler can turn the contract checking on or off as needed, with the ensure/require sections handled independently. - Testing pre- and post-conditions is logically separate from the method's implementation. This allows the implementation to vary while keeping the contract the same. - But at the same time, the contract is right there with the method, not seperated in some potentially distant part of the code base. I'm not going to go into detail about Design By Contract, if anyone wants to learn more you can start here: https://www.eiffel.com/values/design-by-contract/introduction/ https://docs.eiffel.com/book/method/et-design-contract-tm-assertions-and-exceptions I've just discovered there's an older PEP for something similar: https://www.python.org/dev/peps/pep-0316/ but that uses docstrings for the contracts. I don't like that. -- Steve From rymg19 at gmail.com Mon Jan 25 18:40:37 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 25 Jan 2016 17:40:37 -0600 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: ... ...that's actually pretty awesome. (Other than the "is True" and "is False" stuff, which is making my OCD go haywire.) On Mon, Jan 25, 2016 at 5:01 PM, Mahmoud Hashemi wrote: > I tried to have fun, but my joke ended up long and maybe useful. > > Anyways, here's *ynput()*: > > https://gist.github.com/mahmoud/f23785445aff7a367f78 > > Get yourself a True/False from a y/n. > > D[Yn]amically, > > Mahmoud > https://github.com/mahmoud > https://twitter.com/mhashemi > > On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel wrote: > >> Just decorate it with fuckit and >> everything will be alright. Make sure to follow the module's guideline >> though: "This module is like violence: if it doesn't work, you just need >> more of it." >> >> >> On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert >> wrote: >> >>> On Jan 25, 2016, at 14:25, Bar Harel wrote: >>> >>> For the ducks among us. Simple, Clean, Efficient and Secure. The 4 >>> S/C/E/S. >>> >>> def duckput(): >>> """Reads anything. 'Cause there's never too much ducktyping""" >>> return eval(input()+";") # ; makes sure there is only one line. >>> >>> >>> Isn't that a guaranteed syntax error? Expressions can't include >>> semicolons. Although I suppose that makes it even more secure, I think it >>> would be more efficient to just `raise SyntaxError`. >>> >>> >>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez wrote: >>> >>>> Also: >>>> >>>> >>>> def linput(): >>>> 'Reads a list. Completely, 100% secure and bulletproof.' >>>> return map(eval, input[1:-1].split(','))) >>>> >>>> >>>> def ninput(): >>>> 'Reads None.' >>>> assert input() == 'None' >>>> >>>> def strinput(): >>>> 'Reads a string. Also 100% secure.' >>>> return eval("'" + input() + "'") >>>> >>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing < >>>> greg.ewing at canterbury.ac.nz> wrote: >>>> >>>>> Marcel O'Neil wrote: >>>>> >>>>>> def intput(): >>>>>> return int(input()) >>>>>> >>>>> >>>>> And also >>>>> >>>>> def flintput(): >>>>> return float(input()) >>>>> >>>>> Yabba-dabba-doo-ly, >>>>> Greg >>>>> ------------------------------ >>>>> >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>> >>>>> -- >>>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Jan 25 18:52:59 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Jan 2016 10:52:59 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125232136.GP4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> Message-ID: On Tue, Jan 26, 2016 at 10:21 AM, Steven D'Aprano wrote: > - It allows us to avoid the "default argument" idiom, in cases where we > really don't want the argument, we just want to capture the value. There > are a lot of functions which have their parameter list polluted by > extraneous arguments that should never be used by the caller simply > because that's the only way to get early binding/value capturing. > Can you actually name a few, please? I went digging earlier, and couldn't find any really good examples in the stdlib - they're mostly internal functions (underscore-prefixed) that shouldn't be being called from outside their own module anyway. Maybe this isn't as common an issue as I'd thought. ChrisA From abarnert at yahoo.com Mon Jan 25 18:59:58 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 15:59:58 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125232136.GP4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> Message-ID: On Jan 25, 2016, at 15:21, Steven D'Aprano wrote: > > Excellent summary, thank you, but I want to take exception to something > you wrote. I fear that you have inadvertently derailed the thread into a > considerably narrower focus than it should have. > >> On Fri, Jan 22, 2016 at 08:50:52PM -0800, Andrew Barnert wrote: >> >> What the thread is ultimately looking for is a solution to the >> "closures capturing loop variables" problem. This problem has been in >> the official programming FAQ[1] for decades, as "Why do lambdas >> defined in a loop with different values all return the same result"? > > The issue is not loop variables, or rather, it's not *only* loop > variables, and so any solution which focuses on fixing loop variables is > only half a solution. I think it really _is_ only loop variables--or at least 95% loop variables. > ... Outside of such toys, we often find ourselves closing > over at least one variable which is derived from the loop variable, but > not the loop variable itself: But, depending on how you write that, either (a) it already works the way you'd naively expect, or (b) the only reason you'd expect it to work is if you don't understand Python scoping (that is, you think every block is a scope). That's different from the case with loop variables: even people who know Python scoping still regularly make the mistake with loop variables, swear at themselves, and write the default-value trick on the first debug pass. (Novices, of course, swear at themselves, try 28 random changes, then post their code on StackOverflow titled "Why Python closures does suck this way?") It's the loop variable problem that's in the FAQ. And it does in fact come up all the time in some kinds of programs, like Tkinter code that wants to create callbacks for each of 10 buttons. And again, looking at other languages, it's the loop variable problem that's in their FAQs, and the new-variable-per-instance solution would work across most of them, and is actually used in some. Again, I definitely acknowledge that Python's non-granular scopes make the issue much less clear-cut than in those languages where "key = API_KEYS[provider]" would actually work. That's why I said that if there's one mainstream language that _shouldn't_ use my solution, it's Python. And, ultimately, I'm still -0 about any change--the default-value solution has worked for decades, everyone who uses Python understands it, and there's no serious problem with it. But I think "capture by value" or "capture early" would, outside the loop-variable case, be more often an attractive nuisance for code you shouldn't be writing than a help for code you should. If you think we _should_ solve the problem with "loop-body-local" variables, that would definitely be an argument for Nick's "define and call a function" implementation over the new-cell implementation, because his version does actually define a new scope, and can easily be written to make those variables actually loop-body-local. However, I think that, if we wanted that, it would be better to have a more general solution--maybe a "scope" statement that defines a new scope for its suite, or even a "let" statement that defines a new variable only until the end of the current suite. Or, of course, we could toss this on the large pile of "problems that would be solved by light-weight multi-line lambda" (and I think it doesn't add nearly enough weight to that pile to make the problem worth solving, either). >> The OP proposed that we should add some syntax, borrowed from C++, to >> function definitions that specifies that some things get captured by >> value. > [...] > > Regardless of the syntax chosen, this has a few things to recommend it: > > - It's completely explicit. If you want a value captured, you > have to say so explicitly, otherwise you will get the normal variable > lookup behaviour that Python uses now. Surely "for new i" is just as explicit about the fact that the variable is "special" as "def f(x; i):" or "sharedlocal i"? The difference is only _where_ it's marked, not _whether_ it's marked. > - It's general. We can capture locals, nonlocals, globals or builtins, > not just loop variables. Sure, but it may be an overly-general solution to a very specific problem. If not, then great, but... Do you really have code that would be clearer if you could capture a global variable by value? (Of course there's code that does that as an optimization--but that's not to make the code clearer; it's to make the code slightly faster despite being less clear.) > - It allows us to avoid the "default argument" idiom, in cases where we > really don't want the argument, we just want to capture the value. There > are a lot of functions which have their parameter list polluted by > extraneous arguments that should never be used by the caller simply > because that's the only way to get early binding/value capturing. It's not the _only_ way. When you really want a new scope, you can always define and call a local function. Or, usually better, refactor things so you're calling a global function, or using an object, or some other solution. The default-value idiom is just the most _concise_ way. Meanwhile, have you ever actually had a bug where someone passed an override for the i=i or len=len parameter? I suspect if people really were worried about that, they would use "*, _spam=spam", but they never do. (The only place I've seen anything like that is in generated code--e.g., a currying macro.) >> For >> backward-compatibility reasons, this might have to be optional, which >> means new syntax; he proposed "for new i in range(10):". > > I would not like to see "new" become a keyword. I have a lot of code > using new (and old) as a variable. I've even got some 2.5 code that runs in 3.3+ thanks to modernize, but still uses the "new" module. :) Of course it could become a context-sensitive keyword, like async. But yeah, that seems more like a last-resort idea than something to emulate wherever possible... From mike at selik.org Mon Jan 25 19:18:53 2016 From: mike at selik.org (Michael Selik) Date: Tue, 26 Jan 2016 00:18:53 +0000 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125232136.GP4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> Message-ID: On Mon, Jan 25, 2016 at 5:22 PM Steven D'Aprano wrote: > # Still a toy, but perhaps a bit more of a realistic toy. > searchers = [] > for provider in search_provider: > key = API_KEYS[provider] > url = SEARCH_URLS[provider] > def lookup(*terms): > terms = "/q=" + "+".join(escape(t) for t in terms) > u = url + ("key=%s" % key) + terms > return fetch(u) or [] > searchers.append(lookup) > > I'd define the basic function outside the loop. def lookup(root_url, api_key, *terms): args = root_url, api_key, "+".join(escape(t) for t in terms) url = '%s?key=%s&q=%s' % args return fetch(url) or [] Then use ``functools.partial`` inside the loop to create the closure. searchers = [] for provider in search_provider: key = API_KEYS[provider] url = SEARCH_URLS[provider] searchers.append(partial(lookup, url, key)) Or even more concisely, you could use a comprehension at that point. searchers = [partial(lookup, SEARCH_URLS[p], API_KEYS[p]) for p in search_provider] -------------- next part -------------- An HTML attachment was scrubbed... URL: From mahmoud at hatnote.com Mon Jan 25 19:37:50 2016 From: mahmoud at hatnote.com (Mahmoud Hashemi) Date: Mon, 25 Jan 2016 16:37:50 -0800 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: If you look very closely, identity checks are actually intended. I want _the_ True (or False). Otherwise, ValueError. :) On Mon, Jan 25, 2016 at 3:40 PM, Ryan Gonzalez wrote: > ... > > ...that's actually pretty awesome. (Other than the "is True" and "is > False" stuff, which is making my OCD go haywire.) > > On Mon, Jan 25, 2016 at 5:01 PM, Mahmoud Hashemi > wrote: > >> I tried to have fun, but my joke ended up long and maybe useful. >> >> Anyways, here's *ynput()*: >> >> https://gist.github.com/mahmoud/f23785445aff7a367f78 >> >> Get yourself a True/False from a y/n. >> >> D[Yn]amically, >> >> Mahmoud >> https://github.com/mahmoud >> https://twitter.com/mhashemi >> >> On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel wrote: >> >>> Just decorate it with fuckit and >>> everything will be alright. Make sure to follow the module's guideline >>> though: "This module is like violence: if it doesn't work, you just >>> need more of it." >>> >>> >>> On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert >>> wrote: >>> >>>> On Jan 25, 2016, at 14:25, Bar Harel wrote: >>>> >>>> For the ducks among us. Simple, Clean, Efficient and Secure. The 4 >>>> S/C/E/S. >>>> >>>> def duckput(): >>>> """Reads anything. 'Cause there's never too much ducktyping""" >>>> return eval(input()+";") # ; makes sure there is only one line. >>>> >>>> >>>> Isn't that a guaranteed syntax error? Expressions can't include >>>> semicolons. Although I suppose that makes it even more secure, I think it >>>> would be more efficient to just `raise SyntaxError`. >>>> >>>> >>>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez >>>> wrote: >>>> >>>>> Also: >>>>> >>>>> >>>>> def linput(): >>>>> 'Reads a list. Completely, 100% secure and bulletproof.' >>>>> return map(eval, input[1:-1].split(','))) >>>>> >>>>> >>>>> def ninput(): >>>>> 'Reads None.' >>>>> assert input() == 'None' >>>>> >>>>> def strinput(): >>>>> 'Reads a string. Also 100% secure.' >>>>> return eval("'" + input() + "'") >>>>> >>>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing < >>>>> greg.ewing at canterbury.ac.nz> wrote: >>>>> >>>>>> Marcel O'Neil wrote: >>>>>> >>>>>>> def intput(): >>>>>>> return int(input()) >>>>>>> >>>>>> >>>>>> And also >>>>>> >>>>>> def flintput(): >>>>>> return float(input()) >>>>>> >>>>>> Yabba-dabba-doo-ly, >>>>>> Greg >>>>>> ------------------------------ >>>>>> >>>>>> Python-ideas mailing list >>>>>> Python-ideas at python.org >>>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>>>> >>>>>> -- >>>>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> https://mail.python.org/mailman/listinfo/python-ideas >>>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Ryan > [ERROR]: Your autotools build scripts are 200 lines longer than your > program. Something?s wrong. > http://kirbyfan64.github.io/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 25 19:42:59 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 16:42:59 -0800 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: <20160125233455.GQ4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: On Jan 25, 2016, at 15:34, Steven D'Aprano wrote: > > Okay, just to satisfy your curiosity, and not as a concrete proposal at > this time, here is a sketch of the sort of thing Eiffel uses for Design > By Contract. I think it's worth explaining why this has to be an actual language feature, not something you just do by writing functions named "requires" and "ensures". Many of the benefits you cited would work just fine with a PyPI-library solution, but there are some problems that are much harder to solve: * You usually want requires to be able to access the return value and exception state, and maybe even any locals, and ensure to be able to access the parameters. * Faking ensure usually means finally or with (which means indenting your entire function) or a wrapper function (while precludes many simple designs). * Many contract assertions are slow (or even dangerous, when not upheld) to calculate, so just no-opping out the checker functions doesn't help. * Class invariants should be automatically verified as ensures on all public methods except __del__ and (if it raises) __init__. * Subclasses that override a method need to automatically inherit the base class's pre- and post-conditions (as well as possibly adding some of their own), even if they don't call the super method. * Some contract assertions can be tested at compile time. (Eiffel doesn't have much experimentation here; C# does, and there are rumors about Swift with clang-static-analyzer.) Some of these things can be shoehorned in with frame hacks and metaclasses and so on, but it's not fun. There's a lot of history of people trying to fake it in other languages and then giving up and saying "just use comments until we can build it into language version n+1". (See D 1.0, Core C++ Standard for C++14/17, C# 4, Swift 2...) There have been a few attempts for Python, but most of them seem to have run into similar problems, after a lot of messing around with metaclasses and so on. From mike at selik.org Mon Jan 25 20:01:27 2016 From: mike at selik.org (Michael Selik) Date: Tue, 26 Jan 2016 01:01:27 +0000 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: On Mon, Jan 25, 2016 at 6:43 PM Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Jan 25, 2016, at 15:34, Steven D'Aprano wrote: > > > > Okay, just to satisfy your curiosity, and not as a concrete proposal at > > this time, here is a sketch of the sort of thing Eiffel uses for Design > > By Contract. > > I think it's worth explaining why this has to be an actual language > feature, not something you just do by writing functions named "requires" > and "ensures". Many of the benefits you cited would work just fine with a > PyPI-library solution, but there are some problems that are much harder to > solve: > > Some of these things can be shoehorned in with frame hacks and metaclasses > and so on, but it's not fun. ... There have been a few attempts for Python, > but most of them seem to have run into similar problems, after a lot of > messing around with metaclasses and so on. > As you were writing this, I was sketching out an implementation using a callable FunctionWithContract context manager as a decorator. As you say, the trouble seems to be elegantly capturing the function output and passing that to an ensure or __exit__ method. The requires side isn't so bad. Still, I'm somewhat hopeful that someone more skilled than I might be able to write an elegant ``Contract`` type using current Python syntax. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jan 25 20:02:26 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Jan 2016 17:02:26 -0800 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: <26DEA21C-76C8-4092-95AE-FA4ABF167B2C@yahoo.com> One more, ported from VB code I found via Google: def nput(n, prompt): for i in range(n): tx.schedule(puts.pop()) if prompt: tx.immediate = True tx.kick() I think this has something to do with stock derivatives? If you want to use this to get rich off high-frequency trading, you may want to cythonize or numpyize it. From mertz at gnosis.cx Mon Jan 25 20:24:29 2016 From: mertz at gnosis.cx (David Mertz) Date: Mon, 25 Jan 2016 17:24:29 -0800 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: Just curious, Michael, what would you like the Python syntax version to look like if you *can* do whatever metaclass or stack hackery that's needed? I'm a little confused when you mention a decorator and a context manager in the same sentence since those would seem like different approaches. E.g.: @Contract(pre=my_pre, post=my_post) def my_fun(...): ... Versus: with contract(pre=my_pre, post=my_post): def my_fun(...): ... Versus: def my_fun(...): with contract(pre=my_pre, post=my_post): I'm sure lots of other variations are possible too (if any can be made fully to work). On Mon, Jan 25, 2016 at 5:01 PM, Michael Selik wrote: > On Mon, Jan 25, 2016 at 6:43 PM Andrew Barnert via Python-ideas < > python-ideas at python.org> wrote: > >> On Jan 25, 2016, at 15:34, Steven D'Aprano wrote: >> > >> > Okay, just to satisfy your curiosity, and not as a concrete proposal at >> > this time, here is a sketch of the sort of thing Eiffel uses for Design >> > By Contract. >> >> I think it's worth explaining why this has to be an actual language >> feature, not something you just do by writing functions named "requires" >> and "ensures". Many of the benefits you cited would work just fine with a >> PyPI-library solution, but there are some problems that are much harder to >> solve: >> >> Some of these things can be shoehorned in with frame hacks and >> metaclasses and so on, but it's not fun. ... There have been a few attempts >> for Python, but most of them seem to have run into similar problems, after >> a lot of messing around with metaclasses and so on. >> > > As you were writing this, I was sketching out an implementation using a > callable FunctionWithContract context manager as a decorator. As you say, > the trouble seems to be elegantly capturing the function output and passing > that to an ensure or __exit__ method. The requires side isn't so bad. > > Still, I'm somewhat hopeful that someone more skilled than I might be able > to write an elegant ``Contract`` type using current Python syntax. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Tue Jan 26 01:51:46 2016 From: tritium-list at sdamon.com (Alexander Walters) Date: Tue, 26 Jan 2016 01:51:46 -0500 Subject: [Python-ideas] intput() In-Reply-To: References: Message-ID: <56A71782.8040805@sdamon.com> *coughs* intput = ast.literal_eval flput = ast.literal_eval ... On 1/25/2016 10:11, Marcel O'Neil wrote: > def intput(): > return int(input()) > > Life would be just marginally easier, with a punny function name as a > bonus. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjoerdjob at sjec.nl Tue Jan 26 05:47:06 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Tue, 26 Jan 2016 11:47:06 +0100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125233455.GQ4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: <20160126104706.GA18861@sjoerdjob.com> On Tue, Jan 26, 2016 at 10:34:55AM +1100, Steven D'Aprano wrote: > On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote: > > On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano > > wrote: > [...] > > > (I'm saving my energy for Eiffel-like require/ensure blocks > > > *wink*). > > > > > > > Now you're making me curious. > > > Okay, just to satisfy your curiosity, and not as a concrete proposal at > this time, here is a sketch of the sort of thing Eiffel uses for Design > By Contract. > > Each function or method has an (optional, but recommended) pre-condition > and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy > example: > > class Lunch: > def __init__(self, arg): > self.meat = self.spam(arg) > > def spam(self, n:int=5): > """Set the lunch meat to n servings of spam.""" > require: > # Assert the pre-conditions of the method. > assert n >= 1 > ensure: > # Assert the post-conditions of the method. > assert self.meat.startswith('Spam') > if ' ' in self.meat: > assert ' spam' in self.meat > # main body of the method, as usual > serves = ['spam']*n > serves[0] = serves.title() > self.meat = ' '.join(serves) > > > The require block runs before the body of the method, and the ensure > block runs after the body, but before the method returns to the caller. > If either fail their assertions, the method fails and raises an > exception. > > > Benefits: > > - The pre- and post-conditions make up (part of) the method's > contract, which is part of the executable documentation of > the method. Documentation tools can extract the ensure > and require sections as present them as part of the API docs. > > - The compiler can turn the contract checking on or off as > needed, with the ensure/require sections handled independently. > > - Testing pre- and post-conditions is logically separate from > the method's implementation. This allows the implementation > to vary while keeping the contract the same. > > - But at the same time, the contract is right there with the > method, not seperated in some potentially distant part of the > code base. One thing I immediately thought of was using decorators. def requires(*conditions): def decorator(func): # TODO: Do some hackery such that the signature of wrapper # matches the signature of `func`. def wrapper(*args, **kwargs): for condition in conditions assert eval(condition, {}, locals()) return func(*args, **kwargs) return wrapper return decorator def ensure(*conditions): def decorator(func): def wrapper(*args, **kwargs): try: return func(*args, **kwargs) finally: for condition in conditions: assert eval(condition, {}, locals()) return decorator Maybe do some checking for the optimization-level flag, and replace the decorator function with `return func` instead of another wrapper? The `ensure` part isn't quite to my liking yet, but I think that the `ensure` should have no need to access internal variables of the function, but only the externally visible state. (This somewhat mimics what I'm trying to fiddle around with in my own time: writing a decorator that does run-time checking of argument and return types of functions.) From steve at pearwood.info Tue Jan 26 09:26:47 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jan 2016 01:26:47 +1100 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: <20160126142646.GR4619@ando.pearwood.info> On Mon, Jan 25, 2016 at 05:24:29PM -0800, David Mertz wrote: > Just curious, Michael, what would you like the Python syntax version to > look like if you *can* do whatever metaclass or stack hackery that's > needed? I'm a little confused when you mention a decorator and a context > manager in the same sentence since those would seem like different > approaches. E.g.: I'm not Michael, but since I started this discussion, I'll give an answer. I haven't got any working code, but I think something like this would be acceptable as a proof-of-concept. I'd use a class as a fake namespace, with either a decorator or metaclass: class myfunction(metaclass=DBC): def myfunction(args): # function implementation ... def requires(): ... def ensures(): ... The duplication of the name is a bit ugly, and it looks a bit funny for the decorator/metaclass to take a class as input and return a function, but we don't really have anything else that makes a good namespace. There's functions themselves, of course, but it's hard to get at the internals. The point is to avoid having to pre-define the pre- and post-condition functions. We don't write this: def __init__(self): ... def method(self, arg): ... class MyClass(init=__init__, method=method) and nor should we have to do the same for require/ensure. -- Steve From steve at pearwood.info Tue Jan 26 09:40:23 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jan 2016 01:40:23 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> Message-ID: <20160126144023.GS4619@ando.pearwood.info> On Tue, Jan 26, 2016 at 10:52:59AM +1100, Chris Angelico wrote: > On Tue, Jan 26, 2016 at 10:21 AM, Steven D'Aprano wrote: > > - It allows us to avoid the "default argument" idiom, in cases where we > > really don't want the argument, we just want to capture the value. There > > are a lot of functions which have their parameter list polluted by > > extraneous arguments that should never be used by the caller simply > > because that's the only way to get early binding/value capturing. > > > > Can you actually name a few, please? The random module is the first example that comes to mind. Up until 3.3, the last argument was spelled "int" with no underscore: py> inspect.signature(random.randrange) )> random.shuffle also used to have an int=int argument, but it seems to be gone in 3.5. > I went digging earlier, and > couldn't find any really good examples in the stdlib - they're mostly > internal functions (underscore-prefixed) that shouldn't be being > called from outside their own module anyway. Maybe this isn't as > common an issue as I'd thought. Obviously you can get away with more junk in a private function than a public function, but it's still unpleasant. Even if it only effects the maintainer of the library, not the users of it, a polluted signature is still polluted. -- Steve From p.f.moore at gmail.com Tue Jan 26 10:06:48 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 26 Jan 2016 15:06:48 +0000 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: <20160126142646.GR4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: On 26 January 2016 at 14:26, Steven D'Aprano wrote: > class myfunction(metaclass=DBC): > def myfunction(args): > # function implementation > ... > def requires(): > ... > def ensures(): > ... > > > The duplication of the name is a bit ugly, and it looks a bit funny for > the decorator/metaclass to take a class as input and return a function, > but we don't really have anything else that makes a good namespace Well, classes can be callable already, so how about @DBC class myfunction: def __call__(self, args): ... @precondition def requires(self): ... @postcondition def ensures(self, result): ... The DBC class decorator does something like def DBC(cls): def wrapper(*args, **kw): fn = cls() fn.args = args fn.kw = kw for pre in fn.__preconditions__: pre() result = fn(*args, **kw) for post in fn.__postconditions__: post(result) return wrapper Pre and post conditions can access the args via self.args and self.kw. The method decorators would let you have multiple pre- and post-conditions. Or you could use "magic" names and omit the decorators. Paul From random832 at fastmail.com Tue Jan 26 10:30:37 2016 From: random832 at fastmail.com (Random832) Date: Tue, 26 Jan 2016 10:30:37 -0500 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> Message-ID: <1453822237.78223.502982538.3FF49DD9@webmail.messagingengine.com> On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote: > I tried to have fun, but my joke ended up long and maybe useful. > > Anyways, here's *ynput()*: > > https://gist.github.com/mahmoud/f23785445aff7a367f78 > > Get yourself a True/False from a y/n. If I were writing such a function I'd use locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my system indicates these all accept y/n, but additionally accept their own language's term (or a related language - en_DK supports danish J/N, and en_CA supports french O/N). Mostly these use syntax compatible with python regex, though a few use (grouping|alternation) with no backslash. From rosuav at gmail.com Tue Jan 26 10:24:52 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 27 Jan 2016 02:24:52 +1100 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: On Wed, Jan 27, 2016 at 2:06 AM, Paul Moore wrote: > Well, classes can be callable already, so how about > > @DBC > class myfunction: > def __call__(self, args): > ... > @precondition > def requires(self): > ... > @postcondition > def ensures(self, result): > ... > > The DBC class decorator does something like > > def DBC(cls): > def wrapper(*args, **kw): > fn = cls() > fn.args = args > fn.kw = kw > for pre in fn.__preconditions__: > pre() > result = fn(*args, **kw) > for post in fn.__postconditions__: > post(result) > return wrapper > > Pre and post conditions can access the args via self.args and self.kw. > The method decorators would let you have multiple pre- and > post-conditions. Or you could use "magic" names and omit the > decorators. I'd rather use magic names - something like: @DBC class myfunction: def body(self, args): ... def requires(self): ... def ensures(self, result): ... and then the DBC decorator can create a __call__ method. This still has one nasty problem though: the requires and ensures functions can't see function arguments. You could get around this by duplicating the argument list onto the other two, but who wants to do that? ChrisA From p.f.moore at gmail.com Tue Jan 26 10:42:54 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 26 Jan 2016 15:42:54 +0000 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: On 26 January 2016 at 15:24, Chris Angelico wrote: > This still > has one nasty problem though: the requires and ensures functions can't > see function arguments. See my code - you can put the args onto the instance as attributes for requires/ensures to inspect. Paul From rosuav at gmail.com Tue Jan 26 10:51:08 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 27 Jan 2016 02:51:08 +1100 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: On Wed, Jan 27, 2016 at 2:42 AM, Paul Moore wrote: > On 26 January 2016 at 15:24, Chris Angelico wrote: >> This still >> has one nasty problem though: the requires and ensures functions can't >> see function arguments. > > See my code - you can put the args onto the instance as attributes for > requires/ensures to inspect. Except that there can be only one of those at any given time, so you run into issues with recursion or threads/async/etc; plus, it's still not properly clean - you have to check either args or kwargs, depending on whether the argument was passed positionally or by keyword. I don't see that as a solution. (Maybe what we need is a "keyword-to-positional" functools feature - anything in **kwargs that can be interpreted positionally gets removed and added to *args. Or the other way - keywordify everything.) ChrisA From encukou at gmail.com Tue Jan 26 11:17:55 2016 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 26 Jan 2016 17:17:55 +0100 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: <56A79C33.5000807@gmail.com> On 01/26/2016 04:51 PM, Chris Angelico wrote: > On Wed, Jan 27, 2016 at 2:42 AM, Paul Moore wrote: >> On 26 January 2016 at 15:24, Chris Angelico wrote: >>> This still >>> has one nasty problem though: the requires and ensures functions can't >>> see function arguments. >> >> See my code - you can put the args onto the instance as attributes for >> requires/ensures to inspect. > > Except that there can be only one of those at any given time, so you > run into issues with recursion or threads/async/etc; plus, it's still > not properly clean - you have to check either args or kwargs, > depending on whether the argument was passed positionally or by > keyword. I don't see that as a solution. > > (Maybe what we need is a "keyword-to-positional" functools feature - > anything in **kwargs that can be interpreted positionally gets removed > and added to *args. Or the other way - keywordify everything.) Well, it's not in functools. import inspect def keyword_to_positional(func, args, kwargs): sig = inspect.signature(func).bind(*args, **kwargs) sig.apply_defaults() return sig.args, sig.kwargs def keywordify_everything(func, args, kwargs): sig = inspect.signature(func).bind(*args, **kwargs) sig.apply_defaults() return sig.arguments From ethan at stoneleaf.us Tue Jan 26 11:55:04 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 26 Jan 2016 08:55:04 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125233455.GQ4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: <56A7A4E8.1000104@stoneleaf.us> On 01/25/2016 03:34 PM, Steven D'Aprano wrote: > On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote: >> On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano wrote: > [...] >>> (I'm saving my energy for Eiffel-like require/ensure blocks >>> *wink*). >> >> Now you're making me curious. > > Okay, just to satisfy your curiosity, and not as a concrete proposal at > this time, here is a sketch of the sort of thing Eiffel uses for Design > By Contract. > > Each function or method has an (optional, but recommended) pre-condition > and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy > example: > > class Lunch: > def __init__(self, arg): > self.meat = self.spam(arg) > > def spam(self, n:int=5): > """Set the lunch meat to n servings of spam.""" > require: > # Assert the pre-conditions of the method. > assert n >= 1 > ensure: > # Assert the post-conditions of the method. > assert self.meat.startswith('Spam') > if ' ' in self.meat: > assert ' spam' in self.meat > # main body of the method, as usual > serves = ['spam']*n > serves[0] = serves.title() > self.meat = ' '.join(serves) I like that syntax. Currently, something not too ugly would be to use descriptors -- something like: from dbc import require, ensure class Frobnigate(object): @require def spammer(self, desc): desc.assertInRange(arg1, 0, 99) @spammer def _spammer(self, arg1, arg2): return arg1 // arg2 + arg1 @spammer.ensure def spammer(self, desc, res): if desc.arg2 % 2 == 1: desc.assertEqual(res % 2, 1) else: desc.assertEqual(res % 2, 0) @ensure def egger(self, desc, res): desc.assertIsType(res, str) @egger def _egger(self, egg_type): 'scrambled, poached, boiled, etc' return egg_type Where 'desc' in the above code is 'self' for the descriptor so saved arguments could be accessed, etc. I put a leading underscore on the body so it could be kept separate and more easily subclassed without losing the DBC portions. If 'require' is not needed, one can use 'ensure'; both create the DBC object which would take care of calling any/all requires, then the function, then any/all ensures, and also grabbing and saving the function signature and actual parameters. -- ~Ethan~ From srkunze at mail.de Tue Jan 26 12:44:09 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 26 Jan 2016 18:44:09 +0100 Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside stub files In-Reply-To: References: Message-ID: <56A7B069.8050703@mail.de> Overall, that's an interesting idea although I share the concern about the visual heaviness of the proposal. Not sure if that can be resolved properly. I somehow like the comment idea but that doesn't fit into the remaining concept well. On 22.01.2016 21:00, Guido van Rossum wrote: > Calling an @overload-decorated function is still an error (I propose > NotImplemented). Not sure if that applies here, but would that be rather NotImplementedError? Best, Sven From ethan at stoneleaf.us Tue Jan 26 13:55:31 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 26 Jan 2016 10:55:31 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A7A4E8.1000104@stoneleaf.us> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <56A7A4E8.1000104@stoneleaf.us> Message-ID: <56A7C123.5040205@stoneleaf.us> On 01/26/2016 08:55 AM, Ethan Furman wrote: > Currently, something not too ugly would be to use descriptors -- > something like: > > from dbc import require, ensure > > class Frobnigate(object): > @require > def spammer(self, desc): > desc.assertInRange(arg1, 0, 99) > > @spammer > def _spammer(self, arg1, arg2): > return arg1 // arg2 + arg1 > > @spammer.ensure > def spammer(self, desc, res): > if desc.arg2 % 2 == 1: > desc.assertEqual(res % 2, 1) > else: > desc.assertEqual(res % 2, 0) > > @ensure > def egger(self, desc, res): > desc.assertIsType(res, str) > > @egger > def _egger(self, egg_type): > 'scrambled, poached, boiled, etc' > return egg_type > > Where 'desc' in the above code is 'self' for the descriptor so saved > arguments could be accessed, etc. > > I put a leading underscore on the body so it could be kept separate and > more easily subclassed without losing the DBC portions. > > If 'require' is not needed, one can use 'ensure'; both create the DBC > object which would take care of calling any/all requires, then the > function, then any/all ensures, and also grabbing and saving the > function signature and actual parameters. The descriptor itself might look like: # untested class require: def __init__(desc, func=None): desc.require = [] desc.ensure = [] desc.name = None desc.func = None def __call__(desc, func): # desc.func is not None, func is the actual function, # otherwise it's a requires function if desc.func is None: self.require.append(func) return desc else: desc.func_name = name = func.__name__ if name.startswith('_'): name = name[1:] desc.name = name return func def __get__(desc, self, cls): function = self.getattr(desc.func_name) def caller(self, *args, **kwds): for require in desc.require: require(self, desc, *args, **kwds) res = function(self, *args, **kwds) for ensure in desc.ensure: ensure(self, desc, res, *args, **kwds) return res return caller def ensure(desc, func): self.ensure.append(func) return desc def require(desc, func): self.require.append(func) return desc I decided to pass args and kwds rather than save them to the descriptor instance, hoping threading would be easier that way. The 'ensure' class would be very similar. This style does require the programmer to have both names: 'spammer' and '_spammer' -- it would be a bit cleaner to have a metaclass with a custom __getattribute__, but a lot more work and possible metaclass conflicts when combining with other interesting metaclasses. -- ~Ethan~ From bzvi7919 at gmail.com Tue Jan 26 14:13:55 2016 From: bzvi7919 at gmail.com (Bar Harel) Date: Tue, 26 Jan 2016 19:13:55 +0000 Subject: [Python-ideas] intput() In-Reply-To: <1453822237.78223.502982538.3FF49DD9@webmail.messagingengine.com> References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> <1453822237.78223.502982538.3FF49DD9@webmail.messagingengine.com> Message-ID: ynput can use distutils.util.strtobool instead of defining for itself (just an added bonus) On Tue, Jan 26, 2016 at 5:30 PM Random832 wrote: > On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote: > > I tried to have fun, but my joke ended up long and maybe useful. > > > > Anyways, here's *ynput()*: > > > > https://gist.github.com/mahmoud/f23785445aff7a367f78 > > > > Get yourself a True/False from a y/n. > > If I were writing such a function I'd use > locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my > system indicates these all accept y/n, but additionally accept their own > language's term (or a related language - en_DK supports danish J/N, and > en_CA supports french O/N). Mostly these use syntax compatible with > python regex, though a few use (grouping|alternation) with no backslash. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Tue Jan 26 14:40:01 2016 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 26 Jan 2016 14:40:01 -0500 Subject: [Python-ideas] several different needs [Explicit variable capture list] Message-ID: I think a small part of the confusion is that there are at least four separate (albeit related) use cases. They all use default arguments for the current workarounds, but they don't have the same ideal solution. (1) Auxiliary variables def f(x, _len=len): ... This is often a micro-optimization; the _len keyword really shouldn't be overridden. Partly because it shouldn't be overridden, having it in the signature is just ugly. This could be solved with another separator in the signature, such as ; or a second () or a new keyword ... def f(x, aux _len=len): ... def f(x, once _len=len): ... def f(x; _len=len):... def f(x)(_len=len): ... def f(x){_len=len}: ... But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete. (2) immutable bindings once X final Y const Z This is pretty similar to the auxiliary variables case, except that it tends to be desired more outside of functions. The immutability can be worth documenting on its own, but it feels too much like a typing declaration, which raises questions of "why *this* distinction for *this* variable?" So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1. (3) Persistent storage def f(x, _cached_results={}): ... In the past, I've managed to convince myself that it is good to be able to pass in a different cache ... or to turn the function into a class, so that I could get to self, or even to attach attributes to the function itself (so that rebinding the name to another function in a naive manner would fail, rather than produces bugs). Those convincings don't stick very well, though. This was clearly at least one of the motivations of some people who asked about static variables. I still think it might be nice to just have a way of easily opening a new scope ... but then I have to explain why I can't just turn the function into a class... So in the end, I suspect this use case is better off ignored, but I am pretty sure it will lead to some extra confusion if any of the others are "solved" in a way that doesn't consider it. (4) Current Value Capture This is the loop variable case that some have thought was the only case under consideration. I don't have anything to add to Andrew Barnert's https://mail.python.org/pipermail/python-ideas/2016-January/037980.html but do see Steven D'Aprano's https://mail.python.org/pipermail/python-ideas/2016-January/038047.html for gotchas even within this use case. -jJ From abarnert at yahoo.com Tue Jan 26 15:59:07 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Jan 2016 12:59:07 -0800 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: References: Message-ID: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: > > (1) Auxiliary variables > > def f(x, _len=len): ... > > This is often a micro-optimization; When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.: def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable) Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4, except that you're capturing a builtin instead of a nonlocal. > But realistically, that _len isn't ugly *just* because it shouldn't be > overridden; it is also inherently ugly. I would prefer that something > like Victor's FAT optimizer just make this idiom obsolete. But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations. Also, marking that you're using an intentional micro-optimization is useful, even (or maybe especially) if it's ugly: it signals to any future maintainer that performance is particularly important here, and they should be careful with any changes. Of course some people will abuse that (IIRC, a couple years ago, someone removed all the "register" declarations in the perl 5 source, which not only sped it up by a small amount, but also got people to look at some other micro-optimized code from 15 years ago that was actually pessimizing things on modern platforms...), but those people are the last ones who will stop micro-optimizing because you tell them the compiler can often do it better. > (2) immutable bindings > > once X > final Y > const Z But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables. > > So again, I think something like Victor's FAT optimizer (plus comments > when immutability really is important) is a better long-term solution, > but I'm not as sure as I was for case 1. How could an optimizer enforce immutability, much less signal it? It only makes changes that are semantically transparent, and changing a mutable binding to immutable is definitely not transparent. > (3) Persistent storage > > def f(x, _cached_results={}): ... > I still think it might be nice to just have a way of easily opening a > new scope ... You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it? Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other. From abarnert at yahoo.com Tue Jan 26 17:10:41 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Jan 2016 14:10:41 -0800 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> <20160126142646.GR4619@ando.pearwood.info> Message-ID: <5ED5A21E-EB7A-45B5-A22D-6027B8752571@yahoo.com> There are probably a dozen DBC packages on PyPI, and dozens more that never even got that far. If this is doable without language changes, surely it'll be done on PyPI first, get traction there, and only then be considered for inclusion in the stdlib (so that it can be used to contractify parts of the stdlib), right? But, since this is fun: On Jan 26, 2016, at 07:24, Chris Angelico wrote: > >> On Wed, Jan 27, 2016 at 2:06 AM, Paul Moore wrote: >> Well, classes can be callable already, so how about >> >> @DBC >> class myfunction: >> def __call__(self, args): >> ... >> @precondition >> def requires(self): >> ... >> @postcondition >> def ensures(self, result): >> ... >> >> The DBC class decorator does something like >> >> def DBC(cls): >> def wrapper(*args, **kw): >> fn = cls() >> fn.args = args >> fn.kw = kw >> for pre in fn.__preconditions__: >> pre() >> result = fn(*args, **kw) >> for post in fn.__postconditions__: >> post(result) >> return wrapper >> >> Pre and post conditions can access the args via self.args and self.kw. >> The method decorators would let you have multiple pre- and >> post-conditions. Or you could use "magic" names and omit the >> decorators. > > I'd rather use magic names - something like: > > @DBC > class myfunction: > def body(self, args): > ... > def requires(self): > ... > def ensures(self, result): > ... > > and then the DBC decorator can create a __call__ method. This still > has one nasty problem though: the requires and ensures functions can't > see function arguments. You could get around this by duplicating the > argument list onto the other two, but who wants to do that? You could do this pretty easily with a macro that returns (the AST for) something like this: def myfunction([func.body.params]): [func.requires.body] try: return_to_raise([func.body.body]) except Return as r: result, exc = r.args(0), None [func.ensures.body] return result except Exception as exc: [func.ensures.body] raise (I deliberately didn't write this in MacroPy style, but obviously if you really wanted to implement this, that's how you'd do it.) There are still a few things missing here. For example, many postconditions are specified in terms of the pre- and post- values of mutable parameters, with self as a very important special case. And fitting class invariant testing into this scheme should be extra fun. But I think it's all doable. From greg.ewing at canterbury.ac.nz Tue Jan 26 18:59:13 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 27 Jan 2016 12:59:13 +1300 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160125232136.GP4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> Message-ID: <56A80851.7090606@canterbury.ac.nz> Steven D'Aprano wrote: > Outside of such toys, we often find ourselves closing > over at least one variable which is derived from the loop variable, but > not the loop variable itself: Terry's idea of a variant of the for-loop whose body is a nested scope (with everything that implies) would address that, because any name assigned within the body (and not declared nonlocal) would be part of the captured scope. > I would not like to see "new" become a keyword. I'm open to alternatives. Would "foreach" be better keyword material? We could say foreach i in things: ... although the difference between "for" and "foreach" would be far from obvious. I'd like to do something with "let", which is famliar from other languages as a binding-creation construct, and it doesn't seem a likely choice for a variable namne. Maybe if we had a general statement for introducing a new scope, independent of looping: let: ... Then loops other than for-loops could be treated like this: i = 0 while i < n: let: x = things[i] funcs.append(lambda: process(x)) The for-loop is a special case, because it assigns a variable in a place where we can't capture it in a let-block. So we introduce a variant: for let x in things: funcs.append(lambda: process(x)) Refinements" 1) Other special cases could be provided, but I don't think any other special cases are strictly needed. For example, you might want: with open(filename) as let f: process(f) but that could be written as with open(filename) as f: let: g = f process(g) 2) It may be desirable to allow assignments on the same line as "let", e.g. with open(filename) as f: let g = f: process(g) which seems marginally more readable. Also, the RHS of the assignment would be evaluated outside the scope being created, allowing with open(filename) as f: let f = f: process(f) although I'm not sure that's a style that should be encouraged. Code that apparently assigns something to itself always looks a bit wanky to me. :-( -- Greg From mike at selik.org Tue Jan 26 19:47:07 2016 From: mike at selik.org (Michael Selik) Date: Wed, 27 Jan 2016 00:47:07 +0000 Subject: [Python-ideas] DBC (Re: Explicit variable capture list) In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125233455.GQ4619@ando.pearwood.info> Message-ID: On Mon, Jan 25, 2016 at 7:24 PM David Mertz wrote: > Just curious, Michael, what would you like the Python syntax version to > look like if you *can* do whatever metaclass or stack hackery that's > needed? I'm a little confused when you mention a decorator and a context > manager in the same sentence since those would seem like different > approaches. > Now that you mention it, that does seem weird. Initially the pattern of trying to factor out a setup/cleanup feels like a context manager. But we also need to capture the function arguments and return value. So that feels like a decorator. I started by implementing an abstract base class Contract that sets up the require/ensure behavior. One inherits and overrides to implement a particular contract. The overridden require/ensure functions would receive the arguments/result of a decorated function. class StayPositive(Contract): def require(self, *args, **kwargs): assert sum(args + list(kwargs.values()) def ensure(self, result, *args, **kwargs): # ensure receives not only the result, but also same argument objs assert sum(result) @StayPositive def foo(i + am + happy): return i + am + happy One thing I like here is that the require/ensure doesn't clutter the function definition with awkward decorator parameters. The contract terms are completely separate. This does put the burden on wisely naming the contract subclass name. The combination of decorator and context manager was unnecessary. The internals of my Contract base class included an awkward ``with self:``. If I were to refactor, I'd separate out a context manager helper from the decorator object. Seeing some of the stubs folks have written makes me think this ends with exec-ing a template a la namedtuple. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Tue Jan 26 20:23:11 2016 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 26 Jan 2016 20:23:11 -0500 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert wrote: > On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: >> (1) Auxiliary variables >> def f(x, _len=len): ... >> This is often a micro-optimization; > When _isn't_ it a micro-optimization? It can improve readability, usually by providing a useful rename. I have a vague sense that there might be other cases I'm forgetting, simply because I haven't had much use for them myself. > I think if it isn't, it's a very different case, e.g.: > > def len(iterable, _len=len): > if something(iterable): special_case() > else: return _len(iterable) I would (perhaps wrongly) still have assumed that was at least intended for optimization. > Obviously non-optimization use cases can't be solved > by an optimizer. I think this is really more a special case > of your #4 ... [#4 was current-value capture] I almost never set out to capture a snapshot of the current environment's values. I get around to that solution after being annoyed that something else didn't work, but it isn't the original intent. (That might be one reason I sometimes have to stop and think about closures created in a loop.) The "this shouldn't be in the signature" and "why is something being assigned to itself" problems won't go away even if current-value capture is resolved. I suspect current-value capture would even become an attractive nuisance that led to obscure bugs when the value was captured too soon. >> But realistically, that _len isn't ugly *just* because it shouldn't be >> overridden; it is also inherently ugly. I would prefer that something >> like Victor's FAT optimizer just make this idiom obsolete. > But, like most micro-optimizations, you should use this > only when you really need it. Which means you probably > can't count on a general-purpose optimizer that may do it > for you, on some people's installations. That still argues for not making any changes to the language; I think the equivalent of (faster access to unchanged globals or builtins) is a better portability bet than new language features. > Also, marking that you're using an intentional > micro-optimization is useful, even (or maybe especially) > if it's ugly: it signals to any future maintainer that > performance is particularly important here, and they > should be careful with any changes. Changing the language to formalize that signal takes away some of the emphasis you get from ugliness. I also wouldn't assume that such speed assessments are likely to be valid across the timescales needed for adoption of new syntax. >> (2) immutable bindings >> once X >> final Y >> const Z > But a default value neither guarantees immutability, > nor signals such an intent. Parameters can be rebound > or mutated just like any other variables. It is difficult to signal "once set, this should not change" in Python, largely because it is so difficult to enforce. This case might actually be worth new syntax, or a keyword. Alternatively, it might be like const contagion, that ends up being applied too often and just adding visual noise. >> So again, I think something like Victor's FAT optimizer (plus comments >> when immutability really is important) is a better long-term solution, >> but I'm not as sure as I was for case 1. > How could an optimizer enforce immutability, much less signal it? Victor's guards can "enforce" immutability by recognizing when it fails in practice. It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration. >> (3) Persistent storage >> def f(x, _cached_results={}): ... >> I still think it might be nice to just have a way of easily opening a >> new scope ... > You mean to open a new scope _outside_ the function > definition, so it can capture the cache in a closure, without > leaving it accessible from outside the scope? But then f won't > be accessible either, unless you have some way to "return" > the value to the parent scope. And a scope that returns > something--that's just a function, isn't it? It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call. But "opening a new scope" can start to look a lot like creating a new class instance, yes. > Meanwhile, a C-style function-static variable isn't really > the same thing. Statics are just globals with names nobody > else can see. So, for a nested function (or a method) that > had a "static cache", any copies of the function would all > share the same cache, while one with a closure over a > cache defined in a new scope (or a default parameter value, > or a class instance) would get a new cache for each copy. > So, if you give people an easier way to write statics, they'd > still have to use something else when they want the other. And explaining when they want one instead of the other will still be so difficult that whichever is easier to write will become an attractive nuisance, that would only cause problems under load. -jJ From abarnert at yahoo.com Tue Jan 26 23:39:03 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Jan 2016 20:39:03 -0800 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: On Jan 26, 2016, at 17:23, Jim J. Jewett wrote: > >> On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert wrote: >> On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: > >>> (1) Auxiliary variables > >>> def f(x, _len=len): ... > >>> This is often a micro-optimization; > >> When _isn't_ it a micro-optimization? > > It can improve readability, usually by providing a useful rename. OK, but then how could FAT, or any optimizer, help with that? >> I think if it isn't, it's a very different case, e.g.: >> >> def len(iterable, _len=len): >> if something(iterable): special_case() >> else: return _len(iterable) > > I would (perhaps wrongly) still have assumed that was at least > intended for optimization. This is how you hook a global or builtin function with special behavior for a special case, when you can't use the normal protocol (e.g., because the special case is a C extension type so you can't monkeypatch it), or want to hook it at a smaller scope than builtin. That's usually nothing to do with optimization, but with adding functionality. But, either way, it's not something an optimizer can help with anyway. >> Obviously non-optimization use cases can't be solved >> by an optimizer. I think this is really more a special case >> of your #4 ... > > [#4 was current-value capture] > > I almost never set out to capture a snapshot of the current > environment's values. I get around to that solution after being > annoyed that something else didn't work, but it isn't the original > intent. (That might be one reason I sometimes have to stop and think > about closures created in a loop.) > > The "this shouldn't be in the signature" and "why is something being > assigned to itself" problems won't go away even if current-value > capture is resolved. I suspect current-value capture would even > become an attractive nuisance that led to obscure bugs when the value > was captured too soon. You may be right here. The fact that current-value capture is currently ugly means you only use it when you need to explicitly signal something unusual, or when you have no other choice. Making it nicer could make it an attractive nuisance. >> But, like most micro-optimizations, you should use this >> only when you really need it. Which means you probably >> can't count on a general-purpose optimizer that may do it >> for you, on some people's installations. > > That still argues for not making any changes to the language; I think > the equivalent of (faster access to unchanged globals or builtins) is > a better portability bet than new language features. Sure. I already said I don't think anything but maybe (and probably not) the loop-capture problem actually needs to be solved, so you don't have to convince me. :) When you really need the micro-optimization, which is very rare, you will continue to spell it with the default-value trick. The rest of the time, you don't need any way to spell it at all (and maybe FAT will sometimes optimize things for you, but that's just gravy). > Alternatively, it might be like const contagion, that ends > up being applied too often and just adding visual noise. Const contagion is a C++-specific problem. (Actually, two problems--mixing up lvalue-const and rvalue-const incorrectly, and having half the stdlib and half the popular third-party libraries out there not being const-correct because they're actually C libs--but they're both unique to C++.) Play with D or Swift for a while to see how it can work. >>> So again, I think something like Victor's FAT optimizer (plus comments >>> when immutability really is important) is a better long-term solution, >>> but I'm not as sure as I was for case 1. > >> How could an optimizer enforce immutability, much less signal it? > > Victor's guards can "enforce" immutability by recognizing when it > fails in practice. But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here. > It can't signal, but comments can ... and > immutability being semantically important (as opposed to merely useful > for optimization) is rare enough that I think a comment is more likely > to be accurate than a type declaration. Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal. >>> (3) Persistent storage > >>> def f(x, _cached_results={}): ... > >>> I still think it might be nice to just have a way of easily opening a >>> new scope ... > >> You mean to open a new scope _outside_ the function >> definition, so it can capture the cache in a closure, without >> leaving it accessible from outside the scope? But then f won't >> be accessible either, unless you have some way to "return" >> the value to the parent scope. And a scope that returns >> something--that's just a function, isn't it? > > It is a function plus a function call, rather than just a function. > Getting that name (possible several names) bound properly in the outer > scope is also beyond the abilities of a call. It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript: var spam = function(n) { var cache = {}: return function(n) { if (cache[n] === undefined) { cache[n] = slow_computation(n); } return cache[n]; }; }(); And the exact same thing works in Python: def _(): cache = {} def spam(n): if n not in cache: cache[n] = slow_computation(n) return cache[n] return spam spam = _() You just rarely do it in Python because we have better ways of doing everything this can do. >> Meanwhile, a C-style function-static variable isn't really >> the same thing. Statics are just globals with names nobody >> else can see. So, for a nested function (or a method) that >> had a "static cache", any copies of the function would all >> share the same cache, while one with a closure over a >> cache defined in a new scope (or a default parameter value, >> or a class instance) would get a new cache for each copy. >> So, if you give people an easier way to write statics, they'd >> still have to use something else when they want the other. > > And explaining when they want one instead of the other will still be > so difficult that whichever is easier to write will become an > attractive nuisance, that would only cause problems under load. Yes, yet another strike against C-style static variables. But, again, I don't think this was a problem that needed solving in the first place. From mojtaba.gharibi at gmail.com Wed Jan 27 00:25:05 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Wed, 27 Jan 2016 00:25:05 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence Message-ID: Hello, I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ; Some examples first: a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2 So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec. I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy. For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors): innerProduct =0 innerProduct += $a * $b which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i] For example, let's say we want to apply a function to all element in a list, we can do: f($a) The $ and ; take precedence over anything except (). Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; * d;e;f which result in s being a*d+b,c*e+d*f Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops). I'll appreciate your time and looking forward to hearing your thoughts. Cheers, Moj -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjoerdjob at sjec.nl Wed Jan 27 00:57:43 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Wed, 27 Jan 2016 06:57:43 +0100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: Message-ID: <20160127055743.GB14190@sjoerdjob.com> On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: > Hello, > > I'm thinking of this idea that we have a pseudo-operator called > "Respectively" and shown maybe with ; Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to. x, y = 3, 5 would give the same result as x = 3 y = 5 But it's more robust, as it can also deal with things like x, y = y + 1, x + 4 > > Some examples first: > > a;b;c = x1;y1;z1 + x2;y2;z2 > is equivalent to > a=x1+x2 > b=y1+y2 > c=z1+z2 So what would happen with the following? a; b = x1;a + x2;5 > > So it means for each position in the statement, do something like > respectively. It's like what I call a vertical expansion, i.e. running > statements one by one. > Then there is another unpacking operator which maybe we can show with $ > sign and it operates on lists and tuples and creates the "Respectively" > version of them. > So for instance, > vec=[]*10 > $vec = $u + $v > will add two 10-dimensional vectors to each other and put the result in vec. > > I think this is a syntax that can make many things more concise plus it > makes component wise operation on a list done one by one easy. > > For example, we can calculate the inner product between two vectors like > follows (inner product is the sum of component wise multiplication of two > vectors): > > innerProduct =0 > innerProduct += $a * $b > > which is equivalent to > innerProduct=0 > for i in range(len(a)): > ...innerProduct += a[i]+b[i] > >From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need. > > For example, let's say we want to apply a function to all element in a > list, we can do: > f($a) > > The $ and ; take precedence over anything except (). > > Also, an important thing is that whenever, we don't have the respectively > operator, such as for example in the statement above on the left hand side, > we basically use the same variable or value or operator for each statement > or you can equivalently think we have repeated that whole thing with ;;;;. > Such as: > s=0 > s;s;s += a;b;c; * d;e;f > which result in s being a*d+b,c*e+d*f > > Also, I didn't spot (at least for now any ambiguity). > For example one might think what if we do this recursively, such as in: > x;y;z + (a;b;c);(d;e;f);(g;h;i) > using the formula above this is equivalent to > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) > if we apply print on the statement above, the result will be: > x+a > x+b > x+c > y+d > y+e > y+f > z+g > z+h > z+i > > Beware that in all of these ; or $ does not create a new list. Rather, they > are like creating new lines in the program and executing those lines one by > one( in the case of $, to be more accurate, we create for loops). > > I'll appreciate your time and looking forward to hearing your thoughts. Again, probably you should use numpy. I'm not really sure it warrants a change to the language, because it seems like it would really only be beneficial to those working with matrices. Numpy already supports it, and I'm suspecting that the use case for `a;b = c;d + e;f` can already be satisfied by `a, b = c + e, d + f`, and it already has clearly documented semantics and still works fine when one of the names on the left also appears on the right: First all the calculations on the right are performed, then they are assigned to the names on the left. > > Cheers, > Moj Kind regards, Sjoerd Job From mojtaba.gharibi at gmail.com Wed Jan 27 01:19:56 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Wed, 27 Jan 2016 01:19:56 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <20160127055743.GB14190@sjoerdjob.com> References: <20160127055743.GB14190@sjoerdjob.com> Message-ID: Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here. For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. For example: $StudentFullName = $FirstName + " " + $LastName So, in effect, I think one big part of is component wise operations. Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x. About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5 Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2 It's I think good that you (kind of) don't break the encapsulation-ish thing we have for the three values here. Also, you don't risk, making a mistake in the operator for one of the values by centralizing the operator use. For example you could make the mistake: x,y,z = x1+x2, y1-y2, z1+z2 Also there are all sort of other things that are less of a motivation for me but that cannot be done with sequence unpacking. For instance: add ; prod = a +;* y (This one I'm not sure how can be achieved without ambiguity) x;y = f;g (a;b) On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus wrote: > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: > > Hello, > > > > I'm thinking of this idea that we have a pseudo-operator called > > "Respectively" and shown maybe with ; > > Hopefully, you're already aware of sequence unpacking? Search for > 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . > Unfortunately, it does not have its own section I can directly link to. > > x, y = 3, 5 > > would give the same result as > > x = 3 > y = 5 > > But it's more robust, as it can also deal with things like > > x, y = y + 1, x + 4 > > > > Some examples first: > > > > a;b;c = x1;y1;z1 + x2;y2;z2 > > is equivalent to > > a=x1+x2 > > b=y1+y2 > > c=z1+z2 > > So what would happen with the following? > > a; b = x1;a + x2;5 > > > > > So it means for each position in the statement, do something like > > respectively. It's like what I call a vertical expansion, i.e. running > > statements one by one. > > Then there is another unpacking operator which maybe we can show with $ > > sign and it operates on lists and tuples and creates the "Respectively" > > version of them. > > So for instance, > > vec=[]*10 > > $vec = $u + $v > > will add two 10-dimensional vectors to each other and put the result in > vec. > > > > I think this is a syntax that can make many things more concise plus it > > makes component wise operation on a list done one by one easy. > > > > For example, we can calculate the inner product between two vectors like > > follows (inner product is the sum of component wise multiplication of two > > vectors): > > > > innerProduct =0 > > innerProduct += $a * $b > > > > which is equivalent to > > innerProduct=0 > > for i in range(len(a)): > > ...innerProduct += a[i]+b[i] > > > > From what I can see, it would be very beneficial for you to look into > numpy: http://www.numpy.org/ . It already provides inner product, sums > of arrays and such. I myself am not very familiar with it, but I think > it provides what you need. > > > > > For example, let's say we want to apply a function to all element in a > > list, we can do: > > f($a) > > > > The $ and ; take precedence over anything except (). > > > > Also, an important thing is that whenever, we don't have the respectively > > operator, such as for example in the statement above on the left hand > side, > > we basically use the same variable or value or operator for each > statement > > or you can equivalently think we have repeated that whole thing with > ;;;;. > > Such as: > > s=0 > > s;s;s += a;b;c; * d;e;f > > which result in s being a*d+b,c*e+d*f > > > > Also, I didn't spot (at least for now any ambiguity). > > For example one might think what if we do this recursively, such as in: > > x;y;z + (a;b;c);(d;e;f);(g;h;i) > > using the formula above this is equivalent to > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) > > if we apply print on the statement above, the result will be: > > x+a > > x+b > > x+c > > y+d > > y+e > > y+f > > z+g > > z+h > > z+i > > > > Beware that in all of these ; or $ does not create a new list. Rather, > they > > are like creating new lines in the program and executing those lines one > by > > one( in the case of $, to be more accurate, we create for loops). > > > > I'll appreciate your time and looking forward to hearing your thoughts. > > Again, probably you should use numpy. I'm not really sure it warrants a > change to the language, because it seems like it would really only be > beneficial to those working with matrices. Numpy already supports it, > and I'm suspecting that the use case for `a;b = c;d + e;f` can already > be satisfied by `a, b = c + e, d + f`, and it already has clearly > documented semantics and still works fine when one of the names on the > left also appears on the right: First all the calculations on the right > are performed, then they are assigned to the names on the left. > > > > > Cheers, > > Moj > > Kind regards, > Sjoerd Job > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 27 02:29:31 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Jan 2016 23:29:31 -0800 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> Message-ID: <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi wrote: > > Yes, I'm aware sequence unpacking. > There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here. > > For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. Yes, you can do it with numpy. Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections: >>> firsts = ['John', 'Jane'] >>> lasts = ['Smith', 'Doe'] >>> np.vectorize('{1}, {0}'.format)(firsts, lasts) array(['Smith, John', 'Doe, Jane'], dtype=' For example: > > $StudentFullName = $FirstName + " " + $LastName > > So, in effect, I think one big part of is component wise operations. > > Another thing that can't be achieved with sequence unpacking is: > f($x) > i.e. applying f for each component of x. That's a very different operation, which I think is more readably spelled map(f, x). > About your question above, it's not ambiguous here either: > a; b = x1;a + x2;5 > is exactly "Equivalent" to > a = x1+x2 > b = a + 5 > > Also, there is a difference in style in sequence unpacking, and here. > In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: > x,y,z = x1+x2 , y1+y2, z1+z2 > Here you don't have to repeat it and pair up the right variables, i.e. > x;y;z = x1;y1;z1 + x2;y2;z2 If you only have two or three of these, that isn't a problem. Although in this case, it sure looks like you're trying to add two 3D vectors, so maybe you should just be storing 3D vectors as instances of a class (with an __add__ method, of course), or as arrays, or as columns in a larger array, rather than as 3 separate variables. What could be more readable than this: v = v1 + v2 And if you have more than about three separate variables, you _definitely_ want some kind of array or iterable, not a bunch of separate variables. You're worried about accidentally typing "y1-y2" when you meant "+", but you're far more likely to screw up one of the letters or numbers than the operator. You also can't loop over separate variables, which means you can't factor out some logic and apply it to all three axes, or to both vectors. Also consider how you'd do something like transposing or pivoting or anything even fancier. If you've got a 2D array or iterable of iterables, that's trivial: transpose or zip, etc. If you've got N*M separate variables, you have to write them all individually. Your syntax at best cuts the source length and opportunity for errors in half; using collections cuts it down to 1. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjoerdjob at sjec.nl Wed Jan 27 02:30:04 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Wed, 27 Jan 2016 08:30:04 +0100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> Message-ID: <20160127073004.GC14190@sjoerdjob.com> On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote: > Yes, I'm aware sequence unpacking. > There is an overlap like you mentioned, but there are things that can't be > done with sequence unpacking, but can be done here. > > For example, let's say you're given two lists that are not necessarily > numbers, so you can't use numpy, but you want to apply some component-wise > operator between each component. This is something you can't do with > sequence unpacking or with numpy. For example: > > $StudentFullName = $FirstName + " " + $LastName > > So, in effect, I think one big part of is component wise operations. > > Another thing that can't be achieved with sequence unpacking is: > f($x) > i.e. applying f for each component of x. map(f, x) > > About your question above, it's not ambiguous here either: > a; b = x1;a + x2;5 > is exactly "Equivalent" to > a = x1+x2 > b = a + 5 Now that's confusing, that it differs from sequence unpacking. > > Also, there is a difference in style in sequence unpacking, and here. > In sequence unpacking, you have to pair up the right variables and repeat > the operator, for example: > x,y,z = x1+x2 , y1+y2, z1+z2 > Here you don't have to repeat it and pair up the right variables, i.e. > x;y;z = x1;y1;z1 + x2;y2;z2 > It's I think good that you (kind of) don't break the encapsulation-ish > thing we have for the three values here. Also, you don't risk, making a > mistake in the operator for one of the values by centralizing the operator > use. For example you could make the mistake: > x,y,z = x1+x2, y1-y2, z1+z2 > > Also there are all sort of other things that are less of a motivation for > me but that cannot be done with sequence unpacking. > For instance: > add ; prod = a +;* y (This one I'm not sure how can be achieved without > ambiguity) > x;y = f;g (a;b) > > > On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus > wrote: > > > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: > > > Hello, > > > > > > I'm thinking of this idea that we have a pseudo-operator called > > > "Respectively" and shown maybe with ; > > > > Hopefully, you're already aware of sequence unpacking? Search for > > 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . > > Unfortunately, it does not have its own section I can directly link to. > > > > x, y = 3, 5 > > > > would give the same result as > > > > x = 3 > > y = 5 > > > > But it's more robust, as it can also deal with things like > > > > x, y = y + 1, x + 4 > > > > > > Some examples first: > > > > > > a;b;c = x1;y1;z1 + x2;y2;z2 > > > is equivalent to > > > a=x1+x2 > > > b=y1+y2 > > > c=z1+z2 > > > > So what would happen with the following? > > > > a; b = x1;a + x2;5 > > > > > > > > So it means for each position in the statement, do something like > > > respectively. It's like what I call a vertical expansion, i.e. running > > > statements one by one. > > > Then there is another unpacking operator which maybe we can show with $ > > > sign and it operates on lists and tuples and creates the "Respectively" > > > version of them. > > > So for instance, > > > vec=[]*10 > > > $vec = $u + $v > > > will add two 10-dimensional vectors to each other and put the result in > > vec. > > > > > > I think this is a syntax that can make many things more concise plus it > > > makes component wise operation on a list done one by one easy. > > > > > > For example, we can calculate the inner product between two vectors like > > > follows (inner product is the sum of component wise multiplication of two > > > vectors): > > > > > > innerProduct =0 > > > innerProduct += $a * $b > > > > > > which is equivalent to > > > innerProduct=0 > > > for i in range(len(a)): > > > ...innerProduct += a[i]+b[i] > > > Thinking about this some more: How do you know if this is going to return a list of products, or the sum of those products? That is, why is `innerProduct += $a * $b` not equivalent to `innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure. A clearer solution would be innerProduct = sum(map(operator.mul, a, b)) But that's current-Python syntax. To be honest, I still haven't seen an added benefit that the new syntax would gain. Maybe you could expand on that? > > > > From what I can see, it would be very beneficial for you to look into > > numpy: http://www.numpy.org/ . It already provides inner product, sums > > of arrays and such. I myself am not very familiar with it, but I think > > it provides what you need. > > > > > > > > For example, let's say we want to apply a function to all element in a > > > list, we can do: > > > f($a) > > > > > > The $ and ; take precedence over anything except (). > > > > > > Also, an important thing is that whenever, we don't have the respectively > > > operator, such as for example in the statement above on the left hand > > side, > > > we basically use the same variable or value or operator for each > > statement > > > or you can equivalently think we have repeated that whole thing with > > ;;;;. > > > Such as: > > > s=0 > > > s;s;s += a;b;c; * d;e;f > > > which result in s being a*d+b,c*e+d*f > > > > > > Also, I didn't spot (at least for now any ambiguity). > > > For example one might think what if we do this recursively, such as in: > > > x;y;z + (a;b;c);(d;e;f);(g;h;i) > > > using the formula above this is equivalent to > > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) > > > if we apply print on the statement above, the result will be: > > > x+a > > > x+b > > > x+c > > > y+d > > > y+e > > > y+f > > > z+g > > > z+h > > > z+i > > > > > > Beware that in all of these ; or $ does not create a new list. Rather, > > they > > > are like creating new lines in the program and executing those lines one > > by > > > one( in the case of $, to be more accurate, we create for loops). > > > > > > I'll appreciate your time and looking forward to hearing your thoughts. > > > > Again, probably you should use numpy. I'm not really sure it warrants a > > change to the language, because it seems like it would really only be > > beneficial to those working with matrices. Numpy already supports it, > > and I'm suspecting that the use case for `a;b = c;d + e;f` can already > > be satisfied by `a, b = c + e, d + f`, and it already has clearly > > documented semantics and still works fine when one of the names on the > > left also appears on the right: First all the calculations on the right > > are performed, then they are assigned to the names on the left. > > > > > > > > Cheers, > > > Moj > > > > Kind regards, > > Sjoerd Job > > From steve at pearwood.info Wed Jan 27 07:41:03 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Jan 2016 23:41:03 +1100 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: <20160127124103.GU4619@ando.pearwood.info> On Tue, Jan 26, 2016 at 12:59:07PM -0800, Andrew Barnert via Python-ideas wrote: > On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: > > > > (1) Auxiliary variables > > > > def f(x, _len=len): ... > > > > This is often a micro-optimization; > > When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.: > > def len(iterable, _len=len): > if something(iterable): special_case() > else: return _len(iterable) I'm not sure why you call this "a very different case". It looks the same to me: both cases use the default argument trick to capture the value of a builtin name. The reasons why they do so are incidental. I sometimes have code like this: try: enumerate("", 1) except TypeError: # Too old a version of Python. def enumerate(it, start=0, enumerate=enumerate): for a, b in enumerate(it): yield (a+start, b) I don't really want an extra argument, but nor do I want a global: _enumerate = enumerate def enumerate(it, start=0): for a, b in _enumerate(it): yield (a+start, b) This isn't a matter of micro-optimization, it's a matter of encapsulation. That my enumerate calls the built-in enumerate is an implementation detail, and what I'd like is to capture the value without either a global or an extra argument: # capture the current builtin def enumerate(it, start=0)(enumerate): for a, b in enumerate(it): yield (a+start, b) Obviously I'm not going to be able to use hypothetical Python 3.6 syntax in code that needs to run in 2.5. But I might be able to use that syntax in Python 3.8 for code that needs to run in 3.6. > > (2) immutable bindings > > > > once X > > final Y > > const Z > > But a default value neither guarantees immutability, nor signals such > an intent. Parameters can be rebound or mutated just like any other > variables. I don't think this proposal has anything to say about about either immutability or bind-once-only "constants". > > (3) Persistent storage > > > > def f(x, _cached_results={}): ... > > > I still think it might be nice to just have a way of easily opening a > > new scope ... > > You mean to open a new scope _outside_ the function definition, so it > can capture the cache in a closure, without leaving it accessible from > outside the scope? But then f won't be accessible either, unless you > have some way to "return" the value to the parent scope. And a scope > that returns something--that's just a function, isn't it? I'm not sure what point you think you are making here, or what Jim meant by his comment about the new scope, but in this case I don't think we would want an extra scope. We would want the cache to be in the function's local scope, but assigned once at function definition time. When my caching needs are simple, I might write something like this: def func(x, cache={}): ... which is certainly better than having a global variable cache. For many applications (quick and dirty scripts) this is perfectly adequate. For other applications were my caching needs are more sophisticated, I might invest the effort in writing a decorator (or use functools.lru_cache), or a factory to hide the cache in a closure: def factory(): cache = {} def func(x): ... return func func = factory() del factory but there's a middle ground where I want something less quick'n'dirty than the first, but not going to all the effort of the second. For that, I think that being able to capture a value fits the use-case perfectly: def func(x)(cache={}): ... > Meanwhile, a C-style function-static variable isn't really the same > thing. Statics are just globals with names nobody else can see. So, > for a nested function (or a method) that had a "static cache", any > copies of the function would all share the same cache, Copying functions is, I think, a pretty rare and advanced thing to do. At least up to 3.4, copy.copy(func) simply returns func, so if you want to make an actual distinct copy, you probably need to build a new function by hand. In which case, you could copy the cache as part of the process. -- Steve From jsbueno at python.org.br Wed Jan 27 07:44:57 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 27 Jan 2016 10:44:57 -0200 Subject: [Python-ideas] intput() In-Reply-To: References: <56A67FB8.5020001@canterbury.ac.nz> <26DF7102-0657-4110-98D7-C29A64B178FD@gmail.com> <1453822237.78223.502982538.3FF49DD9@webmail.messagingengine.com> Message-ID: for name, obj in __builtins__.__dict__.items(): if isinstance(obj, type) and not issubclass(obj, BaseException): globals()[name + "put"] = lambda obj=obj, name=name: obj(input("Please type in a {}: ".format(name))) On 26 January 2016 at 17:13, Bar Harel wrote: > ynput can use distutils.util.strtobool instead of defining for itself (just > an added bonus) > > On Tue, Jan 26, 2016 at 5:30 PM Random832 wrote: >> >> On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote: >> > I tried to have fun, but my joke ended up long and maybe useful. >> > >> > Anyways, here's *ynput()*: >> > >> > https://gist.github.com/mahmoud/f23785445aff7a367f78 >> > >> > Get yourself a True/False from a y/n. >> >> If I were writing such a function I'd use >> locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my >> system indicates these all accept y/n, but additionally accept their own >> language's term (or a related language - en_DK supports danish J/N, and >> en_CA supports french O/N). Mostly these use syntax compatible with >> python regex, though a few use (grouping|alternation) with no backslash. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From victor.stinner at gmail.com Wed Jan 27 10:39:10 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 27 Jan 2016 16:39:10 +0100 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not Message-ID: Hi, Thank you for all feedback on my PEP 511. It looks like the current blocker point is the unclear status of "language extensions": code tranformers which deliberately changes the Python semantics. I would like to discuss how we should register them. I think that the PEP 511 must discuss "language extensions" even if it doesn't have to propose a solution to make their usage easier. It's an obvious usage of code transformers. If possible, I would like to find a compromise to support them, but make it explicit that they change the Python semantics. By the way, I discussed with Joseph Jevnik who wrote codetransformer (bytecode transformer) and lazy_python (AST transformer). He wrote me: "One concern that I have though is that transformers are registered globally. I think that the decorators in codetransformer do a good job of signalling to reader the scope of some new code generation." Currently, the PEP 511 doesn't provide a way to register a code transformer but only use it under some conditions. For example, if fatoptimizer is registered, all .pyc files will be called file.cpython-36.fat-0.pyc even if fatoptimizer was disabled. I propose to change the design of sys.set_code_transformers() to use it more like a registry similar to the codecs registry (codecs.register), but different (details below). A difference is that the codecs registry uses a mapping (codec name => codec functions), whereas sys.set_code_transformers() uses an ordered sequence (list) of code transformers. A sequence is used because multiple code transformers can be applied sequentially on a single .py file. Petr Viktorin wrote that language extensions "target specific modules, with which they're closely coupled: The modules won't run without the transformer. And with other modules, the transformer either does nothing (as with MacroPy, hopefully), or would fail altogether (as with Hy). So, they would benefit from specific packages opting in. The effects of enabling them globally range from inefficiency (MacroPy) to failures or needing workarounds (Hy)." Problem (A): solutions proposed below don't make code tranformers mandatory. If a code *requires* a code transformer and the code transformer is not registered, Python doesn't complain. Do you think that it is a real issue in practice? For MacroPy, it's not a problem in practice since functions must be decorated using a decorator from the macropy package. If importing macropy fails, the module cannot be imported. Problem (B): proposed solutions below adds markers to ask to enable a specific code transformer, but a code transformer can decide to always modify the Python semantics without using such marker. According to Nick Coghlan, code transformers changing the Python semantics *must* require a marker in the code using them. IMHO it's the responsability of the author of the code transformer to use markers, not the responsability of Python. Code transformers should maybe return a flag telling if they changed the code or not. I prefer a flag rather than comparing the output to the input, since the comparison can be expensive, especially for a deep AST tree. Example: class Optimizer: def ast_optimizer(self, tree, context): # ... return modified, tree *modified* must be True if tree was modified. There are several options to decide if a code transformer must be used on a specific source file. (1) Add a check_code() and check_ast() functions to code transformers. The code transformer is responsible to decide if it wants to transform the code or not. Python doesn't use the code transformer if the check method returns False. Examples: * MacroPy can search for the "import macropy" statement (of "from macropy import ...") in the AST tree * fatoptimizer can search for "__fatoptimizer__ = {'enabled': False}" in the code: if this variable is found, the optimizer is completly skipped (2) Petr proposed to extend importlib to pass a code transformer when importing a module. importlib.util.import_with_transformer( 'mypackage.specialmodule', MyTransformer()) IMHO this option is too specific: it's restricted to importlib (py_compile, compileall and interactive interpreter don't have the feature). I also dislike the API. (3) Petr also proposed "a special flag in packages": __transformers_for_submodules__ = [MyTransformer()] I don't like having to get access to MyTransformer. The PEP 511 mentions an use case where the transformed code is run *without* registering the transformer. But this issue can easily be fixed by using the string to identify the transformer in the registery (ex: "fat") rather than its class. I'm not sure that putting a flag on the package (package/__init__.py?) is a good idea. I would prefer to enable language extensions on individual files to restrict their scope. (4) Sjoerd Job Postmus proposed something similar but using a comment and not for packages, but any source file: #:Transformers modname.TransformerClassName, modname.OtherTransformerClassName The problem is that comments are not stored in the AST tree. I would prefer to use AST to decide if an AST transformer should be used or not. Note: I'm not really motived to extend the AST to start to include comments, or even code formatting (spaces, newlines, etc.). https://pypi.python.org/pypi/redbaron/ can be used if you want to transform a .py file without touching the format. But I don't think that AST must go to this direction. I prefer to keep AST simple. (5) Nick proposed (indirectly) to use a different filename (don't use ".py") for language extensions. This option works with my option (2): the context contains the filename which can be used to decide to enable or not the code transformer. I understand that the code transformer must also install an importlib hook to search for other filenames than only .py files. Am I right? (6) Nick proposed (indirectly) to use an encoding cookie "which are visible as a comment in the module header". Again, I dislike this option because comments are not stored in AST. Victor From abarnert at yahoo.com Wed Jan 27 11:14:15 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 08:14:15 -0800 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: <20160127124103.GU4619@ando.pearwood.info> References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> <20160127124103.GU4619@ando.pearwood.info> Message-ID: <8221007C-DD14-4D72-A30C-CCD8A1FE06D9@yahoo.com> On Jan 27, 2016, at 04:41, Steven D'Aprano wrote: I think you're actually agreeing with me: there _aren't_ four different cases people actually want here, just the one we've all been talking about, and FAT is irrelevant to that case, so this sub thread is ultimately just a distraction. (We may still disagree about whether the one case needs a solution, or what the best solution would be, but we won't get anywhere by talking about different and unrelated things like this distraction.) But, in case I'm wrong about that, I'll answer your replies anyway: >> On Tue, Jan 26, 2016 at 12:59:07PM -0800, Andrew Barnert via Python-ideas wrote: >>> On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: >>> >>> (1) Auxiliary variables >>> >>> def f(x, _len=len): ... >>> >>> This is often a micro-optimization; >> >> When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.: > > I'm not sure why you call this "a very different case". Because Jim's point was that FAT could do this automatically for him, so we don't need any syntax for it at all. That works for the optimization case, but it doesn't work for your case. Therefore, they're different. Put another way: Without the default-value trick, his function means the same thing, so if he could rely on FAT, he could just stop using len=len. Without the default value trick, your function means something very different (a RecursionError), so you can't stop using enumerate=enumerate, with or without FAT, unless there's some other equally explicit syntax you can use instead. Moreover, your case is really no different from his case #4, the case everyone else has been talking about: you want to capture the value of enumerate at function definition time. >>> (2) immutable bindings >>> >>> once X >>> final Y >>> const Z >> >> But a default value neither guarantees immutability, nor signals such >> an intent. Parameters can be rebound or mutated just like any other >> variables. > > I don't think this proposal has anything to say about about either > immutability or bind-once-only "constants". Jim insists that it's one of the four things people use default values for, and one of the things people want from this proposal, and that FAT can make that desire irrelevant. I think he's wrong on all three counts: you can't use default values for constness, nobody cares whether any of these new proposals can be used for constness, and FAT can't help anyone who does want constness. >>> (3) Persistent storage >>> >>> def f(x, _cached_results={}): ... >> >>> I still think it might be nice to just have a way of easily opening a >>> new scope ... >> >> You mean to open a new scope _outside_ the function definition, so it >> can capture the cache in a closure, without leaving it accessible from >> outside the scope? But then f won't be accessible either, unless you >> have some way to "return" the value to the parent scope. And a scope >> that returns something--that's just a function, isn't it? > > I'm not sure what point you think you are making here, or what Jim > meant by his comment about the new scope, but in this case I don't > think we would want an extra scope. My point is that if you want to open a new scope to attach variables to a function, you can already do that by defining and calling a function. Which you very rarely actually need to do, so we don't need to make it any easier. So the fact that no variants of this proposal make it easier is irrelevant. >> Meanwhile, a C-style function-static variable isn't really the same >> thing. Statics are just globals with names nobody else can see. So, >> for a nested function (or a method) that had a "static cache", any >> copies of the function would all share the same cache, > > Copying functions is, I think, a pretty rare and advanced thing to do. I'm not talking about literally copying functions. I'm talking about nested functions using the same code object for each closure that gets created, and methods using the same code and function object for every bound method that gets created. Using a C-style static variable in these cases means all your closures, or in all your methods from different instances, etc., which is not the same behavior as the other alternatives he suggested were equivalent. This one, unlike his other points, isn't completely irrelevant. A C-style static declaration could actually serve some of the cases that the proposal is meant to serve. But it can't serve others, and it confusingly looks like it can serve more than it can, which makes it a confusing side track to bring up. From encukou at gmail.com Wed Jan 27 11:36:47 2016 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 27 Jan 2016 17:36:47 +0100 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: <56A8F21F.4010000@gmail.com> On 01/27/2016 04:39 PM, Victor Stinner wrote: > Hi, > > Thank you for all feedback on my PEP 511. It looks like the current > blocker point is the unclear status of "language extensions": code > tranformers which deliberately changes the Python semantics. I would > like to discuss how we should register them. I think that the PEP 511 > must discuss "language extensions" even if it doesn't have to propose > a solution to make their usage easier. It's an obvious usage of code > transformers. If possible, I would like to find a compromise to > support them, but make it explicit that they change the Python > semantics. > > By the way, I discussed with Joseph Jevnik who wrote codetransformer > (bytecode transformer) and lazy_python (AST transformer). He wrote me: > > "One concern that I have though is that transformers are registered > globally. I think that the decorators in codetransformer do a good job > of signalling to reader the scope of some new code generation." > > > Currently, the PEP 511 doesn't provide a way to register a code > transformer but only use it under some conditions. For example, if > fatoptimizer is registered, all .pyc files will be called > file.cpython-36.fat-0.pyc even if fatoptimizer was disabled. > > I propose to change the design of sys.set_code_transformers() to use > it more like a registry similar to the codecs registry > (codecs.register), but different (details below). A difference is that > the codecs registry uses a mapping (codec name => codec functions), > whereas sys.set_code_transformers() uses an ordered sequence (list) of > code transformers. A sequence is used because multiple code > transformers can be applied sequentially on a single .py file. > > > Petr Viktorin wrote that language extensions "target specific modules, > with which they're closely coupled: The modules won't run without the > transformer. And with other modules, the transformer either does > nothing (as with MacroPy, hopefully), or would fail altogether (as > with Hy). So, they would benefit from specific packages opting in. The > effects of enabling them globally range from inefficiency (MacroPy) to > failures or needing workarounds (Hy)." > > > Problem (A): solutions proposed below don't make code tranformers > mandatory. If a code *requires* a code transformer and the code > transformer is not registered, Python doesn't complain. Do you think > that it is a real issue in practice? For MacroPy, it's not a problem > in practice since functions must be decorated using a decorator from > the macropy package. If importing macropy fails, the module cannot be > imported. > > > Problem (B): proposed solutions below adds markers to ask to enable a > specific code transformer, but a code transformer can decide to always > modify the Python semantics without using such marker. According to > Nick Coghlan, code transformers changing the Python semantics *must* > require a marker in the code using them. IMHO it's the responsability > of the author of the code transformer to use markers, not the > responsability of Python. I believe Nick meant that if a transformer modifies semantics of un-marked code, it would be considered a badly written transformer that doesn't play well with the rest of the language. The responsibility of Python is just to make it easy to do the right thing. > Code transformers should maybe return a flag telling if they changed > the code or not. I prefer a flag rather than comparing the output to > the input, since the comparison can be expensive, especially for a > deep AST tree. Example: > > class Optimizer: > def ast_optimizer(self, tree, context): > # ... > return modified, tree > > *modified* must be True if tree was modified. What would this flag be useful for? > There are several options to decide if a code transformer must be used > on a specific source file. > > [...] > (2) Petr proposed to extend importlib to pass a code transformer when > importing a module. > > importlib.util.import_with_transformer( > 'mypackage.specialmodule', MyTransformer()) > > [...] > (5) Nick proposed (indirectly) to use a different filename (don't use > ".py") for language extensions. > > This option works with my option (2): the context contains the > filename which can be used to decide to enable or not the code > transformer. > > I understand that the code transformer must also install an importlib > hook to search for other filenames than only .py files. Am I right? Yes, you are. But once a custom import hook is in place, you can just use a regular import, the hack in (2) isn't necessary. Also, note that this would solve problem (A) -- without the hook enabled, the source won't be found. From victor.stinner at gmail.com Wed Jan 27 11:44:56 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 27 Jan 2016 17:44:56 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: Hi, 2016-01-16 17:56 GMT+01:00 Kevin Conway : > I'm a big fan of your motivation to build an optimizer for cPython code. > What I'm struggling with is understanding why this requires a PEP and > language modification. There are already several projects that manipulate > the AST for performance gains such as [1] or even my own ham fisted attempt > [2]. Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST optimizers section of Prior Art. I wrote astoptimizer [1] and this project uses monkey-patching of the compile() function, I mentioned this monkey-patching hack in the rationale of the PEP: https://www.python.org/dev/peps/pep-0511/#rationale I would like to avoid monkey-patching because it causes various issues. The PEP 511 also makes transformations more visible: transformers are explicitly registered in sys.set_code_transformers() and the .pyc filename is modified when the code is transformed. It also adds a new feature: it becomes possible to run transformed code without having to register the tranformer at runtime. This is made possible with the addition of the -o command line option. Victor From abarnert at yahoo.com Wed Jan 27 11:48:47 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 08:48:47 -0800 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: On Jan 27, 2016, at 07:39, Victor Stinner wrote: > > Hi, > > Thank you for all feedback on my PEP 511. It looks like the current > blocker point is the unclear status of "language extensions": code > tranformers which deliberately changes the Python semantics. I would > like to discuss how we should register them. I think that the PEP 511 > must discuss "language extensions" even if it doesn't have to propose > a solution to make their usage easier. It's an obvious usage of code > transformers. If possible, I would like to find a compromise to > support them, but make it explicit that they change the Python > semantics. Is this really necessary? If someone is testing a language change locally, and just wants to use your (original) API for his tests instead of the more complicated alternative of building an import hook, it works fine. If he can't deploy that way, that's fine. If someone builds a transformer that adds a feature in a way that makes it a pure superset of Python, he should be fine with running it on all files, so your API works fine. And if some files that didn't use any of the new features get .pyc files that imply they did, so what? If someone builds a transformer that only runs on files with a different extension, he already needs an import hook, so he might as well just call his transformer from the input hook, same as he does today. So... What case is served by this new, more complicated API that wasn't already served by your original, simple one (remembering that import hooks are already there as a fallback)? > By the way, I discussed with Joseph Jevnik who wrote codetransformer > (bytecode transformer) and lazy_python (AST transformer). He wrote me: > > "One concern that I have though is that transformers are registered > globally. I think that the decorators in codetransformer do a good job > of signalling to reader the scope of some new code generation." > > Currently, the PEP 511 doesn't provide a way to register a code > transformer but only use it under some conditions. For example, if > fatoptimizer is registered, all .pyc files will be called > file.cpython-36.fat-0.pyc even if fatoptimizer was disabled. That doesn't really answer his question, unless you're trying to add some syntax that's like a decorator but for an entire module, to be used in addition to the existing more local class and function decorators? > Petr Viktorin wrote that language extensions "target specific modules, > with which they're closely coupled: The modules won't run without the > transformer. And with other modules, the transformer either does > nothing (as with MacroPy, hopefully), or would fail altogether (as > with Hy). So, they would benefit from specific packages opting in. The > effects of enabling them globally range from inefficiency (MacroPy) to > failures or needing workarounds (Hy)." It seems like you're trying to find a declarative alternative to every possible use for an imperative import hook. If you can pull that off, it would be cool--but is it really necessary for your proposal? Does your solution have to make it possible for MacroPy and Hy to drop their complicated import hooks and just register transformers, for it to be a useful solution? If the problem you're trying to solve is just making it easier for MacroPy and Hy to coexist with the new transformers, maybe just solve that. For example, if it's too hard for them to decorate .pyc names in a way that fits in with your system, maybe adding a function to get the pre-hook pyc name and to set the post-hook one (e.g., to insert "-pymacro-" in the middle of it) would be sufficient. If there's something that can't be solved in a similar way--e.g., if you think your proposal has to make macropy.console (or whatever he calls the "macros in the REPL" feature) either automatic or at least a lot easier--then maybe that's a different story, but it would be nice to see the rationale for why we need to solve that today. (Couldn't it be added in 3.7, after people have gotten experience with using 3.6 transformers?) From mojtaba.gharibi at gmail.com Wed Jan 27 12:12:07 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Wed, 27 Jan 2016 12:12:07 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <20160127073004.GC14190@sjoerdjob.com> References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. For example, innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy. Regarding your question about the difference between innerProduct += $a * $b and innerProduct = $innerProduct + $a * $b The second statement returns error. I mentioned in my initial email that $ applies to a list or a tuple. Here I explicitly set my innerProduct=0 initially which you omitted in your example. innerProduct += $a * $b is equivalent to for i in len(range(a)): ...innerProduct +=a[i]*b[i] On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus wrote: > On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote: > > Yes, I'm aware sequence unpacking. > > There is an overlap like you mentioned, but there are things that can't > be > > done with sequence unpacking, but can be done here. > > > > For example, let's say you're given two lists that are not necessarily > > numbers, so you can't use numpy, but you want to apply some > component-wise > > operator between each component. This is something you can't do with > > sequence unpacking or with numpy. For example: > > > > $StudentFullName = $FirstName + " " + $LastName > > > > So, in effect, I think one big part of is component wise operations. > > > > Another thing that can't be achieved with sequence unpacking is: > > f($x) > > i.e. applying f for each component of x. > > map(f, x) > > > > > About your question above, it's not ambiguous here either: > > a; b = x1;a + x2;5 > > is exactly "Equivalent" to > > a = x1+x2 > > b = a + 5 > > Now that's confusing, that it differs from sequence unpacking. > > > > > Also, there is a difference in style in sequence unpacking, and here. > > In sequence unpacking, you have to pair up the right variables and repeat > > the operator, for example: > > x,y,z = x1+x2 , y1+y2, z1+z2 > > Here you don't have to repeat it and pair up the right variables, i.e. > > x;y;z = x1;y1;z1 + x2;y2;z2 > > It's I think good that you (kind of) don't break the encapsulation-ish > > thing we have for the three values here. Also, you don't risk, making a > > mistake in the operator for one of the values by centralizing the > operator > > use. For example you could make the mistake: > > x,y,z = x1+x2, y1-y2, z1+z2 > > > > Also there are all sort of other things that are less of a motivation for > > me but that cannot be done with sequence unpacking. > > For instance: > > add ; prod = a +;* y (This one I'm not sure how can be achieved without > > ambiguity) > > x;y = f;g (a;b) > > > > > > On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus > > wrote: > > > > > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: > > > > Hello, > > > > > > > > I'm thinking of this idea that we have a pseudo-operator called > > > > "Respectively" and shown maybe with ; > > > > > > Hopefully, you're already aware of sequence unpacking? Search for > > > 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html > . > > > Unfortunately, it does not have its own section I can directly link to. > > > > > > x, y = 3, 5 > > > > > > would give the same result as > > > > > > x = 3 > > > y = 5 > > > > > > But it's more robust, as it can also deal with things like > > > > > > x, y = y + 1, x + 4 > > > > > > > > Some examples first: > > > > > > > > a;b;c = x1;y1;z1 + x2;y2;z2 > > > > is equivalent to > > > > a=x1+x2 > > > > b=y1+y2 > > > > c=z1+z2 > > > > > > So what would happen with the following? > > > > > > a; b = x1;a + x2;5 > > > > > > > > > > > So it means for each position in the statement, do something like > > > > respectively. It's like what I call a vertical expansion, i.e. > running > > > > statements one by one. > > > > Then there is another unpacking operator which maybe we can show > with $ > > > > sign and it operates on lists and tuples and creates the > "Respectively" > > > > version of them. > > > > So for instance, > > > > vec=[]*10 > > > > $vec = $u + $v > > > > will add two 10-dimensional vectors to each other and put the result > in > > > vec. > > > > > > > > I think this is a syntax that can make many things more concise plus > it > > > > makes component wise operation on a list done one by one easy. > > > > > > > > For example, we can calculate the inner product between two vectors > like > > > > follows (inner product is the sum of component wise multiplication > of two > > > > vectors): > > > > > > > > innerProduct =0 > > > > innerProduct += $a * $b > > > > > > > > which is equivalent to > > > > innerProduct=0 > > > > for i in range(len(a)): > > > > ...innerProduct += a[i]+b[i] > > > > > > Thinking about this some more: > > How do you know if this is going to return a list of products, or the > sum of those products? > > That is, why is `innerProduct += $a * $b` not equivalent to > `innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure. > > A clearer solution would be > > innerProduct = sum(map(operator.mul, a, b)) > > But that's current-Python syntax. > > To be honest, I still haven't seen an added benefit that the new syntax > would gain. Maybe you could expand on that? > > > > > > > From what I can see, it would be very beneficial for you to look into > > > numpy: http://www.numpy.org/ . It already provides inner product, sums > > > of arrays and such. I myself am not very familiar with it, but I think > > > it provides what you need. > > > > > > > > > > > For example, let's say we want to apply a function to all element in > a > > > > list, we can do: > > > > f($a) > > > > > > > > The $ and ; take precedence over anything except (). > > > > > > > > Also, an important thing is that whenever, we don't have the > respectively > > > > operator, such as for example in the statement above on the left hand > > > side, > > > > we basically use the same variable or value or operator for each > > > statement > > > > or you can equivalently think we have repeated that whole thing with > > > ;;;;. > > > > Such as: > > > > s=0 > > > > s;s;s += a;b;c; * d;e;f > > > > which result in s being a*d+b,c*e+d*f > > > > > > > > Also, I didn't spot (at least for now any ambiguity). > > > > For example one might think what if we do this recursively, such as > in: > > > > x;y;z + (a;b;c);(d;e;f);(g;h;i) > > > > using the formula above this is equivalent to > > > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) > > > > if we apply print on the statement above, the result will be: > > > > x+a > > > > x+b > > > > x+c > > > > y+d > > > > y+e > > > > y+f > > > > z+g > > > > z+h > > > > z+i > > > > > > > > Beware that in all of these ; or $ does not create a new list. > Rather, > > > they > > > > are like creating new lines in the program and executing those lines > one > > > by > > > > one( in the case of $, to be more accurate, we create for loops). > > > > > > > > I'll appreciate your time and looking forward to hearing your > thoughts. > > > > > > Again, probably you should use numpy. I'm not really sure it warrants a > > > change to the language, because it seems like it would really only be > > > beneficial to those working with matrices. Numpy already supports it, > > > and I'm suspecting that the use case for `a;b = c;d + e;f` can already > > > be satisfied by `a, b = c + e, d + f`, and it already has clearly > > > documented semantics and still works fine when one of the names on the > > > left also appears on the right: First all the calculations on the right > > > are performed, then they are assigned to the names on the left. > > > > > > > > > > > Cheers, > > > > Moj > > > > > > Kind regards, > > > Sjoerd Job > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed Jan 27 12:27:55 2016 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Wed, 27 Jan 2016 12:27:55 -0500 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: TLDR: An "extra" defaulted parameter is used for many slightly different reasons ... even a perfect solution for one of them risks being an attractive nuisance for the others. On Tue, Jan 26, 2016 at 11:39 PM, Andrew Barnert wrote: > On Jan 26, 2016, at 17:23, Jim J. Jewett wrote: >>> On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert wrote: >>> On Jan 26, 2016, at 11:40, Jim J. Jewett wrote: >>>> (1) Auxiliary variables >>>> def f(x, _len=len): ... ... >> It can improve readability, usually by providing a useful rename. > OK, but then how could FAT, or any optimizer, help with that? It can't ... and that does argue for aux variables (or a let), but ... would good usage be swamped by abuse? You also brought up the case of augmenting a builtin or global, but still delegating to the original ... I forgot that case, and didn't even notice that you were rebinding the captured name. In those cases, the mechanical intent is "capture the old way", but the higher level intent is to specialize it. This should probably look more like inheritance (or multimethods or advice and dispatch) ... so even if it deserves a language change, capture-current-value idioms wouldn't really be an improvement over the current workaround. ... >>>> So again, I think something like Victor's FAT optimizer (plus comments >>>> when immutability really is important) ... >>> How could an optimizer enforce immutability, much less signal it? >> Victor's guards can "enforce" immutability by recognizing when it >> fails in practice. > But that doesn't do _anything_ semantically--the code runs > exactly the same way as if FAT hadn't done anything, > except maybe a bit slower. If that's wrong, it's still just as > wrong, and you still have no way of noticing that it's wrong, > much less fixing it. So FAT is completely irrelevant here. Using the specific guards he proposes, yes. But something like FAT could provide more active guards that raise an exception, or swap the original value back into place, or even actively prevent the modification. Whether these should be triggered by a declaration in front of the name, or by a module-level freeze statement, or ... there are enough possibilities that I don't think a specific solution should be enshrined in the language yet. >> It can't signal, but comments can ... and >> immutability being semantically important (as opposed to merely useful >> for optimization) is rare enough that I think a comment is more likely >> to be accurate than a type declaration. > Here I disagree completely. Why do we have tuple, > or frozenset? Why do dicts only take immutable keys? > Why does the language make it easier to build > mapped/filtered copies in place? Why can immutable > objects be shared between threads or processes trivially, > while mutable objects need locks for threads and heavy > "manager" objects for processes? Mutability is a very big deal. Those are all "if you're living with these restrictions anyhow, and you tell the compiler, the program can run faster." None of those sound important in terms of "What does this program (eventually) do?" (Obviously, when immutability actually *is* important, and an appropriate immutable data type exists, then *not* using it would send a bad signal.) >>> You mean to open a new scope _outside_ the function >>> definition, so it can capture the cache in a closure, without >>> leaving it accessible from outside the scope? But then f won't >>> be accessible either, unless you have some way to "return" >>> the value to the parent scope. And a scope that returns >>> something--that's just a function, isn't it? >> It is a function plus a function call, rather than just a function. >> Getting that name (possible several names) bound properly in the outer >> scope is also beyond the abilities of a call. > It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript: > > var spam = function(n) { > var cache = {}: > return function(n) { > if (cache[n] === undefined) { > cache[n] = slow_computation(n); > } > return cache[n]; > }; > }(); That still doesn't bind n1, n2, n3 in the enclosing scope -- it only binds spam, from which you can reach spam(n1), spam(n2), etc. I guess I'm (occasionally) looking for something more like class _Scope: ... for attr in dir(_Scope): if not attr.startswith("_"): locals()[attr] = _Scope[attr] -jJ From p.f.moore at gmail.com Wed Jan 27 12:54:06 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 27 Jan 2016 17:54:06 +0000 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: On 27 January 2016 at 17:12, Mirmojtaba Gharibi wrote: > innerProduct = sum(map(operator.mul, a, b)) > is much more complex than > innerProduct += $a * $b Certainly the second is *shorter*. But it's full of weird "magic" behaviour that I don't even begin to know how to explain in general terms (i.e., without having to appeal to specific examples): - Why does += magically initialise innerProduct to 0 before doing the implied loop? Would it initialise to '' if a and b were lists of strings? - What would *= or -= or ... initialise to? Why? - What does $a mean, in isolation from a larger expression? - How do I generalise my understanding of this expression to work out what innerProduct += $a * b means? - Given that omitting the $ before one or both of the variables totally changes the meaning, how bad of a bug magnet is this? - What if a and b are different lengths? Why does the length of the unrelated list b affect the meaning of the expression $a (i.e., there's a huge context sensitivity here). - How do I pronounce $a? What is the name of the $ "operator". "*" is called "multiply", to give an example of what I mean. Oh, and your "standard Python" implementation of inner product is not the most readable (which is a matter of opinion, certainly) approach, so you're asking a loaded question. An alternative way of writing it would be innerProduct = sum(x*y for x, y in zip(a, b)) Variable names that aren't 1-character would probably help the "normal" version. I can't be sure if they'd help or harm the proposed version. Probably wouldn't make much difference. Sorry, but I see no particular value in this proposal, and many issues with it. So -1 from me. Paul From mojtaba.gharibi at gmail.com Wed Jan 27 12:55:33 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Wed, 27 Jan 2016 12:55:33 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> References: <20160127055743.GB14190@sjoerdjob.com> <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> Message-ID: On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert wrote: > On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi > wrote: > > Yes, I'm aware sequence unpacking. > There is an overlap like you mentioned, but there are things that can't be > done with sequence unpacking, but can be done here. > > For example, let's say you're given two lists that are not necessarily > numbers, so you can't use numpy, but you want to apply some component-wise > operator between each component. This is something you can't do with > sequence unpacking or with numpy. > > > Yes, you can do it with numpy. > > Obviously you don't get the performance benefits when you aren't using > "native" types (like int32) and operations that have vectorizes > implementations (like adding two arrays of int32 or taking the dot product > of float64 matrices), but you do still get the same elementwise operators, > and even a way to apply arbitrary callables over arrays, or even other > collections: > > >>> firsts = ['John', 'Jane'] > >>> lasts = ['Smith', 'Doe'] > >>> np.vectorize('{1}, {0}'.format)(firsts, lasts) > array(['Smith, John', 'Doe, Jane'], dtype=' > I think the form I am suggesting is simpler and more readable. I'm happy you brought vectorize to my attention though. I think as soon you make the statement just a bit complex, it would become really complicated with vectorize. For example lets say you have x=[1,2,3,4,5,...] y=['A','BB','CCC',...] p=[2,3,4,6,6,...] r=[]*n $r = str(len($y*$p)+$x) It would be really complex to calculate such a thing with vectorize. All I am saving on is basically a for-loop and the indexing. We don't really have to use numpy,etc. I think it's much easier to just use for-loop and indexing, if you don't like the syntax. So I think the question is, does my syntax bring enough convenience to avoid for-loop and indexing. For example the above could be equivalently written as for i in range(0,len(r)): ...r[i] = str(len(y[i]*p[i])+x[i]) So that's the whole saving. Just a for-loop and indexing operator. > That's everything you're asking for, with even more flexibility, with no > need for any new ugly perlesque syntax: just use at least one np.array type > in an operator expression, call a method on an array type, or wrap a > function in vectorize, and everything is elementwise. > > And of course when you actually _are_ using numbers, as in every single > one of your examples, using numpy also gives you around a 6:1 space and > 20:1 time savings, which is a nice bonus. > > For example: > > $StudentFullName = $FirstName + " " + $LastName > > So, in effect, I think one big part of is component wise operations. > > Another thing that can't be achieved with sequence unpacking is: > f($x) > i.e. applying f for each component of x. > > > That's a very different operation, which I think is more readably spelled > map(f, x). > > About your question above, it's not ambiguous here either: > a; b = x1;a + x2;5 > is exactly "Equivalent" to > a = x1+x2 > b = a + 5 > > Also, there is a difference in style in sequence unpacking, and here. > In sequence unpacking, you have to pair up the right variables and repeat > the operator, for example: > x,y,z = x1+x2 , y1+y2, z1+z2 > Here you don't have to repeat it and pair up the right variables, i.e. > x;y;z = x1;y1;z1 + x2;y2;z2 > > > If you only have two or three of these, that isn't a problem. Although in > this case, it sure looks like you're trying to add two 3D vectors, so > maybe you should just be storing 3D vectors as instances of a class (with > an __add__ method, of course), or as arrays, or as columns in a larger > array, rather than as 3 separate variables. What could be more readable > than this: > > v = v1 + v2 > > And if you have more than about three separate variables, you _definitely_ > want some kind of array or iterable, not a bunch of separate variables. > You're worried about accidentally typing "y1-y2" when you meant "+", but > you're far more likely to screw up one of the letters or numbers than the > operator. You also can't loop over separate variables, which means you > can't factor out some logic and apply it to all three axes, or to both > vectors. Also consider how you'd do something like transposing or pivoting > or anything even fancier. If you've got a 2D array or iterable of > iterables, that's trivial: transpose or zip, etc. If you've got N*M > separate variables, you have to write them all individually. Your syntax at > best cuts the source length and opportunity for errors in half; using > collections cuts it down to 1. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mojtaba.gharibi at gmail.com Wed Jan 27 13:02:40 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Wed, 27 Jan 2016 13:02:40 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: I think a lot of your question are answered in my very first email. Stuff about initialization. I had initialized my variable, but Sjoerd dropped it when giving his example. Please refer to the very first email. Regarding how to explain the behaviour in simple term, I also refer you to my very first email. Basically it's a pair of (kind of) operators I called Respectively and unpacking. You can read it more extensively there. It's supposed that in a pairwise operation like this, you provide identical length lists. If a and b are different length, my idea is that we just go as much as the length of the first list in the operation or alternatively the biggest list and then throw an exception for instance. On Wed, Jan 27, 2016 at 12:54 PM, Paul Moore wrote: > On 27 January 2016 at 17:12, Mirmojtaba Gharibi > wrote: > > innerProduct = sum(map(operator.mul, a, b)) > > is much more complex than > > innerProduct += $a * $b > > Certainly the second is *shorter*. But it's full of weird "magic" > behaviour that I don't even begin to know how to explain in general > terms (i.e., without having to appeal to specific examples): > > - Why does += magically initialise innerProduct to 0 before doing the > implied loop? Would it initialise to '' if a and b were lists of > strings? > - What would *= or -= or ... initialise to? Why? > - What does $a mean, in isolation from a larger expression? > - How do I generalise my understanding of this expression to work out > what innerProduct += $a * b means? > - Given that omitting the $ before one or both of the variables > totally changes the meaning, how bad of a bug magnet is this? > - What if a and b are different lengths? Why does the length of the > unrelated list b affect the meaning of the expression $a (i.e., > there's a huge context sensitivity here). > - How do I pronounce $a? What is the name of the $ "operator". "*" is > called "multiply", to give an example of what I mean. > > Oh, and your "standard Python" implementation of inner product is not > the most readable (which is a matter of opinion, certainly) approach, > so you're asking a loaded question. An alternative way of writing it > would be > > innerProduct = sum(x*y for x, y in zip(a, b)) > > Variable names that aren't 1-character would probably help the > "normal" version. I can't be sure if they'd help or harm the proposed > version. Probably wouldn't make much difference. > > Sorry, but I see no particular value in this proposal, and many issues > with it. So -1 from me. > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 27 13:13:18 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 10:13:18 -0800 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: <225FBF8A-815A-4A2D-B0C8-3DA4E45A6E77@yahoo.com> On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi wrote: > > > I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. > For example, > > innerProduct = sum(map(operator.mul, a, b)) > is much more complex than > innerProduct += $a * $b > > MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy. Why? What's wrong with using numpy? It seems like only problem in your initial post was that you thought numpy can't do what you want, when in fact it can, and trivially so. Adding the same amount of complexity to the base language wouldn't make it any more discoverable--it would just mean that _all_ Python users now have the potential to be confused, rather than only Python+numpy users, which sounds like a step backward. Also, this is going to sound like a rhetorical, or even baited, question, but it's not intended that way: what's wrong with APL, or J, or MATLAB, and what makes you want to use Python instead? I'll bet that, directly or indirectly, the reason is the simplicity, consistency, and readability of Python. If you make Python more cryptic and dense, there's a very good chance it'll end up less readable than J rather than more, which would defeat the entire purpose. Also, while we're at it, if you want the same features as APL and MATLAB, why invent a very different syntax instead of just using their syntax? Most proposals for adding elementwise computation to the base language suggest adding array operators like .+ that work the same way on all types, not adding object-wrapping operators that turn a list or a bunch of separate objects into some hidden type that overloads the normal + operator to be elementwise. What's the rationale for doing it your way instead of the usual way? (I can see one pretty good answer--consistency with numpy--but I don't think it's what you have in mind.) > Regarding your question about the difference between > innerProduct += $a * $b > and > innerProduct = $innerProduct + $a * $b > > The second statement returns error. I mentioned in my initial email that $ applies to a list or a tuple. > Here I explicitly set my innerProduct=0 initially which you omitted in your example. > > innerProduct += $a * $b > is equivalent to > for i in len(range(a)): > ...innerProduct +=a[i]*b[i] > > > > > >> On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus wrote: >> On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote: >> > Yes, I'm aware sequence unpacking. >> > There is an overlap like you mentioned, but there are things that can't be >> > done with sequence unpacking, but can be done here. >> > >> > For example, let's say you're given two lists that are not necessarily >> > numbers, so you can't use numpy, but you want to apply some component-wise >> > operator between each component. This is something you can't do with >> > sequence unpacking or with numpy. For example: >> > >> > $StudentFullName = $FirstName + " " + $LastName >> > >> > So, in effect, I think one big part of is component wise operations. >> > >> > Another thing that can't be achieved with sequence unpacking is: >> > f($x) >> > i.e. applying f for each component of x. >> >> map(f, x) >> >> > >> > About your question above, it's not ambiguous here either: >> > a; b = x1;a + x2;5 >> > is exactly "Equivalent" to >> > a = x1+x2 >> > b = a + 5 >> >> Now that's confusing, that it differs from sequence unpacking. >> >> > >> > Also, there is a difference in style in sequence unpacking, and here. >> > In sequence unpacking, you have to pair up the right variables and repeat >> > the operator, for example: >> > x,y,z = x1+x2 , y1+y2, z1+z2 >> > Here you don't have to repeat it and pair up the right variables, i.e. >> > x;y;z = x1;y1;z1 + x2;y2;z2 >> > It's I think good that you (kind of) don't break the encapsulation-ish >> > thing we have for the three values here. Also, you don't risk, making a >> > mistake in the operator for one of the values by centralizing the operator >> > use. For example you could make the mistake: >> > x,y,z = x1+x2, y1-y2, z1+z2 >> > >> > Also there are all sort of other things that are less of a motivation for >> > me but that cannot be done with sequence unpacking. >> > For instance: >> > add ; prod = a +;* y (This one I'm not sure how can be achieved without >> > ambiguity) >> > x;y = f;g (a;b) >> > >> > >> > On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus >> > wrote: >> > >> > > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: >> > > > Hello, >> > > > >> > > > I'm thinking of this idea that we have a pseudo-operator called >> > > > "Respectively" and shown maybe with ; >> > > >> > > Hopefully, you're already aware of sequence unpacking? Search for >> > > 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . >> > > Unfortunately, it does not have its own section I can directly link to. >> > > >> > > x, y = 3, 5 >> > > >> > > would give the same result as >> > > >> > > x = 3 >> > > y = 5 >> > > >> > > But it's more robust, as it can also deal with things like >> > > >> > > x, y = y + 1, x + 4 >> > > > >> > > > Some examples first: >> > > > >> > > > a;b;c = x1;y1;z1 + x2;y2;z2 >> > > > is equivalent to >> > > > a=x1+x2 >> > > > b=y1+y2 >> > > > c=z1+z2 >> > > >> > > So what would happen with the following? >> > > >> > > a; b = x1;a + x2;5 >> > > >> > > > >> > > > So it means for each position in the statement, do something like >> > > > respectively. It's like what I call a vertical expansion, i.e. running >> > > > statements one by one. >> > > > Then there is another unpacking operator which maybe we can show with $ >> > > > sign and it operates on lists and tuples and creates the "Respectively" >> > > > version of them. >> > > > So for instance, >> > > > vec=[]*10 >> > > > $vec = $u + $v >> > > > will add two 10-dimensional vectors to each other and put the result in >> > > vec. >> > > > >> > > > I think this is a syntax that can make many things more concise plus it >> > > > makes component wise operation on a list done one by one easy. >> > > > >> > > > For example, we can calculate the inner product between two vectors like >> > > > follows (inner product is the sum of component wise multiplication of two >> > > > vectors): >> > > > >> > > > innerProduct =0 >> > > > innerProduct += $a * $b >> > > > >> > > > which is equivalent to >> > > > innerProduct=0 >> > > > for i in range(len(a)): >> > > > ...innerProduct += a[i]+b[i] >> > > > >> >> Thinking about this some more: >> >> How do you know if this is going to return a list of products, or the >> sum of those products? >> >> That is, why is `innerProduct += $a * $b` not equivalent to >> `innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure. >> >> A clearer solution would be >> >> innerProduct = sum(map(operator.mul, a, b)) >> >> But that's current-Python syntax. >> >> To be honest, I still haven't seen an added benefit that the new syntax >> would gain. Maybe you could expand on that? >> >> > > >> > > From what I can see, it would be very beneficial for you to look into >> > > numpy: http://www.numpy.org/ . It already provides inner product, sums >> > > of arrays and such. I myself am not very familiar with it, but I think >> > > it provides what you need. >> > > >> > > > >> > > > For example, let's say we want to apply a function to all element in a >> > > > list, we can do: >> > > > f($a) >> > > > >> > > > The $ and ; take precedence over anything except (). >> > > > >> > > > Also, an important thing is that whenever, we don't have the respectively >> > > > operator, such as for example in the statement above on the left hand >> > > side, >> > > > we basically use the same variable or value or operator for each >> > > statement >> > > > or you can equivalently think we have repeated that whole thing with >> > > ;;;;. >> > > > Such as: >> > > > s=0 >> > > > s;s;s += a;b;c; * d;e;f >> > > > which result in s being a*d+b,c*e+d*f >> > > > >> > > > Also, I didn't spot (at least for now any ambiguity). >> > > > For example one might think what if we do this recursively, such as in: >> > > > x;y;z + (a;b;c);(d;e;f);(g;h;i) >> > > > using the formula above this is equivalent to >> > > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) >> > > > if we apply print on the statement above, the result will be: >> > > > x+a >> > > > x+b >> > > > x+c >> > > > y+d >> > > > y+e >> > > > y+f >> > > > z+g >> > > > z+h >> > > > z+i >> > > > >> > > > Beware that in all of these ; or $ does not create a new list. Rather, >> > > they >> > > > are like creating new lines in the program and executing those lines one >> > > by >> > > > one( in the case of $, to be more accurate, we create for loops). >> > > > >> > > > I'll appreciate your time and looking forward to hearing your thoughts. >> > > >> > > Again, probably you should use numpy. I'm not really sure it warrants a >> > > change to the language, because it seems like it would really only be >> > > beneficial to those working with matrices. Numpy already supports it, >> > > and I'm suspecting that the use case for `a;b = c;d + e;f` can already >> > > be satisfied by `a, b = c + e, d + f`, and it already has clearly >> > > documented semantics and still works fine when one of the names on the >> > > left also appears on the right: First all the calculations on the right >> > > are performed, then they are assigned to the names on the left. >> > > >> > > > >> > > > Cheers, >> > > > Moj >> > > >> > > Kind regards, >> > > Sjoerd Job >> > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjoerdjob at sjec.nl Wed Jan 27 13:58:22 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Wed, 27 Jan 2016 19:58:22 +0100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> Message-ID: <20160127185822.GD14190@sjoerdjob.com> On Wed, Jan 27, 2016 at 12:55:33PM -0500, Mirmojtaba Gharibi wrote: > On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert wrote: > > For example lets say you have > x=[1,2,3,4,5,...] > y=['A','BB','CCC',...] > p=[2,3,4,6,6,...] > r=[]*n > > $r = str(len($y*$p)+$x) Several (current-Python) solutions are there I can already see: r = [str(len(yv * pv) + xv) for xv, yv, pv in zip(x, y, p)] r = map(lambda xv, yv, pv: str(len(yv * pv) + xv), x, y, p) # Assuming x, y, p are numpy arrays r = np.vectorize(lambda xv, yv, pv: str(len(yv * pv) + xv))(x, y, p) Furthermore, the `str(len(y * p) + x)` is supposed to actually do something, I presume. Why does that not have a name? Foobarize? r = [foobarize(xv, yv, pv) for xv, yv, pv in zip(x, y, p)] r = map(foobarize, x, y, p) r = np.vectorize(foobarize)(x, y, p) or in your syntax $r = foobarize($x, $y, $p) I assume? Also, supposing `f` is a function of two arguments: $r = f(x, $y) means r = [f(x, y_val) for y_val in y] And $r = f($x, y) means r = [f(x_val, y) for x_val in x] Then what does $r = f($x, $y) mean? I suppose you want it to mean r = [f(x_val, y_val) for x_val, y_val in zip(x, y)] = map(f, x, y) which can be confusing if `x` and `y` have different lengths. Maybe r = [f(x_val, y_val) for x_val in x for y_val in y] or r = [f(x_val, y_val) for y_val in y for x_val in x] ? Besides the questionable benefit of shorter syntax, I think this would actually not be a good case. Numpy, list/generator comprehensions and the map/zip builtins already provide more than enough ways to do it. Why add even another syntax. No, you don't have to use numpy. If you don't need it, please don't use it. But, do not forget that the standard set of builtins is already powerful enough to give you what you want. Python is a general-purpose programming language (though often used in sciency-stuff). Matlab is a 'matrix lab' language. If the language its only purpose is working with matrices: please, go ahead and build matrix-specific syntax. In my experience, Python has a lot more purposes than just matrix manipulation. Codebases I've worked on only had use for the `$` operator you're suggesting for too little lines of code to bother learning the extra syntax. I'm definitively -1 on yet another syntax when there are already multiple obvious ways to solve the same problem:(numpy, comprehensions, map. (not sure if I even have the right to vote here, given that I'm not a core developer, but just giving my opinion) > > It would be really complex to calculate such a thing with vectorize. > All I am saving on is basically a for-loop and the indexing. We don't > really have to use numpy,etc. I think it's much easier to just use for-loop > and indexing, if you don't like the syntax. So I think the question is, > does my syntax bring enough convenience to avoid for-loop and indexing. > For example the above could be equivalently written as > for i in range(0,len(r)): > ...r[i] = str(len(y[i]*p[i])+x[i]) > So that's the whole saving. Just a for-loop and indexing operator. And I listed some of the ways you can save the loop + indexing. That doesn't need new syntax. > > > > > > That's everything you're asking for, with even more flexibility, with no > > need for any new ugly perlesque syntax: just use at least one np.array type > > in an operator expression, call a method on an array type, or wrap a > > function in vectorize, and everything is elementwise. > > > > And of course when you actually _are_ using numbers, as in every single > > one of your examples, using numpy also gives you around a 6:1 space and > > 20:1 time savings, which is a nice bonus. > > > > For example: > > > > $StudentFullName = $FirstName + " " + $LastName > > > > So, in effect, I think one big part of is component wise operations. > > > > Another thing that can't be achieved with sequence unpacking is: > > f($x) > > i.e. applying f for each component of x. > > > > > > That's a very different operation, which I think is more readably spelled > > map(f, x). > > > > About your question above, it's not ambiguous here either: > > a; b = x1;a + x2;5 > > is exactly "Equivalent" to > > a = x1+x2 > > b = a + 5 > > > > Also, there is a difference in style in sequence unpacking, and here. > > In sequence unpacking, you have to pair up the right variables and repeat > > the operator, for example: > > x,y,z = x1+x2 , y1+y2, z1+z2 > > Here you don't have to repeat it and pair up the right variables, i.e. > > x;y;z = x1;y1;z1 + x2;y2;z2 > > > > > > If you only have two or three of these, that isn't a problem. Although in > > this case, it sure looks like you're trying to add two 3D vectors, so > > maybe you should just be storing 3D vectors as instances of a class (with > > an __add__ method, of course), or as arrays, or as columns in a larger > > array, rather than as 3 separate variables. What could be more readable > > than this: > > > > v = v1 + v2 > > > > And if you have more than about three separate variables, you _definitely_ > > want some kind of array or iterable, not a bunch of separate variables. > > You're worried about accidentally typing "y1-y2" when you meant "+", but > > you're far more likely to screw up one of the letters or numbers than the > > operator. You also can't loop over separate variables, which means you > > can't factor out some logic and apply it to all three axes, or to both > > vectors. Also consider how you'd do something like transposing or pivoting > > or anything even fancier. If you've got a 2D array or iterable of > > iterables, that's trivial: transpose or zip, etc. If you've got N*M > > separate variables, you have to write them all individually. Your syntax at > > best cuts the source length and opportunity for errors in half; using > > collections cuts it down to 1. > > > > From srkunze at mail.de Wed Jan 27 14:06:26 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 27 Jan 2016 20:06:26 +0100 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: <56A91532.7000805@mail.de> Hi, On 27.01.2016 16:39, Victor Stinner wrote: > "One concern that I have though is that transformers are registered > globally. I think that the decorators in codetransformer do a good job > of signalling to reader the scope of some new code generation." I share this concern but haven't a good solution right now. Admittedly, I already have a use-case where I would like to apply a transformation which is NOT an optimization but a global extension. So, the discussion about allowing global extension really made me think about whether that is really a good idea. *BUT* it would allow me to experiment and find out if the risk is worth it. (use-case: adding some hooks before entering and leaving all try blocks) > Currently, the PEP 511 doesn't provide a way to register a code > transformer but only use it under some conditions. For example, if > fatoptimizer is registered, all .pyc files will be called > file.cpython-36.fat-0.pyc even if fatoptimizer was disabled. > > I propose to change the design of sys.set_code_transformers() to use > it more like a registry similar to the codecs registry > (codecs.register), but different (details below). A difference is that > the codecs registry uses a mapping (codec name => codec functions), > whereas sys.set_code_transformers() uses an ordered sequence (list) of > code transformers. A sequence is used because multiple code > transformers can be applied sequentially on a single .py file. How does it change the interface for the users? (I mean besides the renaming). I still like your idea of having the following three options: 1) global optimizers 2) local extensions --> via codec or import hook 3) global extension --> use with care So, I assume we talk about specifying 2). > Petr Viktorin wrote that language extensions "target specific modules, > with which they're closely coupled: The modules won't run without the > transformer. And with other modules, the transformer either does > nothing (as with MacroPy, hopefully), or would fail altogether (as > with Hy). So, they would benefit from specific packages opting in. The > effects of enabling them globally range from inefficiency (MacroPy) to > failures or needing workarounds (Hy)." > > > Problem (A): solutions proposed below don't make code tranformers > mandatory. If a code *requires* a code transformer and the code > transformer is not registered, Python doesn't complain. Do you think > that it is a real issue in practice? For MacroPy, it's not a problem > in practice since functions must be decorated using a decorator from > the macropy package. If importing macropy fails, the module cannot be > imported. Sounds good. > Problem (B): proposed solutions below adds markers to ask to enable a > specific code transformer, but a code transformer can decide to always > modify the Python semantics without using such marker. According to > Nick Coghlan, code transformers changing the Python semantics *must* > require a marker in the code using them. IMHO it's the responsability > of the author of the code transformer to use markers, not the > responsability of Python. I agree with Nick. Be explicit. > Code transformers should maybe return a flag telling if they changed > the code or not. I prefer a flag rather than comparing the output to > the input, since the comparison can be expensive, especially for a > deep AST tree. Example: > > class Optimizer: > def ast_optimizer(self, tree, context): > # ... > return modified, tree > > *modified* must be True if tree was modified. Not sure if that is needed. If we don't have an immediate use-case, simpler is better. > There are several options to decide if a code transformer must be used > on a specific source file. The user should decide, otherwise there is too much magic involved: a marker (source file) or an option (cmdline). I am indifferent whether the marker should be a codec-decl or an import hook. But it should be file-local (at least I would prefer that). All of the options below seem to involve too much magic for my taste (or I didn't understand them correctly). > (1) Add a check_code() and check_ast() functions to code transformers. > The code transformer is responsible to decide if it wants to transform > the code or not. Python doesn't use the code transformer if the check > method returns False. > > Examples: > > * MacroPy can search for the "import macropy" statement (of "from > macropy import ...") in the AST tree > * fatoptimizer can search for "__fatoptimizer__ = {'enabled': False}" > in the code: if this variable is found, the optimizer is completly > skipped > > > (2) Petr proposed to extend importlib to pass a code transformer when > importing a module. > > importlib.util.import_with_transformer( > 'mypackage.specialmodule', MyTransformer()) > > IMHO this option is too specific: it's restricted to importlib > (py_compile, compileall and interactive interpreter don't have the > feature). I also dislike the API. > > > (3) Petr also proposed "a special flag in packages": > > __transformers_for_submodules__ = [MyTransformer()] > > I don't like having to get access to MyTransformer. The PEP 511 > mentions an use case where the transformed code is run *without* > registering the transformer. But this issue can easily be fixed by > using the string to identify the transformer in the registery (ex: > "fat") rather than its class. > > I'm not sure that putting a flag on the package (package/__init__.py?) > is a good idea. I would prefer to enable language extensions on > individual files to restrict their scope. > > > (4) Sjoerd Job Postmus proposed something similar but using a comment > and not for packages, but any source file: > > #:Transformers modname.TransformerClassName, > modname.OtherTransformerClassName > > The problem is that comments are not stored in the AST tree. I would > prefer to use AST to decide if an AST transformer should be used or > not. > > Note: I'm not really motived to extend the AST to start to include > comments, or even code formatting (spaces, newlines, etc.). > https://pypi.python.org/pypi/redbaron/ can be used if you want to > transform a .py file without touching the format. But I don't think > that AST must go to this direction. I prefer to keep AST simple. > > > (5) Nick proposed (indirectly) to use a different filename (don't use > ".py") for language extensions. > > This option works with my option (2): the context contains the > filename which can be used to decide to enable or not the code > transformer. > > I understand that the code transformer must also install an importlib > hook to search for other filenames than only .py files. Am I right? > > > (6) Nick proposed (indirectly) to use an encoding cookie "which are > visible as a comment in the module header". > > Again, I dislike this option because comments are not stored in AST. Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Wed Jan 27 15:17:55 2016 From: random832 at fastmail.com (Random832) Date: Wed, 27 Jan 2016 15:17:55 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <225FBF8A-815A-4A2D-B0C8-3DA4E45A6E77@yahoo.com> References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> <225FBF8A-815A-4A2D-B0C8-3DA4E45A6E77@yahoo.com> Message-ID: <1453925875.1389884.504418858.15B771FC@webmail.messagingengine.com> > On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi > wrote: > > I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. > > For example, > > > > innerProduct = sum(map(operator.mul, a, b)) > > is much more complex than > > innerProduct += $a * $b Frankly, I'd prefer simply innerProduct = sum($a * $b) - i'm not sure how you can reasonably define all the semantics of all operators in all combinations in a way that makes your "+=" work. Furthermore, I think your expressions could also get hairy. a = [1, 2] b = [3, 4] c = 5 a * $b = [[1, 2]*3, [1, 2]*4]] = [[1, 2, 1, 2, 1, 2], [1, 2, 1, 2, 1, 2, 1, 2]] $a * b = [[1*[3, 4], 2*[3, 4]] = [[3, 4], [3, 4, 3, 4]] $a * $b = [1*3, 2*4] = [3, 8] ($a * $b) * c = [3, 8] * 5 = [3, 8, 3, 8, 3, 8, 3, 8, 3, 8] # and let's ignore the associativity problems for the moment $($a * $b) * c = $[3, 8] * 5 = [3*5, 8*5] = [15, 40] # oh, look, we have to put $ on an arbitrary expression, not just a name Do you need multiple $ signs to operate on multiple dimensions? If not, why not? (Arguably, sequence repeating should be a different operator than multiplication anyway, but that ship has long sailed) > > MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy. On Wed, Jan 27, 2016, at 13:13, Andrew Barnert via Python-ideas wrote: > Why? What's wrong with using numpy? > > It seems like only problem in your initial post was that you thought > numpy can't do what you want, when in fact it can, and trivially so. > Adding the same amount of complexity to the base language wouldn't make > it any more discoverable--it would just mean that _all_ Python users now > have the potential to be confused, rather than only Python+numpy users, > which sounds like a step backward. My impression is that the ultimate idea is to allow/require/recommend a post-numpy library to use the same syntax for these semantics, so that the base semantics with the plain operators are not different between post-numpy and base python, in order to make post-numpy less confusing than numpy. I.e. that the semantics when operating on sequences of numbers ought to be defined solely by the syntax (not confusing, even if it's more complex than what we have now), rather than by what library the sequence object comes from (confusing). From brett at python.org Wed Jan 27 15:20:20 2016 From: brett at python.org (Brett Cannon) Date: Wed, 27 Jan 2016 20:20:20 +0000 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Jan 27, 2016, at 07:39, Victor Stinner > wrote: > > > > Hi, > > > > Thank you for all feedback on my PEP 511. It looks like the current > > blocker point is the unclear status of "language extensions": code > > tranformers which deliberately changes the Python semantics. I would > > like to discuss how we should register them. I think that the PEP 511 > > must discuss "language extensions" even if it doesn't have to propose > > a solution to make their usage easier. It's an obvious usage of code > > transformers. If possible, I would like to find a compromise to > > support them, but make it explicit that they change the Python > > semantics. > > Is this really necessary? > > If someone is testing a language change locally, and just wants to use > your (original) API for his tests instead of the more complicated > alternative of building an import hook, it works fine. If he can't deploy > that way, that's fine. > > If someone builds a transformer that adds a feature in a way that makes it > a pure superset of Python, he should be fine with running it on all files, > so your API works fine. And if some files that didn't use any of the new > features get .pyc files that imply they did, so what? > > If someone builds a transformer that only runs on files with a different > extension, he already needs an import hook, so he might as well just call > his transformer from the input hook, same as he does today. > And the import hook is not that difficult. You can reuse everything from importlib without modification except for needing to override a single method in some loader to do your transformation ( https://docs.python.org/3/library/importlib.html#importlib.abc.InspectLoader.source_to_code). Otherwise the only complication is instantiating the right classes and setting the path hook in `sys.path_hooks`. > > So... What case is served by this new, more complicated API that wasn't > already served by your original, simple one (remembering that import hooks > are already there as a fallback)? > As Victor pointed out, the discussion could end in "nothing changed, but we at least discussed it". I think both you and I currently agree that's the answer to his question. :) -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 27 15:49:12 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 12:49:12 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56A80851.7090606@canterbury.ac.nz> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> <56A80851.7090606@canterbury.ac.nz> Message-ID: On Jan 26, 2016, at 15:59, Greg Ewing wrote: > > I'd like to do something with "let", which is famliar > from other languages as a binding-creation construct, > and it doesn't seem a likely choice for a variable > namne. > > Maybe if we had a general statement for introducing > a new scope, independent of looping: > > let: > ... A few years ago, I played with using an import hook to add let statements to Python (by AST-translating them to a function definition and call). It's a neat idea, but I couldn't find any actual uses that made my code more readable. Or, rather, I found a small a handful, but every time it was actually far _more_ readable to just refactor the let body out into a separate (non-nested) function or method. I don't know if this would be true more universally than for my code. But I think it's worth trying to come up with non-toy examples of where you'd actually use this. Put another way: flat is better than nested. When you actually need a closure, you have to go nested--but most of the time, you don't. And if you go flat most of the time, the few cases where you go nested now signal that something is special (you actually need a closure). So, unless there really are common cases where you need a closure over some variables, but early binding/value capture/whatever for others, I think this may harm readability more than it helps. > The for-loop is a special case, because it assigns a > variable in a place where we can't capture it in a > let-block. So we introduce a variant: > > for let x in things: > funcs.append(lambda: process(x)) This reads weird to me. I think it's because I've been spending too much time in Swift, but I also think Swift may have gotten things right here, so that's not totally irrelevant. In Swift, almost anywhere you want to create a new binding--whether normal declaration statements, the equivalent of C99 "if (ch = getch())", or even pattern matching--you have to use the "let" keyword. But "for" statements are the one place you _don't_ use "let", because they _always_ create a new binding for the loop variable. As I've mentioned before, both C# and Ruby made breaking changes from the Python behavior to the Swift behavior, because they couldn't find any legitimate code that would be broken by that change. And there have been few if any complaints since. If we really are considering adding something like "for let", we should seriously consider whether anyone would ever have a good reason to use "for" instead of "for let". If not, just change "for" instead. > 2) It may be desirable to allow assignments on the > same line as "let", e.g. > > with open(filename) as f: > let g = f: > process(g) > > which seems marginally more readable. It's also probably a lot more familiar to people who are used to let from functional languages. And I don't _think_ it's a misleading/false-cognate kind of familiarity, although I'm not positive about that. From joejev at gmail.com Wed Jan 27 16:01:30 2016 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 27 Jan 2016 16:01:30 -0500 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: My thought about decorators is that they allow obvious scoping of changes for the reader. Anything that becomes module scope or is implied based on system state that is set in another module will make debugging and reading much harder. Both lazy_python and codetransformer use bytecode manipulation; however, it is a purely opt-in system where the transformed function is decorated. This keeps the transformations in view while you are reading the code that is affected by them. I would find debugging a project much more difficult if I needed to remember that the order my modules were imported matters a lot because they setup a bunch of state. I am not sure why people want the module to be the smallest unit that is transformed when really it is the code object that should be the smallest unit. This means class bodies and functions. If we treat the module as the most atomic unit then you wouldn't be able to use something like `asconstants` This is a really great local optimzation when calling a function in a loop, especially builtins that you know will most likely never change and you don't want to change if they do. For example: In [1]: from codetransformer.transformers.constants import asconstants In [2]: @asconstants(a=1) ...: def f(): ...: return a ...: In [3]: a = 5 In [4]: f() Out[4]: 1 In [5]: @asconstants('pow') # string means use the built in for this name ...: def g(ns): ...: for n in ns: ...: yield pow(n, 2) ...: In [6]: list(g([1, 2, 3])) Out[6]: [1, 4, 9] In [7]: dis(g) 3 0 SETUP_LOOP 28 (to 31) 3 LOAD_FAST 0 (ns) 6 GET_ITER >> 7 FOR_ITER 20 (to 30) 10 STORE_FAST 1 (n) 13 LOAD_CONST 0 () 16 LOAD_FAST 1 (n) 19 LOAD_CONST 1 (2) 22 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 25 YIELD_VALUE 26 POP_TOP 27 JUMP_ABSOLUTE 7 >> 30 POP_BLOCK >> 31 LOAD_CONST 2 (None) 34 RETURN_VALUE This is a simple optimization that people emulate all the time with things like `sum_ = sum` before the loop or `def g(ns, *, _sum=sum)`. This cannot be used at module scope because often you only think it is safe or worth it to lock in the value for a small segment of code. Hopefully this use case is being considered as I think this is a very simple, non-semantics preserving case that is not very and also practical. On Wed, Jan 27, 2016 at 3:20 PM, Brett Cannon wrote: > > > On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas < > python-ideas at python.org> wrote: > >> On Jan 27, 2016, at 07:39, Victor Stinner >> wrote: >> > >> > Hi, >> > >> > Thank you for all feedback on my PEP 511. It looks like the current >> > blocker point is the unclear status of "language extensions": code >> > tranformers which deliberately changes the Python semantics. I would >> > like to discuss how we should register them. I think that the PEP 511 >> > must discuss "language extensions" even if it doesn't have to propose >> > a solution to make their usage easier. It's an obvious usage of code >> > transformers. If possible, I would like to find a compromise to >> > support them, but make it explicit that they change the Python >> > semantics. >> >> Is this really necessary? >> >> If someone is testing a language change locally, and just wants to use >> your (original) API for his tests instead of the more complicated >> alternative of building an import hook, it works fine. If he can't deploy >> that way, that's fine. >> >> If someone builds a transformer that adds a feature in a way that makes >> it a pure superset of Python, he should be fine with running it on all >> files, so your API works fine. And if some files that didn't use any of the >> new features get .pyc files that imply they did, so what? >> >> If someone builds a transformer that only runs on files with a different >> extension, he already needs an import hook, so he might as well just call >> his transformer from the input hook, same as he does today. >> > > And the import hook is not that difficult. You can reuse everything from > importlib without modification except for needing to override a single > method in some loader to do your transformation ( > https://docs.python.org/3/library/importlib.html#importlib.abc.InspectLoader.source_to_code). > Otherwise the only complication is instantiating the right classes and > setting the path hook in `sys.path_hooks`. > > >> >> So... What case is served by this new, more complicated API that wasn't >> already served by your original, simple one (remembering that import hooks >> are already there as a fallback)? >> > > As Victor pointed out, the discussion could end in "nothing changed, but > we at least discussed it". I think both you and I currently agree that's > the answer to his question. :) > > -Brett > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 27 16:15:18 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 13:15:18 -0800 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> Message-ID: <8FD7A3D2-A438-40A7-98DE-C6D2ED9CC7A7@yahoo.com> On Jan 27, 2016, at 09:55, Mirmojtaba Gharibi wrote: > >> On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert wrote: >>> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi wrote: >>> >>> Yes, I'm aware sequence unpacking. >>> There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here. >>> >>> For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. >> >> Yes, you can do it with numpy. >> >> Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections: >> >> >>> firsts = ['John', 'Jane'] >> >>> lasts = ['Smith', 'Doe'] >> >>> np.vectorize('{1}, {0}'.format)(firsts, lasts) >> array(['Smith, John', 'Doe, Jane'], dtype=' I think the form I am suggesting is simpler and more readable. But the form you're suggesting doesn't work for vectorizing arbitrary functions, only for operator expressions (including simple function calls, but that doesn't help for more general function calls). The fact that numpy is a little harder to read for cases that your syntax can't handle at all is hardly a strike against numpy. And, as I already explained, for the cases where your form _does_ work, numpy already does it, without all the sigils: c = a + b c = a*a + 2*a*b + b*b c = (a * b).sum() It also works nicely over multiple dimensions. For example, if a and b are both arrays of N 3-vectors instead of just being 3-vectors, you can still elementwise-add them just with +; you can sum all of the results with sum(axis=1); etc. How would you write any of those things with your $-syntax? > I'm happy you brought vectorize to my attention though. I think as soon you make the statement just a bit complex, it would become really complicated with vectorize. > > For example lets say you have > x=[1,2,3,4,5,...] > y=['A','BB','CCC',...] > p=[2,3,4,6,6,...] > r=[]*n > > $r = str(len($y*$p)+$x) As a side note, []*n is always just []. Maybe you meant [None for _ in range(n)] or [None]*n? Also, where does n come from? It doesn't seem to have anything to do with the lengths of x, y, and p. So, what happens if it's shorter than them? Or longer? With numpy, of course, that isn't a problem--there's no magic being attempted on the = operator (which is good, because = isn't an operator in Python, and I'm not sure how you'd even properly define your design, much less implement it); the operators just create arrays of the right length. Anyway, that's still mostly just operators. You _could_ wrap up an operator expression in a function to vectorize, but you almost never want to. Just use the operators directly on the arrays. So, let's try a case that has even some minimal amount of logic, where translating to operators would be clumsy at best: @np.vectorize def sillyslice(y, x, p): if x < p: return y[x:p] return y[p:x] r = sillyslice(y, x, p) Being a separate function provides all the usual benefits: sillyslice is reusable, debuggable, unit-testable, usable as a first-class object, etc. But forget that; how would you do this at all with your $-syntax? Since you didn't answer any of my other questions, I'll snip them and repost shorter versions: * what's wrong with using numpy? * what's wrong with APL or J or MATLAB? * what's wrong with making the operators elementwise instead of wrapping the objects in some magic thing? * what is the type of that magic thing anyway? -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Jan 27 16:34:46 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 13:34:46 -0800 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: On Jan 27, 2016, at 12:20, Brett Cannon wrote: > >> On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas wrote: >> >> If someone builds a transformer that only runs on files with a different extension, he already needs an import hook, so he might as well just call his transformer from the input hook, same as he does today. > > And the import hook is not that difficult. Unless it has to work in 2.7 and 3.3 (or, worse, 2.6 and 3.2). :) > You can reuse everything from importlib without modification except for needing to override a single method in some loader to do your transformation Yes, as of 3.4, the design is amazing. In fact, hooking any level--lookup, source, AST, bytecode, or pyc--is about as easy as it could be. My only complaint is that it's not easy enough to find out how easy import hooks are. When I tell people "you could write a simple import hook to play with that idea", they get a look of fear and panic that's completely unwarranted and just drop their cool idea. (I wonder if having complete examples of a simple global-transformer hook and a simple special-extension hook at the start of the docs would be enough to solve that problem?) And I'm a bit worried that if Victor tries to make things like MacroPy and Hy easier, it still won't be enough for real-life cases, so all it'll do is discourage people from going right to writing import hooks and seeing how easy that already is. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jan 27 16:53:15 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 27 Jan 2016 13:53:15 -0800 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi < mojtaba.gharibi at gmail.com> wrote: > > > MATLAB has a built-in easy way of achieving component-wise operation and I > think Python would benefit from that without use of libraries such as numpy. > I've always thought there should be a component-wise operations in Python. The wlay to do it now is somthing like: [i + j for i,j in zip(a,b)] is really pretty darn wordy, compared to : a_numpy_array + another_numpy array (similar in matlab). But maybe an operator is the way to do it. But it was long ago decide dnot to introduce a full set of extra operators, alla matlab: .+ .* etc.... rather, it was realized that for numpy, which does element-wise operations be default, matrix multiplication was really the only non-elementwise operation widely used, so the new @ operator was added. And we're kind of stuck --even if we added a full set, then in numpy, the regular operators would be element wise, but for built-in Python sequences, the special ones would be elementwise -- really confusing! if you really want this, I'd make your own sequences that re-define the operators. Or just use Numpy... you can use object arrays if you want to handle non-numeric values: In [*4*]: a1 = np.array(["this", "that"], dtype=object) In [*5*]: a2 = np.array(["some", "more"], dtype=object) In [*6*]: a1 + a2 Out[*6*]: array(['thissome', 'thatmore'], dtype=object) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Jan 27 17:15:58 2016 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Jan 2016 09:15:58 +1100 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: On Thu, Jan 28, 2016 at 4:27 AM, Jim J. Jewett wrote: >> Here I disagree completely. Why do we have tuple, >> or frozenset? Why do dicts only take immutable keys? >> Why does the language make it easier to build >> mapped/filtered copies in place? Why can immutable >> objects be shared between threads or processes trivially, >> while mutable objects need locks for threads and heavy >> "manager" objects for processes? Mutability is a very big deal. > > Those are all "if you're living with these restrictions anyhow, > and you tell the compiler, the program can run faster." > > None of those sound important in terms of "What does this program > (eventually) do?" The nature of hash tables and equality is such that if an object's value (defined by __eq__) changes between when it's used as a key and when it's looked up, bad stuff happens. It's not just an optimization - it's a way for the dict subsystem to protect us against craziness. Yes, you can bypass that protection: class HashableList(list): def __hash__(self): return hash(tuple(self)) but it's a great safety net. You won't unexpectedly get KeyError when you iterate over a dictionary - you'll instead get TypeError when you try to assign. Is that a semantic question or a performance one? ChrisA From random832 at fastmail.com Wed Jan 27 17:42:15 2016 From: random832 at fastmail.com (Random832) Date: Wed, 27 Jan 2016 17:42:15 -0500 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> Message-ID: <1453934535.1421522.504548074.3CAC68A0@webmail.messagingengine.com> On Wed, Jan 27, 2016, at 17:15, Chris Angelico wrote: > The nature of hash tables and equality is such that if an object's > value (defined by __eq__) changes between when it's used as a key and > when it's looked up, bad stuff happens. It's not just an optimization > - it's a way for the dict subsystem to protect us against craziness. This stands alone against all the things that it *could* protect users against but doesn't due to the "consenting adults" principle. Java allows ArrayLists to be HashMap keys and the sky hasn't fallen, despite that language otherwise having far more of a culture of protecting users from themselves and each other (i.e. it has stuff like private, final, etc) than Python does. We won't even protect from redefining math.pi, yet you want to prevent a user from using as a key in a dictionary a value which _might_ be altered while the dictionary is in use? This prevents all kinds of algorithms from being used which would benefit from using a short-lived dict/set to keep track of things. I think this came up a month or so ago when we were talking about comparison of dict values views (which could benefit from being able to use all the values in the dict as keys in a Counter). They're not going to change while the algorithm is executing unless the user does some weird multithreaded stuff or something truly bizarre in a callback (and if they do? consenting adults.), and the dict is thrown away at the end. > Yes, you can bypass that protection: > > class HashableList(list): > def __hash__(self): return hash(tuple(self)) That doesn't really work for my scenario described above, which requires an alternate universe in which Python (like Java) requires *all* objects, mutable or otherwise, to define __hash__ in a way consistent with __eq__. > but it's a great safety net. You won't unexpectedly get KeyError when > you iterate over a dictionary - you'll instead get TypeError when you > try to assign. Is that a semantic question or a performance one? But I won't get either error if I don't mutate the list, or I only do it in equality-conserving ways (e.g. converting between numeric types). From abarnert at yahoo.com Wed Jan 27 18:19:39 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 15:19:39 -0800 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: <1453934535.1421522.504548074.3CAC68A0@webmail.messagingengine.com> References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> <1453934535.1421522.504548074.3CAC68A0@webmail.messagingengine.com> Message-ID: On Jan 27, 2016, at 14:42, Random832 wrote: > >> On Wed, Jan 27, 2016, at 17:15, Chris Angelico wrote: >> The nature of hash tables and equality is such that if an object's >> value (defined by __eq__) changes between when it's used as a key and >> when it's looked up, bad stuff happens. It's not just an optimization >> - it's a way for the dict subsystem to protect us against craziness. > > This stands alone against all the things that it *could* protect users > against but doesn't due to the "consenting adults" principle. > > Java allows ArrayLists to be HashMap keys and the sky hasn't fallen, > despite that language otherwise having far more of a culture of > protecting users from themselves and each other (i.e. it has stuff like > private, final, etc) than Python does. > > We won't even protect from redefining math.pi, yet you want to prevent a > user from using as a key in a dictionary a value which _might_ be > altered while the dictionary is in use? This prevents all kinds of > algorithms from being used which would benefit from using a short-lived > dict/set to keep track of things. It's amazing how many people go for years using Python without noticing this restriction, then, as soon as it's pointed out to them, exclaim "That's horrible! It's way too restrictive! I can think of all kinds of useful code that this prevents!" And then you go back and try to think of code you were prevented from writing over the past five years before you learned this rule, and realize that there's little if any. And, over the next five years, you run into the rule very rarely (and more often, it's because you forgot to define an appropriate __hash__ for an immutable type than because you needed to put a mutable type in a dict or set). Similarly, everyone learns the tuple/frozenset trick, decries the fact that there's no way to do a "deep" equivalent, but eventually ends up using the trick once every couple years and never running into the shallowness problem. From a pure design point of view, this looks like a case of hidebound purity over practice, exactly what Python is against. But from a practical use point of view, it actually works really well. I don't know if you could prove this fact a prioiri, or even argue very strongly for it, but it still seems to be true. People who use Python don't notice the limitation, people who rant against Python don't include it in their "why Python sucks" lists; only people who just discovered it care. From steve at pearwood.info Wed Jan 27 19:12:51 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jan 2016 11:12:51 +1100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: Message-ID: <20160128001251.GX4619@ando.pearwood.info> On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: > Hello, > > I'm thinking of this idea that we have a pseudo-operator called > "Respectively" and shown maybe with ; > > Some examples first: > > a;b;c = x1;y1;z1 + x2;y2;z2 Regardless of the merits of this proposal, the suggested syntax cannot be used because that's already valid Python syntax equivalent to: a b c = x1 y1 z1 + x2 y2 z2 So forget about using the ; as that would be ambiguous. [...] > Then there is another unpacking operator which maybe we can show with $ > sign and it operates on lists and tuples and creates the "Respectively" > version of them. > So for instance, > vec=[]*10 > $vec = $u + $v > will add two 10-dimensional vectors to each other and put the result in vec. []*10 won't work, as that's just []. And it seems very unpythonic to need to pre-allocate a list just to do vectorized addition. I think you would be better off trying to get better support for vectorized operations into Python: vec = add(u, v) is nearly as nice looking as u + v, and it need not even be a built-in. It could be a library. In an earlier version of the statistics module, I experimented with vectorized functions for some of the operations. I wanted a way for the statistics functions to *automatically* generate either scalar or vector results without any extra programming effort. E.g. writing mean([1, 2, 3]) would return the scalar 2, of course, while: mean([(1, 10, 100), (2, 20, 200), (3, 30, 300)]) would operate column-wise and return (2, 20, 200). To do that, I needed vectorized versions of sum, division, sqrt etc. I didn't mind if they were written as function calls instead of operators: divide(6, 3) # returns 2 divide((6, 60, 600), 3) # returns (2, 20, 200) which I got with a function: divide = vectorize(operator.truediv) where vectorize() took a scalar operator and returned a function that looped over two vectors and applied the operator to each argument in an elementwise fashion. I eventually abandoned this approach because the complexity and performance hit of my initial implementation was far too great, but maybe that was just my poor implementation. I think that vectorized functions would be very useful in Python. Performance need not be super-fast -- numpy, numba, and the other heavy-duty third-party tools would continue to dominate the high-performance scientific computing niche, but they should be at least *no worse* than the equivalent code using a loop. If you had a vectorized add() function, your example: a;b;c = x1;y1;z1 + x2;y2;z2 would become: a, b, c = add([x1, y1, z1], [x2, y2, z2]) Okay, it's not as *nice looking* as the + operator, but it will do. Or you could subclass list to do this instead of concatenation. I would support the addition of a vectorize() function which took an arbitrary scalar function, and returned a vectorized version: func = vectorized(lambda x, y: 2*x + y**3 - x*y/3) a, b, c = func(vector_x, vector_y) being similar to: f = lambda x, y: 2*x + y**3 - x*y/3 a, b, c = [f(x, y) for x, y in zip(vector_x, vector_y)] [...] > For example, we can calculate the inner product between two vectors like > follows (inner product is the sum of component wise multiplication of two > vectors): > > innerProduct =0 > innerProduct += $a * $b > > which is equivalent to > innerProduct=0 > for i in range(len(a)): > ...innerProduct += a[i]+b[i] def mult(*vectors): for t in zip(*vectors): yield reduce(operator.mul, t) innerProduct = sum(mult(a, b)) -- Steve From steve at pearwood.info Wed Jan 27 19:31:16 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jan 2016 11:31:16 +1100 Subject: [Python-ideas] several different needs [Explicit variable capture list] In-Reply-To: <8221007C-DD14-4D72-A30C-CCD8A1FE06D9@yahoo.com> References: <45D92422-3E9F-4367-9DD4-1062819D6232@yahoo.com> <20160127124103.GU4619@ando.pearwood.info> <8221007C-DD14-4D72-A30C-CCD8A1FE06D9@yahoo.com> Message-ID: <20160128003116.GY4619@ando.pearwood.info> On Wed, Jan 27, 2016 at 08:14:15AM -0800, Andrew Barnert wrote: > I think you're actually agreeing with me: there _aren't_ four > different cases people actually want here, just the one we've all been > talking about, and FAT is irrelevant to that case, so this sub thread > is ultimately just a distraction. I think we do agree. Thanks for the extra detail, I have nothing more to say at this point :-) -- Steve From steve at pearwood.info Wed Jan 27 20:06:03 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jan 2016 12:06:03 +1100 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> <56A80851.7090606@canterbury.ac.nz> Message-ID: <20160128010603.GZ4619@ando.pearwood.info> On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert via Python-ideas wrote: > > The for-loop is a special case, because it assigns a > > variable in a place where we can't capture it in a > > let-block. So we introduce a variant: > > > > for let x in things: > > funcs.append(lambda: process(x)) > > This reads weird to me. I think it's because I've been spending too > much time in Swift, but I also think Swift may have gotten things > right here, so that's not totally irrelevant. It reads weird to me too, because "for let x in ..." is just weird. It's uncanny valley for English grammar: at first glance it looks like valid grammar, but it's not. [...] > As I've mentioned before, both C# and Ruby made breaking changes from > the Python behavior to the Swift behavior, because they couldn't find > any legitimate code that would be broken by that change. I'm not sure if you intended this or not, but that sounds like "they found plenty of code that would break, but decided it wasn't legitimate so they didn't care". -- Steve From ethan at stoneleaf.us Wed Jan 27 20:19:16 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Jan 2016 17:19:16 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160128010603.GZ4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> <56A80851.7090606@canterbury.ac.nz> <20160128010603.GZ4619@ando.pearwood.info> Message-ID: <56A96C94.7060603@stoneleaf.us> On 01/27/2016 05:06 PM, Steven D'Aprano wrote: > On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert wrote: >> This reads weird to me. I think it's because I've been spending too >> much time in Swift, but I also think Swift may have gotten things >> right here, so that's not totally irrelevant. > > It reads weird to me too, because "for let x in ..." is just weird. It's > uncanny valley for English grammar: at first glance it looks like valid > grammar, but it's not. >> As I've mentioned before, both C# and Ruby made breaking changes from >> the Python behavior to the Swift behavior, because they couldn't find >> any legitimate code that would be broken by that change. > > I'm not sure if you intended this or not, but that sounds like "they > found plenty of code that would break, but decided it wasn't legitimate > so they didn't care". Or, "they found code that would break, because it was already broken but nobody had noticed yet." -- ~Ethan~ From python at mrabarnett.plus.com Wed Jan 27 20:30:37 2016 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 28 Jan 2016 01:30:37 +0000 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <20160128001251.GX4619@ando.pearwood.info> Message-ID: On 2016-01-28 00:12:51, "Steven D'Aprano" wrote: >On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote: >> Hello, >> >> I'm thinking of this idea that we have a pseudo-operator called >> "Respectively" and shown maybe with ; >> >> Some examples first: >> >> a;b;c = x1;y1;z1 + x2;y2;z2 > >Regardless of the merits of this proposal, the suggested syntax cannot >be used because that's already valid Python syntax equivalent to: > >a >b >c = x1 >y1 >z1 + x2 >y2 >z2 > >So forget about using the ; as that would be ambiguous. > > >[...] >> Then there is another unpacking operator which maybe we can show with >>$ >> sign and it operates on lists and tuples and creates the >>"Respectively" >> version of them. >> So for instance, >> vec=[]*10 >> $vec = $u + $v >> will add two 10-dimensional vectors to each other and put the result >>in vec. > >[]*10 won't work, as that's just []. And it seems very unpythonic to >need to pre-allocate a list just to do vectorized addition. > >I think you would be better off trying to get better support for >vectorized operations into Python: > >vec = add(u, v) > >is nearly as nice looking as u + v, and it need not even be a built-in. >It could be a library. > [snip] An alternative would be to add an element-wise class: class Vector: def __init__(self, *args): self.args = args def __str__(self): return '<%s>' % ', '.join(repr(arg) for arg in self.args) def __add__(self, other): if isinstance(other, Vector): return Vector(*[left + right for left, right in zip(self.args, other.args)]) return Vector(*[left + other for left in self.args]) def __iter__(self): return iter(self.args) Then you could write: a, b, c = Vector(x1, y1, z1) + Vector(x2, y2, z2) I wonder whether there's a suitable pair of delimiters that could be used to create a 'literal' for it. From abarnert at yahoo.com Wed Jan 27 20:51:46 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 17:51:46 -0800 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <20160128001251.GX4619@ando.pearwood.info> References: <20160128001251.GX4619@ando.pearwood.info> Message-ID: On Jan 27, 2016, at 16:12, Steven D'Aprano wrote: > > I think you would be better off trying to get better support for > vectorized operations into Python: I really think, at least 90% of the time, and probably a lot more, people are better off just using numpy than reinventing it. Obviously, building a "statistics-without-numpy" module to be added to the stdlib is an exception. But otherwise, the fact that numpy already exists, and has had a couple decades of heavy use and expert attention and two predecessor libraries to work out the kinks in the design, means that it's likely to be better, even for your limited purposes, than any limited-purpose thing you come up with. There are a lot more edge cases than you think. For example, you thought far enough ahead that your sum that works column-wise on 2D arrays. But what about when you need to work row-wise? What's the best interface: an axis parameter, or a transpose function (hey, you can even just use zip)? How do you then extend whichever choice you made to 3D? Or to when you want to get the total sum across both axes? For another example: should I be able to use vectorize to write a function of two arrays, and then apply it to a single N+1-D array, or is that going to cause more confusion than help? And so on. I wouldn't trust my own a priori intuition on those questions, so I'd go look at APL, J, MATLAB, R, and maybe Mathematica and see how their idioms best translate to Python in a variety of different kinds of problems. And I'd probably get some of it wrong, as numpy's ancestors did, and then have to agonize over compatibility-breaking changes. And after all that, what would be the benefit? I no longer have to install numpy--but now I have to install pyvec instead. Which is just a less-featureful, less-tested, less-optimized, and less-refined numpy. If there's something actually _wrong_ with numpy's design for your purposes (and you can't trivially wrap it away), that's different. Maybe you could do a whole lot lazily by sticking to the iterator protocol? (There's a nifty Haskell package for vectorizing lazily that might be worth looking at, as long as you can stand reading about everything in terms of monadic lifting where you'd say vectorize, etc.) But "I want the same as numpy but less good" doesn't seem like a good place to start, because at best, that's what you'll end up with. From abarnert at yahoo.com Wed Jan 27 21:08:13 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Jan 2016 18:08:13 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <20160128010603.GZ4619@ando.pearwood.info> References: <20160120003712.GZ10854@ando.pearwood.info> <20160121001027.GB4619@ando.pearwood.info> <20160125232136.GP4619@ando.pearwood.info> <56A80851.7090606@canterbury.ac.nz> <20160128010603.GZ4619@ando.pearwood.info> Message-ID: On Jan 27, 2016, at 17:06, Steven D'Aprano wrote: > > On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert via Python-ideas wrote: > >>> The for-loop is a special case, because it assigns a >>> variable in a place where we can't capture it in a >>> let-block. So we introduce a variant: >>> >>> for let x in things: >>> funcs.append(lambda: process(x)) >> >> This reads weird to me. I think it's because I've been spending too >> much time in Swift, but I also think Swift may have gotten things >> right here, so that's not totally irrelevant. > > It reads weird to me too, because "for let x in ..." is just weird. It's > uncanny valley for English grammar: at first glance it looks like valid > grammar, but it's not. Ah, good point. > [...] >> As I've mentioned before, both C# and Ruby made breaking changes from >> the Python behavior to the Swift behavior, because they couldn't find >> any legitimate code that would be broken by that change. > > I'm not sure if you intended this or not, but that sounds like "they > found plenty of code that would break, but decided it wasn't legitimate > so they didn't care". :) What I meant is they found a small number of examples of code that would be affected, but all of them were clearly bugs, and therefore not legitimate. Obviously that can be a judgment call, but usually it's a pretty easy one. Like the function that creates N callbacks that all use the last name, instead of creating one callback for each name, preceded by this comment: # Don't call this function! Ruby sucks but when I complain they tell me I'm too dumb to fix it so just don't use it!!!! Whether the 1.9 change fixed that function or re-broke it differently scarcely matters; clearly no one was depending on the old behavior. Maybe Python is different, and we would find code that really _does_ need 10 separate functions that all compute x**9 or that all disable the last button or... well, probably something more useful than that, which I can't guess in advance. I certainly wouldn't suggest just changing Python based on the results of a search of Ruby code! But I would definitely suggest doing a similar search of Python code before giving people two similar but different statements to hang themselves with. From kevinjacobconway at gmail.com Wed Jan 27 23:56:50 2016 From: kevinjacobconway at gmail.com (Kevin Conway) Date: Thu, 28 Jan 2016 04:56:50 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: I'm willing to take this conversation offline as it seems this thread has cooled down quite a bit. I would still like to hear more, though, about how adding this as a facility in the language improves over the current, external implementations of Python code optimizers. Python already has tools for reading in source files, parsing them into AST, modifying that AST, and writing the final bytecode to files as part of the standard library. I don't see anything in PEP0511 that improves upon that. Out of curiosity, do you consider this PEP as adding something to Python that didn't previously exist or do you consider this PEP to more aligned with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed to marshal the community in a common direction? I understand that you have other PEPs in flight that are designed to make certain optimizations easier (or possible). Looking at this PEP in isolation, however, leaves me wanting more explanation as to its value. You mention the need for monkey-patching or hooking into the import process as a part of the rational. The PyCC project, while it may not be the best example for optimizer design, does not need to patch or hook into any thing to function. Instead, it acts as an alternative bytecode compiler that drops .pyc just like the standard compiler would. Other than the trade-off of using a 3rd party library versus adding a -o flag, what significant advantage does a sys.add_optimizer() call provide? Again, I'm very much behind your motivation and hope you are incredibly successful in making Python a faster place to live. I'm only trying to get in your head and see what you see. On Wed, Jan 27, 2016 at 10:45 AM Victor Stinner wrote: > Hi, > > 2016-01-16 17:56 GMT+01:00 Kevin Conway : > > I'm a big fan of your motivation to build an optimizer for cPython code. > > What I'm struggling with is understanding why this requires a PEP and > > language modification. There are already several projects that manipulate > > the AST for performance gains such as [1] or even my own ham fisted > attempt > > [2]. > > Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST > optimizers section of Prior Art. > > I wrote astoptimizer [1] and this project uses monkey-patching of the > compile() function, I mentioned this monkey-patching hack in the > rationale of the PEP: > https://www.python.org/dev/peps/pep-0511/#rationale > > I would like to avoid monkey-patching because it causes various issues. > > The PEP 511 also makes transformations more visible: transformers are > explicitly registered in sys.set_code_transformers() and the .pyc > filename is modified when the code is transformed. > > It also adds a new feature: it becomes possible to run transformed > code without having to register the tranformer at runtime. This is > made possible with the addition of the -o command line option. > > Victor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mojtaba.gharibi at gmail.com Thu Jan 28 03:29:17 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Thu, 28 Jan 2016 03:29:17 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <20160127073004.GC14190@sjoerdjob.com> Message-ID: On Wed, Jan 27, 2016 at 4:53 PM, Chris Barker wrote: > On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi < > mojtaba.gharibi at gmail.com> wrote: > >> >> >> MATLAB has a built-in easy way of achieving component-wise operation and >> I think Python would benefit from that without use of libraries such as >> numpy. >> > > I've always thought there should be a component-wise operations in Python. > The wlay to do it now is somthing like: > > [i + j for i,j in zip(a,b)] > > is really pretty darn wordy, compared to : > > a_numpy_array + another_numpy array > > (similar in matlab). > > But maybe an operator is the way to do it. But it was long ago decide dnot > to introduce a full set of extra operators, alla matlab: > > .+ > .* > etc.... > > rather, it was realized that for numpy, which does element-wise operations > be default, matrix multiplication was really the only non-elementwise > operation widely used, so the new @ operator was added. > > And we're kind of stuck --even if we added a full set, then in numpy, the > regular operators would be element wise, but for built-in Python sequences, > the special ones would be elementwise -- really confusing! > > if you really want this, I'd make your own sequences that re-define the > operators. > Problem is you always forego the hassle of subclassing at that exact moment that you need element-wise and just use for loops. So it's almost always not worth the hassle. > > Or just use Numpy... you can use object arrays if you want to handle > non-numeric values: > > In [*4*]: a1 = np.array(["this", "that"], dtype=object) > > In [*5*]: a2 = np.array(["some", "more"], dtype=object) > > In [*6*]: a1 + a2 > > Out[*6*]: array(['thissome', 'thatmore'], dtype=object) > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mojtaba.gharibi at gmail.com Thu Jan 28 03:29:38 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Thu, 28 Jan 2016 03:29:38 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> <8FD7A3D2-A438-40A7-98DE-C6D2ED9CC7A7@yahoo.com> Message-ID: On Thu, Jan 28, 2016 at 3:26 AM, Mirmojtaba Gharibi < mojtaba.gharibi at gmail.com> wrote: > > > On Wed, Jan 27, 2016 at 4:15 PM, Andrew Barnert > wrote: > >> On Jan 27, 2016, at 09:55, Mirmojtaba Gharibi >> wrote: >> >> On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert >> wrote: >> >>> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi >>> wrote: >>> >>> Yes, I'm aware sequence unpacking. >>> There is an overlap like you mentioned, but there are things that can't >>> be done with sequence unpacking, but can be done here. >>> >>> For example, let's say you're given two lists that are not necessarily >>> numbers, so you can't use numpy, but you want to apply some component-wise >>> operator between each component. This is something you can't do with >>> sequence unpacking or with numpy. >>> >>> >>> Yes, you can do it with numpy. >>> >>> Obviously you don't get the performance benefits when you aren't using >>> "native" types (like int32) and operations that have vectorizes >>> implementations (like adding two arrays of int32 or taking the dot product >>> of float64 matrices), but you do still get the same elementwise operators, >>> and even a way to apply arbitrary callables over arrays, or even other >>> collections: >>> >>> >>> firsts = ['John', 'Jane'] >>> >>> lasts = ['Smith', 'Doe'] >>> >>> np.vectorize('{1}, {0}'.format)(firsts, lasts) >>> array(['Smith, John', 'Doe, Jane'], dtype='>> >>> I think the form I am suggesting is simpler and more readable. >> >> >> But the form you're suggesting doesn't work for vectorizing arbitrary >> functions, only for operator expressions (including simple function calls, >> but that doesn't help for more general function calls). The fact that numpy >> is a little harder to read for cases that your syntax can't handle at all >> is hardly a strike against numpy. >> > > I don't need to vectorize the functions. It's already being done. > Consider the ; example below: > a;b = f(x;y) > it is equivalent to > a=f(x) > b=f(y) > So in effect, in your terminology, it is already vectorized. > Similar example only with $: > a=[0,0,0,0] > x=[1,2,3,4] > $a=f($x) > is equivalent to > a=[0,0,0,0] > x=[1,2,3,4] > for i in range(len(a)): > ...a[i]=f(x[i]) > > > >> >> And, as I already explained, for the cases where your form _does_ work, >> numpy already does it, without all the sigils: >> >> c = a + b >> >> c = a*a + 2*a*b + b*b >> >> c = (a * b).sum() >> >> It also works nicely over multiple dimensions. For example, if a and b >> are both arrays of N 3-vectors instead of just being 3-vectors, you can >> still elementwise-add them just with +; you can sum all of the results with >> sum(axis=1); etc. How would you write any of those things with your >> $-syntax? >> >> I'm happy you brought vectorize to my attention though. I think as soon >> you make the statement just a bit complex, it would become really >> complicated with vectorize. >> >> >> For example lets say you have >> x=[1,2,3,4,5,...] >> y=['A','BB','CCC',...] >> p=[2,3,4,6,6,...] >> r=[]*n >> >> $r = str(len($y*$p)+$x) >> >> >> As a side note, []*n is always just []. Maybe you meant [None for _ in >> range(n)] or [None]*n? Also, where does n come from? It doesn't seem to >> have anything to do with the lengths of x, y, and p. So, what happens if >> it's shorter than them? Or longer? With numpy, of course, that isn't a >> problem--there's no magic being attempted on the = operator (which is good, >> because = isn't an operator in Python, and I'm not sure how you'd even >> properly define your design, much less implement it); the operators just >> create arrays of the right length. >> >> n I just meant symbolically to be len(x). So please replace n with > len(x). I didn't mean to confuse you. sorry. > > >> Anyway, that's still mostly just operators. You _could_ wrap up an >> operator expression in a function to vectorize, but you almost never want >> to. Just use the operators directly on the arrays. >> >> So, let's try a case that has even some minimal amount of logic, where >> translating to operators would be clumsy at best: >> >> @np.vectorize >> def sillyslice(y, x, p): >> if x < p: return y[x:p] >> return y[p:x] >> >> r = sillyslice(y, x, p) >> >> Being a separate function provides all the usual benefits: sillyslice is >> reusable, debuggable, unit-testable, usable as a first-class object, etc. >> But forget that; how would you do this at all with your $-syntax? >> >> Since you didn't answer any of my other questions, I'll snip them and >> repost shorter versions: >> >> * what's wrong with using numpy? Nothing. What's wrong even with for loop >> or assembly for that matter? I didn't argue that it's not possible to >> achieve these things with assembly. >> * what's wrong with APL or J or MATLAB? Not sure how relevant it is to >> our core of conversation. Skipping this. >> * what's wrong with making the operators elementwise instead of wrapping >> the objects in some magic thing? The fact that whenever you >> * what is the type of that magic thing anyway? It has no type. I refer >> you to my very first email. In that email I exactly explained what it >> means. It's at best a psuedo macro or something like that. It exactly is >> equivalent when you write >> > a;b=f(x;y) > to > a=f(x) > b=f(y) > > In other words, if I could interpret my code before python interpreter > interpret it, I would convert the first to the latter. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Thu Jan 28 06:11:26 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Thu, 28 Jan 2016 06:11:26 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160127055743.GB14190@sjoerdjob.com> <2CB5C96E-7249-49ED-B6D7-4FB98F59CAF9@yahoo.com> <8FD7A3D2-A438-40A7-98DE-C6D2ED9CC7A7@yahoo.com> Message-ID: For personal use, I wrote a class. (Numpy takes a while to load on my machine.) vec = Vector([1,2,3]) vec2 = vec + 5 lst = vec2.tolist() I also add attribute access. # creates a Vector of methods, which is then __call__'d vec_of_lengths = vec2.bit_length() And multi-level indexing, similar to Numpy, though I think I might want to remove indexing of the Vector itself for consistency. vec = Vector([{1:2}, {1:3}, {1,4}]) values = vec[:, 1] # Vector([2,3,4]) And multiple versions of `.map()` (which have different ways of interpreting the additional arguments). But this is for personal use. Dirty scripting. On Thu, Jan 28, 2016 at 3:26 AM, Mirmojtaba Gharibi wrote: > > On Wed, Jan 27, 2016 at 4:15 PM, Andrew Barnert > wrote: >> >> But the form you're suggesting doesn't work for vectorizing arbitrary >> functions, only for operator expressions (including simple function calls, >> but that doesn't help for more general function calls). The fact that numpy >> is a little harder to read for cases that your syntax can't handle at all is >> hardly a strike against numpy. > > > I don't need to vectorize the functions. It's already being done. > Consider the ; example below: > a;b = f(x;y) > it is equivalent to > a=f(x) > b=f(y) > So in effect, in your terminology, it is already vectorized. > Similar example only with $: > a=[0,0,0,0] > x=[1,2,3,4] > $a=f($x) > is equivalent to > a=[0,0,0,0] > x=[1,2,3,4] > for i in range(len(a)): > ...a[i]=f(x[i]) Is that really a _benefit_ of this design? "Explicit is better than implicit." >> And, as I already explained, for the cases where your form _does_ work, >> numpy already does it, without all the sigils: >> >> c = a + b >> >> c = a*a + 2*a*b + b*b >> >> c = (a * b).sum() >> >> It also works nicely over multiple dimensions. For example, if a and b >> are both arrays of N 3-vectors instead of just being 3-vectors, you can >> still elementwise-add them just with +; you can sum all of the results with >> sum(axis=1); etc. How would you write any of those things with your >> $-syntax? >> > I'm happy you brought vectorize to my attention though. I think as soon > you make the statement just a bit complex, it would become really > complicated with vectorize. > > For example lets say you have > x=[1,2,3,4,5,...] > y=['A','BB','CCC',...] > p=[2,3,4,6,6,...] > r=[]*n > > $r = str(len($y*$p)+$x) > > It would be really complex to calculate such a thing with vectorize. I think you misunderstand the way to use np.vectorize. You would write a function, and then np.vectorize it. def some_name(yi, pi, xi): return str(len(yi * pi) + xi) r = np.vectorize(some_name)(y, p, x) In terms of readability, it's probably better: you're describing an action you would take, and then (by vectorizing it) saying that you'll want to repeat that action multiple times. >> * what's wrong with using numpy? > Nothing. What's wrong even with for loop > or assembly for that matter? I didn't argue that it's not possible to > achieve these things with assembly. He's asking what benefit it has over using Numpy. You are proposing a change, and must justify the additional code, API, programmer head room, and maintenance burden. Why do you want this feature over the existing options? >> * what is the type of that magic thing anyway? > It has no type. I refer > you to my very first email. In that email I exactly explained what it means. > It's at best a psuedo macro or something like that. It exactly is equivalent > when you write > > a;b=f(x;y) > to > a=f(x) > b=f(y) > > In other words, if I could interpret my code before python interpreter > interpret it, I would convert the first to the latter. That's very magical. Magic is typically bad. Have you considered the cost to people learning to read Python? I also hate that it doesn't have a type. I don't see a;b = f(x;y) as readable (semicolons look sort of like commas to my weak eyes) or useful (unlike the "$x = f($y)" case). Compare with a, b = map(f, (x, y)) Any vectorization syntax allows us to write vectorized expressions without the extra semicolon syntax. a, b = f($[x, y]) #or: $(a, b) = f($[x, y]) a, b = f*(x, y) a, b, c = $(x1, y1, z1) + $(x2, y2, z2) two_dimensional_array = $(1, 2, 3) * $$(1, 2, 3) # == ((1, 2, 3), # (2, 4, 6), # (3, 6, 9)) # (Though how would you vectorize over two dimensions? Syntax is hard.) > innerProduct =0 > innerProduct += $a * $b I don't like this at all. A single `=` or `+=` meaning an unbounded number of assignments? This piece of code should really just use sum(). With theoretical syntax: innerProduct = sum(^($a * $b)) where the ^ (placeholder symbol) will force collapse to a list/tuple/iterator so that the vectorization doesn't "escape" and get interpreted as innerProduct = [sum(ax * bx) for ax, bx in zip(a, b)] Anyway, there's a lot to think about, and a lot of potential issues of ambiguity to argue over. That's why I just put these kinds of ideas in my notes about language designs, rather than try to think about how they could fit into Python (or any existing language). (Speaking of language design and syntax, vectorization syntax is related to lambda literals: they need a way to make sure that doesn't escape. Arc Lisp uses square brackets instead of the normal parentheses for lambdas, which bind the `_` parameter symbol.) P.S.: In the Gmail editor's bottom-right corner, click the triangle, and set Plain Text Mode. You can also go to Settings -> General and turn on "Reply All" as the default behavior, though this won't set it for mobile. From steve at pearwood.info Thu Jan 28 06:20:48 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Jan 2016 22:20:48 +1100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: <20160128001251.GX4619@ando.pearwood.info> Message-ID: <20160128112048.GA4619@ando.pearwood.info> On Wed, Jan 27, 2016 at 05:51:46PM -0800, Andrew Barnert wrote: > On Jan 27, 2016, at 16:12, Steven D'Aprano wrote: > > > > I think you would be better off trying to get better support for > > vectorized operations into Python: > > I really think, at least 90% of the time, and probably a lot more, > people are better off just using numpy than reinventing it. Oh I agree. [...] > There are a lot more edge cases than you think. For example, you > thought far enough ahead that your sum that works column-wise on 2D > arrays. But what about when you need to work row-wise? I thought of all those questions, and honestly I'm not sure what the right answer is. But the nice thing about writing code for the simple use-cases is that you don't have to worry about the hard use-cases :-) I'm mostly influenced by the UI of calculators like the HP-48GX and the TI CAS calculators, and typically they don't give you the option. If you want to do an average across the row, transpose your data :-) > And after all that, what would be the benefit? I no longer have to > install numpy--but now I have to install pyvec instead. Which is just > a less-featureful, less-tested, less-optimized, and less-refined > numpy. True, true, and for many people that's probably a deal-breaker. But for others, more features == more things you don't understand and don't know why you would ever need them. Anyway, I'm not proposing that any of this should end up in the stdlib, so while I could waffle on for hours, I should bring this to a close before it goes completely off-topic. -- Steve From srkunze at mail.de Thu Jan 28 11:50:32 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 28 Jan 2016 17:50:32 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: <56AA46D8.6090203@mail.de> Some feedback on: https://www.python.org/dev/peps/pep-0511/#usage-3-disable-all-optimization Where do I put this specific piece of code (sys.set_code_transformers([]))? Best, Sven From victor.stinner at gmail.com Thu Jan 28 11:53:35 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 28 Jan 2016 17:53:35 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56AA46D8.6090203@mail.de> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> Message-ID: 2016-01-28 17:50 GMT+01:00 Sven R. Kunze : > Some feedback on: > https://www.python.org/dev/peps/pep-0511/#usage-3-disable-all-optimization > > Where do I put this specific piece of code (sys.set_code_transformers([]))? It's better to use -o noopt command, but if you want to call directly sys.set_code_transformers(), you have to call it before the first import. Example of app.py: -- sys.set_code_transformers([]) import module module.main() -- Victor From srkunze at mail.de Thu Jan 28 11:57:08 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 28 Jan 2016 17:57:08 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> Message-ID: <56AA4864.7030004@mail.de> On 28.01.2016 17:53, Victor Stinner wrote: > 2016-01-28 17:50 GMT+01:00 Sven R. Kunze : >> Some feedback on: >> https://www.python.org/dev/peps/pep-0511/#usage-3-disable-all-optimization >> >> Where do I put this specific piece of code (sys.set_code_transformers([]))? > It's better to use -o noopt command, but if you want to call directly > sys.set_code_transformers(), you have to call it before the first > import. Example of app.py: > -- > sys.set_code_transformers([]) > import module > module.main() > -- I suspected that. So, where is this place of "before the first" import? From victor.stinner at gmail.com Thu Jan 28 12:03:58 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 28 Jan 2016 18:03:58 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56AA4864.7030004@mail.de> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> Message-ID: 2016-01-28 17:57 GMT+01:00 Sven R. Kunze : > I suspected that. So, where is this place of "before the first" import? I don't understand your question. I guess that your real question is: are stdlib modules loaded with peephole optimizer enabled or not? If you use -o noopt, you are safe: the peephole optimizer is disabled before the first Python import. If you use sys.set_code_transformers([]) in your code, it's likely that Python already imported 20 or 40 modules during its initialization (especially in the site module). It's up to you to pick the best option. There are different usages for each option. Maybe you just don't care of the stdlib, you only want to debug your application code, so it's doesn't matter how the stlidb is optimized? -- Or are you asking me to remove sys.set_code_transformers([]) from the section "Usage 3: Disable all optimization"? I don't understand. Victor From srkunze at mail.de Thu Jan 28 12:46:06 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 28 Jan 2016 18:46:06 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> Message-ID: <56AA53DE.4010600@mail.de> On 28.01.2016 18:03, Victor Stinner wrote: > I don't understand your question. > > I guess that your real question is: are stdlib modules loaded with > peephole optimizer enabled or not? > > If you use -o noopt, you are safe: the peephole optimizer is disabled > before the first Python import. > > If you use sys.set_code_transformers([]) in your code, it's likely > that Python already imported 20 or 40 modules during its > initialization (especially in the site module). > > It's up to you to pick the best option. There are different usages for > each option. Maybe you just don't care of the stdlib, you only want to > debug your application code, so it's doesn't matter how the stlidb is > optimized? > > -- > > Or are you asking me to remove sys.set_code_transformers([]) from the > section "Usage 3: Disable all optimization"? I don't understand. That is exactly the issue with setting a transformer at runtime which I don't understand. That is one weakness of the PEP; some people already proposed to make a difference between - local transformation - global transformation I can understand the motivation to have the same API for both, but's inherently different and it makes talking about it hard (as we can see now). I would like to have this clarified in the PEP (use consistent wording) or even split it up into two different parts of the PEP. You said I would need to call the function before all imports. Why is that? Can I not call it it twice in the same file? Or in a loop? What will happen? Will the file get recompiled each time? Some people proposed a "from __extensions__ import my_extension"; inspired by __future__ imports, i.e. it is forced to be at the top. Why? Because it somehow makes sense to perform all transformations the first time a file is loaded. I don't see that addressed in the PEP. I have to admit I would prefer this kind usage over a function call. Furthermore: - we already have import hooks. They can be used for local transformation. I don't see that addressed in the PEP. - after re-reading the PEP I have some difficulties to see how to activate, say, 2 custom transformers **globally** (via -o). Maybe, adding an example would help here. From jimjjewett at gmail.com Thu Jan 28 14:51:17 2016 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 28 Jan 2016 11:51:17 -0800 (PST) Subject: [Python-ideas] Explicit variable capture list In-Reply-To: Message-ID: <56aa7135.a21c8c0a.11ae0.ffff9371@mx.google.com> On Wed Jan 27 15:49:12 EST 2016, Andrew Barnert wrote: > both C# and Ruby made breaking changes from the Python behavior > to the Swift behavior, because they couldn't find any legitimate code > that would be broken by that change. And there have been few if any > complaints since. If we really are considering adding something like > "for let", we should seriously consider whether anyone would ever have > a good reason to use "for" instead of "for let". If not, just change > "for" instead. The first few times I saw this, I figured Python had a stronger (and longer) backwards compatibility guarantee. But now that I consider the actual breakage, I'm not so sure... >>> for i in range(10): print (i) i=i+3 print(i) i is explicitly changed, but it doesn't affect the flow control -- it gets reset to the next sequence item as if nothing had happened. It would break things to hide the final value of i after the loop is over, but that isn't needed. I think the only way it even *could* matter is if the loop variable is captured in a closure each time through the loop. What would it look like for the current behavior to be intentional? >>> for cache in (4, 5, 6, {}): def f(): cache['haha!'] = "I know only the last will really get used!" funcs.append(f) -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From ethan at stoneleaf.us Thu Jan 28 15:09:00 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 28 Jan 2016 12:09:00 -0800 Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56aa7135.a21c8c0a.11ae0.ffff9371@mx.google.com> References: <56aa7135.a21c8c0a.11ae0.ffff9371@mx.google.com> Message-ID: <56AA755C.8070603@stoneleaf.us> On 01/28/2016 11:51 AM, Jim J. Jewett wrote: > I think the only way it even *could* matter is if the loop variable is > captured in a closure each time through the loop. What would it > look like for the current behavior to be intentional? > > >>> for cache in (4, 5, 6, {}): > def f(): > cache['haha!'] = "I know only the last will really get used!" > funcs.append(f) I think that falls into the "not legitimate" category. ;) -- ~Ethan~ From brett at python.org Thu Jan 28 15:36:34 2016 From: brett at python.org (Brett Cannon) Date: Thu, 28 Jan 2016 20:36:34 +0000 Subject: [Python-ideas] PEP 511: Add a check function to decide if a "language extension" code transformer should be used or not In-Reply-To: References: Message-ID: On Wed, 27 Jan 2016 at 13:34 Andrew Barnert wrote: > On Jan 27, 2016, at 12:20, Brett Cannon wrote: > > On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas < > python-ideas at python.org> wrote: > > >> If someone builds a transformer that only runs on files with a different >> extension, he already needs an import hook, so he might as well just call >> his transformer from the input hook, same as he does today. >> > > And the import hook is not that difficult. > > > Unless it has to work in 2.7 and 3.3 (or, worse, 2.6 and 3.2). :) > Sure, but you're already asking for a lot of pain if you're trying to be that compatible at the AST/bytecode level so I view this as the least of your worries. :) > > You can reuse everything from importlib without modification except for > needing to override a single method in some loader to do your > transformation > > > Yes, as of 3.4, the design is amazing. In fact, hooking any level--lookup, > source, AST, bytecode, or pyc--is about as easy as it could be. > > My only complaint is that it's not easy enough to find out how easy import > hooks are. When I tell people "you could write a simple import hook to play > with that idea", they get a look of fear and panic that's completely > unwarranted and just drop their cool idea. (I wonder if having complete > examples of a simple global-transformer hook and a simple special-extension > hook at the start of the docs would be enough to solve that problem?) > So two things. One is that there is an Examples section in the importlib docs for 3.6: https://docs.python.org/3.6/library/importlib.html#examples . As of right now it only covers use-cases that the `imp` module provided since that's the most common thing I get asked about. Second, while it's much easier than it has ever been to do fancy stuff with import, it's a balancing act of promoting it and discouraging it. :) Mess up your import and it can be rather hard to debug. And this is especially true if you hook in early enough such that you start to screw up stuff in the stdlib and not just your own code. It can also lead to people going a bit overboard with things (hence why I kept my life simple with the LazyLoader and actively discourage its use unless you're sure you need it). So it's a balance of "look at this shiny thing!" and "be careful because you might come out screaming". > > And I'm a bit worried that if Victor tries to make things like MacroPy and > Hy easier, it still won't be enough for real-life cases, so all it'll do is > discourage people from going right to writing import hooks and seeing how > easy that already is. > We don't need to empower every use-case as much as possible. While we're consenting adults, we also try to prevent people from making their own lives harder. All of this stuff is a tough balancing act to get right. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Jan 28 15:44:09 2016 From: brett at python.org (Brett Cannon) Date: Thu, 28 Jan 2016 20:44:09 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> Message-ID: On Wed, 27 Jan 2016 at 20:57 Kevin Conway wrote: > I'm willing to take this conversation offline as it seems this thread has > cooled down quite a bit. > > I would still like to hear more, though, about how adding this as a > facility in the language improves over the current, external > implementations of Python code optimizers. Python already has tools for > reading in source files, parsing them into AST, modifying that AST, and > writing the final bytecode to files as part of the standard library. I > don't see anything in PEP0511 that improves upon that. > > Out of curiosity, do you consider this PEP as adding something to Python > that didn't previously exist or do you consider this PEP to more aligned > with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed > to marshal the community in a common direction? I understand that you have > other PEPs in flight that are designed to make certain optimizations easier > (or possible). Looking at this PEP in isolation, however, leaves me wanting > more explanation as to its value. > The PEP is about empowering people to write AST transformers without having to use third-party tools to integrate it into their workflow. As you pointed out, there is very little here that isn't possible today with some toolchain that reads Python source code, translates it into an AST, optimizes it, and then writes out the .pyc file. But that all does require going to PyPI or writing your own solution. But if Victor's PEP gets in, then there will be a standard hook point that all Python code will go through which will make adding AST transformers much easier. Whether this ease of use is beneficial is part of the discussion around this PEP. > > You mention the need for monkey-patching or hooking into the import > process as a part of the rational. The PyCC project, while it may not be > the best example for optimizer design, does not need to patch or hook into > any thing to function. Instead, it acts as an alternative bytecode compiler > that drops .pyc just like the standard compiler would. Other than the > trade-off of using a 3rd party library versus adding a -o flag, what > significant advantage does a sys.add_optimizer() call provide? > The -o addition is probably the biggest thing the PEP is proposing. The overwriting of .pyc files with optimizations that are not necessarily expected is not the best, so -o would allow for stopping the abuse of .pyc file naming. The AST registration parts is all just to make this stuff easier. -Brett > > Again, I'm very much behind your motivation and hope you are incredibly > successful in making Python a faster place to live. I'm only trying to get > in your head and see what you see. > > > On Wed, Jan 27, 2016 at 10:45 AM Victor Stinner > wrote: > >> Hi, >> >> 2016-01-16 17:56 GMT+01:00 Kevin Conway : >> > I'm a big fan of your motivation to build an optimizer for cPython code. >> > What I'm struggling with is understanding why this requires a PEP and >> > language modification. There are already several projects that >> manipulate >> > the AST for performance gains such as [1] or even my own ham fisted >> attempt >> > [2]. >> >> Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST >> optimizers section of Prior Art. >> >> I wrote astoptimizer [1] and this project uses monkey-patching of the >> compile() function, I mentioned this monkey-patching hack in the >> rationale of the PEP: >> https://www.python.org/dev/peps/pep-0511/#rationale >> >> I would like to avoid monkey-patching because it causes various issues. >> >> The PEP 511 also makes transformations more visible: transformers are >> explicitly registered in sys.set_code_transformers() and the .pyc >> filename is modified when the code is transformed. >> >> It also adds a new feature: it becomes possible to run transformed >> code without having to register the tranformer at runtime. This is >> made possible with the addition of the -o command line option. >> >> Victor >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Jan 28 16:13:08 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 28 Jan 2016 21:13:08 +0000 (UTC) Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: Message-ID: <1709231659.1496580.1454015588991.JavaMail.yahoo@mail.yahoo.com> On Thursday, January 28, 2016 12:44 PM, Brett Cannon wrote: >On Wed, 27 Jan 2016 at 20:57 Kevin Conway wrote: >>Out of curiosity, do you consider this PEP as adding something to Python that didn't previously exist or do you consider this PEP to more aligned with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed to marshal the community in a common direction? I understand that you have other PEPs in flight that are designed to make certain optimizations easier (or possible). Looking at this PEP in isolation, however, leaves me wanting more explanation as to its value.> >The PEP is about empowering people to write AST transformers without having to use third-party tools to integrate it into their workflow. As you pointed out, there is very little here that isn't possible today with some toolchain that reads Python source code, translates it into an AST, optimizes it, and then writes out the .pyc file. But that all does require going to PyPI or writing your own solution. This kind of talk worries me. It's _already_ very easy to write AST transformers. There's no need for any third-party code from PyPI, and that "your own solution" that you have to write is a few lines of trivial code. I think a lot of people don't realize this. Maybe because they tried it in 2.6 or 3.2, where it was a lot harder, or because they read the source to MacroPy (which is compatible with 2.6 and 3.2, or at least originally was), where it looks very hard, or maybe just because they didn't realize how much work has already been put in to make it easy. But whatever the reason, they're wrong. And so they're expecting this PEP to solve a problem that doesn't need to be solved. >But if Victor's PEP gets in, then there will be a standard hook point that all Python code will go through which will make adding AST transformers much easier. Whether this ease of use is beneficial is part of the discussion around this PEP. There already is a standard hook point that all Python code goes through. Writing an AST transformer is as simple as replacing the code that compiles source to bytecode with a 3-line function that compiles source to AST, calls your transformer, and compiles AST to bytecode. Processing source or bytecode instead of AST is just as easy (actually, one line shorter). Where it gets tricky is all the different variations on what you hook and how. Do you want to intercept all .py files? Or add a new extension, like .hy, instead? Or all source files, but only if they start with a magic marker line? How do you want to integrate with naming, finding, obsoleting, reading, and writing .pyc files? What about -O? And so on. And how do you want to work together with other libraries trying to do the same thing, which may have made slightly different decisions? Once you decide what you want, it's another few lines to write and install the hook that does that--the hard part is deciding what you want. If this PEP can solve the hard part in a general way, so that the right thing to do for different kinds of transformers will be obvious and easy, that would be great. If it can't do so, then it just shouldn't bother with anything that doesn't fit into its model of global semantic-free transformations. And that would also be great--making global semantic-free transformers easy is already a huge boon even if it doesn't do anything else, and keeping the design for that as simple as possible is better than making it more complex to partially solve other things in a way that only helps with the easiest parts. From abarnert at yahoo.com Thu Jan 28 17:01:53 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 28 Jan 2016 22:01:53 +0000 (UTC) Subject: [Python-ideas] Explicit variable capture list In-Reply-To: <56aa7135.a21c8c0a.11ae0.ffff9371@mx.google.com> References: <56aa7135.a21c8c0a.11ae0.ffff9371@mx.google.com> Message-ID: <676720629.1493492.1454018513525.JavaMail.yahoo@mail.yahoo.com> On Thursday, January 28, 2016 11:51 AM, Jim J. Jewett wrote: > > > > On Wed Jan 27 15:49:12 EST 2016, Andrew Barnert wrote: > >> both C# and Ruby made breaking changes from the Python behavior ... > The first few times I saw this, I figured Python had a stronger (and > longer) backwards compatibility guarantee. Ruby, sure, but C#, I don't think so. Most of the worst warts in C# 6.0 are there for backward compatibility.[1] > But now that I consider the actual breakage, I'm not so sure... > > >>> for i in range(10): > print (i) > i=i+3 > print(i) > > i is explicitly changed, but it doesn't affect the flow control -- > it gets reset to the next sequence item as if nothing had happened. Yeah, that confusion is actually a separate issue. Explaining it in text is a bit difficult, but let's translate to the equivalent while loop: _it = iter(range(10)) try: while True: i = next(_it) print(i) i=i+3 print(i) except StopIteration: pass Now it should be obvious why you aren't affecting the control flow. And it should also be obvious why the "for let" change wouldn't make any difference here. Could Python solve that confusion? Sure. Swift, Scala, and lots of other languages make the loop variable constant/read-only/l-immutable/whatever, so that "i=i+3" either fails to compile, or raises at runtime, with a "ConstError". The idea is that "i=i=3" is more often a confusing bug than intentional--and, when it is intentional, the workaround is trivial (just write "j=i+3" and use j). But in dynamic languages, const tends to be more annoying than useful, so the smart ones (like Python) don't bother with it. [1] For example: Non-generic Task interferes with type inference for generic Task much harder, and isn't used except by accident, but they added it anyway, in C# 5 in 2012, because it was needed for consistency with the non-generic collections, which have been deprecated since C# 2 in 2005 but can't be removed because some code might break. From greg.ewing at canterbury.ac.nz Thu Jan 28 19:27:26 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Jan 2016 13:27:26 +1300 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56AA53DE.4010600@mail.de> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> <56AA53DE.4010600@mail.de> Message-ID: <56AAB1EE.5010805@canterbury.ac.nz> Sven R. Kunze wrote: > Some people > proposed a "from __extensions__ import my_extension"; inspired by > __future__ imports, i.e. it is forced to be at the top. Why? Because it > somehow makes sense to perform all transformations the first time a file > is loaded. It occurs to me that a magic import for applying local transformations could itself be implemented using a global transformer. -- Greg From steve at pearwood.info Thu Jan 28 20:01:58 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jan 2016 12:01:58 +1100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <1709231659.1496580.1454015588991.JavaMail.yahoo@mail.yahoo.com> References: <1709231659.1496580.1454015588991.JavaMail.yahoo@mail.yahoo.com> Message-ID: <20160129010158.GC4619@ando.pearwood.info> On Thu, Jan 28, 2016 at 09:13:08PM +0000, Andrew Barnert via Python-ideas wrote: > This kind of talk worries me. It's _already_ very easy to write AST > transformers. There's no need for any third-party code from PyPI, and > that "your own solution" that you have to write is a few lines of > trivial code. > > I think a lot of people don't realize this. I don't realise this. Not that I don't believe you, but I'd like to see a tutorial that goes through this step by step and actually explains what this is all about. Or, if it really is just a matter of a few lines, even just a simple example might help. For instance, the PEP includes a transformer that changes all string literals to "Ni! Ni! Ni!". Obviously it doesn't work as sys.set_code_transformers doesn't exist yet, but if I'm understanding you, we don't need that because it's already easy to apply that transformer. Can you show how? Something that works today? -- Steve From victor.stinner at gmail.com Thu Jan 28 19:57:02 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 29 Jan 2016 01:57:02 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56AAB1EE.5010805@canterbury.ac.nz> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> <56AA53DE.4010600@mail.de> <56AAB1EE.5010805@canterbury.ac.nz> Message-ID: A local transformation requires to register a global code transformer, but it doesn't mean that all files will be modified. The code transformer can use various kinds of checks to decide if a file must be transformed and then which parts of the code should be transformed. Decorators was suggested as a good granularity. Victor 2016-01-29 1:27 GMT+01:00 Greg Ewing : > Sven R. Kunze wrote: >> >> Some people proposed a "from __extensions__ import my_extension"; inspired >> by __future__ imports, i.e. it is forced to be at the top. Why? Because it >> somehow makes sense to perform all transformations the first time a file is >> loaded. > > > It occurs to me that a magic import for applying local > transformations could itself be implemented using a > global transformer. > > -- > Greg > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From abarnert at yahoo.com Thu Jan 28 22:10:39 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 03:10:39 +0000 (UTC) Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <20160129010158.GC4619@ando.pearwood.info> References: <20160129010158.GC4619@ando.pearwood.info> Message-ID: <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> On Thursday, January 28, 2016 5:07 PM, Steven D'Aprano wrote: > > On Thu, Jan 28, 2016 at 09:13:08PM +0000, Andrew Barnert via Python-ideas wrote: > >> This kind of talk worries me. It's _already_ very easy to write AST >> transformers. There's no need for any third-party code from PyPI, and >> that "your own solution" that you have to write is a few lines of > >> trivial code. >> >> I think a lot of people don't realize this. > > I don't realise this. > > Not that I don't believe you, but I'd like to see a tutorial that goes > through this step by step and actually explains what this is all about. > Or, if it really is just a matter of a few lines, even just a simple > example might help. I agree, but someone (Brett?) on one of these threads explained that they don't include such a tutorial in the docs because they don't want to encourage people to screw around with import hooks too much, so... Anyway, I wrote a blog post about last year ( http://stupidpythonideas.blogspot.com/2015/06/hacking-python-without-hacking-python.html), but I'll summarize it here. I'll show the simplest code for hooking in a source, AST, or bytecode transformer, not the most production-ready. > For instance, the PEP includes a transformer that changes all string > literals to "Ni! Ni! Ni!". Obviously it doesn't work as > sys.set_code_transformers doesn't exist yet, but if I'm understanding > you, we don't need that because it's already easy to apply that > transformer. Can you show how? Something that works today? Sure. Here's an AST transformer: class NiTransformer(ast.NodeTransformer): def visit_Str(self, node): node.s = 'Ni! Ni! Ni!' return node Here's a complete loader implementation that uses the hook: class NiLoader(importlib.machinery.SourceFileLoader): def source_to_code(self, data, path, *, _optimize=-1): source = importlib._bootstrap.decode_source(data) tree = NiTransformer().visit(ast.parse(source, path, 'exec')) return compile(tree, path, 'exec') Now, how do you install the hook? That depends on what exactly you want to do. Let's say you want to make it globally hook all .py files, be transparent to .pyc generation, and ignore -O, and you'd prefer a monkeypatch hack that works on all versions 3.3+, rather than a clean spec-based finder that requires 3.5. Here goes: finder = sys.meta_path[-1] loader = finder.find_module(__file__) loader.source_to_code = NiLoader.source_to_code Just put all this code in your top level script, or just put it in a module and import that in your top level script, either way before importing anything else. (And yes, "before importing anything else" means some bits of the stdlib end up processed and some don't, just as with PEP 511.) You can see it in action at https://github.com/abarnert/nihack PEP 511 writes the NiLoader part for you, but, as you can see, that's the easiest part of the whole thing. If you want all the exact same choices that the PEP makes (global, .py files only, insert name into .pyc files, integrate with -O and -o, promise to be semantically neutral, etc.), it also makes the last part trivial, which is a much bigger deal. If you want any different choices, it doesn't help with the last part at all. (And I think that's fine, as long as that's the intention. Right now, someone has to have some idea of what they're doing to use my hack, and that's probably a good thing, right? And if I want to clean it up and make it distributable, like MacroPy, I'd better know how to write a spec finder or I have no business distributing any such thing. But if people want to experiment with optimizers that don't actually change the behavior of their code, that's a lot safer, so it seems reasonable that we should focus on making that easier.) From abarnert at yahoo.com Thu Jan 28 22:30:12 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 03:30:12 +0000 (UTC) Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> Message-ID: <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> On Thursday, January 28, 2016 7:10 PM, Andrew Barnert wrote: Immediately after sending that, I realized that Victor's PEP uses a bytecode transform rather than an AST transform. That isn't much harder to do today. Here's a quick, untested version: def ni_transform(c): consts = [] for const in c.co_consts: if isinstance(c, str): consts.append('Ni! Ni! Ni!') elif isinstance(c, types.CodeType): consts.append(ni_transform(const)) else: consts.append(const) return types.CodeType( c.co_argcount, c.co_kwonlyargcount, c.co_nlocals, c.co_stacksize, c.co_flags, c.co_code, tuple(consts), c.co_names, c.co_varnames, c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab, c.co_freevars, c.co_cellvars) class NiLoader(importlib.machinery.SourceFileLoader): def source_to_code(self, data, path, *, _optimize=-1): return ni_transform(compile(data, path, 'exec')) You may still need the decode_source bit, at least on some of the Python versions; I can't remember. If so, add that one line from the AST version. Installing the hook is the same as the AST version. You may notice that I have that horrible 18-argument constructor, and the PEP doesn't. But that's because the PEP is basically cheating with this example. For some reason, it passes 3 of those arguments separately--consts, names, and lnotab. If you modify anything else, you'll need the same horrible constructor. And, in any realistic bytecode transformer, you will need to modify something else. For example, you may want to transform the bytecode. And meanwhile, once you start actually transforming bytecode, that becomes the hard part, and PEP 511 won't help you there. If you just want to replace every LOAD_GLOBAL with a LOAD_CONST, you can do that in a pretty simple loop with a bit of help from the dis module. But if you want to insert and delete bytecodes like the existing peephole optimizer in C does, then you're also dealing with renumbering jump targets and rebuilding the lnotab and other fun things. And if you start dealing with opcodes that change the stack effect nonlocally, like with and finally handlers, you'd have to be an idiot or a masochist to not reach for a third-party library like byteplay. (I know this because I'm enough of an idiot to have done it once, but not enough of an idiot or a masochist to do it again...). So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.) From mike at selik.org Thu Jan 28 22:55:42 2016 From: mike at selik.org (Michael Selik) Date: Thu, 28 Jan 2016 21:55:42 -0600 Subject: [Python-ideas] A bit meta Message-ID: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> One defect of a mailing list is the difficulty of viewing a weighted average of opinions. The benefit is that anyone can voice an opinion. This is more like the Senate than the House -- Rhode Island appears (on paper) to have as much influence as California. Luckily, we have a form of President. I'm guessing a House occurs in a more private mode of communication? Perhaps as the community gets larger, a system like StackOverflow might be a better tool for handling things like Python-Ideas. > On Jan 27, 2016, at 12:58 PM, Sjoerd Job Postmus wrote: > (not sure if I even have the right to vote here, given that I'm not a > core developer, but just giving my opinion) From steve at pearwood.info Fri Jan 29 03:03:50 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Jan 2016 19:03:50 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> Message-ID: <20160129080349.GD4619@ando.pearwood.info> On Thu, Jan 28, 2016 at 09:55:42PM -0600, Michael Selik wrote: > One defect of a mailing list is the difficulty of viewing a weighted > average of opinions. The benefit is that anyone can voice an opinion. > This is more like the Senate than the House -- Rhode Island appears > (on paper) to have as much influence as California. Luckily, we have a > form of President. I'm guessing a House occurs in a more private mode > of communication? The Python community is not a democracy. Voting +1, -1 etc. should not be interpreted as *actual* votes that need to counted and averaged, but as personal opinions intended to give other members of the community an idea of whether or not you would like to see a proposed feature. -- Steve From encukou at gmail.com Fri Jan 29 04:11:26 2016 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 29 Jan 2016 10:11:26 +0100 Subject: [Python-ideas] A bit meta In-Reply-To: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> Message-ID: <56AB2CBE.7030404@gmail.com> On 01/29/2016 04:55 AM, Michael Selik wrote: > One defect of a mailing list is the difficulty of viewing a weighted average of opinions. The benefit is that anyone can voice an opinion. This is more like the Senate than the House -- Rhode Island appears (on paper) to have as much influence as California. Luckily, we have a form of President. I'm guessing a House occurs in a more private mode of communication? I've read up a bit on Wikipedia, so I'll try to start summarizing the reference for the non-Americans who come after me. One part of the US government is the "Congress", which is divided into two "houses": the "Senate" and the House of Representatives (which, I assume, is *the* "House"). Members of the House correspond to "districts", which are determined by population (roughly, but the details seem irrelevant here) -- so each member of the House corresponds roughly to some fixed number of people. On the other hand, the Senate has two members for each "state", but states aren't determined by population: "Rhode Island" has many fewer people than "California". (Unsurprising, I might add: I never hear about Rhode Island, but California makes it to local news here at times.) There is also a "President", who doesn't seem to have as much power as Python's BDFL: he/she can veto decisions of the Congress, but that veto can in turn be overriden by the Congress. Trying to hold all these details in my head while thinking how they relate to mailing list discussions leaves me quite confused. Would it be possible to make the argument clearer to people who need to look these things up to understand it? > Perhaps as the community gets larger, a system like StackOverflow might be a better tool for handling things like Python-Ideas. > >> On Jan 27, 2016, at 12:58 PM, Sjoerd Job Postmus wrote: >> (not sure if I even have the right to vote here, given that I'm not a >> core developer, but just giving my opinion) From ncoghlan at gmail.com Fri Jan 29 09:10:02 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 00:10:02 +1000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> Message-ID: On 29 January 2016 at 13:30, Andrew Barnert via Python-ideas wrote: > So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.) Rather than trying to categorise things as "hard" or "easy", I find it to be more helpful to categorise them as "inherent complexity" or "incidental complexity". With inherent complexity, you can never eliminate it, only move it around, and perhaps make it easier to hide from people who don't care about the topic (cf. the helper classes in importlib, which hide a lot of the inherent complexity of the import system). With incidental complexity though, you may be able to find ways to eliminate it entirely. For a lot of code transformations, determining a suitable scope of application is *inherent* complexity: you need to care about where the transformation is applied, as it actually matters for that particular use case. For semantically significant transforms, scope of application is inherent complexity, as it affects code readability, and may even be an error if applied inappropriately. This is why: - the finer-grained control offered by decorators is often preferred to metaclasses or import hooks - custom file extensions or in-file markers are typically used to opt in to import hook processing In these cases, whether or not the standard library is processed doesn't matter, since it will never use the relevant decorator, file extension or in-file marker. You also don't need to worry about subtle start-up bugs, since if the decorator isn't imported, or the relevant import hook isn't installed appropriately, then the code that depends on that happening simply won't run. This means the only code transformation cases where determining scope of applicability turns out to be *incidental* complexity are those that are intended to be semantically neutral operations. Maybe you're collecting statistics on opcode frequency, maybe you're actually applying safe optimisations, maybe you're doing something else, but the one thing you're promising is that if the transformation breaks code that works without the transformation applied, then it's a *bug in the transformer*, not the code being transformed. In these cases, you *do* care about whether or not the standard library is processed, so you want an easy way to say "I want to process *all* the code, wherever it comes from". At the moment, that easy way doesn't exist, so you either give up, or you mess about with the encodings.py hack. PEP 511 erases that piece of incidental complexity and say, "If you want to apply a genuinely global transformation, this is how you do it". The fact we already have decorators and import hooks is why I think PEP 511 can safely ignore the use cases that those handle. However, I think it *would* make sense to make the creation of a "Code Transformation" HOWTO guide part of the PEP - having a guide means we can clearly present the hierarchy in terms of: - decorators are strongly encouraged, since the maintainability harm they can do is limited - for import hooks, the use of custom file extensions and in-file markers is strongly encouraged to limit unintended side effects - global transformation are incredibly powerful, but also very hard to do well. Transform responsibly, or future maintainers will not think well of you :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Fri Jan 29 09:14:35 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 30 Jan 2016 01:14:35 +1100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Sat, Jan 30, 2016 at 1:10 AM, Nick Coghlan wrote: > On 29 January 2016 at 13:30, Andrew Barnert via Python-ideas > wrote: >> So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.) > > Rather than trying to categorise things as "hard" or "easy", I find it > to be more helpful to categorise them as "inherent complexity" or > "incidental complexity". > > With inherent complexity, you can never eliminate it, only move it > around, and perhaps make it easier to hide from people who don't care > about the topic (cf. the helper classes in importlib, which hide a lot > of the inherent complexity of the import system). With incidental > complexity though, you may be able to find ways to eliminate it > entirely. > > For a lot of code transformations, determining a suitable scope of > application is *inherent* complexity: you need to care about where the > transformation is applied, as it actually matters for that particular > use case. > > For semantically significant transforms, scope of application is > inherent complexity, as it affects code readability, and may even be > an error if applied inappropriately. This is why: > - the finer-grained control offered by decorators is often preferred > to metaclasses or import hooks > - custom file extensions or in-file markers are typically used to opt > in to import hook processing > > In these cases, whether or not the standard library is processed > doesn't matter, since it will never use the relevant decorator, file > extension or in-file marker. You also don't need to worry about subtle > start-up bugs, since if the decorator isn't imported, or the relevant > import hook isn't installed appropriately, then the code that depends > on that happening simply won't run. > > This means the only code transformation cases where determining scope > of applicability turns out to be *incidental* complexity are those > that are intended to be semantically neutral operations. Maybe you're > collecting statistics on opcode frequency, maybe you're actually > applying safe optimisations, maybe you're doing something else, but > the one thing you're promising is that if the transformation breaks > code that works without the transformation applied, then it's a *bug > in the transformer*, not the code being transformed. > > In these cases, you *do* care about whether or not the standard > library is processed, so you want an easy way to say "I want to > process *all* the code, wherever it comes from". At the moment, that > easy way doesn't exist, so you either give up, or you mess about with > the encodings.py hack. > > PEP 511 erases that piece of incidental complexity and say, "If you > want to apply a genuinely global transformation, this is how you do > it". The fact we already have decorators and import hooks is why I > think PEP 511 can safely ignore the use cases that those handle. Thank you for the excellent explanation. Can words to this effect be added to the PEP, please? ChrisA From ncoghlan at gmail.com Fri Jan 29 09:31:34 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 00:31:34 +1000 Subject: [Python-ideas] A bit meta In-Reply-To: <20160129080349.GD4619@ando.pearwood.info> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: On 29 January 2016 at 18:03, Steven D'Aprano wrote: > On Thu, Jan 28, 2016 at 09:55:42PM -0600, Michael Selik wrote: >> One defect of a mailing list is the difficulty of viewing a weighted >> average of opinions. The benefit is that anyone can voice an opinion. >> This is more like the Senate than the House -- Rhode Island appears >> (on paper) to have as much influence as California. Luckily, we have a >> form of President. I'm guessing a House occurs in a more private mode >> of communication? > > The Python community is not a democracy. Voting +1, -1 etc. should not > be interpreted as *actual* votes that need to counted and averaged, but > as personal opinions intended to give other members of the community an > idea of whether or not you would like to see a proposed feature. Right, in terms of the language and standard library design, some of the essential points to note are: - individual core committers have the authority to make changes (although we vary in how comfortable we are exercising that authority) - one of the things we're responsible for is judging what topics can be handled with just a tracker discussion, what would benefit from a mailing list thread, and what would benefit from going through the full PEP process (this is still an art rather than a science, which is why it isn't documented very well) - https://docs.python.org/devguide/experts.html#experts records the areas we individually feel comfortable exerting authority over - the PEP process itself is defined in https://www.python.org/dev/peps/pep-0001/ - one relatively common cause of escalation from tracker issues to mailing list discussions is when consensus can't be reached in a smaller forum, so perspectives are sought from a slightly wider audience to see if that tips the balance one way or another - when consensus still can't be reached (and nobody wants to escalate to the full PEP process in order to request an authoritative decision), then the status quo wins stalemates The python-dev and python-ideas communities form a very important part of that process, but the most valuable things folks bring are additional perspectives (whether that's in the form of different use cases, additional domains of expertise, knowledge of practices in other programming language communities, etc) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Fri Jan 29 09:38:34 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 30 Jan 2016 01:38:34 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: On Sat, Jan 30, 2016 at 1:31 AM, Nick Coghlan wrote: > The python-dev and python-ideas communities form a very important part > of that process, but the most valuable things folks bring are > additional perspectives (whether that's in the form of different use > cases, additional domains of expertise, knowledge of practices in > other programming language communities, etc) This. As mentioned in PEP 10 [1], it's the explanations and justifications, far more than the votes, that make the real difference. That said, though, the votes are a great way of gauging the support levels for a set of similar proposals (eg syntactic options), where the proposer of the idea doesn't particularly care which of the options is picked. It's still not in any way democratic, as evidenced by the vote in PEP 308 [2], which had four options clearly better than the others, but the one that's now in the language was the last of those four in the votes. ChrisA [1] https://www.python.org/dev/peps/pep-0010/ [2] https://www.python.org/dev/peps/pep-0308/ From mojtaba.gharibi at gmail.com Fri Jan 29 10:30:54 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Fri, 29 Jan 2016 10:30:54 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: I support a stack exchange website. Quite often here a few members overwhelm the email exchanges and the ideas no matter how clearly you've explained them get buried in your very first email which you have to repeat over and over and basically the discussion becomes answering different here and there criticisms of a particular member. I mean the conversation can quickly become only marginally relevant to the entirety of your idea. I think stack exchange can sort out that chaos considerably and if core developers don't really are looking for consensus, that's okay; at least the convesation is sorted out. Every new visitor has a chance of first seeing the idea proposed at the top of the page, then the comments and answers. On Fri, Jan 29, 2016 at 9:38 AM, Chris Angelico wrote: > On Sat, Jan 30, 2016 at 1:31 AM, Nick Coghlan wrote: >> The python-dev and python-ideas communities form a very important part >> of that process, but the most valuable things folks bring are >> additional perspectives (whether that's in the form of different use >> cases, additional domains of expertise, knowledge of practices in >> other programming language communities, etc) > > This. As mentioned in PEP 10 [1], it's the explanations and > justifications, far more than the votes, that make the real > difference. That said, though, the votes are a great way of gauging > the support levels for a set of similar proposals (eg syntactic > options), where the proposer of the idea doesn't particularly care > which of the options is picked. It's still not in any way democratic, > as evidenced by the vote in PEP 308 [2], which had four options > clearly better than the others, but the one that's now in the language > was the last of those four in the votes. > > ChrisA > > [1] https://www.python.org/dev/peps/pep-0010/ > [2] https://www.python.org/dev/peps/pep-0308/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Fri Jan 29 10:39:09 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 01:39:09 +1000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: On 30 January 2016 at 01:30, Mirmojtaba Gharibi wrote: > I support a stack exchange website. There are lots of things we could do to improve the communications infrastructure the PSF provides the community, but the current limiting factors are management capacity and (infrastructure) contributor time, rather than ideas for potential improvement :) There's also a vicious cycle where the limited management capacity makes it difficult to use volunteer time effectively, which is why the PSF is currently actively attempting to break that cycle by hiring an Infrastructure Manager (applications already closed for that role, though). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Fri Jan 29 10:48:09 2016 From: barry at python.org (Barry Warsaw) Date: Fri, 29 Jan 2016 10:48:09 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: <20160129104809.36663f3e@anarchist.wooz.org> On Jan 29, 2016, at 07:03 PM, Steven D'Aprano wrote: >The Python community is not a democracy. Voting +1, -1 etc. should not >be interpreted as *actual* votes that need to counted and averaged, but >as personal opinions intended to give other members of the community an >idea of whether or not you would like to see a proposed feature. I'll just mention that if folks are interested i exploring a SO-like voting system for mailing list archives, you should get involved with the HyperKitty project. HK is the Django-based new archiver for Mailman 3, and the HK subproject is lead by the quite awesome Aurelien Bompard. A feature like this is on our radar, but you know, resources. https://gitlab.com/mailman/hyperkitty Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From pavol.lisy at gmail.com Fri Jan 29 11:04:36 2016 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Fri, 29 Jan 2016 17:04:36 +0100 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: Message-ID: I would really like (a;b;c) in L vs a in L and b in L and c in L or all(i in L for i in (a,b,c)) because readability very matters. But if I understand then this is not what we could get from your proposal because a;b;c is not expression. Right? So we have to write something like vec=[None]*3 vec=(a;b;c) in L all(vec) # which is now equivalent to (a in L and b in L and c in L) > vec=[]*10 > $vec = $u + $v First row is mistake (somebody wrote it yet) but why not simply? -> vec = list($u +$v) Because $u+$v is not expression. It is construct for "unpacking" operations. It could be useful to have "operator" (calling it operator is misleading because result is not object) to go back to python variable. (but with which type? tuple?) Probably $(a;b) could ("return") be transformed to tuple(a,b) a=[1,2] print(a) print($a) print($$a) ---- [1,2] 1 2 (1,2) So I could write a in L and b in L and c in L as all($((a;b;c) in L)) # which is much less nice as "(a;b;c) in L" and all(i in L for i in (a,b,c)) # which is similar readable and don't need language changes > s=0 > s;s;s += a;b;c; * d;e;f > which result in s being a*d+b,c*e+d*f do you mean a*d+b*d+c*f ? ------- Your idea is interesting. If you like to test it then you could improve implementation of next function and play with it (probably you will find some caveats): def respectively(statement): if statement!='a;b=b;a': raise SyntaxError("only supported statement is 'a;b=b;a'") exec('global a\nglobal b\na=b\nb=a') a,b=1,2 respectively('a;b=b;a') print(a,b) 2 2 Unfortunately this could work only in global context due to limitations around 'exec' and 'locals' functions. From guido at python.org Fri Jan 29 11:19:18 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Jan 2016 08:19:18 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: On Fri, Jan 29, 2016 at 7:39 AM, Nick Coghlan wrote: > On 30 January 2016 at 01:30, Mirmojtaba Gharibi > wrote: >> I support a stack exchange website. > > There are lots of things we could do to improve the communications > infrastructure the PSF provides the community, but the current > limiting factors are management capacity and (infrastructure) > contributor time, rather than ideas for potential improvement :) > > There's also a vicious cycle where the limited management capacity > makes it difficult to use volunteer time effectively, which is why the > PSF is currently actively attempting to break that cycle by hiring an > Infrastructure Manager (applications already closed for that role, > though). I do have to say I find the idea of using a dedicated StackExchange site intriguing. I have been a big fan of its cofounder Joel Spolsky for many years. A StackExchange discussion has some advantages over a thread in a mailing list -- it's got a clear URL that everyone can easily find and reference (as opposed to the variety of archive sites that are currently used), and there is a bit more structure to the discussion (question, answers, comments). I believe there are some good examples of other communities of experts that have really benefited (e.g. mathoverflow.net). A downside may be that it's hard to read via an email client (although you can set up notifications). That doesn't bother me personally (I live in a web browser these days anyway) but I can imagine it will be harder for some folks to participate. I don't think it takes much effort to set up one of these -- if someone feels particularly strong about this I encourage them to figure out how to set up a StackExchange site to augment python-ideas. (I think that's where we should start; leave python-dev alone.) -- --Guido van Rossum (python.org/~guido) From mojtaba.gharibi at gmail.com Fri Jan 29 11:34:18 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Fri, 29 Jan 2016 11:34:18 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: Message-ID: On Fri, Jan 29, 2016 at 11:04 AM, Pavol Lisy wrote: > I would really like > > (a;b;c) in L > > vs > > a in L and b in L and c in L > > or > > all(i in L for i in (a,b,c)) > > because readability very matters. > > But if I understand then this is not what we could get from your > proposal because a;b;c is not expression. Right? > > So we have to write something like > > vec=[None]*3 > vec=(a;b;c) in L > all(vec) # which is now equivalent to (a in L and b in L and c in L) > That's right. Instead, we can get it this way: a;b;c = $L which is equivalent to a=L[0] b=L[1] c=L[2] but as others suggested for this particular example, we can already get it from unpacking syntax, i.e. a,b,c = *L > >> vec=[]*10 >> $vec = $u + $v > > First row is mistake (somebody wrote it yet) but why not simply? -> > > vec = list($u +$v) > > Because $u+$v is not expression. It is construct for "unpacking" > operations. It could be useful to have "operator" (calling it operator > is misleading because result is not object) to go back to python > variable. (but with which type? tuple?) Probably $(a;b) could > ("return") be transformed to tuple(a,b) > > a=[1,2] > print(a) > print($a) > print($$a) > ---- > [1,2] > 1 > 2 > (1,2) > > So I could write > > a in L and b in L and c in L > > as > > all($((a;b;c) in L)) # which is much less nice as "(a;b;c) in L" > > and > > all(i in L for i in (a,b,c)) # which is similar readable and don't > need language changes > >> s=0 >> s;s;s += a;b;c; * d;e;f >> which result in s being a*d+b,c*e+d*f > > do you mean a*d+b*d+c*f ? Yes, Oops, it was a typo. > > ------- > > Your idea is interesting. If you like to test it then you could > improve implementation of next function and play with it (probably you > will find some caveats): > > def respectively(statement): > if statement!='a;b=b;a': > raise SyntaxError("only supported statement is 'a;b=b;a'") > exec('global a\nglobal b\na=b\nb=a') > > a,b=1,2 > respectively('a;b=b;a') > print(a,b) > 2 2 > > Unfortunately this could work only in global context due to > limitations around 'exec' and 'locals' functions. Sounds good. I'd like to experiment with it actually. From ethan at stoneleaf.us Fri Jan 29 11:34:42 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 29 Jan 2016 08:34:42 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: <56AB94A2.20901@stoneleaf.us> On 01/29/2016 08:19 AM, Guido van Rossum wrote: > I do have to say I find the idea of using a dedicated StackExchange > site intriguing. I have been a big fan of its cofounder Joel Spolsky > for many years. A StackExchange discussion has some advantages over a > thread in a mailing list -- it's got a clear URL that everyone can > easily find and reference (as opposed to the variety of archive sites > that are currently used), and there is a bit more structure to the > discussion (question, answers, comments). I believe there are some > good examples of other communities of experts that have really > benefited (e.g. mathoverflow.net). I am also a big fan of StackExchange, but the StackExchange sites are about questions and answers, while Python-Ideas is about ideas and discussion. Given that extensive comments on a question or answer is discouraged, multiple answers trying to follow a thread of discussion would be confusing, and the person asking the question would be the one selecting the "approved" answer (which may have nothing to do with the actual outcome), I don't see this as being a good fit. -- ~Ethan~ From geoffspear at gmail.com Fri Jan 29 11:45:15 2016 From: geoffspear at gmail.com (Geoffrey Spear) Date: Fri, 29 Jan 2016 11:45:15 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: <56AB94A2.20901@stoneleaf.us> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: On Fri, Jan 29, 2016 at 11:34 AM, Ethan Furman wrote: > On 01/29/2016 08:19 AM, Guido van Rossum wrote: > > I do have to say I find the idea of using a dedicated StackExchange >> site intriguing. I have been a big fan of its cofounder Joel Spolsky >> for many years. A StackExchange discussion has some advantages over a >> thread in a mailing list -- it's got a clear URL that everyone can >> easily find and reference (as opposed to the variety of archive sites >> that are currently used), and there is a bit more structure to the >> discussion (question, answers, comments). I believe there are some >> good examples of other communities of experts that have really >> benefited (e.g. mathoverflow.net). >> > > I am also a big fan of StackExchange, but the StackExchange sites are > about questions and answers, while Python-Ideas is about ideas and > discussion. > > Given that extensive comments on a question or answer is discouraged, > multiple answers trying to follow a thread of discussion would be > confusing, and the person asking the question would be the one selecting > the "approved" answer (which may have nothing to do with the actual > outcome), I don't see this as being a good fit. > > As a longtime follower of the SE site-creation process, I'd have to agree. There's pretty much no way such a site would get past the existing site-creation process. I suspect even a special arrangement with the Stack Overflow upper management bypassing the regular process wouldn't happen. In any event, a site that creates the illusion that "Create a Python 2.8!" having a ton of upvotes means something seems like a Bad Idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Fri Jan 29 11:49:40 2016 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 29 Jan 2016 14:49:40 -0200 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: On 29 January 2016 at 14:45, Geoffrey Spear wrote: > > > On Fri, Jan 29, 2016 at 11:34 AM, Ethan Furman wrote: >> >> On 01/29/2016 08:19 AM, Guido van Rossum wrote: >> >>> I do have to say I find the idea of using a dedicated StackExchange >>> site intriguing. I have been a big fan of its cofounder Joel Spolsky >>> for many years. A StackExchange discussion has some advantages over a >>> thread in a mailing list -- it's got a clear URL that everyone can >>> easily find and reference (as opposed to the variety of archive sites >>> that are currently used), and there is a bit more structure to the >>> discussion (question, answers, comments). I believe there are some >>> good examples of other communities of experts that have really >>> benefited (e.g. mathoverflow.net). >> >> >> I am also a big fan of StackExchange, but the StackExchange sites are >> about questions and answers, while Python-Ideas is about ideas and >> discussion. >> >> Given that extensive comments on a question or answer is discouraged, >> multiple answers trying to follow a thread of discussion would be confusing, >> and the person asking the question would be the one selecting the "approved" >> answer (which may have nothing to do with the actual outcome), I don't see >> this as being a good fit. >> > > As a longtime follower of the SE site-creation process, I'd have to agree. > There's pretty much no way such a site would get past the existing > site-creation process. I suspect even a special arrangement with the Stack > Overflow upper management bypassing the regular process wouldn't happen. > > In any event, a site that creates the illusion that "Create a Python 2.8!" > having a ton of upvotes means something seems like a Bad Idea. Creating an instance of a S.O. like site, does not mean getting an official Stack Exchange site- just instantiate some OpenSource product that have the same look and feel (and responsiveness) I know that Ubuntu people run something similar, for example. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From random832 at fastmail.com Fri Jan 29 11:55:45 2016 From: random832 at fastmail.com (Random832) Date: Fri, 29 Jan 2016 11:55:45 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: <1454086545.1984516.506366042.0D3D9067@webmail.messagingengine.com> On Fri, Jan 29, 2016, at 11:49, Joao S. O. Bueno wrote: > Creating an instance of a S.O. like site, does not mean getting an > official > Stack Exchange site- just instantiate some OpenSource product that > have the same look and feel (and responsiveness) I know that Ubuntu > people run something similar, > for example. Ask Ubuntu is, in fact, a real Stack Exchange site (AIUI they did the "special arrangement" thing). Stack Exchange's software is not itself open source, though http://meta.stackexchange.com/questions/2267/stack-exchange-clones lists some "clones" (other software packages that provide varying degrees of the same look and feel). From stephen at xemacs.org Fri Jan 29 12:27:54 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 30 Jan 2016 02:27:54 +0900 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> Message-ID: <22187.41242.549291.36585@turnbull.sk.tsukuba.ac.jp> Guido van Rossum writes: > I don't think it takes much effort to set up one of these -- if > someone feels particularly strong about this I encourage them to > figure out how to set up a StackExchange site to augment python-ideas. > (I think that's where we should start; leave python-dev alone.) I think an even better place to start would be core-mentorship. From stephen at xemacs.org Fri Jan 29 12:32:10 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 30 Jan 2016 02:32:10 +0900 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: References: Message-ID: <22187.41498.833689.317428@turnbull.sk.tsukuba.ac.jp> Pavol Lisy writes: > I would really like > > (a;b;c) in L Not well-specified (does order matter? how about repeated values? is (a;b;c) an object? it sure looks like one, and if so, object in L already has a meaning). But for one obvious interpretation: {a, b, c} <= set(L) and in this interpretation you should probably optimize to {a, b, c} <= L by constructing L as a set in the first place. Really this thread probably belongs on python-list anyway. From stephen at xemacs.org Fri Jan 29 12:34:25 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 30 Jan 2016 02:34:25 +0900 Subject: [Python-ideas] A bit meta In-Reply-To: <56AB2CBE.7030404@gmail.com> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB2CBE.7030404@gmail.com> Message-ID: <22187.41633.301185.82835@turnbull.sk.tsukuba.ac.jp> Petr Viktorin writes: > On 01/29/2016 04:55 AM, Michael Selik wrote: [a lot of en_US.legalese] > Would it be possible to make the argument clearer to people who need to > look these things up to understand it? I would just skip to the chase[1]: > > Perhaps as the community gets larger, a system like StackOverflow > > might be a better tool for handling things like Python-Ideas. I'm not sure what else is in s.o that he thinks would be helpful, the references to the American political system weren't very specific. Obviously a thumbs-up glyph for every post would make it simpler to say "+1", though. Footnotes: [1] Another American idiom. From brett at python.org Fri Jan 29 12:56:57 2016 From: brett at python.org (Brett Cannon) Date: Fri, 29 Jan 2016 17:56:57 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: <56AB94A2.20901@stoneleaf.us> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: On Fri, 29 Jan 2016 at 08:35 Ethan Furman wrote: > On 01/29/2016 08:19 AM, Guido van Rossum wrote: > > > I do have to say I find the idea of using a dedicated StackExchange > > site intriguing. I have been a big fan of its cofounder Joel Spolsky > > for many years. A StackExchange discussion has some advantages over a > > thread in a mailing list -- it's got a clear URL that everyone can > > easily find and reference (as opposed to the variety of archive sites > > that are currently used), and there is a bit more structure to the > > discussion (question, answers, comments). I believe there are some > > good examples of other communities of experts that have really > > benefited (e.g. mathoverflow.net). > > I am also a big fan of StackExchange, but the StackExchange sites are > about questions and answers, while Python-Ideas is about ideas and > discussion. > > Given that extensive comments on a question or answer is discouraged, > multiple answers trying to follow a thread of discussion would be > confusing, and the person asking the question would be the one selecting > the "approved" answer (which may have nothing to do with the actual > outcome), I don't see this as being a good fit. > A better fit would be something like https://www.uservoice.com/ if people wanted a focused "vote on ideas" solution, or something like https://www.discourse.org/ for a more modern forum platform that has the concept of likes for a thread. And then there's https://gitlab.com/mailman/hyperkitty as Barry suggested to add the equivalent of what Discourse has to Mailman 3. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Fri Jan 29 13:09:47 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 29 Jan 2016 19:09:47 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56AAB1EE.5010805@canterbury.ac.nz> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> <56AA53DE.4010600@mail.de> <56AAB1EE.5010805@canterbury.ac.nz> Message-ID: <56ABAAEB.4050703@mail.de> On 29.01.2016 01:27, Greg Ewing wrote: > Sven R. Kunze wrote: >> Some people proposed a "from __extensions__ import my_extension"; >> inspired by __future__ imports, i.e. it is forced to be at the top. >> Why? Because it somehow makes sense to perform all transformations >> the first time a file is loaded. > > It occurs to me that a magic import for applying local > transformations could itself be implemented using a > global transformer. > That is certainly true. :) From srkunze at mail.de Fri Jan 29 13:18:15 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 29 Jan 2016 19:18:15 +0100 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> <56AA53DE.4010600@mail.de> <56AAB1EE.5010805@canterbury.ac.nz> Message-ID: <56ABACE7.1000007@mail.de> On 29.01.2016 01:57, Victor Stinner wrote: > A local transformation requires to register a global code transformer, > but it doesn't mean that all files will be modified. I think you should differentiate between "register" and "use". "register" basically means "provide but don't use". "use" basically means "apply the transformation". (Same is already true for codecs.) The PEP's "set_code_transformers" seem not to make that distinction. > The code transformer can use various kinds of checks to decide if a file must > be transformed and then which parts of the code should be transformed. > Decorators was suggested as a good granularity. As others pointed out, implicit transformations are not desirable. So, why would a transformer need to check if a file must be transformed? Either the author of a file explicitly wants the transformer or not. Same goes for the global option. Either it is there or it isn't. Btw. I would really appreciate a reply to my prior post. ;) Best, Sven From abarnert at yahoo.com Fri Jan 29 14:57:11 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 11:57:11 -0800 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Jan 29, 2016, at 06:10, Nick Coghlan wrote: > > PEP 511 erases that piece of incidental complexity and say, "If you > want to apply a genuinely global transformation, this is how you do > it". The fact we already have decorators and import hooks is why I > think PEP 511 can safely ignore the use cases that those handle. I think this is the conclusion I was hoping to reach, but wasn't sure how to get there. I'm happy with PEP 511 not trying to serve cases like MacroPy and Hy and the example from the byteplay docs, especially so if ignoring them makes PEP 511 simpler, as long as it can explain why it's ignoring them. And a shorter version of your argument should serve as such an explanation. But the other half of my point was that too many people (even very experienced developers like most of the people on this list) think there's more incidental complexity than there is, and that's also a problem. For example, "I want to write a global processor for local experimentation purposes so I can play with my idea before posting it to Python-ideas" is not a bad desire. And, if people think it's way too hard to do with a quick&dirty import hook, they're naturally going to ask why PEP 511 doesn't help them out by adding a bunch of options to install/run the processors conditionally, handle non-.py files, skip the stdlib, etc. And I think the PEP is better without those options. > However, I think it *would* make sense to make the creation of a "Code > Transformation" HOWTO guide part of the PEP - having a guide means we > can clearly present the hierarchy in terms of: I like this idea. Earlier I suggested that the import system documentation should have some simple examples of how to actually use the import system to write transforming hooks. Someone (Brett?) pointed out that it's a dangerous technique, and making it too easy for people to play with it without understanding it may be a bad idea. And they're probably right. A HOWTO is a bit more "out-of-the-way" than library or reference docs--and, more importantly, it also has room to explain when you shouldn't do this or that, and why. I'm not sure it has to be part of the PEP, but I can see the connection. While the PEP helps by separating out the most important safe case (semantically-neutral, reflected in .pyc, globally consistent, etc.), but it also makes the question "how do I do something similar to PEP 511 transformers except ___" more likely to come up in the first place, making the HOWTO more important. From donald at stufft.io Fri Jan 29 16:27:55 2016 From: donald at stufft.io (Donald Stufft) Date: Fri, 29 Jan 2016 16:27:55 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: <7A708956-80EF-48DE-AFD6-8DBDC61FCCE9@stufft.io> > On Jan 29, 2016, at 12:56 PM, Brett Cannon wrote: > > > > On Fri, 29 Jan 2016 at 08:35 Ethan Furman > wrote: > On 01/29/2016 08:19 AM, Guido van Rossum wrote: > > > I do have to say I find the idea of using a dedicated StackExchange > > site intriguing. I have been a big fan of its cofounder Joel Spolsky > > for many years. A StackExchange discussion has some advantages over a > > thread in a mailing list -- it's got a clear URL that everyone can > > easily find and reference (as opposed to the variety of archive sites > > that are currently used), and there is a bit more structure to the > > discussion (question, answers, comments). I believe there are some > > good examples of other communities of experts that have really > > benefited (e.g. mathoverflow.net ). > > I am also a big fan of StackExchange, but the StackExchange sites are > about questions and answers, while Python-Ideas is about ideas and > discussion. > > Given that extensive comments on a question or answer is discouraged, > multiple answers trying to follow a thread of discussion would be > confusing, and the person asking the question would be the one selecting > the "approved" answer (which may have nothing to do with the actual > outcome), I don't see this as being a good fit. > > A better fit would be something like https://www.uservoice.com/ if people wanted a focused "vote on ideas" solution, or something like https://www.discourse.org/ for a more modern forum platform that has the concept of likes for a thread. And then there's https://gitlab.com/mailman/hyperkitty as Barry suggested to add the equivalent of what Discourse has to Mailman 3. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ I?ve been thinking about trying to set up a discourse instance for the packaging stuff, for whatever it?s worth. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From greg.ewing at canterbury.ac.nz Fri Jan 29 16:41:39 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Jan 2016 10:41:39 +1300 Subject: [Python-ideas] A bit meta In-Reply-To: <56AB2CBE.7030404@gmail.com> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB2CBE.7030404@gmail.com> Message-ID: <56ABDC93.3060708@canterbury.ac.nz> Petr Viktorin wrote: > Trying to hold all these details in my head while thinking how they > relate to mailing list discussions leaves me quite confused. I think Guido is more like the king of England was in the old days. His word is law, but if he pisses off his subjects too much, he risks either losing his head or being forced to sign a Magna Carta. -- Greg From ethan at stoneleaf.us Fri Jan 29 16:48:59 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 29 Jan 2016 13:48:59 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: <7A708956-80EF-48DE-AFD6-8DBDC61FCCE9@stufft.io> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <7A708956-80EF-48DE-AFD6-8DBDC61FCCE9@stufft.io> Message-ID: <56ABDE4B.2060406@stoneleaf.us> On 01/29/2016 01:27 PM, Donald Stufft wrote: > I?ve been thinking about trying to set up a discourse instance for > the packaging stuff, for whatever it?s worth. Great! That will be invaluable for evaluation if nothing else. ;) -- ~Ethan~ From greg.ewing at canterbury.ac.nz Fri Jan 29 17:02:17 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Jan 2016 11:02:17 +1300 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: <56ABAAEB.4050703@mail.de> References: <569A2452.1000709@gmail.com> <20160116162235.GB3208@sjoerdjob.com> <56AA46D8.6090203@mail.de> <56AA4864.7030004@mail.de> <56AA53DE.4010600@mail.de> <56AAB1EE.5010805@canterbury.ac.nz> <56ABAAEB.4050703@mail.de> Message-ID: <56ABE169.2080105@canterbury.ac.nz> Sven R. Kunze wrote: > On 29.01.2016 01:27, Greg Ewing wrote: > >> It occurs to me that a magic import for applying local >> transformations could itself be implemented using a >> global transformer. To elaborate on that a bit, something like an __extensions__ magic import could first be prototyped as a global transformer. If the idea caught on, that transformer could be made "official", meaning it was incorporated into the stdlib and applied by default. -- Greg From ned at nedbatchelder.com Fri Jan 29 17:42:18 2016 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 29 Jan 2016 17:42:18 -0500 Subject: [Python-ideas] Prevent importing yourself? Message-ID: <56ABEACA.8000707@nedbatchelder.com> Hi, A common question we get in the #python IRC channel is, "I tried importing a module, but I get an AttributeError trying to use the things it said it provided." Turns out the beginner named their own file the same as the module they were trying to use. That is, they want to try (for example) the "azure" package. So they make a file called azure.py, and start with "import azure". The import succeeds, but it has none of the contents the documentation claims, because they have imported themselves. It's baffling, because they have used the exact statements shown in the examples, but it doesn't work. Could we make this a more obvious failure? Is there ever a valid reason for a file to import itself? Is this situation detectable in the import machinery? --Ned. From rymg19 at gmail.com Fri Jan 29 17:57:20 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 29 Jan 2016 16:57:20 -0600 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56ABEACA.8000707@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> Message-ID: <37CB90B9-73C1-49A5-B820-08C5457DC99A@gmail.com> On January 29, 2016 4:42:18 PM CST, Ned Batchelder wrote: >Hi, > >A common question we get in the #python IRC channel is, "I tried >importing a module, but I get an AttributeError trying to use the >things >it said it provided." Turns out the beginner named their own file the >same as the module they were trying to use. > >That is, they want to try (for example) the "azure" package. So they >make a file called azure.py, and start with "import azure". The import >succeeds, but it has none of the contents the documentation claims, >because they have imported themselves. It's baffling, because they >have >used the exact statements shown in the examples, but it doesn't work. > >Could we make this a more obvious failure? Is there ever a valid >reason >for a file to import itself? Is this situation detectable in the >import >machinery? > Haha, +1. This bit me a good 50 times when I started learning Python. >--Ned. >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. From greg.ewing at canterbury.ac.nz Fri Jan 29 18:09:30 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Jan 2016 12:09:30 +1300 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56ABEACA.8000707@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> Message-ID: <56ABF12A.90609@canterbury.ac.nz> Ned Batchelder wrote: > Could we make this a more obvious failure? Is there ever a valid reason > for a file to import itself? I've done it occasionally, but only when doing something very unusual, and I probably wouldn't mind having to pull it out of sys.modules in cases like that. -- Greg From sjoerdjob at sjec.nl Fri Jan 29 18:42:26 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Sat, 30 Jan 2016 00:42:26 +0100 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56ABEACA.8000707@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> Message-ID: <20160129234226.GA23485@sjoerdjob.com> On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote: > Hi, > > A common question we get in the #python IRC channel is, "I tried > importing a module, but I get an AttributeError trying to use the > things it said it provided." Turns out the beginner named their own > file the same as the module they were trying to use. > > That is, they want to try (for example) the "azure" package. So > they make a file called azure.py, and start with "import azure". The > import succeeds, but it has none of the contents the documentation > claims, because they have imported themselves. It's baffling, > because they have used the exact statements shown in the examples, > but it doesn't work. > > Could we make this a more obvious failure? Is there ever a valid > reason for a file to import itself? Is this situation detectable in > the import machinery? > > --Ned. I feel this is only a partial fix. I've been bitten by something like this, but not precisely like this. The difference in how I experienced this makes it enough for me to say: I don't think your suggestion is that useful. What I experienced was having collisions on the python-path, and modules from my codebase colliding with libraries in the stdlib (or outside it). For example, a library might import one of its dependencies which coincidentally had the same name as one of the libraries I have. Maybe a suggestion would be to add the path of the module to the error message? Currently the message is sjoerdjob$ cat json.py import json TEST = '{"foo": "bar"}' print(json.loads(TEST)) sjoerdjob$ python3.5 json.py Traceback (most recent call last): File "json.py", line 1, in import json File "/Users/sjoerdjob/Development/spikes/importself/json.py", line 5, in print(json.loads(TEST)) AttributeError: module 'json' has no attribute 'loads' But maybe the error could/should be AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/json.py) has no attribute 'loads'. As another corner case, consider the following: #json.py JSON_DATA = '{"foo": "bar"}' #mod_a.py import json def parse(blob): return json.loads(blob) #mod_b.py from json import JSON_DATA from mod_a import parse print(parse(JSON_DATA)) (Now, consider that instead of 'json' we chose a less common module name.). You still get the error `module 'json' has no attribute 'loads'`. In this case, I think it's more helpful to know the filename of the 'json' module. For me that'd sooner be a trigger to "What's going on.", because I haven't been bitten by the 'import self' issue as often as 'name collision in dependency tree'. (Of course, another option would be to look for other modules of the same name when you get an attribute-error on a module to aid debugging, but I think that's too heavy-weight.) Kind regards, Sjoerd Job From gokoproject at gmail.com Fri Jan 29 19:05:32 2016 From: gokoproject at gmail.com (John Wong) Date: Fri, 29 Jan 2016 19:05:32 -0500 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160129234226.GA23485@sjoerdjob.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> Message-ID: On Fri, Jan 29, 2016 at 6:42 PM, Sjoerd Job Postmus wrote: > I feel this is only a partial fix. I've been bitten by something like > this, but not precisely like this. The difference in how I experienced > this makes it enough for me to say: I don't think your suggestion is > that useful. > > Yes, your example is actually more likely to happen, and it happened to me many times. One reason is some of the stdlib module names are kind of commons. Once I defined my own random.py and then another time I had a requests.py which collided with requests library. I think the right solution is assume every import error needs some guidance, some hints. Don't just target a specific problem. Ned probably familiar with this, in the case of Ansible, if Ansible cannot resolve and locate the role you specify in the playbook, Ansible will complain and give this error message: ERROR: cannot find role in /current/path/roles/some-role or /current/path/some-role or /etc/ansible/roles/some-role So the import error should be more or less like this AttributeError: module 'json' has no attribute 'loads' Possible root causes: * json is not found in the current PYTHONPATH. Python tried /current/python/site-packages/json, /current/python/site-packages/json.py. For the full list of PYTHONPATH, please refer to THIS DOC ON PYTHON.ORG. * your current module has the same name as the module you intent to import. You can even simplify this to say possible cause please go to this doc on python.org and we can go verbose there. John -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Fri Jan 29 19:13:36 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sat, 30 Jan 2016 00:13:36 +0000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160129234226.GA23485@sjoerdjob.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> Message-ID: On 29 January 2016 at 23:42, Sjoerd Job Postmus wrote: > On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote: >> Hi, >> >> A common question we get in the #python IRC channel is, "I tried >> importing a module, but I get an AttributeError trying to use the >> things it said it provided." Turns out the beginner named their own >> file the same as the module they were trying to use. ... >> Could we make this a more obvious failure? Is there ever a valid >> reason for a file to import itself? Is this situation detectable in >> the import machinery? > > I feel this is only a partial fix. I've been bitten by something like > this, but not precisely like this. The difference in how I experienced > this makes it enough for me to say: I don't think your suggestion is > that useful. > > What I experienced was having collisions on the python-path, and modules > from my codebase colliding with libraries in the stdlib (or outside it). > For example, a library might import one of its dependencies which > coincidentally had the same name as one of the libraries I have. Another way that the error can arrive is if your script has the same name as an installed module that is indirectly imported. I commonly see my students choose the name "random.py" for a script which can lead to this problem: $ echo 'import urllib2' > random.py $ python random.py Traceback (most recent call last): File "random.py", line 1, in import urllib2 File "/usr/lib/python2.7/urllib2.py", line 94, in import httplib File "/usr/lib/python2.7/httplib.py", line 80, in import mimetools File "/usr/lib/python2.7/mimetools.py", line 6, in import tempfile File "/usr/lib/python2.7/tempfile.py", line 35, in from random import Random as _Random ImportError: cannot import name Random To fully avoid this error you need to know every possible top-level module/package name and not use any of them as the name of your script. This would be avoided if module namespaces had some nesting e.g. 'import stdlib.random' but apparently flat is better than nested... > Maybe a suggestion would be to add the path of the module to the error > message? > > Currently the message is ... > AttributeError: module 'json' has no attribute 'loads' > > But maybe the error could/should be > > AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/json.py) has no attribute 'loads'. I think that would be an improvement. It would still be a problem for absolute beginners but at least the error message gives enough information to spot the problem. In general though I think it's unfortunate that it's possible to be able to override installed or even stdlib modules just by having a .py file with the same name in the same directory as the running script. I had a discussion with a student earlier today about why '.' is not usually on PATH for precisely this reason: you basically never want the ls (or whatever) command to run a program in the current directory. -- Oscar From abarnert at yahoo.com Fri Jan 29 19:16:46 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 16:16:46 -0800 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160129234226.GA23485@sjoerdjob.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> Message-ID: <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus wrote: > > What I experienced was having collisions on the python-path, and modules > from my codebase colliding with libraries in the stdlib (or outside it). > For example, a library might import one of its dependencies which > coincidentally had the same name as one of the libraries I have. Yes. The version of this I've seen most from novices is that they write a program named "json.py" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads. (One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "time.py", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :)) > Maybe a suggestion would be to add the path of the module to the error > message? That would probably help, but think about what it entails: Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc. To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it. So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting. > (Of course, another option would be to look for other modules of the > same name when you get an attribute-error on a module to aid debugging, > but I think that's too heavy-weight.) If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes. From oscar.j.benjamin at gmail.com Fri Jan 29 19:29:48 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sat, 30 Jan 2016 00:29:48 +0000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> Message-ID: On 30 January 2016 at 00:16, Andrew Barnert via Python-ideas wrote: >> Maybe a suggestion would be to add the path of the module to the error >> message? > > That would probably help, but think about what it entails: > > Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc. Oh yeah, good point. Somehow I read the AttributeError as an ImportError e.g. $ python random.py Traceback (most recent call last): File "random.py", line 1, in import urllib2 File "/usr/lib/python2.7/urllib2.py", line 94, in import httplib File "/usr/lib/python2.7/httplib.py", line 80, in import mimetools File "/usr/lib/python2.7/mimetools.py", line 6, in import tempfile File "/usr/lib/python2.7/tempfile.py", line 35, in from random import Random as _Random ImportError: cannot import name Random That error message could be changed to something like ImportError: cannot import name Random from module 'random' (/home/oscar/random.py) Attribute errors would be more problematic. -- Oscar From rosuav at gmail.com Fri Jan 29 22:11:13 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 30 Jan 2016 14:11:13 +1100 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> Message-ID: On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin wrote: > In general though I think it's unfortunate that it's possible to be > able to override installed or even stdlib modules just by having a .py > file with the same name in the same directory as the running script. I > had a discussion with a student earlier today about why '.' is not > usually on PATH for precisely this reason: you basically never want > the ls (or whatever) command to run a program in the current > directory. > One solution would be to always work in a package. As of Python 3, implicit relative imports don't happen, so you should be safe. Maybe there could be a flag like -m that means "run as if current directory is a module"? You can change to a parent directory and run "python3 -m dir.file" to run dir/file.py; if "python3 -r file" could run file.py from the current directory (and assume the presence of an empty __init__.py if one isn't found), that would prevent all accidental imports - if you want to grab a file from right next to you, that's "from . import otherfile", which makes perfect sense. It'd be 100% backward compatible, as the new behaviour would take effect only if the option is explicitly given. Doable? ChrisA From abarnert at yahoo.com Fri Jan 29 22:23:01 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 19:23:01 -0800 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> Message-ID: On Jan 29, 2016, at 19:11, Chris Angelico wrote: > > On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin > wrote: >> In general though I think it's unfortunate that it's possible to be >> able to override installed or even stdlib modules just by having a .py >> file with the same name in the same directory as the running script. I >> had a discussion with a student earlier today about why '.' is not >> usually on PATH for precisely this reason: you basically never want >> the ls (or whatever) command to run a program in the current >> directory. > > One solution would be to always work in a package. As of Python 3, > implicit relative imports don't happen, so you should be safe. Maybe > there could be a flag like -m that means "run as if current directory > is a module"? You can change to a parent directory and run "python3 -m > dir.file" to run dir/file.py; if "python3 -r file" could run file.py > from the current directory (and assume the presence of an empty > __init__.py if one isn't found), that would prevent all accidental > imports - if you want to grab a file from right next to you, that's > "from . import otherfile", which makes perfect sense. > > It'd be 100% backward compatible, as the new behaviour would take > effect only if the option is explicitly given. Doable? I like it. The only problem is that people on platforms where you can add the -r on the shbang line will start doing that, and then their scripts won't be portable... > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Fri Jan 29 23:10:02 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Jan 2016 15:10:02 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> Message-ID: <20160130041001.GI4619@ando.pearwood.info> On Fri, Jan 29, 2016 at 05:56:57PM +0000, Brett Cannon wrote: > A better fit would be something like https://www.uservoice.com/ if people > wanted a focused "vote on ideas" solution, I don't think treating language design as a participatory democracy would be a good idea, even if it were practical. (How could you get all Python users to vote? Do casual users who only use Python occasionally get fractional votes?) If it were, Python would probably look and behave a lot more like PHP. And even representative democracy has practical problems. (Who speaks for the users of numpy? Sys admins? Teachers?) I'm 100% in favour of community participation and would like to encourage people to participate and be heard, but I don't think we should have any illusions about the fundamentally non-democratic nature of language design. Nor do I think that's necessarily a bad thing. Not everything needs to be decided by voting. I think it is far more honest to admit that language design is always going to be an authoritarian process where a small elite, possibly even a single person, decides what makes it into the language and what doesn't, than to try to claim democratic legitimcy via voting that cannot possibly be representative. > or something like > https://www.discourse.org/ for a more modern forum platform that has the > concept of likes for a thread. Ah, "like" buttons. The way to feel good about yourself for participating without actually participating :-) Well, I suppose it's a bit less disruptive than having hordes of "Me too!!!1!" posts. -- Steve From guido at python.org Fri Jan 29 23:35:30 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Jan 2016 20:35:30 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130041001.GI4619@ando.pearwood.info> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: On Fri, Jan 29, 2016 at 8:10 PM, Steven D'Aprano wrote: > On Fri, Jan 29, 2016 at 05:56:57PM +0000, Brett Cannon wrote: > >> A better fit would be something like https://www.uservoice.com/ if people >> wanted a focused "vote on ideas" solution, > > I don't think treating language design as a participatory democracy > would be a good idea, even if it were practical. (How could you get all > Python users to vote? Do casual users who only use Python occasionally > get fractional votes?) If it were, Python would probably look and behave > a lot more like PHP. > > And even representative democracy has practical problems. (Who speaks > for the users of numpy? Sys admins? Teachers?) > > I'm 100% in favour of community participation and would like to > encourage people to participate and be heard, but I don't think we > should have any illusions about the fundamentally non-democratic nature > of language design. Nor do I think that's necessarily a bad thing. Not > everything needs to be decided by voting. > > I think it is far more honest to admit that language design is always > going to be an authoritarian process where a small elite, possibly even > a single person, decides what makes it into the language and what > doesn't, than to try to claim democratic legitimcy via voting that > cannot possibly be representative. > > > >> or something like >> https://www.discourse.org/ for a more modern forum platform that has the >> concept of likes for a thread. > > Ah, "like" buttons. The way to feel good about yourself for > participating without actually participating :-) > > Well, I suppose it's a bit less disruptive than having hordes of > "Me too!!!1!" posts. Let me clarify why I like StackExchange. I don't care about the voting for/against answers or even about the selection of the "best" answer by the OP. I do like that the reputation system of the site automatically recognizes users who should be given more responsibilities (up to and including deleting inappropriate posts -- rarely). What I like most is that the site encourages the creation of artifacts that are useful to reference later, e.g. when a related issue comes up again later. And I think it will be easier for new folks to participate than the current mailing list (where if you don't sign up for it you're likely to miss most replies, while if you do sign up, you'll be inundated with traffic -- not everybody is a wizard at managing high volume mailing list traffic). I don't understand the issues brought up about the SE site creation process. 22 years ago we managed to create a Usenet newsgroup, comp.lang.python. Surely today we can figure out how to create a SE site? -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Sat Jan 30 00:16:27 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Sat, 30 Jan 2016 16:16:27 +1100 Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit meta) References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: <8560ybfxzo.fsf_-_@benfinney.id.au> Guido van Rossum writes: > I don't understand the issues brought up about the SE site creation > process. 22 years ago we managed to create a Usenet newsgroup, > comp.lang.python. Surely today we can figure out how to create a SE > site? We have done, several times. One popular option is Askbot . I'd be happy to see a PSF-blessed instance of Askbot running at a ?foo.python.org? domain. That said, it would be wise to reflect that creating the software is not the hard part; continually responding to community needs, and managing the system so desirable behaviours are encouraged, is the hard part . -- \ ?If nature has made any one thing less susceptible than all | `\ others of exclusive property, it is the action of the thinking | _o__) power called an idea? ?Thomas Jefferson, 1813-08-13 | Ben Finney From guido at python.org Sat Jan 30 00:24:18 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Jan 2016 21:24:18 -0800 Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit meta) In-Reply-To: <8560ybfxzo.fsf_-_@benfinney.id.au> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <8560ybfxzo.fsf_-_@benfinney.id.au> Message-ID: On Fri, Jan 29, 2016 at 9:16 PM, Ben Finney wrote: > Guido van Rossum writes: > >> I don't understand the issues brought up about the SE site creation >> process. 22 years ago we managed to create a Usenet newsgroup, >> comp.lang.python. Surely today we can figure out how to create a SE >> site? > > We have done, several times. One popular option is Askbot > . I'd be happy to see a > PSF-blessed instance of Askbot running at a ?foo.python.org? domain. > > That said, it would be wise to reflect that creating the software is not > the hard part; continually responding to community needs, and managing > the system so desirable behaviours are encouraged, is the hard part > . Oh, I wasn't talking about creating more software. I was assuming we could find a way to join the SE network. IOW let Jeff Atwood and co. take care of that stuff, so we can focus on having meaningful discussions. (Or were you trolling?) -- --Guido van Rossum (python.org/~guido) From stephen at xemacs.org Sat Jan 30 00:52:11 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 30 Jan 2016 14:52:11 +0900 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: <22188.20363.305368.29340@turnbull.sk.tsukuba.ac.jp> Steven D'Aprano writes: > > I don't think treating language design as a participatory democracy > > would be a good idea, even if it were practical. Fred Brooks (The Mythical Man-Month, "The Surgical Team") agreed with you 40 years ago. In fact he argued that dictatorship was best for software systems in general. > > Ah, "like" buttons. The way to feel good about yourself for > > participating without actually participating :-) Guido van Rossum[1] responds: > And I think it will be easier for new folks to participate than the > current mailing list (where if you don't sign up for it you're > likely to miss most replies, while if you do sign up, you'll be > inundated with traffic -- not everybody is a wizard at managing > high volume mailing list traffic). But as you'll recall Antoine not so long ago no-mail'ed python-ideas and possibly python-dev because of the volume of participation by people whose comments were unlikely in the extreme to have any effect on the decision being discussed.[2] I don't know how many other core developers have taken that course, but there certainly was a lot of sympathy for Antoine -- and IMO justifiably so. Noblesse oblige can go only so far, and in the face of "like" buttons.... I agree that reputation systems are very interesting, but in the case of design channels that need (in the sense Steven described well) to be dominated by an "elite", I suspect they could make it very hard to achieve promotion to "elite" status as quickly as python-dev often does. I consider the openness of Python core to potential new members[3] to be a distinguishing characteristic of this community. It would be unfortunate if potential were obscured by initial low reputation. On the other hand, one attribute that you have mentioned (the ease of finding issues) has a useful effect. To the extent that StackExchange makes traffic management easy (specifically filtering, threading, and linking), it might encourage users to follow links to other threads where relevant discussion is posted. In the thread where Antoine spoke up, the fact that the discussion that led to the main decision was on python-committers almost certainly had a lot to do with the fact that most of the posts were unaware that the main decision was final, and of the reasons for and against the decision that had already been discussed. And those reasons were rehashed endlessly! A forum that encourages retrieval of previous discussion before posting would make a big difference, I suspect. Eg, one with a check box "I have read and understood the discussions cited and I still want to post"[4] for comment entry and a "No! He didn't do his homework!" button next to the posted comment. But an experiment, eg, with core-mentorship or a SIG, would be good. As Ben says, designing systems involving people is *hard*, and you frequently see unintended effects. Unfortunately, those effects are perverse far more often than not.[5] Footnotes: [1] The juxtaposition of Guido's words with Steven's is intentional, though no insult is intended to either. [2] I'm sorry about the wording, but I don't have a better one. Python channels do not ignore *people*. However, new participants are more like to make comments that will have no effect, and thus their comments are likely to be ignored or dismissed with a stock response. Especially if to the experienced eye the comment has already been responded to fully in the same thread. [3] Every core wants new members who can fit right in. What makes Python different from the typical project is effective mentoring of those with mere potential. [4] Like the old Usenet newsreaders used to. [5] Which is why my field is justifiably known as "The Dismal Science." From ben+python at benfinney.id.au Sat Jan 30 01:29:37 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Sat, 30 Jan 2016 17:29:37 +1100 Subject: [Python-ideas] A collaborative Q&A site for Python References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <8560ybfxzo.fsf_-_@benfinney.id.au> Message-ID: <85wpqreg1a.fsf@benfinney.id.au> Guido van Rossum writes: > Oh, I wasn't talking about creating more software. I was assuming we > could find a way to join the SE network. IOW let Jeff Atwood and co. > take care of that stuff, so we can focus on having meaningful > discussions. Ah. I guess I work from the assumption we'd want the PSF to keep control of our own tools for collaboration, unless there's good reason otherwise. > (Or were you trolling?) No, just didn't understand the differing priorities. -- \ ?I doubt, therefore I might be.? ?anonymous | `\ | _o__) | Ben Finney From sjoerdjob at sjec.nl Sat Jan 30 01:42:48 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Sat, 30 Jan 2016 07:42:48 +0100 Subject: [Python-ideas] A bit meta In-Reply-To: <22188.20363.305368.29340@turnbull.sk.tsukuba.ac.jp> References: <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <22188.20363.305368.29340@turnbull.sk.tsukuba.ac.jp> Message-ID: <20160130064248.GA27229@sjoerdjob.com> On Sat, Jan 30, 2016 at 02:52:11PM +0900, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > ... > On the other hand, one attribute that you have mentioned (the ease of > finding issues) has a useful effect. To the extent that StackExchange > makes traffic management easy (specifically filtering, threading, and > linking), it might encourage users to follow links to other threads > where relevant discussion is posted. ... . A forum that encourages > retrieval of previous discussion before posting would make a big > difference, I suspect. Eg, one with a check box "I have read and > understood the discussions cited and I still want to post"[4] for > comment entry and a "No! He didn't do his homework!" button next to > the posted comment. To be honest, I don't think that would make that big of a difference, less so than the difference caused by having the discussion area more easily accessible. In the end, there's always going to be a group of people who are likely to ignore best practices and add irrelevant comments (some of whom will not really learn). In fact, I'd expect the ease of using a website to make it more likely for people to join which at first do not follow best practices at all. Think of using a mailing list as placing a filter on minimum intellect. I'm also a visitor on some of the stack-exchange sites, and I see a lot of topics that gets closed quite soon-ish on account of not fitting the model of a Q&A site for [insert any of a thousand reasons here]. On the other hand, in (at least) python-ideas and python-dev, I don't see any 'crap' coming by. Sometimes an idea I might think of as crap, but at least the idea is (quite often) well-substantiated and argued for in the initial postings. I myself am inclined to assign the praise for the high-quality to not only the core community, but also to the somewhat unusual sign-up procedure[1], and would be very (happily) surprised if the quality would stay the same when switching to something web-based with an obvious UI. [1] Unusual in the sense that it's so not 2016 to have a mailing list instead of a web forum. Mailing lists are a lot less common now than it was some time ago. > ... > > [2] I'm sorry about the wording, but I don't have a better one. > Python channels do not ignore *people*. However, new participants are > more like to make comments that will have no effect, and thus their > comments are likely to be ignored or dismissed with a stock response. > Especially if to the experienced eye the comment has already been > responded to fully in the same thread. > Maybe there should be a document describing expected behaviour, instead of expecting people to somehow 'get' it by observing. For instance, I did not know if it was OK for me to say '-1' or '+0' or ... on a suggestion. If there were some guidelines on that, "Everybody can 'vote', but please keep in mind that ..." and "Voting happens by " as well as something along the lines of "It's not a democracy, voting is just a way of showing your support for/against, but there will not be a formal tally.". > [3] Every core wants new members who can fit right in. What makes > Python different from the typical project is effective mentoring of > those with mere potential. On that topic, would it make sense to at the very least make a list of some things you want to look for in members 'who can fit right in'? Well, I'm not really sure that would be a good idea, but what I think might be a good idea would be something to help people in drawing up their opening post with an idea. That would help in people getting an idea of what would be effective behaviour. Things like: - If you're proposing syntax changes, please document as fully as possible why what you want is not possible with the current syntax, or just too burdensome. Why what you want to do is common enough to justify the additional burden of the mental overhead the suggested syntax (naturally) imposes. Yes, your new syntax might reduce the mental overhead in the case you are considering, but please keep ... in mind. - ... (Additional suggestions here) (now, I'm just brainstorming here, but suggestions that would help people write better opening posts, or give more effective feedback would probably not hurt. However, I don't think I'm the proper person to write down suggestions like that, as I'm still relatively new) From sjoerdjob at sjec.nl Sat Jan 30 01:58:51 2016 From: sjoerdjob at sjec.nl (Sjoerd Job Postmus) Date: Sat, 30 Jan 2016 07:58:51 +0100 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> Message-ID: <20160130065851.GB27229@sjoerdjob.com> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote: > On Jan 29, 2016, at 15:42, Sjoerd Job Postmus wrote: > > > > What I experienced was having collisions on the python-path, and modules > > from my codebase colliding with libraries in the stdlib (or outside it). > > For example, a library might import one of its dependencies which > > coincidentally had the same name as one of the libraries I have. > > Yes. The version of this I've seen most from novices is that they write a program named "json.py" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads. > > (One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "time.py", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :)) > > > Maybe a suggestion would be to add the path of the module to the error > > message? > > That would probably help, but think about what it entails: > > Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc. True. Most AttributeErrors are on user-defined classes with a typo. But that's not the case we're discussing here. Here we are discussing how a user should debug the effects of module name collisions, and the resulting AttributeError. I would expect it to be quite unlikely that two modules with the same name each have a class with the same name, and you accidentally initialize the wrong one. More likely (in my experience) is that you get an AttributeError on a module (in the case of module-name collisions). > To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it. > > So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting. Yes, indeed. That's what I was thinking of. I decided to write up a quick hack that added the filename to the exception string. sjoerdjob$ ../python mod_a.py Traceback (most recent call last): File "mod_a.py", line 4, in print(parse(JSON_DATA)) File "/home/sjoerdjob/dev/cpython/tmp/mod_b.py", line 4, in parse return json.loads(blob) AttributeError: module 'json' (loaded from /home/sjoerdjob/dev/cpython/tmp/json.py) has no attribute 'loads' Here's the patch, in case anyone is interested. diff --git a/Objects/moduleobject.c b/Objects/moduleobject.c index 24c5f4c..5cc144a 100644 --- a/Objects/moduleobject.c +++ b/Objects/moduleobject.c @@ -654,17 +654,25 @@ module_repr(PyModuleObject *m) static PyObject* module_getattro(PyModuleObject *m, PyObject *name) { - PyObject *attr, *mod_name; + PyObject *attr, *mod_name, *mod_file; attr = PyObject_GenericGetAttr((PyObject *)m, name); if (attr || !PyErr_ExceptionMatches(PyExc_AttributeError)) return attr; PyErr_Clear(); if (m->md_dict) { _Py_IDENTIFIER(__name__); mod_name = _PyDict_GetItemId(m->md_dict, &PyId___name__); if (mod_name) { - PyErr_Format(PyExc_AttributeError, + _Py_IDENTIFIER(__file__); + mod_file = _PyDict_GetItemId(m->md_dict, &PyId___file__); + if (mod_file && PyUnicode_Check(mod_file)) { + PyErr_Format(PyExc_AttributeError, + "module '%U' (loaded from %U) has no attribute '%U'", mod_name, mod_file, name); + } else { + PyErr_Format(PyExc_AttributeError, "module '%U' has no attribute '%U'", mod_name, name); + } return NULL; } else if (PyErr_Occurred()) { Unfortunately, I do think this might impose **some** performance issue, but on the other hand, I'd be inclined to think that attribute-errors on module objects are not that likely to begin with, except for typos and issues like these. (And of course the case that you have to support older versions of Python with a slower implementation, but you most often see those checks being done at the module-level, so it would only impact load-time and not running-time.) The added benefit would be quicker debugging when finally having posted to a forum: "Ah, I see from the message that the path of the module is not likely a standard-library path. Maybe you have a name collision? Check for files or directories named '(.py)' in your working directory / project / ... . > > > (Of course, another option would be to look for other modules of the > > same name when you get an attribute-error on a module to aid debugging, > > but I think that's too heavy-weight.) > > If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes. No, indeed, that was also my expectation: helpful, but too big a hassle to be worth it. From abarnert at yahoo.com Sat Jan 30 02:44:31 2016 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 29 Jan 2016 23:44:31 -0800 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160130065851.GB27229@sjoerdjob.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> <20160130065851.GB27229@sjoerdjob.com> Message-ID: <7D3CC0E9-C9BA-42F9-9008-9CFE9667F800@yahoo.com> On Jan 29, 2016, at 22:58, Sjoerd Job Postmus wrote: > >> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote: >>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus wrote: >>> >>> Maybe a suggestion would be to add the path of the module to the error >>> message? >> >> That would probably help, but think about what it entails: >> >> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc. > > True. Most AttributeErrors are on user-defined classes with a typo. But > that's not the case we're discussing here. Here we are discussing how a > user should debug the effects of module name collisions, and the > resulting AttributeError. Right. So my point is, either we have to do the extra work in module.__getattr__ when formatting the string, or we have to extend the interface of AttributeError to carry more information in general (the object and attr name, presumably). The latter may be better, but it's also clearly not going to happen any time soon. (People have been suggesting since before 3.0 that all the standard exceptions should have more useful info, but nobody's volunteered to change the hundreds of lines of C code, Python code, and docs to do it...) So, the only argument against your idea I can see is the potential performance issues. Which should be pretty easy to dismiss with a microbenchmark showing it's pretty small even in the worst case and a macrobenchmark showing it's not even measurable in real code, right? From skreft at gmail.com Sat Jan 30 03:05:23 2016 From: skreft at gmail.com (Sebastian Kreft) Date: Sat, 30 Jan 2016 19:05:23 +1100 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <7D3CC0E9-C9BA-42F9-9008-9CFE9667F800@yahoo.com> References: <56ABEACA.8000707@nedbatchelder.com> <20160129234226.GA23485@sjoerdjob.com> <86B59981-4CA8-4E01-89D5-FA05884BC22C@yahoo.com> <20160130065851.GB27229@sjoerdjob.com> <7D3CC0E9-C9BA-42F9-9008-9CFE9667F800@yahoo.com> Message-ID: On Jan 30, 2016 6:45 PM, "Andrew Barnert via Python-ideas" < python-ideas at python.org> wrote: > > On Jan 29, 2016, at 22:58, Sjoerd Job Postmus wrote: > > > >> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote: > >>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus wrote: > >>> > >>> Maybe a suggestion would be to add the path of the module to the error > >>> message? > >> > >> That would probably help, but think about what it entails: > >> > >> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc. > > > > True. Most AttributeErrors are on user-defined classes with a typo. But > > that's not the case we're discussing here. Here we are discussing how a > > user should debug the effects of module name collisions, and the > > resulting AttributeError. > > Right. So my point is, either we have to do the extra work in module.__getattr__ when formatting the string, or we have to extend the interface of AttributeError to carry more information in general (the object and attr name, presumably). The latter may be better, but it's also clearly not going to happen any time soon. (People have been suggesting since before 3.0 that all the standard exceptions should have more useful info, but nobody's volunteered to change the hundreds of lines of C code, Python code, and docs to do it...) Pep 473 centralizes all of this requests. https://www.python.org/dev/peps/pep-0473/. I started adding support for name error, the most simple change and it turned out to be much more complex as I had thought. I had a couple of tests which were failing and didn't have the bandwidth to debug. > > So, the only argument against your idea I can see is the potential performance issues. Which should be pretty easy to dismiss with a microbenchmark showing it's pretty small even in the worst case and a macrobenchmark showing it's not even measurable in real code, right? > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianlee1521 at gmail.com Sat Jan 30 03:47:43 2016 From: ianlee1521 at gmail.com (Ian Lee) Date: Sat, 30 Jan 2016 00:47:43 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> So with the upcoming move to GitHub of the CPython repository, planned with PEP-512 [1], what about the idea of creating a Git repository on GitHub to serve as a replacement for a mailing list, for example python-ideas? Such a repo might be hosted off the ?Python? GitHub organization: https://github.com/python/python-ideas > On Jan 29, 2016, at 20:35, Guido van Rossum wrote: > > What I like most is that the site encourages the creation of > artifacts that are useful to reference later, e.g. when a related > issue comes up again later. And I think it will be easier for new > folks to participate than the current mailing list (where if you don't > sign up for it you're likely to miss most replies, while if you do > sign up, you'll be inundated with traffic -- not everybody is a wizard > at managing high volume mailing list traffic). Such a repository could address a number of items brought up above, including providing a permanent link to artifacts: comments and threads (issues in the issue tracker). The ability to watch the entire ?repo" (mailing list) and unsubscribe to ?issues? (threads) that are no longer interesting to a watcher (similarly to the way ?muting? works in Gmail). Or vice versa, have notifications off by default and able to opt into notifications if something interesting catches your eye. Additionally, this would provide a straight forward and pretty easy way to link discussions to actual changes other repos in an easier way that something like ?CPython at commit XYZ123?. Stephen J. Turnbull writes: > On the other hand, one attribute that you have mentioned (the ease of > finding issues) has a useful effect. To the extent that StackExchange > makes traffic management easy (specifically filtering, threading, and > linking), it might encourage users to follow links to other threads > where relevant discussion is posted. In the thread where Antoine > spoke up, the fact that the discussion that led to the main decision > was on python-committers almost certainly had a lot to do with the > fact that most of the posts were unaware that the main decision was > final, and of the reasons for and against the decision that had > already been discussed. And those reasons were rehashed endlessly! A > forum that encourages retrieval of previous discussion before posting > would make a big difference, I suspect. Eg, one with a check box "I > have read and understood the discussions cited and I still want to > post"[4] for comment entry and a "No! He didn't do his homework!" > button next to the posted comment. A lot of the filtering, sorting, and other benefits that Stephen mentions would be available through GitHub?s searching capabilities, and others such as tagging of ?issues" (mail threads) with labels (peps, new feature, duplicate, change existing functionality, etc come to mind). Additionally, an issue / thread in the repo could be ?closed? when it is off topic, with future issues opened being able to be closed, marked as ?duplicate? and linked against the old closed issue to try to provide that bit of history without needing to take as much time to re-write the response. Other benefits include syntax highlighting, markdown formatting (which was announced this week [2]), and ability to interact with the thread via email (replying to the email creates a comment on the issue) or through the browser (which is nice for the presumably small, but at least >= 1 population that have their personal email blocked by their corporate firewall). I could also see their being a lot of benefit in making the actual code in the repository to be things like contributing information, what is appropriate to say / ask on each list, etc. For lists like core-workflow I could even see this evolving to where the ?Code? was a GitHub Pages [3] page that actually hosts directly something like the contributor guide (which could still live at whatever URL was desired, while letting GitHub do the actual hosting. Extra benefit is that it provides a very straightforward way to update some of the developer, contributor, and mentoring guides. It doesn?t ?solve? some of the other issues such as voting, reputation of a user, etc, However, I?m not hearing a resounding desire for those anyways. There is at least *some* precedent for this in the form of the Government GitHub [4][5] community and related agencies such as 18F [6]. The former of which has a ?best practices? repository [7] which serves this same purpose of communicating and discussing ideas, without necessarily being a code repository. Unfortunately, that repository is a private repository and requires a government email address and joining the ?government? organization to access; see [8] for details on joining if you?re interested. [1] https://github.com/brettcannon/github-transition-pep/blob/master/pep-0512.rst [2] https://github.com/blog/2097-improved-commenting-with-markdown [3] https://pages.github.com [4] https://government.github.com/ [5] https://github.com/government [6] https://github.com/18F/ [7] https://github.com/government/best-practices [8] https://github.com/government/welcome ~ Ian Lee | IanLee1521 at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Jan 30 03:50:48 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 30 Jan 2016 19:50:48 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: On Sat, Jan 30, 2016 at 7:47 PM, Ian Lee wrote: > Such a repository could address a number of items brought up above, > including providing a permanent link to artifacts: comments and threads > (issues in the issue tracker). The ability to watch the entire ?repo" > (mailing list) and unsubscribe to ?issues? (threads) that are no longer > interesting to a watcher (similarly to the way ?muting? works in Gmail). Or > vice versa, have notifications off by default and able to opt into > notifications if something interesting catches your eye. How do you change the subject line to indicate that the topic has drifted (or is a spin-off), while still appropriately quoting the previous post? Most web-based discussion systems are built around a concept of "initial post" and "replies", where the replies always tie exactly to one initial post. The branching of discussion threads never seems to work as well as it does in netnews or email. ChrisA From ianlee1521 at gmail.com Sat Jan 30 04:01:31 2016 From: ianlee1521 at gmail.com (Ian Lee) Date: Sat, 30 Jan 2016 01:01:31 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: > On Jan 30, 2016, at 00:50, Chris Angelico wrote: > > On Sat, Jan 30, 2016 at 7:47 PM, Ian Lee wrote: >> Such a repository could address a number of items brought up above, >> including providing a permanent link to artifacts: comments and threads >> (issues in the issue tracker). The ability to watch the entire ?repo" >> (mailing list) and unsubscribe to ?issues? (threads) that are no longer >> interesting to a watcher (similarly to the way ?muting? works in Gmail). Or >> vice versa, have notifications off by default and able to opt into >> notifications if something interesting catches your eye. > > How do you change the subject line to indicate that the topic has > drifted (or is a spin-off), while still appropriately quoting the > previous post? > > Most web-based discussion systems are built around a concept of > "initial post" and "replies", where the replies always tie exactly to > one initial post. The branching of discussion threads never seems to > work as well as it does in netnews or email. True, you don?t get quite as nice forking of issues, though other solutions mentioned (e.g. StackOverflow) would have similar issues. Off the cuff, I?d suggest that this linking could be handled by creating a new issue which linked to the old issue [1] with something like ?continuing from #12345 ??. This would actually provide an improvement over the current email approach which only really provides a link from the forked thread back to the original, by creating a reference / link to the forked issue in the original, e.g. how [2] and [3] are linked. [1] https://help.github.com/articles/autolinked-references-and-urls/#issues-and-pull-requests [2] https://github.com/python/typing/issues/135 [3] https://github.com/python/typing/issues/136 > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ~ Ian Lee | IanLee1521 at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jan 30 04:18:01 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 19:18:01 +1000 Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit meta) In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <8560ybfxzo.fsf_-_@benfinney.id.au> Message-ID: On 30 January 2016 at 15:24, Guido van Rossum wrote: > On Fri, Jan 29, 2016 at 9:16 PM, Ben Finney wrote: >> Guido van Rossum writes: >> >>> I don't understand the issues brought up about the SE site creation >>> process. 22 years ago we managed to create a Usenet newsgroup, >>> comp.lang.python. Surely today we can figure out how to create a SE >>> site? >> >> We have done, several times. One popular option is Askbot >> . I'd be happy to see a >> PSF-blessed instance of Askbot running at a ?foo.python.org? domain. >> >> That said, it would be wise to reflect that creating the software is not >> the hard part; continually responding to community needs, and managing >> the system so desirable behaviours are encouraged, is the hard part >> . > > Oh, I wasn't talking about creating more software. I was assuming we > could find a way to join the SE network. IOW let Jeff Atwood and co. > take care of that stuff, so we can focus on having meaningful > discussions. Area 51 is their process for doing that: http://area51.stackexchange.com/ However, while Stack Exchange style sites can be good for "Why is this existing thing the way it is?" Q&A, they're not really designed for proposing *changes* to things, discussing the prospective merits of those changes, and coming to a decision. Loomio is a good example of a site that offers some much better tools for collaborative discussion and decision making: https://www.loomio.org/ You still have the "critical mass" problem though, and for CPython, the critical mass of eyeballs is on python-dev and python-ideas - hence the inclination to try to update that infrastructure to Mailman 3 transparently (thus providing a much improved web gateway for potential new participants and better list management tools for existing subscribers), rather than trying to convince current list members to switch to a different technology. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mojtaba.gharibi at gmail.com Sat Jan 30 04:24:52 2016 From: mojtaba.gharibi at gmail.com (Mirmojtaba Gharibi) Date: Sat, 30 Jan 2016 04:24:52 -0500 Subject: [Python-ideas] Respectively and its unpacking sentence In-Reply-To: <22187.41498.833689.317428@turnbull.sk.tsukuba.ac.jp> References: <22187.41498.833689.317428@turnbull.sk.tsukuba.ac.jp> Message-ID: Thanks everyone for your feedback. I think I have a clearer look at it as a result. It seems the most important feature is the vector operation aspect of it. Also that magical behavior and the fact that $ or ;;; does not produce types is troublesome. Also, some of the other aspects such as x,y = 1+2, 3+4 is already addressed by the above notation, so we're not gaining anything there. I'll have some ideas to address the concerns and will post them later again. Moj On Fri, Jan 29, 2016 at 12:32 PM, Stephen J. Turnbull wrote: > Pavol Lisy writes: > > > I would really like > > > > (a;b;c) in L > > Not well-specified (does order matter? how about repeated values? is > (a;b;c) an object? it sure looks like one, and if so, object in L > already has a meaning). But for one obvious interpretation: > > {a, b, c} <= set(L) > > and in this interpretation you should probably optimize to > > {a, b, c} <= L > > by constructing L as a set in the first place. Really this thread > probably belongs on python-list anyway. > > From ncoghlan at gmail.com Sat Jan 30 04:30:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 19:30:05 +1000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56ABEACA.8000707@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> Message-ID: On 30 January 2016 at 08:42, Ned Batchelder wrote: > Hi, > > A common question we get in the #python IRC channel is, "I tried importing a > module, but I get an AttributeError trying to use the things it said it > provided." Turns out the beginner named their own file the same as the > module they were trying to use. > > That is, they want to try (for example) the "azure" package. So they make a > file called azure.py, and start with "import azure". The import succeeds, > but it has none of the contents the documentation claims, because they have > imported themselves. It's baffling, because they have used the exact > statements shown in the examples, but it doesn't work. > > Could we make this a more obvious failure? Is there ever a valid reason for > a file to import itself? Is this situation detectable in the import > machinery? We could potentially detect when __main__ is being reimported under a different name and issue a user visible warning when it happens, but we can't readily detect a file importing itself in the general case (since it may be an indirect circular reference rather than a direct). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sat Jan 30 04:44:56 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Jan 2016 20:44:56 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> References: <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: <20160130094456.GL4619@ando.pearwood.info> On Sat, Jan 30, 2016 at 12:47:43AM -0800, Ian Lee wrote: > So with the upcoming move to GitHub of the CPython repository, planned > with PEP-512 [1], what about the idea of creating a Git repository on > GitHub to serve as a replacement for a mailing list, for example > python-ideas? Such a repo might be hosted off the ?Python? GitHub > organization: https://github.com/python/python-ideas I think any talk of migrating away from email is greatly premature. The Mailman folks have done a lot of fantastic work with Mailman 3 and Hyperkitty, which will bring many of the benefits of a web forum to the mailing lists. We should at least look at Hyperkitty before planning any widespread move away from email. http://wiki.list.org/HyperKitty -- Steve From ned at nedbatchelder.com Sat Jan 30 06:19:35 2016 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 30 Jan 2016 06:19:35 -0500 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: References: <56ABEACA.8000707@nedbatchelder.com> Message-ID: <56AC9C47.50304@nedbatchelder.com> On 1/30/16 4:30 AM, Nick Coghlan wrote: > On 30 January 2016 at 08:42, Ned Batchelder wrote: >> Hi, >> >> A common question we get in the #python IRC channel is, "I tried importing a >> module, but I get an AttributeError trying to use the things it said it >> provided." Turns out the beginner named their own file the same as the >> module they were trying to use. >> >> That is, they want to try (for example) the "azure" package. So they make a >> file called azure.py, and start with "import azure". The import succeeds, >> but it has none of the contents the documentation claims, because they have >> imported themselves. It's baffling, because they have used the exact >> statements shown in the examples, but it doesn't work. >> >> Could we make this a more obvious failure? Is there ever a valid reason for >> a file to import itself? Is this situation detectable in the import >> machinery? > We could potentially detect when __main__ is being reimported under a > different name and issue a user visible warning when it happens, but > we can't readily detect a file importing itself in the general case > (since it may be an indirect circular reference rather than a direct). I thought about the indirect case, and for the errors I'm trying to make clearer, the direct case is plenty. While we're at it though, re-importing __main__ is a separate kind of behavior that is often a problem, since it means you'll have the same classes defined twice. --Ned. > Cheers, > Nick. > From stephen at xemacs.org Sat Jan 30 06:21:14 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 30 Jan 2016 20:21:14 +0900 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130094456.GL4619@ando.pearwood.info> References: <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <20160130094456.GL4619@ando.pearwood.info> Message-ID: <22188.40106.257470.207521@turnbull.sk.tsukuba.ac.jp> >>>>> On Sat, Jan 30, 2016 at 12:47:43AM -0800, Ian Lee wrote: > So with the upcoming move to GitHub of the CPython repository, planned > with PEP-512 [1], what about the idea of creating a Git repository on > GitHub to serve as a replacement for a mailing list, -1 in general. For existing channels, parallel operation, probably with a gateway, is essential. > for example python-ideas? -1 in particular. There are better candidates for experimentation. >>>>> Steven D'Aprano writes: > I think any talk of migrating away from email is greatly premature. +1 (but I'm a contributor to the GNU Mailman project, so take that with a grain of self-interest). From ncoghlan at gmail.com Sat Jan 30 06:57:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Jan 2016 21:57:05 +1000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56AC9C47.50304@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> Message-ID: On 30 January 2016 at 21:19, Ned Batchelder wrote: > On 1/30/16 4:30 AM, Nick Coghlan wrote: >> We could potentially detect when __main__ is being reimported under a >> different name and issue a user visible warning when it happens, but >> we can't readily detect a file importing itself in the general case >> (since it may be an indirect circular reference rather than a direct). > > I thought about the indirect case, and for the errors I'm trying to make > clearer, the direct case is plenty. In that case, the only problem I see off the top of my head with emitting a warning for direct self-imports is that it would rely on import system behaviour we're currently trying to reduce/minimise: the import machinery needing visibility into the globals for the module initiating the import. It's also possible that by the time we get hold of the __spec__ for the module being imported, we've already dropped our reference to the importing module's globals, so we can't check against __file__ any more. However, I'd need to go read the code to remember how quickly we get to extracting just the globals of potential interest. > While we're at it though, re-importing __main__ is a separate kind of > behavior that is often a problem, since it means you'll have the same > classes defined twice. Right, but it combines with the name shadowing behaviour to create a *super* confusing combination when you write a *script* that shadows the name of a standard library module: http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-name-shadowing-trap Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From oscar.j.benjamin at gmail.com Sat Jan 30 08:20:25 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sat, 30 Jan 2016 13:20:25 +0000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> Message-ID: On 30 January 2016 at 11:57, Nick Coghlan wrote: > On 30 January 2016 at 21:19, Ned Batchelder wrote: >> On 1/30/16 4:30 AM, Nick Coghlan wrote: >>> We could potentially detect when __main__ is being reimported under a >>> different name and issue a user visible warning when it happens, but >>> we can't readily detect a file importing itself in the general case >>> (since it may be an indirect circular reference rather than a direct). >> >> I thought about the indirect case, and for the errors I'm trying to make >> clearer, the direct case is plenty. > > In that case, the only problem I see off the top of my head with > emitting a warning for direct self-imports is that it would rely on > import system behaviour we're currently trying to reduce/minimise: the > import machinery needing visibility into the globals for the module > initiating the import. > > It's also possible that by the time we get hold of the __spec__ for > the module being imported, we've already dropped our reference to the > importing module's globals, so we can't check against __file__ any > more. However, I'd need to go read the code to remember how quickly we > get to extracting just the globals of potential interest. Maybe this is because I don't really understand how the import machinery works but I would say that if I run $ python random.py Then the interpreter should be able to know that __main__ is called "random" and know the path to that file. It should also be evident if '' is at the front of sys.path then "import random" is going to import that same module. Why is it difficult to detect that case? I think it would be better to try and solve the problem a little more generally though. Having yesterday created a file called random.py in my user directory (on Ubuntu 15.04) I get the following today: $ cat random.py import urllib2 $ python3 Python 3.4.3 (default, Mar 26 2015, 22:03:40) [GCC 4.9.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> *a = [1, 2] File "", line 1 SyntaxError: starred assignment target must be in a list or tuple Error in sys.excepthook: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook from apport.fileutils import likely_packaged, get_recent_crashes File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in from apport.report import Report File "/usr/lib/python3/dist-packages/apport/report.py", line 12, in import subprocess, tempfile, os.path, re, pwd, grp, os, time File "/usr/lib/python3.4/tempfile.py", line 175, in from random import Random as _Random File "/home/oscar/random.py", line 1, in import urllib2 ImportError: No module named 'urllib2' Original exception was: File "", line 1 SyntaxError: starred assignment target must be in a list or tuple -- Oscar From random832 at fastmail.com Sat Jan 30 10:43:06 2016 From: random832 at fastmail.com (Random832) Date: Sat, 30 Jan 2016 10:43:06 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: Guido van Rossum writes: > Let me clarify why I like StackExchange. I don't care about the voting > for/against answers or even about the selection of the "best" answer > by the OP. I do like that the reputation system of the site > automatically recognizes users who should be given more > responsibilities (up to and including deleting inappropriate posts -- > rarely). These can't be separated. Reputation is obtained by writing answers that people vote for. The site would either have to be structured to allow that, or an entirely different way of getting reputation... which would still involve voting on _something_, if it's to be decentralized and therefore "automatic" rather than requiring you personally to hand out all reputation points. From random832 at fastmail.com Sat Jan 30 10:49:14 2016 From: random832 at fastmail.com (Random832) Date: Sat, 30 Jan 2016 10:49:14 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: Chris Angelico writes: > How do you change the subject line to indicate that the topic has > drifted (or is a spin-off), while still appropriately quoting the > previous post? You're free to quote any post anywhere, even if you make a new thread. In general to do this you have to start your reply in the original thread, then copy/paste the quote markup (which includes a magic link to the post you are quoting) into the post new thread form. It would be interesting to make a forum with a "spin-off thread" feature, which would automate the placement of the reply in a new thread and a note in the old thread with a link to the new one. But in most cases this can't be automated because on better-managed forums once a digression has grown large enough to need a separate thread, the forum's moderators will move earlier posts about it (originally made in the first thread) to the new thread. (it might be interesting to make a forum that provides a way to have a post live in two different threads at the same time) From nicholas.chammas at gmail.com Sat Jan 30 11:48:18 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sat, 30 Jan 2016 16:48:18 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: To follow up on one of the suggestions Brett and Donald made, I think the best solution today for a modern discussion forum is Discourse . Discourse is built by some of the same people who built Stack Overflow, including Jeff Atwood . Among the many excellent features it has is full support for a ?mailing list mode?, where you can reply to and start new conversations entirely via email. That may be important for people who are not interested in using the web for their conversations. Discourse doesn?t currently have a voting plugin, but here is an interesting discussion about adding one . Just earlier today a member of the Discourse team followed-up on that discussion with a detailed proposal to make the plugin real . As an example of the polish Discourse already has, consider this remark by Random832: It would be interesting to make a forum with a ?spin-off thread? feature, which would automate the placement of the reply in a new thread and a note in the old thread with a link to the new one. If you look at the post I linked to about adding a voting plugin , you can see just this kind of link offered by Discourse since the poster spun that new thread from an existing one. A large open source community using Discourse today is Docker . If Donald sets up a Discourse instance for Packaging, that should serve as a good trial for us to deploy it elsewhere. I suspect it will be a success. As for hosting, there are many options that range from free but self-managed, to fully managed for a monthly fee. Nick ? On Sat, Jan 30, 2016 at 10:50 AM Random832 wrote: > Chris Angelico writes: > > How do you change the subject line to indicate that the topic has > > drifted (or is a spin-off), while still appropriately quoting the > > previous post? > > You're free to quote any post anywhere, even if you make a new > thread. In general to do this you have to start your reply in the > original thread, then copy/paste the quote markup (which includes a > magic link to the post you are quoting) into the post new thread form. > > It would be interesting to make a forum with a "spin-off thread" > feature, which would automate the placement of the reply in a new thread > and a note in the old thread with a link to the new one. > > But in most cases this can't be automated because on better-managed > forums once a digression has grown large enough to need a separate > thread, the forum's moderators will move earlier posts about it > (originally made in the first thread) to the new thread. (it might be > interesting to make a forum that provides a way to have a post live in > two different threads at the same time) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 30 12:25:17 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Jan 2016 09:25:17 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: Oooh, Discourse looks and sounds good. Hopefully we can opt out from voting, everything else looks just right. I recommend requesting some PSF money for a fully-hosted instance, so nobody has to suffer when it's down, security upgrades will be taken care of, etc. On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas wrote: > To follow up on one of the suggestions Brett and Donald made, I think the > best solution today for a modern discussion forum is Discourse. > > Discourse is built by some of the same people who built Stack Overflow, > including Jeff Atwood. Among the many excellent features it has is full > support for a ?mailing list mode?, where you can reply to and start new > conversations entirely via email. That may be important for people who are > not interested in using the web for their conversations. > > Discourse doesn?t currently have a voting plugin, but here is an interesting > discussion about adding one. Just earlier today a member of the Discourse > team followed-up on that discussion with a detailed proposal to make the > plugin real. > > As an example of the polish Discourse already has, consider this remark by > Random832: > > It would be interesting to make a forum with a ?spin-off thread? feature, > which would automate the placement of the reply in a new thread and a note > in the old thread with a link to the new one. > > If you look at the post I linked to about adding a voting plugin, you can > see just this kind of link offered by Discourse since the poster spun that > new thread from an existing one. > > A large open source community using Discourse today is Docker. If Donald > sets up a Discourse instance for Packaging, that should serve as a good > trial for us to deploy it elsewhere. I suspect it will be a success. > > As for hosting, there are many options that range from free but > self-managed, to fully managed for a monthly fee. > > Nick > > > On Sat, Jan 30, 2016 at 10:50 AM Random832 wrote: >> >> Chris Angelico writes: >> > How do you change the subject line to indicate that the topic has >> > drifted (or is a spin-off), while still appropriately quoting the >> > previous post? >> >> You're free to quote any post anywhere, even if you make a new >> thread. In general to do this you have to start your reply in the >> original thread, then copy/paste the quote markup (which includes a >> magic link to the post you are quoting) into the post new thread form. >> >> It would be interesting to make a forum with a "spin-off thread" >> feature, which would automate the placement of the reply in a new thread >> and a note in the old thread with a link to the new one. >> >> But in most cases this can't be automated because on better-managed >> forums once a digression has grown large enough to need a separate >> thread, the forum's moderators will move earlier posts about it >> (originally made in the first thread) to the new thread. (it might be >> interesting to make a forum that provides a way to have a post live in >> two different threads at the same time) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From brett at python.org Sat Jan 30 12:29:22 2016 From: brett at python.org (Brett Cannon) Date: Sat, 30 Jan 2016 17:29:22 +0000 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Fri, Jan 29, 2016, 11:57 Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Jan 29, 2016, at 06:10, Nick Coghlan wrote: > > > > PEP 511 erases that piece of incidental complexity and say, "If you > > want to apply a genuinely global transformation, this is how you do > > it". The fact we already have decorators and import hooks is why I > > think PEP 511 can safely ignore the use cases that those handle. > > I think this is the conclusion I was hoping to reach, but wasn't sure how > to get there. I'm happy with PEP 511 not trying to serve cases like MacroPy > and Hy and the example from the byteplay docs, especially so if ignoring > them makes PEP 511 simpler, as long as it can explain why it's ignoring > them. And a shorter version of your argument should serve as such an > explanation. > > But the other half of my point was that too many people (even very > experienced developers like most of the people on this list) think there's > more incidental complexity than there is, and that's also a problem. For > example, "I want to write a global processor for local experimentation > purposes so I can play with my idea before posting it to Python-ideas" is > not a bad desire. And, if people think it's way too hard to do with a > quick&dirty import hook, they're naturally going to ask why PEP 511 doesn't > help them out by adding a bunch of options to install/run the processors > conditionally, handle non-.py files, skip the stdlib, etc. And I think the > PEP is better without those options. > > > However, I think it *would* make sense to make the creation of a "Code > > Transformation" HOWTO guide part of the PEP - having a guide means we > > can clearly present the hierarchy in terms of: > > I like this idea. > > Earlier I suggested that the import system documentation should have some > simple examples of how to actually use the import system to write > transforming hooks. Someone (Brett?) pointed out that it's a dangerous > technique, and making it too easy for people to play with it without > understanding it may be a bad idea. And they're probably right. > If we added an appropriate warning to the example I would be fine adding one that covers how to add a custom loader. -brett > A HOWTO is a bit more "out-of-the-way" than library or reference > docs--and, more importantly, it also has room to explain when you shouldn't > do this or that, and why. > > I'm not sure it has to be part of the PEP, but I can see the connection. > While the PEP helps by separating out the most important safe case > (semantically-neutral, reflected in .pyc, globally consistent, etc.), but > it also makes the question "how do I do something similar to PEP 511 > transformers except ___" more likely to come up in the first place, making > the HOWTO more important. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.chammas at gmail.com Sat Jan 30 12:30:37 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sat, 30 Jan 2016 17:30:37 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: Yeah, the voting is just a plugin which I presume you can enable or disable as desired. As for hosting, I agree it?s probably better to have someone else do that so we can lessen the burden on our infra team. The Discourse team also offers discounted hosting for open source projects . Depending on the arrangement they offer, it may be really cheap for us even with a fully managed instance. Nick ? On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum wrote: > Oooh, Discourse looks and sounds good. Hopefully we can opt out from > voting, everything else looks just right. I recommend requesting some > PSF money for a fully-hosted instance, so nobody has to suffer when > it's down, security upgrades will be taken care of, etc. > > On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas > wrote: > > To follow up on one of the suggestions Brett and Donald made, I think the > > best solution today for a modern discussion forum is Discourse. > > > > Discourse is built by some of the same people who built Stack Overflow, > > including Jeff Atwood. Among the many excellent features it has is full > > support for a ?mailing list mode?, where you can reply to and start new > > conversations entirely via email. That may be important for people who > are > > not interested in using the web for their conversations. > > > > Discourse doesn?t currently have a voting plugin, but here is an > interesting > > discussion about adding one. Just earlier today a member of the Discourse > > team followed-up on that discussion with a detailed proposal to make the > > plugin real. > > > > As an example of the polish Discourse already has, consider this remark > by > > Random832: > > > > It would be interesting to make a forum with a ?spin-off thread? feature, > > which would automate the placement of the reply in a new thread and a > note > > in the old thread with a link to the new one. > > > > If you look at the post I linked to about adding a voting plugin, you can > > see just this kind of link offered by Discourse since the poster spun > that > > new thread from an existing one. > > > > A large open source community using Discourse today is Docker. If Donald > > sets up a Discourse instance for Packaging, that should serve as a good > > trial for us to deploy it elsewhere. I suspect it will be a success. > > > > As for hosting, there are many options that range from free but > > self-managed, to fully managed for a monthly fee. > > > > Nick > > > > > > On Sat, Jan 30, 2016 at 10:50 AM Random832 > wrote: > >> > >> Chris Angelico writes: > >> > How do you change the subject line to indicate that the topic has > >> > drifted (or is a spin-off), while still appropriately quoting the > >> > previous post? > >> > >> You're free to quote any post anywhere, even if you make a new > >> thread. In general to do this you have to start your reply in the > >> original thread, then copy/paste the quote markup (which includes a > >> magic link to the post you are quoting) into the post new thread form. > >> > >> It would be interesting to make a forum with a "spin-off thread" > >> feature, which would automate the placement of the reply in a new thread > >> and a note in the old thread with a link to the new one. > >> > >> But in most cases this can't be automated because on better-managed > >> forums once a digression has grown large enough to need a separate > >> thread, the forum's moderators will move earlier posts about it > >> (originally made in the first thread) to the new thread. (it might be > >> interesting to make a forum that provides a way to have a post live in > >> two different threads at the same time) > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sat Jan 30 12:58:53 2016 From: donald at stufft.io (Donald Stufft) Date: Sat, 30 Jan 2016 12:58:53 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: Honestly, It?s probably not super hard for us to get this running (and we might want to so we can piggyback on our Fastly CDN and such). Assuming it stores all of it?s persistent state inside of PostgreSQL then we?re already running a central PostgreSQL server that we keep backed up. The biggest issue comes from software that wants to store persistent state on disk, since that makes it difficult to treat those machines as empehereal. > On Jan 30, 2016, at 12:30 PM, Nicholas Chammas wrote: > > Yeah, the voting is just a plugin which I presume you can enable or disable as desired. > > As for hosting, I agree it?s probably better to have someone else do that so we can lessen the burden on our infra team. The Discourse team also offers discounted hosting for open source projects . Depending on the arrangement they offer, it may be really cheap for us even with a fully managed instance. > > Nick > > > On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum > wrote: > Oooh, Discourse looks and sounds good. Hopefully we can opt out from > voting, everything else looks just right. I recommend requesting some > PSF money for a fully-hosted instance, so nobody has to suffer when > it's down, security upgrades will be taken care of, etc. > > On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas > > wrote: > > To follow up on one of the suggestions Brett and Donald made, I think the > > best solution today for a modern discussion forum is Discourse. > > > > Discourse is built by some of the same people who built Stack Overflow, > > including Jeff Atwood. Among the many excellent features it has is full > > support for a ?mailing list mode?, where you can reply to and start new > > conversations entirely via email. That may be important for people who are > > not interested in using the web for their conversations. > > > > Discourse doesn?t currently have a voting plugin, but here is an interesting > > discussion about adding one. Just earlier today a member of the Discourse > > team followed-up on that discussion with a detailed proposal to make the > > plugin real. > > > > As an example of the polish Discourse already has, consider this remark by > > Random832: > > > > It would be interesting to make a forum with a ?spin-off thread? feature, > > which would automate the placement of the reply in a new thread and a note > > in the old thread with a link to the new one. > > > > If you look at the post I linked to about adding a voting plugin, you can > > see just this kind of link offered by Discourse since the poster spun that > > new thread from an existing one. > > > > A large open source community using Discourse today is Docker. If Donald > > sets up a Discourse instance for Packaging, that should serve as a good > > trial for us to deploy it elsewhere. I suspect it will be a success. > > > > As for hosting, there are many options that range from free but > > self-managed, to fully managed for a monthly fee. > > > > Nick > > > > > > On Sat, Jan 30, 2016 at 10:50 AM Random832 > wrote: > >> > >> Chris Angelico > writes: > >> > How do you change the subject line to indicate that the topic has > >> > drifted (or is a spin-off), while still appropriately quoting the > >> > previous post? > >> > >> You're free to quote any post anywhere, even if you make a new > >> thread. In general to do this you have to start your reply in the > >> original thread, then copy/paste the quote markup (which includes a > >> magic link to the post you are quoting) into the post new thread form. > >> > >> It would be interesting to make a forum with a "spin-off thread" > >> feature, which would automate the placement of the reply in a new thread > >> and a note in the old thread with a link to the new one. > >> > >> But in most cases this can't be automated because on better-managed > >> forums once a digression has grown large enough to need a separate > >> thread, the forum's moderators will move earlier posts about it > >> (originally made in the first thread) to the new thread. (it might be > >> interesting to make a forum that provides a way to have a post live in > >> two different threads at the same time) > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > -- > --Guido van Rossum (python.org/~guido ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From nicholas.chammas at gmail.com Sat Jan 30 13:06:47 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sat, 30 Jan 2016 18:06:47 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: Agreed. If you read through that thread about Discourse hosting for open source projects, you'll see that Jeff made a point of stressing just how easy it is to manage a Discourse instance. Still, my bias is to delegate where possible, even if the task is light, to reduce the psychological burden of being responsible for something. Then again, I'm not on the Python infra team, so I can't speak for them. If they're (and I'm guessing you're part of the team, Donald?) OK with it, then sure, it should be fine to manage the instance ourselves. Nick On Sat, Jan 30, 2016 at 12:59 PM Donald Stufft wrote: > Honestly, It?s probably not super hard for us to get this running (and we > might want to so we can piggyback on our Fastly CDN and such). Assuming it > stores all of it?s persistent state inside of PostgreSQL then we?re already > running a central PostgreSQL server that we keep backed up. The biggest > issue comes from software that wants to store persistent state on disk, > since that makes it difficult to treat those machines as empehereal. > > On Jan 30, 2016, at 12:30 PM, Nicholas Chammas > wrote: > > Yeah, the voting is just a plugin which I presume you can enable or > disable as desired. > > As for hosting, I agree it?s probably better to have someone else do that > so we can lessen the burden on our infra team. The Discourse team also > offers discounted hosting for open source projects > . > Depending on the arrangement they offer, it may be really cheap for us even > with a fully managed instance. > > Nick > ? > > On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum > wrote: > >> Oooh, Discourse looks and sounds good. Hopefully we can opt out from >> voting, everything else looks just right. I recommend requesting some >> PSF money for a fully-hosted instance, so nobody has to suffer when >> it's down, security upgrades will be taken care of, etc. >> >> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas >> wrote: >> > To follow up on one of the suggestions Brett and Donald made, I think >> the >> > best solution today for a modern discussion forum is Discourse. >> > >> > Discourse is built by some of the same people who built Stack Overflow, >> > including Jeff Atwood. Among the many excellent features it has is full >> > support for a ?mailing list mode?, where you can reply to and start new >> > conversations entirely via email. That may be important for people who >> are >> > not interested in using the web for their conversations. >> > >> > Discourse doesn?t currently have a voting plugin, but here is an >> interesting >> > discussion about adding one. Just earlier today a member of the >> Discourse >> > team followed-up on that discussion with a detailed proposal to make the >> > plugin real. >> > >> > As an example of the polish Discourse already has, consider this remark >> by >> > Random832: >> > >> > It would be interesting to make a forum with a ?spin-off thread? >> feature, >> > which would automate the placement of the reply in a new thread and a >> note >> > in the old thread with a link to the new one. >> > >> > If you look at the post I linked to about adding a voting plugin, you >> can >> > see just this kind of link offered by Discourse since the poster spun >> that >> > new thread from an existing one. >> > >> > A large open source community using Discourse today is Docker. If Donald >> > sets up a Discourse instance for Packaging, that should serve as a good >> > trial for us to deploy it elsewhere. I suspect it will be a success. >> > >> > As for hosting, there are many options that range from free but >> > self-managed, to fully managed for a monthly fee. >> > >> > Nick >> > >> > >> > On Sat, Jan 30, 2016 at 10:50 AM Random832 >> wrote: >> >> >> >> Chris Angelico writes: >> >> > How do you change the subject line to indicate that the topic has >> >> > drifted (or is a spin-off), while still appropriately quoting the >> >> > previous post? >> >> >> >> You're free to quote any post anywhere, even if you make a new >> >> thread. In general to do this you have to start your reply in the >> >> original thread, then copy/paste the quote markup (which includes a >> >> magic link to the post you are quoting) into the post new thread form. >> >> >> >> It would be interesting to make a forum with a "spin-off thread" >> >> feature, which would automate the placement of the reply in a new >> thread >> >> and a note in the old thread with a link to the new one. >> >> >> >> But in most cases this can't be automated because on better-managed >> >> forums once a digression has grown large enough to need a separate >> >> thread, the forum's moderators will move earlier posts about it >> >> (originally made in the first thread) to the new thread. (it might be >> >> interesting to make a forum that provides a way to have a post live in >> >> two different threads at the same time) >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Jan 30 13:35:24 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 30 Jan 2016 10:35:24 -0800 Subject: [Python-ideas] PEP 511: API for code transformers In-Reply-To: References: <20160129010158.GC4619@ando.pearwood.info> <65429619.1655004.1454037039310.JavaMail.yahoo@mail.yahoo.com> <6999559.1612499.1454038212658.JavaMail.yahoo@mail.yahoo.com> Message-ID: <56AD026C.9000400@stoneleaf.us> On 01/30/2016 09:29 AM, Brett Cannon wrote: > On Fri, Jan 29, 2016, 11:57 Andrew Barnert wrote: >> Earlier I suggested that the import system documentation should have >> some simple examples of how to actually use the import system to >> write transforming hooks. Someone (Brett?) pointed out that it's a >> dangerous technique, and making it too easy for people to play with >> it without understanding it may be a bad idea. And they're probably >> right. > If we added an appropriate warning to the example I would be fine > adding one that covers how to add a custom loader. That would be great! -- ~Ethan~ From brett at python.org Sat Jan 30 14:03:11 2016 From: brett at python.org (Brett Cannon) Date: Sat, 30 Jan 2016 19:03:11 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: I've started a thread amongst python-ideas-owners to see if any of us can lead the eval between HyperKitty and Discourse and making sure PSF infra is okay with hosting it (or just paying for hosting). If none of us have time I will come back to the list to ask for someone to lead the evaluation. On Sat, 30 Jan 2016 at 10:07 Nicholas Chammas wrote: > Agreed. If you read through that thread about Discourse hosting for open > source projects, you'll see that Jeff made a point of stressing just how > easy it is to manage a Discourse instance. > > Still, my bias is to delegate where possible, even if the task is light, > to reduce the psychological burden of being responsible for something. Then > again, I'm not on the Python infra team, so I can't speak for them. If > they're (and I'm guessing you're part of the team, Donald?) OK with it, > then sure, it should be fine to manage the instance ourselves. > > Nick > > On Sat, Jan 30, 2016 at 12:59 PM Donald Stufft wrote: > >> Honestly, It?s probably not super hard for us to get this running (and we >> might want to so we can piggyback on our Fastly CDN and such). Assuming it >> stores all of it?s persistent state inside of PostgreSQL then we?re already >> running a central PostgreSQL server that we keep backed up. The biggest >> issue comes from software that wants to store persistent state on disk, >> since that makes it difficult to treat those machines as empehereal. >> >> On Jan 30, 2016, at 12:30 PM, Nicholas Chammas < >> nicholas.chammas at gmail.com> wrote: >> >> Yeah, the voting is just a plugin which I presume you can enable or >> disable as desired. >> >> As for hosting, I agree it?s probably better to have someone else do that >> so we can lessen the burden on our infra team. The Discourse team also >> offers discounted hosting for open source projects >> . >> Depending on the arrangement they offer, it may be really cheap for us even >> with a fully managed instance. >> >> Nick >> ? >> >> On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum >> wrote: >> >>> Oooh, Discourse looks and sounds good. Hopefully we can opt out from >>> voting, everything else looks just right. I recommend requesting some >>> PSF money for a fully-hosted instance, so nobody has to suffer when >>> it's down, security upgrades will be taken care of, etc. >>> >>> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas >>> wrote: >>> > To follow up on one of the suggestions Brett and Donald made, I think >>> the >>> > best solution today for a modern discussion forum is Discourse. >>> > >>> > Discourse is built by some of the same people who built Stack Overflow, >>> > including Jeff Atwood. Among the many excellent features it has is full >>> > support for a ?mailing list mode?, where you can reply to and start new >>> > conversations entirely via email. That may be important for people who >>> are >>> > not interested in using the web for their conversations. >>> > >>> > Discourse doesn?t currently have a voting plugin, but here is an >>> interesting >>> > discussion about adding one. Just earlier today a member of the >>> Discourse >>> > team followed-up on that discussion with a detailed proposal to make >>> the >>> > plugin real. >>> > >>> > As an example of the polish Discourse already has, consider this >>> remark by >>> > Random832: >>> > >>> > It would be interesting to make a forum with a ?spin-off thread? >>> feature, >>> > which would automate the placement of the reply in a new thread and a >>> note >>> > in the old thread with a link to the new one. >>> > >>> > If you look at the post I linked to about adding a voting plugin, you >>> can >>> > see just this kind of link offered by Discourse since the poster spun >>> that >>> > new thread from an existing one. >>> > >>> > A large open source community using Discourse today is Docker. If >>> Donald >>> > sets up a Discourse instance for Packaging, that should serve as a good >>> > trial for us to deploy it elsewhere. I suspect it will be a success. >>> > >>> > As for hosting, there are many options that range from free but >>> > self-managed, to fully managed for a monthly fee. >>> > >>> > Nick >>> > >>> > >>> > On Sat, Jan 30, 2016 at 10:50 AM Random832 >>> wrote: >>> >> >>> >> Chris Angelico writes: >>> >> > How do you change the subject line to indicate that the topic has >>> >> > drifted (or is a spin-off), while still appropriately quoting the >>> >> > previous post? >>> >> >>> >> You're free to quote any post anywhere, even if you make a new >>> >> thread. In general to do this you have to start your reply in the >>> >> original thread, then copy/paste the quote markup (which includes a >>> >> magic link to the post you are quoting) into the post new thread form. >>> >> >>> >> It would be interesting to make a forum with a "spin-off thread" >>> >> feature, which would automate the placement of the reply in a new >>> thread >>> >> and a note in the old thread with a link to the new one. >>> >> >>> >> But in most cases this can't be automated because on better-managed >>> >> forums once a digression has grown large enough to need a separate >>> >> thread, the forum's moderators will move earlier posts about it >>> >> (originally made in the first thread) to the new thread. (it might be >>> >> interesting to make a forum that provides a way to have a post live in >>> >> two different threads at the same time) >>> >> >>> >> _______________________________________________ >>> >> Python-ideas mailing list >>> >> Python-ideas at python.org >>> >> https://mail.python.org/mailman/listinfo/python-ideas >>> >> Code of Conduct: http://python.org/psf/codeofconduct/ >>> > >>> > >>> > _______________________________________________ >>> > Python-ideas mailing list >>> > Python-ideas at python.org >>> > https://mail.python.org/mailman/listinfo/python-ideas >>> > Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> ----------------- >> Donald Stufft >> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 >> DCFA >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sat Jan 30 16:02:23 2016 From: barry at python.org (Barry Warsaw) Date: Sat, 30 Jan 2016 16:02:23 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> Message-ID: <20160130160223.09803cad@anarchist.wooz.org> On Jan 29, 2016, at 08:35 PM, Guido van Rossum wrote: >not everybody is a wizard at managing high volume mailing list traffic). Which is why for me, Gmane is an indispensable tool, along with a decent NNTP client. I subscribe to python-ideas and python-dev so I can post them, but I nomail python-ideas (not yet python-dev) so my inbox doesn't get cluttered. Then I read the Gmane newsgroups and as this message shows, can easy post to threads I care about. I can kill-thread any I don't. Plus, I read them when I have time and can ignore them when I don't. I appreciate this isn't a solution for everyone, but it allows me to stay engaged on my own terms and not get overwhelmed by Python email traffic. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Sat Jan 30 16:17:26 2016 From: barry at python.org (Barry Warsaw) Date: Sat, 30 Jan 2016 16:17:26 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> Message-ID: <20160130161726.2fce1648@anarchist.wooz.org> Two big problems with moving primary discussions off the mailing list are discoverablility and community fracture. Any new forum will mean another login, a new work flow, another slice of the ever diminishing attention pie, and discussions that occur both on the traditional site and the new site. Some people will miss the big announcement about the new forum. There will be lots of cross-posting because people won't know for sure which ones the people who need to be involved frequent. For example, many years ago I missed a discussion about something I cared about and only accidentally took notice when I saw a commit message in my inbox. When I asked about why the issue had never been mentioned on python-dev, I was told that everything was hashed out in great detail on the tracker. I didn't even realize that I wasn't getting email notifications of new tracker issues, so I never saw it until it was too late. I've seen other topics discussed primarily on G+, for which I have an account, but rarely pay attention too. I don't even know if it's still "a thing". Maybe everyone's moved to Slack by now. How many different channels do I have to engage with to keep track of what's happening in core Python? This isn't GOML and I'm all for experimentation, but I do urge caution. Otherwise we might just wonder why we haven't heard from Uncle Timmy in a while. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From donald at stufft.io Sat Jan 30 16:19:50 2016 From: donald at stufft.io (Donald Stufft) Date: Sat, 30 Jan 2016 16:19:50 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130161726.2fce1648@anarchist.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> Message-ID: <8E097D4C-749B-4DCA-B08B-33C6FE01DB42@stufft.io> > On Jan 30, 2016, at 4:17 PM, Barry Warsaw wrote: > > Any new forum will mean another login, a new work flow, another slice of the > ever diminishing attention pie, and discussions that occur both on the > traditional site and the new site. Some people will miss the big announcement > about the new forum. There will be lots of cross-posting because people won't > know for sure which ones the people who need to be involved frequent. For what it?s worth, another thing I want to do is setup id.python.org and consolidate all the logins :) ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From phd at phdru.name Sat Jan 30 16:29:25 2016 From: phd at phdru.name (Oleg Broytman) Date: Sat, 30 Jan 2016 22:29:25 +0100 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130161726.2fce1648@anarchist.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> Message-ID: <20160130212925.GA23980@phdru.name> Hi! On Sat, Jan 30, 2016 at 04:17:26PM -0500, Barry Warsaw wrote: > Two big problems with moving primary discussions off the mailing list are > discoverablility and community fracture. > > Any new forum will mean another login, a new work flow, another slice of the > ever diminishing attention pie, and discussions that occur both on the > traditional site and the new site. Some people will miss the big announcement > about the new forum. There will be lots of cross-posting because people won't > know for sure which ones the people who need to be involved frequent. > > For example, many years ago I missed a discussion about something I cared > about and only accidentally took notice when I saw a commit message in my > inbox. When I asked about why the issue had never been mentioned on > python-dev, I was told that everything was hashed out in great detail on the > tracker. I didn't even realize that I wasn't getting email notifications of > new tracker issues, so I never saw it until it was too late. > > I've seen other topics discussed primarily on G+, for which I have an account, > but rarely pay attention too. I don't even know if it's still "a thing". > Maybe everyone's moved to Slack by now. Or to gitter... > How many different channels do I have > to engage with to keep track of what's happening in core Python? > > This isn't GOML GOML? Get Off my Mailing List? ;-) > and I'm all for experimentation, but I do urge caution. > Otherwise we might just wonder why we haven't heard from Uncle Timmy in a > while. > > Cheers, > -Barry Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From nicholas.chammas at gmail.com Sat Jan 30 17:02:18 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sat, 30 Jan 2016 22:02:18 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: <8E097D4C-749B-4DCA-B08B-33C6FE01DB42@stufft.io> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <8E097D4C-749B-4DCA-B08B-33C6FE01DB42@stufft.io> Message-ID: Two big problems with moving primary discussions off the mailing list are discoverablility and community fracture. Agreed, though consider: Community fracture is a always risk when changing the discussion medium. On balance, we have to consider whether the risk is outweighed by the benefits of a new medium. As for discoverability, let me make a brief case for why Discourse is head and shoulders above mailing lists. Among many well-designed features , Discourse has the following things going for it in the discoverability department: - You can mention people by @name from posts and they?ll get notified, like on GitHub. No need to wonder if people will miss something because they haven?t setup their email filters correctly. - We can unify the various lists under a single forum and separate discussions with categories. This would hopefully lend better to cross-pollination of discussions across different categories (e.g. ideas vs. dev), while still letting people narrow their focus to a single category if that?s what they want. For examples of how categories can be used, see Discourse Meta and this category on the Docker forum . - People starting new posts on Discourse automatically get shown potentially related discussions, similar to what Stack Overflow does. It makes it much harder to miss or forget to look for prior discussions before starting a new one. Naturally, generalized search is also a first-class feature. - Regarding the potential proliferation of logins, Discourse supports single sign-on , so if we want we can let people login with Google, GitHub, or perhaps even a Python-owned identity provider. These features (and others ) are really well-executed, as you would expect coming from Jeff Atwood and others who left Stack Overflow to create Discourse. Finally, as a web-based forum, Discourse takes the burden off of users having to each independently come up with a toolchain that makes things manageable for them. Solutions to common problems like notification, finding prior discussions, and so forth, are implemented centrally, and all users automatically benefit. It?s really hard to offer that with a mailing list. And to top it all off, if for whatever reason you hate web forums, Discourse has a ?mailing list mode? which lets you respond to and start discussions entirely via email, without affecting the web-based forum. Nick ? On Sat, Jan 30, 2016 at 4:22 PM Donald Stufft wrote: > > > On Jan 30, 2016, at 4:17 PM, Barry Warsaw wrote: > > > > Any new forum will mean another login, a new work flow, another slice of > the > > ever diminishing attention pie, and discussions that occur both on the > > traditional site and the new site. Some people will miss the big > announcement > > about the new forum. There will be lots of cross-posting because people > won't > > know for sure which ones the people who need to be involved frequent. > > > For what it?s worth, another thing I want to do is setup id.python.org and > consolidate all the logins :) > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Jan 30 17:09:45 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 31 Jan 2016 11:09:45 +1300 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: <56AD34A9.50807@canterbury.ac.nz> Nicholas Chammas wrote: > Discourse is built by some of the same people who built Stack Overflow, > including Jeff Atwood > . > Among the many excellent features it > has is full support for a ?mailing list mode?, where you can reply to > and start new conversations entirely via email. If such a move were made, some kind of email or usenet gateway would be an *essential* feature for me to continue participating. I don't have enough time or energy to chase down multiple web forums every day and wrestle with their clunky interfaces. One of the usenet groups I used to follow (rec.arts.int-fiction) is now effectively dead since everyone abandoned it for a web forum that I can't easily follow. I'd be very sad if anything like that happened to the main Python groups. -- Greg From barry at python.org Sat Jan 30 17:19:33 2016 From: barry at python.org (Barry Warsaw) Date: Sat, 30 Jan 2016 17:19:33 -0500 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <8E097D4C-749B-4DCA-B08B-33C6FE01DB42@stufft.io> Message-ID: <20160130171933.00e2339f@anarchist.wooz.org> On Jan 30, 2016, at 10:02 PM, Nicholas Chammas wrote: >As for discoverability, let me make a brief case for why Discourse is head >and shoulders above mailing lists. To be clear, I'm a fan of Discourse, and would be happy to see an id.python.org SSO'd instance of it for experimentation purposes. However, I would be really upset if major decisions were made in some Discourse thread. There's a reason why PEP 1 requires posting to python-dev, and specifies headers like Discussions-To, Post-History, and Resolution. Some features, which I'd call "tangential" to core language design do indeed happen elsewhere primarily. asyncio and the distutils-stack come to mind. And I think that's fine. But it's also important to post to python-dev at certain milestones or critical junctures because that's what *everyone* knows as the central place for coordinating development. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From steve at pearwood.info Sat Jan 30 17:47:19 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 31 Jan 2016 09:47:19 +1100 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <56AC9C47.50304@nedbatchelder.com> References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> Message-ID: <20160130224718.GB31806@ando.pearwood.info> On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote: > While we're at it though, re-importing __main__ is a separate kind of > behavior that is often a problem, since it means you'll have the same > classes defined twice. As far as I can tell, importing __main__ is fine. It's only when you import __main__ AND the main module under its real name at the same time that you can run into problems -- and even then, not always. The sort of errors I've seen involve something like this: import myscript import __main__ # this is actually myscript a = myscript.TheClass() # later assert isinstance(a, __main__.TheClass) which fails, because myscript and __main__ don't share state, despite actually coming from the same source file. So I think it's pretty rare for something like this to actually happen. I've never seen it happen by accident, I've only seen it done deliberately as a counter-example to to the "modules are singletons" rule. -- Steve From steve at pearwood.info Sat Jan 30 18:19:16 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 31 Jan 2016 10:19:16 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130212925.GA23980@phdru.name> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160130212925.GA23980@phdru.name> Message-ID: <20160130231916.GC31806@ando.pearwood.info> On Sat, Jan 30, 2016 at 10:29:25PM +0100, Oleg Broytman wrote: > > This isn't GOML > > GOML? Get Off my Mailing List? ;-) "Get off my lawn!", traditionally yelled by grumpy old men at kids playing. Figuratively means "this is new, therefore I hate it". -- Steve From njs at pobox.com Sat Jan 30 18:25:44 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 30 Jan 2016 15:25:44 -0800 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160130224718.GB31806@ando.pearwood.info> References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> <20160130224718.GB31806@ando.pearwood.info> Message-ID: On Sat, Jan 30, 2016 at 2:47 PM, Steven D'Aprano wrote: > On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote: > >> While we're at it though, re-importing __main__ is a separate kind of >> behavior that is often a problem, since it means you'll have the same >> classes defined twice. > > As far as I can tell, importing __main__ is fine. It's only when you > import __main__ AND the main module under its real name at the same time > that you can run into problems -- and even then, not always. The sort of > errors I've seen involve something like this: > > import myscript > import __main__ # this is actually myscript > a = myscript.TheClass() > # later > assert isinstance(a, __main__.TheClass) > > which fails, because myscript and __main__ don't share state, despite > actually coming from the same source file. > > So I think it's pretty rare for something like this to actually happen. > I've never seen it happen by accident, I've only seen it done > deliberately as a counter-example to to the "modules are singletons" > rule. Not only is importing __main__ fine, it's actually unavoidable... __main__ is just an alias to the main script's namespace. -- test_main.py -- class Foo: pass import __main__ # Prints "True" print(Foo is __main__.Foo) -- end test_main.py -- So importing __main__ never creates any new copies of any singletons; it's only importing the main script under its filesystem name that creates the problem. -n -- Nathaniel J. Smith -- https://vorpus.org From nicholas.chammas at gmail.com Sat Jan 30 18:29:06 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sat, 30 Jan 2016 23:29:06 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130231916.GC31806@ando.pearwood.info> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160130212925.GA23980@phdru.name> <20160130231916.GC31806@ando.pearwood.info> Message-ID: To be clear, I?m a fan of Discourse, and would be happy to see an id.python.org SSO?d instance of it for experimentation purposes. +1 Perhaps Donald?s suggestion of starting a Discourse instance for Packaging is the easiest way to evaluate it and give people time to kick the tires and see what they think. I?m guessing that will be discussed on distutils-sig? (This is part of the problem of mailing lists vs. a unified forum. :-) Nick ? On Sat, Jan 30, 2016 at 6:19 PM Steven D'Aprano wrote: > On Sat, Jan 30, 2016 at 10:29:25PM +0100, Oleg Broytman wrote: > > > > This isn't GOML > > > > GOML? Get Off my Mailing List? ;-) > > "Get off my lawn!", traditionally yelled by grumpy old men at kids > playing. Figuratively means "this is new, therefore I hate it". > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Jan 30 19:09:51 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Jan 2016 16:09:51 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: <20160130161726.2fce1648@anarchist.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> Message-ID: On Sat, Jan 30, 2016 at 1:17 PM, Barry Warsaw wrote: > For example, many years ago I missed a discussion about something I cared > about and only accidentally took notice when I saw a commit message in my > inbox. When I asked about why the issue had never been mentioned on > python-dev, I was told that everything was hashed out in great detail on the > tracker. I didn't even realize that I wasn't getting email notifications of > new tracker issues, so I never saw it until it was too late. That might have been a lapse in judgement for that particular issue? Without the tracker we'd be utterly inundated in minutiae on python-dev. Occasionally I see folks redirecting a discussion from python-ideas or python-dev to the tracker, and vice versa, and in general I think the line is pretty clear there and people do the right thing. Honestly I wouldn't want to replace python-dev for decisions, but I know several core devs left python-ideas because it was too noisy for them, and I think plenty of stuff on python-ideas would be totally appropriate for some other forum (I often mute threads myself). My rule is that if something's PEP-worthy it needs to be mentioned on python-dev, even if most of the discussion is elsewhere (whether it's a dedicated SIG or a specific tracker on GitHub). It seems reasonable that python-dev should be involved early on, when the discussion is just starting, and again close to the end, before decisions are cast in stone. But I'm glad we don't have to do everything there. > I've seen other topics discussed primarily on G+, for which I have an account, > but rarely pay attention too. I don't even know if it's still "a thing". Fortunately, G+ is dead. "Social media" as it's now known just isn't a good place for these type of discussions. > Maybe everyone's moved to Slack by now. How many different channels do I have > to engage with to keep track of what's happening in core Python? A lot of stuff used to (or still does) happen in IRC, which (as you know) I utterly hate and can't stand. But chat systems still serve a purpose, and if people want to use them we can't stop them. But we can have a written standard for how to handle major decisions, and I see nothing wrong with the standards we currently have written up in PEP 1. I don't think whatever is being proposed here is going against those rules (remember you're reading this in python-ideas, not python-dev :-). > This isn't GOML and I'm all for experimentation, but I do urge caution. > Otherwise we might just wonder why we haven't heard from Uncle Timmy in a > while. Tim seems to have great filters though -- whenever someone says "float" or "datetime" (or "farmville" :-) he perks up his ears. -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Sat Jan 30 19:53:31 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 31 Jan 2016 11:53:31 +1100 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <8E097D4C-749B-4DCA-B08B-33C6FE01DB42@stufft.io> <20160130171933.00e2339f@anarchist.wooz.org> Message-ID: <85lh76efhw.fsf@benfinney.id.au> Barry Warsaw writes: > But it's also important to post to python-dev at certain milestones or > critical junctures because that's what *everyone* knows as the central > place for coordinating development. And importantly, with a PSF mailing list or PSF bug tracker or PSF code review system, etc., collaboration with the rest of the group doesn't require an account with some particular organisation not accountable to PSF. -- \ ?The best way to get information on Usenet is not to ask a | `\ question, but to post the wrong information.? ?Aahz | _o__) | Ben Finney From ned at nedbatchelder.com Sat Jan 30 19:58:49 2016 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 30 Jan 2016 19:58:49 -0500 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: <20160130224718.GB31806@ando.pearwood.info> References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> <20160130224718.GB31806@ando.pearwood.info> Message-ID: <56AD5C49.1040406@nedbatchelder.com> On 1/30/16 5:47 PM, Steven D'Aprano wrote: > On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote: > >> While we're at it though, re-importing __main__ is a separate kind of >> behavior that is often a problem, since it means you'll have the same >> classes defined twice. > As far as I can tell, importing __main__ is fine. It's only when you > import __main__ AND the main module under its real name at the same time > that you can run into problems -- and even then, not always. The sort of > errors I've seen involve something like this: > > import myscript > import __main__ # this is actually myscript > a = myscript.TheClass() > # later > assert isinstance(a, __main__.TheClass) > > which fails, because myscript and __main__ don't share state, despite > actually coming from the same source file. > > So I think it's pretty rare for something like this to actually happen. > I've never seen it happen by accident, I've only seen it done > deliberately as a counter-example to to the "modules are singletons" > rule. > > Something like this does happen in the real world. A class is defined in the main module, and then the module is later imported with its real name. Now you have __main__.Class and module.Class both defined. You don't need to actually "import __main__" for it to happen. __main__.Class is used implicitly from the main module simply as Class. --Ned. From stephen at xemacs.org Sat Jan 30 19:55:44 2016 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 31 Jan 2016 09:55:44 +0900 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> Message-ID: <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Guido van Rossum writes: > Oooh, Discourse looks and sounds good. Hopefully we can opt out from > voting, everything else looks just right. Random832 is right: you need some kind of voting to have forum-curated reputations. From ncoghlan at gmail.com Sun Jan 31 01:44:53 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 31 Jan 2016 16:44:53 +1000 Subject: [Python-ideas] Prevent importing yourself? In-Reply-To: References: <56ABEACA.8000707@nedbatchelder.com> <56AC9C47.50304@nedbatchelder.com> Message-ID: On 30 January 2016 at 23:20, Oscar Benjamin wrote: > On 30 January 2016 at 11:57, Nick Coghlan wrote: >> On 30 January 2016 at 21:19, Ned Batchelder wrote: >>> On 1/30/16 4:30 AM, Nick Coghlan wrote: >>>> We could potentially detect when __main__ is being reimported under a >>>> different name and issue a user visible warning when it happens, but >>>> we can't readily detect a file importing itself in the general case >>>> (since it may be an indirect circular reference rather than a direct). >>> >>> I thought about the indirect case, and for the errors I'm trying to make >>> clearer, the direct case is plenty. >> >> In that case, the only problem I see off the top of my head with >> emitting a warning for direct self-imports is that it would rely on >> import system behaviour we're currently trying to reduce/minimise: the >> import machinery needing visibility into the globals for the module >> initiating the import. >> >> It's also possible that by the time we get hold of the __spec__ for >> the module being imported, we've already dropped our reference to the >> importing module's globals, so we can't check against __file__ any >> more. However, I'd need to go read the code to remember how quickly we >> get to extracting just the globals of potential interest. > > Maybe this is because I don't really understand how the import > machinery works but I would say that if I run > > $ python random.py > > Then the interpreter should be able to know that __main__ is called > "random" and know the path to that file. It should also be evident if > '' is at the front of sys.path then "import random" is going to import > that same module. Why is it difficult to detect that case? Yes, this is the case I originally said we could definitely detect. The case I don't know if we can readily detect is the one where a module *other than __main__* is imported a second time under a different name. However, I'm not sure that latter capability would be at all useful, so it probably doesn't matter whether or not it's feasible. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From nicholas.chammas at gmail.com Sun Jan 31 10:16:56 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 31 Jan 2016 15:16:56 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Message-ID: And importantly, with a PSF mailing list or PSF bug tracker or PSF code review system, etc., collaboration with the rest of the group doesn?t require an account with some particular organisation not accountable to PSF. A quick comment on this, in case anyone thinks Discourse falls in this category. Discourse the forum software is 100% open source . We can run our own instance on our own hardware, or we can have someone else run it for us (like the Discourse team themselves). And identity is pluggable, so we can have something like id.python.org be the identity provider for our Discourse instance, regardless of where it?s hosted. Discourse is not like Google Groups where a company we don?t control can decide to shut down our forum service, or where we are forced to create accounts with a third-party in order to hold discussions. Every piece of Discourse would be completely under our control. Random832 is right: you need some kind of voting to have forum-curated reputations. A quick distinction here: Voting, at least on Discourse, will be a separate plugin intended to let people do things like vote on proposals. Reputation ? or, as Discourse calls it, trust levels ? is its own thing, and comes built in to Discourse . Similar to how Stack Overflow works, as you gain trust within the community, new abilities become unlocked. For example, users at trust level 0 (i.e. brand new users) cannot send private messages to other users and cannot add attachments to their posts. These defaults are configurable by the forum admin. As they participate in the community, their trust level goes up and formerly-locked abilities become available . People who have been around for ages and who are already trusted can be manually promoted to the highest trust level , which effectively makes them forum moderators. If this sounds interesting to you, I recommend reading through the Discourse trust levels to get a good sense of how Discourse views community building. It?s really well thought out, IMO, and is informed by the authors? experience building Stack Overflow. Nick ? On Sat, Jan 30, 2016 at 8:30 PM Stephen J. Turnbull wrote: > Guido van Rossum writes: > > > Oooh, Discourse looks and sounds good. Hopefully we can opt out from > > voting, everything else looks just right. > > Random832 is right: you need some kind of voting to have forum-curated > reputations. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah.temporarily at gmail.com Sun Jan 31 10:47:58 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sun, 31 Jan 2016 15:47:58 +0000 (UTC) Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Message-ID: Nicholas Chammas writes: > If this sounds interesting to you, I recommend reading through the Discourse trust levels to get a good sense of how Discourse views community building. It?s really well thought out, IMO, and is informed by the authors? experience building Stack Overflow. It does not sound interesting at all -- Python development is increasingly turning into a circus, with fewer and fewer people actually writing code. Stefan Krah From nicholas.chammas at gmail.com Sun Jan 31 11:19:12 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 31 Jan 2016 16:19:12 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Message-ID: To be clear, I'm not on python-dev and am not advocating we replace that list with Discourse. I'm just making the case for why Discourse would be a good candidate for the other discussion venues we've been talking about in this thread (e.g. packaging, python-ideas), where people are open to trying out a new medium. The basic idea is that investing in a better medium and better tooling fosters better discussions, which benefits Python the community and ultimately also Python the code base. I wouldn't call that a circus activity. But then again, I'm relatively new to the Python community; perhaps most people on here find this kind of meta-discussion unproductive. On Sun, Jan 31, 2016 at 10:48 AM Stefan Krah wrote: > Nicholas Chammas writes: > > If this sounds interesting to you, I recommend reading through the > Discourse trust levels to get a good sense of how Discourse views community > building. It?s really well thought out, IMO, and is informed by the > authors? > experience building Stack Overflow. > > It does not sound interesting at all -- Python development is increasingly > turning into a circus, with fewer and fewer people actually writing code. > > > Stefan Krah > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Jan 31 12:40:36 2016 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Jan 2016 12:40:36 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> Message-ID: <20160131124036.6e0bd451@subdivisions.wooz.org> On Jan 30, 2016, at 04:09 PM, Guido van Rossum wrote: >On Sat, Jan 30, 2016 at 1:17 PM, Barry Warsaw wrote: >> For example, many years ago I missed a discussion about something I cared >> about and only accidentally took notice when I saw a commit message in my >> inbox. When I asked about why the issue had never been mentioned on >> python-dev, I was told that everything was hashed out in great detail on the >> tracker. I didn't even realize that I wasn't getting email notifications of >> new tracker issues, so I never saw it until it was too late. > >That might have been a lapse in judgement for that particular issue? I actually don't remember the details, just that it happened. But the fact that a lot of smaller details are discussed primarily or solely on the tracker is totally fine and all good! I think we do a much better job of advertising it now. Hopefully everyone knows that to stay involved at that level of detail, sign up for new-issue notifications and nosey yourself in on the topics you care about. Agreed with you about the rest of what you said, except perhaps for: >A lot of stuff used to (or still does) happen in IRC, which (as you >know) I utterly hate and can't stand. But chat systems still serve a >purpose, and if people want to use them we can't stop them. Yep. We have similar discussions internally. I actually don't mind IRC since I have a good client (bip + Emacs/ERC) and I do live on dozens of channels for work. It can get a little spammy at times, but I find them relatively effective at getting or giving focused, short-term help. IRC doesn't work as well for bigger collaborations. But IRC does have the advantage of being totally open and accessible via numerous clients, so information can't be too exclusive or owned. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mal at egenix.com Sun Jan 31 13:22:38 2016 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 31 Jan 2016 19:22:38 +0100 Subject: [Python-ideas] A bit meta In-Reply-To: <20160131124036.6e0bd451@subdivisions.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> Message-ID: <56AE50EE.4080809@egenix.com> Would it be possible to provide an integration of Mailman with Discourse ? I know that Discourse already provides quite a few mailing list like features, but there are still a few issues, which an integration like the existing NNTP gateway of Mailman could likely help resolve: https://meta.discourse.org/t/feedback-from-a-community-about-mailing-list-feature/27695 Esp. the inline reply style (problem 4) mentioned there seems like a show stopper for the way we are used to working here and on other Python MLs. There already is a grant for improving Discourse for some of these things: https://meta.discourse.org/t/moss-roadmap-mailing-lists/36432 If both Discourse and Mailman can live side-by-side, with Discourse being the "web interface" to the Mailman list, I think we'd get the best of both worlds. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 31 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From brett at python.org Sun Jan 31 13:27:22 2016 From: brett at python.org (Brett Cannon) Date: Sun, 31 Jan 2016 18:27:22 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160129080349.GD4619@ando.pearwood.info> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sun, 31 Jan 2016 at 08:19 Nicholas Chammas wrote: > To be clear, I'm not on python-dev and am not advocating we replace that > list with Discourse. > > I'm just making the case for why Discourse would be a good candidate for > the other discussion venues we've been talking about in this thread (e.g. > packaging, python-ideas), where people are open to trying out a new medium. > > The basic idea is that investing in a better medium and better tooling > fosters better discussions, which benefits Python the community and > ultimately also Python the code base. I wouldn't call that a circus > activity. > > But then again, I'm relatively new to the Python community; perhaps most > people on here find this kind of meta-discussion unproductive. > It should happen on occasion, just not regularly. :) Keeping an open source project running is part technical, part social (which makes it part political :). That social bit means having to occasionally evaluate how we are managing our communication amongst not just long-time participants but also new ones. This means we have to sometimes look at what kids in university are using in order to entice them to participate (heck, even high school at this rate). For instance, Barry has mentioned NNTP as part of his solution to managing his mail relating to Python. But go into any university around the world and ask some CS student, "what is Usenet?" -- let alone NNTP -- and it's quite possible you will get a blank stare. This is why I don't call it comp.lang.python anymore but python-list at python.org (same goes for IRC, but it's probably known a lot more widely than Usenet). What this means is we occasionally have to evaluate whether our ways of communicating are too antiquated for new participants in open source and whether they are no longer the most effective (because old does not mean bad, but it does not mean better either), while balancing it with not having constant churn or inadvertently making things worse. Toss in people's principled stances on open source and it leads to a heated discussion. For instance, people have said they don't want to set up another account. But people forget that *every* mailing list on mail.python.org requires its own account to post (I personally have near a bazillion at this point). And while the archives and gmane give you anonymous access to read without an account, so does Discourse or any of the other solutions being discussed (no one wants to wall off the archives or make it so we can't keep a hold of our data in case of another move). It's the usual issue of having to get down to the root of the issue as to why people would want to stay with the mailing list vs. why others would want to switch to Discourse. Finding out the fundamental reasons and taking out the emotion of the discussion is usually the key to helping solve this sort of grounded discussion (at which point you can start ignoring those who can't remove the emotion). And in the case of people worrying about bifurcating the discussions, the python-ideas mailing list would simply be shut down to new email and its archive left up to prevent a split in audience if we do end up changing things up. > > > On Sun, Jan 31, 2016 at 10:48 AM Stefan Krah > wrote: > >> Nicholas Chammas writes: >> > If this sounds interesting to you, I recommend reading through the >> Discourse trust levels to get a good sense of how Discourse views >> community >> building. It?s really well thought out, IMO, and is informed by the >> authors? >> experience building Stack Overflow. >> >> It does not sound interesting at all -- Python development is increasingly >> turning into a circus, with fewer and fewer people actually writing code. >> >> >> Stefan Krah >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Jan 31 13:35:45 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 31 Jan 2016 10:35:45 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: <20160131124036.6e0bd451@subdivisions.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> Message-ID: On Sun, Jan 31, 2016 at 9:40 AM, Barry Warsaw wrote: > But IRC does have > the advantage of being totally open and accessible via numerous clients, so > information can't be too exclusive or owned. Maybe the software is totally open, but the community doesn't feel that way. hen I forayed into it briefly felt hostile to people who don't have the right personality to be online 24/7. -- --Guido van Rossum (python.org/~guido) From barry at python.org Sun Jan 31 13:36:18 2016 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Jan 2016 13:36:18 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: <56AE50EE.4080809@egenix.com> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> <56AE50EE.4080809@egenix.com> Message-ID: <20160131133618.2350a79e@subdivisions.wooz.org> On Jan 31, 2016, at 07:22 PM, M.-A. Lemburg wrote: >Would it be possible to provide an integration of Mailman with >Discourse ? Possible, I don't know, but on the wish list, yes! None of the core Mailman developers have time for this, but we would gladly help and work with anybody who wanted to look into this. >If both Discourse and Mailman can live side-by-side, with >Discourse being the "web interface" to the Mailman list, >I think we'd get the best of both worlds. Definitely. Also note that we'd like to build NNTP and IMAP support into Mailman, again though it's lack of resources. If anybody wants to work on these areas, please contact us over in mailman-developers at python.org Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Sun Jan 31 13:39:41 2016 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Jan 2016 13:39:41 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> Message-ID: <20160131133941.64a4eefc@subdivisions.wooz.org> On Jan 31, 2016, at 10:35 AM, Guido van Rossum wrote: >Maybe the software is totally open, but the community doesn't feel >that way. hen I forayed into it briefly felt hostile to people who >don't have the right personality to be online 24/7. It's probably a lot like FLOSS communities in general. Some are very open, patient, and accepting, and others aren't. Maybe we're spoiled here in the Pythonia. :) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From donald at stufft.io Sun Jan 31 13:54:16 2016 From: donald at stufft.io (Donald Stufft) Date: Sun, 31 Jan 2016 13:54:16 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: <20160131133941.64a4eefc@subdivisions.wooz.org> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> <20160131133941.64a4eefc@subdivisions.wooz.org> Message-ID: <34B6BED8-F697-4BF3-8F21-852C244B2F8C@stufft.io> > On Jan 31, 2016, at 1:39 PM, Barry Warsaw wrote: > > On Jan 31, 2016, at 10:35 AM, Guido van Rossum wrote: > >> Maybe the software is totally open, but the community doesn't feel >> that way. hen I forayed into it briefly felt hostile to people who >> don't have the right personality to be online 24/7. > > It's probably a lot like FLOSS communities in general. Some are very open, > patient, and accepting, and others aren't. Maybe we're spoiled here in the > Pythonia. :) > Eh, I think IRC as a protocol tends to be hostile to people who can't have some method of being online 24/7 (even if it's via a bouncer and they aren't physically there). I think it's why you see more projects using things like Slack or gitter instead of IRC. You can sort of recreate some of this using log bots and/or bouncers and the like, but I think one of the things we're seeing across all of F/OSS is that for the newer generation of developers, UX matters, in many cases more than F/OSS does and they're less willing to put up with bad UX. I think it is why you see so many people developing software on OS X that they plan to deploy to Linux, why you see people preferring GitHub over other solutions, why Slack over IRC, etc. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From barry at python.org Sun Jan 31 14:02:59 2016 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Jan 2016 14:02:59 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: <34B6BED8-F697-4BF3-8F21-852C244B2F8C@stufft.io> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130161726.2fce1648@anarchist.wooz.org> <20160131124036.6e0bd451@subdivisions.wooz.org> <20160131133941.64a4eefc@subdivisions.wooz.org> <34B6BED8-F697-4BF3-8F21-852C244B2F8C@stufft.io> Message-ID: <20160131140259.3084bafc@subdivisions.wooz.org> On Jan 31, 2016, at 01:54 PM, Donald Stufft wrote: >Eh, I think IRC as a protocol tends to be hostile to people who can't have >some method of being online 24/7 I wouldn't say "hostile" but certainly not nearly as useful. On the flip side, I've heard complaints from Slack users (I'm not one myself) that they can get overwhelmed by notifications when they want to be "off the clock". Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ben+python at benfinney.id.au Sun Jan 31 14:20:54 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 01 Feb 2016 06:20:54 +1100 Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> Message-ID: <85bn81eesp.fsf@benfinney.id.au> Brett Cannon writes: > For instance, people have said they don't want to set up another > account. The complaint expressed (by me, at least; perhaps others agree) was not against setting up an account. As you point out, PSF mailing lists already require creating accounts. It's against being required to maintain a trusted relationship with some non-PSF-accountable entity, in order to participate in some aspect of Python community. I agree with others that a Discourse instance entirely controlled by PSF would avoid that problem. -- \ ?Consider the daffodil. And while you're doing that, I'll be | `\ over here, looking through your stuff.? ?Jack Handey | _o__) | Ben Finney From nicholas.chammas at gmail.com Sun Jan 31 16:11:53 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 31 Jan 2016 21:11:53 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: <85bn81eesp.fsf@benfinney.id.au> References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: If both Discourse and Mailman can live side-by-side, with Discourse being the ?web interface? to the Mailman list,I think we?d get the best of both worlds. Funny you ask that, since I wondered about exactly the same thing when I looked into using Discourse for an Apache project. The Apache Software Foundation has a strict policy about ASF-owned mailing lists being the place of discussion, so the only way Discourse would have been able to play a role was as an interface to an existing, ASF-owned mailing list. Here is the discussion I started about this on Discourse Meta around a year ago. In short, I think the answer that came out from that discussion is (quoting Jeff Atwood; emphasis his): This really depends on the culture of the mailing list. Discourse has fairly robust email support (for notifications, and if configured, for replies and email-in to start new topics), but it is still fundamentally web-centric in the way that it views the world. There will be clashes for people who are 100% email-centric. Do you have support from the ?powers that be? at said mailing lists to make such a change? Are they asking for such a change? We are very open to working with a partner on migrating mailing lists and further enhancing the mailing list support in Discourse, but it very much requires solid support from the *leadership* and a significant part of the *community*. There?s a lot of friction involved in changes for groups! Nick ? On Sun, Jan 31, 2016 at 2:21 PM Ben Finney wrote: > Brett Cannon writes: > > > For instance, people have said they don't want to set up another > > account. > > The complaint expressed (by me, at least; perhaps others agree) was not > against setting up an account. As you point out, PSF mailing lists > already require creating accounts. It's against being required to > maintain a trusted relationship with some non-PSF-accountable entity, in > order to participate in some aspect of Python community. > > I agree with others that a Discourse instance entirely controlled by PSF > would avoid that problem. > > -- > \ ?Consider the daffodil. And while you're doing that, I'll be | > `\ over here, looking through your stuff.? ?Jack Handey | > _o__) | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.chammas at gmail.com Sun Jan 31 16:53:13 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 31 Jan 2016 21:53:13 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: Brett wrote: What this means is we occasionally have to evaluate whether our ways of communicating are too antiquated for new participants in open source and whether they are no longer the most effective (because old does not mean bad, but it does not mean better either), while balancing it with not having constant churn or inadvertently making things worse. Discourse aside, I?m really glad to see that people understand that this is important to the long-term health of Python ? as a community and otherwise ? and are willing to give it priority. (And I totally agree that significant workflow changes, or discussions thereof, should happen infrequently and be evaluated carefully for their cost and benefit over time.) Donald wrote: I think one of the things we?re seeing across all of F/OSS is that for the newer generation of developers, UX matters, in many cases more than F/OSS does and they?re less willing to put up with bad UX. I can attest to this personally, and I?ll also offer this conjecture: I don?t think older generations of developers are intrinsically any more tolerant of bad UX than the younger generations are. They hate bad UX too, and they had to figure out their own solutions to make things better ? their email filters, their clients, their homegrown scripts, etc. ? when nothing better was available, and eventually settled into a flow that worked for them. Nick ? On Sun, Jan 31, 2016 at 4:11 PM Nicholas Chammas wrote: > If both Discourse and Mailman can live side-by-side, with > > Discourse being the ?web interface? to the Mailman list,I think we?d get > the best of both worlds. > > Funny you ask that, since I wondered about exactly the same thing when I > looked into using Discourse for an Apache project. The Apache Software > Foundation has a strict policy about ASF-owned mailing lists being the > place of discussion, so the only way Discourse would have been able to play > a role was as an interface to an existing, ASF-owned mailing list. > > Here is the discussion I started about this > > on Discourse Meta around a year ago. > > In short, I think the answer that came out from that discussion is ( > quoting > > Jeff Atwood; emphasis his): > > This really depends on the culture of the mailing list. Discourse has > fairly robust email support (for notifications, and if configured, for > replies and email-in to start new topics), but it is still fundamentally > web-centric in the way that it views the world. There will be clashes for > people who are 100% email-centric. > > Do you have support from the ?powers that be? at said mailing lists to > make such a change? Are they asking for such a change? We are very open to > working with a partner on migrating mailing lists and further enhancing the > mailing list support in Discourse, but it very much requires solid support > from the *leadership* and a significant part of the *community*. > > There?s a lot of friction involved in changes for groups! > > Nick > ? > > On Sun, Jan 31, 2016 at 2:21 PM Ben Finney > wrote: > >> Brett Cannon writes: >> >> > For instance, people have said they don't want to set up another >> > account. >> >> The complaint expressed (by me, at least; perhaps others agree) was not >> against setting up an account. As you point out, PSF mailing lists >> already require creating accounts. It's against being required to >> maintain a trusted relationship with some non-PSF-accountable entity, in >> order to participate in some aspect of Python community. >> >> I agree with others that a Discourse instance entirely controlled by PSF >> would avoid that problem. >> >> -- >> \ ?Consider the daffodil. And while you're doing that, I'll be | >> `\ over here, looking through your stuff.? ?Jack Handey | >> _o__) | >> Ben Finney >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skrah.temporarily at gmail.com Sun Jan 31 17:03:59 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sun, 31 Jan 2016 22:03:59 +0000 (UTC) Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: Nicholas Chammas writes: > I can attest to this personally, and I?ll also offer this conjecture: > I don?t think older generations of developers are intrinsically any more tolerant of bad UX than the younger generations are. They hate bad UX too, and they had to figure out their own solutions to make things better ? their email filters, their clients, their homegrown scripts, etc. ? when nothing better was available, and eventually settled into a flow that worked for them. You are really getting on a soapbox here while having no clue at all about basic mailing list etiquette like a) not top posting b) not full quoting the entire thread c) properly quoting your predecessors. I guess we'll see more of that once the move to discourse has happened. Stefan Krah From donald at stufft.io Sun Jan 31 17:17:46 2016 From: donald at stufft.io (Donald Stufft) Date: Sun, 31 Jan 2016 17:17:46 -0500 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: > On Jan 31, 2016, at 5:03 PM, Stefan Krah wrote: > > c) properly quoting your predecessors. > Which is ironic, given that you incorrectly quoted Nicholas and half of the message isn?t quoted at all though it should be. Perhaps it?d be a lot more welcoming if we didn?t scold people for ?mailing list etiquette? when the various email clients make it pretty easy to accidentally mess it up. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From skrah.temporarily at gmail.com Sun Jan 31 17:29:19 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sun, 31 Jan 2016 22:29:19 +0000 (UTC) Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: Donald Stufft writes: > > c) properly quoting your predecessors. > > > > Which is ironic, given that you incorrectly quoted Nicholas and half of the message isn?t quoted at all > though it should be. Perhaps it?d be a lot more welcoming if we didn?t scold people for ?mailing list > etiquette? when the various email clients make it pretty easy to accidentally mess it up. How can I quote properly if it isn't clear at all who wrote what? Stefan Krah From nicholas.chammas at gmail.com Sun Jan 31 17:30:58 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Sun, 31 Jan 2016 22:30:58 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: On Sun, Jan 31, 2016 at 5:04 PM Stefan Krah skrah.temporarily at gmail.com wrote: a) not top posting b) not full quoting the entire thread Sorry, by default Gmail hides the thread when replying so it?s easy to forget that you are re-mailing the whole thing out. So normally you would not even notice that someone has top posted or quoted the entire thread if you?re reading on Gmail?s web client. Chalk it up to my being a mailing list n00b. I hope you also recognize that this particular piece of mailing list etiquette arose in a time where people did not have nice tooling to do the work for them, which is part of the point of this discussion. c) properly quoting your predecessors. OK, did I do it right this time? I guess we?ll see more of that once the move to discourse has happened. No such move has been agreed upon as far as I can tell, but if I may continue on my ?soap box? and repeat what I?ve said earlier: I think Discourse will make etiquette *easier* to follow by taking care of repetitive tasks like this for people, instead of requiring that everyone independently remember to do X, Y, and Z every time they post. If you disagree, it would be good to hear why so we can discuss the root issue. And for the record: I?m really taken aback by how cynical your comments are, and I apologize for ticking you off. I?ll do a better job of following mailing list etiquette going forward. Nick ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun Jan 31 17:39:48 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 31 Jan 2016 14:39:48 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: <56AE8D34.8000204@stoneleaf.us> On 01/31/2016 02:03 PM, Stefan Krah wrote: > b) not full quoting the entire thread Do you mean something like quoting an entire PEP when responding to only one or two lines of it, or do you mean keeping everything from the first email through all the replies so we have 15 levels of indentation? `Cause frankly, both those suck, and many long time users here are guilty of it. So why don't you lay off the personality war, and have an honest discussion of the idea. -- ~Ethan~ From guido at python.org Sun Jan 31 17:49:50 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 31 Jan 2016 14:49:50 -0800 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: On Sun, Jan 31, 2016 at 2:29 PM, Stefan Krah wrote: > How can I quote properly if it isn't clear at all who wrote what? Stefan, take this "etiquette" stuff off the thread. -- --Guido van Rossum (python.org/~guido) From skrah.temporarily at gmail.com Sun Jan 31 18:06:05 2016 From: skrah.temporarily at gmail.com (Stefan Krah) Date: Sun, 31 Jan 2016 23:06:05 +0000 (UTC) Subject: [Python-ideas] A bit meta References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: Nicholas Chammas writes: > Sorry, by default Gmail hides the thread when replying so it?s easy to forget that you are re-mailing the whole thing out. So normally you would not even notice that someone has top posted or quoted the entire thread if you?re reading on Gmail?s web client. Chalk it up to my being a mailing list n00b. That's okay, but perhaps your tools aren't as good as you think. > I hope you also recognize that this particular piece of mailing list etiquette arose in a time where people did not have nice tooling to do the work for them, which is part of the point of this discussion. Have you actually *used* Gnus, mutt, slrn or even gmane.org? You are again stating things with great certainty while I don't think you know the subject. > OK, did I do it right this time? No, try replying to one of your own posts on gmane.org and you'll see. > I think Discourse will make etiquette easier to follow by taking care of repetitive tasks like this for people, instead of requiring that everyone independently remember to do X, Y, and Z every time they post. You don't need to: Any of the above options does it automatically. > And for the record: I?m really taken aback by how cynical your comments are, and I apologize for ticking you off. I?ll do a better job of following mailing list etiquette going forward. It's not about the etiquette: If you come in here and tell us that our tools are inferior, expect some pushback. I for example think that http://try.discourse.org/ looks cluttered and distracting. Stefan Krah From rosuav at gmail.com Sun Jan 31 18:09:47 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Feb 2016 10:09:47 +1100 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: On Mon, Feb 1, 2016 at 9:30 AM, Nicholas Chammas wrote: > On Sun, Jan 31, 2016 at 5:04 PM Stefan Krah skrah.temporarily at gmail.com > wrote: > > a) not top posting > > b) not full quoting the entire thread > > Sorry, by default Gmail hides the thread when replying so it?s easy to > forget that you are re-mailing the whole thing out. So normally you would > not even notice that someone has top posted or quoted the entire thread if > you?re reading on Gmail?s web client. Chalk it up to my being a mailing list > n00b. Yeah, Gmail can be a pain. But it's good in so many other ways that I keep using it. A couple of tips: 1) Turn off Rich Text by default, and if ever you see it active, turn it off for that email. It's a lot easier to make sure you're quoting properly etc when the email is in plain text mode. 2) If you're replying to just part of the message, you should be able to highlight that part and click in the Reply box. (That might require a config option - been ages since I set this up.) There'll be a couple of blank lines at the top, but you can either delete or ignore them, and just hit Ctrl-End to start typing underneath (or insert text in between different blocks). 3) To reply to the whole message, hit R or click in the box - and then press Ctrl-A to "select all". This instantly expands out the quoted text, making it easy to see what's worth trimming. And either because of Monty Python or because of it being one of the two hardest problems in computing, I said "a couple" and gave three. Whatever. :) ChrisA From nicholas.chammas at gmail.com Sun Jan 31 20:59:46 2016 From: nicholas.chammas at gmail.com (Nicholas Chammas) Date: Mon, 01 Feb 2016 01:59:46 +0000 Subject: [Python-ideas] A bit meta In-Reply-To: References: <9AFD3488-88D9-4CE2-9772-EB1FDD65615A@selik.org> <56AB94A2.20901@stoneleaf.us> <20160130041001.GI4619@ando.pearwood.info> <2067CA09-F205-41A2-A0E0-17065DB1BE00@gmail.com> <22189.23440.832633.260311@turnbull.sk.tsukuba.ac.jp> <85bn81eesp.fsf@benfinney.id.au> Message-ID: On Sun, Jan 31, 2016 at 6:06 PM Stefan Krah wrote: > > > I hope you also recognize that this particular piece of mailing list > etiquette arose in a time where people did not have nice tooling to do the > work for them, which is part of the point of this discussion. > > Have you actually *used* Gnus, mutt, slrn or even gmane.org? You > are again stating things with great certainty while I don't think > you know the subject. > I haven't used those tools. It would be enlightening if you explained how they can address the issues we've been discussing in this thread. That is what we're discussing here, after all--improving how we discuss things. I've done my part by explaining the potential benefits that Discourse can offer us in great detail, because that's what I know. Regarding "stating things with great certainty", I'm not sure what you're referring to. I made some arguments, quoted people, and linked to stuff. Not sure what my crime is there. And my quote about "the older generations of developers" -- which you sneered at earlier with the "soap box" comment -- I explicitly prefaced with: "I'll also offer this conjecture: ..." > OK, did I do it right this time? > > No, try replying to one of your own posts on gmane.org and you'll see. > I'm not sure what I'm supposed to see on gmane.org. I can see that this thread is on there, but I can't find the most recent messages. The way I am quoting you now is: I am hitting "Reply" in Gmail, clearing out older parts of the thread, and replying inline to what you wrote. It's pretty simple. If that's still not correct then I'm not sure how to satisfy you. All I can say is that I think it would be better if we had a way to solve mundane issues like this centrally, instead of pushing the responsibility onto each list user to piece together their own toolchain or workflow for doing the right thing. > I think Discourse will make etiquette easier to follow by taking care of > repetitive tasks like this for people, instead of requiring that everyone > independently remember to do X, Y, and Z every time they post. > > You don't need to: Any of the above options does it automatically. > Are Gnus, mutt, and slrn client-side tools? If they are, then we are pushing this responsibility onto every list user to find and use these tools correctly. You also mentioned being able to respond to mail via gmane.org. Is that the standard way everyone is expected to interact with the list? If not, then you have the same problem. Having a modern, web-based forum like Discourse which takes care of repetitive tasks like this centrally means everyone on the forum automatically has it taken care of. It's part of the interface of the forum, and everyone is using the same interface. Discourse's UX is good enough that in many cases the user *can't* or is *extremely unlikely* to do the wrong thing when it comes to mundane, routine things like quoting people, replying, etc. I think that's great. > And for the record: I?m really taken aback by how cynical your comments > are, and I apologize for ticking you off. I?ll do a better job of following > mailing list etiquette going forward. > > It's not about the etiquette: If you come in here and tell us > that our tools are inferior, expect some pushback. > I don't think I've bashed anyone's tools on here as "inferior". My discussion has been limited to Discourse vs. mailing lists. As you yourself stated, I clearly don't know about tools like Gnus, mutt, and so forth, and I'm not going to bash something I don't know. I *have* been arguing that a modern web-based forum solves common discussion issues in a way that mailing lists cannot match. But I think my arguments have been dispassionate and have not involved disparaging any tools out there as "inferior". As for "coming in here", I guess you're telling me that I'm an outsider. Sure. And as for "pushback", I would make a distinction between pushback that is substantive in nature and focused on the problem at hand, and simple derision. They don't belong in the same category. I for example think that http://try.discourse.org/ looks cluttered > and distracting. > Finally! An actual discussion of Discourse. And in this case, I agree with you. I've gotten more accustomed to the layout over time, but I do remember being overwhelmed when I first discovered Discourse. I'd bet there are options to change the layout and reduce visual noise, but I don't know. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: