From tarek at ziade.org Fri Sep 1 07:50:13 2017 From: tarek at ziade.org (=?utf-8?Q?Tarek=20Ziad=C3=A9?=) Date: Fri, 01 Sep 2017 13:50:13 +0200 Subject: [Python-ideas] tarfile.extractall progress Message-ID: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> Hey, For large archives, I want to display a progress bar while the archive is being extracted with: https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall I could write my own version of extractall() to do this, or maybe we could introduce a callback option that gets called everytime .extract() is called in extractall() The callback can receive the tarinfo object and where it's being extracted. This is enough to plug a progress bar and avoid reinventing .extractall() I can add a ticket and maybe a patch if people think this is a good little enhancement Cheers Tarek -- Tarek Ziad? | coding: https://ziade.org | running: https://foule.es | twitter: @tarek_ziade From phd at phdru.name Fri Sep 1 08:04:09 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 1 Sep 2017 14:04:09 +0200 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> Message-ID: <20170901120409.GA2828@phdru.name> Hi! On Fri, Sep 01, 2017 at 01:50:13PM +0200, Tarek Ziad?? wrote: > Hey, > > For large archives, I want to display a progress bar while the archive > is being extracted with: > > https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall > > I could write my own version of extractall() to do this, or maybe we > could introduce a callback option that gets called > everytime .extract() is called in extractall() > > The callback can receive the tarinfo object and where it's being > extracted. This is enough to plug a progress bar > and avoid reinventing .extractall() What is "where" here? I think it should be 2 parameters -- position in the file (in bytes) and total file size; the total could be None if the size is unknown (the tar is piped from network or a (g/bz)zip subprocess). > I can add a ticket and maybe a patch if people think this is a good > little enhancement Definitely a good idea! > Cheers > Tarek > > -- > > Tarek Ziad?? | coding: https://ziade.org | running: https://foule.es | > twitter: @tarek_ziade Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From p.f.moore at gmail.com Fri Sep 1 08:18:54 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 1 Sep 2017 13:18:54 +0100 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> Message-ID: On 1 September 2017 at 12:50, Tarek Ziad? wrote: > Hey, > > For large archives, I want to display a progress bar while the archive > is being extracted with: > > https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall > > I could write my own version of extractall() to do this, or maybe we > could introduce a callback option that gets called > everytime .extract() is called in extractall() > > The callback can receive the tarinfo object and where it's being > extracted. This is enough to plug a progress bar > and avoid reinventing .extractall() > > I can add a ticket and maybe a patch if people think this is a good > little enhancement Sounds like a reasonable enhancement, but for your particular use couldn't you just subclass TarFile and call your progress callback at the end of the extract method after the base class extract? Paul From tarek at ziade.org Fri Sep 1 08:23:18 2017 From: tarek at ziade.org (=?utf-8?Q?Tarek=20Ziad=C3=A9?=) Date: Fri, 01 Sep 2017 14:23:18 +0200 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> Message-ID: <1504268598.3834376.1092173848.0238E1D1@webmail.messagingengine.com> On Fri, Sep 1, 2017, at 02:18 PM, Paul Moore wrote: [..] > > Sounds like a reasonable enhancement, but for your particular use > couldn't you just subclass TarFile and call your progress callback at > the end of the extract method after the base class extract? Yes that's what I ended up doing. But a callable in extractall() sounds like a simpler way to do it. From tarek at ziade.org Fri Sep 1 08:28:05 2017 From: tarek at ziade.org (=?utf-8?Q?Tarek=20Ziad=C3=A9?=) Date: Fri, 01 Sep 2017 14:28:05 +0200 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: <20170901120409.GA2828@phdru.name> References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> <20170901120409.GA2828@phdru.name> Message-ID: <1504268885.3835302.1092175584.5E8746E4@webmail.messagingengine.com> On Fri, Sep 1, 2017, at 02:04 PM, Oleg Broytman wrote: > Hi! > > On Fri, Sep 01, 2017 at 01:50:13PM +0200, Tarek Ziad?? > wrote: > > Hey, > > > > For large archives, I want to display a progress bar while the archive > > is being extracted with: > > > > https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall > > > > I could write my own version of extractall() to do this, or maybe we > > could introduce a callback option that gets called > > everytime .extract() is called in extractall() > > > > The callback can receive the tarinfo object and where it's being > > extracted. This is enough to plug a progress bar > > and avoid reinventing .extractall() > > What is "where" here? I think it should be 2 parameters -- position > in the file (in bytes) and total file size; the total could be None if > the size is unknown (the tar is piped from network or a (g/bz)zip > subprocess). Interesting. In my mind, I was thinking about a high level callable that would just let me count the files and directory that are being extracted, my hackish implementation with clint: with tarfile.open(file, "r:gz") as tar: size = len(list(tar)) with progress.Bar(expected_size=size) as bar: def _extract(self, *args, **kw): bar.show(bar.last_progress + 1) return self.old(*args, **kw) tar.old = tar.extract tar.extract = functools.partial(_extract, tar) tar.extractall(profile_dir) What I would expect to be able to do with the new option, something like: with tarfile.open(file, "r:gz") as tar: size = len(list(tar)) with progress.Bar(expected_size=size) as bar: def _progress(tarinfo): bar.show(bar.last_progress + 1) tar.extractall(profile_dir, onextracted=_progress) > > > I can add a ticket and maybe a patch if people think this is a good > > little enhancement > > Definitely a good idea! > > > Cheers > > Tarek > > > > -- > > > > Tarek Ziad?? | coding: https://ziade.org | running: https://foule.es | > > twitter: @tarek_ziade > > Oleg. > -- > Oleg Broytman http://phdru.name/ > phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From storchaka at gmail.com Fri Sep 1 09:34:24 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 1 Sep 2017 16:34:24 +0300 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> Message-ID: 01.09.17 14:50, Tarek Ziad? ????: > For large archives, I want to display a progress bar while the archive > is being extracted with: > > https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall > > I could write my own version of extractall() to do this, or maybe we > could introduce a callback option that gets called > everytime .extract() is called in extractall() > > The callback can receive the tarinfo object and where it's being > extracted. This is enough to plug a progress bar > and avoid reinventing .extractall() This is not enough if extract large files. In that case you perhaps want to update the progress bar more often. If add this feature to tarfile, it may be worth to add it to zipfile and shutil functions (copytree, rmtree). And if call a callback for every extracted/copied entity, it may be worth to use its result for filtering. From francismb at email.de Sun Sep 3 15:36:15 2017 From: francismb at email.de (francismb) Date: Sun, 3 Sep 2017 21:36:15 +0200 Subject: [Python-ideas] tarfile.extractall progress In-Reply-To: <1504268885.3835302.1092175584.5E8746E4@webmail.messagingengine.com> References: <1504266613.3827734.1092143640.6601B53D@webmail.messagingengine.com> <20170901120409.GA2828@phdru.name> <1504268885.3835302.1092175584.5E8746E4@webmail.messagingengine.com> Message-ID: Hi, >>> >>> For large archives, I want to display a progress bar while the archive >>> is being extracted with: >>> >>> >>> The callback can receive the tarinfo object and where it's being >>> extracted. This is enough to plug a progress bar >>> and avoid reinventing .extractall() >> >> What is "where" here? I think it should be 2 parameters -- position >> in the file (in bytes) and total file size; the total could be None if >> the size is unknown (the tar is piped from network or a (g/bz)zip >> subprocess). > > Interesting. In my mind, I was thinking about a high level callable that > would just let me count the files and directory that are being > extracted, > Should it, in the case of just on big file, be the progress 0% (start) or 100% (done) ? or there's a way to see some progress in that case ? or is irrelevant for that case ? Thanks in advance! --francis From k7hoven at gmail.com Mon Sep 4 17:50:35 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 00:50:35 +0300 Subject: [Python-ideas] PEP draft: context variables Message-ID: Hi all, as promised, here is a draft PEP for context variable semantics and implementation. Apologies for the slight delay; I had a not-so-minor autosave accident and had to retype the majority of this first draft. During the past years, there has been growing interest in something like task-local storage or async-local storage. This PEP proposes an alternative approach to solving the problems that are typically stated as motivation for such concepts. This proposal is based on sketches of solutions since spring 2015, with some minor influences from the recent discussion related to PEP 550. I can also see some potential implementation synergy between this PEP and PEP 550, even if the proposed semantics are quite different. So, here it is. This is the first draft and some things are still missing, but the essential things should be there. -- Koos |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PEP: 999 Title: Context-local variables (contextvars) Version: $Revision$ Last-Modified: $Date$ Author: Koos Zevenhoven Status: Draft Type: Standards Track Content-Type: text/x-rst Created: DD-Mmm-YYYY Post-History: DD-Mmm-YYYY Abstract ======== Sometimes, in special cases, it is desired that code can pass information down the function call chain to the callees without having to explicitly pass the information as arguments to each function in the call chain. This proposal describes a construct which allows code to explicitly switch in and out of a context where a certain context variable has a given value assigned to it. This is a modern alternative to some uses of things like global variables in traditional single-threaded (or thread-unsafe) code and of thread-local storage in traditional *concurrency-unsafe* code (single- or multi-threaded). In particular, the proposed mechanism can also be used with more modern concurrent execution mechanisms such as asynchronously executed coroutines, without the concurrently executed call chains interfering with each other's contexts. The "call chain" can consist of normal functions, awaited coroutines, or generators. The semantics of context variable scope are equivalent in all cases, allowing code to be refactored freely into *subroutines* (which here refers to functions, sub-generators or sub-coroutines) without affecting the semantics of context variables. Regarding implementation, this proposal aims at simplicity and minimum changes to the CPython interpreter and to other Python interpreters. Rationale ========= Consider a modern Python *call chain* (or call tree), which in this proposal refers to any chained (nested) execution of *subroutines*, using any possible combinations of normal function calls, or expressions using ``await`` or ``yield from``. In some cases, passing necessary *information* down the call chain as arguments can substantially complicate the required function signatures, or it can even be impossible to achieve in practice. In these cases, one may search for another place to store this information. Let us look at some historical examples. The most naive option is to assign the value to a global variable or similar, where the code down the call chain can access it. However, this immediately makes the code thread-unsafe, because with multiple threads, all threads assign to the same global variable, and another thread can interfere at any point in the call chain. A somewhat less naive option is to store the information as per-thread information in thread-local storage, where each thread has its own "copy" of the variable which other threads cannot interfere with. Although non-ideal, this has been the best solution in many cases. However, thanks to generators and coroutines, the execution of the call chain can be suspended and resumed, allowing code in other contexts to run concurrently. Therefore, using thread-local storage is *concurrency-unsafe*, because other call chains in other contexts may interfere with the thread-local variable. Note that in the above two historical approaches, the stored information has the *widest* available scope without causing problems. For a third solution along the same path, one would first define an equivalent of a "thread" for asynchronous execution and concurrency. This could be seen as the largest amount of code and nested calls that is guaranteed to be executed sequentially without ambiguity in execution order. This might be referred to as concurrency-local or task-local storage. In this meaning of "task", there is no ambiguity in the order of execution of the code within one task. (This concept of a task is close to equivalent to a ``Task`` in ``asyncio``, but not exactly.) In such concurrency-locals, it is possible to pass information down the call chain to callees without another code path interfering with the value in the background. Common to the above approaches is that they indeed use variables with a wide but just-narrow-enough scope. Thread-locals could also be called thread-wide globals---in single-threaded code, they are indeed truly global. And task-locals could be called task-wide globals, because tasks can be very big. The issue here is that neither global variables, thread-locals nor task-locals are really meant to be used for this purpose of passing information of the execution context down the call chain. Instead of the widest possible variable scope, the scope of the variables should be controlled by the programmer, typically of a library, to have the desired scope---not wider. In other words, task-local variables (and globals and thread-locals) have nothing to do with the kind of context-bound information passing that this proposal intends to enable, even if task-locals can be used to emulate the desired semantics. Therefore, in the following, this proposal describes the semantics and the outlines of an implementation for *context-local variables* (or context variables, contextvars). In fact, as a side effect of this PEP, an async framework can use the proposed feature to implement task-local variables. Proposal ======== Because the proposed semantics are not a direct extension to anything already available in Python, this proposal is first described in terms of semantics and API at a fairly high level. In particular, Python ``with`` statements are heavily used in the description, as they are a good match with the proposed semantics. However, the underlying ``__enter__`` and ``__exit__`` methods correspond to functions in the lower-level speed-optimized (C) API. For clarity of this document, the lower-level functions are not explicitly named in the definition of the semantics. After describing the semantics and high-level API, the implementation is described, going to a lower level. Semantics and higher-level API ------------------------------ Core concept '''''''''''' A context-local variable is represented by a single instance of ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` object can ask for its value with respect to the current context. In the high-level API, this value is given by the ``cvar.value`` property:: cvar = contextvars.Var(default="the default value", description="example context variable") assert cvar.value == "the default value" # default still applies # In code examples, all ``assert`` statements should # succeed according to the proposed semantics. No assignments to ``cvar`` have been applied for this context, so ``cvar.value`` gives the default value. Assigning new values to contextvars is done in a highly scope-aware manner:: with cvar.assign(new_value): assert cvar.value is new_value # Any code here, or down the call chain from here, sees: # cvar.value is new_value # unless another value has been assigned in a # nested context assert cvar.value is new_value # the assignment of ``cvar`` to ``new_value`` is no longer visible assert cvar.value == "the default value" Here, ``cvar.assign(value)`` returns another object, namely ``contextvars.Assignment(cvar, new_value)``. The essential part here is that applying a context variable assignment (``Assignment.__enter__``) is paired with a de-assignment (``Assignment.__exit__``). These operations set the bounds for the scope of the assigned value. Assignments to the same context variable can be nested to override the outer assignment in a narrower context:: assert cvar.value == "the default value" with cvar.assign("outer"): assert cvar.value == "outer" with cvar.assign("inner"): assert cvar.value == "inner" assert cvar.value == "outer" assert cvar.value == "the default value" Also multiple variables can be assigned to in a nested manner without affecting each other:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() assert cvar1.value is None # default is None by default assert cvar2.value is None with cvar1.assign(value1): assert cvar1.value is value1 assert cvar2.value is None with cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 assert cvar1.value is value1 assert cvar2.value is None assert cvar1.value is None assert cvar2.value is None Or with more convenient Python syntax:: with cvar1.assign(value1), cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 In another *context*, in another thread or otherwise concurrently executed task or code path, the context variables can have a completely different state. The programmer thus only needs to worry about the context at hand. Refactoring into subroutines '''''''''''''''''''''''''''' Code using contextvars can be refactored into subroutines without affecting the semantics. For instance:: assi = cvar.assign(new_value) def apply(): assi.__enter__() assert cvar.value == "the default value" apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or similarly in an asynchronous context where ``await`` expressions are used. The subroutine can now be a coroutine:: assi = cvar.assign(new_value) async def apply(): assi.__enter__() assert cvar.value == "the default value" await apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or when the subroutine is a generator:: def apply(): yield assi.__enter__() which is called using ``yield from apply()`` or with calls to ``next`` or ``.send``. This is discussed further in later sections. Semantics for generators and generator-based coroutines ''''''''''''''''''''''''''''''''''''''''''''''''''''''' Generators, coroutines and async generators act as subroutines in much the same way that normal functions do. However, they have the additional possibility of being suspended by ``yield`` expressions. Assignment contexts entered inside a generator are normally preserved across yields:: def genfunc(): with cvar.assign(new_value): assert cvar.value is new_value yield assert cvar.value is new_value g = genfunc() next(g) assert cvar.value == "the default value" with cvar.assign(another_value): next(g) However, the outer context visible to the generator may change state across yields:: def genfunc(): assert cvar.value is value2 yield assert cvar.value is value1 yield with cvar.assign(value3): assert cvar.value is value3 with cvar.assign(value1): g = genfunc() with cvar.assign(value2): next(g) next(g) next(g) assert cvar.value is value1 Similar semantics apply to async generators defined by ``async def ... yield ...`` ). By default, values assigned inside a generator do not leak through yields to the code that drives the generator. However, the assignment contexts entered and left open inside the generator *do* become visible outside the generator after the generator has finished with a ``StopIteration`` or another exception:: assi = cvar.assign(new_value) def genfunc(): yield assi.__enter__(): yield g = genfunc() assert cvar.value == "the default value" next(g) assert cvar.value == "the default value" next(g) # assi.__enter__() is called here assert cvar.value == "the default value" next(g) assert cvar.value is new_value assi.__exit__() Special functionality for framework authors ------------------------------------------- Frameworks, such as ``asyncio`` or third-party libraries, can use additional functionality in ``contextvars`` to achieve the desired semantics in cases which are not determined by the Python interpreter. Some of the semantics described in this section are also afterwards used to describe the internal implementation. Leaking yields '''''''''''''' Using the ``contextvars.leaking_yields`` decorator, one can choose to leak the context through ``yield`` expressions into the outer context that drives the generator:: @contextvars.leaking_yields def genfunc(): assert cvar.value == "outer" with cvar.assign("inner"): yield assert cvar.value == "inner" assert cvar.value == "outer" g = genfunc(): with cvar.assign("outer"): assert cvar.value == "outer" next(g) assert cvar.value == "inner" next(g) assert cvar.value == "outer" Capturing contextvar assignments '''''''''''''''''''''''''''''''' Using ``contextvars.capture()``, one can capture the assignment contexts that are entered by a block of code. The changes applied by the block of code can then be reverted and subsequently reapplied, even in another context:: assert cvar1.value is None # default assert cvar2.value is None # default assi1 = cvar1.assign(value1) assi2 = cvar1.assign(value2) with contextvars.capture() as delta: assi1.__enter__() with cvar2.assign("not captured"): assert cvar2.value is "not captured" assi2.__enter__() assert cvar1.value is value2 delta.revert() assert cvar1.value is None assert cvar2.value is None ... with cvar1.assign(1), cvar2.assign(2): delta.reapply() assert cvar1.value is value2 assert cvar2.value == 2 However, reapplying the "delta" if its net contents include deassignments may not be possible (see also Implementation and Open Issues). Getting a snapshot of context state ''''''''''''''''''''''''''''''''''' The function ``contextvars.get_local_state()`` returns an object representing the applied assignments to all context-local variables in the context where the function is called. This can be seen as equivalent to using ``contextvars.capture()`` to capture all context changes from the beginning of execution. The returned object supports methods ``.revert()`` and ``reapply()`` as above. Running code in a clean state ''''''''''''''''''''''''''''' Although it is possible to revert all applied context changes using the above primitives, a more convenient way to run a block of code in a clean context is provided:: with context_vars.clean_context(): # here, all context vars start off with their default values # here, the state is back to what it was before the with block. Implementation -------------- This section describes to a variable level of detail how the described semantics can be implemented. At present, an implementation aimed at simplicity but sufficient features is described. More details will be added later. Alternatively, a somewhat more complicated implementation offers minor additional features while adding some performance overhead and requiring more code in the implementation. Data structures and implementation of the core concept '''''''''''''''''''''''''''''''''''''''''''''''''''''' Each thread of the Python interpreter keeps its on stack of ``contextvars.Assignment`` objects, each having a pointer to the previous (outer) assignment like in a linked list. The local state (also returned by ``contextvars.get_local_state()``) then consists of a reference to the top of the stack and a pointer/weak reference to the bottom of the stack. This allows efficient stack manipulations. An object produced by ``contextvars.capture()`` is similar, but refers to only a part of the stack with the bottom reference pointing to the top of the stack as it was in the beginning of the capture block. Now, the stack evolves according to the assignment ``__enter__`` and ``__exit__`` methods. For example:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() # stack: [] assert cvar1.value is None assert cvar2.value is None with cvar1.assign("outer"): # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" with cvar1.assign("inner"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" with cvar2.assign("hello"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner"), # Assignment(cvar2, "hello")] assert cvar2.value == "hello" # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" assert cvar2.value is None # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" # stack: [] assert cvar1.value is None assert cvar2.value is None Getting a value from the context using ``cvar1.value`` can be implemented as finding the topmost occurrence of a ``cvar1`` assignment on the stack and returning the value there, or the default value if no assignment is found on the stack. However, this can be optimized to instead be an O(1) operation in most cases. Still, even searching through the stack may be reasonably fast since these stacks are not intended to grow very large. The above description is already sufficient for implementing the core concept. Suspendable frames require some additional attention, as explained in the following. Implementation of generator and coroutine semantics ''''''''''''''''''''''''''''''''''''''''''''''''''' Within generators, coroutines and async generators, assignments and deassignments are handled in exactly the same way as anywhere else. However, some changes are needed in the builtin generator methods ``send``, ``__next__``, ``throw`` and ``close``. Here is the Python equivalent of the changes needed in ``send`` for a generator (here ``_old_send`` refers to the behavior in Python 3.6):: def send(self, value): # if decorated with contextvars.leaking_yields if self.gi_contextvars is LEAK: # nothing needs to be done to leak context through yields :) return self._old_send(value) try: with contextvars.capture() as delta: if self.gi_contextvars: # non-zero captured content from previous iteration self.gi_contextvars.reapply() ret = self._old_send(value) except Exception: raise else: # suspending, revert context changes but delta.revert() self.gi_contextvars = delta return ret The corresponding modifications to the other methods is essentially identical. The same applies to coroutines and async generators. For code that does not use ``contextvars``, the additions are O(1) and essentially reduce to a couple of pointer comparisons. For code that does use ``contextvars``, the additions are still O(1) in most cases. More on implementation '''''''''''''''''''''' The rest of the functionality, including ``contextvars.leaking_yields``, contextvars.capture()``, ``contextvars.get_local_state()`` and ``contextvars.clean_context()`` are in fact quite straightforward to implement, but their implementation will be discussed further in later versions of this proposal. Caching of assigned values is somewhat more complicated, and will be discussed later, but it seems that most cases should achieve O(1) complexity. Backwards compatibility ======================= There are no *direct* backwards-compatibility concerns, since a completely new feature is proposed. However, various traditional uses of thread-local storage may need a smooth transition to ``contextvars`` so they can be concurrency-safe. There are several approaches to this, including emulating task-local storage with a little bit of help from async frameworks. A fully general implementation cannot be provided, because the desired semantics may depend on the design of the framework. Another way to deal with the transition is for code to first look for a context created using ``contextvars``. If that fails because a new-style context has not been set or because the code runs on an older Python version, a fallback to thread-local storage is used. Open Issues =========== Out-of-order de-assignments --------------------------- In this proposal, all variable deassignments are made in the opposite order compared to the preceding assignments. This has two useful properties: it encourages using ``with`` statements to define assignment scope and has a tendency to catch errors early (forgetting a ``.__exit__()`` call often results in a meaningful error. To have this as a requirement requirement is beneficial also in terms of implementation simplicity and performance. Nevertheless, allowing out-of-order context exits is not completely out of the question, and reasonable implementation strategies for that do exist. Rejected Ideas ============== Dynamic scoping linked to subroutine scopes ------------------------------------------- The scope of value visibility should not be determined by the way the code is refactored into subroutines. It is necessary to have per-variable control of the assignment scope. Acknowledgements ================ To be added. References ========== To be added. -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Sep 4 18:20:36 2017 From: phd at phdru.name (Oleg Broytman) Date: Tue, 5 Sep 2017 00:20:36 +0200 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: <20170904222036.GA9043@phdru.name> Hi! On Tue, Sep 05, 2017 at 12:50:35AM +0300, Koos Zevenhoven wrote: > cvar = contextvars.Var(default="the default value", > description="example context variable") Why ``description`` and not ``doc``? > with cvar.assign(new_value): Why ``assign`` and not ``set``? > Each thread of the Python interpreter keeps its on stack of "its own", I think. > ``contextvars.Assignment`` objects, each having a pointer to the previous > (outer) assignment like in a linked list. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From yselivanov.ml at gmail.com Mon Sep 4 18:36:41 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 4 Sep 2017 15:36:41 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: So every generator stores "captured" modifications. This is similar to PEP 550, which adds Logical Context to generators to store their EC modifications. The implementation is different, but the intent is the same. PEP 550 uses a stack of hash tables, this proposal has a linked list of Assignment objects. In the worst case, this proposal will have worse performance guarantees. It's hard to say more, because the implementation isn't described in full. With PEP 550 it's trivial to implement a context manager to control variable assignments. If we do that, how exactly this proposal is different? Can you list all semantical differences between this proposal and PEP 550? So far, it looks like if I call "var.assign(value).__enter__()" it would be equivalent to PEP 550's "var.set(value)". Yury On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven wrote: > Hi all, > > as promised, here is a draft PEP for context variable semantics and > implementation. Apologies for the slight delay; I had a not-so-minor > autosave accident and had to retype the majority of this first draft. > > During the past years, there has been growing interest in something like > task-local storage or async-local storage. This PEP proposes an alternative > approach to solving the problems that are typically stated as motivation for > such concepts. > > This proposal is based on sketches of solutions since spring 2015, with some > minor influences from the recent discussion related to PEP 550. I can also > see some potential implementation synergy between this PEP and PEP 550, even > if the proposed semantics are quite different. > > So, here it is. This is the first draft and some things are still missing, > but the essential things should be there. > > -- Koos > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > PEP: 999 > Title: Context-local variables (contextvars) > Version: $Revision$ > Last-Modified: $Date$ > Author: Koos Zevenhoven > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: DD-Mmm-YYYY > Post-History: DD-Mmm-YYYY > > > Abstract > ======== > > Sometimes, in special cases, it is desired that code can pass information > down the function call chain to the callees without having to explicitly > pass the information as arguments to each function in the call chain. This > proposal describes a construct which allows code to explicitly switch in and > out of a context where a certain context variable has a given value assigned > to it. This is a modern alternative to some uses of things like global > variables in traditional single-threaded (or thread-unsafe) code and of > thread-local storage in traditional *concurrency-unsafe* code (single- or > multi-threaded). In particular, the proposed mechanism can also be used with > more modern concurrent execution mechanisms such as asynchronously executed > coroutines, without the concurrently executed call chains interfering with > each other's contexts. > > The "call chain" can consist of normal functions, awaited coroutines, or > generators. The semantics of context variable scope are equivalent in all > cases, allowing code to be refactored freely into *subroutines* (which here > refers to functions, sub-generators or sub-coroutines) without affecting the > semantics of context variables. Regarding implementation, this proposal aims > at simplicity and minimum changes to the CPython interpreter and to other > Python interpreters. > > Rationale > ========= > > Consider a modern Python *call chain* (or call tree), which in this proposal > refers to any chained (nested) execution of *subroutines*, using any > possible combinations of normal function calls, or expressions using > ``await`` or ``yield from``. In some cases, passing necessary *information* > down the call chain as arguments can substantially complicate the required > function signatures, or it can even be impossible to achieve in practice. In > these cases, one may search for another place to store this information. Let > us look at some historical examples. > > The most naive option is to assign the value to a global variable or > similar, where the code down the call chain can access it. However, this > immediately makes the code thread-unsafe, because with multiple threads, all > threads assign to the same global variable, and another thread can interfere > at any point in the call chain. > > A somewhat less naive option is to store the information as per-thread > information in thread-local storage, where each thread has its own "copy" of > the variable which other threads cannot interfere with. Although non-ideal, > this has been the best solution in many cases. However, thanks to generators > and coroutines, the execution of the call chain can be suspended and > resumed, allowing code in other contexts to run concurrently. Therefore, > using thread-local storage is *concurrency-unsafe*, because other call > chains in other contexts may interfere with the thread-local variable. > > Note that in the above two historical approaches, the stored information has > the *widest* available scope without causing problems. For a third solution > along the same path, one would first define an equivalent of a "thread" for > asynchronous execution and concurrency. This could be seen as the largest > amount of code and nested calls that is guaranteed to be executed > sequentially without ambiguity in execution order. This might be referred to > as concurrency-local or task-local storage. In this meaning of "task", there > is no ambiguity in the order of execution of the code within one task. (This > concept of a task is close to equivalent to a ``Task`` in ``asyncio``, but > not exactly.) In such concurrency-locals, it is possible to pass information > down the call chain to callees without another code path interfering with > the value in the background. > > Common to the above approaches is that they indeed use variables with a wide > but just-narrow-enough scope. Thread-locals could also be called thread-wide > globals---in single-threaded code, they are indeed truly global. And > task-locals could be called task-wide globals, because tasks can be very > big. > > The issue here is that neither global variables, thread-locals nor > task-locals are really meant to be used for this purpose of passing > information of the execution context down the call chain. Instead of the > widest possible variable scope, the scope of the variables should be > controlled by the programmer, typically of a library, to have the desired > scope---not wider. In other words, task-local variables (and globals and > thread-locals) have nothing to do with the kind of context-bound information > passing that this proposal intends to enable, even if task-locals can be > used to emulate the desired semantics. Therefore, in the following, this > proposal describes the semantics and the outlines of an implementation for > *context-local variables* (or context variables, contextvars). In fact, as a > side effect of this PEP, an async framework can use the proposed feature to > implement task-local variables. > > Proposal > ======== > > Because the proposed semantics are not a direct extension to anything > already available in Python, this proposal is first described in terms of > semantics and API at a fairly high level. In particular, Python ``with`` > statements are heavily used in the description, as they are a good match > with the proposed semantics. However, the underlying ``__enter__`` and > ``__exit__`` methods correspond to functions in the lower-level > speed-optimized (C) API. For clarity of this document, the lower-level > functions are not explicitly named in the definition of the semantics. After > describing the semantics and high-level API, the implementation is > described, going to a lower level. > > Semantics and higher-level API > ------------------------------ > > Core concept > '''''''''''' > > A context-local variable is represented by a single instance of > ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` > object can ask for its value with respect to the current context. In the > high-level API, this value is given by the ``cvar.value`` property:: > > cvar = contextvars.Var(default="the default value", > description="example context variable") > > assert cvar.value == "the default value" # default still applies > > # In code examples, all ``assert`` statements should > # succeed according to the proposed semantics. > > > No assignments to ``cvar`` have been applied for this context, so > ``cvar.value`` gives the default value. Assigning new values to contextvars > is done in a highly scope-aware manner:: > > with cvar.assign(new_value): > assert cvar.value is new_value > # Any code here, or down the call chain from here, sees: > # cvar.value is new_value > # unless another value has been assigned in a > # nested context > assert cvar.value is new_value > # the assignment of ``cvar`` to ``new_value`` is no longer visible > assert cvar.value == "the default value" > > > Here, ``cvar.assign(value)`` returns another object, namely > ``contextvars.Assignment(cvar, new_value)``. The essential part here is that > applying a context variable assignment (``Assignment.__enter__``) is paired > with a de-assignment (``Assignment.__exit__``). These operations set the > bounds for the scope of the assigned value. > > Assignments to the same context variable can be nested to override the outer > assignment in a narrower context:: > > assert cvar.value == "the default value" > with cvar.assign("outer"): > assert cvar.value == "outer" > with cvar.assign("inner"): > assert cvar.value == "inner" > assert cvar.value == "outer" > assert cvar.value == "the default value" > > > Also multiple variables can be assigned to in a nested manner without > affecting each other:: > > cvar1 = contextvars.Var() > cvar2 = contextvars.Var() > > assert cvar1.value is None # default is None by default > assert cvar2.value is None > > with cvar1.assign(value1): > assert cvar1.value is value1 > assert cvar2.value is None > with cvar2.assign(value2): > assert cvar1.value is value1 > assert cvar2.value is value2 > assert cvar1.value is value1 > assert cvar2.value is None > assert cvar1.value is None > assert cvar2.value is None > > Or with more convenient Python syntax:: > > with cvar1.assign(value1), cvar2.assign(value2): > assert cvar1.value is value1 > assert cvar2.value is value2 > > In another *context*, in another thread or otherwise concurrently executed > task or code path, the context variables can have a completely different > state. The programmer thus only needs to worry about the context at hand. > > Refactoring into subroutines > '''''''''''''''''''''''''''' > > Code using contextvars can be refactored into subroutines without affecting > the semantics. For instance:: > > assi = cvar.assign(new_value) > def apply(): > assi.__enter__() > assert cvar.value == "the default value" > apply() > assert cvar.value is new_value > assi.__exit__() > assert cvar.value == "the default value" > > > Or similarly in an asynchronous context where ``await`` expressions are > used. The subroutine can now be a coroutine:: > > assi = cvar.assign(new_value) > async def apply(): > assi.__enter__() > assert cvar.value == "the default value" > await apply() > assert cvar.value is new_value > assi.__exit__() > assert cvar.value == "the default value" > > > Or when the subroutine is a generator:: > > def apply(): > yield > assi.__enter__() > > > which is called using ``yield from apply()`` or with calls to ``next`` or > ``.send``. This is discussed further in later sections. > > Semantics for generators and generator-based coroutines > ''''''''''''''''''''''''''''''''''''''''''''''''''''''' > > Generators, coroutines and async generators act as subroutines in much the > same way that normal functions do. However, they have the additional > possibility of being suspended by ``yield`` expressions. Assignment contexts > entered inside a generator are normally preserved across yields:: > > def genfunc(): > with cvar.assign(new_value): > assert cvar.value is new_value > yield > assert cvar.value is new_value > g = genfunc() > next(g) > assert cvar.value == "the default value" > with cvar.assign(another_value): > next(g) > > > However, the outer context visible to the generator may change state across > yields:: > > def genfunc(): > assert cvar.value is value2 > yield > assert cvar.value is value1 > yield > with cvar.assign(value3): > assert cvar.value is value3 > > with cvar.assign(value1): > g = genfunc() > with cvar.assign(value2): > next(g) > next(g) > next(g) > assert cvar.value is value1 > > > Similar semantics apply to async generators defined by ``async def ... yield > ...`` ). > > By default, values assigned inside a generator do not leak through yields to > the code that drives the generator. However, the assignment contexts entered > and left open inside the generator *do* become visible outside the generator > after the generator has finished with a ``StopIteration`` or another > exception:: > > assi = cvar.assign(new_value) > def genfunc(): > yield > assi.__enter__(): > yield > > g = genfunc() > assert cvar.value == "the default value" > next(g) > assert cvar.value == "the default value" > next(g) # assi.__enter__() is called here > assert cvar.value == "the default value" > next(g) > assert cvar.value is new_value > assi.__exit__() > > > > Special functionality for framework authors > ------------------------------------------- > > Frameworks, such as ``asyncio`` or third-party libraries, can use additional > functionality in ``contextvars`` to achieve the desired semantics in cases > which are not determined by the Python interpreter. Some of the semantics > described in this section are also afterwards used to describe the internal > implementation. > > Leaking yields > '''''''''''''' > > Using the ``contextvars.leaking_yields`` decorator, one can choose to leak > the context through ``yield`` expressions into the outer context that drives > the generator:: > > @contextvars.leaking_yields > def genfunc(): > assert cvar.value == "outer" > with cvar.assign("inner"): > yield > assert cvar.value == "inner" > assert cvar.value == "outer" > > g = genfunc(): > with cvar.assign("outer"): > assert cvar.value == "outer" > next(g) > assert cvar.value == "inner" > next(g) > assert cvar.value == "outer" > > > Capturing contextvar assignments > '''''''''''''''''''''''''''''''' > > Using ``contextvars.capture()``, one can capture the assignment contexts > that are entered by a block of code. The changes applied by the block of > code can then be reverted and subsequently reapplied, even in another > context:: > > assert cvar1.value is None # default > assert cvar2.value is None # default > assi1 = cvar1.assign(value1) > assi2 = cvar1.assign(value2) > with contextvars.capture() as delta: > assi1.__enter__() > with cvar2.assign("not captured"): > assert cvar2.value is "not captured" > assi2.__enter__() > assert cvar1.value is value2 > delta.revert() > assert cvar1.value is None > assert cvar2.value is None > ... > with cvar1.assign(1), cvar2.assign(2): > delta.reapply() > assert cvar1.value is value2 > assert cvar2.value == 2 > > > However, reapplying the "delta" if its net contents include deassignments > may not be possible (see also Implementation and Open Issues). > > > Getting a snapshot of context state > ''''''''''''''''''''''''''''''''''' > > The function ``contextvars.get_local_state()`` returns an object > representing the applied assignments to all context-local variables in the > context where the function is called. This can be seen as equivalent to > using ``contextvars.capture()`` to capture all context changes from the > beginning of execution. The returned object supports methods ``.revert()`` > and ``reapply()`` as above. > > > Running code in a clean state > ''''''''''''''''''''''''''''' > > Although it is possible to revert all applied context changes using the > above primitives, a more convenient way to run a block of code in a clean > context is provided:: > > with context_vars.clean_context(): > # here, all context vars start off with their default values > # here, the state is back to what it was before the with block. > > > Implementation > -------------- > > This section describes to a variable level of detail how the described > semantics can be implemented. At present, an implementation aimed at > simplicity but sufficient features is described. More details will be added > later. > > Alternatively, a somewhat more complicated implementation offers minor > additional features while adding some performance overhead and requiring > more code in the implementation. > > Data structures and implementation of the core concept > '''''''''''''''''''''''''''''''''''''''''''''''''''''' > > Each thread of the Python interpreter keeps its on stack of > ``contextvars.Assignment`` objects, each having a pointer to the previous > (outer) assignment like in a linked list. The local state (also returned by > ``contextvars.get_local_state()``) then consists of a reference to the top > of the stack and a pointer/weak reference to the bottom of the stack. This > allows efficient stack manipulations. An object produced by > ``contextvars.capture()`` is similar, but refers to only a part of the stack > with the bottom reference pointing to the top of the stack as it was in the > beginning of the capture block. > > Now, the stack evolves according to the assignment ``__enter__`` and > ``__exit__`` methods. For example:: > > cvar1 = contextvars.Var() > cvar2 = contextvars.Var() > # stack: [] > assert cvar1.value is None > assert cvar2.value is None > > with cvar1.assign("outer"): > # stack: [Assignment(cvar1, "outer")] > assert cvar1.value == "outer" > > with cvar1.assign("inner"): > # stack: [Assignment(cvar1, "outer"), > # Assignment(cvar1, "inner")] > assert cvar1.value == "inner" > > with cvar2.assign("hello"): > # stack: [Assignment(cvar1, "outer"), > # Assignment(cvar1, "inner"), > # Assignment(cvar2, "hello")] > assert cvar2.value == "hello" > > # stack: [Assignment(cvar1, "outer"), > # Assignment(cvar1, "inner")] > assert cvar1.value == "inner" > assert cvar2.value is None > > # stack: [Assignment(cvar1, "outer")] > assert cvar1.value == "outer" > > # stack: [] > assert cvar1.value is None > assert cvar2.value is None > > > Getting a value from the context using ``cvar1.value`` can be implemented as > finding the topmost occurrence of a ``cvar1`` assignment on the stack and > returning the value there, or the default value if no assignment is found on > the stack. However, this can be optimized to instead be an O(1) operation in > most cases. Still, even searching through the stack may be reasonably fast > since these stacks are not intended to grow very large. > > The above description is already sufficient for implementing the core > concept. Suspendable frames require some additional attention, as explained > in the following. > > Implementation of generator and coroutine semantics > ''''''''''''''''''''''''''''''''''''''''''''''''''' > > Within generators, coroutines and async generators, assignments and > deassignments are handled in exactly the same way as anywhere else. However, > some changes are needed in the builtin generator methods ``send``, > ``__next__``, ``throw`` and ``close``. Here is the Python equivalent of the > changes needed in ``send`` for a generator (here ``_old_send`` refers to the > behavior in Python 3.6):: > > def send(self, value): > # if decorated with contextvars.leaking_yields > if self.gi_contextvars is LEAK: > # nothing needs to be done to leak context through yields :) > return self._old_send(value) > try: > with contextvars.capture() as delta: > if self.gi_contextvars: > # non-zero captured content from previous iteration > self.gi_contextvars.reapply() > ret = self._old_send(value) > except Exception: > raise > else: > # suspending, revert context changes but > delta.revert() > self.gi_contextvars = delta > return ret > > > The corresponding modifications to the other methods is essentially > identical. The same applies to coroutines and async generators. > > For code that does not use ``contextvars``, the additions are O(1) and > essentially reduce to a couple of pointer comparisons. For code that does > use ``contextvars``, the additions are still O(1) in most cases. > > More on implementation > '''''''''''''''''''''' > > The rest of the functionality, including ``contextvars.leaking_yields``, > contextvars.capture()``, ``contextvars.get_local_state()`` and > ``contextvars.clean_context()`` are in fact quite straightforward to > implement, but their implementation will be discussed further in later > versions of this proposal. Caching of assigned values is somewhat more > complicated, and will be discussed later, but it seems that most cases > should achieve O(1) complexity. > > Backwards compatibility > ======================= > > There are no *direct* backwards-compatibility concerns, since a completely > new feature is proposed. > > However, various traditional uses of thread-local storage may need a smooth > transition to ``contextvars`` so they can be concurrency-safe. There are > several approaches to this, including emulating task-local storage with a > little bit of help from async frameworks. A fully general implementation > cannot be provided, because the desired semantics may depend on the design > of the framework. > > Another way to deal with the transition is for code to first look for a > context created using ``contextvars``. If that fails because a new-style > context has not been set or because the code runs on an older Python > version, a fallback to thread-local storage is used. > > > Open Issues > =========== > > Out-of-order de-assignments > --------------------------- > > In this proposal, all variable deassignments are made in the opposite order > compared to the preceding assignments. This has two useful properties: it > encourages using ``with`` statements to define assignment scope and has a > tendency to catch errors early (forgetting a ``.__exit__()`` call often > results in a meaningful error. To have this as a requirement requirement is > beneficial also in terms of implementation simplicity and performance. > Nevertheless, allowing out-of-order context exits is not completely out of > the question, and reasonable implementation strategies for that do exist. > > Rejected Ideas > ============== > > Dynamic scoping linked to subroutine scopes > ------------------------------------------- > > The scope of value visibility should not be determined by the way the code > is refactored into subroutines. It is necessary to have per-variable control > of the assignment scope. > > Acknowledgements > ================ > > To be added. > > > References > ========== > > To be added. > > > -- > + Koos Zevenhoven + http://twitter.com/k7hoven + > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From njs at pobox.com Mon Sep 4 20:49:56 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 4 Sep 2017 17:49:56 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven wrote: > Hi all, > > as promised, here is a draft PEP for context variable semantics and > implementation. Apologies for the slight delay; I had a not-so-minor > autosave accident and had to retype the majority of this first draft. > > During the past years, there has been growing interest in something like > task-local storage or async-local storage. This PEP proposes an alternative > approach to solving the problems that are typically stated as motivation for > such concepts. >From a quick skim, my impression is: All the high-level semantics you suggest make sense... in fact, AFAICT they're exactly the same semantics we've been using as a litmus test for PEP 550. I think PEP 550 is sufficient to allow implementing all your proposed APIs (and that if it isn't, that's a bug in PEP 550). OTOH, your proposal doesn't provide any way to implement functions like decimal.setcontext or numpy.seterr, except by pushing a new state and never popping it, which leaks memory and permanently increases the N in the O(N) lookups. I didn't see any direct comparison with PEP 550 in your text (maybe I missed it). Why do you think this approach would be better than what's in PEP 550? -n -- Nathaniel J. Smith -- https://vorpus.org From g.rodola at gmail.com Mon Sep 4 23:59:43 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 5 Sep 2017 11:59:43 +0800 Subject: [Python-ideas] Add os.usable_cpu_count() Message-ID: Recently os.cpu_count() on Windows has been fixed in order to take process groups into account and return the number of all available CPUs: http://bugs.python.org/issue30581 This made me realize that os.cpu_count() does not return the number of *usable* CPUs, which could possibly represent a better default value for things like multiprocessing and process pools. It is currently possible to retrieve this info on UNIX with len(os.sched_getaffinity(0)) which takes CPU affinity and (I think) Linux cgroups into account, but it's not possible to do the same on Windows which provides this value via GetActiveProcessorCount() API. As such I am planning to implement this in psutil but would like to know how python-ideas feels about adding a new os.usable_cpu_count() function (or having os.cpu_count(usable=True)). Thoughts? -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Sep 5 00:59:42 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 4 Sep 2017 21:59:42 -0700 Subject: [Python-ideas] Add os.usable_cpu_count() In-Reply-To: References: Message-ID: On Mon, Sep 4, 2017 at 8:59 PM, Giampaolo Rodola' wrote: > Recently os.cpu_count() on Windows has been fixed in order to take process > groups into account and return the number of all available CPUs: > http://bugs.python.org/issue30581 > This made me realize that os.cpu_count() does not return the number of > *usable* CPUs, which could possibly represent a better default value for > things like multiprocessing and process pools. > It is currently possible to retrieve this info on UNIX with > len(os.sched_getaffinity(0)) which takes CPU affinity and (I think) Linux > cgroups into account, but it's not possible to do the same on Windows which > provides this value via GetActiveProcessorCount() API. > As such I am planning to implement this in psutil but would like to know how > python-ideas feels about adding a new os.usable_cpu_count() function (or > having os.cpu_count(usable=True)). This was discussed in https://bugs.python.org/issue23530 It looks like the resolution at that time was: - os.cpu_count() should *not* report the number of CPUs accessible to the current process, but rather continue to report the number of CPUs that exist in the system (whatever that means in these days of virtualization... e.g. if you use KVM to set up a virtual machine with limited CPUs, then that changes os.cpu_count, but if you do it with docker then that doesn't). - multiprocessing and similar should continue to treat os.cpu_count() as if it returned the number of CPUs accessible to the current process. Possibly some lines got crossed there... -n -- Nathaniel J. Smith -- https://vorpus.org From g.rodola at gmail.com Tue Sep 5 02:43:44 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 5 Sep 2017 14:43:44 +0800 Subject: [Python-ideas] Add os.usable_cpu_count() In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 12:59 PM, Nathaniel Smith wrote: > On Mon, Sep 4, 2017 at 8:59 PM, Giampaolo Rodola' > wrote: > > Recently os.cpu_count() on Windows has been fixed in order to take > process > > groups into account and return the number of all available CPUs: > > http://bugs.python.org/issue30581 > > This made me realize that os.cpu_count() does not return the number of > > *usable* CPUs, which could possibly represent a better default value for > > things like multiprocessing and process pools. > > It is currently possible to retrieve this info on UNIX with > > len(os.sched_getaffinity(0)) which takes CPU affinity and (I think) Linux > > cgroups into account, but it's not possible to do the same on Windows > which > > provides this value via GetActiveProcessorCount() API. > > As such I am planning to implement this in psutil but would like to know > how > > python-ideas feels about adding a new os.usable_cpu_count() function (or > > having os.cpu_count(usable=True)). > > This was discussed in https://bugs.python.org/issue23530 > > It looks like the resolution at that time was: > > - os.cpu_count() should *not* report the number of CPUs accessible to > the current process, but rather continue to report the number of CPUs > that exist in the system (whatever that means in these days of > virtualization... e.g. if you use KVM to set up a virtual machine with > limited CPUs, then that changes os.cpu_count, but if you do it with > docker then that doesn't). > > - multiprocessing and similar should continue to treat os.cpu_count() > as if it returned the number of CPUs accessible to the current > process. > > Possibly some lines got crossed there... > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > I agree current os.cpu_count() behavior should remain unchanged. Point is multiprocessing & similar are currently not taking CPU affinity and Linux cgroups into account, spawning more processes than necessary. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Tue Sep 5 03:43:06 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Tue, 5 Sep 2017 09:43:06 +0200 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On 9/4/17, Koos Zevenhoven wrote: > Core concept > '''''''''''' > > A context-local variable is represented by a single instance of > ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` > object can ask for its value with respect to the current context. In the > high-level API, this value is given by the ``cvar.value`` property:: > > cvar = contextvars.Var(default="the default value", > description="example context variable") > > assert cvar.value == "the default value" # default still applies > > # In code examples, all ``assert`` statements should > # succeed according to the proposed semantics. > > > No assignments to ``cvar`` have been applied for this context, so > ``cvar.value`` gives the default value. Assigning new values to contextvars > is done in a highly scope-aware manner:: > > with cvar.assign(new_value): > assert cvar.value is new_value > # Any code here, or down the call chain from here, sees: > # cvar.value is new_value > # unless another value has been assigned in a > # nested context > assert cvar.value is new_value > # the assignment of ``cvar`` to ``new_value`` is no longer visible > assert cvar.value == "the default value" I feel that of "is" and "==" in assert statements in this PEP has to be used (or described) more precisely. What if new_value above is 123456789? maybe using something like could be better? -> def equals(a, b): return a is b or a == b Doesn't PEP need to think about something like "context level overflow" ? Or members like: cvar.level ? From k7hoven at gmail.com Tue Sep 5 09:53:52 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 16:53:52 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: <20170904222036.GA9043@phdru.name> References: <20170904222036.GA9043@phdru.name> Message-ID: On Tue, Sep 5, 2017 at 1:20 AM, Oleg Broytman wrote: > Hi! > > On Tue, Sep 05, 2017 at 12:50:35AM +0300, Koos Zevenhoven < > k7hoven at gmail.com> wrote: > > cvar = contextvars.Var(default="the default value", > > description="example context variable") > > Why ``description`` and not ``doc``? > > ?Cause that's a nice thing to bikeshed about? In fact, I probably should have left it out at this point. Really, it's just to get a meaningful repr for the object and better error messages, without any significance for the substance of the PEP. There are also concepts in the PEP that don't have a name yet. > > with cvar.assign(new_value): > > Why ``assign`` and not ``set``? > ?To distinguish from typical set-operations (setattr, setitem), and from sets and from settings. I would rather enter an "assignment context" than a "set context" or "setting context". One key point of this PEP is to promote defining context variable scopes on a per-variable (and per-value) basis. I combined the variable and value aspects in this concept of Assignment(variable, value) objects, which define a context that one can enter and exit. > > Each thread of the Python interpreter keeps its on stack of > > "its own", I think. > ?That's right, thanks.? ???Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Tue Sep 5 10:42:25 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Tue, 5 Sep 2017 07:42:25 -0700 (PDT) Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> On Monday, September 4, 2017 at 6:37:44 PM UTC-4, Yury Selivanov wrote: > > So every generator stores "captured" modifications. This is similar > to PEP 550, which adds Logical Context to generators to store their EC > modifications. The implementation is different, but the intent is the > same. > > PEP 550 uses a stack of hash tables, this proposal has a linked list > of Assignment objects. In the worst case, this proposal will have > worse performance guarantees. It's hard to say more, because the > implementation isn't described in full. > > With PEP 550 it's trivial to implement a context manager to control > variable assignments. If we do that, how exactly this proposal is > different? Can you list all semantical differences between this > proposal and PEP 550? > > So far, it looks like if I call "var.assign(value).__enter__()" it > would be equivalent to PEP 550's "var.set(value)". > I think you really should add a context manager to PEP 550 since it is better than calling "set", which leaks state. Nathaniel is right that you need set to support legacy numpy methods like seterr. Had there been a way of setting context variables using a context manager, then numpy would only have had to implement the "errstate" context manager on top of it. There would have been no need for seterr, which leaks state between code blocks and is error-prone. > > Yury > > On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven > wrote: > > Hi all, > > > > as promised, here is a draft PEP for context variable semantics and > > implementation. Apologies for the slight delay; I had a not-so-minor > > autosave accident and had to retype the majority of this first draft. > > > > During the past years, there has been growing interest in something like > > task-local storage or async-local storage. This PEP proposes an > alternative > > approach to solving the problems that are typically stated as motivation > for > > such concepts. > > > > This proposal is based on sketches of solutions since spring 2015, with > some > > minor influences from the recent discussion related to PEP 550. I can > also > > see some potential implementation synergy between this PEP and PEP 550, > even > > if the proposed semantics are quite different. > > > > So, here it is. This is the first draft and some things are still > missing, > > but the essential things should be there. > > > > -- Koos > > > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > > > PEP: 999 > > Title: Context-local variables (contextvars) > > Version: $Revision$ > > Last-Modified: $Date$ > > Author: Koos Zevenhoven > > Status: Draft > > Type: Standards Track > > Content-Type: text/x-rst > > Created: DD-Mmm-YYYY > > Post-History: DD-Mmm-YYYY > > > > > > Abstract > > ======== > > > > Sometimes, in special cases, it is desired that code can pass > information > > down the function call chain to the callees without having to explicitly > > pass the information as arguments to each function in the call chain. > This > > proposal describes a construct which allows code to explicitly switch in > and > > out of a context where a certain context variable has a given value > assigned > > to it. This is a modern alternative to some uses of things like global > > variables in traditional single-threaded (or thread-unsafe) code and of > > thread-local storage in traditional *concurrency-unsafe* code (single- > or > > multi-threaded). In particular, the proposed mechanism can also be used > with > > more modern concurrent execution mechanisms such as asynchronously > executed > > coroutines, without the concurrently executed call chains interfering > with > > each other's contexts. > > > > The "call chain" can consist of normal functions, awaited coroutines, or > > generators. The semantics of context variable scope are equivalent in > all > > cases, allowing code to be refactored freely into *subroutines* (which > here > > refers to functions, sub-generators or sub-coroutines) without affecting > the > > semantics of context variables. Regarding implementation, this proposal > aims > > at simplicity and minimum changes to the CPython interpreter and to > other > > Python interpreters. > > > > Rationale > > ========= > > > > Consider a modern Python *call chain* (or call tree), which in this > proposal > > refers to any chained (nested) execution of *subroutines*, using any > > possible combinations of normal function calls, or expressions using > > ``await`` or ``yield from``. In some cases, passing necessary > *information* > > down the call chain as arguments can substantially complicate the > required > > function signatures, or it can even be impossible to achieve in > practice. In > > these cases, one may search for another place to store this information. > Let > > us look at some historical examples. > > > > The most naive option is to assign the value to a global variable or > > similar, where the code down the call chain can access it. However, this > > immediately makes the code thread-unsafe, because with multiple threads, > all > > threads assign to the same global variable, and another thread can > interfere > > at any point in the call chain. > > > > A somewhat less naive option is to store the information as per-thread > > information in thread-local storage, where each thread has its own > "copy" of > > the variable which other threads cannot interfere with. Although > non-ideal, > > this has been the best solution in many cases. However, thanks to > generators > > and coroutines, the execution of the call chain can be suspended and > > resumed, allowing code in other contexts to run concurrently. Therefore, > > using thread-local storage is *concurrency-unsafe*, because other call > > chains in other contexts may interfere with the thread-local variable. > > > > Note that in the above two historical approaches, the stored information > has > > the *widest* available scope without causing problems. For a third > solution > > along the same path, one would first define an equivalent of a "thread" > for > > asynchronous execution and concurrency. This could be seen as the > largest > > amount of code and nested calls that is guaranteed to be executed > > sequentially without ambiguity in execution order. This might be > referred to > > as concurrency-local or task-local storage. In this meaning of "task", > there > > is no ambiguity in the order of execution of the code within one task. > (This > > concept of a task is close to equivalent to a ``Task`` in ``asyncio``, > but > > not exactly.) In such concurrency-locals, it is possible to pass > information > > down the call chain to callees without another code path interfering > with > > the value in the background. > > > > Common to the above approaches is that they indeed use variables with a > wide > > but just-narrow-enough scope. Thread-locals could also be called > thread-wide > > globals---in single-threaded code, they are indeed truly global. And > > task-locals could be called task-wide globals, because tasks can be very > > big. > > > > The issue here is that neither global variables, thread-locals nor > > task-locals are really meant to be used for this purpose of passing > > information of the execution context down the call chain. Instead of the > > widest possible variable scope, the scope of the variables should be > > controlled by the programmer, typically of a library, to have the > desired > > scope---not wider. In other words, task-local variables (and globals and > > thread-locals) have nothing to do with the kind of context-bound > information > > passing that this proposal intends to enable, even if task-locals can be > > used to emulate the desired semantics. Therefore, in the following, this > > proposal describes the semantics and the outlines of an implementation > for > > *context-local variables* (or context variables, contextvars). In fact, > as a > > side effect of this PEP, an async framework can use the proposed feature > to > > implement task-local variables. > > > > Proposal > > ======== > > > > Because the proposed semantics are not a direct extension to anything > > already available in Python, this proposal is first described in terms > of > > semantics and API at a fairly high level. In particular, Python ``with`` > > statements are heavily used in the description, as they are a good match > > with the proposed semantics. However, the underlying ``__enter__`` and > > ``__exit__`` methods correspond to functions in the lower-level > > speed-optimized (C) API. For clarity of this document, the lower-level > > functions are not explicitly named in the definition of the semantics. > After > > describing the semantics and high-level API, the implementation is > > described, going to a lower level. > > > > Semantics and higher-level API > > ------------------------------ > > > > Core concept > > '''''''''''' > > > > A context-local variable is represented by a single instance of > > ``contextvars.Var``, say ``cvar``. Any code that has access to the > ``cvar`` > > object can ask for its value with respect to the current context. In the > > high-level API, this value is given by the ``cvar.value`` property:: > > > > cvar = contextvars.Var(default="the default value", > > description="example context variable") > > > > assert cvar.value == "the default value" # default still applies > > > > # In code examples, all ``assert`` statements should > > # succeed according to the proposed semantics. > > > > > > No assignments to ``cvar`` have been applied for this context, so > > ``cvar.value`` gives the default value. Assigning new values to > contextvars > > is done in a highly scope-aware manner:: > > > > with cvar.assign(new_value): > > assert cvar.value is new_value > > # Any code here, or down the call chain from here, sees: > > # cvar.value is new_value > > # unless another value has been assigned in a > > # nested context > > assert cvar.value is new_value > > # the assignment of ``cvar`` to ``new_value`` is no longer visible > > assert cvar.value == "the default value" > > > > > > Here, ``cvar.assign(value)`` returns another object, namely > > ``contextvars.Assignment(cvar, new_value)``. The essential part here is > that > > applying a context variable assignment (``Assignment.__enter__``) is > paired > > with a de-assignment (``Assignment.__exit__``). These operations set the > > bounds for the scope of the assigned value. > > > > Assignments to the same context variable can be nested to override the > outer > > assignment in a narrower context:: > > > > assert cvar.value == "the default value" > > with cvar.assign("outer"): > > assert cvar.value == "outer" > > with cvar.assign("inner"): > > assert cvar.value == "inner" > > assert cvar.value == "outer" > > assert cvar.value == "the default value" > > > > > > Also multiple variables can be assigned to in a nested manner without > > affecting each other:: > > > > cvar1 = contextvars.Var() > > cvar2 = contextvars.Var() > > > > assert cvar1.value is None # default is None by default > > assert cvar2.value is None > > > > with cvar1.assign(value1): > > assert cvar1.value is value1 > > assert cvar2.value is None > > with cvar2.assign(value2): > > assert cvar1.value is value1 > > assert cvar2.value is value2 > > assert cvar1.value is value1 > > assert cvar2.value is None > > assert cvar1.value is None > > assert cvar2.value is None > > > > Or with more convenient Python syntax:: > > > > with cvar1.assign(value1), cvar2.assign(value2): > > assert cvar1.value is value1 > > assert cvar2.value is value2 > > > > In another *context*, in another thread or otherwise concurrently > executed > > task or code path, the context variables can have a completely different > > state. The programmer thus only needs to worry about the context at > hand. > > > > Refactoring into subroutines > > '''''''''''''''''''''''''''' > > > > Code using contextvars can be refactored into subroutines without > affecting > > the semantics. For instance:: > > > > assi = cvar.assign(new_value) > > def apply(): > > assi.__enter__() > > assert cvar.value == "the default value" > > apply() > > assert cvar.value is new_value > > assi.__exit__() > > assert cvar.value == "the default value" > > > > > > Or similarly in an asynchronous context where ``await`` expressions are > > used. The subroutine can now be a coroutine:: > > > > assi = cvar.assign(new_value) > > async def apply(): > > assi.__enter__() > > assert cvar.value == "the default value" > > await apply() > > assert cvar.value is new_value > > assi.__exit__() > > assert cvar.value == "the default value" > > > > > > Or when the subroutine is a generator:: > > > > def apply(): > > yield > > assi.__enter__() > > > > > > which is called using ``yield from apply()`` or with calls to ``next`` > or > > ``.send``. This is discussed further in later sections. > > > > Semantics for generators and generator-based coroutines > > ''''''''''''''''''''''''''''''''''''''''''''''''''''''' > > > > Generators, coroutines and async generators act as subroutines in much > the > > same way that normal functions do. However, they have the additional > > possibility of being suspended by ``yield`` expressions. Assignment > contexts > > entered inside a generator are normally preserved across yields:: > > > > def genfunc(): > > with cvar.assign(new_value): > > assert cvar.value is new_value > > yield > > assert cvar.value is new_value > > g = genfunc() > > next(g) > > assert cvar.value == "the default value" > > with cvar.assign(another_value): > > next(g) > > > > > > However, the outer context visible to the generator may change state > across > > yields:: > > > > def genfunc(): > > assert cvar.value is value2 > > yield > > assert cvar.value is value1 > > yield > > with cvar.assign(value3): > > assert cvar.value is value3 > > > > with cvar.assign(value1): > > g = genfunc() > > with cvar.assign(value2): > > next(g) > > next(g) > > next(g) > > assert cvar.value is value1 > > > > > > Similar semantics apply to async generators defined by ``async def ... > yield > > ...`` ). > > > > By default, values assigned inside a generator do not leak through > yields to > > the code that drives the generator. However, the assignment contexts > entered > > and left open inside the generator *do* become visible outside the > generator > > after the generator has finished with a ``StopIteration`` or another > > exception:: > > > > assi = cvar.assign(new_value) > > def genfunc(): > > yield > > assi.__enter__(): > > yield > > > > g = genfunc() > > assert cvar.value == "the default value" > > next(g) > > assert cvar.value == "the default value" > > next(g) # assi.__enter__() is called here > > assert cvar.value == "the default value" > > next(g) > > assert cvar.value is new_value > > assi.__exit__() > > > > > > > > Special functionality for framework authors > > ------------------------------------------- > > > > Frameworks, such as ``asyncio`` or third-party libraries, can use > additional > > functionality in ``contextvars`` to achieve the desired semantics in > cases > > which are not determined by the Python interpreter. Some of the > semantics > > described in this section are also afterwards used to describe the > internal > > implementation. > > > > Leaking yields > > '''''''''''''' > > > > Using the ``contextvars.leaking_yields`` decorator, one can choose to > leak > > the context through ``yield`` expressions into the outer context that > drives > > the generator:: > > > > @contextvars.leaking_yields > > def genfunc(): > > assert cvar.value == "outer" > > with cvar.assign("inner"): > > yield > > assert cvar.value == "inner" > > assert cvar.value == "outer" > > > > g = genfunc(): > > with cvar.assign("outer"): > > assert cvar.value == "outer" > > next(g) > > assert cvar.value == "inner" > > next(g) > > assert cvar.value == "outer" > > > > > > Capturing contextvar assignments > > '''''''''''''''''''''''''''''''' > > > > Using ``contextvars.capture()``, one can capture the assignment contexts > > that are entered by a block of code. The changes applied by the block of > > code can then be reverted and subsequently reapplied, even in another > > context:: > > > > assert cvar1.value is None # default > > assert cvar2.value is None # default > > assi1 = cvar1.assign(value1) > > assi2 = cvar1.assign(value2) > > with contextvars.capture() as delta: > > assi1.__enter__() > > with cvar2.assign("not captured"): > > assert cvar2.value is "not captured" > > assi2.__enter__() > > assert cvar1.value is value2 > > delta.revert() > > assert cvar1.value is None > > assert cvar2.value is None > > ... > > with cvar1.assign(1), cvar2.assign(2): > > delta.reapply() > > assert cvar1.value is value2 > > assert cvar2.value == 2 > > > > > > However, reapplying the "delta" if its net contents include > deassignments > > may not be possible (see also Implementation and Open Issues). > > > > > > Getting a snapshot of context state > > ''''''''''''''''''''''''''''''''''' > > > > The function ``contextvars.get_local_state()`` returns an object > > representing the applied assignments to all context-local variables in > the > > context where the function is called. This can be seen as equivalent to > > using ``contextvars.capture()`` to capture all context changes from the > > beginning of execution. The returned object supports methods > ``.revert()`` > > and ``reapply()`` as above. > > > > > > Running code in a clean state > > ''''''''''''''''''''''''''''' > > > > Although it is possible to revert all applied context changes using the > > above primitives, a more convenient way to run a block of code in a > clean > > context is provided:: > > > > with context_vars.clean_context(): > > # here, all context vars start off with their default values > > # here, the state is back to what it was before the with block. > > > > > > Implementation > > -------------- > > > > This section describes to a variable level of detail how the described > > semantics can be implemented. At present, an implementation aimed at > > simplicity but sufficient features is described. More details will be > added > > later. > > > > Alternatively, a somewhat more complicated implementation offers minor > > additional features while adding some performance overhead and requiring > > more code in the implementation. > > > > Data structures and implementation of the core concept > > '''''''''''''''''''''''''''''''''''''''''''''''''''''' > > > > Each thread of the Python interpreter keeps its on stack of > > ``contextvars.Assignment`` objects, each having a pointer to the > previous > > (outer) assignment like in a linked list. The local state (also returned > by > > ``contextvars.get_local_state()``) then consists of a reference to the > top > > of the stack and a pointer/weak reference to the bottom of the stack. > This > > allows efficient stack manipulations. An object produced by > > ``contextvars.capture()`` is similar, but refers to only a part of the > stack > > with the bottom reference pointing to the top of the stack as it was in > the > > beginning of the capture block. > > > > Now, the stack evolves according to the assignment ``__enter__`` and > > ``__exit__`` methods. For example:: > > > > cvar1 = contextvars.Var() > > cvar2 = contextvars.Var() > > # stack: [] > > assert cvar1.value is None > > assert cvar2.value is None > > > > with cvar1.assign("outer"): > > # stack: [Assignment(cvar1, "outer")] > > assert cvar1.value == "outer" > > > > with cvar1.assign("inner"): > > # stack: [Assignment(cvar1, "outer"), > > # Assignment(cvar1, "inner")] > > assert cvar1.value == "inner" > > > > with cvar2.assign("hello"): > > # stack: [Assignment(cvar1, "outer"), > > # Assignment(cvar1, "inner"), > > # Assignment(cvar2, "hello")] > > assert cvar2.value == "hello" > > > > # stack: [Assignment(cvar1, "outer"), > > # Assignment(cvar1, "inner")] > > assert cvar1.value == "inner" > > assert cvar2.value is None > > > > # stack: [Assignment(cvar1, "outer")] > > assert cvar1.value == "outer" > > > > # stack: [] > > assert cvar1.value is None > > assert cvar2.value is None > > > > > > Getting a value from the context using ``cvar1.value`` can be > implemented as > > finding the topmost occurrence of a ``cvar1`` assignment on the stack > and > > returning the value there, or the default value if no assignment is > found on > > the stack. However, this can be optimized to instead be an O(1) > operation in > > most cases. Still, even searching through the stack may be reasonably > fast > > since these stacks are not intended to grow very large. > > > > The above description is already sufficient for implementing the core > > concept. Suspendable frames require some additional attention, as > explained > > in the following. > > > > Implementation of generator and coroutine semantics > > ''''''''''''''''''''''''''''''''''''''''''''''''''' > > > > Within generators, coroutines and async generators, assignments and > > deassignments are handled in exactly the same way as anywhere else. > However, > > some changes are needed in the builtin generator methods ``send``, > > ``__next__``, ``throw`` and ``close``. Here is the Python equivalent of > the > > changes needed in ``send`` for a generator (here ``_old_send`` refers to > the > > behavior in Python 3.6):: > > > > def send(self, value): > > # if decorated with contextvars.leaking_yields > > if self.gi_contextvars is LEAK: > > # nothing needs to be done to leak context through yields :) > > return self._old_send(value) > > try: > > with contextvars.capture() as delta: > > if self.gi_contextvars: > > # non-zero captured content from previous iteration > > self.gi_contextvars.reapply() > > ret = self._old_send(value) > > except Exception: > > raise > > else: > > # suspending, revert context changes but > > delta.revert() > > self.gi_contextvars = delta > > return ret > > > > > > The corresponding modifications to the other methods is essentially > > identical. The same applies to coroutines and async generators. > > > > For code that does not use ``contextvars``, the additions are O(1) and > > essentially reduce to a couple of pointer comparisons. For code that > does > > use ``contextvars``, the additions are still O(1) in most cases. > > > > More on implementation > > '''''''''''''''''''''' > > > > The rest of the functionality, including ``contextvars.leaking_yields``, > > contextvars.capture()``, ``contextvars.get_local_state()`` and > > ``contextvars.clean_context()`` are in fact quite straightforward to > > implement, but their implementation will be discussed further in later > > versions of this proposal. Caching of assigned values is somewhat more > > complicated, and will be discussed later, but it seems that most cases > > should achieve O(1) complexity. > > > > Backwards compatibility > > ======================= > > > > There are no *direct* backwards-compatibility concerns, since a > completely > > new feature is proposed. > > > > However, various traditional uses of thread-local storage may need a > smooth > > transition to ``contextvars`` so they can be concurrency-safe. There are > > several approaches to this, including emulating task-local storage with > a > > little bit of help from async frameworks. A fully general implementation > > cannot be provided, because the desired semantics may depend on the > design > > of the framework. > > > > Another way to deal with the transition is for code to first look for a > > context created using ``contextvars``. If that fails because a new-style > > context has not been set or because the code runs on an older Python > > version, a fallback to thread-local storage is used. > > > > > > Open Issues > > =========== > > > > Out-of-order de-assignments > > --------------------------- > > > > In this proposal, all variable deassignments are made in the opposite > order > > compared to the preceding assignments. This has two useful properties: > it > > encourages using ``with`` statements to define assignment scope and has > a > > tendency to catch errors early (forgetting a ``.__exit__()`` call often > > results in a meaningful error. To have this as a requirement requirement > is > > beneficial also in terms of implementation simplicity and performance. > > Nevertheless, allowing out-of-order context exits is not completely out > of > > the question, and reasonable implementation strategies for that do > exist. > > > > Rejected Ideas > > ============== > > > > Dynamic scoping linked to subroutine scopes > > ------------------------------------------- > > > > The scope of value visibility should not be determined by the way the > code > > is refactored into subroutines. It is necessary to have per-variable > control > > of the assignment scope. > > > > Acknowledgements > > ================ > > > > To be added. > > > > > > References > > ========== > > > > To be added. > > > > > > -- > > + Koos Zevenhoven + http://twitter.com/k7hoven + > > > > _______________________________________________ > > Python-ideas mailing list > > Python... at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Sep 5 10:54:03 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Sep 2017 07:54:03 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> References: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> Message-ID: On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar wrote: > I think you really should add a context manager to PEP 550 since it is > better than calling "set", which leaks state. Nathaniel is right that you > need set to support legacy numpy methods like seterr. Had there been a way > of setting context variables using a context manager, then numpy would only > have had to implement the "errstate" context manager on top of it. There > would have been no need for seterr, which leaks state between code blocks > and is error-prone. > There is nothing in current Python to prevent numpy to use a context manager for seterr; it's easy enough to write your own context manager that saves and restores thread-local state (decimal shows how). In fact with PEP 550 it's so easy that it's really not necessary for the PEP to define this as a separate API -- whoever needs it can just write their own. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinjacobconway at gmail.com Tue Sep 5 11:00:29 2017 From: kevinjacobconway at gmail.com (Kevin Conway) Date: Tue, 05 Sep 2017 15:00:29 +0000 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> Message-ID: You should add https://bitbucket.org/hipchat/txlocal as a reference for the pep as it largely implements this idea for Twisted. It may provide for some practical discussions of use cases and limitations of this approach. On Tue, Sep 5, 2017, 09:55 Guido van Rossum wrote: > On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar > wrote: > >> I think you really should add a context manager to PEP 550 since it is >> better than calling "set", which leaks state. Nathaniel is right that you >> need set to support legacy numpy methods like seterr. Had there been a way >> of setting context variables using a context manager, then numpy would only >> have had to implement the "errstate" context manager on top of it. There >> would have been no need for seterr, which leaks state between code blocks >> and is error-prone. >> > > There is nothing in current Python to prevent numpy to use a context > manager for seterr; it's easy enough to write your own context manager that > saves and restores thread-local state (decimal shows how). In fact with PEP > 550 it's so easy that it's really not necessary for the PEP to define this > as a separate API -- whoever needs it can just write their own. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhihn at gmx.com Tue Sep 5 10:56:39 2017 From: jhihn at gmx.com (Jason H) Date: Tue, 5 Sep 2017 16:56:39 +0200 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: I am a relative nobody in Python, however a few weeks ago, I suggested more harmonization with JavaScript. Admittedly I've been doing more JS lately, so I might have JS-colored glasses on, but it looks like you're trying to add lexical scoping to Python, and there's a whole lot of manual scope work going on. (This may be a result of showing what can be done, rather than what can typically be done). I am probably entirely mistaken, however when I saw the subject and started reading, I expected to see something like v = context.Var({'some': value, 'other': value}) # (wherein Var() would make deep copies of the values) why repeat `var` after context ? if Var is the only point of context module? But that's not what I saw. I didn't immediately grasp that `value` and `description` (aka. `doc`) were special properties for an individual context var. This is likely an issue with me, or it could be documentation. Then it went on to talk about using `with` for managing context. So it looks like the `with`?is just providing a value stack (list) for the variable? Can this be done automatically with a _setattr_ and append()? Having done a lot of HTML5 Canvas drawing (https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/save) , this reminds me of `save()` and `restore()`, which would likely be more familiar to a larger audience. Additionally what about: with context.save() as derived_context: # or just `with context as derived_context:` calling save automatically # whatever # automatically call restore on __exit__. I also wonder about using multiple context Vars/managers with `with`, as that statement would get quite long. Finally, would it be possible to pass a dict and get membered object out? Using v from the example above, i.e.: v.value, v.other where gets/sets automatically use the most recent context? From yselivanov.ml at gmail.com Tue Sep 5 11:18:13 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Sep 2017 08:18:13 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> Message-ID: We'll add a reference to the "Can Execution Context be implemented without modifying CPython?" section [1]. However, after skimming through the readme file, I didn't see any examples or limitations that are relevant to PEP 550. If the PEP gets accepted, Twisted can simply add direct support for it (similarly to asyncio). That would mean that users won't need to maintain the context manually (described in txlocal's "Maintaining Context" section). Yury [1] https://www.python.org/dev/peps/pep-0550/#can-execution-context-be-implemented-without-modifying-cpython On Tue, Sep 5, 2017 at 8:00 AM, Kevin Conway wrote: > You should add https://bitbucket.org/hipchat/txlocal as a reference for the > pep as it largely implements this idea for Twisted. It may provide for some > practical discussions of use cases and limitations of this approach. > > > On Tue, Sep 5, 2017, 09:55 Guido van Rossum wrote: >> >> On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar >> wrote: >>> >>> I think you really should add a context manager to PEP 550 since it is >>> better than calling "set", which leaks state. Nathaniel is right that you >>> need set to support legacy numpy methods like seterr. Had there been a way >>> of setting context variables using a context manager, then numpy would only >>> have had to implement the "errstate" context manager on top of it. There >>> would have been no need for seterr, which leaks state between code blocks >>> and is error-prone. >> >> >> There is nothing in current Python to prevent numpy to use a context >> manager for seterr; it's easy enough to write your own context manager that >> saves and restores thread-local state (decimal shows how). In fact with PEP >> 550 it's so easy that it's really not necessary for the PEP to define this >> as a separate API -- whoever needs it can just write their own. >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From k7hoven at gmail.com Tue Sep 5 11:35:25 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 18:35:25 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith wrote: > On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven wrote: > > Hi all, > > > > as promised, here is a draft PEP for context variable semantics and > > implementation. Apologies for the slight delay; I had a not-so-minor > > autosave accident and had to retype the majority of this first draft. > > > > During the past years, there has been growing interest in something like > > task-local storage or async-local storage. This PEP proposes an > alternative > > approach to solving the problems that are typically stated as motivation > for > > such concepts. > > From a quick skim, my impression is: > > ?Well, I'm happy to hear that a quick skim can already give you an impression ;).? But let's see how correct... > All the high-level semantics you suggest make sense... in fact, AFAICT > they're exactly the same semantics we've been using as a litmus test > for PEP 550. Well, if "exactly the same semantics" is even nearly true, you are only testing a small subset of PEP 550 which resembles a subset of this proposal. > I think PEP 550 is sufficient to allow implementing all > your proposed APIs (and that if it isn't, that's a bug in PEP 550). > That's not true either. The LocalContext-based semantics introduces scope barriers that affect *all* variables. You might get close by putting just one variable in a LogicalContext and then nest them, but PEP 550 does not allow this in all cases. With the addition of PEP 521 and some trickery, it might. See also this section in PEP 550, where one of the related issues is described: https://www.python.org/dev/peps/pep-0550/#should-yield- from-leak-context-changes > OTOH, your proposal doesn't provide any way to implement functions > like decimal.setcontext or numpy.seterr, except by pushing a new state > and never popping it, which leaks memory and permanently increases the > N in the O(N) lookups. > > ?Well, there are different approaches for this. Let's take the example of numpy. import numpy as np I believe the relevant functions are np.seterr -- set a new state (and return the old one) np.geterr -- get the current state np.errstate -- gives you a context manager to do handle (Well, errstate sets more state than np.seterr, but that's irrelevant here). First of all, the np.seterr API is something that I want to discourage in this proposal, because if the state is not reset back to what it was, a completely different piece of code may be affected. BUT To preserve the current semantics of these functions in non-async code, you could do this: - numpy reimplements the errstate context manager using contextvars based on this proposal. - geterr gets the state using contextvars - seterr gets the state using contextvars and mutates it the way it wants (If contextvars is not available, it uses the old way) Also, the idea is to also provide frameworks the means for implementing concurrency-local storage, if that is what people really want, although I'm not sure it is. > I didn't see any direct comparison with PEP 550 in your text (maybe I > missed it). Why do you think this approach would be better than what's > in PEP 550? > It was not my intention to leave out the comparison altogether, but I did avoid the comparisons in some cases in this first draft, because thinking about PEP 550 concepts while trying to understand this proposal might give you the wrong idea. One of the benefits of this proposal is simplicity, and I'm guessing performance as well, but that would need evidence. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Tue Sep 5 11:48:49 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 18:48:49 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 10:43 AM, Pavol Lisy wrote: > On 9/4/17, Koos Zevenhoven wrote: > ?[...]? > > > with cvar.assign(new_value): > > assert cvar.value is new_value > > # Any code here, or down the call chain from here, sees: > > # cvar.value is new_value > > # unless another value has been assigned in a > > # nested context > > assert cvar.value is new_value > > # the assignment of ``cvar`` to ``new_value`` is no longer visible > > assert cvar.value == "the default value" > > I feel that of "is" and "==" in assert statements in this PEP has to > be used (or described) more precisely. > ?? ?The use is quite precise as it is now. I can't use `is` for the string values, because the result would depend on whether Python gives you the same str instance as before, or a new one with the same content.? Maybe I'll get rid of literal string values in the description, since it seems to only cause distraction. > > What if new_value above is 123456789? > ?Any value is fine.? > > maybe using something like could be better? -> > > def equals(a, b): > return a is b or a == b > > Doesn't PEP need to think about something like "context level overflow" ? > > Or members like: cvar.level ? > I don't see any need for this at this point, or possibly ever.? ???Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Sep 5 11:53:12 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Sep 2017 08:53:12 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 8:35 AM, Koos Zevenhoven wrote: > On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith wrote: [..] >> >> I think PEP 550 is sufficient to allow implementing all >> your proposed APIs (and that if it isn't, that's a bug in PEP 550). > > > That's not true either. The LocalContext-based semantics introduces scope > barriers that affect *all* variables. You might get close by putting just > one variable in a LogicalContext and then nest them, but PEP 550 does not > allow this in all cases. With the addition of PEP 521 and some trickery, it > might. I think you have a wrong idea about PEP 550 specification. I recommend you to reread it carefully, otherwise we can't have a productive discussion here. Yury From k7hoven at gmail.com Tue Sep 5 12:12:00 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 19:12:00 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov wrote: > On Tue, Sep 5, 2017 at 8:35 AM, Koos Zevenhoven wrote: > > On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith wrote: > [..] > >> > >> I think PEP 550 is sufficient to allow implementing all > >> your proposed APIs (and that if it isn't, that's a bug in PEP 550). > > > > > > That's not true either. The LocalContext-based semantics introduces scope > > barriers that affect *all* variables. You might get close by putting just > > one variable in a LogicalContext and then nest them, but PEP 550 does not > > allow this in all cases. With the addition of PEP 521 and some trickery, > it > > might. > > I think you have a wrong idea about PEP 550 specification. I > recommend you to reread it carefully, otherwise we can't have a > productive discussion here. > > I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" them, I meant stacking them. It is in fact nesting in terms of value scopes. ??Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Sep 5 13:24:14 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Sep 2017 10:24:14 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 9:12 AM, Koos Zevenhoven wrote: > On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov > wrote: >> >> On Tue, Sep 5, 2017 at 8:35 AM, Koos Zevenhoven wrote: >> > On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith wrote: >> [..] >> >> >> >> I think PEP 550 is sufficient to allow implementing all >> >> your proposed APIs (and that if it isn't, that's a bug in PEP 550). >> > >> > >> > That's not true either. The LocalContext-based semantics introduces >> > scope >> > barriers that affect *all* variables. You might get close by putting >> > just >> > one variable in a LogicalContext and then nest them, but PEP 550 does >> > not >> > allow this in all cases. With the addition of PEP 521 and some trickery, >> > it >> > might. >> >> I think you have a wrong idea about PEP 550 specification. I >> recommend you to reread it carefully, otherwise we can't have a >> productive discussion here. >> > > I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" them, I > meant stacking them. It is in fact nesting in terms of value scopes. I don't actually care if you use the latest terminology. You seem to have a wrong idea about how PEP 550 really works (and its full semantics), because things you say here about it don't make any sense. Yury From k7hoven at gmail.com Tue Sep 5 13:31:45 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 20:31:45 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 8:24 PM, Yury Selivanov wrote: > On Tue, Sep 5, 2017 at 9:12 AM, Koos Zevenhoven wrote: > > On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov > > wrote: > >> > >> On Tue, Sep 5, 2017 at 8:35 AM, Koos Zevenhoven > wrote: > >> > On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith > wrote: > >> [..] > >> >> > >> >> I think PEP 550 is sufficient to allow implementing all > >> >> your proposed APIs (and that if it isn't, that's a bug in PEP 550). > >> > > >> > > >> > That's not true either. The LocalContext-based semantics introduces > >> > scope > >> > barriers that affect *all* variables. You might get close by putting > >> > just > >> > one variable in a LogicalContext and then nest them, but PEP 550 does > >> > not > >> > allow this in all cases. With the addition of PEP 521 and some > trickery, > >> > it > >> > might. > >> > >> I think you have a wrong idea about PEP 550 specification. I > >> recommend you to reread it carefully, otherwise we can't have a > >> productive discussion here. > >> > > > > I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" > them, I > > meant stacking them. It is in fact nesting in terms of value scopes. > > I don't actually care if you use the latest terminology. You seem to > have a wrong idea about how PEP 550 really works (and its full > semantics), because things you say here about it don't make any sense. > ?In PEP 550, introducing a new LogicalContext on the ExecutionContext affects the scope of ?any_ var.set(value)? for * ?any * ?any_var . ? Does that not make sense?? ?? Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Sep 5 13:43:57 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Sep 2017 10:43:57 -0700 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 10:31 AM, Koos Zevenhoven wrote: > On Tue, Sep 5, 2017 at 8:24 PM, Yury Selivanov > wrote: >> >> On Tue, Sep 5, 2017 at 9:12 AM, Koos Zevenhoven wrote: >> > On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov >> > wrote: >> >> >> >> On Tue, Sep 5, 2017 at 8:35 AM, Koos Zevenhoven >> >> wrote: >> >> > On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith >> >> > wrote: >> >> [..] >> >> >> >> >> >> I think PEP 550 is sufficient to allow implementing all >> >> >> your proposed APIs (and that if it isn't, that's a bug in PEP 550). >> >> > >> >> > >> >> > That's not true either. The LocalContext-based semantics introduces >> >> > scope >> >> > barriers that affect *all* variables. You might get close by putting >> >> > just >> >> > one variable in a LogicalContext and then nest them, but PEP 550 does >> >> > not >> >> > allow this in all cases. With the addition of PEP 521 and some >> >> > trickery, >> >> > it >> >> > might. >> >> >> >> I think you have a wrong idea about PEP 550 specification. I >> >> recommend you to reread it carefully, otherwise we can't have a >> >> productive discussion here. >> >> >> > >> > I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" >> > them, I >> > meant stacking them. It is in fact nesting in terms of value scopes. >> >> I don't actually care if you use the latest terminology. You seem to >> have a wrong idea about how PEP 550 really works (and its full >> semantics), because things you say here about it don't make any sense. > > > In PEP 550, introducing a new LogicalContext on the ExecutionContext affects > the scope of > any_ > var.set(value) for * > any > * > any_var > . > Does that not make sense? It does. But your other sentence ".. You might get close by putting just one variable in a LogicalContext and then nest them, but PEP 550 does not allow this in all cases .." does not. Yury From k7hoven at gmail.com Tue Sep 5 13:53:33 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 5 Sep 2017 20:53:33 +0300 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 8:43 PM, Yury Selivanov wrote: > On Tue, Sep 5, 2017 at 10:31 AM, Koos Zevenhoven > wrote: > > On Tue, Sep 5, 2017 at 8:24 PM, Yury Selivanov > > wrote: > >> > >> I don't actually care if you use the latest terminology. You seem to > >> have a wrong idea about how PEP 550 really works (and its full > >> semantics), because things you say here about it don't make any sense. > > > > > > In PEP 550, introducing a new LogicalContext on the ExecutionContext > affects > > the scope of > ? > any_var.set(value) for *any* any_var. > ? > Does that not make sense? > > It does. But your other sentence ".. You might get close by putting > just one variable in a LogicalContext and then nest them, but PEP 550 > does not allow this in all cases .." does not. > So you claim that PEP 550 does allow that in all cases? Or you don't think that that would get close? ???Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Tue Sep 5 17:09:37 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Tue, 05 Sep 2017 21:09:37 +0000 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: <24540851-9f0d-40ee-9b12-322fc0b4053e@googlegroups.com> Message-ID: On Tue, Sep 5, 2017 at 10:54 AM Guido van Rossum wrote: > On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar > wrote: > >> I think you really should add a context manager to PEP 550 since it is >> better than calling "set", which leaks state. Nathaniel is right that you >> need set to support legacy numpy methods like seterr. Had there been a way >> of setting context variables using a context manager, then numpy would only >> have had to implement the "errstate" context manager on top of it. There >> would have been no need for seterr, which leaks state between code blocks >> and is error-prone. >> > > There is nothing in current Python to prevent numpy to use a context > manager for seterr; it's easy enough to write your own context manager that > saves and restores thread-local state (decimal shows how). In fact with PEP > 550 it's so easy that it's really not necessary for the PEP to define this > as a separate API -- whoever needs it can just write their own. > Don't you want to encourage people to use the context manager form and discourage calls to set/discard? I recognize that seterr has to be supported and has to sit on top of some method in the execution context. However, if we were starting from scratch, I don't see why we would have seterr at all. We should just have errstate. seterr can leak state, which might not seem like a big deal in a small program, but in a large program, it can mean that a minor change in one module can cause bugs in a totally different part of the program. These kinds of bugs can be very hard to debug. > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Wed Sep 6 16:23:49 2017 From: francismb at email.de (francismb) Date: Wed, 6 Sep 2017 22:23:49 +0200 Subject: [Python-ideas] PEP draft: context variables In-Reply-To: References: Message-ID: <5afc8296-0c1f-0be4-0da7-e5c7bdac64f7@email.de> Hi!, > Core concept > '''''''''''' > > A context-local variable is represented by a single instance of > ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` > object can ask for its value with respect to the current context. In the > high-level API, this value is given by the ``cvar.value`` property:: > > cvar = contextvars.Var(default="the default value", > description="example context variable") > > assert cvar.value == "the default value" # default still applies > > # In code examples, all ``assert`` statements should > # succeed according to the proposed semantics. [...] > Running code in a clean state > ''''''''''''''''''''''''''''' > > Although it is possible to revert all applied context changes > using the above primitives, a more convenient way to run a block > of code in a clean context is provided:: > > with context_vars.clean_context(): > # here, all context vars start off with their default values > # here, the state is back to what it was before the with block. why not to call the section 'Running code in the default state' and the method just `.default_context()`: with context_vars.default_context(): # here, all context vars start off with their default values # here, the state is back to what it was before the with block. Means `clean` here `default` (the variable is constructed as cvar = ontextvars.Var(default="the default value", ...) ? Thanks in advance, --francis From rymg19 at gmail.com Wed Sep 6 18:52:38 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 6 Sep 2017 17:52:38 -0500 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs Message-ID: Right now, many Google searches for Python modules return the Python 2 documentation. IMO since 2 will be reaching EOL in around 3 years, it would be nice to have a giant red box at the top with a link to the Python 3 documentation. SFML already does something like this: https://www.sfml-dev.org/tutorials/2.3/ (I mean, it would be even nicer to have a "jump to latest version" for *all* not-new Python versions, though I figured just 2 would be a lot easier.) -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else http://refi64.com/ From victor.stinner at gmail.com Wed Sep 6 20:18:36 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Sep 2017 02:18:36 +0200 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: Another example is PyPI showing a bold "Latest version: x.x.x". Example: https://pypi.python.org/pypi/requests/2.17.0 Victor 2017-09-07 0:52 GMT+02:00 Ryan Gonzalez : > Right now, many Google searches for Python modules return the Python 2 > documentation. IMO since 2 will be reaching EOL in around 3 years, it > would be nice to have a giant red box at the top with a link to the > Python 3 documentation. > > SFML already does something like this: https://www.sfml-dev.org/tutorials/2.3/ > > (I mean, it would be even nicer to have a "jump to latest version" for > *all* not-new Python versions, though I figured just 2 would be a lot > easier.) > > -- > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else > http://refi64.com/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From guido at python.org Wed Sep 6 21:18:34 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 18:18:34 -0700 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: That's a good idea. Could you suggest it to https://github.com/python/pythondotorg/ ? (There is actually a version switcher on every page, but it's rather polite. :-) On Wed, Sep 6, 2017 at 3:52 PM, Ryan Gonzalez wrote: > Right now, many Google searches for Python modules return the Python 2 > documentation. IMO since 2 will be reaching EOL in around 3 years, it > would be nice to have a giant red box at the top with a link to the > Python 3 documentation. > > SFML already does something like this: https://www.sfml-dev.org/ > tutorials/2.3/ > > (I mean, it would be even nicer to have a "jump to latest version" for > *all* not-new Python versions, though I figured just 2 would be a lot > easier.) > > -- > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else > http://refi64.com/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bzvi7919 at gmail.com Thu Sep 7 01:51:31 2017 From: bzvi7919 at gmail.com (Bar Harel) Date: Thu, 07 Sep 2017 05:51:31 +0000 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: A bit radical but do you believe we can contact Google to alter the search results? It's for the benefit of the user after all, and many just switch to python 3 in the version picker anyway. On Thu, Sep 7, 2017, 4:18 AM Guido van Rossum wrote: > That's a good idea. Could you suggest it to > https://github.com/python/pythondotorg/ ? (There is actually a version > switcher on every page, but it's rather polite. :-) > > On Wed, Sep 6, 2017 at 3:52 PM, Ryan Gonzalez wrote: > >> Right now, many Google searches for Python modules return the Python 2 >> documentation. IMO since 2 will be reaching EOL in around 3 years, it >> would be nice to have a giant red box at the top with a link to the >> Python 3 documentation. >> >> SFML already does something like this: >> https://www.sfml-dev.org/tutorials/2.3/ >> >> (I mean, it would be even nicer to have a "jump to latest version" for >> *all* not-new Python versions, though I figured just 2 would be a lot >> easier.) >> >> -- >> Ryan (????) >> Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else >> http://refi64.com/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Thu Sep 7 06:25:32 2017 From: flying-sheep at web.de (Philipp A.) Date: Thu, 07 Sep 2017 10:25:32 +0000 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: A progmatic workaround for yourself: https://addons.mozilla.org/de/firefox/addon/py3direct/ https://chrome.google.com/webstore/detail/py3redirect/codfjigcljdnlklcaopdciclmmdandig Bar Harel schrieb am Do., 7. Sep. 2017 um 07:52 Uhr: > A bit radical but do you believe we can contact Google to alter the search > results? > It's for the benefit of the user after all, and many just switch to python > 3 in the version picker anyway. > > On Thu, Sep 7, 2017, 4:18 AM Guido van Rossum wrote: > >> That's a good idea. Could you suggest it to >> https://github.com/python/pythondotorg/ ? (There is actually a version >> switcher on every page, but it's rather polite. :-) >> >> On Wed, Sep 6, 2017 at 3:52 PM, Ryan Gonzalez wrote: >> >>> Right now, many Google searches for Python modules return the Python 2 >>> documentation. IMO since 2 will be reaching EOL in around 3 years, it >>> would be nice to have a giant red box at the top with a link to the >>> Python 3 documentation. >>> >>> SFML already does something like this: >>> https://www.sfml-dev.org/tutorials/2.3/ >>> >>> (I mean, it would be even nicer to have a "jump to latest version" for >>> *all* not-new Python versions, though I figured just 2 would be a lot >>> easier.) >>> >>> -- >>> Ryan (????) >>> Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else >>> http://refi64.com/ >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis at href.ch Thu Sep 7 06:43:13 2017 From: denis at href.ch (=?utf-8?Q?Denis_Krienb=C3=BChl?=) Date: Thu, 7 Sep 2017 12:43:13 +0200 Subject: [Python-ideas] if as Message-ID: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> Hi! I?ve been having this idea for a few years and I thought I finally see if what others think of it. I have no experience in language design and I don?t know if this is something I?ve picked up in some other language. I also do not know what the ramifications of implementing this idea would be. I just keep thinking about it :) I quite often write code like the following in python: result = computation() if result: do_something_with_computation(result) More often than not this snippet evolves from something like this: if computation(): ? That is, I use the truthiness of the value at first. As the code grows I refactor to actually do something with the result. What I would love to see is the following syntax instead, which to me is much cleaner: if computation() as result: do_something_with_result(result) Basically the result variable would be the result of the if condition?s expression and it would be available the same way it would be if we used my initial snippet (much the same way that the result of with expressions also stays around outside the with-block). Any feedback is appreciated :) Cheers, Denis From p.f.moore at gmail.com Thu Sep 7 07:11:58 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Sep 2017 12:11:58 +0100 Subject: [Python-ideas] if as In-Reply-To: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> Message-ID: On 7 September 2017 at 11:43, Denis Krienb?hl wrote: > What I would love to see is the following syntax instead, which to me is much cleaner: > > if computation() as result: > do_something_with_result(result) Hi - thanks for your suggestion! This has actually come up quite a lot in the past. Here's a couple of links to threads you might want to read (it's not surprising if you missed these, it's not that easy to come up with a good search term for this topic). https://mail.python.org/pipermail/python-ideas/2012-January/013461.html https://mail.python.org/pipermail/python-ideas/2009-March/003423.html (This thread includes a note by Guido that he intentionally left out this functionality) In summary, it's a reasonably commonly suggested idea, but there's not enough benefit to warrant adding it to the language. Paul From denis at href.ch Thu Sep 7 07:27:24 2017 From: denis at href.ch (=?utf-8?Q?Denis_Krienb=C3=BChl?=) Date: Thu, 7 Sep 2017 13:27:24 +0200 Subject: [Python-ideas] if as In-Reply-To: References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> Message-ID: <7BC1D46A-B2D8-4FEC-ACB3-B98370588E39@href.ch> > On 7 Sep 2017, at 13:11, Paul Moore wrote: > > On 7 September 2017 at 11:43, Denis Krienb?hl wrote: >> What I would love to see is the following syntax instead, which to me is much cleaner: >> >> if computation() as result: >> do_something_with_result(result) > > Hi - thanks for your suggestion! This has actually come up quite a lot > in the past. Here's a couple of links to threads you might want to > read (it's not surprising if you missed these, it's not that easy to > come up with a good search term for this topic). > > https://mail.python.org/pipermail/python-ideas/2012-January/013461.html > https://mail.python.org/pipermail/python-ideas/2009-March/003423.html > (This thread includes a note by Guido that he intentionally left out > this functionality) > > In summary, it's a reasonably commonly suggested idea, but there's not > enough benefit to warrant adding it to the language. > > Paul I see, thank you for digging those threads up. I?ll read them to learn a thing or two :) From nastasi at alternativeoutput.it Thu Sep 7 07:36:31 2017 From: nastasi at alternativeoutput.it (Matteo Nastasi) Date: Thu, 7 Sep 2017 13:36:31 +0200 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. Message-ID: <20170907113631.GA20825@alternativeoutput.it> Hi all, few days ago I thought about a way to rapresent sets and ordered dicts using a json compatible syntax, these are my conclusions: A set could be defined as { item1, item2, item3[...] } with {,} as an empty set An ordered dict could be defined as [ item1: value1, item2: value2 ... ] with [:] ase an empty odered dict It could be used inside python code or to serialize python structures in a json-like format (pyson maybe ?). What do you think about ? Kind regards, Matteo. -- email: nastasi at alternativeoutput.it, matteo.nastasi at gmail.com web: www.alternativeoutput.it irc: #linux-mi at irc.freenode.net linkedin: http://lnkd.in/SPQG87 From steve at pearwood.info Thu Sep 7 07:54:15 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 7 Sep 2017 21:54:15 +1000 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. In-Reply-To: <20170907113631.GA20825@alternativeoutput.it> References: <20170907113631.GA20825@alternativeoutput.it> Message-ID: <20170907115415.GJ13110@ando.pearwood.info> On Thu, Sep 07, 2017 at 01:36:31PM +0200, Matteo Nastasi wrote: > A set could be defined as { item1, item2, item3[...] } Guido's time machine strikes again. Python 3: py> s = {1, 2, 3} py> type(s) > with {,} as an empty set That looks like a typo. I don't think that having a literal for empty sets is important enough that we need worry about the lack. > An ordered dict could be defined as [ item1: value1, item2: value2 ... ] > with [:] ase an empty odered dict I think that's been proposed before. I don't hate that suggestion, but I don't think its very useful either. We already have a de facto "Ordered Mapping" literal that can be passed to the OrderedDict constructor: OrderedDict( [(key1, value1), (key2, value2), (key3, value3), ...] ) Its not quite as compact, but it isn't too awful. And if you really don't like it, try: OrderedDict(zip(keys, values)) -- Steven From nastasi at alternativeoutput.it Thu Sep 7 07:59:46 2017 From: nastasi at alternativeoutput.it (Matteo Nastasi) Date: Thu, 7 Sep 2017 13:59:46 +0200 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. In-Reply-To: <20170907115415.GJ13110@ando.pearwood.info> References: <20170907113631.GA20825@alternativeoutput.it> <20170907115415.GJ13110@ando.pearwood.info> Message-ID: <20170907115946.GB20825@alternativeoutput.it> I was thinking about an extended json too, not just python syntax. What do you think about it ? Regards and thank you for your time. Matteo. On Thu, Sep 07, 2017 at 09:54:15PM +1000, Steven D'Aprano wrote: > On Thu, Sep 07, 2017 at 01:36:31PM +0200, Matteo Nastasi wrote: > > > A set could be defined as { item1, item2, item3[...] } > > Guido's time machine strikes again. Python 3: > > py> s = {1, 2, 3} > py> type(s) > > > > > with {,} as an empty set > > That looks like a typo. I don't think that having a literal for empty > sets is important enough that we need worry about the lack. > > > > An ordered dict could be defined as [ item1: value1, item2: value2 ... ] > > with [:] ase an empty odered dict > > I think that's been proposed before. > > I don't hate that suggestion, but I don't think its very useful either. > We already have a de facto "Ordered Mapping" literal that can be passed > to the OrderedDict constructor: > > OrderedDict( > [(key1, value1), (key2, value2), (key3, value3), ...] > ) > > Its not quite as compact, but it isn't too awful. And if you really > don't like it, try: > > OrderedDict(zip(keys, values)) > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- email: nastasi at alternativeoutput.it, matteo.nastasi at gmail.com web: www.alternativeoutput.it irc: #linux-mi at irc.freenode.net linkedin: http://lnkd.in/SPQG87 From rosuav at gmail.com Thu Sep 7 08:24:25 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 7 Sep 2017 22:24:25 +1000 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. In-Reply-To: <20170907115946.GB20825@alternativeoutput.it> References: <20170907113631.GA20825@alternativeoutput.it> <20170907115415.GJ13110@ando.pearwood.info> <20170907115946.GB20825@alternativeoutput.it> Message-ID: On Thu, Sep 7, 2017 at 9:59 PM, Matteo Nastasi wrote: > I was thinking about an extended json too, not just python syntax. > > What do you think about it ? > > Regards and thank you for your time. Extend JSON? No thank you. Down the path of extending simple standards in proprietary ways lies a madness that I do not wish on my best friends, much less my worst enemies. ChrisA From jhihn at gmx.com Thu Sep 7 10:36:40 2017 From: jhihn at gmx.com (Jason H) Date: Thu, 7 Sep 2017 16:36:40 +0200 Subject: [Python-ideas] if as In-Reply-To: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> Message-ID: > Sent: Thursday, September 07, 2017 at 6:43 AM > From: "Denis Krienb?hl" > To: python-ideas at python.org > Subject: [Python-ideas] if as > > Hi! > > I?ve been having this idea for a few years and I thought I finally see if what others think of it. I have no experience in language design and I don?t know if this is something I?ve picked up in some other language. I also do not know what the ramifications of implementing this idea would be. I just keep thinking about it :) > > I quite often write code like the following in python: > > result = computation() > if result: > do_something_with_computation(result) > > More often than not this snippet evolves from something like this: > > if computation(): > ? > > That is, I use the truthiness of the value at first. As the code grows I refactor to actually do something with the result. > > What I would love to see is the following syntax instead, which to me is much cleaner: > > if computation() as result: > do_something_with_result(result) > > Basically the result variable would be the result of the if condition?s expression and it would be available the same way it would be if we used my initial snippet (much the same way that the result of with expressions also stays around outside the with-block). > > Any feedback is appreciated :) I also often wonder why we are left doing an assignment and test. You have two options: 1. assign to a variable then test and use 2. repeat the function call I would offer that 'with' [sh|c]ould be used: with test() as x: handle_truthy(x) else: handle_falsey() # do we provide x here too? Because None vs False? From wes.turner at gmail.com Thu Sep 7 10:54:03 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 7 Sep 2017 09:54:03 -0500 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: A sitemap.xml may be helpful for indicating search relevance to search crawlers? https://www.google.com/search?q=sphinx+sitemap On Thursday, September 7, 2017, Bar Harel wrote: > A bit radical but do you believe we can contact Google to alter the search > results? > It's for the benefit of the user after all, and many just switch to python > 3 in the version picker anyway. > > On Thu, Sep 7, 2017, 4:18 AM Guido van Rossum > wrote: > >> That's a good idea. Could you suggest it to https://github.com/python/ >> pythondotorg/ ? (There is actually a version switcher on every page, but >> it's rather polite. :-) >> >> On Wed, Sep 6, 2017 at 3:52 PM, Ryan Gonzalez > > wrote: >> >>> Right now, many Google searches for Python modules return the Python 2 >>> documentation. IMO since 2 will be reaching EOL in around 3 years, it >>> would be nice to have a giant red box at the top with a link to the >>> Python 3 documentation. >>> >>> SFML already does something like this: https://www.sfml-dev.org/ >>> tutorials/2.3/ >>> >>> (I mean, it would be even nicer to have a "jump to latest version" for >>> *all* not-new Python versions, though I figured just 2 would be a lot >>> easier.) >>> >>> -- >>> Ryan (????) >>> Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else >>> http://refi64.com/ >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Sep 7 12:13:32 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Sep 2017 02:13:32 +1000 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: Message-ID: <20170907161332.GK13110@ando.pearwood.info> On Thu, Sep 07, 2017 at 05:51:31AM +0000, Bar Harel wrote: > A bit radical but do you believe we can contact Google to alter the search > results? You should try DuckDuckGo, it recognises "python" searches and quotes from, and links to, the Python 3 documentation at the top of the page before the search results. E.g.: https://duckduckgo.com/html/?q=python%20functools Both Bing and Yahoo also gives the Python 2 docs first: https://www.bing.com/search?q=python+functools https://au.search.yahoo.com/search?p=python+functools -- Steve From steve at pearwood.info Thu Sep 7 12:25:39 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 8 Sep 2017 02:25:39 +1000 Subject: [Python-ideas] if as In-Reply-To: References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> Message-ID: <20170907162539.GL13110@ando.pearwood.info> On Thu, Sep 07, 2017 at 04:36:40PM +0200, Jason H wrote: > I also often wonder why we are left doing an assignment and test. You have two options: > 1. assign to a variable then test and use > 2. repeat the function call Personally, I don't see what's wrong with the "assign then test" idiom. x = something() if x: do_stuff() > I would offer that 'with' [sh|c]ould be used: > with test() as x: > handle_truthy(x) > else: > handle_falsey() # do we provide x here too? Because None vs False? This would cause confusing errors and mysterious behaviour, depending on whether the test() object was a context manager or not. Which should take priority? If you see: with spam() as x: do_stuff is that a context manager with block (like "with open(...) as f") or your boolean if test in disguise? Having "with" sometimes be a disguised "if" and sometimes a regular "with" will make it really, really hard to reason about code. -- Steve From guido at python.org Thu Sep 7 12:30:12 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 09:30:12 -0700 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: <20170907161332.GK13110@ando.pearwood.info> References: <20170907161332.GK13110@ando.pearwood.info> Message-ID: We just need someone with SEO experience to fix this for us. On Thu, Sep 7, 2017 at 9:13 AM, Steven D'Aprano wrote: > On Thu, Sep 07, 2017 at 05:51:31AM +0000, Bar Harel wrote: > > A bit radical but do you believe we can contact Google to alter the > search > > results? > > You should try DuckDuckGo, it recognises "python" searches and quotes > from, and links to, the Python 3 documentation at the top of the page > before the search results. E.g.: > > https://duckduckgo.com/html/?q=python%20functools > > Both Bing and Yahoo also gives the Python 2 docs first: > > https://www.bing.com/search?q=python+functools > > https://au.search.yahoo.com/search?p=python+functools > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Thu Sep 7 13:44:02 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 11:44:02 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis Message-ID: <20170907174402.ocdy3zumzdfj7xg6@python.ca> This is a half baked idea that perhaps could work. Maybe call it 2-stage module load instead of lazy. Introduce a lazy module import process that modules can opt-in to. The opt-in would either be with a __future__ statement or the compiler would statically analyze the module and determine if it is safe. E.g. if the module has no module level statements besides imports. .pyc files get some other bits of information: A) whether the module has opted for lazy import (IS_LAZY) B) the modules imported by the module (i.e. top-level imports, IMPORT_LIST) Make __import__ understand this data and do lazy loading for modules that want it. Sub-modules that have import side-effects will still execute as normal and the side effects will happen when the parent module is imported.. This would consist of a recursive process, something like: def load_module(name): if not IS_LAZY(name): import as usual else: create lazy version of module 'name' for subname in IMPORT_LIST(name): load_module(subname) An additional idea from Barry W, if a module wants lazy loading but wants to do some init when the module is "woken up", define a __init__ top-level function. Python would call that function when attributes of the module are first actually used. My plan was to implement this with a Python __import__ implementation. I would unmarshal the .pyc, compute IS_LAZY and IMPORT_LIST at import time. So, not gaining a lot of speedup. It would prove if the idea works in terms of not causing application crashes, etc. I could try running it with bigger apps and see how many modules are flagged for lazy loading. From ericsnowcurrently at gmail.com Thu Sep 7 14:26:18 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 11:26:18 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code Message-ID: Hi all, As part of the multi-core work I'm proposing the addition of the "interpreters" module to the stdlib. This will expose the existing subinterpreters C-API to Python code. I've purposefully kept the API simple. Please let me know what you think. -eric https://www.python.org/dev/peps/pep-0554/ https://github.com/python/peps/blob/master/pep-0554.rst https://github.com/python/cpython/pull/1748 https://github.com/python/cpython/pull/1802 https://github.com/ericsnowcurrently/cpython/tree/high-level-interpreters-module ********************************************** PEP: 554 Title: Multiple Interpreters in the Stdlib Author: Eric Snow Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.7 Post-History: Abstract ======== This proposal introduces the stdlib "interpreters" module. It exposes the basic functionality of subinterpreters that exists in the C-API. Rationale ========= Running code in multiple interpreters provides a useful level of isolation within the same process. This can be leveraged in number of ways. Furthermore, subinterpreters provide a well-defined framework in which such isolation may extended. CPython has supported subinterpreters, with increasing levels of support, since version 1.5. While the feature has the potential to be a powerful tool, subinterpreters have suffered from neglect because they are not available directly from Python. Exposing the existing functionality in the stdlib will help reverse the situation. Proposal ======== The "interpreters" module will be added to the stdlib. It will provide a high-level interface to subinterpreters and wrap the low-level "_interpreters" module. The proposed API is inspired by the threading module. The module provides the following functions: enumerate(): Return a list of all existing interpreters. get_current(): Return the currently running interpreter. get_main(): Return the main interpreter. create(): Initialize a new Python interpreter and return it. The interpreter will be created in the current thread and will remain idle until something is run in it. The module also provides the following class: Interpreter(id): id: The interpreter's ID (read-only). is_running(): Return whether or not the interpreter is currently running. destroy(): Finalize and destroy the interpreter. run(code): Run the provided Python code in the interpreter, in the current OS thread. Supported code: source text. Copyright ========= This document has been placed in the public domain. From p.f.moore at gmail.com Thu Sep 7 14:52:47 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Sep 2017 19:52:47 +0100 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 7 September 2017 at 19:26, Eric Snow wrote: > As part of the multi-core work I'm proposing the addition of the > "interpreters" module to the stdlib. This will expose the existing > subinterpreters C-API to Python code. I've purposefully kept the API > simple. Please let me know what you think. Looks good. I agree with the idea of keeping the interface simple in the first instance - we can easily add extra functionality later, but removing stuff (or worse still, finding that stuff we thought was OK but had missed corner cases of was broken) is much harder. > run(code): > > Run the provided Python code in the interpreter, in the current > OS thread. Supported code: source text. The only quibble I have is that I'd prefer it if we had a run(callable, *args, **kwargs) method. Either instead of, or as well as, the run(string) one here. Is there any reason why passing a callable and args is unsafe, and/or difficult? Naively, I'd assume that interp.call('f(a)') would be precisely as safe as interp.call(f, a) Am I missing something? Name visibility or scoping issues come to mind as possible complications I'm not seeing. At the least, if we don't want a callable-and-args form yet, a note in the PEP explaining why it's been omitted would be worthwhile. Paul From rymg19 at gmail.com Thu Sep 7 14:35:14 2017 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Thu, 7 Sep 2017 11:35:14 -0700 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. In-Reply-To: <<20170907113631.GA20825@alternativeoutput.it>> Message-ID: IIRC in CPython 3.6 and PyPy dicts are ordered based on insertion anyway; although it's an implementation-specific detail, realistically it removes some of the use cases for ordered dictionary literals. -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Sep 7, 2017 at 6:40 AM, > wrote: Hi all, few days ago I thought about a way to rapresent sets and ordered dicts using a json compatible syntax, these are my conclusions: A set could be defined as { item1, item2, item3[...] } with {,} as an empty set An ordered dict could be defined as [ item1: value1, item2: value2 ... ] with [:] ase an empty odered dict It could be used inside python code or to serialize python structures in a json-like format (pyson maybe ?). What do you think about ? Kind regards, Matteo. -- email: nastasi at alternativeoutput.it, matteo.nastasi at gmail.com web: www.alternativeoutput.it irc: #linux-mi at irc.freenode.net linkedin: http://lnkd.in/SPQG87 _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu Sep 7 15:14:19 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 12:14:19 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 11:52 AM, Paul Moore wrote: > The only quibble I have is that I'd prefer it if we had a > run(callable, *args, **kwargs) method. Either instead of, or as well > as, the run(string) one here. > > Is there any reason why passing a callable and args is unsafe, and/or > difficult? Naively, I'd assume that > > interp.call('f(a)') > > would be precisely as safe as > > interp.call(f, a) The problem for now is with sharing objects between interpreters. The simplest safe approach currently is to restrict execution to source strings. Then there are no complications. Interpreter.call() makes sense but I'd like to wait until we get feel for how subinterpreters get used and until we address some of the issues with object passing. FWIW, here are what I see as the next steps for subinterpreters in the stdlib: 1. add a basic queue class for passing objects between interpreters * only support strings at first (though Nick pointed out we could fall back to pickle or marshal for unsupported objects) 2. implement CSP on top of subinterpreters 3. expand the queue's supported types 4. add something like Interpreter.call() I didn't include such a queue in this proposal because I wanted to keep it as focused as possible. I'll add a note to the PEP about this. > > Am I missing something? Name visibility or scoping issues come to mind > as possible complications I'm not seeing. At the least, if we don't > want a callable-and-args form yet, a note in the PEP explaining why > it's been omitted would be worthwhile. I'll add a note to the PEP. Thanks for pointing this out. :) -eric From sebastian at realpath.org Thu Sep 7 15:07:50 2017 From: sebastian at realpath.org (Sebastian Krause) Date: Thu, 07 Sep 2017 21:07:50 +0200 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: (Guido van Rossum's message of "Thu, 7 Sep 2017 09:30:12 -0700") References: <20170907161332.GK13110@ando.pearwood.info> Message-ID: Guido van Rossum wrote: > We just need someone with SEO experience to fix this for us. I'm not an SEO expert, but I think a possible approach would be using (or partly abusing) the element on the documentation pages: https://support.google.com/webmasters/answer/139066 Python's documentation is already using these canonical links, just in an incomplete way: The Python documentation for 2.7 and 3.5 and later has a pointing to the generic documentation URL of the same major Python version, e.g. in https://docs.python.org/3/library/functools.html you can find this in the page source: This is why if you search for "python 3 $module" in Google, you'll never see a direct link to the 3.5 or 3.6 versions of the documentation (because Google merges them with the generic docs.python.org/3/), but you still results for versions 3.2, 3.3 etc. of the documentation (because the lack the canonical links). A very good step would be to also add this canoncial link to the documentation versions 3.0-3.4, this will make docs.python.org/3.3/ etc. vanish from Google and probably rank the generic docs.python.org/3/ higher than now. And now to the abuse part (I'm honestly not sure how well this would actually work): If you would add such a canonical link in docs.python.org/2/ pointing to docs.python.org/3/ (at least where the same module exists in both versions), you would eventually hide /2/ from the search results. If you don't want to be so extreme (people still want Python 2 documentation if they search for "python 2 $module") you could remove the canonical link just from the specific docs.python.org/2.7/ which will then be ranked much lower than docs.python.org/3/ in the search results, but at least still show up. Sebastian From barry at python.org Thu Sep 7 15:19:10 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 07 Sep 2017 12:19:10 -0700 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170907174402.ocdy3zumzdfj7xg6@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: Neil Schemenauer wrote: > Introduce a lazy module import process that modules can opt-in to. > The opt-in would either be with a __future__ statement or the > compiler would statically analyze the module and determine if it is > safe. E.g. if the module has no module level statements besides > imports. and `def` and `class` of course. There are a few other things that might end up marking a module as "industrious" (my thesaurus's antonym for "lazy"). There will likely be assignments of module global such as: MY_CONST = 'something' and it may even be a little more complicated: COLORS = dict( red=1, blue=2, green=3, ) REVERSE = {value: key for key, value in COLORS.items()} A naive evaluation of such a module might not notice them as lazy, but I think they could still be treated as such. Function and class decorators might also be false positives. E.g. @public def my_public_function(): pass or even @mungify class Munged: pass Maybe that's just the cost of doing business, and if they clear the lazy flag, so be it. But it feels like doing so will leave quite a bit of lazy loading opportunity left on the table. And I'm not sure you can solve all of those by moving things to a module level __init__(). Cheers, -Barry From flying-sheep at web.de Thu Sep 7 15:21:40 2017 From: flying-sheep at web.de (Philipp A.) Date: Thu, 07 Sep 2017 19:21:40 +0000 Subject: [Python-ideas] if as In-Reply-To: <20170907162539.GL13110@ando.pearwood.info> References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> <20170907162539.GL13110@ando.pearwood.info> Message-ID: Sadly it?s hard to create a context manager that skips its body like this: with unpack(computation()) as result: do_something_with_result(result) You can do it with some hackery like described here: https://stackoverflow.com/a/12594789/247482 class unpack: def __init__(self, pred): self.pred = pred def __enter__(self): if self.pred: return self.pred # else skip the with block?s body sys.settrace(lambda *args, **kw: None) frame = inspect.currentframe(1) frame.f_trace = self.trace def trace(self, frame, event, arg): raise def __exit__(self, type, value, traceback): return True # suppress the exception Steven D'Aprano schrieb am Do., 7. Sep. 2017 um 18:26 Uhr: > On Thu, Sep 07, 2017 at 04:36:40PM +0200, Jason H wrote: > > > I also often wonder why we are left doing an assignment and test. You > have two options: > > 1. assign to a variable then test and use > > 2. repeat the function call > > Personally, I don't see what's wrong with the "assign then test" idiom. > > x = something() > if x: > do_stuff() > > > > I would offer that 'with' [sh|c]ould be used: > > with test() as x: > > handle_truthy(x) > > else: > > handle_falsey() # do we provide x here too? Because None vs False? > > > This would cause confusing errors and mysterious behaviour, depending on > whether the test() object was a context manager or not. Which should > take priority? If you see: > > with spam() as x: > do_stuff > > is that a context manager with block (like "with open(...) as f") or > your boolean if test in disguise? > > Having "with" sometimes be a disguised "if" and sometimes a regular > "with" will make it really, really hard to reason about code. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at realpath.org Thu Sep 7 15:24:36 2017 From: sebastian at realpath.org (Sebastian Krause) Date: Thu, 07 Sep 2017 21:24:36 +0200 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: (Sebastian Krause's message of "Thu, 07 Sep 2017 21:07:50 +0200") References: <20170907161332.GK13110@ando.pearwood.info> Message-ID: Sebastian Krause wrote: > This is why if you search for "python 3 $module" in Google, you'll > never see a direct link to the 3.5 or 3.6 versions of the > documentation (because Google merges them with the generic > docs.python.org/3/), but you still results for versions 3.2, 3.3 > etc. of the documentation (because the lack the canonical links). > > A very good step would be to also add this canoncial link to the > documentation versions 3.0-3.4, this will make docs.python.org/3.3/ > etc. vanish from Google and probably rank the generic > docs.python.org/3/ higher than now. Here is Nick Coghlan's bpo issue which added these canonical links: https://bugs.python.org/issue26355 - looks like applying this to the older doc versions never happened in the end. From flying-sheep at web.de Thu Sep 7 15:40:50 2017 From: flying-sheep at web.de (Philipp A.) Date: Thu, 07 Sep 2017 19:40:50 +0000 Subject: [Python-ideas] if as In-Reply-To: References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> <20170907162539.GL13110@ando.pearwood.info> Message-ID: sorry, it?s a bit more difficult. this works: https://gist.github.com/flying-sheep/86dfcc1bdd71a33fa3483b83e254084c Philipp A. schrieb am Do., 7. Sep. 2017 um 21:18 Uhr: > Sadly it?s hard to create a context manager that skips its body like this: > > with unpack(computation()) as result: > do_something_with_result(result) > > You can do it with some hackery like described here: > https://stackoverflow.com/a/12594789/247482 > > class unpack: > def __init__(self, pred): > self.pred = pred > > def __enter__(self): > if self.pred: > return self.pred > # else skip the with block?s body > sys.settrace(lambda *args, **kw: None) > frame = inspect.currentframe(1) > frame.f_trace = self.trace > > def trace(self, frame, event, arg): > raise > > def __exit__(self, type, value, traceback): > return True # suppress the exception > > Steven D'Aprano schrieb am Do., 7. Sep. 2017 um > 18:26 Uhr: > >> On Thu, Sep 07, 2017 at 04:36:40PM +0200, Jason H wrote: >> >> > I also often wonder why we are left doing an assignment and test. You >> have two options: >> > 1. assign to a variable then test and use >> > 2. repeat the function call >> >> Personally, I don't see what's wrong with the "assign then test" idiom. >> >> x = something() >> if x: >> do_stuff() >> >> >> > I would offer that 'with' [sh|c]ould be used: >> > with test() as x: >> > handle_truthy(x) >> > else: >> > handle_falsey() # do we provide x here too? Because None vs False? >> >> >> This would cause confusing errors and mysterious behaviour, depending on >> whether the test() object was a context manager or not. Which should >> take priority? If you see: >> >> with spam() as x: >> do_stuff >> >> is that a context manager with block (like "with open(...) as f") or >> your boolean if test in disguise? >> >> Having "with" sometimes be a disguised "if" and sometimes a regular >> "with" will make it really, really hard to reason about code. >> >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Sep 7 15:44:42 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 7 Sep 2017 20:44:42 +0100 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 7 September 2017 at 20:14, Eric Snow wrote: > On Thu, Sep 7, 2017 at 11:52 AM, Paul Moore wrote: >> Is there any reason why passing a callable and args is unsafe, and/or >> difficult? Naively, I'd assume that >> >> interp.call('f(a)') >> >> would be precisely as safe as >> >> interp.call(f, a) > > The problem for now is with sharing objects between interpreters. The > simplest safe approach currently is to restrict execution to source > strings. Then there are no complications. Interpreter.call() makes > sense but I'd like to wait until we get feel for how subinterpreters > get used and until we address some of the issues with object passing. Ah, OK. so if I create a new interpreter, none of the classes, functions, or objects defined in my calling code will exist within the target interpreter? That makes sense, but I'd missed that nuance from the description. Again, this is probably worth noting in the PEP. And for the record, based on that one fact, I'm perfectly OK with the initial API being string-only. > FWIW, here are what I see as the next steps for subinterpreters in the stdlib: > > 1. add a basic queue class for passing objects between interpreters > * only support strings at first (though Nick pointed out we could > fall back to pickle or marshal for unsupported objects) > 2. implement CSP on top of subinterpreters > 3. expand the queue's supported types > 4. add something like Interpreter.call() > > I didn't include such a queue in this proposal because I wanted to > keep it as focused as possible. I'll add a note to the PEP about > this. This all sounds very reasonable. Thanks for the clarification. Paul From rosuav at gmail.com Thu Sep 7 15:48:29 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 8 Sep 2017 05:48:29 +1000 Subject: [Python-ideas] if as In-Reply-To: References: <029D6B7F-3742-4B88-91E8-1002CFB68D60@href.ch> <20170907162539.GL13110@ando.pearwood.info> Message-ID: On Fri, Sep 8, 2017 at 5:40 AM, Philipp A. wrote: > sorry, it?s a bit more difficult. this works: > https://gist.github.com/flying-sheep/86dfcc1bdd71a33fa3483b83e254084c If this can be made to work - even hackily - can it be added into contextlib.contextmanager (or technically its helper class)? Currently, its __enter__ looks like this: def __enter__(self): try: return next(self.gen) except StopIteration: raise RuntimeError("generator didn't yield") from None If that raise were replaced with the hackiness of skipping the body, we could wrap everything up nicely: @contextlib.contextmanager def iff(thing): if thing: yield thing # otherwise don't yield for i in range(1, 11): with iff(random.randrange(3)) as val: print(i, val) # won't print any zeroes ChrisA From sebastian at realpath.org Thu Sep 7 16:14:48 2017 From: sebastian at realpath.org (Sebastian Krause) Date: Thu, 07 Sep 2017 22:14:48 +0200 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: (Eric Snow's message of "Thu, 7 Sep 2017 12:14:19 -0700") References: Message-ID: Eric Snow wrote: > 1. add a basic queue class for passing objects between interpreters > * only support strings at first (though Nick pointed out we could > fall back to pickle or marshal for unsupported objects) > 2. implement CSP on top of subinterpreters > 3. expand the queue's supported types > 4. add something like Interpreter.call() How is the GIL situation with subinterpreters these days, is the long-term goal still "solving multi-core Python", i.e. using multiple CPU cores from within the same process? Or is it mainly used for isolation? Sebastian From nas at arctrix.com Thu Sep 7 16:27:14 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 20:27:14 +0000 (UTC) Subject: [Python-ideas] lazy import via __future__ or compiler analysis References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: Barry Warsaw wrote: > There are a few other things that might end up marking a module as > "industrious" (my thesaurus's antonym for "lazy"). Good points. The analysis can be simple at first and then we can enhance it to be smarter about what is okay and still lazy load. We may evolve it over time too, making things that are not strictly safe still not trigger the "industrious" load lazy anyhow. Another idea is to introduce __lazy__ or some such in the global namespace of the module, if present, e.g. __lazy__ = True then the analysis doesn't do anything except return True. The module has explicitly stated that side-effects in the top-level code are okay to be done in a lazy fashion. Perhaps with a little bit of smarts in the analsis and a little sprinkling of __lazy__ flags, we can get a big chunk of modules to lazy load. From ericsnowcurrently at gmail.com Thu Sep 7 17:05:09 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 14:05:09 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 12:44 PM, Paul Moore wrote: > Ah, OK. so if I create a new interpreter, none of the classes, > functions, or objects defined in my calling code will exist within the > target interpreter? That makes sense, but I'd missed that nuance from > the description. Again, this is probably worth noting in the PEP. I'll make sure the PEP is more clear about this. > > And for the record, based on that one fact, I'm perfectly OK with the > initial API being string-only. Great! :) -eric From ericsnowcurrently at gmail.com Thu Sep 7 17:09:44 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 14:09:44 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 1:14 PM, Sebastian Krause wrote: > How is the GIL situation with subinterpreters these days, is the > long-term goal still "solving multi-core Python", i.e. using > multiple CPU cores from within the same process? Or is it mainly > used for isolation? The GIL is still process-global. The goal is indeed to change this to support actual multi-core parallelism. However, the benefits of interpreter isolation are certainly a win otherwise. :) -eric From njs at pobox.com Thu Sep 7 18:48:01 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Sep 2017 15:48:01 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 11:26 AM, Eric Snow wrote: > Hi all, > > As part of the multi-core work I'm proposing the addition of the > "interpreters" module to the stdlib. This will expose the existing > subinterpreters C-API to Python code. I've purposefully kept the API > simple. Please let me know what you think. My concern about this is the same as it was last time -- the work looks neat, but right now, almost no-one uses subinterpreters (basically it's Jep and mod_wsgi and that's it?), and therefore many packages get away with ignoring subinterpreters. Numpy is the one I'm most familiar with: when we get subinterpreter bugs we close them wontfix, because supporting subinterpreters properly would require non-trivial auditing, add overhead for non-subinterpreter use cases, and benefit a tiny tiny fraction of our users. If we add a friendly python-level API like this, then we're committing to this being a part of Python for the long term and encouraging people to use it, which puts pressure on downstream packages to do that work... but it's still not clear whether any benefits will actually materialize. I've actually argued with the PyPy devs to try to convince them to add subinterpreter support as part of their experiments with GIL-removal, because I think the semantics would genuinely be nicer to work with than raw threads, but they're convinced that it's impossible to make this work. Or more precisely, they think you could make it work in theory, but that it would be impossible to make it meaningfully more efficient than using multiple processes. I want them to be wrong, but I have to admit I can't see a way to make it work either... If this is being justified by the multicore use case, and specifically by the theory that having two interpreters in the same process will allow for more efficient communication than two interpreters in two different processes, then... why should we believe that that's actually possible? I want your project to succeed, but if it's going to fail then it seems better if it fails before we commit to exposing new APIs. -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Thu Sep 7 18:51:34 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 15:51:34 -0700 Subject: [Python-ideas] Adding "View Python 3 Documentation" to all Python 2 documentation URLs In-Reply-To: References: <20170907161332.GK13110@ando.pearwood.info> Message-ID: On 7 September 2017 at 12:24, Sebastian Krause wrote: > Sebastian Krause wrote: >> This is why if you search for "python 3 $module" in Google, you'll >> never see a direct link to the 3.5 or 3.6 versions of the >> documentation (because Google merges them with the generic >> docs.python.org/3/), but you still results for versions 3.2, 3.3 >> etc. of the documentation (because the lack the canonical links). >> >> A very good step would be to also add this canoncial link to the >> documentation versions 3.0-3.4, this will make docs.python.org/3.3/ >> etc. vanish from Google and probably rank the generic >> docs.python.org/3/ higher than now. > > Here is Nick Coghlan's bpo issue which added these canonical links: > https://bugs.python.org/issue26355 - looks like applying this to the > older doc versions never happened in the end. Right, as adding those will need to be handled through the web server and/or by manually regenerating the docs with the additional HTML headers - making the changes to the CPython repo won't achieve anything, since the docs for those branches aren't automatically regenerated anymore. Another big task that could be undertaken is to start re-routing unqualified deep links to Python 3 - the reason we still haven't done that is because the tree layout is actually different (as per the six.moves module), so it isn't a trivial rewrite rule the way the current redirection into the Python 2 docs is. Given such a mapping, it would also be possible to add the corresponding canonical URL entries to the Python 2.7 documentation to merge their search ranking in to the corresponding Python 3 pages. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 7 19:23:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 16:23:20 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 7 September 2017 at 15:48, Nathaniel Smith wrote: > I've actually argued with the PyPy devs to try to convince them to add > subinterpreter support as part of their experiments with GIL-removal, > because I think the semantics would genuinely be nicer to work with > than raw threads, but they're convinced that it's impossible to make > this work. Or more precisely, they think you could make it work in > theory, but that it would be impossible to make it meaningfully more > efficient than using multiple processes. I want them to be wrong, but > I have to admit I can't see a way to make it work either... The gist of the idea is that with subinterpreters, your starting point is multiprocessing-style isolation (i.e. you have to use pickle to transfer data between subinterpreters), but you're actually running in a shared-memory threading context from the operating system's perspective, so you don't need to rely on mmap to share memory over a non-streaming interface. It's also definitely the case that to make this viable, we'd need to provide fast subinterpreter friendly alternatives to C globals for use by extension modules (otherwise adding subinterpreter compatibility will be excessively painful), and PEP 550 is likely to be helpful there. Personally, I think it would make sense to add the module under PEP 411 provisional status, and make it's continued existence as a public API contingent on actually delivering on the "lower overhead multi-core support than multiprocessing" goal (even if it only delivers on that front on Windows, where process creation is more expensive and there's no fork() equivalent). However, I'd also be entirely happy with our adding it as a private "_subinterpreters" API for testing & experimentation purposes (see https://bugs.python.org/issue30439 ), and reconsidering introducing it as a public API after there's more concrete evidence as to what can actually be achieved based on it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Thu Sep 7 20:15:16 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Sep 2017 17:15:16 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 4:23 PM, Nick Coghlan wrote: > On 7 September 2017 at 15:48, Nathaniel Smith wrote: >> I've actually argued with the PyPy devs to try to convince them to add >> subinterpreter support as part of their experiments with GIL-removal, >> because I think the semantics would genuinely be nicer to work with >> than raw threads, but they're convinced that it's impossible to make >> this work. Or more precisely, they think you could make it work in >> theory, but that it would be impossible to make it meaningfully more >> efficient than using multiple processes. I want them to be wrong, but >> I have to admit I can't see a way to make it work either... > > The gist of the idea is that with subinterpreters, your starting point > is multiprocessing-style isolation (i.e. you have to use pickle to > transfer data between subinterpreters), but you're actually running in > a shared-memory threading context from the operating system's > perspective, so you don't need to rely on mmap to share memory over a > non-streaming interface. The challenge is that streaming bytes between processes is actually really fast -- you don't really need mmap for that. (Maybe this was important for X11 back in the 1980s, but a lot has changed since then :-).) And if you want to use pickle and multiprocessing to send, say, a single big numpy array between processes, that's also really fast, because it's basically just a few memcpy's. The slow case is passing complicated objects between processes, and it's slow because pickle has to walk the object graph to serialize it, and walking the object graph is slow. Copying object graphs between subinterpreters has the same problem. So the only case I can see where I'd expect subinterpreters to make communication dramatically more efficient is if you have a "deeply immutable" type: one where not only are its instances immutable, but all objects reachable from those instances are also guaranteed to be immutable. So like, a tuple except that when you instantiate it it validates that all of its elements are also marked as deeply immutable, and errors out if not. Then when you go to send this between subinterpreters, you can tell by checking the type of the root object that the whole graph is immutable, so you don't need to walk it yourself. However, it seems impossible to support user-defined deeply-immutable types in Python: types and functions are themselves mutable and hold tons of references to other potentially mutable objects via __mro__, __globals__, __weakrefs__, etc. etc., so even if a user-defined instance can be made logically immutable it's still going to hold references to mutable things. So the one case where subinterpreters win is if you have a really big and complicated set of nested pseudo-tuples of ints and strings and you're bottlenecked on passing it between interpreters. Maybe frozendicts too. Is that enough to justify the whole endeavor? It seems dubious to me. I guess the other case where subprocesses lose to "real" threads is startup time on Windows. But starting a subinterpreter is also much more expensive than starting a thread, once you take into account the cost of loading the application's modules into the new interpreter. In both cases you end up needing some kind of process/subinterpreter pool or cache to amortize that cost. Obviously I'm committing the cardinal sin of trying to guess about performance based on theory instead of measurement, so maybe I'm wrong. Or maybe there's some deviously clever trick I'm missing. I hope so -- a really useful subinterpreter multi-core store would be awesome. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.barker at noaa.gov Thu Sep 7 20:49:31 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 7 Sep 2017 17:49:31 -0700 Subject: [Python-ideas] Extension of python/json syntax to support explicitly sets and ordered dict. In-Reply-To: References: Message-ID: <-4180454120286379247@unknownmsgid> a json-like format (pyson maybe ?). I gonna pyson is a fine idea. But don't call it extended JSON ;-) For me, the point would be to capture Python' s richer data types. But would you need an OrderedDict? As pointed out, in recent cPython ( and pypy) dicts are ordered by default, but it's not part of the language spec) And you can use ast.literal_eval as a pyson parser, so it's almost ready to use :-) -CHB What do you think about ? Kind regards, Matteo. -- email: nastasi at alternativeoutput.it, matteo.nastasi at gmail.com web: www.alternativeoutput.it irc: #linux-mi at irc.freenode.net linkedin: http://lnkd.in/SPQG87 _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Sep 7 21:00:29 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 08 Sep 2017 01:00:29 +0000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 5:15 PM Nathaniel Smith wrote: > On Thu, Sep 7, 2017 at 4:23 PM, Nick Coghlan wrote: > > The gist of the idea is that with subinterpreters, your starting point > > is multiprocessing-style isolation (i.e. you have to use pickle to > > transfer data between subinterpreters), but you're actually running in > > a shared-memory threading context from the operating system's > > perspective, so you don't need to rely on mmap to share memory over a > > non-streaming interface. > > The challenge is that streaming bytes between processes is actually > really fast -- you don't really need mmap for that. (Maybe this was > important for X11 back in the 1980s, but a lot has changed since then > :-).) And if you want to use pickle and multiprocessing to send, say, > a single big numpy array between processes, that's also really fast, > because it's basically just a few memcpy's. The slow case is passing > complicated objects between processes, and it's slow because pickle > has to walk the object graph to serialize it, and walking the object > graph is slow. Copying object graphs between subinterpreters has the > same problem. > This doesn't match up with my (somewhat limited) experience. For example, in this table of bandwidth estimates from Matthew Rocklin (CCed), IPC is about 10x slower than a memory copy: http://matthewrocklin.com/blog/work/2015/12/29/data-bandwidth This makes a considerable difference when building a system do to parallel data analytics in Python (e.g., on NumPy arrays), which is exactly what Matthew has been working on for the past few years. I'm sure there are other ways to avoid this expensive IPC without using sub-interpreters, e.g., by using a tool like Plasma ( http://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/). But I'm skeptical of your assessment that the current multiprocessing approach is fast enough. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrocklin at gmail.com Thu Sep 7 21:14:50 2017 From: mrocklin at gmail.com (Matthew Rocklin) Date: Thu, 7 Sep 2017 21:14:50 -0400 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: Those numbers were for common use in Python tools and reflected my anecdotal experience at the time with normal Python tools. I'm sure that there are mechanisms to achieve faster speeds than what I experienced. That being said, here is a small example. In [1]: import multiprocessing In [2]: data = b'0' * 100000000 # 100 MB In [3]: from toolz import identity In [4]: pool = multiprocessing.Pool() In [5]: %time _ = pool.apply_async(identity, (data,)).get() CPU times: user 76 ms, sys: 64 ms, total: 140 ms Wall time: 252 ms This is about 400MB/s for a roundtrip On Thu, Sep 7, 2017 at 9:00 PM, Stephan Hoyer wrote: > On Thu, Sep 7, 2017 at 5:15 PM Nathaniel Smith wrote: > >> On Thu, Sep 7, 2017 at 4:23 PM, Nick Coghlan wrote: >> > The gist of the idea is that with subinterpreters, your starting point >> > is multiprocessing-style isolation (i.e. you have to use pickle to >> > transfer data between subinterpreters), but you're actually running in >> > a shared-memory threading context from the operating system's >> > perspective, so you don't need to rely on mmap to share memory over a >> > non-streaming interface. >> >> The challenge is that streaming bytes between processes is actually >> really fast -- you don't really need mmap for that. (Maybe this was >> important for X11 back in the 1980s, but a lot has changed since then >> :-).) And if you want to use pickle and multiprocessing to send, say, >> a single big numpy array between processes, that's also really fast, >> because it's basically just a few memcpy's. The slow case is passing >> complicated objects between processes, and it's slow because pickle >> has to walk the object graph to serialize it, and walking the object >> graph is slow. Copying object graphs between subinterpreters has the >> same problem. >> > > This doesn't match up with my (somewhat limited) experience. For example, > in this table of bandwidth estimates from Matthew Rocklin (CCed), IPC is > about 10x slower than a memory copy: > http://matthewrocklin.com/blog/work/2015/12/29/data-bandwidth > > This makes a considerable difference when building a system do to parallel > data analytics in Python (e.g., on NumPy arrays), which is exactly what > Matthew has been working on for the past few years. > > I'm sure there are other ways to avoid this expensive IPC without using > sub-interpreters, e.g., by using a tool like Plasma ( > http://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/). > But I'm skeptical of your assessment that the current multiprocessing > approach is fast enough. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Thu Sep 7 22:49:14 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 20:49:14 -0600 Subject: [Python-ideas] Lazy creation of module level functions and classes Message-ID: <20170908024914.jeqmtspsnx5cvrxm@python.ca> This is an idea that come out of the lazy loading modules idea. Larry Hastings mentioned what a good improvement this was for PHP. I think it would help a lot of Python too. Very many functions and classes are not actually needed but are instantiated anyhow. Back of napkin idea: Write AST transformer tool, change top-level functions and classes to be like properties (can use __class__ I guess) Transform is something like: # old code def inc(x): return x + 1 # transformed code def __make_inc(code=): obj = eval(code) _ModuleClass.inc = obj # only do eval once return obj inc = property(__make_inc) Totally seat of pants idea but I can't think of a reason why it shouldn't work. It seems much more powerful than lazying loading modules. In the lazy module case, you load the whole module if any part is touched. Many modules only have a small fraction of their functions and classes actually used. If this transformer idea works, the standard Python compiler could be changed to do the above stuff, no transformer needed. From ericsnowcurrently at gmail.com Thu Sep 7 23:11:24 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 20:11:24 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: First of all, thanks for the feedback and encouragement! Responses in-line below. -eric On Thu, Sep 7, 2017 at 3:48 PM, Nathaniel Smith wrote: > My concern about this is the same as it was last time -- the work > looks neat, but right now, almost no-one uses subinterpreters > (basically it's Jep and mod_wsgi and that's it?), and therefore many > packages get away with ignoring subinterpreters. My concern is that this is a chicken-and-egg problem. The situation won't improve until subinterpreters are more readily available. > Numpy is the one I'm > most familiar with: when we get subinterpreter bugs we close them > wontfix, because supporting subinterpreters properly would require > non-trivial auditing, add overhead for non-subinterpreter use cases, > and benefit a tiny tiny fraction of our users. The main problem of which I'm aware is C globals in libraries and extension modules. PEPs 489 and 3121 are meant to help but I know that there is at least one major situation which is still a blocker for multi-interpreter-safe module state. Other than C globals, is there some other issue? > If we add a friendly python-level API like this, then we're committing > to this being a part of Python for the long term and encouraging > people to use it, which puts pressure on downstream packages to do > that work... but it's still not clear whether any benefits will > actually materialize. I'm fine with Nick's idea about making this a "provisional" module. Would that be enough to ease your concern here? > I've actually argued with the PyPy devs to try to convince them to add > subinterpreter support as part of their experiments with GIL-removal, > because I think the semantics would genuinely be nicer to work with > than raw threads, but they're convinced that it's impossible to make > this work. Or more precisely, they think you could make it work in > theory, but that it would be impossible to make it meaningfully more > efficient than using multiple processes. I want them to be wrong, but > I have to admit I can't see a way to make it work either... Yikes! Given the people involved I don't find that to be a good sign. Nevertheless, I still consider my ultimate goals to be tractable and will press forward. At each step thus far, the effort has led to improvements that extend beyond subinterpreters and multi-core. I see that trend continuing for the entirety of the project. Even if my final goal is not realized, the result will still be significantly net positive...and I still think it will still work out. :) > If this is being justified by the multicore use case, and specifically > by the theory that having two interpreters in the same process will > allow for more efficient communication than two interpreters in two > different processes, then... why should we believe that that's > actually possible? I want your project to succeed, but if it's going > to fail then it seems better if it fails before we commit to exposing > new APIs. The project is partly about performance. However, it's also particularly about offering a alternative concurrency model with an implementation that can run in multiple threads simultaneously in the same process. On Thu, Sep 7, 2017 at 5:15 PM, Nathaniel Smith wrote: > The slow case is passing > complicated objects between processes, and it's slow because pickle > has to walk the object graph to serialize it, and walking the object > graph is slow. Copying object graphs between subinterpreters has the > same problem. The initial goal is to support passing only strings between interpreters. Later efforts will involve investigating approaches to efficiently and safely passing other objects. > So the only case I can see where I'd expect subinterpreters to make > communication dramatically more efficient is if you have a "deeply > immutable" type > [snip] > However, it seems impossible to support user-defined deeply-immutable > types in Python: > [snip] I agree that it is currently not an option. That is part of the exercise. There are a number of possible solutions to explore once we get to that point. However, this PEP isn't about that. I'm confident enough about the possibilities that I'm comfortable with moving forward here. > I guess the other case where subprocesses lose to "real" threads is > startup time on Windows. But starting a subinterpreter is also much > more expensive than starting a thread, once you take into account the > cost of loading the application's modules into the new interpreter. In > both cases you end up needing some kind of process/subinterpreter pool > or cache to amortize that cost. Interpreter startup costs (and optimization strategies) are another aspect of the project which deserve attention. However, we'll worry about that after the core functionality has been achieved. > Obviously I'm committing the cardinal sin of trying to guess about > performance based on theory instead of measurement, so maybe I'm > wrong. Or maybe there's some deviously clever trick I'm missing. :) I'd certainly be interested in more data regarding the relative performance of fork/multiprocess+IPC vs. subinterpreters. However, it's going to be hard to draw any conclusions until the work is complete. :) > I hope so -- a really useful subinterpreter multi-core store would be > awesome. Agreed! Thanks for the encouragement. :) From ericsnowcurrently at gmail.com Thu Sep 7 23:13:22 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Sep 2017 20:13:22 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 12:44 PM, Paul Moore wrote: > On 7 September 2017 at 20:14, Eric Snow wrote: >> I didn't include such a queue in this proposal because I wanted to >> keep it as focused as possible. I'll add a note to the PEP about >> this. > > This all sounds very reasonable. Thanks for the clarification. Hmm. Now I'm starting to think some form of basic queue would be important enough to include in the PEP. I'll see if that feeling holds tomorrow. -eric From njs at pobox.com Fri Sep 8 00:08:48 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Sep 2017 21:08:48 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 6:14 PM, Matthew Rocklin wrote: > Those numbers were for common use in Python tools and reflected my anecdotal > experience at the time with normal Python tools. I'm sure that there are > mechanisms to achieve faster speeds than what I experienced. That being > said, here is a small example. > > > In [1]: import multiprocessing > In [2]: data = b'0' * 100000000 # 100 MB > In [3]: from toolz import identity > In [4]: pool = multiprocessing.Pool() > In [5]: %time _ = pool.apply_async(identity, (data,)).get() > CPU times: user 76 ms, sys: 64 ms, total: 140 ms > Wall time: 252 ms > > This is about 400MB/s for a roundtrip Awesome, thanks for bringing numbers into my wooly-headed theorizing :-). On my laptop I actually get a worse result from your benchmark: 531 ms for 100 MB == ~200 MB/s round-trip, or 400 MB/s one-way. So yeah, transferring data between processes with multiprocessing is slow. This is odd, though, because on the same machine, using socat to send 1 GiB between processes using a unix domain socket runs at 2 GB/s: # terminal 1 ~$ rm -f /tmp/unix.sock && socat -u -b32768 UNIX-LISTEN:/tmp/unix.sock "SYSTEM:pv -W > /dev/null" 1.00GiB 0:00:00 [1.89GiB/s] [<=> ] # terminal 2 ~$ socat -u -b32768 "SYSTEM:dd if=/dev/zero bs=1M count=1024" UNIX:/tmp/unix.sock 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.529814 s, 2.0 GB/s (Notice that the pv output is in GiB/s and the dd output is in GB/s. 1.89 GiB/s = 2.03 GB/s, so they actually agree.) On my system, Python allocates + copies memory at 2.2 GB/s, so bulk byte-level IPC is within 10% of within-process bulk copying: # same 100 MB bytestring as above In [7]: bytearray_data = bytearray(data) In [8]: %timeit bytearray_data.copy() 45.3 ms ? 540 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each) In [9]: 0.100 / 0.0453 # GB / seconds Out[9]: 2.207505518763797 I don't know why multiprocessing is so slow -- maybe there's a good reason, maybe not. But the reason isn't that IPC is intrinsically slow, and subinterpreters aren't going to automatically be 5x faster because they can use memcpy. -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Fri Sep 8 02:19:31 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Sep 2017 23:19:31 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 8:11 PM, Eric Snow wrote: > First of all, thanks for the feedback and encouragement! Responses > in-line below. I hope it's helpful! More responses in-line as well. > On Thu, Sep 7, 2017 at 3:48 PM, Nathaniel Smith wrote: >> My concern about this is the same as it was last time -- the work >> looks neat, but right now, almost no-one uses subinterpreters >> (basically it's Jep and mod_wsgi and that's it?), and therefore many >> packages get away with ignoring subinterpreters. > > My concern is that this is a chicken-and-egg problem. The situation > won't improve until subinterpreters are more readily available. Okay, but you're assuming that "more libraries work well with subinterpreters" is in fact an improvement. I'm asking you to convince me of that :-). Are there people saying "oh, if only subinterpreters had a Python API and less weird interactions with C extensions, I could do "? So far they haven't exactly taken the world by storm... >> Numpy is the one I'm >> most familiar with: when we get subinterpreter bugs we close them >> wontfix, because supporting subinterpreters properly would require >> non-trivial auditing, add overhead for non-subinterpreter use cases, >> and benefit a tiny tiny fraction of our users. > > The main problem of which I'm aware is C globals in libraries and > extension modules. PEPs 489 and 3121 are meant to help but I know > that there is at least one major situation which is still a blocker > for multi-interpreter-safe module state. Other than C globals, is > there some other issue? That's the main one I'm aware of, yeah, though I haven't looked into it closely. >> If we add a friendly python-level API like this, then we're committing >> to this being a part of Python for the long term and encouraging >> people to use it, which puts pressure on downstream packages to do >> that work... but it's still not clear whether any benefits will >> actually materialize. > > I'm fine with Nick's idea about making this a "provisional" module. > Would that be enough to ease your concern here? Potentially, yeah -- basically I'm fine with anything that doesn't end up looking like python-dev telling everyone "subinterpreters are the future! go forth and yell at any devs who don't support them!". What do you think the criteria for graduating to non-provisional status should be, in this case? [snip] >> So the only case I can see where I'd expect subinterpreters to make >> communication dramatically more efficient is if you have a "deeply >> immutable" type >> [snip] >> However, it seems impossible to support user-defined deeply-immutable >> types in Python: >> [snip] > > I agree that it is currently not an option. That is part of the > exercise. There are a number of possible solutions to explore once we > get to that point. However, this PEP isn't about that. I'm confident > enough about the possibilities that I'm comfortable with moving > forward here. I guess I would be much more confident in the possibilities here if you could give: - some hand-wavy sketch for how subinterpreter A could call a function that as originally defined in subinterpreter B without the GIL, which seems like a precondition for sharing user-defined classes - some hand-wavy sketch for how refcounting will work for objects shared between multiple subinterpreters without the GIL, without majorly impacting single-thread performance (I actually forgot about this problem in my last email, because PyPy has already solved this part!) These are the two problems where I find it most difficult to have faith. [snip] >> I hope so -- a really useful subinterpreter multi-core stor[y] would be >> awesome. > > Agreed! Thanks for the encouragement. :) Thanks for attempting such an ambitious project :-). -n -- Nathaniel J. Smith -- https://vorpus.org From storchaka at gmail.com Fri Sep 8 02:57:44 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 8 Sep 2017 09:57:44 +0300 Subject: [Python-ideas] Hexadecimal floating literals Message-ID: The support of hexadecimal floating literals (like 0xC.68p+2) is included in just released C++17 standard. Seems this becomes a mainstream. In Python float.hex() returns hexadecimal string representation. Is it a time to add more support of hexadecimal floating literals? Accept them in float constructor and in Python parser? And maybe add support of hexadecimal formatting ('%x' and '{:x}')? From thibault.hilaire at lip6.fr Fri Sep 8 05:47:17 2017 From: thibault.hilaire at lip6.fr (Thibault Hilaire) Date: Fri, 8 Sep 2017 11:47:17 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: <82463EDF-1003-49E3-827C-FFD603741793@lip6.fr> Dear all This is my very first email to python-ideas, and I strongly support this idea. float.hex() does the job for float to hexadecimal conversion, and float.fromhex() does the opposite. But a full support for hexadecimal floating-point literals would be great (it bypasses the decimal to floating-point conversion), as explained for general purpose here : http://www.exploringbinary.com/hexadecimal-floating-point-constants/ The support for hexadecimal formatting was introduced in C99 with the '%a' formatter for string formatting (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf page 57-58 for literals, or http://en.cppreference.com/w/cpp/language/floating_literal), and, it would be great if python could support it. Thanks Thibault > The support of hexadecimal floating literals (like 0xC.68p+2) is included in just released C++17 standard. Seems this becomes a mainstream. > > In Python float.hex() returns hexadecimal string representation. Is it a time to add more support of hexadecimal floating literals? Accept them in float constructor and in Python parser? And maybe add support of hexadecimal formatting ('%x' and '{:x}')? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ ___________________________________________________ Dr Thibault HILAIRE Universit? Pierre et Marie Curie (Associate Professor) Computing Science Lab (LIP6) Engineering school Polytech Paris UPMC 4 place Jussieu 75005 PARIS, France tel: +33 (0)1.44.27.87.73 email: thibault.hilaire at lip6.fr web: http://www.docmatic.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Fri Sep 8 08:47:49 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Fri, 8 Sep 2017 14:47:49 +0200 Subject: [Python-ideas] Adding new lines to "Zen of Python" Message-ID: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> I curious if there are any plans to update the "Zen of Python". What could be added to the "Zen of Python"? What do you think? Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines From desmoulinmichel at gmail.com Fri Sep 8 09:12:38 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 8 Sep 2017 15:12:38 +0200 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> Message-ID: <21912e51-3550-e992-b6a9-7d6700c92b99@gmail.com> Zen would suppose to remove things from it, not add. Le 08/09/2017 ? 14:47, Thomas G?ttler a ?crit : > I curious if there are any plans to update the "Zen of Python". > > What could be added to the "Zen of Python"? > > What do you think? > > Regards, > Thomas G?ttler > > > > From mal at egenix.com Fri Sep 8 09:20:06 2017 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 8 Sep 2017 15:20:06 +0200 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> Message-ID: <990a2955-3d62-0f54-437b-21faa5b3e67b@egenix.com> On 08.09.2017 14:47, Thomas G?ttler wrote: > I curious if there are any plans to update the "Zen of Python". > > What could be added to the "Zen of Python"? > > What do you think? Only the Zen Master can decide on this one and it appears there's only room for one more aphorism, but I guess that's "Less is better than more." and it was left out for obvious reasons ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 08 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From fedoseev.sergey at gmail.com Fri Sep 8 09:34:41 2017 From: fedoseev.sergey at gmail.com (Sergey Fedoseev) Date: Fri, 8 Sep 2017 18:34:41 +0500 Subject: [Python-ideas] factory for efficient creation of many dicts with the same keys Message-ID: Hi all, Sometimes you may need to create many dicts with the same keys, but different values. For example, if you want to return data from DB as dicts. I think that special type could be added to solve this task more effectively. I created proof of concept for this and here's benchmarks: # currently the fastest way to do it AFAIK $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys for i in range(nrows)]; enumerated = list(enumerate(range(nkeys)))" "for row in rows: {key: row[i] for i, key in enumerated}" 500 loops, best of 5: 645 usec per loop $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys for i in range(nrows)]; factory = dict.factory(*range(nkeys)); from itertools import starmap" "for d in starmap(factory, rows): d" 5000 loops, best of 5: 81.1 usec per loop I'd like to write a patch if this idea will be accepted. From srkunze at mail.de Fri Sep 8 09:35:56 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 8 Sep 2017 15:35:56 +0200 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: <990a2955-3d62-0f54-437b-21faa5b3e67b@egenix.com> References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> <990a2955-3d62-0f54-437b-21faa5b3e67b@egenix.com> Message-ID: On 08.09.2017 15:20, M.-A. Lemburg wrote: > On 08.09.2017 14:47, Thomas G?ttler wrote: >> I curious if there are any plans to update the "Zen of Python". >> >> What could be added to the "Zen of Python"? >> >> What do you think? > Only the Zen Master can decide on this one and it appears there's > only room for one more aphorism, but I guess that's "Less is better > than more." and it was left out for obvious reasons ;-) > My favorite. ;) Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Sep 8 10:56:14 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2017 07:56:14 -0700 Subject: [Python-ideas] factory for efficient creation of many dicts with the same keys In-Reply-To: References: Message-ID: I think you've got it backwards -- if you send the patch the idea *may* be accepted. You ought to at least show us the docs for your proposed factory, it's a little murky from your example. On Fri, Sep 8, 2017 at 6:34 AM, Sergey Fedoseev wrote: > Hi all, > > Sometimes you may need to create many dicts with the same keys, but > different > values. For example, if you want to return data from DB as dicts. > > I think that special type could be added to solve this task more > effectively. > I created proof of concept for this and here's benchmarks: > > # currently the fastest way to do it AFAIK > $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys > for i in range(nrows)]; enumerated = list(enumerate(range(nkeys)))" > "for row in rows: {key: row[i] for i, key in enumerated}" > 500 loops, best of 5: 645 usec per loop > > $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys > for i in range(nrows)]; factory = dict.factory(*range(nkeys)); from > itertools import starmap" "for d in starmap(factory, rows): d" > 5000 loops, best of 5: 81.1 usec per loop > > I'd like to write a patch if this idea will be accepted. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Fri Sep 8 11:05:28 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Fri, 08 Sep 2017 15:05:28 +0000 Subject: [Python-ideas] Add a more disciplined ways of registering ABCs Message-ID: Hi all, tl;dr: I propose adding a `register()` decorator, to be used like this: @abc.register(Abc1, Abc2) class D: ... For preexisting classes I propose adding a magic static variable `__registered__`, to be handled by ABCMeta: class Abc1(metaclass=ABCMeta): __registered__ = [D] Both forms has the effects of calling ABCMeta.register with the corresponding parameters. Explanation: It is possible to register an abstract base class B for preexisting classes A using B.register(A). This form is very flexible - probably too flexible, since the operation changes the type of A, and doing it dynamically makes it difficult for type checker (and human beings) to analyze. But what about new classes? Registering a new class is possible using B.register as a decorator, since it already its argument: @Iterable.register @Sequence.register class A: pass However, due to restrictions on the expressions allowed inside a decorator, this would not work with generic ABCs, which are common and heavily used in type-checked projects: >>> @Iterable[str].register File "", line 1 @Iterable[str].register ^ SyntaxError: invalid syntax While it might be possible to infer the argument, it is not always the case, and adds additional unnecessary burden to type-checker and to the reader; discouraging its use. This kind of restrictions may also prevent some other conceivable forms, such as `@type(X).register`. Additionally, `abc.register()` is also easier to analyze as a "syntactic construct" from the point of view the type checker, similar to the way checkers handle some other constructs such as `enum()`, `cast()` and others. Finally, there's uninformative repetition of `register` for multiple ABCs, and waste of vertical screen space. Q: Why not subclass B? A: Since it forces the metaclass of A to be (a subclass of) ABCMeta, which can cause inconsistency in metaclass structure for derived classes. This issue can surface when using Enum (whose metaclass is EnumMeta): >>> class A(Iterable, Enum): pass ... Traceback (most recent call last): File "", line 1, in TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases This causes problems in typeshed, since it uses inheritance to register `str` as a `Sequence[str]`, for example. As a result, it affects users of mypy trying to define `class A(str, Enum): ...` Q: Why not add to the typing module? A: Because registering an abstract base class has runtime meaning, such as the result of `isinstance` checks. Q: Why not add to the standard library? why not some 3rd party module? A: 1. Because dynamic usage should be discouraged, similarly to monkey-patching. Note that, like monkey-patching, this has runtime implications which cause "actions from distance". 2. Because it should be used by typeshed and mypy. The implementation of register() is straightforward: def register(*abcs): def inner(cls): for b in abcs: b.register(cls) return cls return inner Thanks, Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.morton13 at gmail.com Fri Sep 8 11:27:26 2017 From: joshua.morton13 at gmail.com (Joshua Morton) Date: Fri, 08 Sep 2017 15:27:26 +0000 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: Replying here, although this was written in response to the other thread: Hey Neil, In general this won't work. It's not generally possible to know if a given statement has side effects or not. As an example, one normally wouldn't expect function or class definition to have side effects, but if a function is decorated, the decorators are evaluated at function "compilation"/import time, and may have side effects. As another example, one can put arbitrary expressions in a function annotation, and those are evaluated at import time. As a result of this, you can't even know if an import is safe, because that module may have side effects. That is, the module foo.py: import bar isn't known to be lazy, because bar may import and start the logging module, as an example. While in general you might think that global state modifying decorators or annotations are a bad idea, they are used (Flask). As a result, it's not possible to implement this without breaking behavior for certain users. While I'm not a core dev and thus can't say with certainty, I will say that that makes it very unlikely for this change (or others like it) to be implemented. It might be (from a language standpoint) to implement a `lazy` keyword (ie. `lazy import foo; lazy def bar(): ...`), but if I recall, there's been discussion of that and it's never gotten very far. --Josh On Thu, Sep 7, 2017 at 1:46 PM Neil Schemenauer wrote: > Barry Warsaw wrote: > > There are a few other things that might end up marking a module as > > "industrious" (my thesaurus's antonym for "lazy"). > > Good points. The analysis can be simple at first and then we can > enhance it to be smarter about what is okay and still lazy load. We > may evolve it over time too, making things that are not strictly > safe still not trigger the "industrious" load lazy anyhow. > > Another idea is to introduce __lazy__ or some such in the global > namespace of the module, if present, e.g. > > __lazy__ = True > > then the analysis doesn't do anything except return True. The > module has explicitly stated that side-effects in the top-level code > are okay to be done in a lazy fashion. > > Perhaps with a little bit of smarts in the analsis and a little > sprinkling of __lazy__ flags, we can get a big chunk of modules to > lazy load. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Sep 8 12:02:40 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Sep 2017 02:02:40 +1000 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: On Sat, Sep 9, 2017 at 1:27 AM, Joshua Morton wrote: > As a result of this, you can't even know if an import is safe, because that > module may have side effects. That is, the module foo.py: > > import bar > > isn't known to be lazy, because bar may import and start the logging module, > as an example. Laziness has to be complete - or, looking the other way, eager importing is infectious. For foo to be lazy, bar also has to be lazy; if you don't know for absolute certain that bar is lazy-loadable, then you assume it isn't, and foo becomes eagerly loaded. ChrisA From nas-python-ideas at arctrix.com Fri Sep 8 12:19:01 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Fri, 8 Sep 2017 10:19:01 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: <20170908161901.phjqunc4o7q52exr@python.ca> On 2017-09-08, Joshua Morton wrote: > In general this won't work. It's not generally possible to know if a given > statement has side effects or not. That's true but with the AST static analysis, we find anything that has potential side effects. The question if any useful subset of real modules pass these checks. If we flag everything as no lazy import safe then we don't gain anything. > As an example, one normally wouldn't expect function or class > definition to have side effects, but if a function is decorated, > the decorators are evaluated at function "compilation"/import > time, and may have side effects. Decorators are handled in my latest prototype (module is not lazy). > As another example, one can put arbitrary expressions in a > function annotation, and those are evaluated at import time. Not handled yet but no reason they can't be. > As a result of this, you can't even know if an import is safe, > because that module may have side effects. That is, the module > foo.py: > > import bar > > isn't known to be lazy, because bar may import and start the logging > module, as an example. That is handled as well. We only need to know if the current module is lazy safe or not. Imports of submodules that have side-effects will have those side effects happen like they do now. The major challenge I see right now is 'from .. import' and class bases (i.e. metaclass behavior). If we do the safe thing then all from-imports make the module unsafe for lazy loading and any class definition that has a base class is also unsafe. I think the idea is not yet totally dead though. We could have a command-line option to enable it. Modules that depend on side-effects of from-import and from base classes could let the compiler know about that somehow (make it explicit). That would also a good fraction of modules to be lazy import safe. Regards, Neil From fedoseev.sergey at gmail.com Fri Sep 8 12:24:20 2017 From: fedoseev.sergey at gmail.com (Sergey Fedoseev) Date: Fri, 8 Sep 2017 21:24:20 +0500 Subject: [Python-ideas] factory for efficient creation of many dicts with the same keys In-Reply-To: References: Message-ID: Here's docs: .. staticmethod:: factory(*keys) Return a callable object that creates a dictionary from *keys* and its operands. For example: * ``dict.factory('1', 2, (3,))({1}, [2], {3: None})`` returns ``{'1': {1}, 2: [2], (3,): {3: None}}``. * ``dict.factory((3,), '1', 2)({1}, [2], {3: None})`` returns ``{(3,): {1}, '1': [2], 2: {3: None}}``. Equivalent to:: def factory(*keys): def f(*values): return dict(zip(keys, values)) return f Hope it makes my idea clearer. Link to patch (I guess it's too big to paste it here): https://github.com/sir-sigurd/cpython/commit/a0fe1a80f6e192368180a32e849771c420aa0adc 2017-09-08 19:56 GMT+05:00 Guido van Rossum : > I think you've got it backwards -- if you send the patch the idea *may* be > accepted. You ought to at least show us the docs for your proposed factory, > it's a little murky from your example. From nas-python-ideas at arctrix.com Fri Sep 8 12:36:04 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Fri, 8 Sep 2017 10:36:04 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> Message-ID: <20170908163604.xxp4hd7idi3eboum@python.ca> On 2017-09-09, Chris Angelico wrote: > Laziness has to be complete - or, looking the other way, eager > importing is infectious. For foo to be lazy, bar also has to be lazy; Not with the approach I'm proposing. bar will be loaded in non-lazy fashion at the right time, foo can still be lazy. From rosuav at gmail.com Fri Sep 8 12:43:38 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 9 Sep 2017 02:43:38 +1000 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170908163604.xxp4hd7idi3eboum@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> Message-ID: On Sat, Sep 9, 2017 at 2:36 AM, Neil Schemenauer wrote: > On 2017-09-09, Chris Angelico wrote: >> Laziness has to be complete - or, looking the other way, eager >> importing is infectious. For foo to be lazy, bar also has to be lazy; > > Not with the approach I'm proposing. bar will be loaded in non-lazy > fashion at the right time, foo can still be lazy. Ah, that's cool then! I suppose part of the confusion I had was in the true meaning of "lazy"; obviously you have to still load up the module to some extent. I'm not entirely sure how much you defer and how much you do immediately, and it looks like you have more in the 'defer' category than I thought. ChrisA From songofacandy at gmail.com Fri Sep 8 14:16:42 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 9 Sep 2017 03:16:42 +0900 Subject: [Python-ideas] factory for efficient creation of many dicts with the same keys In-Reply-To: References: Message-ID: Thanks for your suggestion. FYI, you can use "key-sharing dict" (PEP 412: https://www.python.org/dev/peps/pep-0412/) when all keys are string. It saves not only creation time, but also memory usage. I think it's nice for CSV parser and, as you said, DB record. One question is, how is it useful? When working on large dataset, I think list or tuple (or namedtuple) are recommended for records. If it's useful enough, it's worth enough to added in dict. It can't be implemented as 3rd party because relying on many private in dict. Regards, INADA Naoki On Sat, Sep 9, 2017 at 1:24 AM, Sergey Fedoseev wrote: > Here's docs: > > .. staticmethod:: factory(*keys) > > Return a callable object that creates a dictionary from *keys* and its > operands. For example: > > * ``dict.factory('1', 2, (3,))({1}, [2], {3: None})`` returns > ``{'1': {1}, 2: [2], (3,): {3: None}}``. > > * ``dict.factory((3,), '1', 2)({1}, [2], {3: None})`` returns > ``{(3,): {1}, '1': [2], 2: {3: None}}``. > > Equivalent to:: > > def factory(*keys): > def f(*values): > return dict(zip(keys, values)) > return f > > Hope it makes my idea clearer. > > Link to patch (I guess it's too big to paste it here): > https://github.com/sir-sigurd/cpython/commit/a0fe1a80f6e192368180a32e849771c420aa0adc > > 2017-09-08 19:56 GMT+05:00 Guido van Rossum : >> I think you've got it backwards -- if you send the patch the idea *may* be >> accepted. You ought to at least show us the docs for your proposed factory, >> it's a little murky from your example. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From victor.stinner at gmail.com Fri Sep 8 15:05:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Sep 2017 12:05:44 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: 2017-09-07 23:57 GMT-07:00 Serhiy Storchaka : > The support of hexadecimal floating literals (like 0xC.68p+2) is included in > just released C++17 standard. Seems this becomes a mainstream. Floating literal using base 2 (or base 2^n, like hexadecimal, 2^4) is the only way to get exact values in a portable way. So yeah, we need it. We already have float.hex() since Python 2.6. > In Python float.hex() returns hexadecimal string representation. Is it a > time to add more support of hexadecimal floating literals? Accept them in > float constructor and in Python parser? And maybe add support of hexadecimal > formatting ('%x' and '{:x}')? I dislike "%x" % float, since "%x" is a very old format from C printf and I expect it to only work for integers. For example, bytes.hex() exists (since Python 3.5) but b'%x' % b'hello' doesn't work. Since format() is a "new" way to format strings, and each type is free to implement its own formatters, I kind of like the idea of support float.hex() here. Do we need a short PEP, since it changes the Python grammar? It may be nice to describe the exact grammar for float literals. Victor From guido at python.org Fri Sep 8 15:23:16 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2017 12:23:16 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: On Fri, Sep 8, 2017 at 12:05 PM, Victor Stinner wrote: > 2017-09-07 23:57 GMT-07:00 Serhiy Storchaka : > > The support of hexadecimal floating literals (like 0xC.68p+2) is > included in > > just released C++17 standard. Seems this becomes a mainstream. > > Floating literal using base 2 (or base 2^n, like hexadecimal, 2^4) is > the only way to get exact values in a portable way. So yeah, we need > it. We already have float.hex() since Python 2.6. > > > In Python float.hex() returns hexadecimal string representation. Is it a > > time to add more support of hexadecimal floating literals? Accept them in > > float constructor and in Python parser? And maybe add support of > hexadecimal > > formatting ('%x' and '{:x}')? > > I dislike "%x" % float, since "%x" is a very old format from C printf > and I expect it to only work for integers. For example, bytes.hex() > exists (since Python 3.5) but b'%x' % b'hello' doesn't work. > > Since format() is a "new" way to format strings, and each type is free > to implement its own formatters, I kind of like the idea of support > float.hex() here. > > Do we need a short PEP, since it changes the Python grammar? It may be > nice to describe the exact grammar for float literals. > Yes, this needs a PEP. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Sep 8 16:57:13 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 8 Sep 2017 15:57:13 -0500 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> <990a2955-3d62-0f54-437b-21faa5b3e67b@egenix.com> Message-ID: Maybe something over-pretentious like: If it's not automatically tested, it's broken. Or: Tests lie; and you need them. Or: Achieve reproducibility and eliminate quality variance with tests. On Friday, September 8, 2017, Sven R. Kunze wrote: > On 08.09.2017 15:20, M.-A. Lemburg wrote: > > On 08.09.2017 14:47, Thomas G?ttler wrote: > > I curious if there are any plans to update the "Zen of Python". > > What could be added to the "Zen of Python"? > > What do you think? > > Only the Zen Master can decide on this one and it appears there's > only room for one more aphorism, but I guess that's "Less is better > than more." and it was left out for obvious reasons ;-) > > > > My favorite. ;) > > > Sven > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Sep 8 19:00:45 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 8 Sep 2017 19:00:45 -0400 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> Message-ID: On 9/8/2017 8:47 AM, Thomas G?ttler wrote: > I curious if there are any plans to update the "Zen of Python". > What could be added to the "Zen of Python"? > What do you think? "Zen of Python" is a published (in the stdlib), free-verse poem by Tim Peters. As a unitary work of art, it is not something to be 'updated' or augmented by anyone other than the author, who I believe has no plans to do so. (I am here mostly repeating what M.-A. Lemburg said, but in much plainer English.) A speculative thread on 'How I would edit The Zen of Python' could be interesting, but it belongs on python-list rather than here on python-ideas. -- Terry Jan Reedy From carl.input at gmail.com Fri Sep 8 20:17:22 2017 From: carl.input at gmail.com (Carl Smith) Date: Sat, 09 Sep 2017 00:17:22 +0000 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> Message-ID: Rather than 'update' the Zen of Python, it seems better to create something original, maybe derived from the Zen, and see if it becomes popular in its own right. It'd be fun to see extensions and alternatives, but there's no reason to change a classic work that helped to define the culture. Sorry to sound so preachy. It just seems wrong to change a religious text. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Sun Sep 10 06:18:09 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 10 Sep 2017 12:18:09 +0200 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: <0A6FC422-5CF0-4E82-86B5-73693D5D1661@mac.com> > On 8 Sep 2017, at 05:11, Eric Snow wrote: > On Thu, Sep 7, 2017 at 3:48 PM, Nathaniel Smith wrote: > >> Numpy is the one I'm >> most familiar with: when we get subinterpreter bugs we close them >> wontfix, because supporting subinterpreters properly would require >> non-trivial auditing, add overhead for non-subinterpreter use cases, >> and benefit a tiny tiny fraction of our users. > > The main problem of which I'm aware is C globals in libraries and > extension modules. PEPs 489 and 3121 are meant to help but I know > that there is at least one major situation which is still a blocker > for multi-interpreter-safe module state. Other than C globals, is > there some other issue? There?s also the PyGilState_* API that doesn't support multiple interpreters. The issue there is that callbacks from external libraries back into python need to use the correct subinterpreter. Ronald From k7hoven at gmail.com Sun Sep 10 10:52:00 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 10 Sep 2017 17:52:00 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 9:26 PM, Eric Snow wrote: ?[...]? > get_main(): > > Return the main interpreter. > > ?I assume the concept of a main interpreter is inherited from the previous levels of support in the C API, but what exactly is the significance of being "the main interpreter"? Instead, could they just all be subinterpreters of the same Python process (or whatever the right wording would be)?? It might also be helpful if the PEP had a short description of what are considered subinterpreters and how they differ from threads of the same interpreter [*]. Currently, the PEP seems to rely heavily on knowledge of the previously available concepts. However, as this would be a new module, I don't think there's any need to blindly copy the previous design, regardless of how well the design may have served its purpose at the time. -- Koos [*] For instance regarding the role of the glo... local interpreter locks (LILs) ;) -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Sun Sep 10 14:48:47 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 10 Sep 2017 20:48:47 +0200 Subject: [Python-ideas] PEP 562 Message-ID: I have written a short PEP as a complement/alternative to PEP 549. I will be grateful for comments and suggestions. The PEP should appear online soon. -- Ivan *********************************************************** PEP: 562 Title: Module __getattr__ Author: Ivan Levkivskyi Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017 Abstract ======== It is proposed to support ``__getattr__`` function defined on modules to provide basic customization of module attribute access. Rationale ========= It is sometimes convenient to customize or otherwise have control over access to module attributes. A typical example is managing deprecation warnings. Typical workarounds are assigning ``__class__`` of a module object to a custom subclass of ``types.ModuleType`` or substituting ``sys.modules`` item with a custom wrapper instance. It would be convenient to simplify this procedure by recognizing ``__getattr__`` defined directly in a module that would act like a normal ``__getattr__`` method, except that it will be defined on module *instances*. For example:: # lib.py from warnings import warn deprecated_names = ["old_function", ...] def _deprecated_old_function(arg, other): ... def __getattr__(name): if name in deprecated_names: warn(f"{name} is deprecated", DeprecationWarning) return globals()[f"_deprecated_{name}"] raise AttributeError(f"module {__name__} has no attribute {name}") # main.py from lib import old_function # Works, but emits the warning There is a related proposal PEP 549 that proposes to support instance properties for a similar functionality. The difference is this PEP proposes a faster and simpler mechanism, but provides more basic customization. An additional motivation for this proposal is that PEP 484 already defines the use of module ``__getattr__`` for this purpose in Python stub files, see [1]_. Specification ============= The ``__getattr__`` function at the module level should accept one argument which is a name of an attribute and return the computed value or raise an ``AttributeError``:: def __getattr__(name: str) -> Any: ... This function will be called only if ``name`` is not found in the module through the normal attribute lookup. The reference implementation for this PEP can be found in [2]_. Backwards compatibility and impact on performance ================================================= This PEP may break code that uses module level (global) name ``__getattr__``. The performance implications of this PEP are minimal, since ``__getattr__`` is called only for missing attributes. References ========== .. [1] PEP 484 section about ``__getattr__`` in stub files (https://www.python.org/dev/peps/pep-0484/#stub-files) .. [2] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/3/files) Copyright ========= This document has been placed in the public domain. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Sep 10 15:14:00 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 10 Sep 2017 21:14:00 +0200 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code References: Message-ID: <20170910211400.4a829c79@fsol> On Thu, 7 Sep 2017 21:08:48 -0700 Nathaniel Smith wrote: > > Awesome, thanks for bringing numbers into my wooly-headed theorizing :-). > > On my laptop I actually get a worse result from your benchmark: 531 ms > for 100 MB == ~200 MB/s round-trip, or 400 MB/s one-way. So yeah, > transferring data between processes with multiprocessing is slow. > > This is odd, though, because on the same machine, using socat to send > 1 GiB between processes using a unix domain socket runs at 2 GB/s: When using local communication, the raw IPC cost is often minor compared to whatever Python does with the data (parse it, dispatch tasks around, etc.) except when the data is really huge. Local communications on Linux can easily reach several GB/s (even using TCP to localhost). Here is a Python script with reduced overhead to measure it -- as opposed to e.g. a full-fledged event loop: https://gist.github.com/pitrou/d809618359915967ffc44b1ecfc2d2ad > I don't know why multiprocessing is so slow -- maybe there's a good > reason, maybe not. Be careful to measure actual bandwidth, not round-trip latency, however. > But the reason isn't that IPC is intrinsically > slow, and subinterpreters aren't going to automatically be 5x faster > because they can use memcpy. What could improve performance significantly would be to share objects without any form of marshalling; but it's not obvious it's possible in the subinterpreters model *if* it also tries to remove the GIL. You can see it readily with concurrent.futures, when comparing ThreadPoolExecutor and ProcessPoolExecutor: >>> import concurrent.futures as cf ...:tp = cf.ThreadPoolExecutor(4) ...:pp = cf.ProcessPoolExecutor(4) ...:x = b"x" * (100 * 1024**2) ...:def identity(x): return x ...: >>> y = list(tp.map(identity, [x] * 10)) # warm up >>> len(y) 10 >>> y = list(pp.map(identity, [x] * 10)) # warm up >>> len(y) 10 >>> %timeit y = list(tp.map(identity, [x] * 10)) 638 ?s ? 71.3 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) >>> %timeit y = list(pp.map(identity, [x] * 10)) 1.99 s ? 13.7 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) On this trivial case you're really gaining a lot using a thread pool... Regards Antoine. From c at anthonyrisinger.com Sun Sep 10 15:34:49 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Sun, 10 Sep 2017 14:34:49 -0500 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: I'd really love to find a way to enable lazy loading by default, maybe with a way to opt-out old/problem/legacy modules instead of opt-in via __future__ or anything else. IME easily 95%+ of modules in the wild, today, will not even notice (I wrote an application bundler in past that enabled it globally by default without much fuss for several years). The only small annoyance is when it does cause problems the error can jump around, depending on how the lazy import was triggered. module.__getattr__ works pretty well for normal access, after being imported by another module, but it doesn't properly trigger loading by functions defined in the module's own namespace. These functions are bound to module.__dict__ as their __globals__ so lazy loading of this variety is really dependant on a custom module.__dict__ that implements __getitem__ or __missing__. I think this approach is the ideal path over existing PEPs. I've done it in the past and it worked very well. The impl looked something like this: * Import statements and __import__ generate lazy-load marker objects instead of performing imports. * Marker objects record the desired import and what identifiers were supposed to be added to namespace. * module.__dict__.__setitem__ recognizes markers and records their identifiers as lazily imported somewhere, but **does not add them to namespace**. * module.___getattribute__ will request the lazy attribute via module.__dict__ like regular objects and functions will request via their bound __globals__. * Both will trigger module.__dict.__missing__, which looks to see if the requested identifier was previously marked as a lazy import, and if so, performs the import, saves to namespace properly, and returns the real import. -- C Anthony On Sep 10, 2017 1:49 PM, "Ivan Levkivskyi" wrote: I have written a short PEP as a complement/alternative to PEP 549. I will be grateful for comments and suggestions. The PEP should appear online soon. -- Ivan *********************************************************** PEP: 562 Title: Module __getattr__ Author: Ivan Levkivskyi Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017 Abstract ======== It is proposed to support ``__getattr__`` function defined on modules to provide basic customization of module attribute access. Rationale ========= It is sometimes convenient to customize or otherwise have control over access to module attributes. A typical example is managing deprecation warnings. Typical workarounds are assigning ``__class__`` of a module object to a custom subclass of ``types.ModuleType`` or substituting ``sys.modules`` item with a custom wrapper instance. It would be convenient to simplify this procedure by recognizing ``__getattr__`` defined directly in a module that would act like a normal ``__getattr__`` method, except that it will be defined on module *instances*. For example:: # lib.py from warnings import warn deprecated_names = ["old_function", ...] def _deprecated_old_function(arg, other): ... def __getattr__(name): if name in deprecated_names: warn(f"{name} is deprecated", DeprecationWarning) return globals()[f"_deprecated_{name}"] raise AttributeError(f"module {__name__} has no attribute {name}") # main.py from lib import old_function # Works, but emits the warning There is a related proposal PEP 549 that proposes to support instance properties for a similar functionality. The difference is this PEP proposes a faster and simpler mechanism, but provides more basic customization. An additional motivation for this proposal is that PEP 484 already defines the use of module ``__getattr__`` for this purpose in Python stub files, see [1]_. Specification ============= The ``__getattr__`` function at the module level should accept one argument which is a name of an attribute and return the computed value or raise an ``AttributeError``:: def __getattr__(name: str) -> Any: ... This function will be called only if ``name`` is not found in the module through the normal attribute lookup. The reference implementation for this PEP can be found in [2]_. Backwards compatibility and impact on performance ================================================= This PEP may break code that uses module level (global) name ``__getattr__``. The performance implications of this PEP are minimal, since ``__getattr__`` is called only for missing attributes. References ========== .. [1] PEP 484 section about ``__getattr__`` in stub files (https://www.python.org/dev/peps/pep-0484/#stub-files) .. [2] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/3/files) Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Sun Sep 10 15:44:53 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 10 Sep 2017 21:44:53 +0200 Subject: [Python-ideas] PEP 560 Message-ID: I have written another short PEP that proposes some minor changes to core CPython interpreter for better support of generic types. I will be grateful for comments and suggestions: https://www.python.org/dev/peps/pep-0560/ -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Sun Sep 10 16:04:20 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Sun, 10 Sep 2017 14:04:20 -0600 Subject: [Python-ideas] PEP 562 In-Reply-To: <20170910194512.dldxi2in7shwcbgw@python.ca> References: <20170910194512.dldxi2in7shwcbgw@python.ca> Message-ID: <20170910200420.ra3al6hluukudyph@python.ca> On 2017-09-10, Neil Schemenauer wrote: > I have something 90% working, only 90% left to go. ;-) Prototype: https://github.com/warsaw/lazyimport/blob/master/lazy_demo.py https://github.com/nascheme/cpython/tree/exec_mod Next step is to do the compiler and change importlib to do exec(code, module) rather than exec(code, module.__dict__). From njs at pobox.com Sun Sep 10 16:18:25 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 10 Sep 2017 13:18:25 -0700 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: The main two use cases I know of for this and PEP 549 are lazy imports of submodules, and deprecating attributes. If we assume that you only want lazy imports to show up in dir() and don't want deprecated attributes to show up in dir() (and I'm not sure this is what you want 100% of the time, but it seems like the most reasonable default to me), then currently you need one of the PEPs for one of the cases and the other PEP for the other case. Would it make more sense to add direct support for lazy imports and attribute deprecation to ModuleType? This might look something like metamodule's FancyModule type: https://github.com/njsmith/metamodule/blob/ ee54d49100a9a06ffff341bb10a4d3549642139f/metamodule.py#L20 -n On Sep 10, 2017 11:49, "Ivan Levkivskyi" wrote: > I have written a short PEP as a complement/alternative to PEP 549. > I will be grateful for comments and suggestions. The PEP should > appear online soon. > > -- > Ivan > > *********************************************************** > > PEP: 562 > Title: Module __getattr__ > Author: Ivan Levkivskyi > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 09-Sep-2017 > Python-Version: 3.7 > Post-History: 09-Sep-2017 > > > Abstract > ======== > > It is proposed to support ``__getattr__`` function defined on modules to > provide basic customization of module attribute access. > > > Rationale > ========= > > It is sometimes convenient to customize or otherwise have control over > access to module attributes. A typical example is managing deprecation > warnings. Typical workarounds are assigning ``__class__`` of a module > object > to a custom subclass of ``types.ModuleType`` or substituting > ``sys.modules`` > item with a custom wrapper instance. It would be convenient to simplify > this > procedure by recognizing ``__getattr__`` defined directly in a module that > would act like a normal ``__getattr__`` method, except that it will be > defined > on module *instances*. For example:: > > # lib.py > > from warnings import warn > > deprecated_names = ["old_function", ...] > > def _deprecated_old_function(arg, other): > ... > > def __getattr__(name): > if name in deprecated_names: > warn(f"{name} is deprecated", DeprecationWarning) > return globals()[f"_deprecated_{name}"] > raise AttributeError(f"module {__name__} has no attribute {name}") > > # main.py > > from lib import old_function # Works, but emits the warning > > There is a related proposal PEP 549 that proposes to support instance > properties for a similar functionality. The difference is this PEP proposes > a faster and simpler mechanism, but provides more basic customization. > An additional motivation for this proposal is that PEP 484 already defines > the use of module ``__getattr__`` for this purpose in Python stub files, > see [1]_. > > > Specification > ============= > > The ``__getattr__`` function at the module level should accept one argument > which is a name of an attribute and return the computed value or raise > an ``AttributeError``:: > > def __getattr__(name: str) -> Any: ... > > This function will be called only if ``name`` is not found in the module > through the normal attribute lookup. > > The reference implementation for this PEP can be found in [2]_. > > > Backwards compatibility and impact on performance > ================================================= > > This PEP may break code that uses module level (global) name > ``__getattr__``. > The performance implications of this PEP are minimal, since ``__getattr__`` > is called only for missing attributes. > > > References > ========== > > .. [1] PEP 484 section about ``__getattr__`` in stub files > (https://www.python.org/dev/peps/pep-0484/#stub-files) > > .. [2] The reference implementation > (https://github.com/ilevkivskyi/cpython/pull/3/files) > > > Copyright > ========= > > This document has been placed in the public domain. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cody.piersall at gmail.com Sun Sep 10 18:01:15 2017 From: cody.piersall at gmail.com (Cody Piersall) Date: Sun, 10 Sep 2017 17:01:15 -0500 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: Sorry for top posting! I'm on a phone. I still think the better way to solve the custom dir() would be to change the module __dir__ method to check if __all__ is defined and use it to generate the result if it exists. This seems like a logical enhancement to me, and I'm planning on writing a patch to implement this. Whether it would be accepted is still an open issue though. Cody On Sep 10, 2017 3:19 PM, "Nathaniel Smith" wrote: The main two use cases I know of for this and PEP 549 are lazy imports of submodules, and deprecating attributes. If we assume that you only want lazy imports to show up in dir() and don't want deprecated attributes to show up in dir() (and I'm not sure this is what you want 100% of the time, but it seems like the most reasonable default to me), then currently you need one of the PEPs for one of the cases and the other PEP for the other case. Would it make more sense to add direct support for lazy imports and attribute deprecation to ModuleType? This might look something like metamodule's FancyModule type: https://github.com/njsmith/metamodule/blob/ee54d49100a9a06 ffff341bb10a4d3549642139f/metamodule.py#L20 -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at ethanhs.me Sun Sep 10 19:05:21 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Sun, 10 Sep 2017 16:05:21 -0700 Subject: [Python-ideas] PEP 561: Distributing and Packaging Type Information Message-ID: Hello, I have just published my first PEP, on packaging type information. I would appreciate comments and suggestions. The PEP can be found at https://www.python.org/dev/peps/pep-0561/ I have also duplicated the text below. Thanks! ---------------------------------------------------------- PEP: 561 Title: Distributing and Packaging Type Information Author: Ethan Smith Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: Abstract ======== PEP 484 introduced type hints to Python, with goals of making typing gradual and easy to adopt. Currently, typing information must be distributed manually. This PEP provides a standardized means to package and distribute type information and an ordering for type checkers to resolve modules and collect this information for type checking using existing packaging architecture. Rationale ========= PEP 484 has a brief section on distributing typing information. In this section [1]_ the PEP recommends using ``shared/typehints/pythonX.Y/`` for shipping stub files. However, manually adding a path to stub files for each third party library does not scale. The simplest approach people have taken is to add ``site-packages`` to their ``PYTHONPATH``, but this causes type checkers to fail on packages that are highly dynamic (e.g. sqlalchemy and Django). Furthermore, package authors are wishing to distribute code that has inline type information, and there currently is no standard method to distribute packages with inline type annotations or syntax that can simultaneously be used at runtime and in type checking. Specification ============= There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: 1. The package maintainer would like to add type information inline. 2. The package maintainer would like to add type information via stubs. 3. A third party would like to share stub files for a package, but the maintainer does not want to include them in the source of the package. This PEP aims to support these scenarios and make them simple to add to packaging and deployment. The two major parts of this specification are the packaging specifications and the resolution order for resolving module type information. This spec is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [1]_. Packaging Type Information -------------------------- Packages must opt into supporting typing. This will be done though a distutils extension [2]_, providing a ``typed`` keyword argument to the distutils ``setup()`` command. The argument value will depend on the kind of type information the package provides. The distutils extension will be added to the ``typing`` package. Therefore a package maintainer may write :: setup( ... setup_requires=["typing"], typed="inline", ... ) Inline Typed Packages ''''''''''''''''''''' Packages that have inline type annotations simply have to pass the value ``"inline"`` to the ``typed`` argument in ``setup()``. Stub Only Packages '''''''''''''''''' For package maintainers wishing to ship stub files containing all of their type information, it is prefered that the ``*.pyi`` stubs are alongside the corresponding ``*.py`` files. However, the stubs may be put in a sub-folder of the Python sources, with the same name the ``*.py`` files are in. For example, the ``flyingcircus`` package would have its stubs in the folder ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are not found in ``flyingcircus/`` the type checker may treat the subdirectory as a normal package. The normal resolution order of checking ``*.pyi`` before ``*.py`` will be maintained. The value of the ``typed`` argument to ``setup()`` is ``"stubs"`` for this type of distribution. The author of the package is suggested to use ``package_data`` to assure the stub files are installed alongside the runtime Python code. Third Party Stub Packages ''''''''''''''''''''''''' Third parties seeking to distribute stub files are encouraged to contact the maintainer of the package about distribution alongside the package. If the maintainer does not wish to maintain or package stub files or type information inline, then a "third party stub package" should be created. The structure is similar, but slightly different from that of stub only packages. If the stubs are for the library ``flyingcircus`` then the package should be named ``flyingcircus-stubs`` and the stub files should be put in a sub-directory named ``flyingcircus``. This allows the stubs to be checked as if they were in a regular package. These packages should also pass ``"stubs"`` as the value of ``typed`` argument in ``setup()``. These packages are suggested to use ``package_data`` to package stub files. The version of the ``flyingcircus-stubs`` package should match the version of the ``flyingcircus`` package it is providing types for. Type Checker Module Resolution Order ------------------------------------ The following is the order that type checkers supporting this PEP should resolve modules containing type information: 1. User code - the files the type checker is running on. 2. Stubs or Python source in ``PYTHONPATH``. This is to allow the user complete control of which stubs to use, and patch broken stubs/inline types from packages. 3. Third party stub packages - these packages can supersede the installed untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, however it is encouraged to check their metadata to confirm that they opt into type checking. 4. Inline packages - finally, if there is nothing overriding the installed package, and it opts into type checking. 5. Typeshed (if used) - Provides the stdlib types and several third party libraries When resolving step (3) type checkers should assure the version of the stubs match the installed runtime package. Type checkers that check a different Python version than the version they run on must find the type information in the ``site-packages``/``dist-packages`` of that Python version. This can be queried e.g. ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended that the type checker allow for the user to point to a particular Python binary, in case it is not in the path. To check if a package has opted into type checking, type checkers are recommended to use the ``pkg_resources`` module to query the package metadata. If the ``typed`` package metadata has ``None`` as its value, the package has not opted into type checking, and the type checker should skip that package. References ========== .. [1] PEP 484, Storing and Distributing Stub Files (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) .. [2] Distutils Extensions, Adding setup() arguments (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Sun Sep 10 20:39:39 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Sun, 10 Sep 2017 17:39:39 -0700 Subject: [Python-ideas] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: Congratulations on your first PEP! This is solving an important problem for typing in Python, so I'm glad we're tackling it. 2017-09-10 16:05 GMT-07:00 Ethan Smith : > Hello, > > I have just published my first PEP, on packaging type information. I would > appreciate comments and suggestions. The PEP can be found at > https://www.python.org/dev/peps/pep-0561/ > > I have also duplicated the text below. > > Thanks! > > ---------------------------------------------------------- > > > PEP: 561 > Title: Distributing and Packaging Type Information > Author: Ethan Smith > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 09-Sep-2017 > Python-Version: 3.7 > Post-History: > > > Abstract > ======== > > PEP 484 introduced type hints to Python, with goals of making typing > gradual and easy to adopt. Currently, typing information must be distributed > manually. This PEP provides a standardized means to package and distribute > type information and an ordering for type checkers to resolve modules and > collect this information for type checking using existing packaging > architecture. > > > Rationale > ========= > > PEP 484 has a brief section on distributing typing information. In this > section [1]_ the PEP recommends using ``shared/typehints/pythonX.Y/`` for > shipping stub files. However, manually adding a path to stub files for each > third party library does not scale. The simplest approach people have taken > is to add ``site-packages`` to their ``PYTHONPATH``, but this causes type > checkers to fail on packages that are highly dynamic (e.g. sqlalchemy > and Django). > > Furthermore, package authors are wishing to distribute code that has > inline type information, and there currently is no standard method to > distribute packages with inline type annotations or syntax that can > simultaneously be used at runtime and in type checking. > > This feels like it should be the first paragraph: it describes the important problem we're solving, and the first paragraph is just details on why the previous solutions don't work. Perhaps you could talk more about how people are running into problems because of the absence of a way to distribute typed packages. For example, if you're working on a proprietary codebase, it's likely that you're relying on other internal packages, but there is no good way to do that right now (short of setting MYPYPATH). For open source package, you can add them to typeshed, but that adds overhead for the package maintainer and ties you to mypy's release cycle. > > > Specification > ============= > > There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: > > 1. The package maintainer would like to add type information inline. > > 2. The package maintainer would like to add type information via stubs. > > 3. A third party would like to share stub files for a package, but the > maintainer does not want to include them in the source of the package. > > > Where does the typeshed repo fit in here? Does the PEP propose to deprecate using typeshed for third-party packages, or should typeshed continue to be the repository for major third-party packages? Either way, it should be discussed. > This PEP aims to support these scenarios and make them simple to add to > packaging and deployment. > > The two major parts of this specification are the packaging specifications > and the resolution order for resolving module type information. This spec > is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [1]_. > > Packaging Type Information > -------------------------- > > Packages must opt into supporting typing. This will be done though a distutils > extension [2]_, providing a ``typed`` keyword argument to the distutils > ``setup()`` command. The argument value will depend on the kind of type > information the package provides. The distutils extension will be added to the > ``typing`` package. Therefore a package maintainer may write > > Is the addition to the `typing` package just a legacy feature for Python versions without typing in the standard library? This should be made explicit. > > :: > > setup( > ... > setup_requires=["typing"], > typed="inline", > ... > ) > > Inline Typed Packages > ''''''''''''''''''''' > > Packages that have inline type annotations simply have to pass the value > ``"inline"`` to the ``typed`` argument in ``setup()``. > > Stub Only Packages > '''''''''''''''''' > > For package maintainers wishing to ship stub files containing all of their > type information, it is prefered that the ``*.pyi`` stubs are alongside the > corresponding ``*.py`` files. However, the stubs may be put in a sub-folder > of the Python sources, with the same name the ``*.py`` files are in. For > example, the ``flyingcircus`` package would have its stubs in the folder > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > > What if `flyingcircus` already contains a subpackage called `flyingcircus`? This might be theoretical but there's probably a package out there that does this. > not found in ``flyingcircus/`` the type checker may treat the subdirectory as > a normal package. The normal resolution order of checking ``*.pyi`` before > ``*.py`` will be maintained. The value of the ``typed`` argument to > ``setup()`` is ``"stubs"`` for this type of distribution. The author of the > package is suggested to use ``package_data`` to assure the stub files are > installed alongside the runtime Python code. > > It would be helpful to have an example of what this looks like. This PEP will likely end up being the reference document for people looking to add typing support to their packages. > > Third Party Stub Packages > ''''''''''''''''''''''''' > > Third parties seeking to distribute stub files are encouraged to contact the > maintainer of the package about distribution alongside the package. If the > maintainer does not wish to maintain or package stub files or type information > inline, then a "third party stub package" should be created. The structure is > similar, but slightly different from that of stub only packages. If the stubs > are for the library ``flyingcircus`` then the package should be named > ``flyingcircus-stubs`` and the stub files should be put in a sub-directory > named ``flyingcircus``. This allows the stubs to be checked as if they were in > a regular package. These packages should also pass ``"stubs"`` as the value > of ``typed`` argument in ``setup()``. These packages are suggested to use > ``package_data`` to package stub files. > > The version of the ``flyingcircus-stubs`` package should match the version of > the ``flyingcircus`` package it is providing types for. > > What if I made stubs for flyingcircus 1.0 in my flyingcircus-stubs package, but then realized that I made a terrible mistake in the stubs? I can't upload a new version of flyingcircus-stubs 1.0 to PyPI, and I can't make flyingcircus-stubs have some other version than 1.0 because of this requirement. Another option is as follows: - Stub packages are versioned independently of the packages they provide stubs for. - There is a special global (say __version__) that can be used in stub packages and gets resolved to the version of the package that is being checked. The contents of flyingcircus-stubs might look like: import enum class Swallow(enum.Enum): african = 0 european = 1 if __version__ >= (2, 0): asian = 2 This option doesn't have the problem I described above, but it requires this magical __version__ variable. > > Type Checker Module Resolution Order > ------------------------------------ > > The following is the order that type checkers supporting this PEP should > resolve modules containing type information: > > 1. User code - the files the type checker is running on. > > 2. Stubs or Python source in ``PYTHONPATH``. This is to allow the user > complete control of which stubs to use, and patch broken stubs/inline > types from packages. > > Current type checkers don't use PYTHONPATH as far as I know, because they may not run under the same Python version or environment as the code to be type checked. I don't think we should require type checkers to listen to PYTHONPATH; perhaps we should just say that type checkers should provide a way for users to put code at the beginning of the search path. > > 3. Third party stub packages - these packages can supersede the installed > untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, > however it is encouraged to check their metadata to confirm that they opt > into type checking. > > The metadata of the stubs package? I'm not sure why that makes much sense here. > > 4. Inline packages - finally, if there is nothing overriding the installed > package, and it opts into type checking. > > 5. Typeshed (if used) - Provides the stdlib types and several third party libraries > > When resolving step (3) type checkers should assure the version of the stubs > match the installed runtime package. > > Type checkers that check a different Python version than the version they run > on must find the type information in the ``site-packages``/``dist-packages`` > of that Python version. This can be queried e.g. > ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended > that the type checker allow for the user to point to a particular Python > binary, in case it is not in the path. > > To check if a package has opted into type checking, type checkers are > recommended to use the ``pkg_resources`` module to query the package > metadata. If the ``typed`` package metadata has ``None`` as its value, the > package has not opted into type checking, and the type checker should skip that > package. > > > References > ========== > > .. [1] PEP 484, Storing and Distributing Stub Files > (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) > > .. [2] Distutils Extensions, Adding setup() arguments > (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at ethanhs.me Sun Sep 10 21:10:52 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Sun, 10 Sep 2017 18:10:52 -0700 Subject: [Python-ideas] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: On Sun, Sep 10, 2017 at 5:39 PM, Jelle Zijlstra wrote: > Congratulations on your first PEP! This is solving an important problem > for typing in Python, so I'm glad we're tackling it. > Thanks! > > >> >> Furthermore, package authors are wishing to distribute code that has >> inline type information, and there currently is no standard method to >> distribute packages with inline type annotations or syntax that can >> simultaneously be used at runtime and in type checking. >> >> This feels like it should be the first paragraph: it describes the > important problem we're solving, and the first paragraph is just details on > why the previous solutions don't work. Perhaps you could talk more about > how people are running into problems because of the absence of a way to > distribute typed packages. For example, if you're working on a proprietary > codebase, it's likely that you're relying on other internal packages, but > there is no good way to do that right now (short of setting MYPYPATH). For > open source package, you can add them to typeshed, but that adds overhead > for the package maintainer and ties you to mypy's release cycle. > Inline types are one of the problems this PEP tries to resolve. But it also tries to solve the issue of distributing stubs. You are correct that typeshed should be mentioned as a current method. And I will re-word this the next change I make. > Specification >> ============= >> >> There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: >> >> 1. The package maintainer would like to add type information inline. >> >> 2. The package maintainer would like to add type information via stubs. >> >> 3. A third party would like to share stub files for a package, but the >> maintainer does not want to include them in the source of the package. >> >> >> Where does the typeshed repo fit in here? Does the PEP propose to > deprecate using typeshed for third-party packages, or should typeshed > continue to be the repository for major third-party packages? Either way, > it should be discussed. > Yes, I agree I should mention typeshed. I believe the best approach would be to encourage new third-party packages to use this PEP's approach to stub packages, keep typeshed as is, and migrate the third party part of typeshed into packages if maintainers are found. > This PEP aims to support these scenarios and make them simple to add to >> packaging and deployment. >> >> The two major parts of this specification are the packaging specifications >> and the resolution order for resolving module type information. This spec >> is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [1]_. >> >> Packaging Type Information >> -------------------------- >> >> Packages must opt into supporting typing. This will be done though a distutils >> extension [2]_, providing a ``typed`` keyword argument to the distutils >> ``setup()`` command. The argument value will depend on the kind of type >> information the package provides. The distutils extension will be added to the >> ``typing`` package. Therefore a package maintainer may write >> >> Is the addition to the `typing` package just a legacy feature for Python > versions without typing in the standard library? This should be made > explicit. > The intent here is that the typing package would be required for the extra setup keyword to work, otherwise it would fail. > :: >> >> setup( >> ... >> setup_requires=["typing"], >> typed="inline", >> ... >> ) >> >> Inline Typed Packages >> ''''''''''''''''''''' >> >> Packages that have inline type annotations simply have to pass the value >> ``"inline"`` to the ``typed`` argument in ``setup()``. >> >> Stub Only Packages >> '''''''''''''''''' >> >> For package maintainers wishing to ship stub files containing all of their >> type information, it is prefered that the ``*.pyi`` stubs are alongside the >> corresponding ``*.py`` files. However, the stubs may be put in a sub-folder >> of the Python sources, with the same name the ``*.py`` files are in. For >> example, the ``flyingcircus`` package would have its stubs in the folder >> ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are >> >> What if `flyingcircus` already contains a subpackage called > `flyingcircus`? This might be theoretical but there's probably a package > out there that does this. > I considered this. I considered it as worrying too much. The alternative would be to special case the name and have type checkers follow that. > not found in ``flyingcircus/`` the type checker may treat the subdirectory as >> a normal package. The normal resolution order of checking ``*.pyi`` before >> ``*.py`` will be maintained. The value of the ``typed`` argument to >> ``setup()`` is ``"stubs"`` for this type of distribution. The author of the >> package is suggested to use ``package_data`` to assure the stub files are >> installed alongside the runtime Python code. >> >> It would be helpful to have an example of what this looks like. This PEP > will likely end up being the reference document for people looking to add > typing support to their packages. > I plan on writing examples of the distutils plugin and a sample package tomorrow if I have time. > Third Party Stub Packages >> ''''''''''''''''''''''''' >> >> Third parties seeking to distribute stub files are encouraged to contact the >> maintainer of the package about distribution alongside the package. If the >> maintainer does not wish to maintain or package stub files or type information >> inline, then a "third party stub package" should be created. The structure is >> similar, but slightly different from that of stub only packages. If the stubs >> are for the library ``flyingcircus`` then the package should be named >> ``flyingcircus-stubs`` and the stub files should be put in a sub-directory >> named ``flyingcircus``. This allows the stubs to be checked as if they were in >> a regular package. These packages should also pass ``"stubs"`` as the value >> of ``typed`` argument in ``setup()``. These packages are suggested to use >> ``package_data`` to package stub files. >> >> The version of the ``flyingcircus-stubs`` package should match the version of >> the ``flyingcircus`` package it is providing types for. >> >> What if I made stubs for flyingcircus 1.0 in my flyingcircus-stubs > package, but then realized that I made a terrible mistake in the stubs? I > can't upload a new version of flyingcircus-stubs 1.0 to PyPI, and I can't > make flyingcircus-stubs have some other version than 1.0 because of this > requirement. > > Another option is as follows: > - Stub packages are versioned independently of the packages they provide > stubs for. > - There is a special global (say __version__) that can be used in stub > packages and gets resolved to the version of the package that is being > checked. > > The contents of flyingcircus-stubs might look like: > > import enum > > class Swallow(enum.Enum): > african = 0 > european = 1 > if __version__ >= (2, 0): > asian = 2 > > This option doesn't have the problem I described above, but it requires > this magical __version__ variable. > Guido has said he doesn't like this idea ( https://github.com/python/typing/issues/84#issuecomment-318256377), and I'm not convinced it is worth the complications it involves > Type Checker Module Resolution Order >> ------------------------------------ >> >> The following is the order that type checkers supporting this PEP should >> resolve modules containing type information: >> >> 1. User code - the files the type checker is running on. >> >> 2. Stubs or Python source in ``PYTHONPATH``. This is to allow the user >> complete control of which stubs to use, and patch broken stubs/inline >> types from packages. >> >> Current type checkers don't use PYTHONPATH as far as I know, because they > may not run under the same Python version or environment as the code to be > type checked. I don't think we should require type checkers to listen to > PYTHONPATH; perhaps we should just say that type checkers should provide a > way for users to put code at the beginning of the search path. > I agree. > 3. Third party stub packages - these packages can supersede the installed >> untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, >> however it is encouraged to check their metadata to confirm that they opt >> into type checking. >> >> The metadata of the stubs package? I'm not sure why that makes much sense > here. > Essentially, a package opts into being checked via a setup() keyword. That keyword is put in the packages metadata. > 4. Inline packages - finally, if there is nothing overriding the installed >> package, and it opts into type checking. >> >> 5. Typeshed (if used) - Provides the stdlib types and several third party libraries >> >> When resolving step (3) type checkers should assure the version of the stubs >> match the installed runtime package. >> >> Type checkers that check a different Python version than the version they run >> on must find the type information in the ``site-packages``/``dist-packages`` >> of that Python version. This can be queried e.g. >> ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended >> that the type checker allow for the user to point to a particular Python >> binary, in case it is not in the path. >> >> To check if a package has opted into type checking, type checkers are >> recommended to use the ``pkg_resources`` module to query the package >> metadata. If the ``typed`` package metadata has ``None`` as its value, the >> package has not opted into type checking, and the type checker should skip that >> package. >> >> >> References >> ========== >> >> .. [1] PEP 484, Storing and Distributing Stub Files >> (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) >> >> .. [2] Distutils Extensions, Adding setup() arguments >> (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) >> >> Copyright >> ========= >> >> This document has been placed in the public domain. >> >> >> >> .. >> Local Variables: >> mode: indented-text >> indent-tabs-mode: nil >> sentence-end-double-space: t >> fill-column: 70 >> coding: utf-8 >> End: >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Sun Sep 10 21:21:02 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Sun, 10 Sep 2017 18:21:02 -0700 Subject: [Python-ideas] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: 2017-09-10 18:10 GMT-07:00 Ethan Smith : > > > On Sun, Sep 10, 2017 at 5:39 PM, Jelle Zijlstra > wrote: > >> Congratulations on your first PEP! This is solving an important problem >> for typing in Python, so I'm glad we're tackling it. >> > > Thanks! > >> >> >>> >>> Furthermore, package authors are wishing to distribute code that has >>> inline type information, and there currently is no standard method to >>> distribute packages with inline type annotations or syntax that can >>> simultaneously be used at runtime and in type checking. >>> >>> This feels like it should be the first paragraph: it describes the >> important problem we're solving, and the first paragraph is just details on >> why the previous solutions don't work. Perhaps you could talk more about >> how people are running into problems because of the absence of a way to >> distribute typed packages. For example, if you're working on a proprietary >> codebase, it's likely that you're relying on other internal packages, but >> there is no good way to do that right now (short of setting MYPYPATH). For >> open source package, you can add them to typeshed, but that adds overhead >> for the package maintainer and ties you to mypy's release cycle. >> > > Inline types are one of the problems this PEP tries to resolve. But it > also tries to solve the issue of distributing stubs. You are correct that > typeshed should be mentioned as a current method. And I will re-word this > the next change I make. > >> Specification >>> ============= >>> >>> There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: >>> >>> 1. The package maintainer would like to add type information inline. >>> >>> 2. The package maintainer would like to add type information via stubs. >>> >>> 3. A third party would like to share stub files for a package, but the >>> maintainer does not want to include them in the source of the package. >>> >>> >>> Where does the typeshed repo fit in here? Does the PEP propose to >> deprecate using typeshed for third-party packages, or should typeshed >> continue to be the repository for major third-party packages? Either way, >> it should be discussed. >> > > Yes, I agree I should mention typeshed. I believe the best approach would > be to encourage new third-party packages to use this PEP's approach to stub > packages, keep typeshed as is, and migrate the third party part of typeshed > into packages if maintainers are found. > >> This PEP aims to support these scenarios and make them simple to add to >>> packaging and deployment. >>> >>> The two major parts of this specification are the packaging specifications >>> and the resolution order for resolving module type information. This spec >>> is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [1]_. >>> >>> Packaging Type Information >>> -------------------------- >>> >>> Packages must opt into supporting typing. This will be done though a distutils >>> extension [2]_, providing a ``typed`` keyword argument to the distutils >>> ``setup()`` command. The argument value will depend on the kind of type >>> information the package provides. The distutils extension will be added to the >>> ``typing`` package. Therefore a package maintainer may write >>> >>> Is the addition to the `typing` package just a legacy feature for Python >> versions without typing in the standard library? This should be made >> explicit. >> > > The intent here is that the typing package would be required for the extra > setup keyword to work, otherwise it would fail. > Then I would have to install the `typing` PyPI package even if I am only using Python 3.7+? That seems suboptimal. Perhaps the new keyword can be part of Python core in 3.7 and added to `typing_extensions` for 3.5 and 3.6. > :: >>> >>> setup( >>> ... >>> setup_requires=["typing"], >>> typed="inline", >>> ... >>> ) >>> >>> Inline Typed Packages >>> ''''''''''''''''''''' >>> >>> Packages that have inline type annotations simply have to pass the value >>> ``"inline"`` to the ``typed`` argument in ``setup()``. >>> >>> Stub Only Packages >>> '''''''''''''''''' >>> >>> For package maintainers wishing to ship stub files containing all of their >>> type information, it is prefered that the ``*.pyi`` stubs are alongside the >>> corresponding ``*.py`` files. However, the stubs may be put in a sub-folder >>> of the Python sources, with the same name the ``*.py`` files are in. For >>> example, the ``flyingcircus`` package would have its stubs in the folder >>> ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are >>> >>> What if `flyingcircus` already contains a subpackage called >> `flyingcircus`? This might be theoretical but there's probably a package >> out there that does this. >> > > I considered this. I considered it as worrying too much. The alternative > would be to special case the name and have type checkers follow that. > > That's fair. > not found in ``flyingcircus/`` the type checker may treat the subdirectory as >>> a normal package. The normal resolution order of checking ``*.pyi`` before >>> ``*.py`` will be maintained. The value of the ``typed`` argument to >>> ``setup()`` is ``"stubs"`` for this type of distribution. The author of the >>> package is suggested to use ``package_data`` to assure the stub files are >>> installed alongside the runtime Python code. >>> >>> It would be helpful to have an example of what this looks like. This PEP >> will likely end up being the reference document for people looking to add >> typing support to their packages. >> > > I plan on writing examples of the distutils plugin and a sample package > tomorrow if I have time. > >> Third Party Stub Packages >>> ''''''''''''''''''''''''' >>> >>> Third parties seeking to distribute stub files are encouraged to contact the >>> maintainer of the package about distribution alongside the package. If the >>> maintainer does not wish to maintain or package stub files or type information >>> inline, then a "third party stub package" should be created. The structure is >>> similar, but slightly different from that of stub only packages. If the stubs >>> are for the library ``flyingcircus`` then the package should be named >>> ``flyingcircus-stubs`` and the stub files should be put in a sub-directory >>> named ``flyingcircus``. This allows the stubs to be checked as if they were in >>> a regular package. These packages should also pass ``"stubs"`` as the value >>> of ``typed`` argument in ``setup()``. These packages are suggested to use >>> ``package_data`` to package stub files. >>> >>> The version of the ``flyingcircus-stubs`` package should match the version of >>> the ``flyingcircus`` package it is providing types for. >>> >>> What if I made stubs for flyingcircus 1.0 in my flyingcircus-stubs >> package, but then realized that I made a terrible mistake in the stubs? I >> can't upload a new version of flyingcircus-stubs 1.0 to PyPI, and I can't >> make flyingcircus-stubs have some other version than 1.0 because of this >> requirement. >> >> Another option is as follows: >> - Stub packages are versioned independently of the packages they provide >> stubs for. >> - There is a special global (say __version__) that can be used in stub >> packages and gets resolved to the version of the package that is being >> checked. >> >> The contents of flyingcircus-stubs might look like: >> >> import enum >> >> class Swallow(enum.Enum): >> african = 0 >> european = 1 >> if __version__ >= (2, 0): >> asian = 2 >> >> This option doesn't have the problem I described above, but it requires >> this magical __version__ variable. >> > > Guido has said he doesn't like this idea (https://github.com/python/ > typing/issues/84#issuecomment-318256377), and I'm not convinced it is > worth the complications it involves > But Guido's comment implies that people could use an independent versioning scheme in their stubs package (his "django-1.1-stubs" example). The PEP makes the situation worse because the version is fixed. I agree that my proposal introduces a lot of complication, but I don't think what the PEP proposes is workable (if django-stubs 1.1 got the stubs for django 1.1 wrong, there is no second chance). We get enough bugs in typeshed to know that stub packages will need updates independent from the package they're providing stubs for. Hopefully we can come up with something that allows stub packages to be versioned separately. Guido's comment actually suggests a way forward: What if stub packages can declare (in their package metadata or something) what version of the package they are type checking? That way, django1.1-stubs 0.1 can provide buggy stubs for django 1.1, and then fix them in django1.1-stubs 0.2. > Type Checker Module Resolution Order >>> ------------------------------------ >>> >>> The following is the order that type checkers supporting this PEP should >>> resolve modules containing type information: >>> >>> 1. User code - the files the type checker is running on. >>> >>> 2. Stubs or Python source in ``PYTHONPATH``. This is to allow the user >>> complete control of which stubs to use, and patch broken stubs/inline >>> types from packages. >>> >>> Current type checkers don't use PYTHONPATH as far as I know, because >> they may not run under the same Python version or environment as the code >> to be type checked. I don't think we should require type checkers to listen >> to PYTHONPATH; perhaps we should just say that type checkers should provide >> a way for users to put code at the beginning of the search path. >> > > I agree. > >> 3. Third party stub packages - these packages can supersede the installed >>> untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, >>> however it is encouraged to check their metadata to confirm that they opt >>> into type checking. >>> >>> The metadata of the stubs package? I'm not sure why that makes much >> sense here. >> > > Essentially, a package opts into being checked via a setup() keyword. That > keyword is put in the packages metadata. > >> 4. Inline packages - finally, if there is nothing overriding the installed >>> package, and it opts into type checking. >>> >>> 5. Typeshed (if used) - Provides the stdlib types and several third party libraries >>> >>> When resolving step (3) type checkers should assure the version of the stubs >>> match the installed runtime package. >>> >>> Type checkers that check a different Python version than the version they run >>> on must find the type information in the ``site-packages``/``dist-packages`` >>> of that Python version. This can be queried e.g. >>> ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended >>> that the type checker allow for the user to point to a particular Python >>> binary, in case it is not in the path. >>> >>> To check if a package has opted into type checking, type checkers are >>> recommended to use the ``pkg_resources`` module to query the package >>> metadata. If the ``typed`` package metadata has ``None`` as its value, the >>> package has not opted into type checking, and the type checker should skip that >>> package. >>> >>> >>> References >>> ========== >>> >>> .. [1] PEP 484, Storing and Distributing Stub Files >>> (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) >>> >>> .. [2] Distutils Extensions, Adding setup() arguments >>> (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) >>> >>> Copyright >>> ========= >>> >>> This document has been placed in the public domain. >>> >>> >>> >>> .. >>> Local Variables: >>> mode: indented-text >>> indent-tabs-mode: nil >>> sentence-end-double-space: t >>> fill-column: 70 >>> coding: utf-8 >>> End: >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sun Sep 10 23:08:12 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 11 Sep 2017 03:08:12 +0000 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: It looks simple and easy to understand. To achieve lazy import without breaking backward compatibility, I want to add one more rule: When package defines both of __getattr__ and __all__, automatic import of submodules are disabled (sorry, I don't have pointer to specification about this behavior). For example, some modules depends on email.parser or email.feedparser. But since email/__init__.py uses __all__, all submodules are imported eagerly. See https://github.com/python/cpython/blob/master/Lib/email/__init__.py#L7-L25 Changing __all__ will break backward compatibility. With __getattr__, this can be lazy import: import importlib def __getattr__(name): if name in __all__: return importlib.import_module("." + name, __name__) raise AttributeError(f"module {__name__!r} has no attribute {name!r}") Regards, -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Sep 10 23:17:41 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Sep 2017 20:17:41 -0700 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: I don't think submodules are automatically imported, unless there are import statements in __init__.py. On Sun, Sep 10, 2017 at 8:08 PM, INADA Naoki wrote: > It looks simple and easy to understand. > > To achieve lazy import without breaking backward compatibility, > I want to add one more rule: When package defines both of __getattr__ and > __all__, automatic import of submodules are disabled (sorry, I don't have > pointer to specification about this behavior). > > For example, some modules depends on email.parser or email.feedparser. > But since email/__init__.py uses __all__, all submodules > are imported eagerly. > > See https://github.com/python/cpython/blob/master/Lib/email/ > __init__.py#L7-L25 > > Changing __all__ will break backward compatibility. > With __getattr__, this can be lazy import: > > import importlib > > def __getattr__(name): > if name in __all__: > return importlib.import_module("." + name, __name__) > raise AttributeError(f"module {__name__!r} has no attribute {name!r}") > > > Regards, > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Mon Sep 11 00:02:48 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 11 Sep 2017 13:02:48 +0900 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: Oh, I'm shame myself. Only when `from email import *` is used, __all__ submodules are imported. INADA Naoki On Mon, Sep 11, 2017 at 12:17 PM, Guido van Rossum wrote: > I don't think submodules are automatically imported, unless there are import > statements in __init__.py. > > On Sun, Sep 10, 2017 at 8:08 PM, INADA Naoki wrote: >> >> It looks simple and easy to understand. >> >> To achieve lazy import without breaking backward compatibility, >> I want to add one more rule: When package defines both of __getattr__ and >> __all__, automatic import of submodules are disabled (sorry, I don't have >> pointer to specification about this behavior). >> >> For example, some modules depends on email.parser or email.feedparser. >> But since email/__init__.py uses __all__, all submodules >> are imported eagerly. >> >> See >> https://github.com/python/cpython/blob/master/Lib/email/__init__.py#L7-L25 >> >> Changing __all__ will break backward compatibility. >> With __getattr__, this can be lazy import: >> >> import importlib >> >> def __getattr__(name): >> if name in __all__: >> return importlib.import_module("." + name, __name__) >> raise AttributeError(f"module {__name__!r} has no attribute {name!r}") >> >> >> Regards, >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) From ethan at ethanhs.me Mon Sep 11 01:15:40 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Sun, 10 Sep 2017 22:15:40 -0700 Subject: [Python-ideas] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: On Sun, Sep 10, 2017 at 6:21 PM, Jelle Zijlstra wrote: > > > 2017-09-10 18:10 GMT-07:00 Ethan Smith : > >> >> >> On Sun, Sep 10, 2017 at 5:39 PM, Jelle Zijlstra > > wrote: >> >>> Congratulations on your first PEP! This is solving an important problem >>> for typing in Python, so I'm glad we're tackling it. >>> >> >> Thanks! >> >>> >>> >>> >> Packaging Type Information >>>> -------------------------- >>>> >>>> Packages must opt into supporting typing. This will be done though a distutils >>>> extension [2]_, providing a ``typed`` keyword argument to the distutils >>>> ``setup()`` command. The argument value will depend on the kind of type >>>> information the package provides. The distutils extension will be added to the >>>> ``typing`` package. Therefore a package maintainer may write >>>> >>>> Is the addition to the `typing` package just a legacy feature for >>> Python versions without typing in the standard library? This should be made >>> explicit. >>> >> >> The intent here is that the typing package would be required for the >> extra setup keyword to work, otherwise it would fail. >> > Then I would have to install the `typing` PyPI package even if I am only > using Python 3.7+? That seems suboptimal. Perhaps the new keyword can be > part of Python core in 3.7 and added to `typing_extensions` for 3.5 and 3.6. > That would be acceptable. > :: >>>> >>>> setup( >>>> ... >>>> setup_requires=["typing"], >>>> typed="inline", >>>> ... >>>> ) >>>> >>>> Inline Typed Packages >>>> ''''''''''''''''''''' >>>> >>>> Packages that have inline type annotations simply have to pass the value >>>> ``"inline"`` to the ``typed`` argument in ``setup()``. >>>> >>>> Stub Only Packages >>>> '''''''''''''''''' >>>> >>>> For package maintainers wishing to ship stub files containing all of their >>>> type information, it is prefered that the ``*.pyi`` stubs are alongside the >>>> corresponding ``*.py`` files. However, the stubs may be put in a sub-folder >>>> of the Python sources, with the same name the ``*.py`` files are in. For >>>> example, the ``flyingcircus`` package would have its stubs in the folder >>>> ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are >>>> >>>> What if `flyingcircus` already contains a subpackage called >>> `flyingcircus`? This might be theoretical but there's probably a package >>> out there that does this. >>> >> >> I considered this. I considered it as worrying too much. The alternative >> would be to special case the name and have type checkers follow that. >> > >> > That's fair. > >> not found in ``flyingcircus/`` the type checker may treat the subdirectory as >>>> a normal package. The normal resolution order of checking ``*.pyi`` before >>>> ``*.py`` will be maintained. The value of the ``typed`` argument to >>>> ``setup()`` is ``"stubs"`` for this type of distribution. The author of the >>>> package is suggested to use ``package_data`` to assure the stub files are >>>> installed alongside the runtime Python code. >>>> >>>> It would be helpful to have an example of what this looks like. This >>> PEP will likely end up being the reference document for people looking to >>> add typing support to their packages. >>> >> >> I plan on writing examples of the distutils plugin and a sample package >> tomorrow if I have time. >> >>> Third Party Stub Packages >>>> ''''''''''''''''''''''''' >>>> >>>> Third parties seeking to distribute stub files are encouraged to contact the >>>> maintainer of the package about distribution alongside the package. If the >>>> maintainer does not wish to maintain or package stub files or type information >>>> inline, then a "third party stub package" should be created. The structure is >>>> similar, but slightly different from that of stub only packages. If the stubs >>>> are for the library ``flyingcircus`` then the package should be named >>>> ``flyingcircus-stubs`` and the stub files should be put in a sub-directory >>>> named ``flyingcircus``. This allows the stubs to be checked as if they were in >>>> a regular package. These packages should also pass ``"stubs"`` as the value >>>> of ``typed`` argument in ``setup()``. These packages are suggested to use >>>> ``package_data`` to package stub files. >>>> >>>> The version of the ``flyingcircus-stubs`` package should match the version of >>>> the ``flyingcircus`` package it is providing types for. >>>> >>>> What if I made stubs for flyingcircus 1.0 in my flyingcircus-stubs >>> package, but then realized that I made a terrible mistake in the stubs? I >>> can't upload a new version of flyingcircus-stubs 1.0 to PyPI, and I can't >>> make flyingcircus-stubs have some other version than 1.0 because of this >>> requirement. >>> >>> Another option is as follows: >>> - Stub packages are versioned independently of the packages they provide >>> stubs for. >>> - There is a special global (say __version__) that can be used in stub >>> packages and gets resolved to the version of the package that is being >>> checked. >>> >>> The contents of flyingcircus-stubs might look like: >>> >>> import enum >>> >>> class Swallow(enum.Enum): >>> african = 0 >>> european = 1 >>> if __version__ >= (2, 0): >>> asian = 2 >>> >>> This option doesn't have the problem I described above, but it requires >>> this magical __version__ variable. >>> >> >> Guido has said he doesn't like this idea (https://github.com/python/typ >> ing/issues/84#issuecomment-318256377), and I'm not convinced it is worth >> the complications it involves >> > But Guido's comment implies that people could use an independent > versioning scheme in their stubs package (his "django-1.1-stubs" example). > The PEP makes the situation worse because the version is fixed. > > I agree that my proposal introduces a lot of complication, but I don't > think what the PEP proposes is workable (if django-stubs 1.1 got the stubs > for django 1.1 wrong, there is no second chance). We get enough bugs in > typeshed to know that stub packages will need updates independent from the > package they're providing stubs for. > > Hopefully we can come up with something that allows stub packages to be > versioned separately. Guido's comment actually suggests a way forward: What > if stub packages can declare (in their package metadata or something) what > version of the package they are type checking? That way, django1.1-stubs > 0.1 can provide buggy stubs for django 1.1, and then fix them in > django1.1-stubs 0.2. > After thinking about this for a while, I have come to believe metadata is the solution. Keeping the versioning in the name becomes a burden on both the maintainer and PyPI. I think the best solution is to leverage existing metadata and have the "stub only" packages list the package versions they support via the install_requires keyword, and I believe type checkers can verify that rather easily. So django-stubs would list e.g. django>=1.1.0,django<1.2.0 or whatever the stub maintainer wishes to support. Then the package of stubs can use normal versioning. I think I have reached the point where I should go back and work on some changes to the PEP barring objection to your feedback and/or my suggested amendments. > Type Checker Module Resolution Order >>>> ------------------------------------ >>>> >>>> The following is the order that type checkers supporting this PEP should >>>> resolve modules containing type information: >>>> >>>> 1. User code - the files the type checker is running on. >>>> >>>> 2. Stubs or Python source in ``PYTHONPATH``. This is to allow the user >>>> complete control of which stubs to use, and patch broken stubs/inline >>>> types from packages. >>>> >>>> Current type checkers don't use PYTHONPATH as far as I know, because >>> they may not run under the same Python version or environment as the code >>> to be type checked. I don't think we should require type checkers to listen >>> to PYTHONPATH; perhaps we should just say that type checkers should provide >>> a way for users to put code at the beginning of the search path. >>> >> >> I agree. >> >>> 3. Third party stub packages - these packages can supersede the installed >>>> untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, >>>> however it is encouraged to check their metadata to confirm that they opt >>>> into type checking. >>>> >>>> The metadata of the stubs package? I'm not sure why that makes much >>> sense here. >>> >> >> Essentially, a package opts into being checked via a setup() keyword. >> That keyword is put in the packages metadata. >> >>> 4. Inline packages - finally, if there is nothing overriding the installed >>>> package, and it opts into type checking. >>>> >>>> 5. Typeshed (if used) - Provides the stdlib types and several third party libraries >>>> >>>> When resolving step (3) type checkers should assure the version of the stubs >>>> match the installed runtime package. >>>> >>>> Type checkers that check a different Python version than the version they run >>>> on must find the type information in the ``site-packages``/``dist-packages`` >>>> of that Python version. This can be queried e.g. >>>> ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended >>>> that the type checker allow for the user to point to a particular Python >>>> binary, in case it is not in the path. >>>> >>>> To check if a package has opted into type checking, type checkers are >>>> recommended to use the ``pkg_resources`` module to query the package >>>> metadata. If the ``typed`` package metadata has ``None`` as its value, the >>>> package has not opted into type checking, and the type checker should skip that >>>> package. >>>> >>>> >>>> References >>>> ========== >>>> >>>> .. [1] PEP 484, Storing and Distributing Stub Files >>>> (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) >>>> >>>> .. [2] Distutils Extensions, Adding setup() arguments >>>> (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) >>>> >>>> Copyright >>>> ========= >>>> >>>> This document has been placed in the public domain. >>>> >>>> >>>> >>>> .. >>>> Local Variables: >>>> mode: indented-text >>>> indent-tabs-mode: nil >>>> sentence-end-double-space: t >>>> fill-column: 70 >>>> coding: utf-8 >>>> End: >>>> >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Sep 11 01:32:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Sep 2017 15:32:19 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 11 September 2017 at 00:52, Koos Zevenhoven wrote: > On Thu, Sep 7, 2017 at 9:26 PM, Eric Snow > wrote: > [...] > >> >> get_main(): >> >> Return the main interpreter. >> > > I assume the concept of a main interpreter is inherited from the previous > levels of support in the C API, but what exactly is the significance of > being "the main interpreter"? Instead, could they just all be > subinterpreters of the same Python process (or whatever the right wording > would be)? The main interpreter is ultimately responsible for the actual process global state: standard streams, signal handlers, dynamically linked libraries, __main__ module, etc. The line between it and the "CPython Runtime" is fuzzy for both practical and historical reasons, but the regular Python CLI will always have a "first created, last destroyed" main interpreter, simply because we don't really gain anything significant from eliminating it as a concept. By contrast, embedding applications that *don't* have a __main__ module, and already manage most process global state themselves without the assistance of the CPython Runtime can already get pretty close to just having a pool of peer subinterpreters, and will presumably be able to get closer over time as the subinterpreter support becomes more robust. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Mon Sep 11 04:02:55 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 11 Sep 2017 11:02:55 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Mon, Sep 11, 2017 at 8:32 AM, Nick Coghlan wrote: > On 11 September 2017 at 00:52, Koos Zevenhoven wrote: > > On Thu, Sep 7, 2017 at 9:26 PM, Eric Snow > > wrote: > > [...] > > > >> > >> get_main(): > >> > >> Return the main interpreter. > >> > > > > I assume the concept of a main interpreter is inherited from the previous > > levels of support in the C API, but what exactly is the significance of > > being "the main interpreter"? Instead, could they just all be > > subinterpreters of the same Python process (or whatever the right wording > > would be)? > > The main interpreter is ultimately responsible for the actual process > global state: standard streams, signal handlers, dynamically linked > libraries, __main__ module, etc. > > ?Hmm. It is not clear, for instance, why a signal handler could not be owned by an interpreter that wasn't the first one started.? Or, if a non-main process imports a module from a dynamically linked library, does it delegate that to the main interpreter? And do sys.stdout et al. not exist in the other interpreters? The line between it and the "CPython Runtime" is fuzzy for both > practical and historical reasons, but the regular Python CLI will > always have a "first created, last destroyed" main interpreter, simply > because we don't really gain anything significant from eliminating it > as a concept. > I fear that emphasizing the main interpreter will lead to all kinds of libraries/programs that somehow unnecessarily rely on some or all tasks being performed in the main interpreter. Then you'll have a hard time running two of them in parallel in the same process, because you don't have two main interpreters. -- Koos ?? ?PS. There's a saying... something like "always say never" ;) ? ? > By contrast, embedding applications that *don't* have a __main__ > module, and already manage most process global state themselves > without the assistance of the CPython Runtime can already get pretty > close to just having a pool of peer subinterpreters, and will > presumably be able to get closer over time as the subinterpreter > support becomes more robust. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From guettliml at thomas-guettler.de Mon Sep 11 05:18:51 2017 From: guettliml at thomas-guettler.de (=?UTF-8?Q?Thomas_G=c3=bcttler?=) Date: Mon, 11 Sep 2017 11:18:51 +0200 Subject: [Python-ideas] Adding new lines to "Zen of Python" In-Reply-To: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> References: <4adb6e68-80f8-b57f-8428-41ace013d35f@thomas-guettler.de> Message-ID: Am 08.09.2017 um 14:47 schrieb Thomas G?ttler: > I curious if there are any plans to update the "Zen of Python". > > What could be added to the "Zen of Python"? > > What do you think? > I like this one: "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." (Linus Torvalds) Regards, Thomas G?ttler -- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines From jcrmatos at gmail.com Mon Sep 11 10:03:12 2017 From: jcrmatos at gmail.com (=?UTF-8?Q?Jo=c3=a3o_Matos?=) Date: Mon, 11 Sep 2017 15:03:12 +0100 Subject: [Python-ideas] Give nonlocal the same creating power as global Message-ID: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> Hello, I would like to suggest that nonlocal should be given the same creating power as global. If I do global a_var it creates the global a_var if it doesn't exist. I think it would be great that nonlocal maintained that power. This way when I do nonlocal a_var it would create a_var in the imediate parent environment, if it didn't exist. Without nonlocal creation powers I have to create global variables or local variables after master=Tk() (in the following example): from tkinter import StringVar, Tk from tkinter.ttk import Label def start_gui(): ??? def change_label(): ??????? _label_sv.set('Bye Bye') ??? def create_vars(): ??????? global _label_sv ??????? _label_sv = StringVar(value='Hello World') ??? def create_layout(): ??????? Label(master, textvariable=_label_sv).grid() ??? def create_bindings(): ??????? master.bind('', lambda _: master.destroy()) ??????? master.bind('', lambda _: change_label()) ??? master = Tk() ??? create_vars() ??? create_layout() ??? create_bindings() ??? master.mainloop() if __name__ == '__main__': ??? start_gui() With nonlocal creation powers it would become a start_gui local variable (no global) but I could have a function to create the vars instead of having to add them after master=Tk(): from tkinter import StringVar, Tk from tkinter.ttk import Label def start_gui(): ??? def change_label(): ??????? label_sv.set('Bye Bye') ??? def create_vars(): ??????? nonlocal label_sv ??????? label_sv = StringVar(value='Hello World') ??? def create_layout(): ??????? Label(master, textvariable=label_sv).grid() ??? def create_bindings(): ??????? master.bind('', lambda _: master.destroy()) ??????? master.bind('', lambda _: change_label()) ??? master = Tk() ??? create_vars() ??? create_layout() ??? create_bindings() ??? master.mainloop() if __name__ == '__main__': ??? start_gui() I know that I could also do it with OOP, but this way is more concise (OOP would add more lines and increase the lines length, which I personally dislike) This example is very simple, but if you imagine a GUI with several widgets, then the separation between vars, layout and bindings becomes useful for code organization. Best regards, Jo?o Matos From jcrmatos at gmail.com Mon Sep 11 10:06:47 2017 From: jcrmatos at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Matos?=) Date: Mon, 11 Sep 2017 07:06:47 -0700 (PDT) Subject: [Python-ideas] Give nonlocal the same creating power as global Message-ID: <7fb585fb-d66a-4860-9c4b-3563205d1d75@googlegroups.com> Hello, I would like to suggest that nonlocal should be given the same creating power as global. If I do global a_var it creates the global a_var if it doesn't exist. I think it would be great that nonlocal maintained that power. This way when I do nonlocal a_var it would create a_var in the imediate parent environment, if it didn't exist. Without nonlocal creation powers I have to create global variables or local variables after master=Tk() (in the following example): from tkinter import StringVar, Tk from tkinter.ttk import Label def start_gui(): def change_label(): _label_sv.set('Bye Bye') def create_vars(): global _label_sv _label_sv = StringVar(value='Hello World') def create_layout(): Label(master, textvariable=_label_sv).grid() def create_bindings(): master.bind('', lambda _: master.destroy()) master.bind('', lambda _: change_label()) master = Tk() create_vars() create_layout() create_bindings() master.mainloop() if __name__ == '__main__': start_gui() With nonlocal creation powers it would become a start_gui local variable (no global) but I could have a function to create the vars instead of having to add them after master=Tk(): from tkinter import StringVar, Tk from tkinter.ttk import Label def start_gui(): def change_label(): label_sv.set('Bye Bye') def create_vars(): nonlocal label_sv label_sv = StringVar(value='Hello World') def create_layout(): Label(master, textvariable=label_sv).grid() def create_bindings(): master.bind('', lambda _: master.destroy()) master.bind('', lambda _: change_label()) master = Tk() create_vars() create_layout() create_bindings() master.mainloop() if __name__ == '__main__': start_gui() I know that I could also do it with OOP, but this way is more concise (OOP would add more lines and increase the lines length, which I personally dislike) This example is very simple, but if you imagine a GUI with several widgets, then the separation between vars, layout and bindings becomes useful for code organization. Best regards, Jo?o Matos -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhihn at gmx.com Mon Sep 11 10:26:10 2017 From: jhihn at gmx.com (Jason H) Date: Mon, 11 Sep 2017 16:26:10 +0200 Subject: [Python-ideas] Give nonlocal the same creating power as global In-Reply-To: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> References: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> Message-ID: > Sent: Monday, September 11, 2017 at 10:03 AM > From: "Jo?o Matos" > To: python-ideas at python.org > Subject: [Python-ideas] Give nonlocal the same creating power as global > > Hello, > > I would like to suggest that nonlocal should be given the same creating > power as global. > If I do > global a_var > it creates the global a_var if it doesn't exist. > ... I think this is a bad idea overall. It breaks encapsulation. I would suggest you use create_vars() to return a context: context = create_vars() create_layout(context) create_bindings(context) Or use a class: class TkView: def __init__(self): self.tk = Tk() self.create_layout() self.create_bindings() def create_layout(self): # use self.tk def create_bindings(self): # use self.tk From guido at python.org Mon Sep 11 11:45:37 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 08:45:37 -0700 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: There's no need for shame! I regularly find out that there are Python features I didn't know about. It's called perpetual learning. :-) On Sun, Sep 10, 2017 at 9:02 PM, INADA Naoki wrote: > Oh, I'm shame myself. > > Only when `from email import *` is used, __all__ submodules are imported. > INADA Naoki > > > On Mon, Sep 11, 2017 at 12:17 PM, Guido van Rossum > wrote: > > I don't think submodules are automatically imported, unless there are > import > > statements in __init__.py. > > > > On Sun, Sep 10, 2017 at 8:08 PM, INADA Naoki > wrote: > >> > >> It looks simple and easy to understand. > >> > >> To achieve lazy import without breaking backward compatibility, > >> I want to add one more rule: When package defines both of __getattr__ > and > >> __all__, automatic import of submodules are disabled (sorry, I don't > have > >> pointer to specification about this behavior). > >> > >> For example, some modules depends on email.parser or email.feedparser. > >> But since email/__init__.py uses __all__, all submodules > >> are imported eagerly. > >> > >> See > >> https://github.com/python/cpython/blob/master/Lib/email/ > __init__.py#L7-L25 > >> > >> Changing __all__ will break backward compatibility. > >> With __getattr__, this can be lazy import: > >> > >> import importlib > >> > >> def __getattr__(name): > >> if name in __all__: > >> return importlib.import_module("." + name, __name__) > >> raise AttributeError(f"module {__name__!r} has no attribute > {name!r}") > >> > >> > >> Regards, > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > >> > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Sep 11 11:51:46 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 11 Sep 2017 11:51:46 -0400 Subject: [Python-ideas] Give nonlocal the same creating power as global In-Reply-To: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> References: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> Message-ID: On 9/11/2017 10:03 AM, Jo?o Matos wrote: > Hello, > > I would like to suggest that nonlocal should be given the same creating > power as global. > If I do global a_var it creates the global a_var if it doesn't exist. The global declaration does not create anything, but it redirects subsequent binding. > I think it would be great that nonlocal maintained that power. > > This way when I do nonlocal a_var > it would create a_var in the imediate parent environment, if it didn't > exist. 'Creating new variables' was discussed and rejected when nonlocal was added. That may partly be for technical reasons of not nonlocal is implemented. But there are also problems of ambiguity. Consider this currently legal code. def f(a): def g(): pass def h(): nonlocal a a = 1 You proposal would break all such usages that depend on skipping the immediate parent environment. 'nonlocal a' effectively means 'find the closest function scope with local name a' and I strongly doubt we will change that. If you want 'nonlocal a' to bind in g, explicitly add a to g's locals, such as with 'a = None'. > Without nonlocal creation powers I have to create global variables or > local variables after master=Tk() (in the following example): There is nothing wrong with either. > from tkinter import StringVar, Tk > from tkinter.ttk import Label > > > def start_gui(): > ??? def change_label(): > ??????? _label_sv.set('Bye Bye') > > ??? def create_vars(): > ??????? global _label_sv > > ??????? _label_sv = StringVar(value='Hello World') > > ??? def create_layout(): > ??????? Label(master, textvariable=_label_sv).grid() > > ??? def create_bindings(): > ??????? master.bind('', lambda _: master.destroy()) > ??????? master.bind('', lambda _: change_label()) > > ??? master = Tk() > > ??? create_vars() > ??? create_layout() > ??? create_bindings() > > ??? master.mainloop() > > if __name__ == '__main__': > ??? start_gui() In the version above, you could simplify by removing start_gui and put the operative code from 'master = Tk()' on down in the main clause. This is standard practice for non-OOP tkinter code. > With nonlocal creation powers it would become a start_gui local variable > (no global) but I could have a function to create the vars instead of > having to add them after master=Tk(): > > from tkinter import StringVar, Tk > from tkinter.ttk import Label > > > def start_gui(): > ??? def change_label(): > ??????? label_sv.set('Bye Bye') > > ??? def create_vars(): > ??????? nonlocal label_sv > ??????? label_sv = StringVar(value='Hello World') > > ??? def create_layout(): > ??????? Label(master, textvariable=label_sv).grid() > > ??? def create_bindings(): > ??????? master.bind('', lambda _: master.destroy()) > ??????? master.bind('', lambda _: change_label()) > > ??? master = Tk() > > ??? create_vars() > ??? create_layout() > ??? create_bindings() > > ??? master.mainloop() > > > if __name__ == '__main__': > ??? start_gui() Initializing the outer function local, here adding 'label_sv = None', is the price of wanting to create a class with functions instead of a class definition. > I know that I could also do it with OOP, but this way is more concise > (OOP would add more lines and increase the lines length, which I > personally dislike) > This example is very simple, but if you imagine a GUI with several > widgets, then the separation between vars, layout and bindings becomes > useful for code organization. This is what classes are for. Either use 'class' or explicitly name the local of the outer function acting as a class. -- Terry Jan Reedy From lukasz at langa.pl Mon Sep 11 11:58:45 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Mon, 11 Sep 2017 11:58:45 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft Message-ID: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> PEP: 563 Title: Postponed Evaluation of Annotations Version: $Revision$ Last-Modified: $Date$ Author: ?ukasz Langa Discussions-To: Python-Dev Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 8-Sep-2017 Python-Version: 3.7 Post-History: Resolution: Abstract ======== PEP 3107 introduced syntax for function annotations, but the semantics were deliberately left undefined. PEP 484 introduced a standard meaning to annotations: type hints. PEP 526 defined variable annotations, explicitly tying them with the type hinting use case. This PEP proposes changing function annotations and variable annotations so that they are no longer evaluated at function definition time. Instead, they are preserved in ``__annotations__`` in string form. This change is going to be introduced gradually, starting with a new ``__future__`` import in Python 3.7. Rationale and Goals =================== PEP 3107 added support for arbitrary annotations on parts of a function definition. Just like default values, annotations are evaluated at function definition time. This creates a number of issues for the type hinting use case: * forward references: when a type hint contains names that have not been defined yet, that definition needs to be expressed as a string literal; * type hints are executed at module import time, which is not computationally free. Postponing the evaluation of annotations solves both problems. Non-goals --------- Just like in PEP 484 and PEP 526, it should be emphasized that **Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.** Annotations are still available for arbitrary use besides type checking. Using ``@typing.no_type_hints`` in this case is recommended to disambiguate the use case. Implementation ============== In a future version of Python, function and variable annotations will no longer be evaluated at definition time. Instead, a string form will be preserved in the respective ``__annotations__`` dictionary. Static type checkers will see no difference in behavior, whereas tools using annotations at runtime will have to perform postponed evaluation. If an annotation was already a string, this string is preserved verbatim. In other cases, the string form is obtained from the AST during the compilation step, which means that the string form preserved might not preserve the exact formatting of the source. Annotations need to be syntactically valid Python expressions, also when passed as literal strings (i.e. ``compile(literal, '', 'eval')``). Annotations can only use names present in the module scope as postponed evaluation using local names is not reliable. Note that as per PEP 526, local variable annotations are not evaluated at all since they are not accessible outside of the function's closure. Enabling the future behavior in Python 3.7 ------------------------------------------ The functionality described above can be enabled starting from Python 3.7 using the following special import:: from __future__ import annotations Resolving Type Hints at Runtime =============================== To resolve an annotation at runtime from its string form to the result of the enclosed expression, user code needs to evaluate the string. For code that uses type hints, the ``typing.get_type_hints()`` function correctly evaluates expressions back from its string form. Note that all valid code currently using ``__annotations__`` should already be doing that since a type annotation can be expressed as a string literal. For code which uses annotations for other purposes, a regular ``eval(ann, globals, locals)`` call is enough to resolve the annotation. The trick here is to get the correct value for globals. Fortunately, in the case of functions, they hold a reference to globals in an attribute called ``__globals__``. To get the correct module-level context to resolve class variables, use:: cls_globals = sys.modules[SomeClass.__module__].__dict__ Runtime annotation resolution and class decorators -------------------------------------------------- Metaclasses and class decorators that need to resolve annotations for the current class will fail for annotations that use the name of the current class. Example:: def class_decorator(cls): annotations = get_type_hints(cls) # raises NameError on 'C' print(f'Annotations for {cls}: {annotations}') return cls @class_decorator class C: singleton: 'C' = None This was already true before this PEP. The class decorator acts on the class before it's assigned a name in the current definition scope. The situation is made somewhat stricter when class-level variables are considered. Previously, when the string form wasn't used in annotations, a class decorator would be able to cover situations like:: @class_decorator class Restaurant: class MenuOption(Enum): SPAM = 1 EGGS = 2 default_menu: List[MenuOption] = [] This is no longer possible. Runtime annotation resolution and ``TYPE_CHECKING`` --------------------------------------------------- Sometimes there's code that must be seen by a type checker but should not be executed. For such situations the ``typing`` module defines a constant, ``TYPE_CHECKING``, that is considered ``True`` during type checking but ``False`` at runtime. Example:: import typing if typing.TYPE_CHECKING: import expensive_mod def a_func(arg: expensive_mod.SomeClass) -> None: a_var: expensive_mod.SomeClass = arg ... This approach is also useful when handling import cycles. Trying to resolve annotations of ``a_func`` at runtime using ``typing.get_type_hints()`` will fail since the name ``expensive_mod`` is not defined (``TYPE_CHECKING`` variable being ``False`` at runtime). This was already true before this PEP. Backwards Compatibility ======================= This is a backwards incompatible change. Applications depending on arbitrary objects to be directly present in annotations will break if they are not using ``typing.get_type_hints()`` or ``eval()``. Annotations that depend on locals at the time of the function/class definition are now invalid. Example:: def generate_class(): some_local = datetime.datetime.now() class C: field: some_local = 1 # NOTE: INVALID ANNOTATION def method(self, arg: some_local.day) -> None: # NOTE: INVALID ANNOTATION ... Annotations using nested classes and their respective state are still valid, provided they use the fully qualified name. Example:: class C: field = 'c_field' def method(self, arg: C.field) -> None: # this is OK ... class D: field2 = 'd_field' def method(self, arg: C.field -> C.D.field2: # this is OK ... In the presence of an annotation that cannot be resolved using the current module's globals, a NameError is raised at compile time. Deprecation policy ------------------ In Python 3.7, a ``__future__`` import is required to use the described functionality and a ``PendingDeprecationWarning`` is raised by the compiler in the presence of type annotations in modules without the ``__future__`` import. In Python 3.8 the warning becomes a ``DeprecationWarning``. In the next version this will become the default behavior. Rejected Ideas ============== Keep the ability to use local state when defining annotations ------------------------------------------------------------- With postponed evaluation, this is impossible for function locals. For classes, it would be possible to keep the ability to define annotations using the local scope. However, when using ``eval()`` to perform the postponed evaluation, we need to provide the correct globals and locals to the ``eval()`` call. In the face of nested classes, the routine to get the effective "globals" at definition time would have to look something like this:: def get_class_globals(cls): result = {} result.update(sys.modules[cls.__module__].__dict__) for child in cls.__qualname__.split('.'): result.update(result[child].__dict__) return result This is brittle and doesn't even cover slots. Requiring the use of module-level names simplifies runtime evaluation and provides the "one obvious way" to read annotations. It's the equivalent of absolute imports. Acknowledgements ================ This document could not be completed without valuable input, encouragement and advice from Guido van Rossum, Jukka Lehtosalo, and Ivan Levkivskyi. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From elazarg at gmail.com Mon Sep 11 12:06:14 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Mon, 11 Sep 2017 16:06:14 +0000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: I like it. For previous discussion of this idea see here: https://mail.python.org/pipermail/python-ideas/2016-September/042527.html I don't see this mentioned in the PEP, but it will also allow (easy) description of contracts and dependent types. Elazar On Mon, Sep 11, 2017 at 6:59 PM Lukasz Langa wrote: > PEP: 563 > Title: Postponed Evaluation of Annotations > Version: $Revision$ > Last-Modified: $Date$ > Author: ?ukasz Langa > Discussions-To: Python-Dev > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 8-Sep-2017 > Python-Version: 3.7 > Post-History: > Resolution: > > > Abstract > ======== > > PEP 3107 introduced syntax for function annotations, but the semantics > were deliberately left undefined. PEP 484 introduced a standard meaning > to annotations: type hints. PEP 526 defined variable annotations, > explicitly tying them with the type hinting use case. > > This PEP proposes changing function annotations and variable annotations > so that they are no longer evaluated at function definition time. > Instead, they are preserved in ``__annotations__`` in string form. > > This change is going to be introduced gradually, starting with a new > ``__future__`` import in Python 3.7. > > > Rationale and Goals > =================== > > PEP 3107 added support for arbitrary annotations on parts of a function > definition. Just like default values, annotations are evaluated at > function definition time. This creates a number of issues for the type > hinting use case: > > * forward references: when a type hint contains names that have not been > defined yet, that definition needs to be expressed as a string > literal; > > * type hints are executed at module import time, which is not > computationally free. > > Postponing the evaluation of annotations solves both problems. > > Non-goals > --------- > > Just like in PEP 484 and PEP 526, it should be emphasized that **Python > will remain a dynamically typed language, and the authors have no desire > to ever make type hints mandatory, even by convention.** > > Annotations are still available for arbitrary use besides type checking. > Using ``@typing.no_type_hints`` in this case is recommended to > disambiguate the use case. > > > Implementation > ============== > > In a future version of Python, function and variable annotations will no > longer be evaluated at definition time. Instead, a string form will be > preserved in the respective ``__annotations__`` dictionary. Static type > checkers will see no difference in behavior, whereas tools using > annotations at runtime will have to perform postponed evaluation. > > If an annotation was already a string, this string is preserved > verbatim. In other cases, the string form is obtained from the AST > during the compilation step, which means that the string form preserved > might not preserve the exact formatting of the source. > > Annotations need to be syntactically valid Python expressions, also when > passed as literal strings (i.e. ``compile(literal, '', 'eval')``). > Annotations can only use names present in the module scope as postponed > evaluation using local names is not reliable. > > Note that as per PEP 526, local variable annotations are not evaluated > at all since they are not accessible outside of the function's closure. > > Enabling the future behavior in Python 3.7 > ------------------------------------------ > > The functionality described above can be enabled starting from Python > 3.7 using the following special import:: > > from __future__ import annotations > > > Resolving Type Hints at Runtime > =============================== > > To resolve an annotation at runtime from its string form to the result > of the enclosed expression, user code needs to evaluate the string. > > For code that uses type hints, the ``typing.get_type_hints()`` function > correctly evaluates expressions back from its string form. Note that > all valid code currently using ``__annotations__`` should already be > doing that since a type annotation can be expressed as a string literal. > > For code which uses annotations for other purposes, a regular > ``eval(ann, globals, locals)`` call is enough to resolve the > annotation. The trick here is to get the correct value for globals. > Fortunately, in the case of functions, they hold a reference to globals > in an attribute called ``__globals__``. To get the correct module-level > context to resolve class variables, use:: > > cls_globals = sys.modules[SomeClass.__module__].__dict__ > > Runtime annotation resolution and class decorators > -------------------------------------------------- > > Metaclasses and class decorators that need to resolve annotations for > the current class will fail for annotations that use the name of the > current class. Example:: > > def class_decorator(cls): > annotations = get_type_hints(cls) # raises NameError on 'C' > print(f'Annotations for {cls}: {annotations}') > return cls > > @class_decorator > class C: > singleton: 'C' = None > > This was already true before this PEP. The class decorator acts on > the class before it's assigned a name in the current definition scope. > > The situation is made somewhat stricter when class-level variables are > considered. Previously, when the string form wasn't used in annotations, > a class decorator would be able to cover situations like:: > > @class_decorator > class Restaurant: > class MenuOption(Enum): > SPAM = 1 > EGGS = 2 > > default_menu: List[MenuOption] = [] > > This is no longer possible. > > Runtime annotation resolution and ``TYPE_CHECKING`` > --------------------------------------------------- > > Sometimes there's code that must be seen by a type checker but should > not be executed. For such situations the ``typing`` module defines a > constant, ``TYPE_CHECKING``, that is considered ``True`` during type > checking but ``False`` at runtime. Example:: > > import typing > > if typing.TYPE_CHECKING: > import expensive_mod > > def a_func(arg: expensive_mod.SomeClass) -> None: > a_var: expensive_mod.SomeClass = arg > ... > > This approach is also useful when handling import cycles. > > Trying to resolve annotations of ``a_func`` at runtime using > ``typing.get_type_hints()`` will fail since the name ``expensive_mod`` > is not defined (``TYPE_CHECKING`` variable being ``False`` at runtime). > This was already true before this PEP. > > > Backwards Compatibility > ======================= > > This is a backwards incompatible change. Applications depending on > arbitrary objects to be directly present in annotations will break > if they are not using ``typing.get_type_hints()`` or ``eval()``. > > Annotations that depend on locals at the time of the function/class > definition are now invalid. Example:: > > def generate_class(): > some_local = datetime.datetime.now() > class C: > field: some_local = 1 # NOTE: INVALID ANNOTATION > def method(self, arg: some_local.day) -> None: # NOTE: > INVALID ANNOTATION > ... > > Annotations using nested classes and their respective state are still > valid, provided they use the fully qualified name. Example:: > > class C: > field = 'c_field' > def method(self, arg: C.field) -> None: # this is OK > ... > > class D: > field2 = 'd_field' > def method(self, arg: C.field -> C.D.field2: # this is OK > ... > > In the presence of an annotation that cannot be resolved using the > current module's globals, a NameError is raised at compile time. > > > Deprecation policy > ------------------ > > In Python 3.7, a ``__future__`` import is required to use the described > functionality and a ``PendingDeprecationWarning`` is raised by the > compiler in the presence of type annotations in modules without the > ``__future__`` import. In Python 3.8 the warning becomes a > ``DeprecationWarning``. In the next version this will become the > default behavior. > > > Rejected Ideas > ============== > > Keep the ability to use local state when defining annotations > ------------------------------------------------------------- > > With postponed evaluation, this is impossible for function locals. For > classes, it would be possible to keep the ability to define annotations > using the local scope. However, when using ``eval()`` to perform the > postponed evaluation, we need to provide the correct globals and locals > to the ``eval()`` call. In the face of nested classes, the routine to > get the effective "globals" at definition time would have to look > something like this:: > > def get_class_globals(cls): > result = {} > result.update(sys.modules[cls.__module__].__dict__) > for child in cls.__qualname__.split('.'): > result.update(result[child].__dict__) > return result > > This is brittle and doesn't even cover slots. Requiring the use of > module-level names simplifies runtime evaluation and provides the > "one obvious way" to read annotations. It's the equivalent of absolute > imports. > > > Acknowledgements > ================ > > This document could not be completed without valuable input, > encouragement and advice from Guido van Rossum, Jukka Lehtosalo, and > Ivan Levkivskyi. > > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Mon Sep 11 13:16:53 2017 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 11 Sep 2017 12:16:53 -0500 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: One thing I want to point out: there are a lot of really useful Python libraries that have come to rely on annotations being objects, ranging from plac to fbuild to many others. I could understand something that delays the evaluation of annotations until they are accessed, but this seems really extreme. On Mon, Sep 11, 2017 at 10:58 AM, Lukasz Langa wrote: > PEP: 563 > Title: Postponed Evaluation of Annotations > Version: $Revision$ > Last-Modified: $Date$ > Author: ?ukasz Langa > Discussions-To: Python-Dev > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 8-Sep-2017 > Python-Version: 3.7 > Post-History: > Resolution: > > > Abstract > ======== > > PEP 3107 introduced syntax for function annotations, but the semantics > were deliberately left undefined. PEP 484 introduced a standard meaning > to annotations: type hints. PEP 526 defined variable annotations, > explicitly tying them with the type hinting use case. > > This PEP proposes changing function annotations and variable annotations > so that they are no longer evaluated at function definition time. > Instead, they are preserved in ``__annotations__`` in string form. > > This change is going to be introduced gradually, starting with a new > ``__future__`` import in Python 3.7. > > > Rationale and Goals > =================== > > PEP 3107 added support for arbitrary annotations on parts of a function > definition. Just like default values, annotations are evaluated at > function definition time. This creates a number of issues for the type > hinting use case: > > * forward references: when a type hint contains names that have not been > defined yet, that definition needs to be expressed as a string > literal; > > * type hints are executed at module import time, which is not > computationally free. > > Postponing the evaluation of annotations solves both problems. > > Non-goals > --------- > > Just like in PEP 484 and PEP 526, it should be emphasized that **Python > will remain a dynamically typed language, and the authors have no desire > to ever make type hints mandatory, even by convention.** > > Annotations are still available for arbitrary use besides type checking. > Using ``@typing.no_type_hints`` in this case is recommended to > disambiguate the use case. > > > Implementation > ============== > > In a future version of Python, function and variable annotations will no > longer be evaluated at definition time. Instead, a string form will be > preserved in the respective ``__annotations__`` dictionary. Static type > checkers will see no difference in behavior, whereas tools using > annotations at runtime will have to perform postponed evaluation. > > If an annotation was already a string, this string is preserved > verbatim. In other cases, the string form is obtained from the AST > during the compilation step, which means that the string form preserved > might not preserve the exact formatting of the source. > > Annotations need to be syntactically valid Python expressions, also when > passed as literal strings (i.e. ``compile(literal, '', 'eval')``). > Annotations can only use names present in the module scope as postponed > evaluation using local names is not reliable. > > Note that as per PEP 526, local variable annotations are not evaluated > at all since they are not accessible outside of the function's closure. > > Enabling the future behavior in Python 3.7 > ------------------------------------------ > > The functionality described above can be enabled starting from Python > 3.7 using the following special import:: > > from __future__ import annotations > > > Resolving Type Hints at Runtime > =============================== > > To resolve an annotation at runtime from its string form to the result > of the enclosed expression, user code needs to evaluate the string. > > For code that uses type hints, the ``typing.get_type_hints()`` function > correctly evaluates expressions back from its string form. Note that > all valid code currently using ``__annotations__`` should already be > doing that since a type annotation can be expressed as a string literal. > > For code which uses annotations for other purposes, a regular > ``eval(ann, globals, locals)`` call is enough to resolve the > annotation. The trick here is to get the correct value for globals. > Fortunately, in the case of functions, they hold a reference to globals > in an attribute called ``__globals__``. To get the correct module-level > context to resolve class variables, use:: > > cls_globals = sys.modules[SomeClass.__module__].__dict__ > > Runtime annotation resolution and class decorators > -------------------------------------------------- > > Metaclasses and class decorators that need to resolve annotations for > the current class will fail for annotations that use the name of the > current class. Example:: > > def class_decorator(cls): > annotations = get_type_hints(cls) # raises NameError on 'C' > print(f'Annotations for {cls}: {annotations}') > return cls > > @class_decorator > class C: > singleton: 'C' = None > > This was already true before this PEP. The class decorator acts on > the class before it's assigned a name in the current definition scope. > > The situation is made somewhat stricter when class-level variables are > considered. Previously, when the string form wasn't used in annotations, > a class decorator would be able to cover situations like:: > > @class_decorator > class Restaurant: > class MenuOption(Enum): > SPAM = 1 > EGGS = 2 > > default_menu: List[MenuOption] = [] > > This is no longer possible. > > Runtime annotation resolution and ``TYPE_CHECKING`` > --------------------------------------------------- > > Sometimes there's code that must be seen by a type checker but should > not be executed. For such situations the ``typing`` module defines a > constant, ``TYPE_CHECKING``, that is considered ``True`` during type > checking but ``False`` at runtime. Example:: > > import typing > > if typing.TYPE_CHECKING: > import expensive_mod > > def a_func(arg: expensive_mod.SomeClass) -> None: > a_var: expensive_mod.SomeClass = arg > ... > > This approach is also useful when handling import cycles. > > Trying to resolve annotations of ``a_func`` at runtime using > ``typing.get_type_hints()`` will fail since the name ``expensive_mod`` > is not defined (``TYPE_CHECKING`` variable being ``False`` at runtime). > This was already true before this PEP. > > > Backwards Compatibility > ======================= > > This is a backwards incompatible change. Applications depending on > arbitrary objects to be directly present in annotations will break > if they are not using ``typing.get_type_hints()`` or ``eval()``. > > Annotations that depend on locals at the time of the function/class > definition are now invalid. Example:: > > def generate_class(): > some_local = datetime.datetime.now() > class C: > field: some_local = 1 # NOTE: INVALID ANNOTATION > def method(self, arg: some_local.day) -> None: # NOTE: INVALID ANNOTATION > ... > > Annotations using nested classes and their respective state are still > valid, provided they use the fully qualified name. Example:: > > class C: > field = 'c_field' > def method(self, arg: C.field) -> None: # this is OK > ... > > class D: > field2 = 'd_field' > def method(self, arg: C.field -> C.D.field2: # this is OK > ... > > In the presence of an annotation that cannot be resolved using the > current module's globals, a NameError is raised at compile time. > > > Deprecation policy > ------------------ > > In Python 3.7, a ``__future__`` import is required to use the described > functionality and a ``PendingDeprecationWarning`` is raised by the > compiler in the presence of type annotations in modules without the > ``__future__`` import. In Python 3.8 the warning becomes a > ``DeprecationWarning``. In the next version this will become the > default behavior. > > > Rejected Ideas > ============== > > Keep the ability to use local state when defining annotations > ------------------------------------------------------------- > > With postponed evaluation, this is impossible for function locals. For > classes, it would be possible to keep the ability to define annotations > using the local scope. However, when using ``eval()`` to perform the > postponed evaluation, we need to provide the correct globals and locals > to the ``eval()`` call. In the face of nested classes, the routine to > get the effective "globals" at definition time would have to look > something like this:: > > def get_class_globals(cls): > result = {} > result.update(sys.modules[cls.__module__].__dict__) > for child in cls.__qualname__.split('.'): > result.update(result[child].__dict__) > return result > > This is brittle and doesn't even cover slots. Requiring the use of > module-level names simplifies runtime evaluation and provides the > "one obvious way" to read annotations. It's the equivalent of absolute > imports. > > > Acknowledgements > ================ > > This document could not be completed without valuable input, > encouragement and advice from Guido van Rossum, Jukka Lehtosalo, and > Ivan Levkivskyi. > > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else http://refi64.com/ From c at anthonyrisinger.com Mon Sep 11 13:29:03 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Mon, 11 Sep 2017 12:29:03 -0500 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170908163604.xxp4hd7idi3eboum@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> Message-ID: On Fri, Sep 8, 2017 at 11:36 AM, Neil Schemenauer < nas-python-ideas at arctrix.com> wrote: > On 2017-09-09, Chris Angelico wrote: > > Laziness has to be complete - or, looking the other way, eager > > importing is infectious. For foo to be lazy, bar also has to be lazy; > > Not with the approach I'm proposing. bar will be loaded in non-lazy > fashion at the right time, foo can still be lazy. > I'll bring the the conversation back here instead of co-opting the PEP 562 thread. On Sun, Sep 10, 2017 at 2:45 PM, Neil Schemenauer wrote: > > I think the key is to make exec(code, module) work as an alternative > to exec(code, module.__dict). That allows module singleton classes > to define properties and use __getattr__. The > LOAD_NAME/STORE_NAME/DELETE_NAME opcodes need to be tweaked to > handle this. There will be a slight performance cost. Modules that > don't opt-in will not pay. > I'm not sure I follow the `exec(code, module)` part from the other thread. `exec` needs a dict to exec code into, the import protocol expects you to exec code into a module.__dict__, and even the related type.__prepare__ requires a dict so it can `exec` the class body there. Code wants a dict so functions created by the code string can bind it to function.__globals__. How do you handle lazy loading when a defined function requests a global via LOAD_NAME? Are you suggesting to change function.__globals__ to something not-a-dict, and/or change LOAD_NAME to bypass function.__globals__ and instead do something like: getattr(sys.modules[function.__globals__['__name__']], lazy_identifier) ? All this chatter about modifying opcodes, adding future statements, lazy module opt-in mechanisms, special handling of __init__ or __getattr__ or SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel like an awful lot of work to me, adding even more cognitive load to an already massively complex import system. They seem to make modules even less like other objects or types. It would be really *really* nice if ModuleType got closer to being a simple class, instead of farther away. Maybe we start treating new modules like a subclass of ModuleType instead of all the half-way or special case solutions... HEAR ME OUT :-) Demo below. (also appended to end) https://gist.github.com/anthonyrisinger/b04f40a3611fd7cde10eed6bb68e8824 ``` # from os.path import realpath as rpath # from spam.ham import eggs, sausage as saus # print(rpath) # print(rpath('.')) # print(saus) $ python deferred_namespace.py /home/anthony/devel/deferred_namespace Traceback (most recent call last): File "deferred_namespace.py", line 73, in class ModuleType(metaclass=MetaModuleType): File "deferred_namespace.py", line 88, in ModuleType print(saus) File "deferred_namespace.py", line 48, in __missing__ resolved = deferred.__import__() File "deferred_namespace.py", line 9, in __import__ module = __import__(*self.args) ModuleNotFoundError: No module named 'spam' ``` Lazy-loading can be achieved by giving modules a __dict__ namespace that is import-aware. This parallels heavily with classes using __prepare__ to make their namespace order-aware (ignore the fact they are now order-aware by default). What if we brought the two closer together? I feel like the python object data model already has all the tools we need. The above uses __prepare__ and a module metaclass, but it could also use a custom __dict__ descriptor for ModuleType that returns an import-aware namespace (like DeferredImportNamespace in my gist). Or ModuleType.__new__ can reassign its own __dict__ (currently read-only). In all these cases we only need to make 2 small changes to Python: * Change `__import__` to call `globals.__defer__` (or similar) when appropriate instead of importing. * Create a way to make a non-binding class type so `module.function.__get__` doesn't create a bound method. The metaclass path also opens the door for passing keyword arguments to __prepare__ and __new__: from spam.ham import eggs using methods: True ... which might mean: GeneratedModuleClassName(ModuleType, methods=True): # module code ... # methods=True passed to __prepare__ and __new__, # allowing the module to implement bound methods! ... or even: import . import CustomMetaModule from spam.ham import ( eggs, sausage as saus, ) via CustomMetaModule using { methods: True, other: feature, } ... which might mean: GeneratedModuleClassName(ModuleType, metaclass=CustomMetaModule, methods=True, other=feature): # module code ... Making modules work like a real type/class means we we get __init__, __getattr__, and every other __*__ method *for free*, especially when combined with an extension to the import protocol allowing methods=True (or similar, like above). We could even subclass the namespace for each module, allowing us to effectively revert the module's __dict__ to a normal dict, and completely remove any possible overhead. Python types are powerful, let's do more of them! At the end of the day, I believe we should strive for these 3 things: * MUST work with function.__globals__[deferred], module.__dict__[deferred], and module.deferred. * SHOULD bring modules closer to normal objects, and maybe accept the fact they are more like class defe * SHOULD NOT require opt-in! Virtually every existing module will work fine. Thanks, ```python class Import: def __init__(self, args, attr): self.args = args self.attr = attr self.done = False def __import__(self): module = __import__(*self.args) if not self.attr: return module try: return getattr(module, self.attr) except AttributeError as e: raise ImportError(f'getattr({module!r}, {self.attr!r})') from e class DeferredImportNamespace(dict): def __init__(self, *args, **kwds): super().__init__(*args, **kwds) self.deferred = {} def __defer__(self, args, *names): # If __import__ is called and globals.__defer__() is defined, names to # bind are non-empty, each name is either missing from globals.deferred # or still marked done=False, then it should call: # # globals.__defer__(args, *names) # # where `args` are the original arguments and `names` are the bindings: # # from spam.ham import eggs, sausage as saus # __defer__(('spam.ham', self, self, ['eggs', 'sausage'], 0), 'eggs', 'saus') # # Records the import and what names would have been used. for i, name in enumerate(names): if name not in self.deferred: attr = args[3][i] if args[3] else None self.deferred[name] = Import(args, attr) def __missing__(self, name): # Raise KeyError if not a deferred import. deferred = self.deferred[name] try: # Replay original __import__ call. resolved = deferred.__import__() except KeyError as e: # KeyError -> ImportError so it's not swallowed by __missing__. raise ImportError(f'{name} = __import__{deferred.args}') from e else: # TODO: Still need a way to avoid binds... or maybe opt-in? # # from spam.ham import eggs, sausage using methods=True # # Save the import to namespace! self[name] = resolved finally: # Set after import to avoid recursion. deferred.done = True # Return import to original requestor. return resolved class MetaModuleType(type): @classmethod def __prepare__(cls, name, bases, defer=True, **kwds): return DeferredImportNamespace() if defer else {} class ModuleType(metaclass=MetaModuleType): # Simulate what we want to happen in a module block! __defer__ = locals().__defer__ # from os.path import realpath as rpath __defer__(('os.path', locals(), locals(), ['realpath'], 0), 'rpath') # from spam.ham import eggs, sausage as saus __defer__(('spam.ham', locals(), locals(), ['eggs', 'sausage'], 0), 'eggs', 'saus') # Good import. print(rpath) print(rpath('.')) # Bad import. print(saus) ``` -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcrmatos at gmail.com Mon Sep 11 13:32:18 2017 From: jcrmatos at gmail.com (=?UTF-8?Q?Jo=c3=a3o_Matos?=) Date: Mon, 11 Sep 2017 18:32:18 +0100 Subject: [Python-ideas] Give nonlocal the same creating power as global In-Reply-To: References: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> Message-ID: <5c49346e-198e-8348-5107-9b16d321ea20@gmail.com> Hello, You're correct. The idea is to give nonlocal the same ability, redirect subsequent bindings if the variable doesn't exist. No, what I said is that it would only create the var if it didn't exist. That means that the current behaviour of nonlocal to check all previous envs (except global) would be the same. The variable would only be created if it ididn't exist in all parent envs. The downside of the global example is that it is "poluting" the globals. The reason it is inside a function is that it is called from another module. The main module does some checks and only if the checks are ok it calls the gui module. Best regards, Jo?o Matos On 11-09-2017 16:51, Terry Reedy wrote: > On 9/11/2017 10:03 AM, Jo?o Matos wrote: >> Hello, >> >> I would like to suggest that nonlocal should be given the same >> creating power as global. >> If I do global a_var it creates the global a_var if it doesn't exist. > > The global declaration does not create anything, but it redirects > subsequent binding. > >> I think it would be great that nonlocal maintained that power. >> >> This way when I do nonlocal a_var >> it would create a_var in the imediate parent environment, if it >> didn't exist. > > 'Creating new variables' was discussed and rejected when nonlocal was > added.? That may partly be for technical reasons of not nonlocal is > implemented.? But there are also problems of ambiguity.? Consider this > currently legal code. > > def f(a): > ??? def g(): pass > ??????? def h(): > ??????????? nonlocal a > ??????????? a = 1 > > You proposal would break all such usages that depend on skipping the > immediate parent environment.? 'nonlocal a' effectively means 'find > the closest function scope with local name a' and I strongly doubt we > will change that. If you want 'nonlocal a' to bind in g, explicitly > add a to g's locals, such as with 'a = None'. > >> Without nonlocal creation powers I have to create global variables or >> local variables after master=Tk() (in the following example): > > There is nothing wrong with either. > >> from tkinter import StringVar, Tk >> from tkinter.ttk import Label >> >> >> def start_gui(): >> ???? def change_label(): >> ???????? _label_sv.set('Bye Bye') >> >> ???? def create_vars(): >> ???????? global _label_sv >> >> ???????? _label_sv = StringVar(value='Hello World') >> >> ???? def create_layout(): >> ???????? Label(master, textvariable=_label_sv).grid() >> >> ???? def create_bindings(): >> ???????? master.bind('', lambda _: master.destroy()) >> ???????? master.bind('', lambda _: change_label()) >> >> ???? master = Tk() >> >> ???? create_vars() >> ???? create_layout() >> ???? create_bindings() >> >> ???? master.mainloop() >> >> if __name__ == '__main__': >> ???? start_gui() > > In the version above, you could simplify by removing start_gui and put > the operative code from 'master = Tk()' on down in the main clause. > This is standard practice for non-OOP tkinter code. > >> With nonlocal creation powers it would become a start_gui local >> variable (no global) but I could have a function to create the vars >> instead of having to add them after master=Tk(): >> >> from tkinter import StringVar, Tk >> from tkinter.ttk import Label >> >> >> def start_gui(): >> ???? def change_label(): >> ???????? label_sv.set('Bye Bye') >> >> ???? def create_vars(): >> ???????? nonlocal label_sv >> ???????? label_sv = StringVar(value='Hello World') >> >> ???? def create_layout(): >> ???????? Label(master, textvariable=label_sv).grid() >> >> ???? def create_bindings(): >> ???????? master.bind('', lambda _: master.destroy()) >> ???????? master.bind('', lambda _: change_label()) >> >> ???? master = Tk() >> >> ???? create_vars() >> ???? create_layout() >> ???? create_bindings() >> >> ???? master.mainloop() >> >> >> if __name__ == '__main__': >> ???? start_gui() > > Initializing the outer function local, here adding 'label_sv = None', > is the price of wanting to create a class with functions instead of a > class definition. > >> I know that I could also do it with OOP, but this way is more concise >> (OOP would add more lines and increase the lines length, which I >> personally dislike) > >> This example is very simple, but if you imagine a GUI with several >> widgets, then the separation between vars, layout and bindings >> becomes useful for code organization. > > This is what classes are for.? Either use 'class' or explicitly name > the local of the outer function acting as a class. > From steve at pearwood.info Mon Sep 11 13:57:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 03:57:33 +1000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: <20170911175733.GT13110@ando.pearwood.info> On Mon, Sep 11, 2017 at 04:06:14PM +0000, ????? wrote: > I like it. For previous discussion of this idea see here: > https://mail.python.org/pipermail/python-ideas/2016-September/042527.html > > I don't see this mentioned in the PEP, but it will also allow (easy) > description of contracts and dependent types. How? You may be using a different meaning to the word "contract" than I'm familiar with. I'm thinking about Design By Contract, where the contracts are typically much more powerful than mere type checks, e.g. a contract might state that the argument is float between 0 and 1, or that the return result is datetime object in the future. There are (at least?) three types of contracts: preconditions, which specify the arguments, postconditions, which specify the return result, and invariants, which specify what doesn't change. I don't see how you can specify contracts in a single type annotation. -- Steve From nas-python-ideas at arctrix.com Mon Sep 11 14:09:27 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Mon, 11 Sep 2017 12:09:27 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> Message-ID: <20170911180927.hlgtzaph56mygsco@python.ca> On 2017-09-11, C Anthony Risinger wrote: > I'm not sure I follow the `exec(code, module)` part from the other thread. > `exec` needs a dict to exec code into [..] [..] > How do you handle lazy loading when a defined function requests a global > via LOAD_NAME? Are you suggesting to change function.__globals__ to > something not-a-dict, and/or change LOAD_NAME to bypass > function.__globals__ and instead do something like: I propose to make function.__namespace__ be a module (or other namespace object). function.__globals__ would be a property that calls vars(function.__namespace__). Implementing this is a lot of work, need to fix LOAD_NAME, LOAD_GLOBAL and a whole heap of other things. I have a partly done proof-of-concept implementation. It crashes immediately on Python startup at this point but so far I have not seen any insurmountable issues. Doing it while perserving backwards compatibility will be a challenge. Doing it without losing performance (LOAD_GLOBAL using the fact that f_globals is an honest 'dict') is also hard. It this point, I think there is a chance we can do it. It is a conceptual simplification of Python that gives the language more consistency and more power. > All this chatter about modifying opcodes, adding future statements, lazy > module opt-in mechanisms, special handling of __init__ or __getattr__ or > SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel like > an awful lot of work to me, adding even more cognitive load to an already > massively complex import system. They seem to make modules even less like > other objects or types. I disagree. It would make for less cognitive load as LOAD_ATTR would be very simlar to LOAD_NAME/LOAD_GLOBAL. It makes modules *more* like other objects and types. I'm busy with "real work" this week and so can't follow the discussion closely or work on my proof-of-concept prototype. I hope we can come up with an elegant solution and not some special hack just to make module properties work. Regards, Neil From elazarg at gmail.com Mon Sep 11 14:31:46 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Mon, 11 Sep 2017 18:31:46 +0000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <20170911175733.GT13110@ando.pearwood.info> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170911175733.GT13110@ando.pearwood.info> Message-ID: On Mon, Sep 11, 2017 at 8:58 PM Steven D'Aprano wrote: > On Mon, Sep 11, 2017 at 04:06:14PM +0000, ????? wrote: > > I like it. For previous discussion of this idea see here: > > > https://mail.python.org/pipermail/python-ideas/2016-September/042527.html > > > > I don't see this mentioned in the PEP, but it will also allow (easy) > > description of contracts and dependent types. > > How? You may be using a different meaning to the word "contract" than I'm familiar with. I'm thinking about Design By Contract, where the > contracts are typically much more powerful than mere type checks, e.g. a > contract might state that the argument is float between 0 and 1, def f(x: float and (0 <= x <= 1)) -> float: ... > or that the return result is datetime object in the future. There are (at > least?) three types of contracts: preconditions, which specify the > arguments, Exemplified above > postconditions, which specify the return result, def f(x: int, y: int) -> ret < y: # ret being an (ugly) convention, unknown to python ... > and invariants, which specify what doesn't change. > class A: x: x != 0 y: y > x def foo(self): ... Of course I'm not claiming my specific examples are useful or readable. I'm also not claiming anything about the ability to check or enforce it; it's merely about making python more friendly to 3rd party tool support. You didn't ask about dependent types, but an example is in order: def zip(*x: Tuple[List[T], _n]) -> List[Tuple[T, _n]]: ... # _n being a binding occurrence, again not something the interpreter should know Basically dependent types, like other types, are just a restricted form of contracts. Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Mon Sep 11 15:07:20 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Mon, 11 Sep 2017 15:07:20 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170911175733.GT13110@ando.pearwood.info> Message-ID: <48F8143B-23C9-49EA-921C-C991EC471373@langa.pl> This is off topic for discussion of this PEP. It would require another one (essentially an extension of PEP 484) to get passed for your idea to be standardized. For now, I don't want to distract reviewers from conflating PEP 563 with all the possible wonderful or horrible ways people can potentially extend type hints with. The biggest achievement of PEP 484 is creating a _standard_ syntax for typing in Python that other tools can embrace. I want to be very explicit that in no way should PEP 563 be viewed as a gateway to custom extensions that are going to go against the agreed standard. Further evolution of PEP 484 is possible (as exemplified by PEP 526) but even though PEP 563 does create several opportunities for nicer syntax, this is off topic for the time being. - ? > On Sep 11, 2017, at 2:31 PM, ??????? wrote: > > > > On Mon, Sep 11, 2017 at 8:58 PM Steven D'Aprano > wrote: > On Mon, Sep 11, 2017 at 04:06:14PM +0000, ????? wrote: > > I like it. For previous discussion of this idea see here: > > https://mail.python.org/pipermail/python-ideas/2016-September/042527.html > > > > I don't see this mentioned in the PEP, but it will also allow (easy) > > description of contracts and dependent types. > > How? You may be using a different meaning to the word "contract" than I'm > familiar with. I'm thinking about Design By Contract, where the > contracts are typically much more powerful than mere type checks, e.g. a > contract might state that the argument is float between 0 and 1, > > def f(x: float and (0 <= x <= 1)) -> float: ... > > or that the return result is datetime object in the future. There are (at > least?) three types of contracts: preconditions, which specify the > arguments, > > Exemplified above > > postconditions, which specify the return result, > > def f(x: int, y: int) -> ret < y: # ret being an (ugly) convention, unknown to python > ... > > and invariants, which specify what doesn't change. > > class A: > x: x != 0 > y: y > x > def foo(self): ... > > Of course I'm not claiming my specific examples are useful or readable. I'm also not claiming anything about the ability to check or enforce it; it's merely about making python more friendly to 3rd party tool support. > > You didn't ask about dependent types, but an example is in order: > > def zip(*x: Tuple[List[T], _n]) -> List[Tuple[T, _n]]: ... # _n being a binding occurrence, again not something the interpreter should know > > Basically dependent types, like other types, are just a restricted form of contracts. > > Elazar > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From c at anthonyrisinger.com Mon Sep 11 15:06:47 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Mon, 11 Sep 2017 14:06:47 -0500 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170911180927.hlgtzaph56mygsco@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> <20170911180927.hlgtzaph56mygsco@python.ca> Message-ID: On Mon, Sep 11, 2017 at 1:09 PM, Neil Schemenauer < nas-python-ideas at arctrix.com> wrote: > On 2017-09-11, C Anthony Risinger wrote: > > I'm not sure I follow the `exec(code, module)` part from the other > thread. > > `exec` needs a dict to exec code into [..] > [..] > > How do you handle lazy loading when a defined function requests a global > > via LOAD_NAME? Are you suggesting to change function.__globals__ to > > something not-a-dict, and/or change LOAD_NAME to bypass > > function.__globals__ and instead do something like: > > I propose to make function.__namespace__ be a module (or other > namespace object). function.__globals__ would be a property that > calls vars(function.__namespace__). > Oh interesting, I kinda like that. > Doing it while perserving backwards compatibility will be a > challenge. Doing it without losing performance (LOAD_GLOBAL using > the fact that f_globals is an honest 'dict') is also hard. It this > point, I think there is a chance we can do it. It is a conceptual > simplification of Python that gives the language more consistency > and more power. > I do agree it makes module access more uniform if both defined functions and normal code end up effectively calling getattr(...), instead of directly reaching into __dict__. > > All this chatter about modifying opcodes, adding future statements, lazy > > module opt-in mechanisms, special handling of __init__ or __getattr__ or > > SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel > like > > an awful lot of work to me, adding even more cognitive load to an already > > massively complex import system. They seem to make modules even less like > > other objects or types. > > I disagree. It would make for less cognitive load as LOAD_ATTR > would be very simlar to LOAD_NAME/LOAD_GLOBAL. It makes modules > *more* like other objects and types. > I'm not sure about this though. Anything that special cases dunder methods to sort of look like their counter part on types, eg. __init__ or __getattr__ or __getattribute__ or whatever else, is a hack to me. The only way I see to remedy this discrepancy is to make modules a real subclass of ModuleType, giving them full access to the power of the type system: ``` DottedModuleName(ModuleType, bound_methods=False): # something like this: # sys.modules[__class__.__name__] = __class__._proxy_during_import() ??? # ... module code here ... sys.modules[DottedModuleName.__name__] = DottedModuleName(DottedModuleName.__name__, DottedModuleName.__doc__) ``` I've done this a few times in the past, and it works even better on python3 (python2 function.__globals__ didn't trigger __missing__ IIRC). I guess all I'm getting at, is can we find a way to make modules a real type? So dunder methods are activated? This would make modules phenomenally powerful instead of just a namespace (or resorting to after the fact __class__ reassignment hacks). > I'm busy with "real work" this week and so can't follow the > discussion closely or work on my proof-of-concept prototype. I hope > we can come up with an elegant solution and not some special hack > just to make module properties work. Agree, and same, but take a look at what I posted prior. I have a ton of interest around lazy/deferred module loading, have made it work a few times in a couple ways, and am properly steeping in import lore. I have bandwidth to work towards a goal that gives modules full access to dunder methods. I'll also try to properly patch Python in the way I described. Ultimately I want deferred loading everywhere, even if it means modules can't do all the other things types can do. I'm less concerned with how we get there :-) -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Mon Sep 11 15:20:26 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Mon, 11 Sep 2017 19:20:26 +0000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <48F8143B-23C9-49EA-921C-C991EC471373@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170911175733.GT13110@ando.pearwood.info> <48F8143B-23C9-49EA-921C-C991EC471373@langa.pl> Message-ID: On Mon, Sep 11, 2017 at 10:07 PM Lukasz Langa wrote: > This is off topic for discussion of this PEP. > > It would require another one (essentially an extension of PEP 484) to get > passed for your idea to be standardized. > I'm not sure whether this is directed to me; so just to make it clear, I did not propose anything here. I will be happy for PEP 563 to be accepted as is, although the implications of the ability to customize Python's scoping/binding rules are worth mentioning in the PEP, regardless of whether these are considered a pro or a con. Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Sep 11 15:23:21 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 11 Sep 2017 21:23:21 +0200 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: Ryan Gonzalez schrieb am 11.09.2017 um 19:16: > One thing I want to point out: there are a lot of really useful Python > libraries that have come to rely on annotations being objects, ranging > from plac to fbuild to many others. I could understand something that > delays the evaluation of annotations until they are accessed, but this > seems really extreme. I guess there could be some helper that would allow you to say "here's an annotation or a function, here's the corresponding module globals(), please give me the annotation instances". But the "__annotations__" mapping could probably also be self-expanding on first request, if it remembers its globals(). Stefan From lukasz at langa.pl Mon Sep 11 15:25:28 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Mon, 11 Sep 2017 15:25:28 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> > On Sep 11, 2017, at 1:16 PM, Ryan Gonzalez wrote: > > One thing I want to point out: there are a lot of really useful Python > libraries that have come to rely on annotations being objects, ranging > from plac to fbuild to many others. Shout out to fbuild which is a project that was built on Python 3 since early 2008, running on RCs of Python 3.0! Mind=blown. We are aware of fbuild, plac, and dryparse, Larry Hastings' similar project. But I definitely wouldn't say there are "a lot of" libraries like that, in fact I only know of a handful more, most of them early stage "runtime type checkers" with minimal adoption. When PEP 484 was put for review, we were testing the waters here, and added wording like: "In order for maximal compatibility with offline type checking it may eventually be a good idea to change interfaces that rely on annotations to switch to a different mechanism, for example a decorator." and "We do hope that type hints will eventually become the sole use for annotations, but this will require additional discussion and a deprecation period after the initial roll-out of the typing module. (...) Another possible outcome would be that type hints will eventually become the default meaning for annotations, but that there will always remain an option to disable them." Turns out, only authors of a few libraries spoke up AFAICT and most were happy with @no_type_hints. I remember mostly Stefan Behnel's concerns about Cython's annotations, and those never really took off. The bigger uproar at the time was against Python becoming a "statically typed language". Summing up, PEP 563 is proposing a backwards incompatible change but it doesn't look like it's going to affect "a lot of" libraries. More importantly, it does provide a way for those libraries to keep working. > I could understand something that > delays the evaluation of annotations until they are accessed, but this > seems really extreme. This PEP is proposing delaying evaluation until annotations are accessed but gives user code the power to decide whether the string form is enough, or maybe an AST would be enough, or actual evaluation with get_type_hints() or eval() is necessary. - ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Mon Sep 11 15:30:54 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 12:30:54 -0700 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: On Mon, Sep 11, 2017 at 10:16 AM, Ryan Gonzalez wrote: > One thing I want to point out: there are a lot of really useful Python > libraries that have come to rely on annotations being objects, ranging > from plac to fbuild to many others. I could understand something that > delays the evaluation of annotations until they are accessed, but this > seems really extreme. > This is a serious concern and we need to give it some thought. The current thinking is that those libraries can still get those objects by simply applying eval() to the annotations (or typing.get_type_hints()). And they may already have to support that in order to support PEP 484's forward references. Though perhaps there's a reason why such libraries currently don't need to handle forward refs? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Mon Sep 11 15:39:54 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Mon, 11 Sep 2017 15:39:54 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: <06BB273C-61FC-4154-A1D6-B777C0D9FA49@langa.pl> > On Sep 11, 2017, at 3:23 PM, Stefan Behnel wrote: > > Ryan Gonzalez schrieb am 11.09.2017 um 19:16: >> One thing I want to point out: there are a lot of really useful Python >> libraries that have come to rely on annotations being objects, ranging >> from plac to fbuild to many others. I could understand something that >> delays the evaluation of annotations until they are accessed, but this >> seems really extreme. > > I guess there could be some helper that would allow you to say "here's an > annotation or a function, here's the corresponding module globals(), please > give me the annotation instances". Currently the PEP simply proposes using eval(ann, globals, locals) and even suggests where to take globals from. The problem is with nested classes or type annotations that are using local state. The PEP is proposing to disallow those due to the trickiness of getting the global and local state right in those situations. Instead, you'd use qualified names for class-level fields that you're using in your annotation. This change is fine for static use and runtime use, except when faced with metaclasses or class decorators which resolve the annotations of a class in question. So far it looks like both typing.NamedTuple and the proposed data classes are fine with this. But if you have examples of metaclasses or class decorators which would break, let me know! - ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From nas-python-ideas at arctrix.com Mon Sep 11 16:03:45 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Mon, 11 Sep 2017 14:03:45 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> <20170911180927.hlgtzaph56mygsco@python.ca> Message-ID: <20170911200345.dajb7gpac2ehwvgg@python.ca> On 2017-09-11, C Anthony Risinger wrote: > I'm getting at, is can we find a way to make modules a real type? So dunder > methods are activated? This would make modules phenomenally powerful > instead of just a namespace (or resorting to after the fact __class__ > reassignment hacks). My __namespace__ idea will allow this. A module can be a singleton instance of a singleton ModuleType instance. So, you can assign a property like: .__class__.prop = and have it just work. Each module would have a singleton class associated with it to store the properties. The spelling of will need to be worked out. It could be sys.modules[__name__].__class__ or perhaps we can have a weakref, so this: __module__.__class__.prop = ... Need to think about this. I have done import hooks before and I know the pain involved. importlib cleans things up a lot. However, if my early prototype work is an indication, the import stuff gets a whole lot simpler. Instead of passing around a dict and then grubbing around sys.modules because the module is actually what you want, you just pass the module around directly. Thanks for you feedback. Regards, Neil From yselivanov.ml at gmail.com Mon Sep 11 16:21:04 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 11 Sep 2017 16:21:04 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: On Mon, Sep 11, 2017 at 3:25 PM, Lukasz Langa wrote: [..] > This PEP is proposing delaying evaluation until annotations are accessed but > gives user code the power to decide whether the string form is enough, or > maybe an AST would be enough, or actual evaluation with get_type_hints() or > eval() is necessary. I'm one of those who used annotations for other purposes than type hints. And even if annotations became strings in Python 3.7 *without future import*, fixing my libraries would be easy -- just add an eval(). That said, the PEP doesn't cover an alternative solution: 1. Add another special attribute to functions: __annotations_text__. 2. __annotations__ becomes a dynamic Mapping, which evaluates stuff from __annotations_text__ *lazily*. 3. Recommend linters and IDEs to support "# pragma: annotations", as a way to say that the Python files follows the new Python 3.7 annotations semantics. That would maintain full backwards compatibility with all existing Python libraries and would not require a future import. Yury From nas-python-ideas at arctrix.com Mon Sep 11 16:32:23 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Mon, 11 Sep 2017 14:32:23 -0600 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170911200345.dajb7gpac2ehwvgg@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> <20170911180927.hlgtzaph56mygsco@python.ca> <20170911200345.dajb7gpac2ehwvgg@python.ca> Message-ID: <20170911203223.2kajjuzybtkltewr@python.ca> On 2017-09-11, Neil Schemenauer wrote: > A module can be a singleton instance of a singleton ModuleType > instance. Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy APIs. From victor.stinner at gmail.com Mon Sep 11 18:45:37 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Sep 2017 00:45:37 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: Instead of modifying the Python grammar, the alternative is to enhance float(str) to support it: k = float("0x1.2492492492492p-3") # 1/7 Victor 2017-09-08 8:57 GMT+02:00 Serhiy Storchaka : > The support of hexadecimal floating literals (like 0xC.68p+2) is included in > just released C++17 standard. Seems this becomes a mainstream. > > In Python float.hex() returns hexadecimal string representation. Is it a > time to add more support of hexadecimal floating literals? Accept them in > float constructor and in Python parser? And maybe add support of hexadecimal > formatting ('%x' and '{:x}')? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From nas-python-ideas at arctrix.com Mon Sep 11 20:26:16 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Mon, 11 Sep 2017 18:26:16 -0600 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: <20170912002616.3gil3x6ymufaecsx@python.ca> On 2017-09-12, Victor Stinner wrote: > Instead of modifying the Python grammar, the alternative is to enhance > float(str) to support it: > > k = float("0x1.2492492492492p-3") # 1/7 Making it a different function from float() would avoid backwards compatibility issues. I.e. float() no longer returns errors on some inputs. E.g. from math import hexfloat k = hexfloat("0x1.2492492492492p-3") I still think a literal syntax has merits. The above cannot be optimized by the compiler as it doesn't know what hexfloat() refers to. That in turn destroys constant folding peephole stuff that uses the literal. From ethan at ethanhs.me Mon Sep 11 20:33:51 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Mon, 11 Sep 2017 17:33:51 -0700 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: On Mon, Sep 11, 2017 at 1:21 PM, Yury Selivanov wrote: > On Mon, Sep 11, 2017 at 3:25 PM, Lukasz Langa wrote: > [..] > > This PEP is proposing delaying evaluation until annotations are accessed > but > > gives user code the power to decide whether the string form is enough, or > > maybe an AST would be enough, or actual evaluation with get_type_hints() > or > > eval() is necessary. > > I'm one of those who used annotations for other purposes than type > hints. And even if annotations became strings in Python 3.7 *without > future import*, fixing my libraries would be easy -- just add an > eval(). > > That said, the PEP doesn't cover an alternative solution: > > 1. Add another special attribute to functions: __annotations_text__. > > 2. __annotations__ becomes a dynamic Mapping, which evaluates stuff > from __annotations_text__ *lazily*. > > 3. Recommend linters and IDEs to support "# pragma: annotations", as a > way to say that the Python files follows the new Python 3.7 > annotations semantics. > > I really like this proposal! I agree the linters should understand the semantics of if TYPE_CHECKING, Python 2.7-3.6 will continue to need it. > That would maintain full backwards compatibility with all existing > Python libraries and would not require a future import. > > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Sep 11 21:48:44 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 11:48:44 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: Message-ID: <20170912014844.GV13110@ando.pearwood.info> On Tue, Sep 12, 2017 at 12:45:37AM +0200, Victor Stinner wrote: > Instead of modifying the Python grammar, the alternative is to enhance > float(str) to support it: > > k = float("0x1.2492492492492p-3") # 1/7 Why wouldn't you just write 1/7? -- Steve From steve at pearwood.info Mon Sep 11 21:45:18 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 11:45:18 +1000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> Message-ID: <20170912014516.GU13110@ando.pearwood.info> On Mon, Sep 11, 2017 at 11:58:45AM -0400, Lukasz Langa wrote: > PEP: 563 > Title: Postponed Evaluation of Annotations A few comments, following the quoted passages as needed. > Rationale and Goals > =================== > > PEP 3107 added support for arbitrary annotations on parts of a function > definition. Just like default values, annotations are evaluated at > function definition time. This creates a number of issues for the type > hinting use case: > > * forward references: when a type hint contains names that have not been > defined yet, that definition needs to be expressed as a string > literal; > > * type hints are executed at module import time, which is not > computationally free. > > Postponing the evaluation of annotations solves both problems. You haven't justified that these are problems large enough to need fixing, let alone fixing in a backwards-incompatible way. Regarding forward references: I see no problem with quoting forward references. Some people complain about the quotation marks, but frankly I don't think that's a major imposition. I'm not sure that going from a situation where only forward references are strings, to one where all annotations are strings, counts as a positive solution. Regarding the execution time at runtime: this sounds like premature optimization. If it is not, if you have profiled some applications found that there is more than a trivial amount of time used by generating the annotations, you should say so. In my opinion, a less disruptive solution to the execution time (supposed?) problem is a switch to disable annotations altogether, similar to the existing -O optimize switch, turning them into no-ops. That won't make any difference to static type-checkers. Some people will probably want such a switch anyway, if they care about the memory used by the __annotations__ dictionaries. > Implementation > ============== > > In a future version of Python, function and variable annotations will no > longer be evaluated at definition time. Instead, a string form will be > preserved in the respective ``__annotations__`` dictionary. Static type > checkers will see no difference in behavior, whereas tools using > annotations at runtime will have to perform postponed evaluation. > > If an annotation was already a string, this string is preserved > verbatim. In other cases, the string form is obtained from the AST > during the compilation step, which means that the string form preserved > might not preserve the exact formatting of the source. Can you give an example of how the AST may distort the source? My knee-jerk reaction is that anything which causes the annotation to differ from the source is likely to cause confusion. > Annotations need to be syntactically valid Python expressions, also when > passed as literal strings (i.e. ``compile(literal, '', 'eval')``). > Annotations can only use names present in the module scope as postponed > evaluation using local names is not reliable. And that's a problem. Forcing the use of global variables seems harmful. Preventing the use of locals seems awful. Can't we use some sort of closure-like mechanism? This restriction is going to break, or prevent, situations like this: def decorator(func): kind = ... # something generated at decorator call time @functools.wraps(func) def inner(arg: kind): ... return inner Even if static typecheckers have no clue what the annotation on the inner function is, it is still useful for introspection and documentation. > Resolving Type Hints at Runtime > =============================== [...] > To get the correct module-level > context to resolve class variables, use:: > > cls_globals = sys.modules[SomeClass.__module__].__dict__ A small style issue: I think that's better written as: cls_globals = vars(sys.modules[SomeClass.__module__]) We should avoid directly accessing dunders unless necessary, and vars() exists specifically for the purpose of returning object's __dict__. > Runtime annotation resolution and ``TYPE_CHECKING`` > --------------------------------------------------- > > Sometimes there's code that must be seen by a type checker but should > not be executed. For such situations the ``typing`` module defines a > constant, ``TYPE_CHECKING``, that is considered ``True`` during type > checking but ``False`` at runtime. Example:: > > import typing > > if typing.TYPE_CHECKING: > import expensive_mod > > def a_func(arg: expensive_mod.SomeClass) -> None: > a_var: expensive_mod.SomeClass = arg > ... I don't know whether this is important, but for the record the current documentation shows expensive_mod.SomeClass quoted. > Backwards Compatibility > ======================= [...] > Annotations that depend on locals at the time of the function/class > definition are now invalid. Example:: As mentioned above, I think this is a bad loss of existing functionality. [...] > In the presence of an annotation that cannot be resolved using the > current module's globals, a NameError is raised at compile time. It is not clear what this means, or how it will work, especially given that the point of this is to delay evaluating annotations. How will the compiler know that an annotation cannot be resolved if it doesn't try to evaluate it? Which brings me to another objection. In general, errors should be caught as early as possible. Currently, many (but not all) errors in annotations are caught at runtime because their evaluation fails: class MyClass: ... def function(arg: MyClsas) -> int: # oops ... That's a nice feature even if I'm not using a type-checker: I get immediate feedback as soon as I try running the code that my annotation is wrong. It is true that forward references aren't evaluated at runtime, because they are strings, but "ordinary" annotations are evaluated, and that's a good thing! Losing that seems like a step backwards. > Rejected Ideas > ============== > > Keep the ability to use local state when defining annotations > ------------------------------------------------------------- > > With postponed evaluation, this is impossible for function locals. Impossible seems a bit strong. Can you elaborate? [...] > This is brittle and doesn't even cover slots. Requiring the use of > module-level names simplifies runtime evaluation and provides the > "one obvious way" to read annotations. It's the equivalent of absolute > imports. I hardly think that "simplifies runtime evaluation" is true. At the moment annotations are already evaluated. *Anything* that you have to do by hand (like call eval) cannot be simpler than "do nothing". I don't think the analogy with absolute imports is even close to useful, and far from being "one obvious way", this is a surprising, non-obvious, seemingly-arbitrary restriction on annotations. Given that for *six versions* of Python 3 annotations could be locals, there is nothing obvious about restricting them to globals. -- Steve From stefan_ml at behnel.de Tue Sep 12 02:03:58 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 12 Sep 2017 08:03:58 +0200 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: Lukasz Langa schrieb am 11.09.2017 um 21:25: > I remember mostly Stefan Behnel's concerns about Cython's annotations, I'm currently reimplementing the annotation typing in Cython to be compatible with PEP-484, so that concern is pretty much out of the way. This PEP still has an impact on Cython, because we'd have to implement the same thing, and also make the interface available in older Python versions (2.6+) for Cython compiled modules. Stefan From victor.stinner at gmail.com Tue Sep 12 03:23:04 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Sep 2017 09:23:04 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170912014844.GV13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> Message-ID: 2017-09-12 3:48 GMT+02:00 Steven D'Aprano : >> k = float("0x1.2492492492492p-3") # 1/7 > > Why wouldn't you just write 1/7? 1/7 is irrational, so it's not easy to get the "exact value" for a 64-bit IEEE 754 double float. I chose it because it's easy to write. Maybe math.pi is a better example :-) >>> math.pi.hex() '0x1.921fb54442d18p+1' Victor From victor.stinner at gmail.com Tue Sep 12 03:24:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Sep 2017 09:24:39 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170911232756.ilvoimg6jtlnp3sk@python.ca> References: <20170911232756.ilvoimg6jtlnp3sk@python.ca> Message-ID: 2017-09-12 1:27 GMT+02:00 Neil Schemenauer : >> k = float("0x1.2492492492492p-3") # 1/7 > > Making it a different function from float() would avoid backwards > compatibility issues. I.e. float() no longer returns errors on some > inputs. In that case, I suggest float.fromhex() to remain consistent the bytes example: >>> b'123'.hex() '313233' >>> bytes.fromhex('313233') b'123' ... Oh wait, it already exists... >>> float.fromhex('0x1.921fb54442d18p+1') 3.141592653589793 :-D Victor From levkivskyi at gmail.com Tue Sep 12 04:26:34 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 12 Sep 2017 10:26:34 +0200 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: @Anthony > module.__getattr__ works pretty well for normal access, after being > imported by another module, but it doesn't properly trigger loading by > functions defined in the module's own namespace. The idea of my PEP is to be very simple (both semantically and in terms of implementation). This is why I don't want to add any complex logic. People who will want to use __getattr__ for lazy loading still can do this by importing submodules. @Nathaniel @INADA > The main two use cases I know of for this and PEP 549 are lazy imports > of submodules, and deprecating attributes. Yes, lazy loading seems to be a popular idea :-) I will add the simple recipe by Inada to the PEP since it will already work. @Cody > I still think the better way > to solve the custom dir() would be to change the module __dir__ > method to check if __all__ is defined and use it to generate the > result if it exists. This seems like a logical enhancement to me, > and I'm planning on writing a patch to implement this. Whether it > would be accepted is still an open issue though. This seems a reasonable rule to me, I can also make this patch if you will not have time. @Guido What do you think about the above idea? -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue Sep 12 05:38:32 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 12 Sep 2017 11:38:32 +0200 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: In principle, I like this idea, this will save some keystrokes and will make annotated code more "beautiful". But I am quite worried about the backwards compatibility. One possible idea would be to use __future__ import without a definite deprecation plan. If people will be fine with using typing.get_type_hints (btw this is already the preferred way instead of directly accessing __annotations__, according to PEP 526 at least) then we could go ahead with deprecation. Also I really like Yury's idea of dynamic mapping, but it has one downside, semantics of this will change: def fun(x: print("Function defined"), y: int) -> None: ... However I agree functions with side effects in annotations are very rare, and it would be reasonable to sacrifice this tiny backwards compatibility to avoid the __future__ import. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Sep 12 06:40:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Sep 2017 20:40:26 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 11 September 2017 at 18:02, Koos Zevenhoven wrote: > On Mon, Sep 11, 2017 at 8:32 AM, Nick Coghlan wrote: >> The line between it and the "CPython Runtime" is fuzzy for both >> practical and historical reasons, but the regular Python CLI will >> always have a "first created, last destroyed" main interpreter, simply >> because we don't really gain anything significant from eliminating it >> as a concept. > > I fear that emphasizing the main interpreter will lead to all kinds of > libraries/programs that somehow unnecessarily rely on some or all tasks > being performed in the main interpreter. Then you'll have a hard time > running two of them in parallel in the same process, because you don't have > two main interpreters. You don't need to fear this scenario, since it's a description of the status quo (and it's the primary source of overstated claims about subinterpreters being "fundamentally broken"). So no, not everything will be subinterpreter-friendly, just as not everything in Python is thread-safe, and not everything is portable across platforms. That's OK - it just means we'll aim to make as many things as possible implicitly subinterpreter-friendly, and for everything else, we'll aim to minimise the adjustments needed to *make* things subinterpreter friendly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 12 06:54:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Sep 2017 20:54:26 +1000 Subject: [Python-ideas] Give nonlocal the same creating power as global In-Reply-To: <5c49346e-198e-8348-5107-9b16d321ea20@gmail.com> References: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> <5c49346e-198e-8348-5107-9b16d321ea20@gmail.com> Message-ID: On 12 September 2017 at 03:32, Jo?o Matos wrote: > Hello, > > You're correct. The idea is to give nonlocal the same ability, redirect > subsequent bindings if the variable doesn't exist. The issue you're facing is that optimised local variables still need to be defined in the compilation unit where they're locals - we're not going to make the compiler keep track of all the nonlocal declarations in nested functions and infer additional local variables from those. (It's not technically impossible to do that, it just takes our already complex name scoping rules, and makes them even more complex and hard to understand). So in order to do what you want, you're going to need to explicitly declare a local variable in the scope you want to write to, either by assigning None to it (in any version), or by using a variable annotation (in 3.6+). Future readers of your code will thank you for making the widget definitions easier to find, rather than having them scattered through an arbitrarily large number of nested functions :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Tue Sep 12 07:15:24 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Sep 2017 21:15:24 +1000 Subject: [Python-ideas] Give nonlocal the same creating power as global In-Reply-To: References: <89c78309-d807-8d76-2e6f-b3ef0ff29c35@gmail.com> <5c49346e-198e-8348-5107-9b16d321ea20@gmail.com> Message-ID: On Tue, Sep 12, 2017 at 8:54 PM, Nick Coghlan wrote: > On 12 September 2017 at 03:32, Jo?o Matos wrote: >> Hello, >> >> You're correct. The idea is to give nonlocal the same ability, redirect >> subsequent bindings if the variable doesn't exist. > > The issue you're facing is that optimised local variables still need > to be defined in the compilation unit where they're locals - we're not > going to make the compiler keep track of all the nonlocal declarations > in nested functions and infer additional local variables from those. > (It's not technically impossible to do that, it just takes our already > complex name scoping rules, and makes them even more complex and hard > to understand). > > So in order to do what you want, you're going to need to explicitly > declare a local variable in the scope you want to write to, either by > assigning None to it (in any version), or by using a variable > annotation (in 3.6+). Future readers of your code will thank you for > making the widget definitions easier to find, rather than having them > scattered through an arbitrarily large number of nested functions :) Or, yaknow, the OP could actually use a class, instead of treating a closure as a poor-man's class... Honestly, I don't see much advantage to the closure here. A class is a far better tool for this job IMO. ChrisA From ncoghlan at gmail.com Tue Sep 12 07:17:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Sep 2017 21:17:23 +1000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <20170912014516.GU13110@ando.pearwood.info> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170912014516.GU13110@ando.pearwood.info> Message-ID: On 12 September 2017 at 11:45, Steven D'Aprano wrote: >> Rejected Ideas >> ============== >> >> Keep the ability to use local state when defining annotations >> ------------------------------------------------------------- >> >> With postponed evaluation, this is impossible for function locals. > > Impossible seems a bit strong. Can you elaborate? I actually agree with this, and I think there's an alternative to string evaluation that would solve the "figure out the right globals() & locals() references" problem in a more elegant way: instead of using strings, implicitly compile the annotations as "lambda: ". You'd still lose the ability to access class locals (except by their qualified name), but you'd be able to access function locals just fine, since the lambda expression would implicitly generate the necessary closure cells to keep the relevant local variables alive following the termination of the outer function. Unfortunately, this idea has the downside that for trivial annotations, defining a lambda expression is likely to be *slower* than evaluating the expression, whereas referencing a string constant is faster: $ python -m perf timeit "int" ..................... Mean +- std dev: 27.7 ns +- 1.8 ns $ python -m perf timeit "lambda: int" ..................... Mean +- std dev: 66.0 ns +- 1.7 ns $ python -m perf timeit "'int'" ..................... Mean +- std dev: 7.97 ns +- 0.32 ns Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Tue Sep 12 07:20:03 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 21:20:03 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170912002616.3gil3x6ymufaecsx@python.ca> References: <20170912002616.3gil3x6ymufaecsx@python.ca> Message-ID: <20170912112002.GW13110@ando.pearwood.info> On Mon, Sep 11, 2017 at 06:26:16PM -0600, Neil Schemenauer wrote: > On 2017-09-12, Victor Stinner wrote: > > Instead of modifying the Python grammar, the alternative is to enhance > > float(str) to support it: > > > > k = float("0x1.2492492492492p-3") # 1/7 > > Making it a different function from float() would avoid backwards > compatibility issues. I.e. float() no longer returns errors on some > inputs. I don't think many people will care about backwards compatibility of errors. Intentionally calling float() in order to get an exception is not very common (apart from test suites). Its easier to use raise if you want a ValueError. The only counter-example I can think of is beginner programmers who write something like: num = float(input("Enter a number:")) and are surprised when the "invalid" response "0x1.Fp2" is accepted. But then they've already got the same so-called problem with int accepting "invalid" strings like "0xDEADBEEF". So I stress that this is a problem in theory, not in practice. > E.g. > > from math import hexfloat > k = hexfloat("0x1.2492492492492p-3") I don't think that's necessary. float() is sufficient. > I still think a literal syntax has merits. The above cannot be > optimized by the compiler as it doesn't know what hexfloat() refers > to. That in turn destroys constant folding peephole stuff that uses > the literal. Indeed. If there are use-cases for hexadecimal floats, then we should support both a literal 0x1.fp2 form and the float constructor. -- Steve From steve at pearwood.info Tue Sep 12 07:28:51 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 21:28:51 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> Message-ID: <20170912112851.GX13110@ando.pearwood.info> On Tue, Sep 12, 2017 at 09:23:04AM +0200, Victor Stinner wrote: > 2017-09-12 3:48 GMT+02:00 Steven D'Aprano : > >> k = float("0x1.2492492492492p-3") # 1/7 > > > > Why wouldn't you just write 1/7? > > 1/7 is irrational, so it's not easy to get the "exact value" for a > 64-bit IEEE 754 double float. 1/7 is not irrational. It is the ratio of 1 over 7, by definition it is a rational number. Are you thinking of square root of 7? 1/7 gives the exact 64-bit IEEE 754 float closest to the true rational number 1/7. And with the keyhole optimizer in recent versions of Python, you don't even pay a runtime cost. py> (1/7).hex() '0x1.2492492492492p-3' I do like the idea of having float hex literals, and supporting them in float itself (although we do already have float.fromhex) but I must admit I'm struggling for a use-case. But perhaps "C allows it now, we should too" is a good enough reason. > I chose it because it's easy to write. Maybe math.pi is a better example :-) > > >>> math.pi.hex() > '0x1.921fb54442d18p+1' 3.141592653589793 is four fewer characters to type, just as accurate, and far more recognisable. -- Steve From rosuav at gmail.com Tue Sep 12 07:30:58 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Sep 2017 21:30:58 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170912112002.GW13110@ando.pearwood.info> References: <20170912002616.3gil3x6ymufaecsx@python.ca> <20170912112002.GW13110@ando.pearwood.info> Message-ID: On Tue, Sep 12, 2017 at 9:20 PM, Steven D'Aprano wrote: > On Mon, Sep 11, 2017 at 06:26:16PM -0600, Neil Schemenauer wrote: >> On 2017-09-12, Victor Stinner wrote: >> > Instead of modifying the Python grammar, the alternative is to enhance >> > float(str) to support it: >> > >> > k = float("0x1.2492492492492p-3") # 1/7 >> >> Making it a different function from float() would avoid backwards >> compatibility issues. I.e. float() no longer returns errors on some >> inputs. > > I don't think many people will care about backwards compatibility of > errors. Intentionally calling float() in order to get an exception is > not very common (apart from test suites). Its easier to use raise if > you want a ValueError. > > The only counter-example I can think of is beginner programmers who > write something like: > > num = float(input("Enter a number:")) > > and are surprised when the "invalid" response "0x1.Fp2" is accepted. But > then they've already got the same so-called problem with int accepting > "invalid" strings like "0xDEADBEEF". So I stress that this is a problem > in theory, not in practice. Your specific example doesn't work as int() won't accept that by default - you have to explicitly say "base=0" to make that acceptable. But we have other examples where what used to be an error is now acceptable: Python 3.5.3 (default, Jan 19 2017, 14:11:04) [GCC 6.3.0 20170118] on linux Type "help", "copyright", "credits" or "license" for more information. >>> int("1_234_567") Traceback (most recent call last): File "", line 1, in ValueError: invalid literal for int() with base 10: '1_234_567' Python 3.7.0a0 (heads/master:cb76029b47, Aug 30 2017, 23:43:41) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. >>> int("1_234_567") 1234567 Maybe hex floats should be acceptable only with float(str, base=0)? ChrisA From steve at pearwood.info Tue Sep 12 07:35:05 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Sep 2017 21:35:05 +1000 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170912014516.GU13110@ando.pearwood.info> Message-ID: <20170912113504.GY13110@ando.pearwood.info> On Tue, Sep 12, 2017 at 09:17:23PM +1000, Nick Coghlan wrote: > Unfortunately, this idea has the downside that for trivial > annotations, defining a lambda expression is likely to be *slower* > than evaluating the expression, whereas referencing a string constant > is faster: Is it time to consider a specialised, high-speed (otherwise there's no point) thunk that can implicitly capture the environment like a function, but has less overhead? For starters, you don't need to care about argument passing. -- Steve From k7hoven at gmail.com Tue Sep 12 09:17:49 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 12 Sep 2017 16:17:49 +0300 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912002616.3gil3x6ymufaecsx@python.ca> <20170912112002.GW13110@ando.pearwood.info> Message-ID: On Tue, Sep 12, 2017 at 2:30 PM, Chris Angelico wrote: > > Your specific example doesn't work as int() won't accept that by > default - you have to explicitly say "base=0" to make that acceptable. > But we have other examples where what used to be an error is now > acceptable: ??I'm surprised that it's not "base=None". ??Koos?? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at ionelmc.ro Tue Sep 12 10:07:20 2017 From: contact at ionelmc.ro (=?UTF-8?Q?Ionel_Cristian_M=C4=83rie=C8=99?=) Date: Tue, 12 Sep 2017 17:07:20 +0300 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: Wouldn't a better approach be a way to customize the type of the module? That would allow people to define behavior for almost anything (__call__, __getattr__, __setattr__, __dir__, various operators etc). This question shouldn't exist "why can't I customize behavior X in a module when I can do it for a class". Why go half-way. Thanks, -- Ionel Cristian M?rie?, http://blog.ionelmc.ro On Sun, Sep 10, 2017 at 9:48 PM, Ivan Levkivskyi wrote: > I have written a short PEP as a complement/alternative to PEP 549. > I will be grateful for comments and suggestions. The PEP should > appear online soon. > > -- > Ivan > > *********************************************************** > > PEP: 562 > Title: Module __getattr__ > Author: Ivan Levkivskyi > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 09-Sep-2017 > Python-Version: 3.7 > Post-History: 09-Sep-2017 > > > Abstract > ======== > > It is proposed to support ``__getattr__`` function defined on modules to > provide basic customization of module attribute access. > > > Rationale > ========= > > It is sometimes convenient to customize or otherwise have control over > access to module attributes. A typical example is managing deprecation > warnings. Typical workarounds are assigning ``__class__`` of a module > object > to a custom subclass of ``types.ModuleType`` or substituting > ``sys.modules`` > item with a custom wrapper instance. It would be convenient to simplify > this > procedure by recognizing ``__getattr__`` defined directly in a module that > would act like a normal ``__getattr__`` method, except that it will be > defined > on module *instances*. For example:: > > # lib.py > > from warnings import warn > > deprecated_names = ["old_function", ...] > > def _deprecated_old_function(arg, other): > ... > > def __getattr__(name): > if name in deprecated_names: > warn(f"{name} is deprecated", DeprecationWarning) > return globals()[f"_deprecated_{name}"] > raise AttributeError(f"module {__name__} has no attribute {name}") > > # main.py > > from lib import old_function # Works, but emits the warning > > There is a related proposal PEP 549 that proposes to support instance > properties for a similar functionality. The difference is this PEP proposes > a faster and simpler mechanism, but provides more basic customization. > An additional motivation for this proposal is that PEP 484 already defines > the use of module ``__getattr__`` for this purpose in Python stub files, > see [1]_. > > > Specification > ============= > > The ``__getattr__`` function at the module level should accept one argument > which is a name of an attribute and return the computed value or raise > an ``AttributeError``:: > > def __getattr__(name: str) -> Any: ... > > This function will be called only if ``name`` is not found in the module > through the normal attribute lookup. > > The reference implementation for this PEP can be found in [2]_. > > > Backwards compatibility and impact on performance > ================================================= > > This PEP may break code that uses module level (global) name > ``__getattr__``. > The performance implications of this PEP are minimal, since ``__getattr__`` > is called only for missing attributes. > > > References > ========== > > .. [1] PEP 484 section about ``__getattr__`` in stub files > (https://www.python.org/dev/peps/pep-0484/#stub-files) > > .. [2] The reference implementation > (https://github.com/ilevkivskyi/cpython/pull/3/files) > > > Copyright > ========= > > This document has been placed in the public domain. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Tue Sep 12 10:35:34 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 12 Sep 2017 17:35:34 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Tue, Sep 12, 2017 at 1:40 PM, Nick Coghlan wrote: > On 11 September 2017 at 18:02, Koos Zevenhoven wrote: > > On Mon, Sep 11, 2017 at 8:32 AM, Nick Coghlan > wrote: > >> The line between it and the "CPython Runtime" is fuzzy for both > >> practical and historical reasons, but the regular Python CLI will > >> always have a "first created, last destroyed" main interpreter, simply > >> because we don't really gain anything significant from eliminating it > >> as a concept. > > > > I fear that emphasizing the main interpreter will lead to all kinds of > > libraries/programs that somehow unnecessarily rely on some or all tasks > > being performed in the main interpreter. Then you'll have a hard time > > running two of them in parallel in the same process, because you don't > have > > two main interpreters. > > You don't need to fear this scenario, since it's a description of the > status quo (and it's the primary source of overstated claims about > subinterpreters being "fundamentally broken"). > > Well, if that's true, it's hardly a counter-argument to what I said. Anyway, there is no status quo about what is proposed in the PEP. And as long as the existing APIs are preserved, why not make the new one less susceptible to overstated fundamental brokenness? > So no, not everything will be subinterpreter-friendly, just as not > everything in Python is thread-safe, and not everything is portable > across platforms. I don't see how the situation benefits from calling something the "main interpreter".? Subinterpreters can be a way to take something non-thread-safe and make it thread-safe, because in an interpreter-per-thread scheme, most of the state, like module globals, are thread-local. (Well, this doesn't help for async concurrency, but anyway.) > That's OK - it just means we'll aim to make as many > things as possible implicitly subinterpreter-friendly, and for > everything else, we'll aim to minimise the adjustments needed to > *make* things subinterpreter friendly. > > ?And that's exactly what I'm after here! I'm mostly just worried about the `get_main()` function. Maybe it should be called `asdfjaosjnoijb()`, so people wouldn't use it. Can't the first running interpreter just introduce itself to its children? And if that's too much to ask, maybe there could be a `get_parent()` function, which would give you the interpreter that spawned the current subinterpreter. Well OK, perhaps the current implementation only allows the "main interpreter" to spawn new interpreters (I have no idea). In that case, `get_parent()` will just be another, more future-proof name for `get_main()`. Then it would just need a clear documentation of the differences between the parent and its children. If the author of user code is not being too lazy, they might even read the docs and figure out if they *really* need to make a big deal out of which interpreter is the main/parent one. Still, I'm not convinced that there needs to be a get_main or get_parent. It shouldn't be too hard for users to make a wrapper around the API that provides this functionality. And if they do that??and use it to make their code "fundamentally broken"??then... at least we tried. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Tue Sep 12 10:53:15 2017 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 12 Sep 2017 16:53:15 +0200 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: <20170912145314.GA2805@bytereef.org> On Tue, Sep 12, 2017 at 05:35:34PM +0300, Koos Zevenhoven wrote: > I don't see how the situation benefits from calling something the "main > interpreter". Subinterpreters can be a way to take something > non-thread-safe and make it thread-safe, because in an > interpreter-per-thread scheme, most of the state, like module globals, are > thread-local. (Well, this doesn't help for async concurrency, but anyway.) You could have a privileged C extension that is only imported in the main interpreter: if get_current_interp() is main_interp(): from _decimal import * else: from _pydecimal import * This is of course only attractive if importing the interpreters module and calling these functions has minimal overhead. Stefan Krah From desmoulinmichel at gmail.com Tue Sep 12 11:18:43 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 12 Sep 2017 17:18:43 +0200 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: <808400f9-aca5-57fe-4b7d-564db3a12e95@gmail.com> If I recall there was a proposal a few months for a "lazy" keyword that would render anything lazy, including imports. Instead of just adding laziness on generators, the on imports, then who knows where, maybe it's time to consider laziness is a hell of a good general concept and try to generalize it ? For imports, that would mean: lazy from module import stuff lazy import foo For the rest, bar = lazy 1 + 1 When you think about it, it's syntaxic sugar to avoid manually wrapping everything in functions, storying stuff in closure and calling that later. Le 12/09/2017 ? 10:26, Ivan Levkivskyi a ?crit : > @Anthony >> module.__getattr__ works pretty well for normal access, after being >> imported by another module, but it doesn't properly trigger loading by >> functions defined in the module's own namespace. > > The idea of my PEP is to be very simple (both semantically and in terms > of implementation). This is why I don't want to add any complex logic. > People who will want to use __getattr__ for lazy loading still can do this > by importing submodules. > > @Nathaniel @INADA >> The main two use cases I know of for this and PEP 549 are lazy imports >> of submodules, and deprecating attributes. > > Yes, lazy loading seems to be a popular idea :-) > I will add the simple recipe by Inada to the PEP since it will already work. > > @Cody >> I still think the better way >> to solve the custom dir() would be to change the module __dir__ >> method to check if __all__ is defined and use it to generate the >> result if it exists. This seems like a logical enhancement to me, >> and I'm planning on writing a patch to implement this. Whether it >> would be accepted is still an open issue though. > > This seems a reasonable rule to me, I can also make this patch if > you will not have time. > > @Guido > What do you think about the above idea? > > -- > Ivan > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From k7hoven at gmail.com Tue Sep 12 11:30:09 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 12 Sep 2017 18:30:09 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: <20170912145314.GA2805@bytereef.org> References: <20170912145314.GA2805@bytereef.org> Message-ID: On Tue, Sep 12, 2017 at 5:53 PM, Stefan Krah wrote: > On Tue, Sep 12, 2017 at 05:35:34PM +0300, Koos Zevenhoven wrote: > > I don't see how the situation benefits from calling something the "main > > interpreter". Subinterpreters can be a way to take something > > non-thread-safe and make it thread-safe, because in an > > interpreter-per-thread scheme, most of the state, like module globals, > are > > thread-local. (Well, this doesn't help for async concurrency, but > anyway.) > > You could have a privileged C extension that is only imported in the main > interpreter: > > > if get_current_interp() is main_interp(): > from _decimal import * > else: > from _pydecimal import * > > > ?Or it could be first-come first-served: if is_imported_by_other_process("_decimal"): from _pydecimal import * else from _decimal import * ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Tue Sep 12 11:33:01 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 12 Sep 2017 18:33:01 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: <20170912145314.GA2805@bytereef.org> Message-ID: On Tue, Sep 12, 2017 at 6:30 PM, Koos Zevenhoven wrote: > On Tue, Sep 12, 2017 at 5:53 PM, Stefan Krah wrote: > >> On Tue, Sep 12, 2017 at 05:35:34PM +0300, Koos Zevenhoven wrote: >> > I don't see how the situation benefits from calling something the "main >> > interpreter". Subinterpreters can be a way to take something >> > non-thread-safe and make it thread-safe, because in an >> > interpreter-per-thread scheme, most of the state, like module globals, >> are >> > thread-local. (Well, this doesn't help for async concurrency, but >> anyway.) >> >> You could have a privileged C extension that is only imported in the main >> interpreter: >> >> >> if get_current_interp() is main_interp(): >> from _decimal import * >> else: >> from _pydecimal import * >> >> >> > ??Oops.. it should of course be "by_this_process", not "by_other_process" (fixed below).?? > ?Or it could be first-come first-served: > > if is_imported_by_ > ?this > _process("_decimal"): > ?? > > from _pydecimal import * > else > from _decimal import * > > ??Koos > > > > -- > + Koos Zevenhoven + http://twitter.com/k7hoven + > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Tue Sep 12 12:17:08 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Tue, 12 Sep 2017 10:17:08 -0600 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() Message-ID: <20170912161708.x3mnxmrtbd26hsvi@python.ca> This is my idea of making module properties work. It is necessary for various lazy-loading module ideas and it cleans up the language IMHO. I think it may be possible to do it with minimal backwards compatibility problems and performance regression. To me, the main issue with module properties (or module __getattr__) is that you introduce another level of indirection on global variable access. Anywhere the module.__dict__ is used as the globals for code execution, changing LOAD_NAME/LOAD_GLOBAL to have another level of indirection is necessary. That seems inescapable. Introducing another special feature of modules to make this work is not the solution, IMHO. We should make module namespaces be more like instance namespaces. We already have a mechanism and it is getattr on objects. I have a very early prototype of this idea. See: https://github.com/nascheme/cpython/tree/exec_mod Issues to be resolved: - __namespace__ entry in the __dict__ creates a reference cycle. Maybe could use a weakref somehow to avoid it. Maybe we just explicitly break it. - getattr() on the module may return things that LOAD_NAME and LOAD_GLOBAL don't expect (e.g. things from the module type). I need to investigate that. - Need to fix STORE_* opcodes to do setattr() rather than __setitem__. - Need to optimize the implementation. Maybe the module instance can know if any properties or __getattr__ are defined. If no, have __getattribute__ grab the variable directly from md_dict. - Need to fix eval() to allow module as well as dict. - Need to change logic where global dict is passed around. Pass the module instead so we don't have to keep retrieving __namespace__. For backwards compatibility, need to keep functions that take 'globals' as dict and use PyModule_GetDict() on public APIs that return globals as a dict. - interp->builtins should be a module, not a dict. - module shutdown procedure needs to be investigated and fixed. I think it may get simpler. - importlib needs to be fixed to pass modules to exec() and not dicts. From my initial experiments, it looks like importlib gets a lot simpler. Right now we pass around dicts in a lot of places and then have to grub around in sys.modules to get the module object, which is what importlib usually wants. I have requested help in writing a PEP for this idea but so far no one is foolish enough to join my crazy endeavor. ;-) Regards, Neil From njs at pobox.com Tue Sep 12 15:32:26 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 12 Sep 2017 12:32:26 -0700 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: On Sep 12, 2017 7:08 AM, "Ionel Cristian M?rie? via Python-ideas" < python-ideas at python.org> wrote: Wouldn't a better approach be a way to customize the type of the module? That would allow people to define behavior for almost anything (__call__, __getattr__, __setattr__, __dir__, various operators etc). This question shouldn't exist "why can't I customize behavior X in a module when I can do it for a class". Why go half-way. If you're ok with replacing the object in sys.modules then the ability to totally customize your module's type has existed since the dawn era. And if you're not ok with that, then it's still existed since 3.5 via the mechanism of assigning to __class__ to change the type in-place. So this discussion isn't about adding new functionality per se, but about trying to find some way to provide a little bit of sugar that provides most of the value in a less obscure way. (And unfortunately there's a chicken and egg problem for using custom module types *without* the __class__ assignment hack, because you can't load any code from a package until after you've created the top level module object. So we've kind of taken custom module types as far as they can go already.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python-ideas at arctrix.com Tue Sep 12 15:46:46 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Tue, 12 Sep 2017 13:46:46 -0600 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: <20170912194646.zvtnuc4qus4kkee6@python.ca> On 2017-09-12, Nathaniel Smith wrote: > If you're ok with replacing the object in sys.modules then the ability to > totally customize your module's type has existed since the dawn era. And if > you're not ok with that, then it's still existed since 3.5 via the > mechanism of assigning to __class__ to change the type in-place. It doesn't quite work though. Swapping out or assigning to __class__, and then running: exec(code, module.__dict__) does not have the expected behavior. LOAD_NAME/LOAD_GLOBAL does not care about your efforts. Accessing module globals from outside the module does work as that is a getattr call. That is a weird inconsistency that should be fixed if it is not too painful. Coming up with handy syntax or whatever is a minor problem. From lukasz at langa.pl Tue Sep 12 16:10:18 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Tue, 12 Sep 2017 16:10:18 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: > On Sep 11, 2017, at 4:21 PM, Yury Selivanov wrote: > > I'm one of those who used annotations for other purposes than type > hints. And even if annotations became strings in Python 3.7 *without > future import*, fixing my libraries would be easy -- just add an > eval(). > > That said, the PEP doesn't cover an alternative solution: > > 1. Add another special attribute to functions: __annotations_text__. > > 2. __annotations__ becomes a dynamic Mapping, which evaluates stuff > from __annotations_text__ *lazily*. > > 3. Recommend linters and IDEs to support "# pragma: annotations", as a > way to say that the Python files follows the new Python 3.7 > annotations semantics. > > That would maintain full backwards compatibility with all existing > Python libraries and would not require a future import. I'm not very thrilled about this because lazy evaluation is subject to the new scoping rules (can't use local state) and might give a different result than before. It's not backwards compatible. A __future__ import makes it obvious that behavior is going to be different. And lazy evaluation is an unnecessary step if `get_type_hints()` is used later on so it's unnecessary for the most common usage of annotations. Finally, I don't think we ever had a "# pragma" suggestion coming from CPython. In reality, people wouldn't bother putting it in most files so tools would have to assume that forward references are correct *and* avoid raising errors about invalid names used in annotations. This is loss of functionality. - ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From lukasz at langa.pl Tue Sep 12 16:11:14 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Tue, 12 Sep 2017 16:11:14 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <07A8117C-010D-4336-8591-0F3A15F2FCA3@langa.pl> Message-ID: <774EF52B-8F25-41CF-B65F-F31003614A8D@langa.pl> > On Sep 12, 2017, at 5:38 AM, Ivan Levkivskyi wrote: > > In principle, I like this idea, this will save some keystrokes > and will make annotated code more "beautiful". But I am quite worried about the backwards > compatibility. One possible idea would be to use __future__ import without a definite > deprecation plan. This is not a viable strategy since __future__ is not designed to be a feature toggle but rather to be a gradual introduction of an upcoming breaking change. > If people will be fine with using typing.get_type_hints > (btw this is already the preferred way instead of directly accessing __annotations__, > according to PEP 526 at least) then we could go ahead with deprecation. As you're pointing out, people already have to use `typing.get_type_hints()`, otherwise they are already failing evaluation of existing forward references. Accessing __annotations__ directly in this context is a bug today. > Also I really like Yury's idea of dynamic mapping I responded to his idea under his post. - ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From ericsnowcurrently at gmail.com Tue Sep 12 16:46:25 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 13:46:25 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Thu, Sep 7, 2017 at 11:19 PM, Nathaniel Smith wrote: > On Thu, Sep 7, 2017 at 8:11 PM, Eric Snow wrote: >> My concern is that this is a chicken-and-egg problem. The situation >> won't improve until subinterpreters are more readily available. > > Okay, but you're assuming that "more libraries work well with > subinterpreters" is in fact an improvement. I'm asking you to convince > me of that :-). Are there people saying "oh, if only subinterpreters > had a Python API and less weird interactions with C extensions, I > could do "? So far they haven't exactly taken the > world by storm... The problem is that most people don't know about the feature. And even if they do, using it requires writing a C-extension, which most people aren't comfortable doing. >> Other than C globals, is there some other issue? > > That's the main one I'm aware of, yeah, though I haven't looked into it closely. Oh, good. I haven't missed something. :) Do you know how often subinterpreter support is a problem for users? I was under the impression from your earlier statements that this is a recurring issue but my understanding from mod_wsgi is that it isn't that common. >> I'm fine with Nick's idea about making this a "provisional" module. >> Would that be enough to ease your concern here? > > Potentially, yeah -- basically I'm fine with anything that doesn't end > up looking like python-dev telling everyone "subinterpreters are the > future! go forth and yell at any devs who don't support them!". Great! I'm also looking at the possibility of adding a mechanism for extension modules to opt out of subinterpreter support (using PEP 489 ModuleDef slots). However, I'd rather wait on that if making the PEP provisional is sufficient. > What do you think the criteria for graduating to non-provisional > status should be, in this case? Consensus among the (Dutch?) core devs that subinterpreters are worth keeping in the stdlib and that we've smoothed out any rough parts in the module. > I guess I would be much more confident in the possibilities here if > you could give: > > - some hand-wavy sketch for how subinterpreter A could call a function > that as originally defined in subinterpreter B without the GIL, which > seems like a precondition for sharing user-defined classes (Before I respond, note that this is way outside the scope of the PEP. The merit of subinterpreters extends beyond any benefits of running sans-GIL, though that is my main goal. I've been updating the PEP to (hopefully) better communicate the utility of subinterpreters.) Code objects are immutable so that part should be relatively straight-forward. There's the question of closures and default arguments that would have to be resolved. However, those are things that would need to be supported anyway in a world where we want to pass functions and user-defined types between interpreters. Doing so will be a gradual process of starting with immutable non-container builtin types and expanding out from there to other immutable types, including user-defined ones. Note that sharing mutable objects between interpreters would be a pretty advanced usage (i.e. opt-in shared state vs. threading's share-everything). If it proves desirable then we'd sort that out then. However, I don't see that as a more than an esoteric feature relative to subinterpreters. In my mind, the key advantage of being able to share more (immutable) objects, including user-defined types, between interpreters is in the optimization opportunities. It would allow us to avoid instantiating the same object in each interpreter. That said, the way I imagine it I wouldn't consider such an optimization to be very user-facing so it doesn't impact the PEP. The user-facing part would be the expanded set of immutable objects interpreters could pass back and forth, and expanding that set won't require any changes to the API in the PEP. > - some hand-wavy sketch for how refcounting will work for objects > shared between multiple subinterpreters without the GIL, without > majorly impacting single-thread performance (I actually forgot about > this problem in my last email, because PyPy has already solved this > part!) (same caveat as above) There are a number of approaches that may work. One is to give each interpreter its own allocator and GC. Another is to mark shared objects such that they never get GC'ed. Another is to allow objects to exist only in one interpreter at a time. Similarly, object ownership (per interpreter) could help. Asynchronous refcounting could be an option. That's only some of the possible approaches. I expect that at least one of them will be suitable. However, the first step is to get the multi-interpreter support out there. Then we can tackle the problem of optimization and multi-core utilization. FWIW, the biggest complexity is actually in synchronizing the sharing strategy across the inter-interpreter boundary (e.g. FIFO). We should expect the relative time spent passing objects between interpreters to be very small. So not only does that provide us will a good target for our refcount resolving strategy, we can afford some performance wiggle room in that solution. (again, we're looking way ahead here) > Thanks for attempting such an ambitious project :-). Hey, I'm learning a lot and feel like every step along the way is making Python better in some stand-alone way. :) -eric From ericsnowcurrently at gmail.com Tue Sep 12 16:48:54 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 13:48:54 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: <0A6FC422-5CF0-4E82-86B5-73693D5D1661@mac.com> References: <0A6FC422-5CF0-4E82-86B5-73693D5D1661@mac.com> Message-ID: Yep. See http://bugs.python.org/issue10915 and http://bugs.python.org/issue15751. The issue of C-extension support for subinterpreters is, of course, a critical one here. At the very least, incompatible modules should be able to opt out of subinterpreter support. I've updated the PEP to discuss this. -eric On Sun, Sep 10, 2017 at 3:18 AM, Ronald Oussoren wrote: > >> On 8 Sep 2017, at 05:11, Eric Snow wrote: > >> On Thu, Sep 7, 2017 at 3:48 PM, Nathaniel Smith wrote: >> >>> Numpy is the one I'm >>> most familiar with: when we get subinterpreter bugs we close them >>> wontfix, because supporting subinterpreters properly would require >>> non-trivial auditing, add overhead for non-subinterpreter use cases, >>> and benefit a tiny tiny fraction of our users. >> >> The main problem of which I'm aware is C globals in libraries and >> extension modules. PEPs 489 and 3121 are meant to help but I know >> that there is at least one major situation which is still a blocker >> for multi-interpreter-safe module state. Other than C globals, is >> there some other issue? > > There?s also the PyGilState_* API that doesn't support multiple interpreters. > > The issue there is that callbacks from external libraries back into python > need to use the correct subinterpreter. > > Ronald From ericsnowcurrently at gmail.com Tue Sep 12 16:51:11 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 13:51:11 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Sun, Sep 10, 2017 at 7:52 AM, Koos Zevenhoven wrote: > I assume the concept of a main interpreter is inherited from the previous > levels of support in the C API, but what exactly is the significance of > being "the main interpreter"? Instead, could they just all be > subinterpreters of the same Python process (or whatever the right wording > would be)? > > It might also be helpful if the PEP had a short description of what are > considered subinterpreters and how they differ from threads of the same > interpreter [*]. Currently, the PEP seems to rely heavily on knowledge of > the previously available concepts. However, as this would be a new module, I > don't think there's any need to blindly copy the previous design, regardless > of how well the design may have served its purpose at the time. I've updated the PEP to be more instructive. I've also dropped the "get_main()" function from the PEP. -eric From ericsnowcurrently at gmail.com Tue Sep 12 16:54:07 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 13:54:07 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: <20170910211400.4a829c79@fsol> References: <20170910211400.4a829c79@fsol> Message-ID: On Sun, Sep 10, 2017 at 12:14 PM, Antoine Pitrou wrote: > What could improve performance significantly would be to share objects > without any form of marshalling; but it's not obvious it's possible in > the subinterpreters model *if* it also tries to remove the GIL. Yep. This is one of the main challenges relative to the goal of fully utilizing multiple cores. -eric From ethan at ethanhs.me Tue Sep 12 19:43:07 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Tue, 12 Sep 2017 16:43:07 -0700 Subject: [Python-ideas] PEP 561 v2 - Packaging Static Type Information Message-ID: Hello, V2 of my PEP on packaging type information is available at https://www.python.org/dev/peps/pep-0561/. It is also replicated below. I look forward to any suggestions or comments that people may have! Thanks Ethan ----------------------------------------------------------------------- PEP: 561 Title: Distributing and Packaging Type Information Author: Ethan Smith Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: Abstract ======== PEP 484 introduced type hints to Python, with goals of making typing gradual and easy to adopt. Currently, typing information must be distributed manually. This PEP provides a standardized means to package and distribute type information and an ordering for type checkers to resolve modules and collect this information for type checking using existing packaging architecture. Rationale ========= Currently, package authors wish to distribute code that has inline type information. However, there is no standard method to distribute packages with inline type annotations or syntax that can simultaneously be used at runtime and in type checking. Additionally, if one wished to ship typing information privately the only method would be via setting ``MYPYPATH`` or the equivalent to manually point to stubs. If the package can be released publicly, it can be added to typeshed [1]_. However, this does not scale and becomes a burden on the maintainers of typeshed. Additionally, it ties bugfixes to releases of the tool using typeshed. PEP 484 has a brief section on distributing typing information. In this section [2]_ the PEP recommends using ``shared/typehints/pythonX.Y/`` for shipping stub files. However, manually adding a path to stub files for each third party library does not scale. The simplest approach people have taken is to add ``site-packages`` to their ``MYPYPATH``, but this causes type checkers to fail on packages that are highly dynamic (e.g. sqlalchemy and Django). Specification ============= There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: 1. The package maintainer would like to add type information inline. 2. The package maintainer would like to add type information via stubs. 3. A third party would like to share stub files for a package, but the maintainer does not want to include them in the source of the package. This PEP aims to support these scenarios and make them simple to add to packaging and deployment. The two major parts of this specification are the packaging specifications and the resolution order for resolving module type information. This spec is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [2]_. New third party stub libraries are encouraged to distribute stubs via the third party packaging proposed in this PEP in place of being added to typeshed. Typeshed will remain in use, but if maintainers are found, third party stubs in typeshed are encouraged to be split into their own package. Packaging Type Information -------------------------- Packages must opt into supporting typing. This will be done though a distutils extension [3]_, providing a ``typed`` keyword argument to the distutils ``setup()`` command. The argument value will depend on the kind of type information the package provides. The new keyword will be added to Python 3.7 and will also be accessible in the ``typing_extensions`` package though a distutils extension. This enables a package maintainer to write :: setup( ... setup_requires=["typing_extensions"], typed="inline", ... ) Inline Typed Packages ''''''''''''''''''''' Packages that have inline type annotations simply have to pass the value ``"inline"`` to the ``typed`` argument in ``setup()``. Stub Only Packages '''''''''''''''''' For package maintainers wishing to ship stub files containing all of their type information, it is prefered that the ``*.pyi`` stubs are alongside the corresponding ``*.py`` files. However, the stubs may be put in a sub-folder of the Python sources, with the same name the ``*.py`` files are in. For example, the ``flyingcircus`` package would have its stubs in the folder ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are not found in ``flyingcircus/`` the type checker may treat the subdirectory as a normal package. The normal resolution order of checking ``*.pyi`` before ``*.py`` will be maintained. The value of the ``typed`` argument to ``setup()`` is ``"stubs"`` for this type of distribution. The author of the package is suggested to use ``package_data`` to assure the stub files are installed alongside the runtime Python code. Third Party Stub Packages ''''''''''''''''''''''''' Third parties seeking to distribute stub files are encouraged to contact the maintainer of the package about distribution alongside the package. If the maintainer does not wish to maintain or package stub files or type information inline, then a "third party stub package" should be created. The structure is similar, but slightly different from that of stub only packages. If the stubs are for the library ``flyingcircus`` then the package should be named ``flyingcircus-stubs`` and the stub files should be put in a sub-directory named ``flyingcircus``. This allows the stubs to be checked as if they were in a regular package. These packages should also pass ``"stubs"`` as the value of ``typed`` argument in ``setup()``. These packages are suggested to use ``package_data`` to package stub files. In addition, the package should indicate which version(s) of the runtime package are supported via the ``install_requires`` argument to ``setup()``. Type Checker Module Resolution Order ------------------------------------ The following is the order that type checkers supporting this PEP should resolve modules containing type information: 1. User code - the files the type checker is running on. 2. Stubs or Python source manually put in the beginning of the path. Type checkers should provide this to allow the user complete control of which stubs to use, and patch broken stubs/inline types from packages. 3. Third party stub packages - these packages can supersede the installed untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, however it is encouraged to check their metadata to confirm that they opt into type checking via the ``typed`` keyword. 4. Inline packages - finally, if there is nothing overriding the installed package, and it opts into type checking. 5. Typeshed (if used) - Provides the stdlib types and several third party libraries When resolving step (3) type checkers should assure the version of the stubs is compatible with the installed runtime package through the method described above. Type checkers that check a different Python version than the version they run on must find the type information in the ``site-packages``/``dist-packages`` of that Python version. This can be queried e.g. ``pythonX.Y -c 'import sys; print(sys.exec_prefix)'``. It is also recommended that the type checker allow for the user to point to a particular Python binary, in case it is not in the path. To check if a package has opted into type checking, type checkers are recommended to use the ``pkg_resources`` module to query the package metadata. If the ``typed`` package metadata has ``None`` as its value, the package has not opted into type checking, and the type checker should skip that package. References ========== .. [1] Typeshed (https://github.com/python/typeshed) .. [2] PEP 484, Storing and Distributing Stub Files (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) .. [3] Distutils Extensions, Adding setup() arguments (http://setuptools.readthedocs.io/en/latest/setuptools.html#adding-setup-arguments) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Sep 12 19:46:25 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 16:46:25 -0700 Subject: [Python-ideas] sys.py Message-ID: The sys module is a rather special case as far as modules go. It is effectively a "console" into the interpreter's internal state and that includes some mutable state. Since it is a module, we don't have much of an opportunity to: * validate values assigned to its attributes [1] * issue DeprecationWarning for deprecated attrs [2] * alias attrs [2] * replace get (and get/set) functions with properties * re-organize sys [3] One possible solution I've been toying with for quite a while [2] is to rename the current sys module "_sys" and then add Lib/sys.py. The new module would proxy the old one (for backward-compatibility), but also allow us to do all of the above (e.g. validate sys.modules). I implemented this a few weeks ago: https://github.com/ericsnowcurrently/cpython/tree/sys-module (for the sake of comparison: https://github.com/ericsnowcurrently/cpython/pull/2) It uses the trick of replacing itself in sys.modules (though it could just as well set __class__). The only problem I've encountered is code that uses "type(sys)" to get types.ModuleType. Under my branch "type(sys)" returns the ModuleType subclass. Otherwise everything looks fine. Thoughts? -eric [1] I ran into this recently: https://bugs.python.org/issue31404 [2] I have plans for a low-level encapsulation of the import state and a high-level API for the current import machinery as a whole. As part of that I'd like to be able to deprecate the current import state attrs (e.g. sys.modules) and make them look up values from sys.importstate. [3] This has come up before, including https://www.python.org/dev/peps/pep-3139/. Also PEP 432 implies a clearer structure for sys. From contact at ionelmc.ro Tue Sep 12 19:49:57 2017 From: contact at ionelmc.ro (=?UTF-8?Q?Ionel_Cristian_M=C4=83rie=C8=99?=) Date: Tue, 12 Sep 2017 23:49:57 +0000 Subject: [Python-ideas] PEP 562 In-Reply-To: References: Message-ID: On Tue, Sep 12, 2017 at 10:32 PM Nathaniel Smith wrote: > If you're ok with replacing the object in sys.modules then the ability to > totally customize your module's type has existed since the dawn era. > I'm down with that. Just make it easier, mucking with sys.modules ain't a walk in a park, and there's the boilerplate and the crazy issues with interpreter shutdown. -- Thanks, -- Ionel Cristian M?rie?, http://blog.ionelmc.ro -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Sep 12 22:32:40 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 19:32:40 -0700 Subject: [Python-ideas] lazy import via __future__ or compiler analysis In-Reply-To: <20170911203223.2kajjuzybtkltewr@python.ca> References: <20170907174402.ocdy3zumzdfj7xg6@python.ca> <20170908163604.xxp4hd7idi3eboum@python.ca> <20170911180927.hlgtzaph56mygsco@python.ca> <20170911200345.dajb7gpac2ehwvgg@python.ca> <20170911203223.2kajjuzybtkltewr@python.ca> Message-ID: On Sep 11, 2017 2:32 PM, "Neil Schemenauer" wrote: On 2017-09-11, Neil Schemenauer wrote: > A module can be a singleton instance of a singleton ModuleType > instance. Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy API. FYI, you should be able to try this out using a custom loader the implements a create_module() method. See importlib.abc.Finder. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Sep 12 22:48:00 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 19:48:00 -0700 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: On Sep 12, 2017 10:17 AM, "Neil Schemenauer" wrote: Introducing another special feature of modules to make this work is not the solution, IMHO. We should make module namespaces be more like instance namespaces. We already have a mechanism and it is getattr on objects. +1 - importlib needs to be fixed to pass modules to exec() and not dicts. From my initial experiments, it looks like importlib gets a lot simpler. Right now we pass around dicts in a lot of places and then have to grub around in sys.modules to get the module object, which is what importlib usually wants. Without looking at the importlib code, passing around modules should mostly be fine. There is some semantic trickiness involving sys.modules, but it shouldn't be too bad to work around. I have requested help in writing a PEP for this idea but so far no one is foolish enough to join my crazy endeavor. ;-) Yeah, good luck! :). If I weren't otherwise occupied with my own crazy endeavor I'd lend a hand. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Sep 12 23:14:13 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 13:14:13 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 13 September 2017 at 00:35, Koos Zevenhoven wrote: > On Tue, Sep 12, 2017 at 1:40 PM, Nick Coghlan wrote: >> >> On 11 September 2017 at 18:02, Koos Zevenhoven wrote: >> > On Mon, Sep 11, 2017 at 8:32 AM, Nick Coghlan >> > wrote: >> >> The line between it and the "CPython Runtime" is fuzzy for both >> >> practical and historical reasons, but the regular Python CLI will >> >> always have a "first created, last destroyed" main interpreter, simply >> >> because we don't really gain anything significant from eliminating it >> >> as a concept. >> > >> > I fear that emphasizing the main interpreter will lead to all kinds of >> > libraries/programs that somehow unnecessarily rely on some or all tasks >> > being performed in the main interpreter. Then you'll have a hard time >> > running two of them in parallel in the same process, because you don't >> > have >> > two main interpreters. >> >> You don't need to fear this scenario, since it's a description of the >> status quo (and it's the primary source of overstated claims about >> subinterpreters being "fundamentally broken"). >> > > Well, if that's true, it's hardly a counter-argument to what I said. Anyway, > there is no status quo about what is proposed in the PEP. Yes, there is, since subinterpreters are an existing feature of the CPython implementation. What's new in the PEP is the idea of giving that feature a Python level API so that it's available to regular Python programs, rather than only being available to embedding applications that choose to use it (e.g. mod_wsgi). > And as long as the existing APIs are preserved, why not make the new one > less susceptible to overstated fundamental brokenness? Having a privileged main interpreter isn't fundamentally broken, since you aren't going to run __main__ in more than one interpreter, just as you don't run __main__ in more than one thread (and multiprocessing deliberately avoids running the "if __name__ == '__main__'" sections of it in more than one process). >> So no, not everything will be subinterpreter-friendly, just as not >> everything in Python is thread-safe, and not everything is portable >> across platforms. > > I don't see how the situation benefits from calling something the "main > interpreter". Subinterpreters can be a way to take something non-thread-safe > and make it thread-safe, because in an interpreter-per-thread scheme, most > of the state, like module globals, are thread-local. (Well, this doesn't > help for async concurrency, but anyway.) "The interpreter that runs __main__" is never going to go away as a concept for the regular CPython CLI. Right now, its also a restriction even for applications like mod_wsgi, since the GIL state APIs always register C created threads with the main interpreter. >> That's OK - it just means we'll aim to make as many >> things as possible implicitly subinterpreter-friendly, and for >> everything else, we'll aim to minimise the adjustments needed to >> *make* things subinterpreter friendly. >> > > And that's exactly what I'm after here! No, you're after deliberately making the proposed API non-representative of how the reference implementation actually works because of a personal aesthetic preference rather than asking yourself what the practical benefit of hiding the existence of the main interpreter would be. The fact is that the main interpreter *is* special (just as the main thread is special), and your wishing that things were otherwise won't magically make it so. > I'm mostly just worried about the `get_main()` function. Maybe it should be > called `asdfjaosjnoijb()`, so people wouldn't use it. Can't the first > running interpreter just introduce itself to its children? And if that's too > much to ask, maybe there could be a `get_parent()` function, which would > give you the interpreter that spawned the current subinterpreter. If the embedding application never calls "_Py_ConfigureMainInterpreter", then get_main() could conceivably return None. However, we don't expose that as a public API yet, so for the time being, Py_Initialize() will always call it, and hence there will always be a main interpreter (even in things like mod_wsgi). Whether we invest significant effort in making configuring the main interpreter genuinely optional is still an open question - since most applications are free to just not use the main interpreter for code execution if they don't want to, we haven't found a real world use case that would benefit meaningfully from its non-existence (just as the vast majority of applications don't care about the various ways in which the main thread that runs Py_Initialize() and Py_Finalize() is given special treatment, and for those that do, they're free to avoid using it). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From nas-python-ideas at arctrix.com Tue Sep 12 23:21:17 2017 From: nas-python-ideas at arctrix.com (Neil Schemenauer) Date: Tue, 12 Sep 2017 21:21:17 -0600 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: <20170913032117.xddg35mpzc2wnsji@python.ca> On 2017-09-12, Eric Snow wrote: > Yeah, good luck! :). If I weren't otherwise occupied with my own crazy > endeavor I'd lend a hand. No problem. It makes sense to have a proof of concept before spending time on a PEP. If the idea breaks too much old code it is not going to happen. So, I will work on a slow but mostly compatible implementation for now. Regards, Neil From songofacandy at gmail.com Tue Sep 12 23:24:31 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 13 Sep 2017 12:24:31 +0900 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: <20170912161708.x3mnxmrtbd26hsvi@python.ca> References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: I'm worring about performance much. Dict has ma_version from Python 3.6 to be used for future optimization including global caching. Adding more abstraction layer may make it difficult. When considering lazy loading, big problem is backward compatibility. For example, see https://github.com/python/cpython/blob/master/Lib/concurrent/futures/__init__.py from concurrent.futures._base import (FIRST_COMPLETED, FIRST_EXCEPTION, ALL_COMPLETED, CancelledError, TimeoutError, Future, Executor, wait, as_completed) from concurrent.futures.process import ProcessPoolExecutor from concurrent.futures.thread import ThreadPoolExecutor Asyncio must import concurrent.futures.Future because compatibility between asyncio.Future and concurrent.futures.Future. But not all asyncio applications need ProcessPoolExecutor. Thay may use only ThreadPoolExecutor. Currently, they are forced to import concurrent.futures.process, and it imports multiprocessing. It makes large import dependency tree. To solve such problem, hooking LOAD_GLOBAL is not necessary. # in concurrent/futures/__init__.py def __getattr__(name): if name == 'ProcessPoolExecutor': global ProcessPoolExecutor from .process import ProcessPoolExecutor return ProcessPoolExecutor # Following code should call __getattr__ from concurrent.futures import ProcessPoolExecutor # eager loading import concurrent.futures as futures executor = futures.ProcessPoolExecutor() # lazy loading On the other hand, lazy loading global is easier than above. For example, linecache imports tokenize and tokenize is relatively heavy. https://github.com/python/cpython/blob/master/Lib/linecache.py#L11 tokenize is used from only one place (in linecache.updatecache()). So lazy importing it is just moving `import tokenize` into the function. try: import tokenize with tokenize.open(fullname) as fp: lines = fp.readlines() I want to lazy load only for heavy and rarely used module Lazy loading many module may make execution order unpredictable. So manual lazy loading technique is almost enough to me. Then, what is real world requirement about abstraction layer to LOAD_GLOBAL? Regards, INADA Naoki On Wed, Sep 13, 2017 at 1:17 AM, Neil Schemenauer wrote: > This is my idea of making module properties work. It is necessary > for various lazy-loading module ideas and it cleans up the language > IMHO. I think it may be possible to do it with minimal backwards > compatibility problems and performance regression. > > To me, the main issue with module properties (or module __getattr__) > is that you introduce another level of indirection on global > variable access. Anywhere the module.__dict__ is used as the > globals for code execution, changing LOAD_NAME/LOAD_GLOBAL to have > another level of indirection is necessary. That seems inescapable. > > Introducing another special feature of modules to make this work is > not the solution, IMHO. We should make module namespaces be more > like instance namespaces. We already have a mechanism and it is > getattr on objects. > > I have a very early prototype of this idea. See: > > https://github.com/nascheme/cpython/tree/exec_mod > > Issues to be resolved: > > - __namespace__ entry in the __dict__ creates a reference cycle. > Maybe could use a weakref somehow to avoid it. Maybe we just > explicitly break it. > > - getattr() on the module may return things that LOAD_NAME and > LOAD_GLOBAL don't expect (e.g. things from the module type). I > need to investigate that. > > - Need to fix STORE_* opcodes to do setattr() rather than > __setitem__. > > - Need to optimize the implementation. Maybe the module instance > can know if any properties or __getattr__ are defined. If no, > have __getattribute__ grab the variable directly from md_dict. > > - Need to fix eval() to allow module as well as dict. > > - Need to change logic where global dict is passed around. Pass the > module instead so we don't have to keep retrieving __namespace__. > For backwards compatibility, need to keep functions that take > 'globals' as dict and use PyModule_GetDict() on public APIs that > return globals as a dict. > > - interp->builtins should be a module, not a dict. > > - module shutdown procedure needs to be investigated and fixed. I > think it may get simpler. > > - importlib needs to be fixed to pass modules to exec() and not > dicts. From my initial experiments, it looks like importlib gets > a lot simpler. Right now we pass around dicts in a lot of places > and then have to grub around in sys.modules to get the module > object, which is what importlib usually wants. > > I have requested help in writing a PEP for this idea but so far no > one is foolish enough to join my crazy endeavor. ;-) > > Regards, > > Neil > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Tue Sep 12 23:30:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 13:30:48 +1000 Subject: [Python-ideas] PEP 561 v2 - Packaging Static Type Information In-Reply-To: References: Message-ID: On 13 September 2017 at 09:43, Ethan Smith wrote: > The two major parts of this specification are the packaging specifications > and the resolution order for resolving module type information. This spec > is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 > [2]_. There are a lot of packaging tools in use other than distutils, so I don't think the distutils update proposal belongs in the PEP. Rather, the PEP should focus on defining how type analysers should search for typing information, and then updating packaging tools to help with that can be treated as separate RFEs for each of the publishing tools that people use (perhaps with a related task-oriented guide on packaging.python.org) > Type Checker Module Resolution Order > ------------------------------------ > > The following is the order that type checkers supporting this PEP should > resolve modules containing type information: > > 1. User code - the files the type checker is running on. > > 2. Stubs or Python source manually put in the beginning of the path. Type > checkers should provide this to allow the user complete control of which > stubs to use, and patch broken stubs/inline types from packages. > > 3. Third party stub packages - these packages can supersede the installed > untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, > however it is encouraged to check their metadata to confirm that they opt > into type checking via the ``typed`` keyword. > 4. Inline packages - finally, if there is nothing overriding the installed > package, and it opts into type checking. > > 5. Typeshed (if used) - Provides the stdlib types and several third party > libraries I'm not clear on how this actually differs from the existing search protocol in PEP 484, since step 3 is exactly what the 'shared/typehints/pythonX.Y' directory is intended to cover. Is it just a matter allowing the use of "-stubs" as the typehint installation directory, since installing under a different package name is easier to manage using existing publishing tools than installing to a different target directory? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 12 23:35:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 13:35:20 +1000 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: On 13 September 2017 at 09:46, Eric Snow wrote: > The sys module is a rather special case as far as modules go. It is > effectively a "console" into the interpreter's internal state and that > includes some mutable state. Since it is a module, we don't have much > of an opportunity to: > > * validate values assigned to its attributes [1] > * issue DeprecationWarning for deprecated attrs [2] > * alias attrs [2] > * replace get (and get/set) functions with properties > * re-organize sys [3] +1 from me, specifically because there are edge cases we don't generally test (e.g. folks rebinding sys.modules to nonsense), and it would be nice to be able to upgrade those from "don't do that" to "the obvious way of doing that just plain isn't allowed". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 12 23:44:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 13:44:23 +1000 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: <20170912161708.x3mnxmrtbd26hsvi@python.ca> References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: On 13 September 2017 at 02:17, Neil Schemenauer wrote: > Introducing another special feature of modules to make this work is > not the solution, IMHO. We should make module namespaces be more > like instance namespaces. We already have a mechanism and it is > getattr on objects. One thing to keep in mind is that class instances *also* allow their attribute access machinery to be bypassed by writing to the instance.__dict__ directly - it's just that the instance dict may be bypassed on lookup for data descriptors. So that means we wouldn't need to change the way globals() works - we'd just add the caveat that amendments made that way may be ignored for things defined as properties. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Wed Sep 13 00:10:12 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 12 Sep 2017 21:10:12 -0700 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Tue, Sep 12, 2017 at 1:46 PM, Eric Snow wrote: > On Thu, Sep 7, 2017 at 11:19 PM, Nathaniel Smith wrote: >> On Thu, Sep 7, 2017 at 8:11 PM, Eric Snow wrote: >>> My concern is that this is a chicken-and-egg problem. The situation >>> won't improve until subinterpreters are more readily available. >> >> Okay, but you're assuming that "more libraries work well with >> subinterpreters" is in fact an improvement. I'm asking you to convince >> me of that :-). Are there people saying "oh, if only subinterpreters >> had a Python API and less weird interactions with C extensions, I >> could do "? So far they haven't exactly taken the >> world by storm... > > The problem is that most people don't know about the feature. And > even if they do, using it requires writing a C-extension, which most > people aren't comfortable doing. > >>> Other than C globals, is there some other issue? >> >> That's the main one I'm aware of, yeah, though I haven't looked into it closely. > > Oh, good. I haven't missed something. :) Do you know how often > subinterpreter support is a problem for users? I was under the > impression from your earlier statements that this is a recurring issue > but my understanding from mod_wsgi is that it isn't that common. It looks like we've been averaging one bug report every ~6 months for the last 3 years: https://github.com/numpy/numpy/issues?utf8=%E2%9C%93&q=is%3Aissue%20subinterpreter%20OR%20subinterpreters They mostly come from Jep, not mod_wsgi. (Possibly because Jep has some built-in numpy integration.) I don't know how many people file bugs versus just living with it or finding some workaround. I suspect for mod_wsgi in particular they probably switch to something else -- it's not like there's any shortage of WSGI servers that avoid these problems. And for Jep there are prominent warnings to expect problems and suggesting workarounds: https://github.com/ninia/jep/wiki/Workarounds-for-CPython-Extensions >> I guess I would be much more confident in the possibilities here if >> you could give: >> >> - some hand-wavy sketch for how subinterpreter A could call a function >> that as originally defined in subinterpreter B without the GIL, which >> seems like a precondition for sharing user-defined classes > > (Before I respond, note that this is way outside the scope of the PEP. > The merit of subinterpreters extends beyond any benefits of running > sans-GIL, though that is my main goal. I've been updating the PEP to > (hopefully) better communicate the utility of subinterpreters.) Subinterpreters are basically an attempt to reimplement the OS's process isolation in user-space, right? Classic trade-off where we accept added complexity and fragility in the hopes of gaining some speed? I just looked at the PEP again, and I'm afraid I still don't understand what the benefits are unless we can remove the GIL and somehow get a speedup over processes. Implementing CSP is a neat idea, but you could do it with subprocesses too. AFAICT you could implement the whole subinterpreters module API with subprocesses on 3.6, and it'd be multi-core and have perfect extension module support. > Code objects are immutable so that part should be relatively > straight-forward. There's the question of closures and default > arguments that would have to be resolved. However, those are things > that would need to be supported anyway in a world where we want to > pass functions and user-defined types between interpreters. Doing so > will be a gradual process of starting with immutable non-container > builtin types and expanding out from there to other immutable types, > including user-defined ones. I tried arguing that code objects were immutable to the PyPy devs too :-). The problem is that to call a function you need both its __code__, which is immutable, and its __globals__, which is emphatically not. The __globals__ thing means that if you start from an average function you can often follow pointers to reach every other global object (e.g. if the function uses regular expressions, you can probably reach any module by doing func.__globals__["re"].sys.modules[...]). You might hope that you could somehow restrict this, but I can't think of any way that's really useful :-(. > > Note that sharing mutable objects between interpreters would be a > pretty advanced usage (i.e. opt-in shared state vs. threading's > share-everything). If it proves desirable then we'd sort that out > then. However, I don't see that as a more than an esoteric feature > relative to subinterpreters. > > In my mind, the key advantage of being able to share more (immutable) > objects, including user-defined types, between interpreters is in the > optimization opportunities. But even if we can add new language features for "freezing" user-defined objects, then their .__class__ will still be mutable, their methods will still have mutable .__globals__, etc. And even if we somehow made it so their methods only got a read-only view on the original interpreter's state, that still wouldn't protect against race conditions and memory corruption, because the original interpreter might be mutating things while the subinterpreter is looking at them. > It would allow us to avoid instantiating > the same object in each interpreter. That said, the way I imagine it > I wouldn't consider such an optimization to be very user-facing so it > doesn't impact the PEP. The user-facing part would be the expanded > set of immutable objects interpreters could pass back and forth, and > expanding that set won't require any changes to the API in the PEP. > >> - some hand-wavy sketch for how refcounting will work for objects >> shared between multiple subinterpreters without the GIL, without >> majorly impacting single-thread performance (I actually forgot about >> this problem in my last email, because PyPy has already solved this >> part!) > > (same caveat as above) > > There are a number of approaches that may work. One is to give each > interpreter its own allocator and GC. This makes sense to me if the subinterpreters aren't going to share any references, but obviously that wasn't the question :-). And I don't see how this makes it easier to work with references that cross between different GC domains. If anything it seems like it would make things harder. > Another is to mark shared > objects such that they never get GC'ed. I don't think just leaking everything all the time is viable :-(. And even so this requires traversing the whole object reference graph on every communication operation, which defeats the purpose; the whole idea here was to find something that doesn't have to walk the object graph, because that's what makes pickle slow. > Another is to allow objects > to exist only in one interpreter at a time. Yeah, like rust -- very neat if we can do it! If I want you to give you this object, I can't have it anymore myself. But... I haven't been able to think of any way we could actually enforce this efficiently. When passing an object between subinterpreters you could require that the root object has refcount 1, and then do like a little mini-mark-and-sweep to make sure all the objects reachable from it only have references that are within the local object graph. But then we're traversing the object graph again. This would also require new syntax, because you need something like a simultaneous send-and-del, and you can't fake a del with a function call. (It also adds another extension module incompatibility, because it would require every extension type to implement tp_traverse -- previously this was only mandatory for objects that could indirectly reference themselves. But maybe the extension type thing doesn't matter because this would have to be restricted to builtin immutable types anyway, as per above.) > Similarly, object ownership (per interpreter) could help. How? > Asynchronous refcounting could be an option. Right, like in the GILectomy branch. This is the only potentially viable solution I can think of (short of dropping refcounting altogether, which I think is how most multi-core languages solve this, and why PyPy has a head start on GIL removal). Do we know yet how much single-thread overhead this adds? It makes me nervous too -- a lot of the attraction of subinterpreters is that the minimal shared state is supposed to make GIL removal easier and less risky than if we were attempting a full GILectomy. But in this case I don't see how to exploit their isolation at all. > That's only some of the possible approaches. I > expect that at least one of them will be suitable. The reason I'm pushing on this is exactly because I don't expect that; I think it's very likely that you'll spend a bunch of time on the fun easier parts, and then discover that actually the hard parts are impossible. If you want me to shut up and leave you to it, say the word :-). > However, the first > step is to get the multi-interpreter support out there. Then we can > tackle the problem of optimization and multi-core utilization. > > FWIW, the biggest complexity is actually in synchronizing the sharing > strategy across the inter-interpreter boundary (e.g. FIFO). We should > expect the relative time spent passing objects between interpreters to > be very small. So not only does that provide us will a good target > for our refcount resolving strategy, we can afford some performance > wiggle room in that solution. (again, we're looking way ahead here) But if the only advantage of subinterpreters over subprocesses is that the communication costs are lower, then surely you should be targeting cases where the communication costs are high? -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Wed Sep 13 00:30:54 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Sep 2017 21:30:54 -0700 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: I find this a disturbing trend. I think we have bigger fish to fry and this sounds like it could slow down startup. On Tue, Sep 12, 2017 at 8:35 PM, Nick Coghlan wrote: > On 13 September 2017 at 09:46, Eric Snow > wrote: > > The sys module is a rather special case as far as modules go. It is > > effectively a "console" into the interpreter's internal state and that > > includes some mutable state. Since it is a module, we don't have much > > of an opportunity to: > > > > * validate values assigned to its attributes [1] > > * issue DeprecationWarning for deprecated attrs [2] > > * alias attrs [2] > > * replace get (and get/set) functions with properties > > * re-organize sys [3] > > +1 from me, specifically because there are edge cases we don't > generally test (e.g. folks rebinding sys.modules to nonsense), and it > would be nice to be able to upgrade those from "don't do that" to "the > obvious way of doing that just plain isn't allowed". > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at ethanhs.me Wed Sep 13 00:33:16 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Tue, 12 Sep 2017 21:33:16 -0700 Subject: [Python-ideas] PEP 561 v2 - Packaging Static Type Information In-Reply-To: References: Message-ID: On Tue, Sep 12, 2017 at 8:30 PM, Nick Coghlan wrote: > On 13 September 2017 at 09:43, Ethan Smith wrote: > > The two major parts of this specification are the packaging > specifications > > and the resolution order for resolving module type information. This spec > > is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 > > [2]_. > > There are a lot of packaging tools in use other than distutils, so I > don't think the distutils update proposal belongs in the PEP. Rather, > the PEP should focus on defining how type analysers should search for > typing information, and then updating packaging tools to help with > that can be treated as separate RFEs for each of the publishing tools > that people use (perhaps with a related task-oriented guide on > packaging.python.org) > I think this makes a lot of sense. Would a description of the package metadata being queried suffice to be generic enough? And a guide on packaging.python.org makes a lot of sense, thank you for the suggestion! > > > Type Checker Module Resolution Order > > ------------------------------------ > > > > The following is the order that type checkers supporting this PEP should > > resolve modules containing type information: > > > > 1. User code - the files the type checker is running on. > > > > 2. Stubs or Python source manually put in the beginning of the path. Type > > checkers should provide this to allow the user complete control of > which > > stubs to use, and patch broken stubs/inline types from packages. > > > > 3. Third party stub packages - these packages can supersede the installed > > untyped packages. They can be found at ``pkg-stubs`` for package > ``pkg``, > > however it is encouraged to check their metadata to confirm that they > opt > > into type checking via the ``typed`` keyword. > > > 4. Inline packages - finally, if there is nothing overriding the > installed > > package, and it opts into type checking. > > > > 5. Typeshed (if used) - Provides the stdlib types and several third party > > libraries > > I'm not clear on how this actually differs from the existing search > protocol in PEP 484, since step 3 is exactly what the > 'shared/typehints/pythonX.Y' directory is intended to cover. > > Is it just a matter allowing the use of "-stubs" as the typehint > installation directory, since installing under a different package > name is easier to manage using existing publishing tools than > installing to a different target directory? > Perhaps I could be clearer in the PEP text on this. The idea is that people can ship normal sdists (or what have you) and install those to the package installation directory. Then the type checkers would pick up `pkg-stub` when looking for `pkg` type information via the package API. This allows a third party to ship just *.pyi files in a package and install it as if it were the runtime package, but still be picked up by type checkers. This is different than using 'shared/typehints/pythonX.Y' because that directory cannot be queried by package resource APIs, and since no type checker implements PEP 484's method, I thought it would be better to have everything be unified under the same system of installing packages. So I suppose that is a rather long, yes. :) > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Wed Sep 13 00:29:35 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Wed, 13 Sep 2017 00:29:35 -0400 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: <20170912014516.GU13110@ando.pearwood.info> References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170912014516.GU13110@ando.pearwood.info> Message-ID: Thanks for your detailed review! > On Sep 11, 2017, at 9:45 PM, Steven D'Aprano wrote: > > Regarding forward references: I see no problem with quoting forward > references. Some people complain about the quotation marks, but frankly > I don't think that's a major imposition. Are you a user of type annotations? In the introduction of typing at Facebook this is the single most annoying thing people point out. The reason is that it's not obvious and the workaround is ugly. Learning about this quirk adds to an already significant body of knowledge new typing users have to familiarize themselves with. It's also aesthetically jarring, the original PEP 484 admits this is a clunky solution. > Regarding the execution time at runtime: this sounds like premature > optimization. If it is not, if you have profiled some applications found > that there is more than a trivial amount of time used by generating the > annotations, you should say so. I used hand-wavy language because I didn't really check before. This time around I'm coming back prepared. Instagram has roughly 10k functions annotated at this point. Using tracemalloc I tested how much memory it takes to import modules with those functions. Then I compared how much memory it takes to import modules with those functions when annotations are stripped from the entire module (incl. module-level variable annotations, aliases, class variable annotations, etc.). The difference in allocated memory is over 22 MB. The import time with annotations is over 2s longer. The problem with those numbers that we still have 80% functions to cover. > In my opinion, a less disruptive solution to the execution time > (supposed?) problem is a switch to disable annotations altogether, > similar to the existing -O optimize switch, turning them into no-ops. > That won't make any difference to static type-checkers. > > Some people will probably want such a switch anyway, if they care about > the memory used by the __annotations__ dictionaries. Yes, this is a possibility that I should address in the PEP explicitly. There are two reasons this is not satisfying: 1. This only addresses runtime cost, not forward references, those still cannot be safely used in source code. Even if a single one was used in a single module, the entire program now has to be executed with this new hypothetical -O switch. Nobody would agree to dependencies like that. 2. This throws the baby out with the bath water. Now *no* runtime annotation use can be performed. There's work on new tools that use PEP 484-compliant type annotations at runtime (Larry is reviving his dryparse library, Eric Smith is working on data classes). Those would not work with this hypothetical -O switch but are fine with the string form since they already need to use `typing.get_type_hints()`. >> If an annotation was already a string, this string is preserved >> verbatim. In other cases, the string form is obtained from the AST >> during the compilation step, which means that the string form preserved >> might not preserve the exact formatting of the source. > > Can you give an example of how the AST may distort the source? > > My knee-jerk reaction is that anything which causes the annotation to > differ from the source is likely to cause confusion. Anything not explicitly stored in the AST will require a normalized untokenization. This covers whitespace, punctuation, and brackets. An example: def fun(a: Dict[ str, str, ]) -> None: ... would become something like: {'a': 'Dict[(str, str)]', 'return': 'None'} >> Annotations need to be syntactically valid Python expressions, also when >> passed as literal strings (i.e. ``compile(literal, '', 'eval')``). >> Annotations can only use names present in the module scope as postponed >> evaluation using local names is not reliable. > > And that's a problem. Forcing the use of global variables seems harmful. > Preventing the use of locals seems awful. Can't we use some sort of > closure-like mechanism? As Nick measured, currently closures would add to the performance problem, not solve it. What is an actual use case that this would prevent? > This restriction is going to break, or prevent, situations like this: > > def decorator(func): > kind = ... # something generated at decorator call time > @functools.wraps(func) > def inner(arg: kind): > ... > return inner > > > Even if static typecheckers have no clue what the annotation on the > inner function is, it is still useful for introspection and > documentation. Do you have a specific use case in mind? If you need to put magic dynamic annotations for runtime introspection, the decorator can apply those directly via calling inner.__annotations__ = {'arg': 'kind'} This is what the "attrs" library is considering doing for the magic __init__ method. Having annotation literals in the source not validated would not hinder that at all. That's it for introspection. As for documentation, I don't really understand that comment. >> Resolving Type Hints at Runtime >> =============================== > [...] >> To get the correct module-level >> context to resolve class variables, use:: >> >> cls_globals = sys.modules[SomeClass.__module__].__dict__ > > A small style issue: I think that's better written as: > > cls_globals = vars(sys.modules[SomeClass.__module__]) > > We should avoid directly accessing dunders unless necessary, and vars() > exists specifically for the purpose of returning object's __dict__. Right! Would be great if it composed a dictionary of slots for us, too ;-) >> Runtime annotation resolution and ``TYPE_CHECKING`` >> --------------------------------------------------- >> >> Sometimes there's code that must be seen by a type checker but should >> not be executed. For such situations the ``typing`` module defines a >> constant, ``TYPE_CHECKING``, that is considered ``True`` during type >> checking but ``False`` at runtime. Example:: >> >> import typing >> >> if typing.TYPE_CHECKING: >> import expensive_mod >> >> def a_func(arg: expensive_mod.SomeClass) -> None: >> a_var: expensive_mod.SomeClass = arg >> ... > > I don't know whether this is important, but for the record the current > documentation shows expensive_mod.SomeClass quoted. Yes, I wanted to explicitly illustrate that now you can use the annotations directly without quoting and it won't fail at import time. If you tried to evaluate them with `typing.get_type_hints()` that would fail, just like trying to evaluate the string form today. >> Backwards Compatibility >> ======================= > [...] >> Annotations that depend on locals at the time of the function/class >> definition are now invalid. Example:: > > As mentioned above, I think this is a bad loss of existing > functionality. I don't quite see it. If somebody badly needs to store arbitrary data as annotations in generated functions, they still can directly write to `__annotations__`. In every other case, specifically in static typing (the overwhelming use case for annotations), this makes human readers happier and static analysis tools are indifferent. In fact, if runtime type hint evaluation is necessary, then even if a library is already using `typing.get_type_hints()` (which it should!), so far I haven't seen much use of `globalns` and `localns` arguments to this function which suggests that the forward references we can support are already constrained to globals anyway. >> In the presence of an annotation that cannot be resolved using the >> current module's globals, a NameError is raised at compile time. > > It is not clear what this means, or how it will work, especially given > that the point of this is to delay evaluating annotations. How will the > compiler know that an annotation cannot be resolved if it doesn't try to > evaluate it? The idea would be to validate the expressions after the entire module is compiled, something like what the flake8-pyi plugin is doing today for .pyi files. Guido pointed out that it's not trivial since the compiler doesn't keep a symbol table around. But I'd invest time in this since I agree with your point that we should raise errors as early as possible. >> Rejected Ideas >> ============== >> >> Keep the ability to use local state when defining annotations >> ------------------------------------------------------------- >> >> With postponed evaluation, this is impossible for function locals. > > Impossible seems a bit strong. Can you elaborate? Unless we stored the closure, it's not possible. > [...] >> This is brittle and doesn't even cover slots. Requiring the use of >> module-level names simplifies runtime evaluation and provides the >> "one obvious way" to read annotations. It's the equivalent of absolute >> imports. > > I hardly think that "simplifies runtime evaluation" is true. At the > moment annotations are already evaluated. *Anything* that you have to do > by hand (like call eval) cannot be simpler than "do nothing". This section covers the rejected idea of allowing local access on classes specifically. As I mentioned earlier, preserving local function scope access is not something we can do without bending ourselves backwards. Constraining ourselves to only global scope simplifies runtime evaluation compared to that use case. The alternative is not "do nothing". > I don't think the analogy with absolute imports is even close to useful, > and far from being "one obvious way", this is a surprising, non-obvious, > seemingly-arbitrary restriction on annotations. That's a bit harsh. I probably didn't voice my idea clearly enough here so let me elaborate. Annotations inside nested classes which are using local scope currently have to use the local names directly instead of using the qualified name. This has similar issues to relative imports: class Menu(UIComponent): ... class Restaurant: class Menu(Enum): SPAM = 1 EGGS = 2 def generate_menu_component(self) -> Menu: ... It is impossible today to use the global "Menu" type in the annotation of the example method. This PEP is proposing to use qualified names in this case which disambiguates between the global "Menu" and "Restaurant.Menu". In this sense it felt similar to absolute imports to me. > Given that for *six versions* of Python 3 annotations could be locals, > there is nothing obvious about restricting them to globals. I don't understand the argument of status quo. In those six versions, the first two were not usable in production and the following two only slowly started getting adoption. It wasn't until Python 3.5 that we started seeing significant adoption figures. This is incidentally also the first release with PEP 484. Wow, this took quite some time to respond to, sorry it took so long! I had to do additional research and clarify some of my own understanding while doing this. It was very valuable, thanks again for your feedback! - ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From levkivskyi at gmail.com Wed Sep 13 03:51:55 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 13 Sep 2017 09:51:55 +0200 Subject: [Python-ideas] PEP 563: Postponed Evaluation of Annotations, first draft In-Reply-To: References: <1176DC03-ACB8-48F4-B8A0-02268B524BE4@langa.pl> <20170912014516.GU13110@ando.pearwood.info> Message-ID: > The difference in allocated memory is over 22 MB. > The import time with annotations is over 2s longer. > The problem with those numbers that we still have 80% functions to cover. This will not be a problem with PEP 560 (I could imagine that string objects may take actually more memory than relatively small cached objects). Also I think it makes sense to mention in the PEP that stringifying annotations does not solve _all_ problems with forward references. For example, two typical situations are: T = TypeVar('T', bound='Factory') class Factory: def make_copy(self: T) -> T: ... and class Vertex(List['Edge']): ... class Edge: ends: Tuple[Vertex, Vertex] Actually both situations can be resolved with PEP 563 if one puts `T` after `Factory`, and `Vertex` after `Edge`, the latter is OK, but the former would be strange. After all, it is OK to pay a _little_ price for Python being an interpreted language. There are other situations discussed in https://github.com/python/typing/issues/400, I don't want to copy all of them to the PEP, but I think this prior discussion should be referenced in the PEP. > This is not a viable strategy since __future__ is not designed to be > a feature toggle but rather to be a gradual introduction of an upcoming > breaking change. But how it was with `from __future__ import division`? What I was proposing is something similar, just have `from __future__ import annotations` that will be default in Python 4. (Although this time it would be a good idea to emit DeprecationWarning one-two releases before Python 4). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Wed Sep 13 06:45:30 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 13 Sep 2017 13:45:30 +0300 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On Wed, Sep 13, 2017 at 6:14 AM, Nick Coghlan wrote: > On 13 September 2017 at 00:35, Koos Zevenhoven wrote:> > > I don't see how the situation benefits from calling something the "main > > interpreter". Subinterpreters can be a way to take something > non-thread-safe > > and make it thread-safe, because in an interpreter-per-thread scheme, > most > > of the state, like module globals, are thread-local. (Well, this doesn't > > help for async concurrency, but anyway.) > > "The interpreter that runs __main__" is never going to go away as a > concept for the regular CPython CLI. > It's still just *an* interpreter that happens to run __main__. And who says it even needs to be the only one? > > Right now, its also a restriction even for applications like mod_wsgi, > since the GIL state APIs always register C created threads with the > main interpreter. > > >> That's OK - it just means we'll aim to make as many > >> things as possible implicitly subinterpreter-friendly, and for > >> everything else, we'll aim to minimise the adjustments needed to > >> *make* things subinterpreter friendly. > >> > > > > And that's exactly what I'm after here! > > No, you're after deliberately making the proposed API > non-representative of how the reference implementation actually works > because of a personal aesthetic preference rather than asking yourself > what the practical benefit of hiding the existence of the main > interpreter would be. > > The fact is that the main interpreter *is* special (just as the main > thread is special), and your wishing that things were otherwise won't > magically make it so. > ?I'm not questioning whether the main interpreter is special, or whether the interpreters may differ from each other. I'm questioning the whole concept of "main interpreter". People should not care about which interpreter is "the main ONE". They should care about what properties an interpreter has. That's not aesthetics. Just look at, e.g. the _decimal/_pydecimal examples in this thread. > I'm mostly just worried about the `get_main()` function. Maybe it should > be > > called `asdfjaosjnoijb()`, so people wouldn't use it. Can't the first > > running interpreter just introduce itself to its children? And if that's > too > > much to ask, maybe there could be a `get_parent()` function, which would > > give you the interpreter that spawned the current subinterpreter. > > If the embedding application never calls > "_Py_ConfigureMainInterpreter", then get_main() could conceivably > return None. However, we don't expose that as a public API yet, so for > the time being, Py_Initialize() will always call it, and hence there > will always be a main interpreter (even in things like mod_wsgi). > > You don't need to remove _Py_ConfigureMainInterpreter. Just make sure you don't try to smuggle it into the status quo of the possibly upcoming new stdlib module. Who knows what the function does anyway, let alone what it might or might not do in the future. Of course that doesn't mean that there couldn't be ways to configure an interpreter, but coupling that with a concept of a "main interpreter", as you suggest, doesn't seem to make any sense. And surely the code that creates a new interpreter should know if it wants the new interpreter to start with `__name__ == "__main__"` or `__name__ == "__just_any__", if there is a choice. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Sep 13 09:42:56 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 13 Sep 2017 15:42:56 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? Message-ID: Hi, tl; dr How can we extend unittest module to plug new checks before/after running tests? The CPython project has a big test suite in the Lib/test/ directory. While all tests are written with the unittest module and the unittest.TestCase class, tests are not run directly by unittest, but run by "regrtest" (for "regression test") which is a test runner doing more checks (and more). I would like to see if and how we can integrate/move some regrtest features into the unittest module. Example of regrtest features: * skip a test if it allocates too much memory, command line argument to specify how many memory a test is allowed to allocate (ex: --memlimit=2G for 2 GB of memory) * concept of "resource" like "network" (connect to external network servers, to the Internet), "cpu" (CPU intensive tests), etc. Tests are skipped by default and enabled by the -u command line option (ex: "-u cpu). * track memory leaks: check the reference counter, check the number of allocated memory blocks, check the number of open file descriptors. * detect if the test spawned a thread or process and the thread/process is still running at the test exit * --timeout: watchdog killing the test if the run time exceed the timeout in seconds (use faulthandler.dump_traceback_later) * multiprocessing: run tests in subprocesses, in parallel * redirect stdout/stderr to pipes (StringIO objects), ignore them on success, or dump them to stdout/stderr on test failure * --slowest: top 10 of the slowest tests * --randomize: randomize test order * --match, --matchfile, -x: filter tests * --forever: run the test in a loop until it fails (or is interrupted by CTRL+c) * --list-tests / --list-cases: list test files / test methods * --fail-env-changed: mark tests as failed if a test altered the environment * detect if a "global variable" of the standard library was modified but not restored by the test: resources = ('sys.argv', 'cwd', 'sys.stdin', 'sys.stdout', 'sys.stderr', 'os.environ', 'sys.path', 'sys.path_hooks', '__import__', 'warnings.filters', 'asyncore.socket_map', 'logging._handlers', 'logging._handlerList', 'sys.gettrace', 'sys.warnoptions', 'multiprocessing.process._dangling', 'threading._dangling', 'sysconfig._CONFIG_VARS', 'sysconfig._INSTALL_SCHEMES', 'files', 'locale', 'warnings.showwarning', 'shutil_archive_formats', 'shutil_unpack_formats', ) * test.bisect: bisection to identify the failing method, used to track memory leaks or identify a test leaking a resource (ex: create a file but don't remove it) * ... : regrtest has many many features My question is also connected to test.support (Lib/test/support/__init__.py): a big module containing a lot of helper functions to write tests and to detect bugs in tests. For example, @reap_children decorator emits a warnig if the test leaks a child process (and reads its exit status to prevent zombie process). I started to duplicate code in many files of Lib/test/test_*.py to check if tests "leak running threads" ("dangling threads"). Example from Lib/test/test_theading.py: class BaseTestCase(unittest.TestCase): def setUp(self): self._threads = test.support.threading_setup() def tearDown(self): test.support.threading_cleanup(*self._threads) test.support.reap_children() I would like to get this test "for free" directly from the regular unittest.TestCase class, but I don't know how to extend the unittest module for that? Victor From thibault.hilaire at lip6.fr Wed Sep 13 10:36:49 2017 From: thibault.hilaire at lip6.fr (Thibault Hilaire) Date: Wed, 13 Sep 2017 16:36:49 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170912112851.GX13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> Message-ID: <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> Hi everybody > I chose it because it's easy to write. Maybe math.pi is a better example :-) >> >>>>> math.pi.hex() >> '0x1.921fb54442d18p+1' > > 3.141592653589793 is four fewer characters to type, just as accurate, > and far more recognizable. Of course, for a lost of numbers, the decimal representation is simpler, and just as accurate as the radix-2 hexadecimal representation. But, due to the radix-10 and radix-2 used in the two representations, the radix-2 may be much easier to use. In the "Handbook of Floating-Point Arithmetic" (JM Muller et al, Birkhauser editor, page 40),the authors claims that the largest exact decimal representation of a double-precision floating-point requires 767 digits !! So it is not always few characters to type to be just as accurate !! For example (this is the largest exact decimal representation of a single-precision 32-bit float): > 1.17549421069244107548702944484928734882705242874589333385717453057158887047561890426550235133618116378784179687e-38 and > 0x1.fffffc0000000p-127 are exactly the same number (one in decimal representation, the other in radix-2 hexadecimal)! So, we have several alternatives: - do nothing, and continue to use float.hex() and float.fromhex() - support one of some of the following possibilities: a) support the hexadecimal floating-point literals, like released in C++17 (I don't know if some other languages already support this) >>> x = 0x1.2492492492492p-3 b) extend the constructor float to be able to build float from hexadecimal >>> x = float('0x1.2492492492492p-3') I don't know if we should add a "base=None" or not c) extend the string formatting with '%a' (as in C since C99) and '{:a}' >>> s = '%a' % (x,) Serhly proposes to use '%x' and '{:x}', but I prefer to be consistent with C To my point of view (my needs are maybe not very representative, as computer scientist working in computer arithmetic), a full support for radix-2 representation is required (it is sometimes easier/quicker to exchange data between different softwares in plain text, and radix-2 hexadecimal is the best way to do it, because it is exact). Also, support option a) will help me to generate python code (from other python or C code) without building the float at runtime with fromhex(). My numbers will be literals, not string converted in float! Option c) will help me to print my data in the same way as in C, and be consistent (same formatting character) And option b) will be just here for consistency with new hexadecimal literals... Finally, I am now considering writing a PEP from Serhly Storchaka's idea, but if someone else wants to start it, I can help/contribute. Thanks Thibault From jhihn at gmx.com Wed Sep 13 11:09:37 2017 From: jhihn at gmx.com (Jason H) Date: Wed, 13 Sep 2017 17:09:37 +0200 Subject: [Python-ideas] Make map() better Message-ID: The format of map seems off. Coming from JS, all the functions come second. I think this approach is superior. Currently: map(lambda x: chr(ord('a')+x), range(26)) # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] But I think this reads better: map(range(26), lambda x: chr(ord('a')+x)) Currently that results in: TypeError: argument 2 to map() must support iteration Also, how are we to tell what supports map()? Any iterable should be able to map via: range(26).map(lambda x: chr(ord('a')+x))) While the line length is the same, I think the latter is much more readable, and the member method avoids parameter order confusion For the global map(), having the iterable first also increases reliability because the lambda function is highly variable in length, where as parameter names are generally shorter than even the longest lambda expression. More readable: IMHO: map(in, lambda x: chr(ord('a')+x)) out = map(out, lambda x: chr(ord('a')+x)) out = map(out, lambda x: chr(ord('a')+x)) Less readable (I have to parse the lambda): map(lambda x: chr(ord('a')+x), in) out = map(lambda x: chr(ord('a')+x), out) out = map(lambda x: chr(ord('a')+x), out) But I contend: range(26).map(lambda x: chr(ord('a')+x))) is superior to all. From mertz at gnosis.cx Wed Sep 13 11:20:30 2017 From: mertz at gnosis.cx (David Mertz) Date: Wed, 13 Sep 2017 08:20:30 -0700 Subject: [Python-ideas] Make map() better In-Reply-To: References: Message-ID: We're not going to break every version of Python since 0.9 because Javascript does something a certain way. Whatever might be better abstractly, this is well established. As to adding a `.map()` method to *every* iterable... just how would you propose to do that given that it's really easy and common to write custom iterables. How do you propose to change every class ever written by users? (including ones that already define `.map()` with some other meaning than the one you suggest)? On Wed, Sep 13, 2017 at 8:09 AM, Jason H wrote: > The format of map seems off. Coming from JS, all the functions come > second. I think this approach is superior. > > Currently: > map(lambda x: chr(ord('a')+x), range(26)) # ['a', 'b', 'c', 'd', 'e', 'f', > 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', > 'v', 'w', 'x', 'y', 'z'] > > But I think this reads better: > map(range(26), lambda x: chr(ord('a')+x)) > > Currently that results in: > TypeError: argument 2 to map() must support iteration > > Also, how are we to tell what supports map()? > Any iterable should be able to map via: > range(26).map(lambda x: chr(ord('a')+x))) > > While the line length is the same, I think the latter is much more > readable, and the member method avoids parameter order confusion > > For the global map(), > having the iterable first also increases reliability because the lambda > function is highly variable in length, where as parameter names are > generally shorter than even the longest lambda expression. > > More readable: IMHO: > map(in, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > > Less readable (I have to parse the lambda): > map(lambda x: chr(ord('a')+x), in) > out = map(lambda x: chr(ord('a')+x), out) > out = map(lambda x: chr(ord('a')+x), out) > > But I contend: > range(26).map(lambda x: chr(ord('a')+x))) > is superior to all. > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From egregius313 at gmail.com Wed Sep 13 11:23:53 2017 From: egregius313 at gmail.com (Edward Minnix) Date: Wed, 13 Sep 2017 11:23:53 -0400 Subject: [Python-ideas] Make map() better In-Reply-To: References: Message-ID: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> While I agree that the method calling syntax is nicer, I disagree with flipping the argument error for three main reasons. First: it violates the signature entirely The signature to map is map(function, *iterables). Python?s map is more like Haskell?s zipWith. Making the function last would either ruin the signature or would slow down performance. Second: currying If you ever curry a function in Python using functools.partial, having the most common arguments first is crucial. (You?re more likely to apply the same function to multiple iterables than to apply several functions on the same exact iterable). Thirdly: the change would make several functional programming packages have incompatible APIs. Currently libraries like PyToolz/Cytoolz and funcy have APIs that require function-first argument order. Changing the argument order would be disruptive to most Python FP packages/frameworks. So while I agree with you that ?iterable.map(fn)? is more readable, I think changing the argument order would be too much of a breaking change, and there is no practical way to add ?iterable.map(fn)? to every iterable type. - Ed > On Sep 13, 2017, at 11:09, Jason H wrote: > > The format of map seems off. Coming from JS, all the functions come second. I think this approach is superior. > > Currently: > map(lambda x: chr(ord('a')+x), range(26)) # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] > > But I think this reads better: > map(range(26), lambda x: chr(ord('a')+x)) > > Currently that results in: > TypeError: argument 2 to map() must support iteration > > Also, how are we to tell what supports map()? > Any iterable should be able to map via: > range(26).map(lambda x: chr(ord('a')+x))) > > While the line length is the same, I think the latter is much more readable, and the member method avoids parameter order confusion > > For the global map(), > having the iterable first also increases reliability because the lambda function is highly variable in length, where as parameter names are generally shorter than even the longest lambda expression. > > More readable: IMHO: > map(in, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > > Less readable (I have to parse the lambda): > map(lambda x: chr(ord('a')+x), in) > out = map(lambda x: chr(ord('a')+x), out) > out = map(lambda x: chr(ord('a')+x), out) > > But I contend: > range(26).map(lambda x: chr(ord('a')+x))) > is superior to all. > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From jhihn at gmx.com Wed Sep 13 11:54:53 2017 From: jhihn at gmx.com (Jason H) Date: Wed, 13 Sep 2017 17:54:53 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: > Sent: Wednesday, September 13, 2017 at 11:23 AM > From: "Edward Minnix" > To: "Jason H" > Cc: Python-Ideas > Subject: Re: [Python-ideas] Make map() better > > While I agree that the method calling syntax is nicer, I disagree with flipping the argument error for three main reasons. > > First: it violates the signature entirely > The signature to map is map(function, *iterables). Python?s map is more like Haskell?s zipWith. Making the function last would either ruin the signature or would slow down performance. > > Second: currying > If you ever curry a function in Python using functools.partial, having the most common arguments first is crucial. (You?re more likely to apply the same function to multiple iterables than to apply several functions on the same exact iterable). > > Thirdly: the change would make several functional programming packages have incompatible APIs. > Currently libraries like PyToolz/Cytoolz and funcy have APIs that require function-first argument order. Changing the argument order would be disruptive to most Python FP packages/frameworks. > > So while I agree with you that ?iterable.map(fn)? is more readable, I think changing the argument order would be too much of a breaking change, and there is no practical way to add ?iterable.map(fn)? to every iterable type. Thanks for the insights. I don't think it would be that breaking: def remap_map(a1, a2): if hasattr(a1, '__call__'): return map(a1, a2) elif hasattr(a2, '__call__'): return map(a2,a1) else: raise NotCallable # Exception neither is callable I'm rather surprised that there isn't a Iterable class which dict and list derive from. If that were added to just dict and list, I think it would cover 98% of cases, and adding Iterable would be reasonable in the remaining scenarios. From prometheus235 at gmail.com Wed Sep 13 12:37:43 2017 From: prometheus235 at gmail.com (Nick Timkovich) Date: Wed, 13 Sep 2017 11:37:43 -0500 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: On Wed, Sep 13, 2017 at 10:54 AM, Jason H wrote: > > Thanks for the insights. > I don't think it would be that breaking: > > def remap_map(a1, a2): > if hasattr(a1, '__call__'): > return map(a1, a2) > elif hasattr(a2, '__call__'): > return map(a2,a1) > else: > raise NotCallable # Exception neither is callable > I think it's better to be parsimonious and adhere to the "there is one way to do it" design principle. On the matter of style, map with a lambda is more pleasing as `(expr-x for x in iterable)` rather than `map(lambda x: expr-x, iterable)`. If you need to handle multiple iterables, they can be zip'd. > I'm rather surprised that there isn't a Iterable class which dict and list > derive from. > If that were added to just dict and list, I think it would cover 98% of > cases, and adding Iterable would be reasonable in the remaining scenarios. For checking, there's `collections.abc.Iterable` and neighbors that can look at the interface easily, but I don't think the C-implemented, built-in types spring from them. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Sep 13 14:55:58 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 13 Sep 2017 21:55:58 +0300 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: <20170912161708.x3mnxmrtbd26hsvi@python.ca> References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: 12.09.17 19:17, Neil Schemenauer ????: > This is my idea of making module properties work. It is necessary > for various lazy-loading module ideas and it cleans up the language > IMHO. I think it may be possible to do it with minimal backwards > compatibility problems and performance regression. > > To me, the main issue with module properties (or module __getattr__) > is that you introduce another level of indirection on global > variable access. Anywhere the module.__dict__ is used as the > globals for code execution, changing LOAD_NAME/LOAD_GLOBAL to have > another level of indirection is necessary. That seems inescapable. > > Introducing another special feature of modules to make this work is > not the solution, IMHO. We should make module namespaces be more > like instance namespaces. We already have a mechanism and it is > getattr on objects. There is a difference between module namespaces and instance namespaces. LOAD_NAME/LOAD_GLOBAL fall back to builtins if the name is not found in the globals dictionary. Calling __getattr__() will slow down the access to builtins. And there is a recursion problem if module's __getattr__() uses builtins. From jimjjewett at gmail.com Wed Sep 13 14:56:33 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Wed, 13 Sep 2017 14:56:33 -0400 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility Message-ID: I am generally supportive of leaving the type annotations unprocessed by default, but there are use cases where they should be processed (and even cases where doing it at the right time matters, because of a side effect). I am concerned that the backwards compatibility story for non-typing cases be not just possible, but reasonable. (1) The PEP suggests opting out with @typing.no_type_hints ... The closest I could find was @typing.no_type_check, which has to be called on each object. It should be possible to opt out for an entire module, and it should be possible to do so *without* first importing typing. Telling type checkers to ignore scopes (including modules) with a # typing.no_type_hints comment would be sufficient for me. If that isn't possible, please at least create a nontyping or minityping module so that the marker can be imported without the full overhead of the typing module. (2) Getting the annotations processed (preferably at the currently correct time) should also be possible on a module-wide basis, and should also not require importing the entire typing apparatus. It would be a bit messy (like the old coding cookie), but recognizing a module-wide # typing.no_type_hints comment and then falling back to the current behavior would be enough for me. Alternatively, it would be acceptable to run something like typing.get_type_hints, if that could be done in a single pass at the end of the module (callable from both within the module and from outside) ... but again, such a cleanup function should be in a smaller module that doesn't require loading all of typing. -jJ From guido at python.org Wed Sep 13 15:08:44 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 12:08:44 -0700 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: On Wed, Sep 13, 2017 at 11:56 AM, Jim J. Jewett wrote: > It should be possible to opt out for an entire module, and it should > be possible to do so *without* first importing typing. > PEP 484 has a notation for this -- put # type: ignore at the top of your file and the file won't be type-checked. (Before you test this, mypy doesn't yet support this. But it could.) IIUC functions and classes will still have an __annotations__ attribute (except when it would be empty) so even with the __future__ import (or in Python 4.0) you could still make non-standard use of annotations pretty easily -- you'd just get a string rather than an object. (And a simple eval() will turn the string into an object -- the PEP has a lot of extra caution because currently the evaluation happens in the scope where the annotation is encountered, but if you don't care about that everything's easy.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 13 15:15:06 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 12:15:06 -0700 Subject: [Python-ideas] A reminder for PEP owners Message-ID: I know there's a lot of excitement around lots of new ideas. And the 3.7 feature freeze is looming (January is in a few months). But someone has to review and accept all those PEPs, and I can't do it all by myself. If you want your proposal to be taken seriously, you need to include a summary of the discussion on the mailing list (including objections, even if you disagree!) in your PEP, e.g. as an extended design rationale or under the Rejected Ideas heading. If you don't do this you risk having to repeat yourself -- also you risk having your PEP rejected, because at this point there's no way I am going to read all the discussions. -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Wed Sep 13 15:12:42 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Wed, 13 Sep 2017 15:12:42 -0400 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: > On Sep 13, 2017, at 2:56 PM, Jim J. Jewett wrote: > > I am generally supportive of leaving the type annotations unprocessed > by default, but there are use cases where they should be processed > (and even cases where doing it at the right time matters, because of a > side effect). What is the "right time" you're speaking of? > (1) The PEP suggests opting out with @typing.no_type_hints ... The > closest I could find was @typing.no_type_check, which has to be called > on each object. This was a typo on my part. Yes, no_type_check is what I meant. > It should be possible to opt out for an entire module, and it should > be possible to do so *without* first importing typing. > > Telling type checkers to ignore scopes (including modules) with a > > # typing.no_type_hints > > comment would be sufficient for me. This is already possible. PEP 484 specifies that "A # type: ignore comment on a line by itself is equivalent to adding an inline # type: ignore to each line until the end of the current indented block. At top indentation level this has effect of disabling type checking until the end of file." > (2) Getting the annotations processed (preferably at the currently > correct time) should also be possible on a module-wide basis, and > should also not require importing the entire typing apparatus. Again, what is the "correct time" you're speaking of? > It would be a bit messy (like the old coding cookie), but recognizing > a module-wide > > # typing.no_type_hints > > comment and then falling back to the current behavior would be enough for me. Do you know of any other per-module feature toggle of this kind? __future__ imports are not feature toggles, they are timed deprecations. Finally, the non-typing use cases that you're worried about, what are they? From the research I've done, none of the actual use cases in existence would be rendered impossible by postponed evaluation. So far the concerns about side effects and local scope in annotations aren't supported by any strong evidence that this change would be disruptive. Don't get me wrong, I'm not being dismissive. I just don't think it's reasonable to get blocked on potential and obscure use cases that no real world code actually employs. - ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From stefan_ml at behnel.de Wed Sep 13 15:57:02 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 13 Sep 2017 21:57:02 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: Jason H schrieb am 13.09.2017 um 17:54: > I'm rather surprised that there isn't a Iterable class which dict and list derive from. > If that were added to just dict and list, I think it would cover 98% of cases, and adding Iterable would be reasonable in the remaining scenarios. Would you then always have to inherit from that class in order to make a type iterable? That would be fairly annoying... The iterable and iterator protocols are extremely simple, and that's a great feature. And look, map() even works with all of them, without inheritance, registration, and whatnot. It's so easy! Stefan From jimjjewett at gmail.com Wed Sep 13 16:01:48 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Wed, 13 Sep 2017 16:01:48 -0400 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: On Wed, Sep 13, 2017 at 3:12 PM, Lukasz Langa wrote: > On Sep 13, 2017, at 2:56 PM, Jim J. Jewett wrote: >> I am generally supportive of leaving the type annotations >> unprocessed by default, but there are use cases where >> they should be processed (and even cases where doing it >> at the right time matters, because of a side effect). > What is the "right time" you're speaking of? The "right time" is whenever they are currently evaluated. (Definition time, I think, but won't swear.) For example, the "annotation" might really be a call to a logger, showing the current environment, including names that will be rebound before the module finishes loading. I'm perfectly willing to agree that even needing this much control over timing is a code smell, but it is currently possible, and I would rather it not become impossible. At a minimum, it seems like "just run this typing function that you should already be using" should either save the right context, or the PEP should state explicitly that this functionality is being withdrawn. (And go ahead and suggest a workaround, such as running the code before the method definition, or as a decorator.) >> (1) The PEP suggests opting out with @typing.no_type_hints ... > This is already possible. PEP 484 specifies that > "A # type: ignore comment on a line by itself is equivalent to adding an > inline # type: ignore to each line until the end of the current indented > block. At top indentation level this has effect of disabling type checking > until the end of file." Great! Please mention this as well as (or perhaps instead of) typing.no_type_check. >> It would be a bit messy (like the old coding cookie), >> but recognizing a module-wide >> # typing.no_type_hints >> comment and then falling back to the current behavior >> would be enough for me. > Do you know of any other per-module feature toggle of this kind? No, thus the comment about it being messy. But it does offer one way to ensure that annotations are evaluated within the proper environment, even without having to save those environments. -jJ From lucas.wiman at gmail.com Wed Sep 13 16:07:41 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Wed, 13 Sep 2017 13:07:41 -0700 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: On Wed, Sep 13, 2017 at 11:55 AM, Serhiy Storchaka wrote: > [...] Calling __getattr__() will slow down the access to builtins. And > there is a recursion problem if module's __getattr__() uses builtins. > The first point is totally valid, but the recursion problem doesn't seem like a strong argument. There are already lots of recursion problems when defining custom __getattr__ or __getattribute__ methods, but on balance they're a very useful part of the language. - Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Wed Sep 13 16:43:22 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Wed, 13 Sep 2017 13:43:22 -0700 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: 2017-09-13 13:01 GMT-07:00 Jim J. Jewett : > On Wed, Sep 13, 2017 at 3:12 PM, Lukasz Langa wrote: > > On Sep 13, 2017, at 2:56 PM, Jim J. Jewett wrote: > > >> I am generally supportive of leaving the type annotations > >> unprocessed by default, but there are use cases where > >> they should be processed (and even cases where doing it > >> at the right time matters, because of a side effect). > > > What is the "right time" you're speaking of? > > The "right time" is whenever they are currently evaluated. > (Definition time, I think, but won't swear.) > > For example, the "annotation" might really be a call to a logger, > showing the current environment, including names that will be rebound > before the module finishes loading. > > I'm perfectly willing to agree that even needing this much control > over timing is a code smell, but it is currently possible, and I would > rather it not become impossible. > Is this just a theoretical concern? Unless there is significant real-world code doing this sort of thing, I don't see much of a problem in deprecating such code using the normal __future__-based deprecation cycle. > > At a minimum, it seems like "just run this typing function that you > should already be using" should either save the right context, or the > PEP should state explicitly that this functionality is being > withdrawn. (And go ahead and suggest a workaround, such as running > the code before the method definition, or as a decorator.) > > > >> (1) The PEP suggests opting out with @typing.no_type_hints ... > > > This is already possible. PEP 484 specifies that > > > "A # type: ignore comment on a line by itself is equivalent to adding an > > inline # type: ignore to each line until the end of the current indented > > block. At top indentation level this has effect of disabling type > checking > > until the end of file." > > Great! Please mention this as well as (or perhaps instead of) > typing.no_type_check. > > > >> It would be a bit messy (like the old coding cookie), > >> but recognizing a module-wide > > >> # typing.no_type_hints > > >> comment and then falling back to the current behavior > >> would be enough for me. > > > Do you know of any other per-module feature toggle of this kind? > > No, thus the comment about it being messy. But it does offer one way > to ensure that annotations are evaluated within the proper > environment, even without having to save those environments. > > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip.montanaro at gmail.com Wed Sep 13 17:03:21 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Wed, 13 Sep 2017 16:03:21 -0500 Subject: [Python-ideas] [Python-Dev] A reminder for PEP owners In-Reply-To: References: Message-ID: > But someone has to > review and accept all those PEPs, and I can't do it all by myself. An alternate definition for BDFL is "Benevolent Delegator For Life." :-) Skip From jhihn at gmx.com Wed Sep 13 17:05:26 2017 From: jhihn at gmx.com (Jason H) Date: Wed, 13 Sep 2017 23:05:26 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: > Sent: Wednesday, September 13, 2017 at 3:57 PM > From: "Stefan Behnel" > To: python-ideas at python.org > Subject: Re: [Python-ideas] Make map() better > > Jason H schrieb am 13.09.2017 um 17:54: > > I'm rather surprised that there isn't a Iterable class which dict and list derive from. > > If that were added to just dict and list, I think it would cover 98% of cases, and adding Iterable would be reasonable in the remaining scenarios. > > Would you then always have to inherit from that class in order to make a > type iterable? That would be fairly annoying... > > The iterable and iterator protocols are extremely simple, and that's a > great feature. > > And look, map() even works with all of them, without inheritance, > registration, and whatnot. It's so easy! Define easy. It's far easier for me to do a dir(dict) and see what I can do with it. This is what python does after all. "Does it have the interface I expect?" Global functions like len(), min(), max(), map(), etc(), don't really tell me the full story. len(7) makes no sense. I can attempt to call a function with an invalid argument. [].len() makes more sense. Python is weird in that there are these special magical globals that operate on many things. Why is it ','.join(iterable), why isn't there join(',', iterable) At what point does a method become a global? A member? Do we take the path that everything is a global? Or should all methods be members? So far it seems arbitrary. From ncoghlan at gmail.com Wed Sep 13 18:03:51 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 08:03:51 +1000 Subject: [Python-ideas] PEP 561 v2 - Packaging Static Type Information In-Reply-To: References: Message-ID: On 13 September 2017 at 14:33, Ethan Smith wrote: > On Tue, Sep 12, 2017 at 8:30 PM, Nick Coghlan wrote: >> There are a lot of packaging tools in use other than distutils, so I >> don't think the distutils update proposal belongs in the PEP. Rather, >> the PEP should focus on defining how type analysers should search for >> typing information, and then updating packaging tools to help with >> that can be treated as separate RFEs for each of the publishing tools >> that people use (perhaps with a related task-oriented guide on >> packaging.python.org) > > > I think this makes a lot of sense. Would a description of the package > metadata being queried suffice to be generic enough? It would - a spec to say "Typecheckers should look for to learn " and "Publishers should provide to tell typecheckers ". PEP 376 is the current definition of the installed package metadata, so if you describe this idea in terms of *.dist-info/METADATA entries, then folks will be able to translate that to the wheel archive format and the various legacy install db formats. >> I'm not clear on how this actually differs from the existing search >> protocol in PEP 484, since step 3 is exactly what the >> 'shared/typehints/pythonX.Y' directory is intended to cover. >> >> Is it just a matter allowing the use of "-stubs" as the typehint >> installation directory, since installing under a different package >> name is easier to manage using existing publishing tools than >> installing to a different target directory? > > Perhaps I could be clearer in the PEP text on this. The idea is that people > can ship normal sdists (or what have you) and install those to the package > installation directory. Then the type checkers would pick up `pkg-stub` when > looking for `pkg` type information via the package API. This allows a third > party to ship just *.pyi files in a package and install it as if it were the > runtime package, but still be picked up by type checkers. This is different > than using 'shared/typehints/pythonX.Y' because that directory cannot be > queried by package resource APIs, and since no type checker implements PEP > 484's method, I thought it would be better to have everything be unified > under the same system of installing packages. So I suppose that is a rather > long, yes. :) OK, it wasn't clear to me that none of the current typecheckers actually implement looking for extra stubs in 'shared/typehints/pythonX.Y' . In that case, it makes a lot of sense to me to try to lower barriers to adoption by switching to a scheme that's more consistent with the way Python packaging and installation tools already work, and a simple suffix-based shadow tree approach makes a lot of sense to me from the packaging perspective (I'll leave it to the folks actually working on mypy et al to say how the feel about this more decentralised approach to managing 3rd party stubs). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 13 18:25:46 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 08:25:46 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 13 September 2017 at 20:45, Koos Zevenhoven wrote: > On Wed, Sep 13, 2017 at 6:14 AM, Nick Coghlan wrote: >> >> On 13 September 2017 at 00:35, Koos Zevenhoven wrote:> >> >> > I don't see how the situation benefits from calling something the "main >> > interpreter". Subinterpreters can be a way to take something >> > non-thread-safe >> > and make it thread-safe, because in an interpreter-per-thread scheme, >> > most >> > of the state, like module globals, are thread-local. (Well, this doesn't >> > help for async concurrency, but anyway.) >> >> "The interpreter that runs __main__" is never going to go away as a >> concept for the regular CPython CLI. > > > It's still just *an* interpreter that happens to run __main__. And who says > it even needs to be the only one? Koos, I've asked multiple times now for you to describe the practical user benefits you believe will come from dispensing with the existing notion of a main interpreter (which is *not* something PEP 554 has created - the main interpreter already exists at the implementation level, PEP 554 just makes that fact visible at the Python level). If you can't come up with a meaningful user benefit that would arise from removing it, then please just let the matter drop. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 13 18:37:54 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 08:37:54 +1000 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: On 14 September 2017 at 06:01, Jim J. Jewett wrote: > The "right time" is whenever they are currently evaluated. > (Definition time, I think, but won't swear.) > > For example, the "annotation" might really be a call to a logger, > showing the current environment, including names that will be rebound > before the module finishes loading. > > I'm perfectly willing to agree that even needing this much control > over timing is a code smell, but it is currently possible, and I would > rather it not become impossible. > > At a minimum, it seems like "just run this typing function that you > should already be using" should either save the right context, or the > PEP should state explicitly that this functionality is being > withdrawn. (And go ahead and suggest a workaround, such as running > the code before the method definition, or as a decorator.) I think it would be useful for the PEP to include a definition of an "eager annotations" decorator that did something like: def eager_annotations(f): ns = f.__globals__ annotations = f.__annotations__ for k, v in annotations.items(): annotations[k] = eval(v, ns) return f And pointed out that you can create variants of that which also pass in the locals() namespace (or use sys._getframes() to access it dynamically). That way, during the "from __future__ import lazy_annotations" period, folks will have clearer guidance on how to explicitly opt-in to eager evaluation via function and class decorators. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 13 18:46:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 08:46:45 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 13 September 2017 at 14:10, Nathaniel Smith wrote: > Subinterpreters are basically an attempt to reimplement the OS's > process isolation in user-space, right? Not really, they're more an attempt to make something resembling Rust's memory model available to Python programs - having the default behaviour be "memory is not shared", but having the choice to share when you want to be entirely an application level decision, without getting into the kind of complexity needed to deliberately break operating system level process isolation. The difference is that where Rust was able to do that on a per-thread basis and rely on their borrow checker for enforcement of memory ownership, for PEP 554, we're proposing to do it on a per-interpreter basis, and rely on runtime object space partitioning (where Python objects and the memory allocators are *not* shared between interpreters) to keep things separated from each other. That's why memoryview is such a key part of making the proposal interesting: it's what lets us relatively easily poke holes in the object level partitioning between interpreters and provide zero-copy messaging passing without having to share any regular reference counts between interpreters (which in turn is what makes it plausible that we may eventually be able to switch to a true GIL-per-interpreter model, with only a few cross-interpreter locks for operations like accessing the list of interpreters itself). Right now, the closest equivalent to this programming model that Python offers is to combine threads with queue.Queue, and it requires a lot of programming discipline to ensure that you don't access an object again once you've submitted to a queue. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 13 18:58:38 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 08:58:38 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 14 September 2017 at 08:46, Nick Coghlan wrote: > On 13 September 2017 at 14:10, Nathaniel Smith wrote: >> Subinterpreters are basically an attempt to reimplement the OS's >> process isolation in user-space, right? > > Not really, they're more an attempt to make something resembling > Rust's memory model available to Python programs - having the default > behaviour be "memory is not shared", but having the choice to share > when you want to be entirely an application level decision, without > getting into the kind of complexity needed to deliberately break > operating system level process isolation. I should also clarify: *Eric* still has hopes of sharing actual objects between subinterpreters without copying them. *I* think that's a forlorn hope, and expect that communicating between subinterpreters is going to end up looking an awful lot like communicating between subprocesses via shared memory. The trade-off between the two models will then be that one still just looks like a single process from the point of view of the outside world, and hence doesn't place any extra demands on the underlying OS beyond those required to run CPython with a single interpreter, while the other gives much stricter isolation (including isolating C globals in extension modules), but also demands much more from the OS when it comes to its IPC capabilities. The security risk profiles of the two approaches will also be quite different, since using subinterpreters won't require deliberately poking holes in the process isolation that operating systems give you by default. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Wed Sep 13 19:00:42 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 13 Sep 2017 16:00:42 -0700 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: On Tue, Sep 12, 2017 at 9:30 PM, Guido van Rossum wrote: > I find this a disturbing trend. Which trend? Moving away from "consenting adults"? In the case of sys.modules, the problem is that assigning a bogus value (e.g. []) can cause the interpreter to crash. It wasn't a problem until recently when I removed PyInterpreterState.modules and made sys.modules authoritative (see https://bugs.python.org/issue28411). The options there are: 1. revert that change (which means assigning to sys.modules deceptively does nothing) 2. raise an exception in all the places that expect sys.modules to be a mapping (far from where sys.modules was re-assigned) 3. raise an exception if you try to set sys.modules to a non-mapping 4. let a bogus sys.modules break the interpreter (basically, tell people "don't do that") My preference is #3 (obviously), but it sounds like you'd rather not. > I think we have bigger fish to fry and this sounds like it could slow down startup. It should have little impact on startup. The difference is the cost of importing the new sys module (which we could easily freeze to reduce the cost). That cost would apply only to programs that currently import sys. Everything in the stdlib would be updated to use _sys directly. If you think it isn't worth it then I'll let it go. I brought it up because I consider it a cheap, practical solution to the problem I ran into. Thanks! -eric From guido at python.org Wed Sep 13 19:10:25 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 16:10:25 -0700 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: My preference is (1), revert. You have to understand how sys.modules works before you can change it, and if you waste time debugging your mistake, so be it. I think the sys.py proposal is the wrong way to fix the mistake you made there. On Wed, Sep 13, 2017 at 4:00 PM, Eric Snow wrote: > On Tue, Sep 12, 2017 at 9:30 PM, Guido van Rossum > wrote: > > I find this a disturbing trend. > > Which trend? Moving away from "consenting adults"? In the case of > sys.modules, the problem is that assigning a bogus value (e.g. []) can > cause the interpreter to crash. It wasn't a problem until recently > when I removed PyInterpreterState.modules and made sys.modules > authoritative (see https://bugs.python.org/issue28411). The options > there are: > > 1. revert that change (which means assigning to sys.modules > deceptively does nothing) > 2. raise an exception in all the places that expect sys.modules to be > a mapping (far from where sys.modules was re-assigned) > 3. raise an exception if you try to set sys.modules to a non-mapping > 4. let a bogus sys.modules break the interpreter (basically, tell > people "don't do that") > > My preference is #3 (obviously), but it sounds like you'd rather not. > > > I think we have bigger fish to fry and this sounds like it could slow > down startup. > > It should have little impact on startup. The difference is the cost > of importing the new sys module (which we could easily freeze to > reduce the cost). That cost would apply only to programs that > currently import sys. Everything in the stdlib would be updated to > use _sys directly. > > If you think it isn't worth it then I'll let it go. I brought it up > because I consider it a cheap, practical solution to the problem I ran > into. Thanks! > > -eric > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Wed Sep 13 19:43:58 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Wed, 13 Sep 2017 19:43:58 -0400 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: > On Sep 13, 2017, at 6:37 PM, Nick Coghlan wrote: > > I think it would be useful for the PEP to include a definition of an > "eager annotations" decorator that did something like: > > def eager_annotations(f): > ns = f.__globals__ > annotations = f.__annotations__ > for k, v in annotations.items(): > annotations[k] = eval(v, ns) > return f > > And pointed out that you can create variants of that which also pass > in the locals() namespace (or use sys._getframes() to access it > dynamically). > > That way, during the "from __future__ import lazy_annotations" period, > folks will have clearer guidance on how to explicitly opt-in to eager > evaluation via function and class decorators. I like this idea! For classes it would have to be a function that you call post factum. The way class decorators are implemented, they cannot evaluate annotations that contain forward references. For example: class Tree: left: Tree right: Tree def __init__(self, left: Tree, right: Tree): self.left = left self.right = right This is true today, get_type_hints() called from within a class decorator will fail on this class. However, a function performing postponed evaluation can do this without issue. If a class decorator knew what name a class is about to get, that would help. But that's a different PEP and I'm not writing that one ;-) - ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From ncoghlan at gmail.com Wed Sep 13 21:44:00 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 11:44:00 +1000 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: On 14 September 2017 at 09:43, Lukasz Langa wrote: >> On Sep 13, 2017, at 6:37 PM, Nick Coghlan wrote: >> That way, during the "from __future__ import lazy_annotations" period, >> folks will have clearer guidance on how to explicitly opt-in to eager >> evaluation via function and class decorators. > > I like this idea! For classes it would have to be a function that you call post factum. The way class decorators are implemented, they cannot evaluate annotations that contain forward references. For example: > > class Tree: > left: Tree > right: Tree > > def __init__(self, left: Tree, right: Tree): > self.left = left > self.right = right > > This is true today, get_type_hints() called from within a class decorator will fail on this class. However, a function performing postponed evaluation can do this without issue. If a class decorator knew what name a class is about to get, that would help. But that's a different PEP and I'm not writing that one ;-) The class decorator case is indeed a bit more complicated, but there are a few tricks available to create a forward-reference friendly evaluation environment. 1. To get the right globals namespace, you can do: global_ns = sys.modules[cls.__module__].__dict__ 2. Define the evaluation locals as follows: local_ns = collections.ChainMap({cls.__name__: cls}, cls.__dict__) 3. Evaluate the variable and method annotations using "eval(expr, global_ns, local_ns)" If you make the eager annotation evaluation recursive (so the decorator can be applied to the outermost class, but also affects all inner class definitions), then it would even be sufficient to allow nested classes to refer to both the outer class as well as other inner classes (regardless of definition order). To prevent inadvertent eager evaluation of annotations on functions and classes that are merely referenced from a class attribute, the recursive descent would need to be conditional on "attr.__qualname__ == cls.__qualname__ + '.' + attr.__name__". So something like: def eager_class_annotations(cls): global_ns = sys.modules[cls.__module__].__dict__ local_ns = collections.ChainMap({cls.__name__: cls}, cls.__dict__) annotations = cls.__annotations__ for k, v in annotations.items(): annotations[k] = eval(v, global_ns, local_ns) for attr in cls.__dict__.values(): name = getattr(attr, "__name__", None) if name is None: continue qualname = getattr(attr, "__qualname__", None) if qualname is None: continue if qualname != f"{cls.__qualname}.{name}": continue if isinstance(attr, type): eager_class_annotations(attr) else: eager_annotations(attr) return cls You could also hide the difference between eager annotation evaluation on a class or a function inside a single decorator: def eager_annotations(obj): if isinstance(obj, type): _eval_class_annotations(obj) # Class elif hasattr(obj, "__globals__"): _eval_annotations(obj, obj.__globals__) # Function else: _eval_annotations(obj, obj.__dict__) # Module return obj Given the complexity of the class decorator variant, I now think it would actually make sense for the PEP to propose *providing* these decorators somewhere in the standard library (the lower level "types" module seems like a reasonable candidate, but we've historically avoided having that depend on the full collections module) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lukasz at langa.pl Wed Sep 13 22:29:24 2017 From: lukasz at langa.pl (Lukasz Langa) Date: Wed, 13 Sep 2017 22:29:24 -0400 Subject: [Python-ideas] PEP 563 and expensive backwards compatibility In-Reply-To: References: Message-ID: <8870E036-B34F-417B-A8CD-B3E57FCA081E@langa.pl> > On Sep 13, 2017, at 9:44 PM, Nick Coghlan wrote: > > On 14 September 2017 at 09:43, Lukasz Langa wrote: >>> On Sep 13, 2017, at 6:37 PM, Nick Coghlan wrote: >>> That way, during the "from __future__ import lazy_annotations" period, >>> folks will have clearer guidance on how to explicitly opt-in to eager >>> evaluation via function and class decorators. >> >> I like this idea! For classes it would have to be a function that you call post factum. The way class decorators are implemented, they cannot evaluate annotations that contain forward references. For example: >> >> class Tree: >> left: Tree >> right: Tree >> >> def __init__(self, left: Tree, right: Tree): >> self.left = left >> self.right = right >> >> This is true today, get_type_hints() called from within a class decorator will fail on this class. However, a function performing postponed evaluation can do this without issue. If a class decorator knew what name a class is about to get, that would help. But that's a different PEP and I'm not writing that one ;-) > > The class decorator case is indeed a bit more complicated, but there > are a few tricks available to create a forward-reference friendly > evaluation environment. Using cls.__name__ and the ChainMap is clever, I like it. It might prove useful for Eric's data classes later. However, there's more to forward references than self-references: class A: b: B class B: ... In this scenario evaluation of A's annotations has to happen after the module is fully loaded. This is the general case. No magic decorator will solve this. The general solution is running eval() later, when the namespace is fully populated. I do agree with you that a default implementation of a typing-agnostic variant of `get_type_hints()` would be nice. If anything, implementing this might better surface limitations of postponed annotations. That function won't be recursive though as your example. And I'll leave converting the function to a decorator as an exercise for the reader, especially given the forward referencing caveats. - ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP URL: From steve at pearwood.info Wed Sep 13 22:42:12 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Sep 2017 12:42:12 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: Message-ID: <20170914024211.GA13110@ando.pearwood.info> On Wed, Sep 13, 2017 at 05:09:37PM +0200, Jason H wrote: > The format of map seems off. Coming from JS, all the functions come > second. I think this approach is superior. Obviously Javascript has got it wrong. map() applies the function to the given values, so the function comes first. That matches normal English word order: * map function to values * apply polish to boots # not "apply boots from polish" * spread butter on bread # not "spread bread under butter" Its hard to even write the opposite order in English: map() takes the values and has the function applied to them which is a completely unnatural way of speaking or thinking about it (in English). I suggest you approach the Javascript developers and ask them to change the way they call map() to suit the way Python does it. After all, Python is the more popular language, and it is older too. > Also, how are we to tell what supports map()? Um... is this a trick question? Any function that takes at least one argument is usable with map(). > Any iterable should be able to map via: > range(26).map(lambda x: chr(ord('a')+x))) No, that would be silly. That means that every single iterable class is responsible for re-implementing map, instead of having a single implementation, namely the map() function. -- Steve From steve at pearwood.info Wed Sep 13 22:55:47 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Sep 2017 12:55:47 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: <20170914025547.GB13110@ando.pearwood.info> On Wed, Sep 13, 2017 at 11:05:26PM +0200, Jason H wrote: > > And look, map() even works with all of them, without inheritance, > > registration, and whatnot. It's so easy! > > Define easy. Opposite of hard or difficult. You want to map a function? map(function, values) is all it takes. You don't have to care whether the collection of values supports the map() method, or whether the class calls it "apply", or "Map", or something else. All you need care about is that the individual items inside the iterable are valid for the function, but you would need to do that regardless of how you call it. [1, 2, 3, {}, 5].map(plusone) # will fail > It's far easier for me to do a dir(dict) and see what I can do with it. And what of the functions that dict doesn't know about? > This is what python does after all. "Does it have the interface I > expect?" Global functions like len(), min(), max(), map(), etc(), > don't really tell me the full story. len(7) makes no sense. I can > attempt to call a function with an invalid argument. And you can attempt to call a non-existent method: x = 7 x.len() Or should that be length() or size() or count() or what? > [].len() makes more sense. Why? Getting the length of a sequence or iterator is not specifically a list operation, it is a generic operation that can apply to many different kinds of things. > Python is weird in that there are these special magical > globals The word you want is "function". > that operate on many things. What makes that weird? Even Javascript has functions. So do C, Pascal, Haskell, C++, Lisp, Scheme, and thousands of other languages. > Why is it ','.join(iterable), why > isn't there join(',', iterable) At what point does a method become a > global? A member? Do we take the path that everything is a global? Or > should all methods be members? So far it seems arbitrary. Okay, its arbitrary. Why is it called [].len instead of [].length or {}.size? Why None instead of nil or null or nul or NULL or NOTHING? Many decisions in programming languages are arbitrary. -- Steve From steve at pearwood.info Thu Sep 14 09:07:16 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Sep 2017 23:07:16 +1000 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: <20170914130716.GD13110@ando.pearwood.info> On Wed, Sep 13, 2017 at 12:24:31PM +0900, INADA Naoki wrote: > I'm worring about performance much. > > Dict has ma_version from Python 3.6 to be used for future optimization > including global caching. > Adding more abstraction layer may make it difficult. Can we make it opt-in, by replacing the module __dict__ when and only if needed? Perhaps we could replace it on the fly with a dict subclass that defines __missing__? That's virtually the same as __getattr__. Then modules which haven't replaced their __dict__ would not see any slow down at all. Does any of this make sense, or am I talking nonsense on stilts? -- Steve From storchaka at gmail.com Thu Sep 14 10:08:13 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 14 Sep 2017 17:08:13 +0300 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: 13.09.17 23:07, Lucas Wiman ????: > On Wed, Sep 13, 2017 at 11:55 AM, Serhiy Storchaka > > wrote: > > [...] Calling __getattr__() will slow down the access to builtins. > And there is a recursion problem if module's __getattr__() uses > builtins. > > > The first point is totally valid, but the recursion problem doesn't > seem like a strong argument. There are already lots of recursion > problems when defining custom __getattr__ or __getattribute__ methods, > but on balance they're a very useful part of the language. In normal classes we have the recursion problem in __getattr__() only with accessing instance attributes. Builtins (like isinstance, getattr, AttributeError) can be used without problems. In module's __getattr__() all this is a problem. Module attribute access can be implicit. For example comparing a string with a byte object in __getattr__() can trigger the lookup of __warningregistry__ and the infinity recursion. From rosuav at gmail.com Thu Sep 14 10:40:38 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 15 Sep 2017 00:40:38 +1000 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> Message-ID: On Fri, Sep 15, 2017 at 12:08 AM, Serhiy Storchaka wrote: > 13.09.17 23:07, Lucas Wiman ????: >> >> On Wed, Sep 13, 2017 at 11:55 AM, Serhiy Storchaka > > wrote: >> >> [...] Calling __getattr__() will slow down the access to builtins. >> And there is a recursion problem if module's __getattr__() uses >> builtins. >> >> >> The first point is totally valid, but the recursion problem doesn't seem >> like a strong argument. There are already lots of recursion problems when >> defining custom __getattr__ or __getattribute__ methods, but on balance >> they're a very useful part of the language. > > > In normal classes we have the recursion problem in __getattr__() only with > accessing instance attributes. Builtins (like isinstance, getattr, > AttributeError) can be used without problems. In module's __getattr__() all > this is a problem. > > Module attribute access can be implicit. For example comparing a string with > a byte object in __getattr__() can trigger the lookup of __warningregistry__ > and the infinity recursion. Crazy idea: Can we just isolate that function from its module? def isolate(func): return type(func)(func.__code__, {"__builtins__": __builtins__}, func.__name__) @isolate def __getattr__(name): print("Looking up", name) # the lookup of 'print' will skip this module ChrisA From mehaase at gmail.com Thu Sep 14 11:55:49 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Thu, 14 Sep 2017 11:55:49 -0400 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> Message-ID: On Wed, Sep 13, 2017 at 5:05 PM, Jason H wrote: > Python is weird in that there are these special magical globals that > operate on many things. Jason, that weirdness is actually a deep part of Python's philsophy. The language is very protocol driven. It's not just the built-in functions that use protocols; the language itself is built around them. For example, `for ... in ...` syntax loops over any iterable object. You can like it or leave it, but this will never change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Sep 14 11:57:08 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Sep 2017 01:57:08 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> Message-ID: <20170914155707.GE13110@ando.pearwood.info> On Wed, Sep 13, 2017 at 04:36:49PM +0200, Thibault Hilaire wrote: > Of course, for a lost of numbers, the decimal representation is simpler, and just as accurate as the radix-2 hexadecimal representation. > But, due to the radix-10 and radix-2 used in the two representations, the radix-2 may be much easier to use. Hex is radix 16, not radix 2 (binary). > In the "Handbook of Floating-Point Arithmetic" (JM Muller et al, Birkhauser editor, page 40),the authors claims that the largest exact decimal representation of a double-precision floating-point requires 767 digits !! > So it is not always few characters to type to be just as accurate !! > For example (this is the largest exact decimal representation of a single-precision 32-bit float): > > 1.17549421069244107548702944484928734882705242874589333385717453057158887047561890426550235133618116378784179687e-38 > and > > 0x1.fffffc0000000p-127 > are exactly the same number (one in decimal representation, the other in radix-2 hexadecimal)! That may be so, but that doesn't mean you have to type all 100+ digits in order to reproduce the float exactly. Just 1.1754942106924411e-38 is sufficient: py> 1.1754942106924411e-38 == float.fromhex('0x1.fffffc0000000p-127') True You may be mistaking two different questions: (1) How many decimal digits are needed to exactly convert the float to decimal? That can be over 100 for a C single, and over 700 for a double. (2) How many decimal digits are needed to uniquely represent the float? Nine digits (plus an exponent) is enough to represent all possible C singles; 17 digits is enough to represent all doubles (Python floats). I'm not actually opposed to hex float literals. I think they're cool. But we ought to have a reason more than just "they're cool" for supporting them, and I'm having trouble thinking of any apart from "C supports them, so should we". But maybe that's enough. -- Steve From python at mrabarnett.plus.com Thu Sep 14 12:43:24 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 14 Sep 2017 17:43:24 +0100 Subject: [Python-ideas] Make map() better In-Reply-To: <20170914025547.GB13110@ando.pearwood.info> References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> Message-ID: <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> On 2017-09-14 03:55, Steven D'Aprano wrote: > > On Wed, Sep 13, 2017 at 11:05:26PM +0200, Jason H wrote: > >> > And look, map() even works with all of them, without inheritance, >> > registration, and whatnot. It's so easy! >> >> Define easy. > > Opposite of hard or difficult. > > You want to map a function? > > map(function, values) > > is all it takes. You don't have to care whether the collection of values > supports the map() method, or whether the class calls it "apply", or > "Map", or something else. All you need care about is that the individual > items inside the iterable are valid for the function, but you would need > to do that regardless of how you call it. > > [1, 2, 3, {}, 5].map(plusone) # will fail > > >> It's far easier for me to do a dir(dict) and see what I can do with it. > > And what of the functions that dict doesn't know about? > > > >> This is what python does after all. "Does it have the interface I >> expect?" Global functions like len(), min(), max(), map(), etc(), >> don't really tell me the full story. len(7) makes no sense. I can >> attempt to call a function with an invalid argument. > > And you can attempt to call a non-existent method: > > x = 7 > x.len() > > Or should that be length() or size() or count() or what? > >> [].len() makes more sense. > > Why? Getting the length of a sequence or iterator is not specifically a > list operation, it is a generic operation that can apply to many > different kinds of things. > > >> Python is weird in that there are these special magical >> globals > > The word you want is "function". > >> that operate on many things. > > What makes that weird? Even Javascript has functions. So do C, Pascal, > Haskell, C++, Lisp, Scheme, and thousands of other languages. > >> Why is it ','.join(iterable), why >> isn't there join(',', iterable) At what point does a method become a >> global? A member? Do we take the path that everything is a global? Or >> should all methods be members? So far it seems arbitrary. > > Okay, its arbitrary. > > Why is it called [].len instead of [].length or {}.size? Why None > instead of nil or null or nul or NULL or NOTHING? > > Many decisions in programming languages are arbitrary. > In Java, strings have .length(), arrays have .length, and collections have .size(). From mal at egenix.com Thu Sep 14 13:01:42 2017 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 14 Sep 2017 19:01:42 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170914155707.GE13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> Message-ID: All this talk about accurate representation left aside, please consider what a newbie would think when s/he sees: x = 0x1.fffffc0000000p-127 There's really no need to make Python scripts cryptic. It's enough to have a helper function that knows how to read such representations and we already have that. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 14 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From antoine.rozo at gmail.com Thu Sep 14 13:38:34 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Thu, 14 Sep 2017 19:38:34 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: > Why is it ','.join(iterable), why isn't there join(',', iterable) Because join apply on a string, and strings are defined by the str class, not by a specific protocol (unlike iterables). 2017-09-14 18:43 GMT+02:00 MRAB : > On 2017-09-14 03:55, Steven D'Aprano wrote: > >> >> On Wed, Sep 13, 2017 at 11:05:26PM +0200, Jason H wrote: >> >> > And look, map() even works with all of them, without inheritance, >>> > registration, and whatnot. It's so easy! >>> >>> Define easy. >>> >> >> Opposite of hard or difficult. >> >> You want to map a function? >> >> map(function, values) >> >> is all it takes. You don't have to care whether the collection of values >> supports the map() method, or whether the class calls it "apply", or >> "Map", or something else. All you need care about is that the individual >> items inside the iterable are valid for the function, but you would need >> to do that regardless of how you call it. >> >> [1, 2, 3, {}, 5].map(plusone) # will fail >> >> >> It's far easier for me to do a dir(dict) and see what I can do with it. >>> >> >> And what of the functions that dict doesn't know about? >> >> >> >> This is what python does after all. "Does it have the interface I >>> expect?" Global functions like len(), min(), max(), map(), etc(), don't >>> really tell me the full story. len(7) makes no sense. I can attempt to call >>> a function with an invalid argument. >>> >> >> And you can attempt to call a non-existent method: >> >> x = 7 >> x.len() >> >> Or should that be length() or size() or count() or what? >> >> [].len() makes more sense. >>> >> >> Why? Getting the length of a sequence or iterator is not specifically a >> list operation, it is a generic operation that can apply to many >> different kinds of things. >> >> >> Python is weird in that there are these special magical globals >>> >> >> The word you want is "function". >> >> that operate on many things. >>> >> >> What makes that weird? Even Javascript has functions. So do C, Pascal, >> Haskell, C++, Lisp, Scheme, and thousands of other languages. >> >> Why is it ','.join(iterable), why isn't there join(',', iterable) At what >>> point does a method become a global? A member? Do we take the path that >>> everything is a global? Or should all methods be members? So far it seems >>> arbitrary. >>> >> >> Okay, its arbitrary. >> >> Why is it called [].len instead of [].length or {}.size? Why None >> instead of nil or null or nul or NULL or NOTHING? >> >> Many decisions in programming languages are arbitrary. >> >> In Java, strings have .length(), arrays have .length, and collections > have .size(). > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From c at anthonyrisinger.com Thu Sep 14 13:41:37 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Thu, 14 Sep 2017 12:41:37 -0500 Subject: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr() In-Reply-To: <20170914130716.GD13110@ando.pearwood.info> References: <20170912161708.x3mnxmrtbd26hsvi@python.ca> <20170914130716.GD13110@ando.pearwood.info> Message-ID: On Thu, Sep 14, 2017 at 8:07 AM, Steven D'Aprano wrote: > On Wed, Sep 13, 2017 at 12:24:31PM +0900, INADA Naoki wrote: > > I'm worring about performance much. > > > > Dict has ma_version from Python 3.6 to be used for future optimization > > including global caching. > > Adding more abstraction layer may make it difficult. > > Can we make it opt-in, by replacing the module __dict__ when and only if > needed? Perhaps we could replace it on the fly with a dict subclass that > defines __missing__? That's virtually the same as __getattr__. > > Then modules which haven't replaced their __dict__ would not see any > slow down at all. > > Does any of this make sense, or am I talking nonsense on stilts? > This is more or less what I was describing here: https://mail.python.org/pipermail/python-ideas/2017-September/047034.html I am also looking at Neil's approach this weekend though. I would be happy with a __future__ that enacted whatever concessions are necessary to define a module as if it were a class body, with import statements maybe being implicitly global. This "new-style" module would preferably avoid the need to populate `sys.modules` with something that can't possibly exist yet (since it's being defined!). Maybe we allow module bodies to contain a `return` or `yield`, making them a simple function or generator? The presence of either would activate this "new-style" module loading: * Modules that call `return` should return the completed module. Importing yourself indirectly would likely cause recursion or be an error (lazy importing would really help here!). Could conceptually expand to something like: ``` global __class__ global __self__ class __class__: def __new__(... namespace-dunders-and-builtins-passed-as-kwds ...): # ... module code ... # ... closures may access __self__ and __class__ ... return FancyModule(__name__) __self__ = __class__(__builtins__={...}, __name__='fancy', ...) sys.modules[__self__.__name__] = __self__ ``` * Modules that call `yield` should yield modules. This could allow defining zero modules, multiple modules, overwriting the same module multiple times. Module-level code may then yield an initial object so self-referential imports, in lieu of deferred loading, work better. They might decide to later upgrade the initial module's __class__ (similar to today) or replace outright. Could conceptually expand to something like: ``` global __class__ global __self__ def __hidden_TOS(... namespace-dunders-and-builtins-passed-as-kwds ...): # ... initial module code ... # ... closures may access __self__ and __class__ ... module = yield FancyModuleInitialThatMightRaiseIfUsed(__name__) # ... more module code ... module.__class__ = FancyModule for __self__ in __hidden_TOS(__builtins__={...}, __name__='fancy', ...): __class__ = __self__.__class__ sys.modules[__self__.__name__] = __self__ ``` Otherwise I still have a few ideas around using what we've got, possibly in a backwards compatible way: ``` global __builtins__ = {...} global __class__ global __self__ # Loader dunders. __name__ = 'fancy' # Deferred loading could likely stop this from raising in most cases. # globals is a deferred import dict using __missing__. # possibly sys.modules itself does deferred imports using __missing__. sys.modules[__name__] = RaiseIfTouchedElseReplaceAllRefs(globals()) class __class__: [global] import current_module # ref in cells replaced with __self__ [global] import other_module def bound_module_function(...): pass [global] def simple_module_function(...): pass # ... end module body ... # Likely still a descriptor. __dict__ = globals() __self__ = __class__() sys.modules[__self__.__name__] = __self__ ``` Something to think about. Thanks, -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.galode at gmail.com Thu Sep 14 14:57:03 2017 From: alexandre.galode at gmail.com (Alexandre GALODE) Date: Thu, 14 Sep 2017 20:57:03 +0200 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality Message-ID: Hi everybody, I'm a Python dev since about 10 years. I saw that customers are more and more curious about development process, talking from various reference on which allow to define what they call "good quality code". Someones are talking only about Pylint, other one from SonarQube, ... I search for SonarQube for Python, and saw it was using Pylint, coverage and Unittest. I saw also some customers internal tools which, every of them, was using internal tool, pylint, pylama, pep8, ... So, with help of my customers (thanks to them ^^), i realize there was a missing in PEP about this point. I made some search, and saw that there is no PEP which define, not even a minimum, which metrics and/or tools use to evaluate the quality of a Python code; no minimal indication in this goal. Also, my PEP idea was to try to define basical metrics to guarantee minimal code quality. I'd like to purpose it as an informational PEP. So users and implementers would be free to ignore it. I does not see it as a constrining PEP, but as a PEP reference which could give help, and be used by every developers to say he/she is PEP xxxx from its development. What are you thinking about this PEP idea? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhihn at gmx.com Thu Sep 14 15:06:59 2017 From: jhihn at gmx.com (Jason H) Date: Thu, 14 Sep 2017 21:06:59 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: >> Why is it ','.join(iterable), why isn't there join(',', iterable) ? > Because join apply on a string, and strings are defined by the str class, not by a specific protocol (unlike iterables). Why? I can iterate over a string. [c for c in 'abc'] It certainly behaves like one... I'd say this is inconsistent because there is no __iter__() and next() on the str class. >>> Many decisions in programming languages are arbitrary. >>?In Java, strings have .length(), arrays have .length, and collections have .size(). You don't have to write wholly new functions. We could keep the globals and just add wrappers: class Iterable: def length(self): return len(self) # list or dict def map(self, f): return map(f, self)) # list def map(self, f): return map(f, self.iteritems()) # dict And this circles back around to my original point. Python is the only language people use. Only a very small few get to have 1 language. The rest of us are stuck switching languages, experiencing pain and ambiguity when writing code. I think that's a legitimate criticism. We've got Python as an outlier with no property or method for length of a collection (C++/Java/JS). I don't care if it's implemented as all the same function, but being the exception is a drawback. (FWIW, I don't like size() unless it's referring to storage size which isn't necessarily equal to the number of iterables.) I do think Python is superior in many, many, ways to all other languages, but as Python and JS skills are often desired in the same engineer, it seems that we're making it harder on the majority of the labor force. As for the English language parsing of map(f, list), it's a simple matter of taking what you have - an iterable, and applying the function to them. You start with shoes that you want to be shined, not the act of shining then apply that to the shoes. It's like the saying when all you have is a hammer, everything starts to look like a nail. You're better off only hammering nails. From rosuav at gmail.com Thu Sep 14 15:31:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 15 Sep 2017 05:31:56 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: On Fri, Sep 15, 2017 at 5:06 AM, Jason H wrote: > >>> Why is it ','.join(iterable), why isn't there join(',', iterable) > >> Because join apply on a string, and strings are defined by the str class, not by a specific protocol (unlike iterables). > Why? I can iterate over a string. [c for c in 'abc'] It certainly behaves like one... I'd say this is inconsistent because there is no __iter__() and next() on the str class. There is __iter__, but no next() or __next__() on the string itself. __iter__ makes something iterable; __next__ is on iterators, but not on all iterables. >>> "abc".__iter__() > I do think Python is superior in many, many, ways to all other languages, but as Python and JS skills are often desired in the same engineer, it seems that we're making it harder on the majority of the labor force. > "We" are making it harder? Who's "we"? Python predates JavaScript by a few years, and the latter language was spun up in less than two weeks in order to create an 'edge' in the browser wars. So I don't think anyone really planned for anyone to write multi-language code involving Python and JS. ChrisA From antoine.rozo at gmail.com Thu Sep 14 17:03:23 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Thu, 14 Sep 2017 23:03:23 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: > Why? I can iterate over a string. [c for c in 'abc'] It certainly behaves like one... I'd say this is inconsistent because there is no __iter__() and next() on the str class. Yes, strings are iterables. You can use a string as argument of str.join method. But only strings can be used as separators, so there is non need for a generic join method for all types of separators. Python is well designed, you are just not used to it 2017-09-14 21:31 GMT+02:00 Chris Angelico : > On Fri, Sep 15, 2017 at 5:06 AM, Jason H wrote: > > > >>> Why is it ','.join(iterable), why isn't there join(',', iterable) > > > >> Because join apply on a string, and strings are defined by the str > class, not by a specific protocol (unlike iterables). > > Why? I can iterate over a string. [c for c in 'abc'] It certainly > behaves like one... I'd say this is inconsistent because there is no > __iter__() and next() on the str class. > > There is __iter__, but no next() or __next__() on the string itself. > __iter__ makes something iterable; __next__ is on iterators, but not > on all iterables. > > >>> "abc".__iter__() > > > > I do think Python is superior in many, many, ways to all other > languages, but as Python and JS skills are often desired in the same > engineer, it seems that we're making it harder on the majority of the labor > force. > > > > "We" are making it harder? Who's "we"? Python predates JavaScript by a > few years, and the latter language was spun up in less than two weeks > in order to create an 'edge' in the browser wars. So I don't think > anyone really planned for anyone to write multi-language code > involving Python and JS. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Sep 14 18:26:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Sep 2017 00:26:59 +0200 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: Le 14 sept. 2017 01:01, "Eric Snow" a ?crit : In the case of sys.modules, the problem is that assigning a bogus value (e.g. []) can cause the interpreter to crash. It wasn't a problem until recently when I removed PyInterpreterState.modules and made sys.modules authoritative (see https://bugs.python.org/issue28411). How do you crash Python? Can't we fix the interpreter? Your change makes so I would prefer to keep it if possible. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Sep 14 18:39:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 08:39:19 +1000 Subject: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code In-Reply-To: References: Message-ID: On 14 September 2017 at 08:25, Nick Coghlan wrote: > On 13 September 2017 at 20:45, Koos Zevenhoven wrote: >> It's still just *an* interpreter that happens to run __main__. And who says >> it even needs to be the only one? > > Koos, I've asked multiple times now for you to describe the practical > user benefits you believe will come from dispensing with the existing > notion of a main interpreter (which is *not* something PEP 554 has > created - the main interpreter already exists at the implementation > level, PEP 554 just makes that fact visible at the Python level). Eric addressed this in the latest update, and took the view that since it's a question the can be deferred, it's one that should be deferred, in line with the overall "minimal enabling infrastructure" philosophy of the PEP. On thinking about it further, I believe this may also intersect with some open questions I have around the visibility of *thread* objects across interpreters - the real runtime constraint at the implementation level is the fact that we need a main *thread* in order to sensibly manage the way signal handling works across different platforms, and that's where we may get into trouble if we allow arbitrary subinterpreters to run in the main thread, and accept and process signals directly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 14 18:55:52 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 08:55:52 +1000 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: Message-ID: On 15 September 2017 at 04:57, Alexandre GALODE wrote: > What are you thinking about this PEP idea? We don't really see it as the role of the language development team to formally provide direct advice on what tools people should be using to improve their development workflows (similar to the way that it isn't the C/C++ standards committees recommending tools like Coverity, or Oracle and the OpenJDK teams advising on the use of SonarQube). That said, there *are* collaborative groups working on these kinds of activities: - for linters & style checkers, there's the "PyCQA": http://meta.pycqa.org/en/latest/introduction.html - for testing, there's the "Python Testing Tools Taxonomy" (https://wiki.python.org/moin/PythonTestingToolsTaxonomy) and the testing-in-python mailing list (http://lists.idyll.org/listinfo/testing-in-python) - for gradual typing, we do get involved for the language level changes, but even there, the day-to-day focus of activity is more the typing repo at https://github.com/python/typing/ rather than CPython itself So while there's definitely scope for making these aspects of the Python ecosystem more approachable (similar to what we're aiming to do with packaging.python.org for the software distribution space), a PEP isn't the right vehicle for it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 14 19:11:49 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 09:11:49 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: On 15 September 2017 at 03:38, Antoine Rozo wrote: >> Why is it ','.join(iterable), why isn't there join(',', iterable) > > Because join apply on a string, and strings are defined by the str class, > not by a specific protocol (unlike iterables). Join actually used to only be available as a function (string.join in Python 2.x). However, nobody could ever remember whether the parameter order was "join this series of strings with this separator" or "use this separator to join this series of strings" - it didn't have the same kind of natural ordering that map and filter did thanks to the old "apply" builtin ("apply this function to these arguments" - while the apply builtin itself is no longer around, the "callable(*args, **kwds)" ordering that corresponds to the map() and filter() argument ordering lives on). This meant that when string.join became a builtin interface, it became a string method since: 1. Strings are one of the few Python APIs that *aren't* protocol centric - they're so heavily intertwined with the interpreter implementation, that most folks don't even try to emulate or even subclass them*. 2. As a string method, it's obvious what the right order has to be (separator first, since it's the separator that has the method) As a result, the non-obvious to learn, but easy to remember, method spelling became the only spelling when string.join was dropped for Python 3.0. Cheers, Nick. * One case where it was actually fairly common for folks to define their own str subclasses, rich filesystem path objects, finally gained its own protocol in Python 3.6 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mistersheik at gmail.com Thu Sep 14 19:15:36 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 14 Sep 2017 16:15:36 -0700 (PDT) Subject: [Python-ideas] Make map() better In-Reply-To: References: Message-ID: For these examples, you shouldn't be using map at all. On Wednesday, September 13, 2017 at 11:10:39 AM UTC-4, Jason H wrote: > > The format of map seems off. Coming from JS, all the functions come > second. I think this approach is superior. > > Currently: > map(lambda x: chr(ord('a')+x), range(26)) # ['a', 'b', 'c', 'd', 'e', 'f', > 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', > 'v', 'w', 'x', 'y', 'z'] > > But I think this reads better: > map(range(26), lambda x: chr(ord('a')+x)) > > Currently that results in: > TypeError: argument 2 to map() must support iteration > > Also, how are we to tell what supports map()? > Any iterable should be able to map via: > range(26).map(lambda x: chr(ord('a')+x))) > > While the line length is the same, I think the latter is much more > readable, and the member method avoids parameter order confusion > > For the global map(), > having the iterable first also increases reliability because the lambda > function is highly variable in length, where as parameter names are > generally shorter than even the longest lambda expression. > > More readable: IMHO: > map(in, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > out = (chr(ord('a')+x) for x in out) is the most legible. > out = map(out, lambda x: chr(ord('a')+x)) > Less readable (I have to parse the lambda): > map(lambda x: chr(ord('a')+x), in) > out = map(lambda x: chr(ord('a')+x), out) > out = map(lambda x: chr(ord('a')+x), out) > > But I contend: > range(26).map(lambda x: chr(ord('a')+x))) > is superior to all. > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu Sep 14 19:54:48 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 14 Sep 2017 17:54:48 -0600 Subject: [Python-ideas] sys.py In-Reply-To: References: Message-ID: On Thu, Sep 14, 2017 at 4:26 PM, Victor Stinner wrote: > How do you crash Python? See https://bugs.python.org/issue31404. > Can't we fix the interpreter? I'm looking into it. In the meantime I've split the original branch up into 3. The first I've already landed. [1] The second I'll land once I resolve some refcount leaks. [2] The final branch is the one that actually drops PyInterpreterState.modules. [3] It's quite small, but that's the part that causes the crash. So we'll have to adapt it if we want to make it work before it can be merged again (or else we'll be right back where we were before I reverted). > Your change makes so I would prefer to keep it if possible. Why in particular do you want to keep the change? -eric [1] https://github.com/python/cpython/pull/3575 [2] https://github.com/python/cpython/pull/3593 [3] https://github.com/ericsnowcurrently/cpython/compare/sys-modules-any-mapping...ericsnowcurrently:remove-modules-from-interpreter-state From alexandre.galode at gmail.com Fri Sep 15 01:58:43 2017 From: alexandre.galode at gmail.com (alexandre.galode at gmail.com) Date: Thu, 14 Sep 2017 22:58:43 -0700 (PDT) Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: Message-ID: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> Hi, Thanks for your return, and link. I knew PyCQA but not the others. It seems i didn't success to precise my idea correctly. I'm 100% OK with you that this is not language role to indicate precisely which tools to use. But my idea is only to define basical metrics, useful to evaluate quality code, and related to PEPs if existing. I precise i'd like to propose informational PEP only. I'm not considering that my idea must be mlandatory, but only an indication to improve its code quality. So each developer/society would be free to evaluate each metrics with the way he/it want. As for PEP8 for example, for which each developer can use pep8, pylint, ... The PEP257, for docstrings, indicate how to structure our docstring. My PEP, in my mind, would indicate how to basically evaluate our quality code, only which metrics are useful for that. But only this, no tools indication. Customer i see regularly ask me for being PEP8 & 257 & ... "certified". This PEP could be opportunity to have opportunity saying be "PEPxxxx certified", so customers would be sure that several metrics, based on PEP and other parameter, would be use to gurantee their quality code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Sep 15 02:57:42 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 15 Sep 2017 02:57:42 -0400 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> Message-ID: On 9/15/2017 1:58 AM, alexandre.galode at gmail.com wrote: > Hi, > > Thanks for your return, and link. I knew PyCQA but not the others. > > It seems i didn't success to precise my idea correctly. I'm 100% OK with > you that this is not language role to indicate precisely which tools to > use. > > But my idea is only to define basical metrics, useful to evaluate > quality code, and related to PEPs if existing. I precise i'd like to > propose informational PEP only. I'm not considering that my idea must be > mlandatory, but only an indication to improve its code quality. So each > developer/society would be free to evaluate each metrics with the way > he/it want. As for PEP8 for example, for which each developer can use > pep8, pylint, ... It seems to me that you are talking about stuff more appropriate to blog posts or the python wiki or even to a web site. > The PEP257, for docstrings, indicate how to structure our docstring. My > PEP, in my mind, would indicate how to basically evaluate our quality > code, only which metrics are useful for that. But only this, no tools > indication. > > Customer i see regularly ask me for being PEP8 & 257 & ... "certified". Why do they care? In my view, style is not quality. It is a means to better quality. I say this as someone who enforces most of PEP 8 on new IDLE patches, whether by myself or others, and who even upgrades existing code. The relevance to users is that it facilitates making IDLE work better. Automated test coverage is also a means to code quality, not quality in itself. I recently fixed a bug in 100% covered code. I discovered it by using the corresponding widgets like a user would. The automated test did not properly simulate a particular pattern of user actions. > This PEP could be opportunity to have opportunity saying be "PEPxxxx > certified", so customers would be sure that several metrics, based on > PEP and other parameter, would be use to gurantee their quality code. PEP 8 started as a guide for new stdlib code, but we do not 'pep8 certify' Python patches to the stdlib. Security, stability, correctness, test coverage, and speed are all more important. It just happened that others eventually adopted PEP 8 for their own use, including in code checkers. Information about .rst only belongs in a PEP because we use it in our own docs. Information about code metrics might belong in a PEP only if we were applying them to the CPython code base. Note that our C code is checked by Coverity, but I don't believe there is a PEP about it. (No title has 'coverity'.) After pydev discussion, a core developer made the available free checks happen and then followed up on the deficiency reports. -- Terry Jan Reedy From steve at pearwood.info Fri Sep 15 06:19:54 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Sep 2017 20:19:54 +1000 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> Message-ID: <20170915101954.GF13110@ando.pearwood.info> Hi Alexandre, On Thu, Sep 14, 2017 at 10:58:43PM -0700, alexandre.galode at gmail.com wrote: > But my idea is only to define basical metrics, useful to evaluate quality > code, and related to PEPs if existing. I precise i'd like to propose > informational PEP only. I'm not considering that my idea must be > mlandatory, but only an indication to improve its code quality. So each > developer/society would be free to evaluate each metrics with the way he/it > want. As for PEP8 for example, for which each developer can use pep8, > pylint, ... > > The PEP257, for docstrings, indicate how to structure our docstring. My > PEP, in my mind, would indicate how to basically evaluate our quality code, > only which metrics are useful for that. But only this, no tools indication. It might help if you tell us some of these proposed code quality metrics. What do you mean by "code quality", and how do you measure it? The only semi-objective measures of code quality that I am aware of are: - number of bugs per kiloline of code; - percentage of code covered by tests. And possibly this one: http://www.osnews.com/story/19266/WTFs_m Given two code bases in the same language, one is better quality than the other if the number of bugs per thousand lines is smaller, and the percentage of code coverage is higher. I am interested in, but suspicious of, measures of code complexity, coupling, code readability, and similar. What do you consider objective, reliable code quality metrics? -- Steve From jhihn at gmx.com Fri Sep 15 10:21:56 2017 From: jhihn at gmx.com (Jason H) Date: Fri, 15 Sep 2017 16:21:56 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: I'm going to respond to a few replies with this one. > > Because join apply on a string, and strings are defined by the str class, > > not by a specific protocol (unlike iterables). > > Join actually used to only be available as a function (string.join in > Python 2.x). However, nobody could ever remember whether the parameter > order was "join this series of strings with this separator" or "use > this separator to join this series of strings" - it didn't have the > same kind of natural ordering that map and filter did thanks to the > old "apply" builtin ("apply this function to these arguments" - while > the apply builtin itself is no longer around, the "callable(*args, > **kwds)" ordering that corresponds to the map() and filter() argument > ordering lives on). > > This meant that when string.join became a builtin interface, it became > a string method since: > > 1. Strings are one of the few Python APIs that *aren't* protocol > centric - they're so heavily intertwined with the interpreter > implementation, that most folks don't even try to emulate or even > subclass them*. > 2. As a string method, it's obvious what the right order has to be > (separator first, since it's the separator that has the method) Your second point is aligned with my initial point. Providing it as a class method further removes the ambiguity of what the right order is because of there is only one parameter: the function. As for the comment about Python being better designed, I never implied it wasn't. Python is my favorite. However as someone who uses multiple languages, and JS has risen in popularity, and it's not going anywhere, I seek a more peaceful co-existence with this language - a language that was designed in 10 days in 1995. In a covert way, by embracing some of the not-too-terrible syntax you make it easier for JS programmers to feel at home in Python, and adopt it. I see this only has promoting a peaceful coexistence. Another pain point is python uses [].append() and JS uses [].join() Having a wrapper for append would be helpful. And for that matter, why isn't append/extend a global? I can add things to lots of different collections. lists, sets, strings... From antoine.rozo at gmail.com Fri Sep 15 10:31:54 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Fri, 15 Sep 2017 16:31:54 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: > And for that matter, why isn't append/extend a global? I can add things to lots of different collections. lists, sets, strings... No, append is relevant when the collection is ordered. You update dictionnaries, you make unions of sets, and you append lists to another ones. These operations are specific for each datatype. And not all iterables need operations to append new elements. But you can chain generic iterables (with itertools.chain). 2017-09-15 16:21 GMT+02:00 Jason H : > I'm going to respond to a few replies with this one. > > > > Because join apply on a string, and strings are defined by the str > class, > > > not by a specific protocol (unlike iterables). > > > > Join actually used to only be available as a function (string.join in > > Python 2.x). However, nobody could ever remember whether the parameter > > order was "join this series of strings with this separator" or "use > > this separator to join this series of strings" - it didn't have the > > same kind of natural ordering that map and filter did thanks to the > > old "apply" builtin ("apply this function to these arguments" - while > > the apply builtin itself is no longer around, the "callable(*args, > > **kwds)" ordering that corresponds to the map() and filter() argument > > ordering lives on). > > > > This meant that when string.join became a builtin interface, it became > > a string method since: > > > > 1. Strings are one of the few Python APIs that *aren't* protocol > > centric - they're so heavily intertwined with the interpreter > > implementation, that most folks don't even try to emulate or even > > subclass them*. > > 2. As a string method, it's obvious what the right order has to be > > (separator first, since it's the separator that has the method) > > Your second point is aligned with my initial point. Providing it as a > class method further removes the ambiguity of what the right order is > because of there is only one parameter: the function. > > As for the comment about Python being better designed, I never implied it > wasn't. Python is my favorite. However as someone who uses multiple > languages, and JS has risen in popularity, and it's not going anywhere, I > seek a more peaceful co-existence with this language - a language that was > designed in 10 days in 1995. In a covert way, by embracing some of the > not-too-terrible syntax you make it easier for JS programmers to feel at > home in Python, and adopt it. I see this only has promoting a peaceful > coexistence. > > Another pain point is python uses [].append() and JS uses [].join() Having > a wrapper for append would be helpful. > > And for that matter, why isn't append/extend a global? I can add things to > lots of different collections. lists, sets, strings... > > > > > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Sep 15 10:42:57 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 15 Sep 2017 07:42:57 -0700 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: On Sep 15, 2017 7:23 AM, "Jason H" wrote: Another pain point is python uses [].append() and JS uses [].join() Having a wrapper for append would be helpful. There should be one, and only one, obvious way to do it. And for that matter, why isn't append/extend a global? I can add things to lots of different collections. lists, sets, strings... This is a key misunderstanding. You CANNOT "append" to lots of collections. You can ADD to a set, which has a quite different semantics than appending (and hence a different name). You can neither add nor append to a string because they are immutable. I can imagine various collections where appending might make sense (mutable strings?). None of them are in builtins. If a 3rd party wrote such a collection with append semantics, they'd probably name it append... e.g. collections.deque has an "append" (but also an "appendleft"). -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Sep 15 12:24:20 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Sep 2017 02:24:20 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> Message-ID: <20170915162419.GG13110@ando.pearwood.info> Jason, I'm sorry if you feel that everyone is piling on you to poo-poo your ideas, but we've heard it all before. To you it might seem like "peaceful coexistence", but we already have peaceful coexistence. Python exists, and Javascript exists (to say nothing of thousands of other languages, good bad and indifferent), and we're hardly at war in any way. Rather, what it sounds like to us is "Hey, Python is really great, let's make it worse so Javascript coders won't have to learn anything new!" Um... what's in it for *us*? What benefit do we get? Honestly, there are so many differences between Python and Javascript that having to learn a handful more or less won't make any difference. The syntax is different, the keywords are different, the standard library is different, the semantics of code is different, the names of functions and objects are different, the methods are different, the idioms of what is considered best practice are different... Why, I could almost believe that Python and Javascript were different languages! -- Steve From jhihn at gmx.com Fri Sep 15 14:03:32 2017 From: jhihn at gmx.com (Jason H) Date: Fri, 15 Sep 2017 20:03:32 +0200 Subject: [Python-ideas] Make map() better In-Reply-To: <20170915162419.GG13110@ando.pearwood.info> References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> <20170915162419.GG13110@ando.pearwood.info> Message-ID: > Sent: Friday, September 15, 2017 at 12:24 PM > From: "Steven D'Aprano" > To: python-ideas at python.org > Subject: Re: [Python-ideas] Make map() better > > Jason, > > I'm sorry if you feel that everyone is piling on you to poo-poo your > ideas, but we've heard it all before. To you it might seem like > "peaceful coexistence", but we already have peaceful coexistence. Python > exists, and Javascript exists (to say nothing of thousands of other > languages, good bad and indifferent), and we're hardly at war in any > way. > > Rather, what it sounds like to us is "Hey, Python is really great, let's > make it worse so Javascript coders won't have to learn anything new!" > > Um... what's in it for *us*? What benefit do we get? > > Honestly, there are so many differences between Python and Javascript > that having to learn a handful more or less won't make any difference. > The syntax is different, the keywords are different, the standard > library is different, the semantics of code is different, the names of > functions and objects are different, the methods are different, the > idioms of what is considered best practice are different... > > Why, I could almost believe that Python and Javascript were different > languages! Really, there is no offense taken. Clearly, I am a genius and you all are wrong. ;-) But really, I've learned a few things along the way. Though nothing to convince me that I'm wrong or it's a bad idea. It's just not liked by the greybeards, which I can appreciate. "here's some new punk kid, get off my lawn!" type of mentality. Dunning-Kruger, etc. Hyperbole aside, you (and others) raise some very good points. I think your question deserves an answer: > Um... what's in it for *us*? What benefit do we get? How that question is answered, depends on who is 'us'? If you're a bunch of python purist greybeards, then conceivably nothing. But, according to [Uncle] Bob Martin, the software industry doubles in engineers every 5 years. meaning that they greybeards are asymptotically becoming a minority. The features I've requested are to ease newcomers who will undoubtedly have JS* experience. * I may have focused too much on JS specifically, however I split my time between C/C++/Java/Python/JS. These days it's mainly JS and Python. But the request to move map(), etc to member functions is just an OOP-motivated one. The C++ or Java developer in me also wants those things. So now we have engineers coming from 2 other languages in addition to JS that would benefit. Why should it be Python? I have to admit this isn't about fairness. Clearly Python's concepts are superior to JS, but the responsibility falls to Python because it's far more mutable than the other languages. The process to improve JS if fraught with limitations from the start. Some are technical limitations, some are managing-body imposed, and some are just market share (there must be more than one browser implementation). Again, Python wins. At the end of the day, I'm just asking for a the 1% of enhancements that will eliminate 98% of the frustration of polyglots. I can't say with any certainty how that shape the future, but I would expect that it would make Python even more popular while increasing productivity. I can't see that there would be any real downside. Except that there would be more than one way in some scenarios, but that additional way would be balanced with familiarity with other languages. From ned at nedbatchelder.com Fri Sep 15 14:54:01 2017 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 15 Sep 2017 14:54:01 -0400 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> <20170915162419.GG13110@ando.pearwood.info> Message-ID: <66ae915f-efee-fa6b-429e-193d6e31e2d5@nedbatchelder.com> On 9/15/17 2:03 PM, Jason H wrote: > It's just not liked by the greybeards, which I can appreciate. "here's some new punk kid, get off my lawn!" type of mentality. Dunning-Kruger, etc. I'm not sure if you meant this tongue-in-cheek or not.? The main problem with your proposal is that it would break existing code. This seems like a really flip way to address that concern.? I suspect even the hip young kids would like existing code to keep working. map() isn't even used much in Python, since most people prefer list comprehensions.? There really are many differences between Python and Javascript, which go much deeper than the order of arguments to built-in functions.? You are in a multi-lingual world.? Accept it. --Ned. From ethan at stoneleaf.us Fri Sep 15 15:26:14 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 15 Sep 2017 12:26:14 -0700 Subject: [Python-ideas] Make map() better In-Reply-To: References: <710E3AC6-3ED5-4E6B-A738-C0CE7B2E4BBA@gmail.com> <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> <20170915162419.GG13110@ando.pearwood.info> Message-ID: <59BC2956.9080206@stoneleaf.us> On 09/15/2017 11:03 AM, Jason H wrote: > From: "Steven D'Aprano" >> Um... what's in it for *us*? What benefit do we get? > > How that question is answered, depends on who is 'us'? If you're a bunch of > python purist greybeards, then conceivably nothing. How about refugees from other languages? Your suggestions only make Python worse -- hardly a commendable course of action. > The features I've requested are to ease newcomers who will undoubtedly have > JS* experience. The point of having different languages is to have different ways of doing things. Certain mindsets do better with Python's ways, others do better with JS's ways. That doesn't mean we should have the languages converge into the same thing. Aside: your suggestion that anyone who likes the philosophy of Python and doesn't want to incorporate features from other languages that are contrary to it is just old and set in their ways is offensive. Such assertions weaken your arguments and your reputation. -- ~Ethan~ From steve at pearwood.info Fri Sep 15 21:58:17 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Sep 2017 11:58:17 +1000 Subject: [Python-ideas] Make map() better In-Reply-To: References: <20170914025547.GB13110@ando.pearwood.info> <76adba39-8af0-adec-a969-f2ea90a494c6@mrabarnett.plus.com> <20170915162419.GG13110@ando.pearwood.info> Message-ID: <20170916015817.GH13110@ando.pearwood.info> On Fri, Sep 15, 2017 at 08:03:32PM +0200, Jason H wrote: > But really, I've learned a few things along the way. Though nothing to > convince me that I'm wrong or it's a bad idea. It's just not liked by > the greybeards, which I can appreciate. "here's some new punk kid, get > off my lawn!" type of mentality. Dunning-Kruger, etc. "The Dunning?Kruger effect is a cognitive bias wherein persons of low ability suffer from illusory superiority, mistakenly assessing their cognitive ability as greater than it is. The cognitive bias of illusory superiority derives from the metacognitive inability of low-ability persons to recognize their own ineptitude." -- Wikipedia. If you're attempting to win us over to your side, stating that we're too stupid to realise how dumb we are is NOT the right way to do so. But since you've just taken the gloves off, and flung them right in our faces, perhaps you ought to take a long hard look in the mirror. You've been told that at least some of the changes you're asking for are off the table because they will break working code. And that is still not enough to convince you that they are a bad idea? Breaking thousands of Python programs just so that people coming from Javascript have only 99 differences to learn instead of 100 is not an acceptable tradeoff. Adding bloat and redundant aliases to methods, especially using inappropriate names like "join" for append, increases the cognitive burden on ALL Python programmers. It won't even decrease the amount of differences Javascript coders have to learn, because they will still come across list.append in existing code. So these aliases will be nothing but lose-lose for everyone: - those who have to maintain and document them - those who have to learn them - those who have to answer the question "what's the difference between list.append and list.join, they seem to do the same thing" - those who have to read them in others' code - and those who have to decide each time whether to spell it alist.join or alist.append. I'm sorry that Python's OOP design is not pure enough for your taste, but we like it the way it is, and the popularity of the language demonstrates that so do millions of others. We're not going to ruin what makes Python great by trying to be poor clones of Java, C++ or Javascript, even if it decreases the learning time for people coming from Java, C++ or Javascript by one or two minutes. We're going to continue to prefer a mixed interface where functions are used for some operations and methods for others, and if that upsets you then you should read this: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html -- Steve From alexandre.galode at gmail.com Sun Sep 17 14:17:38 2017 From: alexandre.galode at gmail.com (alexandre.galode at gmail.com) Date: Sun, 17 Sep 2017 11:17:38 -0700 (PDT) Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <20170915101954.GF13110@ando.pearwood.info> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: Hi, thanks for your answer, and your questions. Very good draw. I printed it in my office ^^ First i'd like to precise that, as in title, aim is not to gurantee quality but minimal quality. I think it's really two different things. About metrics, my ideas was about following (for the moment): - Basical Python Rules respect: PEP8 & PEP20 respect - Docstring/documentation respect: PEP257 respect - Code readability: percentage of commentary line and percentage of blank line - Code maintainability / complexity: the facility to re-read code if old code, or to understand code for an external developer. If not comprehensive, for example, i use McCabe in my work - Code coverage: by unit tests >From your question on objective metrics, i don't think that reliable metrics exists. We can only verify that minimal quality can be reached. As you say, it's a subjective apprehension, but in my mind, this "PEP" could be a guideline to improve development for some developers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Sep 17 16:31:43 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Sep 2017 22:31:43 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? References: Message-ID: <20170917223143.4b4c2949@fsol> On Wed, 13 Sep 2017 15:42:56 +0200 Victor Stinner wrote: > I would like to see if and how we can integrate/move some regrtest > features into the unittest module. Example of regrtest features: > > * skip a test if it allocates too much memory, command line argument > to specify how many memory a test is allowed to allocate (ex: > --memlimit=2G for 2 GB of memory) That would be suitable for a plugin if unittest had a plugin architecture, but not as a core functionality IMO. > * concept of "resource" like "network" (connect to external network > servers, to the Internet), "cpu" (CPU intensive tests), etc. Tests are > skipped by default and enabled by the -u command line option (ex: "-u > cpu). Good as a core functionality IMO. > * track memory leaks: check the reference counter, check the number of > allocated memory blocks, check the number of open file descriptors. Good for a plugin IMO. > * detect if the test spawned a thread or process and the > thread/process is still running at the test exit Good for a plugin IMO. > * --timeout: watchdog killing the test if the run time exceed the > timeout in seconds (use faulthandler.dump_traceback_later) Good for a plugin IMO. > * multiprocessing: run tests in subprocesses, in parallel Good as a core functionality IMO. > * redirect stdout/stderr to pipes (StringIO objects), ignore them on > success, or dump them to stdout/stderr on test failure Good for a plugin IMO. > * --slowest: top 10 of the slowest tests Good for a plugin IMO. > * --randomize: randomize test order Will be tricky to mix with setupClass. > * --match, --matchfile, -x: filter tests Good as a core functionality IMO. > * --forever: run the test in a loop until it fails (or is interrupted by CTRL+c) Good for a plugin IMO. > * --list-tests / --list-cases: list test files / test methods Good as a core functionality IMO. > * --fail-env-changed: mark tests as failed if a test altered the environment Good for a plugin IMO. > * detect if a "global variable" of the standard library was modified > but not restored by the test: Good for a plugin IMO. > * test.bisect: bisection to identify the failing method, used to track > memory leaks or identify a test leaking a resource (ex: create a file > but don't remove it) Good as a core functionality IMO. > I started to duplicate code in many files of Lib/test/test_*.py to > check if tests "leak running threads" ("dangling threads"). Example > from Lib/test/test_theading.py: > > class BaseTestCase(unittest.TestCase): > def setUp(self): > self._threads = test.support.threading_setup() > > def tearDown(self): > test.support.threading_cleanup(*self._threads) > test.support.reap_children() > > I would like to get this test "for free" directly from the regular > unittest.TestCase class, but I don't know how to extend the unittest > module for that? Instead of creating tons of distinct base TestCase classes, you can just provide helper functions / methods calling addCleanup(). Regards Antoine. From arj.python at gmail.com Sun Sep 17 16:42:06 2017 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Mon, 18 Sep 2017 00:42:06 +0400 Subject: [Python-ideas] Make map() better In-Reply-To: References: Message-ID: in that case you presented maybe it is more readeable but considering that lists as ["a","b"] can be inputted, not just using range, the result might be as it was. change x y people are happy change y and z people hate it Abdur-Rahmaan Janhangeer, Mauritius abdurrahmaanjanhangeer.wordpress.com On 13 Sep 2017 19:10, "Jason H" wrote: > The format of map seems off. Coming from JS, all the functions come > second. I think this approach is superior. > > Currently: > map(lambda x: chr(ord('a')+x), range(26)) # ['a', 'b', 'c', 'd', 'e', 'f', > 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', > 'v', 'w', 'x', 'y', 'z'] > > But I think this reads better: > map(range(26), lambda x: chr(ord('a')+x)) > > Currently that results in: > TypeError: argument 2 to map() must support iteration > > Also, how are we to tell what supports map()? > Any iterable should be able to map via: > range(26).map(lambda x: chr(ord('a')+x))) > > While the line length is the same, I think the latter is much more > readable, and the member method avoids parameter order confusion > > For the global map(), > having the iterable first also increases reliability because the lambda > function is highly variable in length, where as parameter names are > generally shorter than even the longest lambda expression. > > More readable: IMHO: > map(in, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > out = map(out, lambda x: chr(ord('a')+x)) > > Less readable (I have to parse the lambda): > map(lambda x: chr(ord('a')+x), in) > out = map(lambda x: chr(ord('a')+x), out) > out = map(lambda x: chr(ord('a')+x), out) > > But I contend: > range(26).map(lambda x: chr(ord('a')+x))) > is superior to all. > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cody.piersall at gmail.com Sun Sep 17 22:34:59 2017 From: cody.piersall at gmail.com (Cody Piersall) Date: Sun, 17 Sep 2017 21:34:59 -0500 Subject: [Python-ideas] Use __all__ for dir(module) (Was: PEP 562) Message-ID: On Tue, Sep 12, 2017 at 3:26 AM, Ivan Levkivskyi wrote: > @Cody >> I still think the better way >> to solve the custom dir() would be to change the module __dir__ >> method to check if __all__ is defined and use it to generate the >> result if it exists. This seems like a logical enhancement to me, >> and I'm planning on writing a patch to implement this. Whether it >> would be accepted is still an open issue though. > > This seems a reasonable rule to me, I can also make this patch if > you will not have time. I submitted a PR:https://github.com/python/cpython/pull/3610 and a BPO issue: https://bugs.python.org/issue31503 R. David Murray pointed out that this is a backwards-incompatible change. This is technically true, but I don't know of any code that depends on this behavior. (Of course, that does not mean it does not exist!) >From my perspective, the big benefit of this change is that tab-completion will get better for libraries which are already defining __all__. This will make for a better REPL experience. The only code in the stdlib that broke were tests in test_pkg which were explicitly checking the return value of dir(). Apart from that, nothing broke. If a module does not have __all__ defined, then nothing changes for that module. Cody From guido at python.org Sun Sep 17 22:49:31 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Sep 2017 19:49:31 -0700 Subject: [Python-ideas] Use __all__ for dir(module) (Was: PEP 562) In-Reply-To: References: Message-ID: I ave to agree with the other committers who already spoke up. I'm not using tab completion much (I have a cranky old Emacs setup), but isn't making tab completion better a job for editor authors (or language-support-for-editor authors) rather than for the core language? What editor are you using that calls dir() for tab completion? On Sun, Sep 17, 2017 at 7:34 PM, Cody Piersall wrote: > On Tue, Sep 12, 2017 at 3:26 AM, Ivan Levkivskyi > wrote: > > @Cody > >> I still think the better way > >> to solve the custom dir() would be to change the module __dir__ > >> method to check if __all__ is defined and use it to generate the > >> result if it exists. This seems like a logical enhancement to me, > >> and I'm planning on writing a patch to implement this. Whether it > >> would be accepted is still an open issue though. > > > > This seems a reasonable rule to me, I can also make this patch if > > you will not have time. > > I submitted a PR:https://github.com/python/cpython/pull/3610 > and a BPO issue: https://bugs.python.org/issue31503 > > R. David Murray pointed out that this is a backwards-incompatible > change. This is technically true, but I don't know of any code that > depends on this behavior. (Of course, that does not mean it does not > exist!) > > From my perspective, the big benefit of this change is that > tab-completion will get better for libraries which are already > defining __all__. This will make for a better REPL experience. The > only code in the stdlib that broke were tests in test_pkg which were > explicitly checking the return value of dir(). Apart from that, > nothing broke. > > If a module does not have __all__ defined, then nothing changes for that > module. > > Cody > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cody.piersall at gmail.com Mon Sep 18 00:40:56 2017 From: cody.piersall at gmail.com (Cody Piersall) Date: Sun, 17 Sep 2017 23:40:56 -0500 Subject: [Python-ideas] Use __all__ for dir(module) (Was: PEP 562) In-Reply-To: References: Message-ID: On Sun, Sep 17, 2017 at 9:49 PM, Guido van Rossum wrote: > I ave to agree with the other committers who already spoke up. > > I'm not using tab completion much (I have a cranky old Emacs setup), but > isn't making tab completion better a job for editor authors (or > language-support-for-editor authors) rather than for the core language? What > editor are you using that calls dir() for tab completion? > >> From my perspective, the big benefit of this change is that >> tab-completion will get better for libraries which are already >> defining __all__. This will make for a better REPL experience. The >> only code in the stdlib that broke were tests in test_pkg which were >> explicitly checking the return value of dir(). Apart from that, >> nothing broke. I'm sorry, I should have been more specific here. The tab completion provided by Jupyter uses dir() to provide the relevant tab-completion options. I was motivated to put this PR together whenever someone (I think Nathaniel Smith) was talking about setting a custom __dir__ on a module by overriding class, and IIRC his motivation was so that no one tab-completes to use a deprecated attribute. I spend a _lot_ of time in a Jupyter environment, so most of my tab completion is provided by whatever dir() returns. I think this is a pretty common setup. The default REPL also uses dir() for populating the completion list. Since my only interaction with dir() has to do with tab completion, I may be unaware of use cases where this PR would actually break working code. I understand (and agree with!) the emphasis the Python developers place on backwards compatibility, but I just can't think of code that would be broken by this change. Of course, that doesn't mean it doesn't exist! Cody From desmoulinmichel at gmail.com Mon Sep 18 02:35:40 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 18 Sep 2017 08:35:40 +0200 Subject: [Python-ideas] Use __all__ for dir(module) (Was: PEP 562) In-Reply-To: References: Message-ID: <72d6cd37-187d-317b-7f51-1909bad66faa@gmail.com> Le 18/09/2017 ? 06:40, Cody Piersall a ?crit?: > On Sun, Sep 17, 2017 at 9:49 PM, Guido van Rossum wrote: >> I ave to agree with the other committers who already spoke up. >> >> I'm not using tab completion much (I have a cranky old Emacs setup), but >> isn't making tab completion better a job for editor authors (or >> language-support-for-editor authors) rather than for the core language? What >> editor are you using that calls dir() for tab completion? >> >>> From my perspective, the big benefit of this change is that >>> tab-completion will get better for libraries which are already >>> defining __all__. This will make for a better REPL experience. The >>> only code in the stdlib that broke were tests in test_pkg which were >>> explicitly checking the return value of dir(). Apart from that, >>> nothing broke. > > I'm sorry, I should have been more specific here. The tab completion > provided by Jupyter uses dir() to provide the relevant tab-completion > options. I was motivated to put this PR together whenever someone (I > think Nathaniel Smith) was talking about setting a custom __dir__ on a > module by overriding class, and IIRC his motivation was so that no one > tab-completes to use a deprecated attribute. I spend a _lot_ of time > in a Jupyter environment, so most of my tab completion is provided by > whatever dir() returns. I think this is a pretty common setup. In that case, the problem is that jupyter should check up __all__ and act on it. Potentially breaking the language for that seems very overkill. I'm sure very few code actually depends on the current dir() behavior, but it's more about the social contract of not breaking things unless there is a very good reason. From desmoulinmichel at gmail.com Mon Sep 18 02:38:04 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 18 Sep 2017 08:38:04 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? In-Reply-To: <20170917223143.4b4c2949@fsol> References: <20170917223143.4b4c2949@fsol> Message-ID: <1bd5757a-2ccd-b113-423f-6eacbc6fb343@gmail.com> Can you elaborate on _why_ you think something is good for core/a plugin ? Cause right now it's impossible to know what's the logic behind. Le 17/09/2017 ? 22:31, Antoine Pitrou a ?crit?: > On Wed, 13 Sep 2017 15:42:56 +0200 > Victor Stinner > wrote: >> I would like to see if and how we can integrate/move some regrtest >> features into the unittest module. Example of regrtest features: >> >> * skip a test if it allocates too much memory, command line argument >> to specify how many memory a test is allowed to allocate (ex: >> --memlimit=2G for 2 GB of memory) > > That would be suitable for a plugin if unittest had a plugin > architecture, but not as a core functionality IMO. > >> * concept of "resource" like "network" (connect to external network >> servers, to the Internet), "cpu" (CPU intensive tests), etc. Tests are >> skipped by default and enabled by the -u command line option (ex: "-u >> cpu). > > Good as a core functionality IMO. > >> * track memory leaks: check the reference counter, check the number of >> allocated memory blocks, check the number of open file descriptors. > > Good for a plugin IMO. > >> * detect if the test spawned a thread or process and the >> thread/process is still running at the test exit > > Good for a plugin IMO. > >> * --timeout: watchdog killing the test if the run time exceed the >> timeout in seconds (use faulthandler.dump_traceback_later) > > Good for a plugin IMO. > >> * multiprocessing: run tests in subprocesses, in parallel > > Good as a core functionality IMO. > >> * redirect stdout/stderr to pipes (StringIO objects), ignore them on >> success, or dump them to stdout/stderr on test failure > > Good for a plugin IMO. > >> * --slowest: top 10 of the slowest tests > > Good for a plugin IMO. > >> * --randomize: randomize test order > > Will be tricky to mix with setupClass. > >> * --match, --matchfile, -x: filter tests > > Good as a core functionality IMO. > >> * --forever: run the test in a loop until it fails (or is interrupted by CTRL+c) > > Good for a plugin IMO. > >> * --list-tests / --list-cases: list test files / test methods > > Good as a core functionality IMO. > >> * --fail-env-changed: mark tests as failed if a test altered the environment > > Good for a plugin IMO. > >> * detect if a "global variable" of the standard library was modified >> but not restored by the test: > > Good for a plugin IMO. > >> * test.bisect: bisection to identify the failing method, used to track >> memory leaks or identify a test leaking a resource (ex: create a file >> but don't remove it) > > Good as a core functionality IMO. > >> I started to duplicate code in many files of Lib/test/test_*.py to >> check if tests "leak running threads" ("dangling threads"). Example >> from Lib/test/test_theading.py: >> >> class BaseTestCase(unittest.TestCase): >> def setUp(self): >> self._threads = test.support.threading_setup() >> >> def tearDown(self): >> test.support.threading_cleanup(*self._threads) >> test.support.reap_children() >> >> I would like to get this test "for free" directly from the regular >> unittest.TestCase class, but I don't know how to extend the unittest >> module for that? > > Instead of creating tons of distinct base TestCase classes, you can just > provide helper functions / methods calling addCleanup(). > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From sureshvv at hotmail.com Mon Sep 18 02:43:32 2017 From: sureshvv at hotmail.com (suresh) Date: Mon, 18 Sep 2017 12:13:32 +0530 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: https://pypi.python.org/pypi/radon impements some of these metrics From alexandre.galode at gmail.com Mon Sep 18 04:48:36 2017 From: alexandre.galode at gmail.com (alexandre.galode at gmail.com) Date: Mon, 18 Sep 2017 01:48:36 -0700 (PDT) Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: Hi, effectively it's one of the numerous tools whih allows to have metrics on code. This PEP would not have to precise tools name, but it's a good example ^^ ; and can give some more ideas on complementary metrics -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Sep 18 06:16:35 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 18 Sep 2017 12:16:35 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? In-Reply-To: <20170917223143.4b4c2949@fsol> References: <20170917223143.4b4c2949@fsol> Message-ID: Antoine: >> * skip a test if it allocates too much memory, command line argument >> to specify how many memory a test is allowed to allocate (ex: >> --memlimit=2G for 2 GB of memory) > > That would be suitable for a plugin if unittest had a plugin > architecture, but not as a core functionality IMO. My hidden question is more why unittest doesn't already have a plugin system, whereas there are tons on projects based on unittest extending its features like pytest, nose, testtools, etc. Longer list: https://wiki.python.org/moin/PythonTestingToolsTaxonomy pytest has a plugin system. I didn't use it, but I know that there is a pytest-faulthandler to enable my faulthandler module, and this extension seems tp be used in the wild: https://pypi.python.org/pypi/pytest-faulthandler I found a list of pytest extensions: https://docs.pytest.org/en/latest/plugins.html It seems like the lack of plugin architecture prevented enhancements. Example: "unittest: display time used by each test case" https://bugs.python.org/issue4080 Michael Foord's comment (July, 2010): "Even if it is added to the core it should be in the form of an extension (plugin) so please don't update the patch until this is in place." Victor From solipsis at pitrou.net Mon Sep 18 06:55:50 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 12:55:50 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? References: <20170917223143.4b4c2949@fsol> Message-ID: <20170918125550.4b805b33@fsol> On Mon, 18 Sep 2017 12:16:35 +0200 Victor Stinner wrote: > Antoine: > >> * skip a test if it allocates too much memory, command line argument > >> to specify how many memory a test is allowed to allocate (ex: > >> --memlimit=2G for 2 GB of memory) > > > > That would be suitable for a plugin if unittest had a plugin > > architecture, but not as a core functionality IMO. > > My hidden question is more why unittest doesn't already have a plugin > system, whereas there are tons on projects based on unittest extending > its features like pytest, nose, testtools, etc. Michael Foord or Robert Collins may be able to answer that question :-) Though I suspect the answer has mainly to do with lack of time and the hurdles of backwards compatibility. Regards Antoine. From storchaka at gmail.com Mon Sep 18 07:31:14 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 18 Sep 2017 14:31:14 +0300 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? In-Reply-To: References: Message-ID: 13.09.17 16:42, Victor Stinner ????: > * skip a test if it allocates too much memory, command line argument > to specify how many memory a test is allowed to allocate (ex: > --memlimit=2G for 2 GB of memory) Instead of just making checks before running some tests it would be worth to limit the memory usage hard (by setting ulimit or analogs on other plathforms). The purpose of this option is preventing swapping which makes tests just hanging for hours. > * concept of "resource" like "network" (connect to external network > servers, to the Internet), "cpu" (CPU intensive tests), etc. Tests are > skipped by default and enabled by the -u command line option (ex: "-u > cpu). The problem is what include in "all". The set of resources is application specific. Some of resources used in CPython tests make sense only for single test. > * --timeout: watchdog killing the test if the run time exceed the > timeout in seconds (use faulthandler.dump_traceback_later) This feature looks functionally similar to limiting memory usage. > * --match, --matchfile, -x: filter tests The discovery feature of unittest looks similar. > * ... : regrtest has many many features Many of them contain a bunch of engineering tricks and evolve quickly. Regrtest now is not a regrtest two years ago, and I'm sure that two years later it will differ too much from the current. Unittest should be more stable. From victor.stinner at gmail.com Tue Sep 19 09:08:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 19 Sep 2017 15:08:18 +0200 Subject: [Python-ideas] Move some regrtest or test.support features into unittest? In-Reply-To: References: Message-ID: >> * --timeout: watchdog killing the test if the run time exceed the >> timeout in seconds (use faulthandler.dump_traceback_later) > > This feature looks functionally similar to limiting memory usage. Hum, I don't think so. Limiting the memory usage doesn't catch deadlocks for example. Victor From barry at barrys-emacs.org Tue Sep 19 14:33:25 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 19 Sep 2017 19:33:25 +0100 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: > On 17 Sep 2017, at 19:17, alexandre.galode at gmail.com wrote: > > Hi, > > thanks for your answer, and your questions. Very good draw. I printed it in my office ^^ > > First i'd like to precise that, as in title, aim is not to gurantee quality but minimal quality. I think it's really two different things. > > About metrics, my ideas was about following (for the moment): > Basical Python Rules respect: PEP8 & PEP20 respect > Docstring/documentation respect: PEP257 respect > Code readability: percentage of commentary line and percentage of blank line > Code maintainability / complexity: the facility to re-read code if old code, or to understand code for an external developer. If not comprehensive, for example, i use McCabe in my work > Code coverage: by unit tests > From your question on objective metrics, i don't think that reliable metrics exists. We can only verify that minimal quality can be reached. As you say, it's a subjective apprehension, but in my mind, this "PEP" could be a guideline to improve development for some developers. Quality is something that an organisation and its people need to achieve by building appropriate processes and improvement methods into their work flow. Trying to be prescriptive will run into trouble for the wider world I suspect. Many of the maintainability metrics may help a team. However peer review and discussion within teams is a powerful process to achieve good code, which is process. I do not see quality as a quantity that can be easily measured. How can we set a minimum for the hard to measure? Barry p.s. Why does this thread have a reply address of python-ideas at googlegroups.com ? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.hilaire at lip6.fr Wed Sep 20 04:35:13 2017 From: thibault.hilaire at lip6.fr (Thibault Hilaire) Date: Wed, 20 Sep 2017 10:35:13 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170914155707.GE13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> Message-ID: <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> Hi everyone >> Of course, for a lost of numbers, the decimal representation is simpler, and just as accurate as the radix-2 hexadecimal representation. >> But, due to the radix-10 and radix-2 used in the two representations, the radix-2 may be much easier to use. > > Hex is radix 16, not radix 2 (binary). Of course, Hex is radix-16! I was talking about radix-2 because all the exactness problem comes when converting binary and decimal, and Hex can be seen as an (exact) compact way to express binary, and that's what we want (build literal floats in the same exact way they are stored internally, and export them exactly in a compact way). > >> In the "Handbook of Floating-Point Arithmetic" (JM Muller et al, Birkhauser editor, page 40),the authors claims that the largest exact decimal representation of a double-precision floating-point requires 767 digits !! >> So it is not always few characters to type to be just as accurate !! >> For example (this is the largest exact decimal representation of a single-precision 32-bit float): >>> 1.17549421069244107548702944484928734882705242874589333385717453057158887047561890426550235133618116378784179687e-38 >> and >>> 0x1.fffffc0000000p-127 >> are exactly the same number (one in decimal representation, the other in radix-2 hexadecimal)! > > That may be so, but that doesn't mean you have to type all 100+ digits > in order to reproduce the float exactly. Just 1.1754942106924411e-38 is > sufficient: > > py> 1.1754942106924411e-38 == float.fromhex('0x1.fffffc0000000p-127') > True > > You may be mistaking two different questions: > > (1) How many decimal digits are needed to exactly convert the float to > decimal? That can be over 100 for a C single, and over 700 for a double. > > (2) How many decimal digits are needed to uniquely represent the float? > Nine digits (plus an exponent) is enough to represent all possible C > singles; 17 digits is enough to represent all doubles (Python floats). You're absolutely right, 1.1754942106924411e-38 is enough to *reproduce* the float exactly, BUT it is still different to 0x1.fffffc0000000p-127 (or it's 112-digits decimal representation). Because 1.1754942106924411e-38 is rounded at compile-time to 0x1.fffffc0000000p-127 (so exactly to 1.17549421069244107548702944484928734882705242874589333385717453057158887047561890426550235133618116378784179687e-38 in decimal). So 17 digits are enough to reach each double, after the compile-time quantization. But "explicit is better than implicit", as someone says ;-), so I prefer, in some particular occasions, to explicitly express the floating-point number I want (like 0x1.fffffc0000000p-127), rather than hoping the quantization of my decimal number (1.1754942106924411e-38) will produce the right floating-point (0x1.fffffc0000000p-127) And that's one of the reasons why the hexadecimal floating-point representation exist: - as a way to *exactly* export floating-point numbers without any doubt (so give a compact form of if binary intern representation) - as a way to *explicitly* and *exactly* specify some floating-point values in your code, directly in the way they are store internally (in a compact way, because binary is too long) > I'm not actually opposed to hex float literals. I think they're cool. > But we ought to have a reason more than just "they're cool" for > supporting them, and I'm having trouble thinking of any apart from "C > supports them, so should we". But maybe that's enough. To sum up: - In some specific context, hexadecimal floating-point constants make it easy for the programmers to reproduce the exact value. Typically, a software engineer who is concerned about floating-point accuracy would prepare hexadecimal floating-point constants for use in a program by generating them with special software (e.g., Maple, Mathematica, Sage or some multi-precision library). These hexadecimal literals have been added to C (since C99), Java, Lua, Ruby, Perl (since v5.22), etc. for the same reasons. - The exact grammar has been fully documented in the IEEE-754-2008 norm (section 5.12.13), and also in C99 (or C++17 and others) - Of course, hexadecimal floating-point can be manipulated with float.hex() and float.fromhex(), *but* it works from strings, and the translation is done at execution-time... I hope this can be seen as a sufficient reason to support hexadecimal floating literals. Thibault From jhihn at gmx.com Wed Sep 20 10:38:42 2017 From: jhihn at gmx.com (Jason H) Date: Wed, 20 Sep 2017 16:38:42 +0200 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: > Quality is something that an organisation and its people need to achieve by building appropriate processes and improvement methods into their work flow. > Trying to be prescriptive will run into trouble for the wider world I suspect. > > Many of the maintainability metrics may help a team. > ? > However peer review and discussion within teams is a powerful process to achieve good code, which is process. >? > I do not see quality as a quantity that can be easily measured. > How can we set a minimum for the hard to measure? Since the proposal first came out, I've been wondering about it's implications. I do think that there is some virtue to having it, but it quickly gets messy. In terms of quality as a quantity, the minimal would be to break on a linting error. JSHint has numbered issues: /* jshint -W034 */ Having a decorator of @pylint-034 could tell PyLint or python proper to refuse compilation/execution if the standard is not met. So would we limit or list the various standards that apply to the code? Do we limit it on a per-function or per file, or even per-line basis? How do we handle different organizational requirements? @pylint([34]) @pep([8,20]) def f(a): return math.sqrt(a) The other aspect that comes into code quality is unit tests. A decorator of what test functions need to be run on a function (and pass) would also be useful: def test_f_arg_negative(f): try: return f(-1) == 1 except(e): return False# ValueError: math domain error @tests([test_f_arg_positive, test_f_arg_negative, test_f_arg_zero, f_test_f_arg_int, test_f_arg_float]) def f(a): return math.sqrt(math.abs(a)) In a test-driven world, the test functions would come first, but this is rarely the case. Possible results are not just test failure, but also that a test does not yet exist. I think it would be great to know what tests the function is supposed to pass. The person coding the function has the best idea of what inputs it is expected to handle and side-effects it should cause. Hunting through a test kit for the function is usually tedious. This I think would vastly improve it while taking steps to describe and enforce 'quality' Just my $0.02 From chris.barker at noaa.gov Wed Sep 20 11:34:34 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 20 Sep 2017 08:34:34 -0700 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> Message-ID: <-1213469140971527789@unknownmsgid> > How do we handle different organizational requirements? > By keeping linting out of the code ( and certainly out of "official" python), and in the organization's development process where it belongs. > @pylint([34]) > @pep([8,20]) > def f(a): > return math.sqrt(a) Yeach! But that's just my opinion -- if you really like this idea, you can certainly implement it in an external package. No need for a PEP or anything of the sort. > The other aspect that comes into code quality is unit tests. A decorator of what test functions need to be run on a function (and pass) would also be useful: > > def test_f_arg_negative(f): > try: > return f(-1) == 1 > except(e): > return False# ValueError: math domain error > > @tests([test_f_arg_positive, test_f_arg_negative, test_f_arg_zero, f_test_f_arg_int, test_f_arg_float]) > def f(a): > return math.sqrt(math.abs(a)) Again, I don't think it belongs there, but I do see your point. If you like this idea--implement it, put it on PyPi, and see if anyone else likes if as well. -CHB From jhihn at gmx.com Wed Sep 20 14:23:31 2017 From: jhihn at gmx.com (Jason H) Date: Wed, 20 Sep 2017 20:23:31 +0200 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <-1213469140971527789@unknownmsgid> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: > > How do we handle different organizational requirements? > > > By keeping linting out of the code ( and certainly out of "official" > python), and in the organization's development process where it > belongs. > > > @pylint([34]) > > @pep([8,20]) > > def f(a): > > return math.sqrt(a) > > Yeach! But that's just my opinion -- if you really like this idea, you > can certainly implement it in an external package. No need for a PEP > or anything of the sort. > > > The other aspect that comes into code quality is unit tests. A decorator of what test functions need to be run on a function (and pass) would also be useful: > > > > def test_f_arg_negative(f): > > try: > > return f(-1) == 1 > > except(e): > > return False# ValueError: math domain error > > > > @tests([test_f_arg_positive, test_f_arg_negative, test_f_arg_zero, f_test_f_arg_int, test_f_arg_float]) > > def f(a): > > return math.sqrt(math.abs(a)) > > Again, I don't think it belongs there, but I do see your point. If you > like this idea--implement it, put it on PyPi, and see if anyone else > likes if as well. Not my idea, but the question was raised as to what could a 'quality guarantee' or 'quality' even mean. I was just throwing out examples for discussion. I did not intend to make you vomit. I think in an abstract sense it's a good idea, but in my own head I would expect that all code to be written to the highest standard from the start. I have some nascent ideas, but they are not even worth mentioning yet, and I don't even know how they'd fit in any known language. From alexandre.galode at gmail.com Wed Sep 20 15:02:29 2017 From: alexandre.galode at gmail.com (alexandre.galode at gmail.com) Date: Wed, 20 Sep 2017 12:02:29 -0700 (PDT) Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: Hi, Thanks to everyone for your contribution to my proposal. @Barry, I agree with you that each organisation needs to integrate the "quality process" in their workflow. But before integrate it, this "quality" process need to be define, at minimal, i think. I don't think too this is easy to measure, in integrity, but we can define minimal metric, easy to get from code, as the ones defines previously. About your question for googlegroups, i don't know if it's bond, but i use google group to participate to this mailing list. @Jason, thanks for your example. When i discussed from this proposal with other devs of my team, they needed too example to have better idea of use. But i think, as wee need to avoid to talk about any tool name in the PEP, we need to avoid to give a code example. The aim of this proposal is to have a guideline on minimal metrics to have minimal quality. As you talked about, i ask to every devs of my team to respect the higher standard as possible. This, and the permanent request of my customers for the highest dev quality as possible, is the reason which explain this proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at barrys-emacs.org Wed Sep 20 15:39:34 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Wed, 20 Sep 2017 20:39:34 +0100 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: > On 20 Sep 2017, at 20:02, alexandre.galode at gmail.com wrote: > > Hi, > > Thanks to everyone for your contribution to my proposal. > > @Barry, I agree with you that each organisation needs to integrate the "quality process" in their workflow. But before integrate it, this "quality" process need to be define, at minimal, i think. I don't think too this is easy to measure, in integrity, but we can define minimal metric, easy to get from code, as the ones defines previously. About your question for googlegroups, i don't know if it's bond, but i use google group to participate to this mailing list. The organisations work flow *is* its quality process, for better or worse. You can define metrics. But as to what they mean? Well that is the question. I was once told that you should measure a new metric for 2 years before you attempting to use it to change your processes. Oh and what is the most important thing for a piece of work? I'd claim its the requirements. Get them wrong and nothing else matters. Oh and thank you for raising a very important topic. Barry > > @Jason, thanks for your example. When i discussed from this proposal with other devs of my team, they needed too example to have better idea of use. But i think, as wee need to avoid to talk about any tool name in the PEP, we need to avoid to give a code example. The aim of this proposal is to have a guideline on minimal metrics to have minimal quality. As you talked about, i ask to every devs of my team to respect the higher standard as possible. This, and the permanent request of my customers for the highest dev quality as possible, is the reason which explain this proposal. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From chris.barker at noaa.gov Wed Sep 20 20:37:11 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 20 Sep 2017 17:37:11 -0700 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: <-9088590704105947642@unknownmsgid> > You can define metrics. But as to what they mean? Well that is the question. One big problem with metrics is that we tend to measure what we know how to measure -- generally really not the most useful metric... As for some kind of PEP or PEP-like document: I think we'd have to see a draft before we have any idea as to whether it's a useful document -- if some of the folks on this thread are inspired -- start writing! I, however, am skeptical because of two things: 1) most code-quality measures, processes, etc are not language specific -- so don't really belong in a PEP So why PEP 8? -- because the general quality metric is: " the source code confirms to a consistent style" -- there is nothing language specific about that. But we need to actually define that style for a given project or organization -- hence PEP 8. 2) while you can probably get folks to agree on general guidelines-- tests are good! -- I think you'll get buried in bike shedding of the details. And it will likely go beyond just the color of the bike shed. Again -- what about PEP8? Plenty of bike shedding opportunities there. But there was a need to reach consensus on SOMETHING-- we needed a style guide for the standard lib. I'm not that's the case with other "quality" metrics. But go ahead and prove me wrong! -CHB > > I was once told that you should measure a new metric for 2 years before you attempting to use it to change your processes. > > Oh and what is the most important thing for a piece of work? > > I'd claim its the requirements. Get them wrong and nothing else matters. > > Oh and thank you for raising a very important topic. > > Barry > > >> >> @Jason, thanks for your example. When i discussed from this proposal with other devs of my team, they needed too example to have better idea of use. But i think, as wee need to avoid to talk about any tool name in the PEP, we need to avoid to give a code example. The aim of this proposal is to have a guideline on minimal metrics to have minimal quality. As you talked about, i ask to every devs of my team to respect the higher standard as possible. This, and the permanent request of my customers for the highest dev quality as possible, is the reason which explain this proposal. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From chris.barker at noaa.gov Wed Sep 20 20:44:03 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 20 Sep 2017 17:44:03 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> Message-ID: <5758742674925253881@unknownmsgid> > And that's one of the reasons why the hexadecimal floating-point representation exist: I suspect no one here thinks floathex representation is unimportant... > > To sum up: > - In some specific context, hexadecimal floating-point constants make it easy for the programmers to reproduce the exact value. Typically, a software engineer who is concerned about floating-point accuracy would prepare hexadecimal floating-point constants for use in a program by generating them with special software (e.g., Maple, Mathematica, Sage or some multi-precision library). These hexadecimal literals have been added to C (since C99), Java, Lua, Ruby, Perl (since v5.22), etc. for the same reasons. > - The exact grammar has been fully documented in the IEEE-754-2008 norm (section 5.12.13), and also in C99 (or C++17 and others) > - Of course, hexadecimal floating-point can be manipulated with float.hex() and float.fromhex(), *but* it works from strings, and the translation is done at execution-time... Right. But it addresses all of the points you make. The functionality is there. Making a new literal will buy a slight improvement in writability and performance. Is that worth much in a dynamic language like python? -CHB From ned at nedbatchelder.com Wed Sep 20 20:51:40 2017 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 20 Sep 2017 20:51:40 -0400 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <-9088590704105947642@unknownmsgid> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> <-9088590704105947642@unknownmsgid> Message-ID: <39f9c8bc-09d1-965b-5a30-bd547785226a@nedbatchelder.com> On 9/20/17 8:37 PM, Chris Barker - NOAA Federal wrote: >> You can define metrics. But as to what they mean? Well that is the question. > One big problem with metrics is that we tend to measure what we know > how to measure -- generally really not the most useful metric... > > As for some kind of PEP or PEP-like document: > > I think we'd have to see a draft before we have any idea as to whether > it's a useful document -- if some of the folks on this thread are > inspired -- start writing! > > I, however, am skeptical because of two things: > > 1) most code-quality measures, processes, etc are not language > specific -- so don't really belong in a PEP > > So why PEP 8? -- because the general quality metric is: " the source > code confirms to a consistent style" -- there is nothing language > specific about that. But we need to actually define that style for a > given project or organization -- hence PEP 8. > > 2) while you can probably get folks to agree on general guidelines-- > tests are good! -- I think you'll get buried in bike shedding of the > details. And it will likely go beyond just the color of the bike shed. > > Again -- what about PEP8? Plenty of bike shedding opportunities there. > But there was a need to reach consensus on SOMETHING-- we needed a > style guide for the standard lib. I'm not that's the case with other > "quality" metrics. I don't see the need for a PEP here.? PEPs are written where we need the core committers to agree on something (how to change Python, how to style stdlib code, etc), or where we need multiple implementations to agree on something (what does "python" mean, how to change Python, etc). There's no need for the core committers or multiple implementations of Python to agree on quality metrics.? There's no need for this to be a PEP.? Write a document that proposes some quality metrics. Share it around. Get people to like it. If it becomes popular, then people will start to value it as a standard for project quality. It doesn't need to be a PEP. --Ned. > But go ahead and prove me wrong! > > -CHB > > > >> I was once told that you should measure a new metric for 2 years before you attempting to use it to change your processes. >> >> Oh and what is the most important thing for a piece of work? >> >> I'd claim its the requirements. Get them wrong and nothing else matters. >> >> Oh and thank you for raising a very important topic. >> >> Barry >> >> >>> @Jason, thanks for your example. When i discussed from this proposal with other devs of my team, they needed too example to have better idea of use. But i think, as wee need to avoid to talk about any tool name in the PEP, we need to avoid to give a code example. The aim of this proposal is to have a guideline on minimal metrics to have minimal quality. As you talked about, i ask to every devs of my team to respect the higher standard as possible. This, and the permanent request of my customers for the highest dev quality as possible, is the reason which explain this proposal. >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Wed Sep 20 21:13:44 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Sep 2017 11:13:44 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <5758742674925253881@unknownmsgid> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> Message-ID: On 21 September 2017 at 10:44, Chris Barker - NOAA Federal wrote: [Thibault] >> To sum up: >> - In some specific context, hexadecimal floating-point constants make it easy for the programmers to reproduce the exact value. Typically, a software engineer who is concerned about floating-point accuracy would prepare hexadecimal floating-point constants for use in a program by generating them with special software (e.g., Maple, Mathematica, Sage or some multi-precision library). These hexadecimal literals have been added to C (since C99), Java, Lua, Ruby, Perl (since v5.22), etc. for the same reasons. >> - The exact grammar has been fully documented in the IEEE-754-2008 norm (section 5.12.13), and also in C99 (or C++17 and others) >> - Of course, hexadecimal floating-point can be manipulated with float.hex() and float.fromhex(), *but* it works from strings, and the translation is done at execution-time... > > Right. But it addresses all of the points you make. The functionality > is there. Making a new literal will buy a slight improvement in > writability and performance. > > Is that worth much in a dynamic language like python? I think so, as consider this question: how do you write a script that accepts a user-supplied string (e.g. from a CSV file) and treats it as hex floating point if it has the 0x prefix, and decimal floating point otherwise? You can't just blindly apply float.fromhex(), as that will also treat unprefixed strings as hexadecimal: >>> float.fromhex("0x10") 16.0 >>> float.fromhex("10") 16.0 So you need to do the try/except dance with ValueError instead: try: float_data = float(text) except ValueError: float_values = float.fromhex(text) At which point you may wonder why you can't just write "float_data = float(text, base=0)" the way you can for integers: >>> int("10", base=0) 10 >>> int("0x10", base=0) 16 And if the float() builtin were to gain a "base" parameter, then it's only a short step from there to allow at least the "0x" prefix on literals, and potentially even "0b" and "0o" as well. So I'm personally +0 on the idea - it would improve interface consistency between integers and floating point values, and make it easier to write correctness tests for IEEE754 floating point hardware and algorithms in Python (where your input & output test vectors are going to use binary or hex representations, not decimal). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 20 21:32:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Sep 2017 11:32:26 +1000 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: <39f9c8bc-09d1-965b-5a30-bd547785226a@nedbatchelder.com> References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> <-9088590704105947642@unknownmsgid> <39f9c8bc-09d1-965b-5a30-bd547785226a@nedbatchelder.com> Message-ID: On 21 September 2017 at 10:51, Ned Batchelder wrote: > Write a document that proposes some quality metrics. Share it around. Get > people to like it. If it becomes popular, then people will start to value it > as a standard for project quality. And explore the academic literature for research on quality measures that are actually predictors of real world benefits (e.g. readability, maintainability, correctness, affordability). There doesn't seem to be all that much research out there, although I did find https://link.springer.com/chapter/10.1007/978-3-642-12165-4_24 from several years ago, as well as http://ieeexplore.ieee.org/document/7809284/ from last year (both are examples of pay-to-play science though, so not particularly useful to open source practitioners). Regardless, Ned's point still stands: the PEP process only applies to situations where the CPython core developers (or a closely associated group like the Python Packaging Authority) are the relevant global authorities on a topic. Even PEP 7 and PEP 8 are technically only the style guides for the CPython reference implementation - folks just borrow them as the baseline style guides for their own Python projects. "Which characteristics of Python code are useful predictors of the ability to deliver software projects to specification on time and within budget?" (the most pragmatic definition of "software quality") is *not* one of those areas - for that, you'd be more looking towards groups like IEEE (Institute of Electrical & Electronics Engineers) and ACM (Association for Computing Machinery), who study that kind of thing across multiple languages and language communities, and try to put some empirical weight behind their findings, rather than relying primarily on instinct and experience (which is the way we tend to do things in the open source community, since it's more fun, and less effort). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Wed Sep 20 22:04:11 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Sep 2017 19:04:11 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> Message-ID: Yeah, I agree, +0. It won't confuse anyone who doesn't care about it and those who need it will benefit. On Wed, Sep 20, 2017 at 6:13 PM, Nick Coghlan wrote: > On 21 September 2017 at 10:44, Chris Barker - NOAA Federal > wrote: > [Thibault] > >> To sum up: > >> - In some specific context, hexadecimal floating-point constants make > it easy for the programmers to reproduce the exact value. Typically, a > software engineer who is concerned about floating-point accuracy would > prepare hexadecimal floating-point constants for use in a program by > generating them with special software (e.g., Maple, Mathematica, Sage or > some multi-precision library). These hexadecimal literals have been added > to C (since C99), Java, Lua, Ruby, Perl (since v5.22), etc. for the same > reasons. > >> - The exact grammar has been fully documented in the IEEE-754-2008 norm > (section 5.12.13), and also in C99 (or C++17 and others) > >> - Of course, hexadecimal floating-point can be manipulated with > float.hex() and float.fromhex(), *but* it works from strings, and the > translation is done at execution-time... > > > > Right. But it addresses all of the points you make. The functionality > > is there. Making a new literal will buy a slight improvement in > > writability and performance. > > > > Is that worth much in a dynamic language like python? > > I think so, as consider this question: how do you write a script that > accepts a user-supplied string (e.g. from a CSV file) and treats it as > hex floating point if it has the 0x prefix, and decimal floating point > otherwise? > > You can't just blindly apply float.fromhex(), as that will also treat > unprefixed strings as hexadecimal: > > >>> float.fromhex("0x10") > 16.0 > >>> float.fromhex("10") > 16.0 > > So you need to do the try/except dance with ValueError instead: > > try: > float_data = float(text) > except ValueError: > float_values = float.fromhex(text) > > At which point you may wonder why you can't just write "float_data = > float(text, base=0)" the way you can for integers: > > >>> int("10", base=0) > 10 > >>> int("0x10", base=0) > 16 > > And if the float() builtin were to gain a "base" parameter, then it's > only a short step from there to allow at least the "0x" prefix on > literals, and potentially even "0b" and "0o" as well. > > So I'm personally +0 on the idea - it would improve interface > consistency between integers and floating point values, and make it > easier to write correctness tests for IEEE754 floating point hardware > and algorithms in Python (where your input & output test vectors are > going to use binary or hex representations, not decimal). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Sep 20 21:53:45 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 21 Sep 2017 11:53:45 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> Message-ID: <20170921015345.GN13110@ando.pearwood.info> On Thu, Sep 21, 2017 at 11:13:44AM +1000, Nick Coghlan wrote: > I think so, as consider this question: how do you write a script that > accepts a user-supplied string (e.g. from a CSV file) and treats it as > hex floating point if it has the 0x prefix, and decimal floating point > otherwise? float.fromhex(s) if s.startswith('0x') else float(s) [...] > And if the float() builtin were to gain a "base" parameter, then it's > only a short step from there to allow at least the "0x" prefix on > literals, and potentially even "0b" and "0o" as well. > > So I'm personally +0 on the idea I agree with your arguments. I just wish I could think of a good reason to make it +1 instead of a luke-warm +0. -- Steve From mal at egenix.com Thu Sep 21 03:47:56 2017 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Sep 2017 09:47:56 +0200 Subject: [Python-ideas] A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> <-9088590704105947642@unknownmsgid> <39f9c8bc-09d1-965b-5a30-bd547785226a@nedbatchelder.com> Message-ID: On 21.09.2017 03:32, Nick Coghlan wrote: > On 21 September 2017 at 10:51, Ned Batchelder wrote: >> Write a document that proposes some quality metrics. Share it around. Get >> people to like it. If it becomes popular, then people will start to value it >> as a standard for project quality. > > And explore the academic literature for research on quality measures > that are actually predictors of real world benefits (e.g. readability, > maintainability, correctness, affordability). > > There doesn't seem to be all that much research out there, although I > did find https://link.springer.com/chapter/10.1007/978-3-642-12165-4_24 > from several years ago, as well as > http://ieeexplore.ieee.org/document/7809284/ from last year (both are > examples of pay-to-play science though, so not particularly useful to > open source practitioners). On the topic, you might want to have a look at a talk I held at EuroPython 2016 on valuation of a code base (in the context of valuation of a Python company): https://downloads.egenix.com/python/EuroPython-2016-Python-Startup-Valuation.pdf (1596930 bytes) Video: https://www.youtube.com/watch?v=nIoE3KJxK6U There are a few things you can do with metrics to figure out how good a code base is. Of course, in the end you always have to do a code review, but the metrics are a good indicator of where to start looking for possible issues. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 21 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ > Regardless, Ned's point still stands: the PEP process only applies to > situations where the CPython core developers (or a closely associated > group like the Python Packaging Authority) are the relevant global > authorities on a topic. Even PEP 7 and PEP 8 are technically only the > style guides for the CPython reference implementation - folks just > borrow them as the baseline style guides for their own Python > projects. > > "Which characteristics of Python code are useful predictors of the > ability to deliver software projects to specification on time and > within budget?" (the most pragmatic definition of "software quality") > is *not* one of those areas - for that, you'd be more looking towards > groups like IEEE (Institute of Electrical & Electronics Engineers) and > ACM (Association for Computing Machinery), who study that kind of > thing across multiple languages and language communities, and try to > put some empirical weight behind their findings, rather than relying > primarily on instinct and experience (which is the way we tend to do > things in the open source community, since it's more fun, and less > effort). > > Cheers, > Nick. > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 21 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From p.f.moore at gmail.com Thu Sep 21 03:57:28 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Sep 2017 08:57:28 +0100 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170921015345.GN13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On 21 September 2017 at 02:53, Steven D'Aprano wrote: > On Thu, Sep 21, 2017 at 11:13:44AM +1000, Nick Coghlan wrote: > >> I think so, as consider this question: how do you write a script that >> accepts a user-supplied string (e.g. from a CSV file) and treats it as >> hex floating point if it has the 0x prefix, and decimal floating point >> otherwise? > > float.fromhex(s) if s.startswith('0x') else float(s) > > [...] >> And if the float() builtin were to gain a "base" parameter, then it's >> only a short step from there to allow at least the "0x" prefix on >> literals, and potentially even "0b" and "0o" as well. >> >> So I'm personally +0 on the idea > > I agree with your arguments. I just wish I could think of a good reason > to make it +1 instead of a luke-warm +0. I'm also +0. I think +0 is pretty much the correct response - it's OK with me, but someone who actually needs or wants the feature will need to implement it. It's also worth remembering that there will be implementations other than CPython that will need changes, too - Jython, PyPy, possibly Cython, and many editors and IDEs. So setting the bar at "someone who wants this will have to step up and provide a patch" seems reasonable to me. Paul From pavol.lisy at gmail.com Thu Sep 21 05:17:43 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Thu, 21 Sep 2017 11:17:43 +0200 Subject: [Python-ideas] Fwd: A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: On 9/20/17, alexandre.galode at gmail.com wrote: [...] > But i think, as wee need to avoid to talk about any tool name in the PEP, > we need to avoid to give a code example. The aim of this proposal is to > have a guideline on minimal metrics to have minimal quality. Michel Foucault wrote book (https://en.wikipedia.org/wiki/The_Birth_of_the_Clinic) which subtitle is "An Archaeology of Medical Perception". Imagine that code is organism and quality is health. Which archaeological era of medical perception is analogous to your proposal? ( https://en.wikipedia.org/wiki/File:Yupik_shaman_Nushagak.jpg or https://en.wikipedia.org/wiki/Blinded_experiment ?) PS. Good analogy is probably heart rate. It is very simple metric which we could use. And it seems not too problematic to propose range where it is healthy. But different kind of software is like different species. (see for example http://www.merckvetmanual.com/appendixes/reference-guides/resting-heart-rates ) And there are development stages! Embryonic bpm is NA (because no heart) and then much bigger than child's and its bigger than adult's. Another simple metric could be temperature. But individual has also different tissues! Different part of body have different temperature. (We could probably not measure hair's temperature) etc, etc... From victor.stinner at gmail.com Thu Sep 21 11:23:11 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 21 Sep 2017 17:23:11 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170921015345.GN13110@ando.pearwood.info> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: 2017-09-21 3:53 GMT+02:00 Steven D'Aprano : > float.fromhex(s) if s.startswith('0x') else float(s) My vote is now -1 on extending the Python syntax to add hexadecimal floating literals. While I was first in favor of extending the Python syntax, I changed my mind. Float constants written in hexadecimal is a (very?) rare use case, and there is already float.fromhex() available. A new syntax is something more to learn when you learn Python. Is it worth it? I don't think so. Very few people need to write hexadecimal constants in their code. For hardcore developers loving bytes, struct.unpack() is also available for your pleasure. Moreover, there is also slowly a trend to compute floating point numbers in decimal since it's easier to understand and debug (by humans ;-)). We already have a fast "decimal" module in Python 3. Victor From lucas.wiman at gmail.com Thu Sep 21 12:47:19 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Thu, 21 Sep 2017 09:47:19 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 8:23 AM, Victor Stinner wrote: > While I was first in favor of extending the Python syntax, I changed > my mind. Float constants written in hexadecimal is a (very?) rare use > case, and there is already float.fromhex() available. > > A new syntax is something more to learn when you learn Python. Is it > worth it? I don't think so. Very few people need to write hexadecimal > constants in their code. > It is inconsistent that you can write hexadecimal integers but not floating point numbers. Consistency in syntax is *fewer* things to learn, not more. That said, I agree it's a rare use case, so it probably doesn't matter much either way. - Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Sep 21 16:09:11 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Sep 2017 13:09:11 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: -1 Writing a floating point literal requires A LOT more knowledge than writing a hex integer. What is the bit length of floats on your specific Python compile? What happens if you specify more or less precision than actually available. Where is the underflow to subnormal numbers? What is the bit representation of information? Nan? -0 vs +0? There are people who know this and need to know this. But float.fromhex() is already available to them. A literal is an attractive nuisance for people who almost-but-not-quite understand IEEE-854. I.e. those people who named neither Tim Peters nor Mike Cowlishaw. On Sep 21, 2017 9:48 AM, "Lucas Wiman" wrote: On Thu, Sep 21, 2017 at 8:23 AM, Victor Stinner wrote: > While I was first in favor of extending the Python syntax, I changed > my mind. Float constants written in hexadecimal is a (very?) rare use > case, and there is already float.fromhex() available. > > A new syntax is something more to learn when you learn Python. Is it > worth it? I don't think so. Very few people need to write hexadecimal > constants in their code. > It is inconsistent that you can write hexadecimal integers but not floating point numbers. Consistency in syntax is fewer things to learn, not more. That said, I agree it's a rare use case, so it probably doesn't matter much either way. - Lucas _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Sep 21 16:10:43 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Sep 2017 13:10:43 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: Tablet autocorrect: bit representation of inf and -inf. On Sep 21, 2017 1:09 PM, "David Mertz" wrote: > -1 > > Writing a floating point literal requires A LOT more knowledge than > writing a hex integer. > > What is the bit length of floats on your specific Python compile? What > happens if you specify more or less precision than actually available. > Where is the underflow to subnormal numbers? What is the bit representation > of information? Nan? -0 vs +0? > > There are people who know this and need to know this. But float.fromhex() > is already available to them. A literal is an attractive nuisance for > people who almost-but-not-quite understand IEEE-854. I.e. those people who > named neither Tim Peters nor Mike Cowlishaw. > > On Sep 21, 2017 9:48 AM, "Lucas Wiman" wrote: > > On Thu, Sep 21, 2017 at 8:23 AM, Victor Stinner > wrote: > >> While I was first in favor of extending the Python syntax, I changed >> my mind. Float constants written in hexadecimal is a (very?) rare use >> case, and there is already float.fromhex() available. >> >> A new syntax is something more to learn when you learn Python. Is it >> worth it? I don't think so. Very few people need to write hexadecimal >> constants in their code. >> > > It is inconsistent that you can write hexadecimal integers but not > floating point numbers. Consistency in syntax is fewer things to learn, not > more. That said, I agree it's a rare use case, so it probably doesn't > matter much either way. > > - Lucas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhihn at gmx.com Thu Sep 21 17:02:18 2017 From: jhihn at gmx.com (Jason H) Date: Thu, 21 Sep 2017 23:02:18 +0200 Subject: [Python-ideas] Fwd: A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: One of my hesitations on this topic is that it could create a false sense of security. And I mean security in both the 'comfortable with the code base' sense leading to insufficient testing, as well as 'we have a top-notch quality level, there are no vulnerabilities'. The one thing that I keep coming back to is all of the side-channel attacks. From legacy APICs on the mobo, to your DRAM leaking your crypto keys, so something arguably more code level like timing response of failed operations... I don't think we can even approximate quality to security. So given two code bases and as many dimensions as needed to express it, how do you compare the 'quality' of two code bases? Is that even fair? Can you only compare 'quality' to the previous iteration of the source? Is it normalized for size? Even then I'd argue that it shouldn't be anything the developer can claim, or if they can, it's got to be enforced by completely breaking. There's got to be tool that measures it, and it can't be gamed. We've had static analysis tools for some time. I've used them, I don't know that they've done any good, aside from a copy-paste analyzer that helps keep things DRY. From steve at pearwood.info Thu Sep 21 21:32:02 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Sep 2017 11:32:02 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: <20170922013202.GR13110@ando.pearwood.info> On Thu, Sep 21, 2017 at 01:09:11PM -0700, David Mertz wrote: > -1 > > Writing a floating point literal requires A LOT more knowledge than writing > a hex integer. > > What is the bit length of floats on your specific Python compile? Are there actually any Python implementations or builds which have floats not equal to 64 bits? If not, perhaps it is time to make 64 bit floats a language guarantee. > What > happens if you specify more or less precision than actually available. I expect the answer will be "exactly the same as what already happens right now". Why wouldn't it be? py> float.fromhex('0x1.81cd5c28f5c290000000089p+13') 12345.67 py> float('12345.6700000000000089') 12345.67 > Where is the underflow to subnormal numbers? Same place it is right now. > What is the bit representation > of information? Nan? -0 vs +0? Same as it is now. > There are people who know this and need to know this. But float.fromhex() > is already available to them. A literal is an attractive nuisance for > people who almost-but-not-quite understand IEEE-854. I.e. those people who > named neither Tim Peters nor Mike Cowlishaw. Using a different sized float is going to affect any representation of floats, whether it is in decimal or in hex. If your objections are valid for hex literals, then they're equally valid (if not more so!) for decimal literals, and we're left with the conclusion that nobody except Tim Peters and Mike Cowlishaw can enter floats into source code, or convert them from strings. And I think that's silly. Obviously many people can and do successfully use floats all the time, without worrying whether or not the code is absolutely, 100% consistent across all platforms, including that weird build on Acme YouNicks with 57 bit floats. People who care about weird builds can use sys.float_info to find out what they need to know, and adjust accordingly. Those who don't will continue to do what they're already doing: assume floats are 64-bit C doubles, and live in a state of blissful ignorance about alternatives until somebody reports a bug, which they'll close as "won't fix". -- Steve From greg.ewing at canterbury.ac.nz Thu Sep 21 19:27:00 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Sep 2017 11:27:00 +1200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: <59C44AC4.2050301@canterbury.ac.nz> Lucas Wiman wrote: > It is inconsistent that you can write hexadecimal integers but not > floating point numbers. Consistency in syntax is /fewer/ things to > learn, not more. You still need to learn the details of the hex syntax for floats, though. It's not obvious e.g. that you need to use "p" for the exponent. -- Greg From jim.baker at python.org Thu Sep 21 22:35:22 2017 From: jim.baker at python.org (Jim Baker) Date: Thu, 21 Sep 2017 20:35:22 -0600 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 1:57 AM, Paul Moore wrote: > ... > > It's also worth remembering that there will be implementations other > than CPython that will need changes, too - Jython, PyPy, possibly > Cython, and many editors and IDEs. So setting the bar at "someone who > wants this will have to step up and provide a patch" seems reasonable > to me. > It would be more or less trivial for Jython to add such support, given that Java has such support natively, and we already leverage this support in our current implementation of 2.7. See https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#valueOf(java.lang.String) if curious. We just need to add the correct floating point constant that is parsed to Java's constant pool, as used by Java bytecode, and it's done. I'm much more concerned about finishing the rest of 3.x. - Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.baker at python.org Thu Sep 21 22:49:11 2017 From: jim.baker at python.org (Jim Baker) Date: Thu, 21 Sep 2017 20:49:11 -0600 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170922013202.GR13110@ando.pearwood.info> References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 7:32 PM, Steven D'Aprano wrote: > On Thu, Sep 21, 2017 at 01:09:11PM -0700, David Mertz wrote: > > -1 > > > > Writing a floating point literal requires A LOT more knowledge than > writing > > a hex integer. > > > > What is the bit length of floats on your specific Python compile? > > Are there actually any Python implementations or builds which have > floats not equal to 64 bits? If not, perhaps it is time to make 64 bit > floats a language guarantee. > Jython passes the hexadecimal float tests in Lib/test/test_float.py, since Java uses 64-bit IEEE 754 double representation for the storage type of its double primitive type. (One can further constrain with strictfp, for intermediate representation, not certain how widely used that would be. I have never seen it.) In turn, Jython uses such doubles for its PyFloat implementation. I wonder if CPython is the only implementation that could potentially supports other representations, such as found on System/360 (or the successor z/OS architecture). And I vaguely recall VAX VMS had an alternative floating point, but is that still around??? - Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Sep 21 22:57:01 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Sep 2017 19:57:01 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170922013202.GR13110@ando.pearwood.info> References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> Message-ID: I think you are missing the point I was assuming at. Having a binary/hex float literal would tempt users to think "I know EXACTLY what number I'm spelling this way"... where most users definitely don't in edge cases. Spelling it float.fromhex(s) makes it more obvious "this is an expert operation I may not understand the intricacies of." On Sep 21, 2017 6:32 PM, "Steven D'Aprano" wrote: > On Thu, Sep 21, 2017 at 01:09:11PM -0700, David Mertz wrote: > > -1 > > > > Writing a floating point literal requires A LOT more knowledge than > writing > > a hex integer. > > > > What is the bit length of floats on your specific Python compile? > > Are there actually any Python implementations or builds which have > floats not equal to 64 bits? If not, perhaps it is time to make 64 bit > floats a language guarantee. > > > > What > > happens if you specify more or less precision than actually available. > > I expect the answer will be "exactly the same as what already > happens right now". Why wouldn't it be? > > py> float.fromhex('0x1.81cd5c28f5c290000000089p+13') > 12345.67 > py> float('12345.6700000000000089') > 12345.67 > > > > Where is the underflow to subnormal numbers? > > Same place it is right now. > > > > What is the bit representation > > of information? Nan? -0 vs +0? > > Same as it is now. > > > > There are people who know this and need to know this. But float.fromhex() > > is already available to them. A literal is an attractive nuisance for > > people who almost-but-not-quite understand IEEE-854. I.e. those people > who > > named neither Tim Peters nor Mike Cowlishaw. > > Using a different sized float is going to affect any representation of > floats, whether it is in decimal or in hex. If your objections are valid > for hex literals, then they're equally valid (if not more so!) for > decimal literals, and we're left with the conclusion that nobody except > Tim Peters and Mike Cowlishaw can enter floats into source code, or > convert them from strings. > > And I think that's silly. Obviously many people can and do successfully > use floats all the time, without worrying whether or not the code is > absolutely, 100% consistent across all platforms, including that weird > build on Acme YouNicks with 57 bit floats. > > People who care about weird builds can use sys.float_info to find out > what they need to know, and adjust accordingly. Those who don't will > continue to do what they're already doing: assume floats are 64-bit C > doubles, and live in a state of blissful ignorance about alternatives > until somebody reports a bug, which they'll close as "won't fix". > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Sep 21 23:14:27 2017 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 21 Sep 2017 22:14:27 -0500 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: [David Mertz ] > -1 > > Writing a floating point literal requires A LOT more knowledge than writing > a hex integer. But not really more than writing a decimal float literal in "scientific notation". People who use floats are used to the latter. Besides using "p" instead of "e" to mark the exponent, the only differences are that the mantissa is expressed in hex instead of in decimal, and the implicit base to which the exponent is applied is 2 instead of 10. Either way it's a notation for a rational number, which may or may not be exactly representable in the native HW float format. > What is the bit length of floats on your specific Python compile? What > happens if you specify more or less precision than actually available. > Where is the underflow to subnormal numbers? All the same answers apply as when using decimal "scientific notation". When the denoted rational isn't exactly representable, then a maze of rounding, overflow, and/or underflow rules apply. The base the literal is expressed in doesn't really have anything to do with those. > What is the bit representation of [infinity]? Nan? Hex float literals in general have no way to spell those: it's just a way to spell a subset of (mathematical) rational numbers. Python, however, does support special cases for those: >>> float.fromhex("inf") inf >>> float.fromhex("nan") nan > -0 vs +0? The obvious first attempts work fine for those ;-) > There are people who know this and need to know this. But float.fromhex() is > already available to them. A literal is an attractive nuisance for people > who almost-but-not-quite understand IEEE-854. As notations for rationals, nobody needs to understand 854 at all to use these things, so long as they stick to exactly representable numbers. Whether a specific literal _is_ exactly representable, and what happens if it's not, does require understanding a whole lot - but that's also true of decimal float literals. > I.e. those people who named neither Tim Peters nor Mike Cowlishaw. Or Mark Dickinson ;-) All that said, I'm -0 on the idea. I doubt I (or Mark, or Mike) would use it, because the need is rare and float.fromhex() is already sufficient. Indeed, `fromhex()` is less annoying, because it does support special cases for infinities and NaNs, and doesn't _require_ a "0x" prefix. From guido at python.org Thu Sep 21 23:16:27 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2017 20:16:27 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 7:57 PM, David Mertz wrote: > I think you are missing the point I was assuming at. Having a binary/hex > float literal would tempt users to think "I know EXACTLY what number I'm > spelling this way"... where most users definitely don't in edge cases. > That problem has never stopped us from using decimals. :-) > Spelling it float.fromhex(s) makes it more obvious "this is an expert > operation I may not understand the intricacies of." > I don't see why that would be more obvious than if it were built into the language -- just because something is a function doesn't mean it's an expert operation. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Sep 21 23:30:45 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Sep 2017 20:30:45 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: When I teach, I usually present this to students: >>> (0.1 + 0.2) + 0.3 == 0.1 + (0.2 + 0.3) False This is really easy as a way to say "floating point numbers are approximations where you often encounter rounding errors" The fact the "edge cases" are actually pretty central and commonplace in decimal approximations makes this a very easy lesson to teach. After that, I might discuss using deltas, or that better yet is math.isclose() or numpy.isclose(). Sometimes I'll get into absolute tolerance versus relative tolerance passingly at this point. Simply because the edge cases for working with e.g. '0xC.68p+2' in a hypothetical future Python are less obvious and less simple to demonstrate, I feel like learners will be tempted to think that using this base-2/16 representation saves them all their approximation issues and their need still to use isclose() or friends. On Thu, Sep 21, 2017 at 8:14 PM, Tim Peters wrote: > [David Mertz ] > > -1 > > > > Writing a floating point literal requires A LOT more knowledge than > writing > > a hex integer. > > But not really more than writing a decimal float literal in > "scientific notation". People who use floats are used to the latter. > Besides using "p" instead of "e" to mark the exponent, the only > differences are that the mantissa is expressed in hex instead of in > decimal, and the implicit base to which the exponent is applied is 2 > instead of 10. > > Either way it's a notation for a rational number, which may or may not > be exactly representable in the native HW float format. > > > > What is the bit length of floats on your specific Python compile? What > > happens if you specify more or less precision than actually available. > > Where is the underflow to subnormal numbers? > > All the same answers apply as when using decimal "scientific > notation". When the denoted rational isn't exactly representable, > then a maze of rounding, overflow, and/or underflow rules apply. The > base the literal is expressed in doesn't really have anything to do > with those. > > > > What is the bit representation of [infinity]? Nan? > > Hex float literals in general have no way to spell those: it's just a > way to spell a subset of (mathematical) rational numbers. Python, > however, does support special cases for those: > > >>> float.fromhex("inf") > inf > >>> float.fromhex("nan") > nan > > > > -0 vs +0? > > The obvious first attempts work fine for those ;-) > > > > There are people who know this and need to know this. But > float.fromhex() is > > already available to them. A literal is an attractive nuisance for people > > who almost-but-not-quite understand IEEE-854. > > As notations for rationals, nobody needs to understand 854 at all to > use these things, so long as they stick to exactly representable > numbers. Whether a specific literal _is_ exactly representable, and > what happens if it's not, does require understanding a whole lot - but > that's also true of decimal float literals. > > > I.e. those people who named neither Tim Peters nor Mike Cowlishaw. > > Or Mark Dickinson ;-) > > All that said, I'm -0 on the idea. I doubt I (or Mark, or Mike) would > use it, because the need is rare and float.fromhex() is already > sufficient. Indeed, `fromhex()` is less annoying, because it does > support special cases for infinities and NaNs, and doesn't _require_ a > "0x" prefix. > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Sep 21 23:38:25 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Sep 2017 20:38:25 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 8:30 PM, David Mertz wrote: > Simply because the edge cases for working with e.g. '0xC.68p+2' in a > hypothetical future Python are less obvious and less simple to demonstrate, > I feel like learners will be tempted to think that using this base-2/16 > representation saves them all their approximation issues and their need > still to use isclose() or friends. > Show them 1/49*49, and explain why for i < 49, (1/i)*i equals 1 (lucky rounding). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Thu Sep 21 23:41:42 2017 From: mertz at gnosis.cx (David Mertz) Date: Thu, 21 Sep 2017 20:41:42 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: When I teach, I usually present this to students: >>> (0.1 + 0.2) + 0.3 == 0.1 + (0.2 + 0.3) False This is really easy as a way to say "floating point numbers are approximations where you often encounter rounding errors" The fact the "edge cases" are actually pretty central and commonplace in decimal approximations makes this a very easy lesson to teach. After that, I might discuss using deltas, or that better yet is math.isclose() or numpy.isclose(). Sometimes I'll get into absolute tolerance versus relative tolerance passingly at this point. Simply because the edge cases for working with e.g. '0xC.68p+2' in a hypothetical future Python are less obvious and less simple to demonstrate, I feel like learners will be tempted to think that using this base-2/16 representation saves them all their approximation issues and their need still to use isclose() or friends. On Thu, Sep 21, 2017 at 8:14 PM, Tim Peters wrote: > [David Mertz ] > > -1 > > > > Writing a floating point literal requires A LOT more knowledge than > writing > > a hex integer. > > But not really more than writing a decimal float literal in > "scientific notation". People who use floats are used to the latter. > Besides using "p" instead of "e" to mark the exponent, the only > differences are that the mantissa is expressed in hex instead of in > decimal, and the implicit base to which the exponent is applied is 2 > instead of 10. > > Either way it's a notation for a rational number, which may or may not > be exactly representable in the native HW float format. > > > > What is the bit length of floats on your specific Python compile? What > > happens if you specify more or less precision than actually available. > > Where is the underflow to subnormal numbers? > > All the same answers apply as when using decimal "scientific > notation". When the denoted rational isn't exactly representable, > then a maze of rounding, overflow, and/or underflow rules apply. The > base the literal is expressed in doesn't really have anything to do > with those. > > > > What is the bit representation of [infinity]? Nan? > > Hex float literals in general have no way to spell those: it's just a > way to spell a subset of (mathematical) rational numbers. Python, > however, does support special cases for those: > > >>> float.fromhex("inf") > inf > >>> float.fromhex("nan") > nan > > > > -0 vs +0? > > The obvious first attempts work fine for those ;-) > > > > There are people who know this and need to know this. But > float.fromhex() is > > already available to them. A literal is an attractive nuisance for people > > who almost-but-not-quite understand IEEE-854. > > As notations for rationals, nobody needs to understand 854 at all to > use these things, so long as they stick to exactly representable > numbers. Whether a specific literal _is_ exactly representable, and > what happens if it's not, does require understanding a whole lot - but > that's also true of decimal float literals. > > > I.e. those people who named neither Tim Peters nor Mike Cowlishaw. > > Or Mark Dickinson ;-) > > All that said, I'm -0 on the idea. I doubt I (or Mark, or Mike) would > use it, because the need is rare and float.fromhex() is already > sufficient. Indeed, `fromhex()` is less annoying, because it does > support special cases for infinities and NaNs, and doesn't _require_ a > "0x" prefix. > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Sep 22 00:20:45 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Sep 2017 14:20:45 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On 22 September 2017 at 13:38, Guido van Rossum wrote: > On Thu, Sep 21, 2017 at 8:30 PM, David Mertz wrote: >> >> Simply because the edge cases for working with e.g. '0xC.68p+2' in a >> hypothetical future Python are less obvious and less simple to demonstrate, >> I feel like learners will be tempted to think that using this base-2/16 >> representation saves them all their approximation issues and their need >> still to use isclose() or friends. > > Show them 1/49*49, and explain why for i < 49, (1/i)*i equals 1 (lucky > rounding). If anything, I'd expect the hex notation to make the binary vs decimal representational differences *easier* to teach, since instructors would be able to directly show things like: >>> 0.5 == 0x0.8 == 0o0.4 == 0b0.1 # Negative power of two! True >>> (0.1 + 0.2) == 0.3 # Not negative powers of two False >>> 0.3 == 0x1.3333333333333p-2 True >>> (0.1 + 0.2) == 0x1.3333333333334p-2 True While it's possible to provide a demonstration along those lines today, it means writing the last two lines as: >>> 0.3.hex() == "0x1.3333333333333p-2" True >>> (0.1 + 0.2).hex() == "0x1.3333333333334p-2" True (Which invites the question "Why does 'hex(3)' work, but I have to write '0.3.hex()' instead"?) To illustrate that hex floating point literals don't magically solve all your binary floating point rounding issues, an instructor could also demonstrate: >>> one_tenth = 0x1.0 / 0xA.0 >>> two_tenths = 0x2.0 / 0xA.0 >>> three_tenths = 0x3.0 / 0xA.0 >>> three_tenths == one_tenth + two_tenths False Again, a demonstration along those lines is already possible, but it involves using integers in the rational expressions, rather than floats. Given syntactic support, it would also be reasonable for the hex()/oct()/bin() builtins to be expanded to handle printing floating point numbers in those formats, and for floats to gain support for the corresponding print formatting codes. So overall, I'm still +0, on the grounds of improving int/float API consistency. While I'm sympathetic to the concerns about potentially changing the way the binary/decimal representation distinction is taught for floating point values, I don't think having better support for more native representations of binary floats is likely to make that harder than it already is. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rob.cliffe at btinternet.com Fri Sep 22 03:41:52 2017 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Fri, 22 Sep 2017 08:41:52 +0100 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170922013202.GR13110@ando.pearwood.info> References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> Message-ID: <290438ea-5c1a-1dd7-be43-d21b969caf97@btinternet.com> On 22/09/2017 02:32, Steven D'Aprano wrote: > > Are there actually any Python implementations or builds which have > floats not equal to 64 bits? If not, perhaps it is time to make 64 bit > floats a language guarantee. > > > This will be unfortunate when Intel bring out a processor with 256-bit floats (or by "64 bit" do you mean "at least 64 bit"?).? Hm, is there an analog of Moore's law that says the number of floating-point bits doubles every X years? :-) Unrelated thought:? Users might be unsure if the exponent in a hexadecimal float is in decimal or in hex. Rob Cliffe From solipsis at pitrou.net Fri Sep 22 05:42:12 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Sep 2017 11:42:12 +0200 Subject: [Python-ideas] Hexadecimal floating literals References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: <20170922114212.29afc9c9@fsol> On Thu, 21 Sep 2017 22:14:27 -0500 Tim Peters wrote: > [David Mertz ] > > -1 > > > > Writing a floating point literal requires A LOT more knowledge than writing > > a hex integer. > > But not really more than writing a decimal float literal in > "scientific notation". People who use floats are used to the latter. > Besides using "p" instead of "e" to mark the exponent, the only > differences are that the mantissa is expressed in hex instead of in > decimal, and the implicit base to which the exponent is applied is 2 > instead of 10. The main difference is familiarity. "scientific" notation should be well-known and understood even by high school kids. Who knows about hexadecimal notation for floats, apart from floating-point experts? So for someone reading code, the scientific notation poses no problem as they understand it intuitively (even if they may not grasp the difficulties of the underlying conversion to binary FP), while for hexadecimal float notation need they have to go out of their way to learn about it, parse the number slowly and try to make out what its value is. Regards Antoine. From storchaka at gmail.com Fri Sep 22 06:56:21 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 22 Sep 2017 13:56:21 +0300 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: 21.09.17 18:23, Victor Stinner ????: > My vote is now -1 on extending the Python syntax to add hexadecimal > floating literals. > > While I was first in favor of extending the Python syntax, I changed > my mind. Float constants written in hexadecimal is a (very?) rare use > case, and there is already float.fromhex() available. > > A new syntax is something more to learn when you learn Python. Is it > worth it? I don't think so. Very few people need to write hexadecimal > constants in their code. Initially I was between -0 and +0. The cost of implementing this feature is not zero, but it looked harmless (while almost useless). But after reading the discussion (in particular the comments of proponents) I'm closer to -1. This feature can be useful for very few people. And they already have float.fromhex(). Taking to account the nature of Python the arguments for literals are weaker than in case of statically compiled languages. For the rest of users it rather adds confusion and misunderstanding. And don't forgot about non-zero cost. You will be impressed by the number of places just in the CPython core and stdlib that should be updated for supporting a new type of literals. From rhodri at kynesim.co.uk Fri Sep 22 07:50:46 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 22 Sep 2017 12:50:46 +0100 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> Message-ID: On 22/09/17 03:57, David Mertz wrote: > I think you are missing the point I was assuming at. Having a binary/hex > float literal would tempt users to think "I know EXACTLY what number I'm > spelling this way"... where most users definitely don't in edge cases. Quite. What makes me -0 on this idea is that a lot of the initial enthusiasm on this list came from people saying exactly that. -- Rhodri James *-* Kynesim Ltd From mertz at gnosis.cx Fri Sep 22 11:37:06 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 22 Sep 2017 08:37:06 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <290438ea-5c1a-1dd7-be43-d21b969caf97@btinternet.com> References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> <290438ea-5c1a-1dd7-be43-d21b969caf97@btinternet.com> Message-ID: > > Unrelated thought: Users might be unsure if the exponent in a hexadecimal > float is in decimal or in hex. I was playing around with float.fromhex() for this thread, and the first number I tried to spell used a hex exponent because that seemed like "the obvious thing"... I figured it out quickly enough, but the actual spelling feels less obvious to me. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Sep 22 11:37:31 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 22 Sep 2017 08:37:31 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On Thu, Sep 21, 2017 at 9:20 PM, Nick Coghlan wrote: > >>> one_tenth = 0x1.0 / 0xA.0 > >>> two_tenths = 0x2.0 / 0xA.0 > >>> three_tenths = 0x3.0 / 0xA.0 > >>> three_tenths == one_tenth + two_tenths > False > OMG Regardless of whether we introduce this feature, .hex() is the way to show what's going on here: >>> 0.1.hex() '0x1.999999999999ap-4' >>> 0.2.hex() '0x1.999999999999ap-3' >>> 0.3.hex() '0x1.3333333333333p-2' >>> (0.1+0.2).hex() '0x1.3333333333334p-2' >>> This shows so clearly that there's 1 bit difference! -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Sep 22 13:06:17 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 22 Sep 2017 10:06:17 -0700 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> Message-ID: On Fri, Sep 22, 2017 at 8:37 AM, Guido van Rossum wrote: > On Thu, Sep 21, 2017 at 9:20 PM, Nick Coghlan wrote: > >> >>> one_tenth = 0x1.0 / 0xA.0 >> >>> two_tenths = 0x2.0 / 0xA.0 >> >>> three_tenths = 0x3.0 / 0xA.0 >> >>> three_tenths == one_tenth + two_tenths >> False >> > > OMG Regardless of whether we introduce this feature, .hex() is the way to > show what's going on here: > > >>> 0.1.hex() > '0x1.999999999999ap-4' > >>> 0.2.hex() > '0x1.999999999999ap-3' > >>> 0.3.hex() > '0x1.3333333333333p-2' > >>> (0.1+0.2).hex() > '0x1.3333333333334p-2' > >>> > > This shows so clearly that there's 1 bit difference! > Thanks! I really should add this example to the math.isclose() docs.... .hex is mentioned in: https://docs.python.org/3/tutorial/floatingpoint.html but I don't see it used in a nice clear example like this. -CHB > > > -- > --Guido van Rossum (python.org/~guido ) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Sep 22 13:15:51 2017 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 22 Sep 2017 12:15:51 -0500 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <20170922114212.29afc9c9@fsol> References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922114212.29afc9c9@fsol> Message-ID: [Antoine Pitrou ] > ... > The main difference is familiarity. "scientific" notation should be > well-known and understood even by high school kids. Who knows about > hexadecimal notation for floats, apart from floating-point experts? Here's an example: you <0x0.2p0 wink>. For people who understand both hex and (decimal) scientific notation, learning what hex float notation means is easy. > So for someone reading code, the scientific notation poses no problem > as they understand it intuitively (even if they may not grasp the > difficulties of the underlying conversion to binary FP), while for > hexadecimal float notation need they have to go out of their way to > learn about it, parse the number slowly and try to make out what its > value is. I've seen plenty of people on StackOverflow who (a) don't understand hex notation for integers; and/or (b) don't understand scientific notation for floats. Nothing is self-evident about either; they both have to be learned at first. Same for hex float notation. Of course it's true that many (not all) people do know about hex integers and/or decimal scientific notation from prior (to Python) experience. My objection is that we already have a way to use hex float notation, and the _need_ for it is rare. If someone uninitiated sees a rare: x = 0x1.aaap-4 they're going to ask on StackOverflow what the heck it's supposed to mean. But if they see a rare: x = float.fromhex("0x1.aaap-4") they can Google for "python fromhex" and find the docs themselves at once. The odd method name makes it highly "discoverable", and I think that's a feature for rare gimmicks with a small, specialized audience. From antoine at python.org Fri Sep 22 13:21:35 2017 From: antoine at python.org (Antoine Pitrou) Date: Fri, 22 Sep 2017 19:21:35 +0200 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922114212.29afc9c9@fsol> Message-ID: Le 22/09/2017 ? 19:15, Tim Peters a ?crit?: > I've seen plenty of people on StackOverflow who (a) don't understand > hex notation for integers; and/or (b) don't understand scientific > notation for floats. Nothing is self-evident about either; they both > have to be learned at first. Sure. But, unless I'm mistaken, most people learn about the scientific notation as teenagers (that was certainly my case at least, and that was not from my parents AFAIR). Which of them learn about hexadecimal float notation at the same time? > But if they see a rare: > > x = float.fromhex("0x1.aaap-4") > > they can Google for "python fromhex" and find the docs themselves at > once. The odd method name makes it highly "discoverable", and I think > that's a feature for rare gimmicks with a small, specialized audience. Basically agreed. Moreover, "float.fromhex" spells it out, while the literal syntax does not even make it obvious it's a floating-point number at all. Regards Antoine. From random832 at fastmail.com Sat Sep 23 21:09:22 2017 From: random832 at fastmail.com (Random832) Date: Sat, 23 Sep 2017 21:09:22 -0400 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: <290438ea-5c1a-1dd7-be43-d21b969caf97@btinternet.com> References: <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922013202.GR13110@ando.pearwood.info> <290438ea-5c1a-1dd7-be43-d21b969caf97@btinternet.com> Message-ID: <1506215362.2474938.1116137224.54D854A9@webmail.messagingengine.com> On Fri, Sep 22, 2017, at 03:41, Rob Cliffe wrote: > Unrelated thought:? Users might be unsure if the exponent in a > hexadecimal float is in decimal or in hex. Or, for that matter, a power of two or of sixteen. From ncoghlan at gmail.com Sun Sep 24 07:39:03 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Sep 2017 21:39:03 +1000 Subject: [Python-ideas] Hexadecimal floating literals In-Reply-To: References: <20170912014844.GV13110@ando.pearwood.info> <20170912112851.GX13110@ando.pearwood.info> <9A3E54B3-BD0E-4CFF-B53E-5C411AB36775@lip6.fr> <20170914155707.GE13110@ando.pearwood.info> <703B80FA-D17A-4E76-8D02-085D40B7D20E@lip6.fr> <5758742674925253881@unknownmsgid> <20170921015345.GN13110@ando.pearwood.info> <20170922114212.29afc9c9@fsol> Message-ID: On 23 September 2017 at 03:15, Tim Peters wrote: > But if they see a rare: > > x = float.fromhex("0x1.aaap-4") > > they can Google for "python fromhex" and find the docs themselves at > once. The odd method name makes it highly "discoverable", and I think > that's a feature for rare gimmicks with a small, specialized audience. Given how often I've used "It's hard to search for magic syntax" as a design argument myself, I'm surprised I missed the fact it also applies in this case. Now that you've brought it up though, I think it's a very good point, and it's enough to switch me from +0 to -1. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From alexandre.galode at gmail.com Mon Sep 25 07:49:42 2017 From: alexandre.galode at gmail.com (alexandre.galode at gmail.com) Date: Mon, 25 Sep 2017 04:49:42 -0700 (PDT) Subject: [Python-ideas] Fwd: A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: Hi, Sorry from being late, i was in professional trip to Pycon FR. I see that the subject is divising advises. Reading responses, i have impression that my proposal has been saw as mandatory, that i don't want of course. As previously said, i see this "PEP" as an informational PEP. So it's a guideline, not a mandatory. Each developer will have right to ignore it, as each developer can choose to ignore PEP8 or PEP20. Someones was saying that it was too generic, and not attached to Python, but i'm not OK on this point, because some metrics basically measured on Python code is justly PEP8 respect, with tools like PEP8, pylint, ... Not every metrics could be attached to another PEP, i'm OK on this point, but if at least one of its could be, it means in my mind, that a PEP can be justified. @Jason, about false sense of security. Reading this make me thinking to the last week Pycon. Someone was talking to me to its coverage rate was falling. But in fact it was because an add of code lines, without TU on it. It means, that the purposed metrics are not to be used as "god" metrics, that we have only to read to know precisely the "health" of our code, but metrics which indicates to us the "health" trend of our code. @Chris, about we measure only what we know. Effectively, i think it's the reality. One year ago, i didn't know McCabe principle and associated tool. But now, i'm using it. If a "PEP", existing from a time, was talking about this concept, i would read it and apply the concept. Perfect solution does not exist, i know it, but i think this "PEP" could, partially, be a good guideline. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Sep 25 22:53:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Sep 2017 12:53:42 +1000 Subject: [Python-ideas] Fwd: Fwd: A PEP to define basical metric which allows to guarantee minimal code quality In-Reply-To: References: <4c594053-c431-4b29-a8ac-d5c422e6a06f@googlegroups.com> <20170915101954.GF13110@ando.pearwood.info> <-1213469140971527789@unknownmsgid> Message-ID: Forwarding my reply, since Google Groups still can't get the Reply-To headers for the mailing list right, and we still don't know how to categorically prohibit posting from there. ---------- Forwarded message ---------- From: Nick Coghlan Date: 26 September 2017 at 12:51 Subject: Re: [Python-ideas] Fwd: A PEP to define basical metric which allows to guarantee minimal code quality To: Alexandre GALODE Cc: python-ideas On 25 September 2017 at 21:49, wrote: > Hi, > > Sorry from being late, i was in professional trip to Pycon FR. > > I see that the subject is divising advises. > > Reading responses, i have impression that my proposal has been saw as > mandatory, that i don't want of course. As previously said, i see this "PEP" > as an informational PEP. So it's a guideline, not a mandatory. Each > developer will have right to ignore it, as each developer can choose to > ignore PEP8 or PEP20. > > Perfect solution does not exist, i know it, but i think this "PEP" could, > partially, be a good guideline. Your question is essentially "Are python-dev prepared to offer generic code quality assessment advice to Python developers?" The answer is "No, we're not". It's not our role, and it's not a role we're the least bit interested in taking on. Just because we're the ones making the software equivalent of hammers and saws doesn't mean we're also the ones that should be drafting or signing off on people's building codes :) Python's use cases are too broad, and what's appropriate for my ad hoc script to download desktop wallpaper backgrounds, isn't going to be what's appropriate for writing an Ansible module, which in turn isn't going to be the same as what's appropriate for writing a highly scalable web service or a complex data analysis job. So the question of "What does 'good enough for my purposes' actually mean?" is something for end users to tackle for themselves, either individually or collaboratively, without seeking specific language designer endorsement of their chosen criteria. However, as mentioned earlier in the thread, it would be *entirely* appropriate for the folks participating in PyCQA to decide to either take on this work themselves, or else endorse somebody else taking it on. I'd see such an effort as being similar to the way that packaging.python.org originally started as an independent PyPA project hosted at python-packaging-user-guide.readthedocs.io, with a fair bit of content already being added before we later requested and received the python.org subdomain. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From levkivskyi at gmail.com Wed Sep 27 05:28:11 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 27 Sep 2017 11:28:11 +0200 Subject: [Python-ideas] PEP 560 (second post) Message-ID: Previously I posted PEP 560 two weeks ago, while several other PEPs were also posted, so it didn't get much of attention. Here I post the PEP 560 again, now including the full text for convenience of commenting. -- Ivan ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ PEP: 560 Title: Core support for generic types Author: Ivan Levkivskyi Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 03-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017 Abstract ======== Initially PEP 484 was designed in such way that it would not introduce *any* changes to the core CPython interpreter. Now type hints and the ``typing`` module are extensively used by the community, e.g. PEP 526 and PEP 557 extend the usage of type hints, and the backport of ``typing`` on PyPI has 1M downloads/month. Therefore, this restriction can be removed. It is proposed to add two special methods ``__class_getitem__`` and ``__subclass_base__`` to the core CPython for better support of generic types. Rationale ========= The restriction to not modify the core CPython interpreter lead to some design decisions that became questionable when the ``typing`` module started to be widely used. There are three main points of concerns: performance of the ``typing`` module, metaclass conflicts, and the large number of hacks currently used in ``typing``. Performance: ------------ The ``typing`` module is one of the heaviest and slowest modules in the standard library even with all the optimizations made. Mainly this is because subscripted generic types (see PEP 484 for definition of terms used in this PEP) are class objects (see also [1]_). The three main ways how the performance can be improved with the help of the proposed special methods: - Creation of generic classes is slow since the ``GenericMeta.__new__`` is very slow; we will not need it anymore. - Very long MROs for generic classes will be twice shorter; they are present because we duplicate the ``collections.abc`` inheritance chain in ``typing``. - Time of instantiation of generic classes will be improved (this is minor however). Metaclass conflicts: -------------------- All generic types are instances of ``GenericMeta``, so if a user uses a custom metaclass, then it is hard to make a corresponding class generic. This is particularly hard for library classes that a user doesn't control. A workaround is to always mix-in ``GenericMeta``:: class AdHocMeta(GenericMeta, LibraryMeta): pass class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta): ... but this is not always practical or even possible. With the help of the proposed special attributes the ``GenericMeta`` metaclass will not be needed. Hacks and bugs that will be removed by this proposal: ----------------------------------------------------- - ``_generic_new`` hack that exists since ``__init__`` is not called on instances with a type differing form the type whose ``__new__`` was called, ``C[int]().__class__ is C``. - ``_next_in_mro`` speed hack will be not necessary since subscription will not create new classes. - Ugly ``sys._getframe`` hack, this one is particularly nasty, since it looks like we can't remove it without changes outside ``typing``. - Currently generics do dangerous things with private ABC caches to fix large memory consumption that grows at least as O(N\ :sup:`2`), see [2]_. This point is also important because it was recently proposed to re-implement ``ABCMeta`` in C. - Problems with sharing attributes between subscripted generics, see [3]_. Current solution already uses ``__getattr__`` and ``__setattr__``, but it is still incomplete, and solving this without the current proposal will be hard and will need ``__getattribute__``. - ``_no_slots_copy`` hack, where we clean-up the class dictionary on every subscription thus allowing generics with ``__slots__``. - General complexity of the ``typing`` module, the new proposal will not only allow to remove the above mentioned hacks/bugs, but also simplify the implementation, so that it will be easier to maintain. Specification ============= The idea of ``__class_getitem__`` is simple: it is an exact analog of ``__getitem__`` with an exception that it is called on a class that defines it, not on its instances, this allows us to avoid ``GenericMeta.__getitem__`` for things like ``Iterable[int]``. The ``__class_getitem__`` is automatically a class method and does not require ``@classmethod`` decorator (similar to ``__init_subclass__``) and is inherited like normal attributes. For example:: class MyList: def __getitem__(self, index): return index + 1 def __class_getitem__(cls, item): return f"{cls.__name__}[{item.__name__}]" class MyOtherList(MyList): pass assert MyList()[0] == 1 assert MyList[int] == "MyList[int]" assert MyOtherList()[0] == 1 assert MyOtherList[int] == "MyOtherList[int]" Note that this method is used as a fallback, so if a metaclass defines ``__getitem__``, then that will have the priority. If an object that is not a class object appears in the bases of a class definition, the ``__subclass_base__`` is searched on it. If found, it is called with the original tuple of bases as an argument. If the result of the call is not ``None``, then it is substituted instead of this object. Otherwise (if the result is ``None``), the base is just removed. This is necessary to avoid inconsistent MRO errors, that are currently prevented by manipulations in ``GenericMeta.__new__``. After creating the class, the original bases are saved in ``__orig_bases__`` (currently this is also done by the metaclass). NOTE: These two method names are reserved for exclusive use by the ``typing`` module and the generic types machinery, and any other use is strongly discouraged. The reference implementation (with tests) can be found in [4]_, the proposal was originally posted and discussed on the ``typing`` tracker, see [5]_. Backwards compatibility and impact on users who don't use ``typing``: ===================================================================== This proposal may break code that currently uses the names ``__class_getitem__`` and ``__subclass_base__``. This proposal will support almost complete backwards compatibility with the current public generic types API; moreover the ``typing`` module is still provisional. The only two exceptions are that currently ``issubclass(List[int], List)`` returns True, with this proposal it will raise ``TypeError``. Also ``issubclass(collections.abc.Iterable, typing.Iterable)`` will return ``False``, which is probably desirable, since currently we have a (virtual) inheritance cycle between these two classes. With the reference implementation I measured negligible performance effects (under 1% on a micro-benchmark) for regular (non-generic) classes. References ========== .. [1] Discussion following Mark Shannon's presentation at Language Summit (https://github.com/python/typing/issues/432) .. [2] Pull Request to implement shared generic ABC caches (https://github.com/python/typing/pull/383) .. [3] An old bug with setting/accessing attributes on generic types (https://github.com/python/typing/issues/392) .. [4] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/2/files) .. [5] Original proposal (https://github.com/python/typing/issues/468) Copyright ========= This document has been placed in the public domain. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Sep 27 12:08:11 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 27 Sep 2017 12:08:11 -0400 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 9/27/2017 5:28 AM, Ivan Levkivskyi wrote: > Abstract > ======== > > Initially PEP 484 was designed in such way that it would not introduce > *any* changes to the core CPython interpreter. Now type hints and > the ``typing`` module are extensively used by the community, e.g. PEP 526 > and PEP 557 extend the usage of type hints, and the backport of ``typing`` > on PyPI has 1M downloads/month. Therefore, this restriction can be removed. It seem sensible to me that you waited awhile to discover what would be needed. > It is proposed to add two special methods ``__class_getitem__`` and > ``__subclass_base__`` to the core CPython for better support of > generic types. I would not be concerned about anyone (mis)using reserved words. If the new methods were for general use, I would question making them automatically class methods. Having __new__ automatically being a static method is convenient, but occasionally throws people off. But if they were only used for typing, perhaps it is ok. On the other hand, I expect that others will use __class_getitem__ for the same purpose -- to avoid defining a metaclass just to make class[something] work. So I question defining that as 'typing only'. Without rereading the PEP, the use case for __subclass_base__ is not clear to me. So I don't know if there are other uses for it. -- Terry Jan Reedy From levkivskyi at gmail.com Wed Sep 27 18:11:04 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 28 Sep 2017 00:11:04 +0200 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 27 September 2017 at 18:08, Terry Reedy wrote: > On 9/27/2017 5:28 AM, Ivan Levkivskyi wrote: > > It is proposed to add two special methods ``__class_getitem__`` and >> ``__subclass_base__`` to the core CPython for better support of >> generic types. >> > > I would not be concerned about anyone (mis)using reserved words. > > These methods are quite specific (especially __subclass_base__) so there are two points: * I would like to say that there are less backwards compatibility guarantees. Only the documented use is guaranteed to be backwards compatible, which is in this case Iterable[int] etc. * I don't want to "advertise" these methods. I could imagine someone will be unpleasantly surprised when finding these while reading someone other's code. > If the new methods were for general use, I would question making them > automatically class methods. Having __new__ automatically being a static > method is convenient, but occasionally throws people off. But if they were > only used for typing, perhaps it is ok. On the other hand, I expect that > others will use __class_getitem__ for the same purpose -- to avoid defining > a metaclass just to make class[something] work. So I question defining > that as 'typing only'. > > I think we would rather want to limit the number of use cases for SomeClass[int], so that one doesn't need to guess if it is a generic class or something else. > Without rereading the PEP, the use case for __subclass_base__ is not clear > to me. So I don't know if there are other uses for it. > The __subclass_base__ method is needed to avoid making the result of Iterable[int] a class object. Creating new class objects on every subscription is too expensive. However, these objects must be subclassable, so that the only way is to introduce this new method. For example: class Iterable: def __class_getitem__(cls, item): return GenericAlias(cls, item) class GenericAlias: def __init__(self, origin, item): self.origin = origin self.item = item def __subclass_base__(self, bases): return self.origin class MyIterable(Iterable[int]): ... Real code will be more complex, but this illustrates the idea. I don't know other use cases where one would want to allow non-classes in base classes list. Thanks for comments! -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Sep 28 02:27:09 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Sep 2017 16:27:09 +1000 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 27 September 2017 at 19:28, Ivan Levkivskyi wrote: > If an object that is not a class object appears in the bases of a class > definition, the ``__subclass_base__`` is searched on it. If found, > it is called with the original tuple of bases as an argument. If the result > of the call is not ``None``, then it is substituted instead of this object. > Otherwise (if the result is ``None``), the base is just removed. This is > necessary to avoid inconsistent MRO errors, that are currently prevented by > manipulations in ``GenericMeta.__new__``. After creating the class, > the original bases are saved in ``__orig_bases__`` (currently this is also > done by the metaclass). The name of "__subclass_base__" is still the part of this proposal that bothers me the most. I do know what it means, but "the class to use when this is listed as a base class for a subclass" is a genuinely awkward noun phrase. How would you feel about calling it "__mro_entry__", as a mnemonic for "the substitute entry to use instead of this object when calculating a subclass MRO"? Then the MRO calculation process would be: * generate "resolved_bases" from "orig_bases" (if any of them define "__mro_entry__") * use the existing MRO calculation process with resolved_bases as the input instead of orig_bases I think the other thing that needs to be clarified is whether or not the actual metaclass can expect to receive an already-resolved sequence of MRO entries as its list of bases, or if it will need to repeat the base resolution process executed while figuring out the metaclass. You can see the implications of that question most clearly when looking at the dynamic type creation API in the types module, where we offer both: # All inclusive with callback types.new_class(name, bases, kwds, exec_body) # Separate preparation phase mcl, ns, updated_kwds = types.prepare_class(name, bases, kwds) exec_body(ns) mcl(name, bases, ns, **updated_kwds) If we expect the metaclass to receive an already resolved sequence of bases, then we'll need to update the usage expectations for the latter API to look something like: mcl, ns, updated_kwds = types.prepare_class(name, bases, kwds) resolved_bases = ns.pop("__resolved_bases__", bases) # We leave ns["__orig_bases__"] set exec_body(ns) mcl(name, resolved_bases, ns, **updated_kwds) By contrast, if we decide that we're going do the full MRO resolution twice, then every metaclass will need to be updated to resolve the bases correctly (and make sure to use "cls.__bases__" or "cls.__mro__" after call up to their parent metaclass to actually create the class object). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From diana.joan.clarke at gmail.com Thu Sep 28 14:48:15 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Thu, 28 Sep 2017 12:48:15 -0600 Subject: [Python-ideas] Changes to the existing optimization levels Message-ID: Hi folks: I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list. "Make optimizations more fine-grained than just -O and -OO" There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following. 0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks >From what I gather, their use-case is assert statements in production code. More specifically, they want to be able to optimize away docstrings, but keep the assert statements, which currently isn't possible with the existing optimization levels. As a first baby-step, I considered just adding a new optimization level 3 that keeps asserts but continues to remove docstrings and __debug__ blocks. 3: remove docstrings and __debug__ blocks >From a command-line perspective, there is already support for additional optimization levels. That is, without making any changes, the optimization level will increase with the number of 0s provided. $ python -c "import sys; print(sys.flags.optimize)" 0 $ python -OO -c "import sys; print(sys.flags.optimize)" 2 $ python -OOOOOOO -c "import sys; print(sys.flags.optimize)" 7 And the PYTHONOPTIMIZE environment variable will happily assign something like 42 to sys.flags.optimize. $ unset PYTHONOPTIMIZE $ python -c "import sys; print(sys.flags.optimize)" 0 $ export PYTHONOPTIMIZE=2 $ python -c "import sys; print(sys.flags.optimize)" 2 $ export PYTHONOPTIMIZE=42 $ python -c "import sys; print(sys.flags.optimize)" 42 Finally, the resulting __pycache__ folder also already contains the expected bytecode files for the new optimization levels ( __init__.cpython-37.opt-42.pyc was created for optimization level 42, for example). $ tree . ??? test ??? __init__.py ??? __pycache__ ??? __init__.cpython-37.opt-1.pyc ??? __init__.cpython-37.opt-2.pyc ??? __init__.cpython-37.opt-42.pyc ??? __init__.cpython-37.opt-7.pyc ??? __init__.cpython-37.pyc Adding optimization level 3 is an easy change to make. Here's that quick proof of concept (minus changes to the docs, etc). I've also attached that diff as 3.diff. https://github.com/dianaclarke/cpython/commit/4bd7278d87bd762b2989178e5bfed309cf9fb5bf I was initially looking for a more elegant solution that allowed you to specify exactly which optimizations you wanted, and when I floated this naive ("level 3") approach off-list to a few core developers, their feedback confirmed my hunch (too hacky). So for my second pass at this task, I started with the following two pronged approach. 1) Changed the various compile signatures to accept a set of string optimization flags rather than an int value. 2) Added a new command line option N that allows you to specify any number of individual optimization flags. For example: python -N nodebug -N noassert -N nodocstring The existing optimization options (-O and -OO) still exist in this approach, but they are mapped to the new optimization flags ("nodebug", "noassert", "nodocstring"). With the exception of the builtin complile() function, all underlying compile functions would only accept optimization flags going forward, and the builtin compile() function would accept both an integer optimize value or a set of optimization flags for backwards compatibility. You can find that work-in-progress approach here on github (also attached as N.diff). https://github.com/dianaclarke/cpython/commit/3e36cea1fc8ee6f4cdc584851e4c1edfc2bb1e56 All in all, that approach is going fairly well, but there's a lot of work remaining, and that diff is already getting quite large (for my new-contributor status). Note for example, that I haven't yet tackled adding bytecode files to __pycache__ that reflect these new optimization flags. Something like: $ tree . ??? test ??? __init__.py ??? __pycache__ ??? __init__.cpython-37.opt-nodebug-noassert.pyc ??? __init__.cpython-37.opt-nodebug-nodocstring.pyc ??? __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc ??? __init__.cpython-37.pyc I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no. And there are still a ton of references to "-O", "-OO", "sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize", etc that all need to be audited and their implications considered. I've really enjoyed this task and I'm learning a lot about the c api, but I think this is a good place to stop and solicit feedback and direction. My gut says that the amount of churn and resulting risk is too high to continue down this path, but I would love to hear thoughts from others (alternate approaches, ways to limit scope, confirmation that the existing approach is too entrenched for change, etc). Regardless, I think the following subset change could merge without any bigger picture changes, as it just adds test coverage for a case not yet covered. I can reopen that pull request once I clean up the commit message a bit (I closed it in the mean time). https://github.com/python/cpython/pull/3450/commits/bfdab955a94a7fef431548f3ba2c4b5ca79e958d Thanks for your time! Cheers, --diana -------------- next part -------------- diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py index 9d949b74cb..cc6baff707 100644 --- a/Lib/test/test_builtin.py +++ b/Lib/test/test_builtin.py @@ -339,7 +339,8 @@ class BuiltinTest(unittest.TestCase): values = [(-1, __debug__, f.__doc__), (0, True, 'doc'), (1, False, 'doc'), - (2, False, None)] + (2, False, None), + (3, True, None)] for optval, debugval, docstring in values: # test both direct compilation and compilation via AST codeobjs = [] diff --git a/Modules/main.c b/Modules/main.c index 08b22760de..a63882685e 100644 --- a/Modules/main.c +++ b/Modules/main.c @@ -66,6 +66,7 @@ static const char usage_2[] = "\ -m mod : run library module as a script (terminates option list)\n\ -O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x\n\ -OO : remove doc-strings in addition to the -O optimizations\n\ +-OOO : like -OO but don't optimize away asserts\n\ -q : don't print version and copyright messages on interactive startup\n\ -s : don't add user site directory to sys.path; also PYTHONNOUSERSITE\n\ -S : don't imply 'import site' on initialization\n\ diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c index 85f207b68e..453b9bbc8a 100644 --- a/Python/bltinmodule.c +++ b/Python/bltinmodule.c @@ -705,7 +705,7 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, } /* XXX Warn if (supplied_flags & PyCF_MASK_OBSOLETE) != 0? */ - if (optimize < -1 || optimize > 2) { + if (optimize < -1 || optimize > 3) { PyErr_SetString(PyExc_ValueError, "compile(): invalid optimize value"); goto error; diff --git a/Python/compile.c b/Python/compile.c index 280ddc39e3..e892e34e6c 100644 --- a/Python/compile.c +++ b/Python/compile.c @@ -1431,7 +1431,7 @@ compiler_body(struct compiler *c, asdl_seq *stmts, string docstring) if (find_ann(stmts)) { ADDOP(c, SETUP_ANNOTATIONS); } - /* if not -OO mode, set docstring */ + /* if not -OO or -OOO mode, set docstring */ if (c->c_optimize < 2 && docstring) { ADDOP_O(c, LOAD_CONST, docstring, consts); ADDOP_NAME(c, STORE_NAME, __doc__, names); @@ -1836,7 +1836,7 @@ compiler_function(struct compiler *c, stmt_ty s, int is_async) return 0; } - /* if not -OO mode, add docstring */ + /* if not -OO or -OOO mode, add docstring */ if (c->c_optimize < 2 && s->v.FunctionDef.docstring) docstring = s->v.FunctionDef.docstring; if (compiler_add_o(c, c->u->u_consts, docstring) < 0) { @@ -2825,7 +2825,7 @@ compiler_assert(struct compiler *c, stmt_ty s) basicblock *end; PyObject* msg; - if (c->c_optimize) + if (c->c_optimize && c->c_optimize != 3) return 1; if (assertion_error == NULL) { assertion_error = PyUnicode_InternFromString("AssertionError"); -------------- next part -------------- diff --git a/Include/compile.h b/Include/compile.h index 3cc351d409..837e4afbea 100644 --- a/Include/compile.h +++ b/Include/compile.h @@ -47,18 +47,18 @@ typedef struct { #define FUTURE_GENERATOR_STOP "generator_stop" struct _mod; /* Declare the existence of this type */ -#define PyAST_Compile(mod, s, f, ar) PyAST_CompileEx(mod, s, f, -1, ar) +#define PyAST_Compile(mod, s, f, ar) PyAST_CompileEx(mod, s, f, NULL, ar) PyAPI_FUNC(PyCodeObject *) PyAST_CompileEx( struct _mod *mod, const char *filename, /* decoded from the filesystem encoding */ PyCompilerFlags *flags, - int optimize, + PyObject *optimizations, PyArena *arena); PyAPI_FUNC(PyCodeObject *) PyAST_CompileObject( struct _mod *mod, PyObject *filename, PyCompilerFlags *flags, - int optimize, + PyObject *optimizations, PyArena *arena); PyAPI_FUNC(PyFutureFeatures *) PyFuture_FromAST( struct _mod * mod, diff --git a/Include/pythonrun.h b/Include/pythonrun.h index 6f0c6fc655..31625d002c 100644 --- a/Include/pythonrun.h +++ b/Include/pythonrun.h @@ -100,19 +100,19 @@ PyAPI_FUNC(PyObject *) PyRun_FileExFlags( #ifdef Py_LIMITED_API PyAPI_FUNC(PyObject *) Py_CompileString(const char *, const char *, int); #else -#define Py_CompileString(str, p, s) Py_CompileStringExFlags(str, p, s, NULL, -1) -#define Py_CompileStringFlags(str, p, s, f) Py_CompileStringExFlags(str, p, s, f, -1) +#define Py_CompileString(str, p, s) Py_CompileStringExFlags(str, p, s, NULL, NULL) +#define Py_CompileStringFlags(str, p, s, f) Py_CompileStringExFlags(str, p, s, f, NULL) PyAPI_FUNC(PyObject *) Py_CompileStringExFlags( const char *str, const char *filename, /* decoded from the filesystem encoding */ int start, PyCompilerFlags *flags, - int optimize); + PyObject *optimizations); PyAPI_FUNC(PyObject *) Py_CompileStringObject( const char *str, PyObject *filename, int start, PyCompilerFlags *flags, - int optimize); + PyObject *optimizations); #endif PyAPI_FUNC(struct symtable *) Py_SymtableString( const char *str, diff --git a/Include/sysmodule.h b/Include/sysmodule.h index c5547ff674..62bba3bd94 100644 --- a/Include/sysmodule.h +++ b/Include/sysmodule.h @@ -25,6 +25,9 @@ PyAPI_FUNC(void) PySys_WriteStderr(const char *format, ...) PyAPI_FUNC(void) PySys_FormatStdout(const char *format, ...); PyAPI_FUNC(void) PySys_FormatStderr(const char *format, ...); +PyAPI_FUNC(void) PySys_SetOptimizations(PyObject *); +PyAPI_FUNC(PyObject *) PySys_GetOptimizations(void); + PyAPI_FUNC(void) PySys_ResetWarnOptions(void); PyAPI_FUNC(void) PySys_AddWarnOption(const wchar_t *); PyAPI_FUNC(void) PySys_AddWarnOptionUnicode(PyObject *); diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py index 9d949b74cb..87dcda7b43 100644 --- a/Lib/test/test_builtin.py +++ b/Lib/test/test_builtin.py @@ -328,19 +328,22 @@ class BuiltinTest(unittest.TestCase): codestr = '''def f(): """doc""" + debug_enabled = False + if __debug__: + debug_enabled = True try: assert False except AssertionError: - return (True, f.__doc__) + return (True, f.__doc__, debug_enabled) else: - return (False, f.__doc__) + return (False, f.__doc__, debug_enabled) ''' def f(): """doc""" - values = [(-1, __debug__, f.__doc__), - (0, True, 'doc'), - (1, False, 'doc'), - (2, False, None)] - for optval, debugval, docstring in values: + values = [(-1, __debug__, f.__doc__, __debug__), + (0, True, 'doc', True), + (1, False, 'doc', False), + (2, False, None, False)] + for optval, assertval, docstring, debugval in values: # test both direct compilation and compilation via AST codeobjs = [] codeobjs.append(compile(codestr, "", "exec", optimize=optval)) @@ -350,7 +353,7 @@ class BuiltinTest(unittest.TestCase): ns = {} exec(code, ns) rv = ns['f']() - self.assertEqual(rv, (debugval, docstring)) + self.assertEqual(rv, (assertval, docstring, debugval)) def test_delattr(self): sys.spam = 1 diff --git a/Modules/main.c b/Modules/main.c index 08b22760de..051d9e4792 100644 --- a/Modules/main.c +++ b/Modules/main.c @@ -40,7 +40,7 @@ static wchar_t **orig_argv; static int orig_argc; /* command line options */ -#define BASE_OPTS L"bBc:dEhiIJm:OqRsStuvVW:xX:?" +#define BASE_OPTS L"bBc:dEhiIJm:N:OqRsStuvVW:xX:?" #define PROGRAM_OPTS BASE_OPTS @@ -64,6 +64,7 @@ static const char usage_2[] = "\ if stdin does not appear to be a terminal; also PYTHONINSPECT=x\n\ -I : isolate Python from the user's environment (implies -E and -s)\n\ -m mod : run library module as a script (terminates option list)\n\ +-N : optimization flags: nodebug, noassert, nodocstring\n\ -O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x\n\ -OO : remove doc-strings in addition to the -O optimizations\n\ -q : don't print version and copyright messages on interactive startup\n\ @@ -354,6 +355,7 @@ typedef struct { wchar_t *module; /* -m argument */ PyObject *warning_options; /* -W options */ PyObject *extra_options; /* -X options */ + PyObject *optimizations; /* -N optimization flags */ int print_help; /* -h, -? options */ int print_version; /* -V option */ int bytes_warning; /* Py_BytesWarningFlag */ @@ -372,7 +374,7 @@ typedef struct { } _Py_CommandLineDetails; #define _Py_CommandLineDetails_INIT \ - {NULL, NULL, NULL, NULL, NULL, \ + {NULL, NULL, NULL, NULL, NULL, NULL, \ 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0} @@ -380,6 +382,7 @@ static int read_command_line(int argc, wchar_t **argv, _Py_CommandLineDetails *cmdline) { PyObject *warning_option = NULL; + PyObject *optimization = NULL; wchar_t *command = NULL; wchar_t *module = NULL; int c; @@ -435,6 +438,14 @@ read_command_line(int argc, wchar_t **argv, _Py_CommandLineDetails *cmdline) /* case 'J': reserved for Jython */ + case 'N': + if (cmdline->optimizations == NULL) + cmdline->optimizations = PySet_New(NULL); + optimization = PyUnicode_FromWideChar(_PyOS_optarg, -1); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + break; + case 'O': cmdline->optimization_level++; break; @@ -515,6 +526,46 @@ read_command_line(int argc, wchar_t **argv, _Py_CommandLineDetails *cmdline) } } + /* + * Map legacy optimization_level int value to optimization flags. + * 0: {}, no optimization flags + * 1: {"nodebug", "noassert"} + * 2: {"nodebug", "noassert", "nodocstring"} + */ + if (cmdline->optimization_level >= 0) { + if (cmdline->optimizations == NULL) { + cmdline->optimizations = PySet_New(NULL); + } + + if (cmdline->optimization_level == 1) { + optimization = PyUnicode_FromString("nodebug"); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + + optimization = PyUnicode_FromString("noassert"); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + } + + if (cmdline->optimization_level == 2) { + optimization = PyUnicode_FromString("nodebug"); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + + optimization = PyUnicode_FromString("noassert"); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + + optimization = PyUnicode_FromString("nodocstring"); + PySet_Add(cmdline->optimizations, optimization); + Py_DECREF(optimization); + } + } + + if (cmdline->optimizations != NULL) { + PySys_SetOptimizations(cmdline->optimizations); + } + if (command == NULL && module == NULL && _PyOS_optind < argc && wcscmp(argv[_PyOS_optind], L"-") != 0) { diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c index 929f2deb16..91e0286a6b 100644 --- a/Modules/parsermodule.c +++ b/Modules/parsermodule.c @@ -515,7 +515,7 @@ parser_compilest(PyST_Object *self, PyObject *args, PyObject *kw) goto error; res = (PyObject *)PyAST_CompileObject(mod, filename, - &self->st_flags, -1, arena); + &self->st_flags, NULL, arena); error: Py_XDECREF(filename); if (arena != NULL) diff --git a/Modules/zipimport.c b/Modules/zipimport.c index fad1b1f5ab..302a8b8bd3 100644 --- a/Modules/zipimport.c +++ b/Modules/zipimport.c @@ -1365,7 +1365,7 @@ compile_source(PyObject *pathname, PyObject *source) } code = Py_CompileStringObject(PyBytes_AsString(fixed_source), - pathname, Py_file_input, NULL, -1); + pathname, Py_file_input, NULL, NULL); Py_DECREF(fixed_source); return code; diff --git a/Programs/_freeze_importlib.c b/Programs/_freeze_importlib.c index 1069966a18..7b9be6cfd1 100644 --- a/Programs/_freeze_importlib.c +++ b/Programs/_freeze_importlib.c @@ -90,7 +90,12 @@ main(int argc, char *argv[]) code_name = is_bootstrap ? "" : ""; - code = Py_CompileStringExFlags(text, code_name, Py_file_input, NULL, 0); + + PyObject *optimizations = PySet_New(NULL); // empty: no optimizations + code = Py_CompileStringExFlags(text, code_name, Py_file_input, NULL, + optimizations); + Py_DECREF(optimizations); + if (code == NULL) goto error; free(text); diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c index 85f207b68e..542ae337e2 100644 --- a/Python/bltinmodule.c +++ b/Python/bltinmodule.c @@ -683,7 +683,7 @@ in addition to any features explicitly specified. static PyObject * builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, const char *mode, int flags, int dont_inherit, - int optimize) + PyObject *optimizations) /*[clinic end generated code: output=1fa176e33452bb63 input=0ff726f595eb9fcd]*/ { PyObject *source_copy; @@ -705,11 +705,11 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, } /* XXX Warn if (supplied_flags & PyCF_MASK_OBSOLETE) != 0? */ - if (optimize < -1 || optimize > 2) { - PyErr_SetString(PyExc_ValueError, - "compile(): invalid optimize value"); - goto error; - } +// if (optimize < -1 || optimize > 2) { +// PyErr_SetString(PyExc_ValueError, +// "compile(): invalid optimize value"); +// goto error; +// } if (!dont_inherit) { PyEval_MergeCompilerFlags(&cf); @@ -751,8 +751,8 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, PyArena_Free(arena); goto error; } - result = (PyObject*)PyAST_CompileObject(mod, filename, - &cf, optimize, arena); + result = (PyObject*)PyAST_CompileObject(mod, filename, &cf, + optimizations, arena); PyArena_Free(arena); } goto finally; @@ -762,7 +762,8 @@ builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, if (str == NULL) goto error; - result = Py_CompileStringObject(str, filename, start[compile_mode], &cf, optimize); + result = Py_CompileStringObject(str, filename, start[compile_mode], &cf, + optimizations); Py_XDECREF(source_copy); goto finally; @@ -2737,7 +2738,7 @@ _PyBuiltin_Init(void) SETBUILTIN("tuple", &PyTuple_Type); SETBUILTIN("type", &PyType_Type); SETBUILTIN("zip", &PyZip_Type); - debug = PyBool_FromLong(Py_OptimizeFlag == 0); + debug = PyBool_FromLong(Py_OptimizeFlag == 0); // TODO if (PyDict_SetItemString(dict, "__debug__", debug) < 0) { Py_DECREF(debug); return NULL; diff --git a/Python/clinic/bltinmodule.c.h b/Python/clinic/bltinmodule.c.h index fa327da0ff..35accc0701 100644 --- a/Python/clinic/bltinmodule.c.h +++ b/Python/clinic/bltinmodule.c.h @@ -133,7 +133,7 @@ exit: PyDoc_STRVAR(builtin_compile__doc__, "compile($module, /, source, filename, mode, flags=0,\n" -" dont_inherit=False, optimize=-1)\n" +" dont_inherit=False, optimizations=None, optimize=-1)\n" "--\n" "\n" "Compile source into a code object that can be executed by exec() or eval().\n" @@ -155,26 +155,61 @@ PyDoc_STRVAR(builtin_compile__doc__, static PyObject * builtin_compile_impl(PyObject *module, PyObject *source, PyObject *filename, const char *mode, int flags, int dont_inherit, - int optimize); + PyObject *optimizations); static PyObject * builtin_compile(PyObject *module, PyObject **args, Py_ssize_t nargs, PyObject *kwnames) { PyObject *return_value = NULL; - static const char * const _keywords[] = {"source", "filename", "mode", "flags", "dont_inherit", "optimize", NULL}; - static _PyArg_Parser _parser = {"OO&s|iii:compile", _keywords, 0}; + static const char * const _keywords[] = { + "source", + "filename", + "mode", + "flags", + "dont_inherit", + "optimizations", + "optimize", + NULL + }; + static _PyArg_Parser _parser = {"OO&s|iiOi:compile", _keywords, 0}; PyObject *source; PyObject *filename; const char *mode; int flags = 0; int dont_inherit = 0; + PyObject *optimizations = NULL; int optimize = -1; if (!_PyArg_ParseStackAndKeywords(args, nargs, kwnames, &_parser, - &source, PyUnicode_FSDecoder, &filename, &mode, &flags, &dont_inherit, &optimize)) { - goto exit; + &source, PyUnicode_FSDecoder, &filename, &mode, &flags, &dont_inherit, + &optimizations, &optimize)) { + goto exit; + } + + /* + * Map legacy optimize int value to optimization flags. + * - 1: NULL, use optimization level of interpreter + * 0: {}, no optimization flags + * 1: {"nodebug", "noassert"} + * 2: {"nodebug", "noassert", "nodocstring"} + */ + if (optimize >= 0) { + if (optimizations == NULL) { + optimizations = PySet_New(NULL); + } + + if (optimize == 1) { + PySet_Add(optimizations, PyUnicode_FromString("nodebug")); + PySet_Add(optimizations, PyUnicode_FromString("noassert")); + } else if (optimize == 2) { + PySet_Add(optimizations, PyUnicode_FromString("nodebug")); + PySet_Add(optimizations, PyUnicode_FromString("noassert")); + PySet_Add(optimizations, PyUnicode_FromString("nodocstring")); + } } - return_value = builtin_compile_impl(module, source, filename, mode, flags, dont_inherit, optimize); + + return_value = builtin_compile_impl(module, source, filename, mode, flags, + dont_inherit, optimizations); exit: return return_value; diff --git a/Python/compile.c b/Python/compile.c index 280ddc39e3..a062f3e0ee 100644 --- a/Python/compile.c +++ b/Python/compile.c @@ -154,7 +154,7 @@ struct compiler { PyFutureFeatures *c_future; /* pointer to module's __future__ */ PyCompilerFlags *c_flags; - int c_optimize; /* optimization level */ + PyObject *c_optimizations; /* optimization flags */ int c_interactive; /* true if in interactive mode */ int c_nestlevel; @@ -285,6 +285,12 @@ _Py_Mangle(PyObject *privateobj, PyObject *ident) return result; } +static int +optimization_flag_absent(struct compiler *c, const char *flag) +{ + return PySet_Contains(c->c_optimizations, PyUnicode_FromString(flag)) == 0; +} + static int compiler_init(struct compiler *c) { @@ -299,7 +305,7 @@ compiler_init(struct compiler *c) PyCodeObject * PyAST_CompileObject(mod_ty mod, PyObject *filename, PyCompilerFlags *flags, - int optimize, PyArena *arena) + PyObject *optimizations, PyArena *arena) { struct compiler c; PyCodeObject *co = NULL; @@ -328,9 +334,14 @@ PyAST_CompileObject(mod_ty mod, PyObject *filename, PyCompilerFlags *flags, c.c_future->ff_features = merged; flags->cf_flags = merged; c.c_flags = flags; - c.c_optimize = (optimize == -1) ? Py_OptimizeFlag : optimize; c.c_nestlevel = 0; + if (optimizations == NULL) { + c.c_optimizations = PySys_GetOptimizations(); + } else { + c.c_optimizations = optimizations; + } + c.c_st = PySymtable_BuildObject(mod, filename, c.c_future); if (c.c_st == NULL) { if (!PyErr_Occurred()) @@ -348,14 +359,14 @@ PyAST_CompileObject(mod_ty mod, PyObject *filename, PyCompilerFlags *flags, PyCodeObject * PyAST_CompileEx(mod_ty mod, const char *filename_str, PyCompilerFlags *flags, - int optimize, PyArena *arena) + PyObject *optimizations, PyArena *arena) { PyObject *filename; PyCodeObject *co; filename = PyUnicode_DecodeFSDefault(filename_str); if (filename == NULL) return NULL; - co = PyAST_CompileObject(mod, filename, flags, optimize, arena); + co = PyAST_CompileObject(mod, filename, flags, optimizations, arena); Py_DECREF(filename); return co; @@ -1432,7 +1443,8 @@ compiler_body(struct compiler *c, asdl_seq *stmts, string docstring) ADDOP(c, SETUP_ANNOTATIONS); } /* if not -OO mode, set docstring */ - if (c->c_optimize < 2 && docstring) { + int include_doc = optimization_flag_absent(c, "nodocstring"); + if (include_doc && docstring) { ADDOP_O(c, LOAD_CONST, docstring, consts); ADDOP_NAME(c, STORE_NAME, __doc__, names); } @@ -1837,7 +1849,8 @@ compiler_function(struct compiler *c, stmt_ty s, int is_async) } /* if not -OO mode, add docstring */ - if (c->c_optimize < 2 && s->v.FunctionDef.docstring) + int include_doc = optimization_flag_absent(c, "nodocstring"); + if (include_doc && s->v.FunctionDef.docstring) docstring = s->v.FunctionDef.docstring; if (compiler_add_o(c, c->u->u_consts, docstring) < 0) { compiler_exit_scope(c); @@ -2825,8 +2838,10 @@ compiler_assert(struct compiler *c, stmt_ty s) basicblock *end; PyObject* msg; - if (c->c_optimize) + int include_assert = optimization_flag_absent(c, "noassert"); + if (!include_assert) { return 1; + } if (assertion_error == NULL) { assertion_error = PyUnicode_InternFromString("AssertionError"); if (assertion_error == NULL) @@ -4142,8 +4157,10 @@ expr_constant(struct compiler *c, expr_ty e) case Name_kind: /* optimize away names that can't be reassigned */ id = PyUnicode_AsUTF8(e->v.Name.id); - if (id && strcmp(id, "__debug__") == 0) - return !c->c_optimize; + if (id && strcmp(id, "__debug__") == 0) { + int include_debug = optimization_flag_absent(c, "nodebug"); + return include_debug ? 1 : 0; + } return -1; case NameConstant_kind: { PyObject *o = e->v.NameConstant.value; @@ -5453,5 +5470,5 @@ PyAPI_FUNC(PyCodeObject *) PyAST_Compile(mod_ty mod, const char *filename, PyCompilerFlags *flags, PyArena *arena) { - return PyAST_CompileEx(mod, filename, flags, -1, arena); + return PyAST_CompileEx(mod, filename, flags, NULL, arena); } diff --git a/Python/pythonrun.c b/Python/pythonrun.c index f31b3ee5a5..0225a9c382 100644 --- a/Python/pythonrun.c +++ b/Python/pythonrun.c @@ -976,7 +976,7 @@ run_mod(mod_ty mod, PyObject *filename, PyObject *globals, PyObject *locals, { PyCodeObject *co; PyObject *v; - co = PyAST_CompileObject(mod, filename, flags, -1, arena); + co = PyAST_CompileObject(mod, filename, flags, NULL, arena); if (co == NULL) return NULL; v = PyEval_EvalCode((PyObject*)co, globals, locals); @@ -1023,7 +1023,7 @@ run_pyc_file(FILE *fp, const char *filename, PyObject *globals, PyObject * Py_CompileStringObject(const char *str, PyObject *filename, int start, - PyCompilerFlags *flags, int optimize) + PyCompilerFlags *flags, PyObject *optimizations) { PyCodeObject *co; mod_ty mod; @@ -1041,20 +1041,20 @@ Py_CompileStringObject(const char *str, PyObject *filename, int start, PyArena_Free(arena); return result; } - co = PyAST_CompileObject(mod, filename, flags, optimize, arena); + co = PyAST_CompileObject(mod, filename, flags, optimizations, arena); PyArena_Free(arena); return (PyObject *)co; } PyObject * Py_CompileStringExFlags(const char *str, const char *filename_str, int start, - PyCompilerFlags *flags, int optimize) + PyCompilerFlags *flags, PyObject *optimizations) { PyObject *filename, *co; filename = PyUnicode_DecodeFSDefault(filename_str); if (filename == NULL) return NULL; - co = Py_CompileStringObject(str, filename, start, flags, optimize); + co = Py_CompileStringObject(str, filename, start, flags, optimizations); Py_DECREF(filename); return co; } @@ -1523,7 +1523,7 @@ PyRun_SimpleString(const char *s) PyAPI_FUNC(PyObject *) Py_CompileString(const char *str, const char *p, int s) { - return Py_CompileStringExFlags(str, p, s, NULL, -1); + return Py_CompileStringExFlags(str, p, s, NULL, NULL); } #undef Py_CompileStringFlags @@ -1531,7 +1531,7 @@ PyAPI_FUNC(PyObject *) Py_CompileStringFlags(const char *str, const char *p, int s, PyCompilerFlags *flags) { - return Py_CompileStringExFlags(str, p, s, flags, -1); + return Py_CompileStringExFlags(str, p, s, flags, NULL); } #undef PyRun_InteractiveOne diff --git a/Python/sysmodule.c b/Python/sysmodule.c index ab435c8310..2e325ad031 100644 --- a/Python/sysmodule.c +++ b/Python/sysmodule.c @@ -1481,6 +1481,31 @@ list_builtin_module_names(void) return list; } +static PyObject *optimizations = NULL; + +void +PySys_SetOptimizations(PyObject *options) +{ + if (optimizations == NULL) + optimizations = PySet_New(NULL); + + Py_ssize_t i; + int size = PySet_GET_SIZE(options); + for (i = 0; i < size; i++) { + PySet_Add(optimizations, PySet_Pop(options)); + } +} + +PyObject * +PySys_GetOptimizations(void) +{ + if (optimizations == NULL || !PySet_Check(optimizations)) { + Py_XDECREF(optimizations); + optimizations = PySet_New(NULL); + } + return optimizations; +} + static PyObject *warnoptions = NULL; void From solipsis at pitrou.net Thu Sep 28 15:02:48 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 28 Sep 2017 21:02:48 +0200 Subject: [Python-ideas] Changes to the existing optimization levels References: Message-ID: <20170928210248.65a06e4b@fsol> On Thu, 28 Sep 2017 12:48:15 -0600 Diana Clarke wrote: > > 2) Added a new command line option N that allows you to specify > any number of individual optimization flags. > > For example: > > python -N nodebug -N noassert -N nodocstring We could instead reuse the existing -X option, which allows for free-form implementation-specific flags. > I'm also not certain if the various compile signatures are even open > for change (int optimize => PyObject *optimizations), or if that's a > no-no. You probably want to keep the existing signatures for compatibility: - in C, add new APIs with the new convention - in Python, add a new (optional) function argument for the new convention Regards Antoine. From levkivskyi at gmail.com Thu Sep 28 18:04:48 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 29 Sep 2017 00:04:48 +0200 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 28 September 2017 at 08:27, Nick Coghlan wrote: > On 27 September 2017 at 19:28, Ivan Levkivskyi > wrote: > > If an object that is not a class object appears in the bases of a class > > definition, the ``__subclass_base__`` is searched on it. If found, > > it is called with the original tuple of bases as an argument. If the > result > > of the call is not ``None``, then it is substituted instead of this > object. > > Otherwise (if the result is ``None``), the base is just removed. This is > > necessary to avoid inconsistent MRO errors, that are currently prevented > by > > manipulations in ``GenericMeta.__new__``. After creating the class, > > the original bases are saved in ``__orig_bases__`` (currently this is > also > > done by the metaclass). > > How would you feel about calling it "__mro_entry__", as a mnemonic for > "the substitute entry to use instead of this object when calculating a > subclass MRO"? > > I don't have any preferences for the name, __mro_entry__ sounds equally OK to me. I think the other thing that needs to be clarified is whether or not > the actual metaclass can expect to receive an already-resolved > sequence of MRO entries as its list of bases, or if it will need to > repeat the base resolution process executed while figuring out the > metaclass. > > There are three points for discussion here: 1) It is necessary to make the bases resolution soon, before the metaclass is calculated. This is why I do this at the beginning of __build_class__ in the reference implementation. 2) Do we need to update type.__new__ to be able to accept non-classes as bases? I think no. One might be a bit surprised that class C(Iterable[int]): pass works, but type('C', (Iterable[int],), {}) fails with a metaclass conflict, but I think it is natural that static typing and dynamic class creation should not be used together. I propose to update ``type.__new__`` to just give a better error message explaining this. 3) Do we need to update types.new_class and types.prepare_class? Here I am not sure. These functions are rather utility functions and are designed to mimic in Python what __build_class__ does in C. I think we might add types._update_bases that does the same as its C counterpart. Then we can update types.new_class and types.prepare_class like you proposed, this will preserve their current API while types.new_class will match behaviour of __build_class__ If you and others agree with this, then I will update the PEP text and the reference implementation. Thanks for comments! -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Sep 28 18:09:33 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 29 Sep 2017 00:09:33 +0200 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: Message-ID: > 2) Added a new command line option N that allows you to specify > any number of individual optimization flags. > > For example: > > python -N nodebug -N noassert -N nodocstring You may want to look at my PEP 511 which proposes to add a new "-o" option to specify a list of optimizations: https://www.python.org/dev/peps/pep-0511/#changes The PEP proposes to add a new sys.implementation.optim_tag which is used to generated the .pyc filename. Victor From diana.joan.clarke at gmail.com Thu Sep 28 18:28:40 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Thu, 28 Sep 2017 16:28:40 -0600 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: Message-ID: Yup. I referenced your pep a few times in a previous off-list email, but I omitted that paragraph from this pass because I was using it to bolster the previous "level 3" idea (which didn't fly). """ This simple approach to new optimization levels also appears to be inline with the direction Victor Stinner is going in for PEP 511 - "API for code transformers" [1]. More specifically, in the "Optimizer tag" section [2] of that PEP Victor proposes adding a new -o OPTIM_TAG command line option that defaults to "opt" for the existing optimizations, but would also let you to swap in custom bytecode transformers (like "fat" in his examples [3]). Assuming I understood that correctly ;) os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc [1] https://www.python.org/dev/peps/pep-0511/ [2] https://www.python.org/dev/peps/pep-0511/#optimizer-tag [3] https://www.python.org/dev/peps/pep-0511/#examples """ Thanks for taking the time to respond (you too Antoine). Cheers, --diana On Thu, Sep 28, 2017 at 4:09 PM, Victor Stinner wrote: >> 2) Added a new command line option N that allows you to specify >> any number of individual optimization flags. >> >> For example: >> >> python -N nodebug -N noassert -N nodocstring > > You may want to look at my PEP 511 which proposes to add a new "-o" > option to specify a list of optimizations: > https://www.python.org/dev/peps/pep-0511/#changes > > The PEP proposes to add a new sys.implementation.optim_tag which is > used to generated the .pyc filename. > > Victor From diana.joan.clarke at gmail.com Fri Sep 29 00:17:44 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Thu, 28 Sep 2017 22:17:44 -0600 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: Message-ID: Perhaps I should be a bit clearer. When I said the "level 3" approach "appears to be inline with the direction Victor Stinner is going in for PEP 511", it was mostly at a superficial level. Meaning: - PEP 511 still appears to use integer (unnamed) optimization levels for alternate transformers (fat 0, 1, and 2). I assumed (perhaps incorrectly) that you could provide a list of transformers ("opt,fat,bar") but that each transformer would still contain a number of different off/on toggles, arbitrarily identified as integer flags like 0, 1, and 2. I should go back and read that PEP again. I don't recall seeing where the 0, 1, and 2 came from in the fat examples. os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc - Secondly, I reviewed PEP 511 when I initially started working on the naive "level 3" approach to make sure what I proposed didn't impede the progress of PEP 511 (or more realistically make my attempt obsolete). Since PEP 511 didn't seem to deviate much from the current integer flags (aside from allowing multiple different named sets of integer flags), I figured that whatever approach PEP 511 took with the existing optimization levels (0, 1, and 2) would presumably also work for a new level 3. I hope that makes sense... If not, let me know & I'll try again tomorrow to be clearer. PS. I think it sounds like I'm now re-advocating for the simple "level 3" approach. I'm not ? just trying to explain my earlier thought process. I'm open to all kinds of feedback & suggestions. Thanks again folks! Cheers, --diana From w00t at fb.com Fri Sep 29 01:53:58 2017 From: w00t at fb.com (Wren Turkal) Date: Fri, 29 Sep 2017 05:53:58 +0000 Subject: [Python-ideas] allow overriding files used for the input builtin Message-ID: Hi there, I have posted an idea for improvement with a PR of an implementation to https://bugs.python.org/issue31603. The basic idea is to add fin, fout, and ferr file object parameters and default to using what is used today when the args are not specified. I believe this would be useful to allow captures input and send output to specific files when using input. The input builtin has some logic to use readline if it's available. It would be nice to be able to use this same logic no matter what files are being used for input/output. This is meant to turn code like the following: orig_stdin = sys.stdin orig_stdout = sys.stdout with open('/dev/tty', 'r+') as f: sys.stdin = f sys.stdout = f name = input('Name? ') sys.stdin = orig_stdin sys.stdout = orig_stdout print(name) into something more like this: with open('/dev/tty', 'r+') as f: name = input('Name? ', fin=f, fout=f) print(name) It's nice that it makes the assignment to a global variable to change the file used for input/output to no longer be needed. I had this idea the other day, and I realized that it would be super easy to implement, so I went ahead the threw up a PR also. Would love to see if anyone else is interested in this. I think it's pretty cool that the core logic really didn't need to be changed other than plumbing in the new args. FWIW, this change introduces no regressions and adds a few more tests to test the new functionality. Honestly, I think this functionality could probably be used to simplify some of the other tests as well, but I wanted to gauge what folks thought of the change before going farther. Wren Turkal Existential Production Engineer of the Ages Facebook, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mixie at gmail.com Fri Sep 29 02:18:18 2017 From: amit.mixie at gmail.com (Amit Green) Date: Fri, 29 Sep 2017 02:18:18 -0400 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: I'm fine with the idea in general of extra keyword parameters to the input function. A few points: Your example code, needs try/catch to match what the input with parameters does -- and yes, its way nicer to be able to use it the example you have shown than play games with try/catch (Personally I also refuse to ever change sys.stdin, or sys.stdout, as I consider that a bad coding style). Mostly though I would like to ask, please do not name keyword arguments with names like 'fin' & 'fout'. This is almost unreadable and make's code almost indecipherable to others the first time they see the function & its keyword arguments (First impressions are very important). Both a function name & its keyword parameters need to be as understandable as possible when a user encounters them for the first time. On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal wrote: > Hi there, > > > I have posted an idea for improvement with a PR of an implementation to > https://bugs.python.org/issue31603. > > > The basic idea is to add fin, fout, and ferr file object parameters and > default to using what is used today when the args are not specified. I > believe this would be useful to allow captures input and send output to > specific files when using input. The input builtin has some logic to use > readline if it's available. It would be nice to be able to use this same > logic no matter what files are being used for input/output. > > > This is meant to turn code like the following: > > orig_stdin = sys.stdin > > orig_stdout = sys.stdout > > with open('/dev/tty', 'r+') as f: > > sys.stdin = f > > sys.stdout = f > > name = input('Name? ') > > sys.stdin = orig_stdin > > sys.stdout = orig_stdout > > print(name) > > > into something more like this: > > with open('/dev/tty', 'r+') as f: > > name = input('Name? ', fin=f, fout=f) > > print(name) > > > It's nice that it makes the assignment to a global variable to change the > file used for input/output to no longer be needed. > > > I had this idea the other day, and I realized that it would be super easy > to implement, so I went ahead the threw up a PR also. > > > Would love to see if anyone else is interested in this. I think it's > pretty cool that the core logic really didn't need to be changed other than > plumbing in the new args. > > > FWIW, this change introduces no regressions and adds a few more tests to > test the new functionality. Honestly, I think this functionality could > probably be used to simplify some of the other tests as well, but I wanted > to gauge what folks thought of the change before going farther. > > > Wren Turkal > > Existential Production Engineer of the Ages > > Facebook, Inc. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Sep 29 02:57:47 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Sep 2017 16:57:47 +1000 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 29 September 2017 at 08:04, Ivan Levkivskyi wrote: > On 28 September 2017 at 08:27, Nick Coghlan wrote: >> >> On 27 September 2017 at 19:28, Ivan Levkivskyi >> wrote: >> > If an object that is not a class object appears in the bases of a class >> > definition, the ``__subclass_base__`` is searched on it. If found, >> > it is called with the original tuple of bases as an argument. If the >> > result >> > of the call is not ``None``, then it is substituted instead of this >> > object. >> > Otherwise (if the result is ``None``), the base is just removed. This is >> > necessary to avoid inconsistent MRO errors, that are currently prevented >> > by >> > manipulations in ``GenericMeta.__new__``. After creating the class, >> > the original bases are saved in ``__orig_bases__`` (currently this is >> > also >> > done by the metaclass). >> >> How would you feel about calling it "__mro_entry__", as a mnemonic for >> "the substitute entry to use instead of this object when calculating a >> subclass MRO"? > > I don't have any preferences for the name, __mro_entry__ sounds equally OK > to me. I'd propose changing it then, as searching for "Python mro entry" is likely to get people to the right place faster than searching for "Python subclass base". >> I think the other thing that needs to be clarified is whether or not >> the actual metaclass can expect to receive an already-resolved >> sequence of MRO entries as its list of bases, or if it will need to >> repeat the base resolution process executed while figuring out the >> metaclass. >> > > There are three points for discussion here: > > 1) It is necessary to make the bases resolution soon, before the metaclass > is calculated. This is why I do this at the beginning of __build_class__ in > the > reference implementation. Indeed. > 2) Do we need to update type.__new__ to be able to accept non-classes as > bases? > I think no. One might be a bit surprised that > > class C(Iterable[int]): > pass > > works, but > > type('C', (Iterable[int],), {}) > > fails with a metaclass conflict, but I think it is natural that static > typing and dynamic > class creation should not be used together. I propose to update > ``type.__new__`` to just give > a better error message explaining this. +1 from me, since that avoids ever resolving the list of bases twice. > 3) Do we need to update types.new_class and types.prepare_class? > Here I am not sure. These functions are rather utility functions and are > designed to > mimic in Python what __build_class__ does in C. I think we might add > types._update_bases > that does the same as its C counterpart. Then we can update types.new_class > and types.prepare_class > like you proposed, this will preserve their current API while > types.new_class will match behaviour of __build_class__ Your suggestion for `types.__new__` gave me a different idea: what if `types.prepare_class` *also* just raised an error when given a non-class as a nominal base class? Then if we added `types.resolve_bases` as a public API, a full reimplementation of `types.new_class` would now look like: resolved_bases = types.resolve_bases(bases) mcl, ns, updated_kwds = types.prepare_class(name, resolved_bases, kwds) exec_body(ns) ns["__orig_bases__"] = bases mcl(name, resolved_bases, ns, **updated_kwds) That way, `types.new_class` would transparently switch to the new behaviour, while clients of any other dynamic type creation API could do their own base class resolution, even if the type creation API they were using didn't implicitly support MRO entry resolution. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 29 03:33:11 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Sep 2017 17:33:11 +1000 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: <20170928210248.65a06e4b@fsol> References: <20170928210248.65a06e4b@fsol> Message-ID: On 29 September 2017 at 05:02, Antoine Pitrou wrote: > On Thu, 28 Sep 2017 12:48:15 -0600 > Diana Clarke > wrote: >> >> 2) Added a new command line option N that allows you to specify >> any number of individual optimization flags. >> >> For example: >> >> python -N nodebug -N noassert -N nodocstring > > We could instead reuse the existing -X option, which allows for > free-form implementation-specific flags. And declaring named optimisation flags to be implementation dependent is likely a good way to go. The one downside is that it would mean there was no formally interpreter independent way of requesting the "nodebug,nodocstring" configuration, but informal conventions around particular uses of "-X" may be sufficient for that purpose. >> I'm also not certain if the various compile signatures are even open >> for change (int optimize => PyObject *optimizations), or if that's a >> no-no. > > You probably want to keep the existing signatures for compatibility: > - in C, add new APIs with the new convention > - in Python, add a new (optional) function argument for the new > convention This approach should also reduce the overall amount of code churn, since any CPython (or external) code currently passing "optimize=-1" won't need to change at all: that already says "get the optimization settings from the interpreter state", so it will pick up any changes to how that configuration works "for free". That said, we may also want to consider a couple of other options related to changing the meaning of *existing* parameters to these APIs: 1. We have the PyCompilerFlags struct that's currently only used to pass around feature flags for the __future__ module. It could gain a second bitfield for optimisation options 2. We could reinterpret "optimize" as a bitfield instead of a regular integer, special casing the already defined values: - all zero: no optimizations - sign bit set: negative -> use global settings - 0x0001: nodebug+noassert - 0x0002: nodebug+noassert+nodocstrings - 0x0004: nodebug - 0x0008: noassert - 0x0010: nodocstrings The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :) The opt-N values in pyc files would be somewhat cryptic-to-humans, but still relatively easy to translate back to readable strings given the bitfield values, and common patterns (like 0x14 -> 20 for nodebug+nodocstrings) would likely become familiar pretty quickly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Fri Sep 29 04:14:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 29 Sep 2017 10:14:39 +0200 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: > Title: Core support for generic types Would it be possible to mention "typing" somewhere in the title? If you don't know the context, it's hard to understand that the PEP is related to type annotation and type checks. At least just from the title. Victor From steve at pearwood.info Fri Sep 29 06:25:04 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Sep 2017 20:25:04 +1000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: <20170929102504.GW13110@ando.pearwood.info> On Fri, Sep 29, 2017 at 05:53:58AM +0000, Wren Turkal wrote: [...] > The basic idea is to add fin, fout, and ferr file object parameters > and default to using what is used today when the args are not > specified. I believe this would be useful to allow captures input and > send output to specific files when using input. The input builtin has > some logic to use readline if it's available. It would be nice to be > able to use this same logic no matter what files are being used for > input/output. I've done the whole "set stdout and stdin, call input, reset stdout and stdin" dance, and so I like this concept of being able to easily use input() non-interactively. I think your basic idea is a good one. I wonder what you think the ferr parameter will do? As far as I know, input() doesn't use stderr. I also don't think much of your parameter names. They strike me as very C-like, rather than Pythonic. So my proposal is: input(prompt, *, infile, outfile) where infile must be a file-like object with a read() method, suitable for replacing stdin, and outfile must be a file-like object with a write() method suitable for replacing stdout. > This is meant to turn code like the following: > > orig_stdin = sys.stdin > orig_stdout = sys.stdout > with open('/dev/tty', 'r+') as f: > sys.stdin = f > sys.stdout = f > name = input('Name? ') > > sys.stdin = orig_stdin > sys.stdout = orig_stdout > print(name) For production use, that should be wrapped in a try...finally: try: ... finally: sys.stdin = orig_stdin sys.stdout = orig_stdout > into something more like this: > > with open('/dev/tty', 'r+') as f: > name = input('Name? ', fin=f, fout=f) > > print(name) I like it very much. But as an alternative, perhaps all we really need is a context manager to set the std* files: with open('/dev/tty', 'r+') as f: with stdio(stdin=f, stdout=f): name = input('Name? ') print(name) That's nearly as nice, and is possibly useful in more situations. Or maybe we should have both? [...] > Would love to see if anyone else is interested in this. I think it's > pretty cool that the core logic really didn't need to be changed other > than plumbing in the new args. Definitely interested! -- Steve From p.f.moore at gmail.com Fri Sep 29 06:34:26 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 29 Sep 2017 11:34:26 +0100 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: <20170929102504.GW13110@ando.pearwood.info> References: <20170929102504.GW13110@ando.pearwood.info> Message-ID: On 29 September 2017 at 11:25, Steven D'Aprano wrote: > I like it very much. > > But as an alternative, perhaps all we really need is a context manager > to set the std* files: > > with open('/dev/tty', 'r+') as f: > with stdio(stdin=f, stdout=f): > name = input('Name? ') > > print(name) > > > That's nearly as nice, and is possibly useful in more situations. Or > maybe we should have both? Agreed - a general way of redirecting stdio would be more generally useful - I've often replaced stdio, but as far as I can recall never for input(). There's already contextlib.redirect_stdout() and contextlib.redirect_stderr(). Adding contextlib.redirect_stdin() would be logical, but I think a more flexible contextlib.redirect_stdio(stdin=None, stdout=None, stderr=None) would be better - where None (the default) means "leave this alone". >> Would love to see if anyone else is interested in this. I think it's >> pretty cool that the core logic really didn't need to be changed other >> than plumbing in the new args. > > Definitely interested! I'm interested in the general context manager. I don't use input() enough to be particularly interested in a solution specific to that function. Paul From storchaka at gmail.com Fri Sep 29 06:45:05 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 29 Sep 2017 13:45:05 +0300 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: 29.09.17 08:53, Wren Turkal ????: > This is meant to turn code like the following: > > orig_stdin = sys.stdin > > orig_stdout = sys.stdout > > with open('/dev/tty', 'r+') as f: > > ? ? sys.stdin = f > > ? ? sys.stdout = f > > ? ? name = input('Name? ') > > sys.stdin = orig_stdin > > sys.stdout = orig_stdout > > print(name) > > > into something more like this: > > with open('/dev/tty', 'r+') as f: > > ? ? name = input('Name? ', fin=f, fout=f) > > print(name) Why not use just the following two lines? f.write('Name? ') name = f.readline() This falls to me in the category "not every two lines of the code should be added as a builtin". From drekin at gmail.com Fri Sep 29 06:56:04 2017 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Fri, 29 Sep 2017 12:56:04 +0200 Subject: [Python-ideas] allow overriding files used for the input builtin Message-ID: Hello, it seems that there are two independent issues ? a way to temporarily replace all sys.std* streams, and a way to use the special interactive readline logic for arbitrary terminal-like file. I thought that OP's concern was the latter. In that case shouldn't there just be a way to produce an IO object f whose .readline() (or other special method) does this and has an optional prompt argument? The OP's code would become with open_terminal('/dev/tty') as f: name = f.readline(prompt='Name? ') It seems to me that changing the rather specialized input() function is wrong in both cases. For me input() is just a short way to read from the standard input, usually interactively ? the same way print is a short way to write to the standard output. IIRC there was even proposal to remove input() in favor of direct use of sys.stdin.readline() at time of Python 3000 redesign. Regards, Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Sep 29 07:40:04 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Sep 2017 21:40:04 +1000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: <20170929114004.GX13110@ando.pearwood.info> On Fri, Sep 29, 2017 at 01:45:05PM +0300, Serhiy Storchaka wrote: > Why not use just the following two lines? > > f.write('Name? ') > name = f.readline() Because the two-liner doesn't do what input() does. Testing it at the interactive interpreter gives me: py> def myinput(): ... sys.stdout.write("Name? ") ... return sys.stdin.readline() ... py> x = myinput() Steve py> ? py> The output isn't displayed until the input is entered, and then the prompt messes it up. Admittedly this isn't likely to be an issue if you're redirecting to another file, but it demonstrates that your suggested replacement is not equivalent to the feature request. There's no support for arrow keys, even when readline is available: My name^[[D^[[C^[[D > This falls to me in the category "not every two lines of the code should > be added as a builtin". The built-in already exists. This is making it more useful, just like adding file to print made print more useful. -- Steve From storchaka at gmail.com Fri Sep 29 08:19:49 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 29 Sep 2017 15:19:49 +0300 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: <20170929114004.GX13110@ando.pearwood.info> References: <20170929114004.GX13110@ando.pearwood.info> Message-ID: 29.09.17 14:40, Steven D'Aprano ????: > Because the two-liner doesn't do what input() does. Testing it at the > interactive interpreter gives me: > > py> def myinput(): > .... sys.stdout.write("Name? ") > .... return sys.stdin.readline() > .... > py> x = myinput() > Steve > py> ? py> > > The output isn't displayed until the input is entered, and then the > prompt messes it up. Well, I forgot about a flush(). Now this is just a three-liner. fout.write('Name? ') fout.flush() name = fin.readline() > There's no support for arrow keys, even when readline is available: > > My name^[[D^[[C^[[D Did you check that arrow keys are supported with the proposed PR? From ned at nedbatchelder.com Fri Sep 29 08:47:04 2017 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 29 Sep 2017 08:47:04 -0400 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: Message-ID: On 9/28/17 2:48 PM, Diana Clarke wrote: > Hi folks: > > I was recently looking for an entry-level cpython task to work on in > my spare time and plucked this off of someone's TODO list. > > "Make optimizations more fine-grained than just -O and -OO" > > There are currently three supported optimization levels (0, 1, and 2). > Briefly summarized, they do the following. > > 0: no optimizations > 1: remove assert statements and __debug__ blocks > 2: remove docstrings, assert statements, and __debug__ blocks > > Don't forget that the current "no optimizations" setting actually does peephole optimizations.? Are we considering addressing https://bugs.python.org/issue2506 to make a really truly "no optimizations" option? --Ned. From w00t at fb.com Fri Sep 29 10:50:27 2017 From: w00t at fb.com (Wren Turkal) Date: Fri, 29 Sep 2017 14:50:27 +0000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: , Message-ID: I am happy to rename the args. What do you think about infile, outfile, and errfile? FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I didn't think those quite captured the full meaning. wt ________________________________ From: Amit Green Sent: Thursday, September 28, 2017 11:18:18 PM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin I'm fine with the idea in general of extra keyword parameters to the input function. A few points: Your example code, needs try/catch to match what the input with parameters does -- and yes, its way nicer to be able to use it the example you have shown than play games with try/catch (Personally I also refuse to ever change sys.stdin, or sys.stdout, as I consider that a bad coding style). Mostly though I would like to ask, please do not name keyword arguments with names like 'fin' & 'fout'. This is almost unreadable and make's code almost indecipherable to others the first time they see the function & its keyword arguments (First impressions are very important). Both a function name & its keyword parameters need to be as understandable as possible when a user encounters them for the first time. On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal > wrote: Hi there, I have posted an idea for improvement with a PR of an implementation to https://bugs.python.org/issue31603. The basic idea is to add fin, fout, and ferr file object parameters and default to using what is used today when the args are not specified. I believe this would be useful to allow captures input and send output to specific files when using input. The input builtin has some logic to use readline if it's available. It would be nice to be able to use this same logic no matter what files are being used for input/output. This is meant to turn code like the following: orig_stdin = sys.stdin orig_stdout = sys.stdout with open('/dev/tty', 'r+') as f: sys.stdin = f sys.stdout = f name = input('Name? ') sys.stdin = orig_stdin sys.stdout = orig_stdout print(name) into something more like this: with open('/dev/tty', 'r+') as f: name = input('Name? ', fin=f, fout=f) print(name) It's nice that it makes the assignment to a global variable to change the file used for input/output to no longer be needed. I had this idea the other day, and I realized that it would be super easy to implement, so I went ahead the threw up a PR also. Would love to see if anyone else is interested in this. I think it's pretty cool that the core logic really didn't need to be changed other than plumbing in the new args. FWIW, this change introduces no regressions and adds a few more tests to test the new functionality. Honestly, I think this functionality could probably be used to simplify some of the other tests as well, but I wanted to gauge what folks thought of the change before going farther. Wren Turkal Existential Production Engineer of the Ages Facebook, Inc. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From diana.joan.clarke at gmail.com Fri Sep 29 10:56:43 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Fri, 29 Sep 2017 08:56:43 -0600 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: Message-ID: I suppose anything is possible ;) Perhaps I'll try my hand at that next. But no, I'm limiting the scope to the existing toggles only (docstrings, __debug__, assert) for this pass. I am aware of that thread though. I read it a few weeks back when I was initially researching the existing implementation and history. Happy Friday, folks! --diana On Fri, Sep 29, 2017 at 6:47 AM, Ned Batchelder wrote: > Don't forget that the current "no optimizations" setting actually does > peephole optimizations. Are we considering addressing > https://bugs.python.org/issue2506 to make a really truly "no optimizations" > option? From amit.mixie at gmail.com Fri Sep 29 11:13:21 2017 From: amit.mixie at gmail.com (Amit Green) Date: Fri, 29 Sep 2017 11:13:21 -0400 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: Yes, infile, outfile & errfile would be consistent with python naming convention (also mentioned by Steven D'Aprano above) One of python's greatest strength is its library, the consistency of the library, and how well documented the library is (in fact, I think the library is a greater strength than even the very nice syntax of python in general). By "consistency of the library" I mean: functions do pretty much what you expect, they use consistent error mechanism & the documentation pretty much accurately documents what the function does -- especially as to showing its results & how it handles errors. Regarding this though, this then brings up the question (above from Steven D'Aprano) -- what would the the "errfile" parameter do? - As a general principle of consistency python library functions, and python itself, do not output to errfile, but instead throw errors. - (There are very minor exceptions such as exceptions thrown in __del__ functions; which are caught by python & then printed to standard error). I would thus think you don't want the errfile parameter -- unless it would be for catching these __del__ method that get triggered by input failing (for example your 'infile' parameter when called, allocated an object, which gets deallocated & throws an exception inside of its __del__ method). If this is the purpose, then (back to 'how well documented the library is') -- it should be documented this is the purpose of the "errfile" parameter ;-) [A secondary reason you might want to redirect "errfile" is that the passed in input or output file's, themselves do output to standard error ...] On Fri, Sep 29, 2017 at 10:50 AM, Wren Turkal wrote: > I am happy to rename the args. What do you think about infile, outfile, > and errfile? > > > FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I > didn't think those quite captured the full meaning. > > > wt > ------------------------------ > *From:* Amit Green > *Sent:* Thursday, September 28, 2017 11:18:18 PM > *To:* Wren Turkal > *Cc:* python-ideas at python.org > *Subject:* Re: [Python-ideas] allow overriding files used for the input > builtin > > I'm fine with the idea in general of extra keyword parameters to the input > function. > > A few points: > > Your example code, needs try/catch to match what the input with parameters > does -- and yes, its way nicer to be able to use it the example you have > shown than play games with try/catch (Personally I also refuse to ever > change sys.stdin, or sys.stdout, as I consider that a bad coding style). > > Mostly though I would like to ask, please do not name keyword arguments > with names like 'fin' & 'fout'. This is almost unreadable and make's code > almost indecipherable to others the first time they see the function & its > keyword arguments (First impressions are very important). > > Both a function name & its keyword parameters need to be as understandable > as possible when a user encounters them for the first time. > > On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal wrote: > >> Hi there, >> >> >> I have posted an idea for improvement with a PR of an implementation to >> https://bugs.python.org/issue31603 >> >> . >> >> >> The basic idea is to add fin, fout, and ferr file object parameters and >> default to using what is used today when the args are not specified. I >> believe this would be useful to allow captures input and send output to >> specific files when using input. The input builtin has some logic to use >> readline if it's available. It would be nice to be able to use this same >> logic no matter what files are being used for input/output. >> >> >> This is meant to turn code like the following: >> >> orig_stdin = sys.stdin >> >> orig_stdout = sys.stdout >> >> with open('/dev/tty', 'r+') as f: >> >> sys.stdin = f >> >> sys.stdout = f >> >> name = input('Name? ') >> >> sys.stdin = orig_stdin >> >> sys.stdout = orig_stdout >> >> print(name) >> >> >> into something more like this: >> >> with open('/dev/tty', 'r+') as f: >> >> name = input('Name? ', fin=f, fout=f) >> >> print(name) >> >> >> It's nice that it makes the assignment to a global variable to change the >> file used for input/output to no longer be needed. >> >> >> I had this idea the other day, and I realized that it would be super easy >> to implement, so I went ahead the threw up a PR also. >> >> >> Would love to see if anyone else is interested in this. I think it's >> pretty cool that the core logic really didn't need to be changed other than >> plumbing in the new args. >> >> >> FWIW, this change introduces no regressions and adds a few more tests to >> test the new functionality. Honestly, I think this functionality could >> probably be used to simplify some of the other tests as well, but I wanted >> to gauge what folks thought of the change before going farther. >> >> >> Wren Turkal >> >> Existential Production Engineer of the Ages >> >> Facebook, Inc. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Sep 29 11:17:25 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 29 Sep 2017 17:17:25 +0200 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 29 September 2017 at 10:14, Victor Stinner wrote: > > Title: Core support for generic types > > Would it be possible to mention "typing" somewhere in the title? If > you don't know the context, it's hard to understand that the PEP is > related to type annotation and type checks. At least just from the > title. > What do you think about "Core support for typing module and generic types"? Another option is "Runtime mechanism to improve generics and typing module". -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Sep 29 11:22:27 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 29 Sep 2017 17:22:27 +0200 Subject: [Python-ideas] PEP 560 (second post) In-Reply-To: References: Message-ID: On 29 September 2017 at 08:57, Nick Coghlan wrote: > On 29 September 2017 at 08:04, Ivan Levkivskyi > wrote: > >> How would you feel about calling it "__mro_entry__", as a mnemonic for > >> "the substitute entry to use instead of this object when calculating a > >> subclass MRO"? > > > > I don't have any preferences for the name, __mro_entry__ sounds equally > OK > > to me. > > I'd propose changing it then, as searching for "Python mro entry" is > likely to get people to the right place faster than searching for > "Python subclass base". > > OK, will do. > > I propose to update > > ``type.__new__`` to just give > > a better error message explaining this. > > +1 from me, since that avoids ever resolving the list of bases twice. > > OK, I will update the reference implementation. > > 3) Do we need to update types.new_class and types.prepare_class? > > Here I am not sure. These functions are rather utility functions and are > > designed to > > mimic in Python what __build_class__ does in C. I think we might add > > types._update_bases > > that does the same as its C counterpart. Then we can update > types.new_class > > and types.prepare_class > > like you proposed, this will preserve their current API while > > types.new_class will match behaviour of __build_class__ > > Your suggestion for `types.__new__` gave me a different idea: what if > `types.prepare_class` *also* just raised an error when given a > non-class as a nominal base class? > > Then if we added `types.resolve_bases` as a public API, a full > reimplementation of `types.new_class` would now look like: > > resolved_bases = types.resolve_bases(bases) > mcl, ns, updated_kwds = types.prepare_class(name, resolved_bases, kwds) > exec_body(ns) > ns["__orig_bases__"] = bases > mcl(name, resolved_bases, ns, **updated_kwds) > > That way, `types.new_class` would transparently switch to the new > behaviour, while clients of any other dynamic type creation API could > do their own base class resolution, even if the type creation API they > were using didn't implicitly support MRO entry resolution. > > Yes, makes sense. I no one is against adding new public ``types.resolve_bases`` then I will add this to the PEP. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From w00t at fb.com Fri Sep 29 12:01:11 2017 From: w00t at fb.com (Wren Turkal) Date: Fri, 29 Sep 2017 16:01:11 +0000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: , Message-ID: Steven and Amit, I originally configured the mailing list for digest delivery and can't reply directly to his message. However, I now seen it, I will update the PR with the suggested name changes as soon as my "make test" finishes. FWIW, I've changed to direct message delivery. With regard to ferr (the corresponding variable in the builtin_input c function before the change), I saw this code: /* First of all, flush stderr */ tmp = _PyObject_CallMethodId(ferr, &PyId_flush, NULL); if (tmp == NULL) PyErr_Clear(); else Py_DECREF(tmp); And I assumed that it was important to be able to override a stderr as a result. I think there are few options here to resolve that: 1. Remove the errfile parameter and just do what happened before. 2. Remove the errfile and the above code (FWIW, I am not sure I understand the importance of flushing stderr before taking input). 3. Document the explicit purpose of the errfile. Amit, you'd responded with something about this (again digest reply, sorry). I am not sure how to concisely do that given Amit's descriptions. ? I am honestly leaning toward 2 unless I can figure out why the flushing of stderr is actually needed. wt ________________________________ From: Amit Green Sent: Friday, September 29, 2017 8:13:21 AM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin Yes, infile, outfile & errfile would be consistent with python naming convention (also mentioned by Steven D'Aprano above) One of python's greatest strength is its library, the consistency of the library, and how well documented the library is (in fact, I think the library is a greater strength than even the very nice syntax of python in general). By "consistency of the library" I mean: functions do pretty much what you expect, they use consistent error mechanism & the documentation pretty much accurately documents what the function does -- especially as to showing its results & how it handles errors. Regarding this though, this then brings up the question (above from Steven D'Aprano) -- what would the the "errfile" parameter do? * As a general principle of consistency python library functions, and python itself, do not output to errfile, but instead throw errors. * (There are very minor exceptions such as exceptions thrown in __del__ functions; which are caught by python & then printed to standard error). I would thus think you don't want the errfile parameter -- unless it would be for catching these __del__ method that get triggered by input failing (for example your 'infile' parameter when called, allocated an object, which gets deallocated & throws an exception inside of its __del__ method). If this is the purpose, then (back to 'how well documented the library is') -- it should be documented this is the purpose of the "errfile" parameter ;-) [A secondary reason you might want to redirect "errfile" is that the passed in input or output file's, themselves do output to standard error ...] On Fri, Sep 29, 2017 at 10:50 AM, Wren Turkal > wrote: I am happy to rename the args. What do you think about infile, outfile, and errfile? FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I didn't think those quite captured the full meaning. wt ________________________________ From: Amit Green > Sent: Thursday, September 28, 2017 11:18:18 PM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin I'm fine with the idea in general of extra keyword parameters to the input function. A few points: Your example code, needs try/catch to match what the input with parameters does -- and yes, its way nicer to be able to use it the example you have shown than play games with try/catch (Personally I also refuse to ever change sys.stdin, or sys.stdout, as I consider that a bad coding style). Mostly though I would like to ask, please do not name keyword arguments with names like 'fin' & 'fout'. This is almost unreadable and make's code almost indecipherable to others the first time they see the function & its keyword arguments (First impressions are very important). Both a function name & its keyword parameters need to be as understandable as possible when a user encounters them for the first time. On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal > wrote: Hi there, I have posted an idea for improvement with a PR of an implementation to https://bugs.python.org/issue31603. The basic idea is to add fin, fout, and ferr file object parameters and default to using what is used today when the args are not specified. I believe this would be useful to allow captures input and send output to specific files when using input. The input builtin has some logic to use readline if it's available. It would be nice to be able to use this same logic no matter what files are being used for input/output. This is meant to turn code like the following: orig_stdin = sys.stdin orig_stdout = sys.stdout with open('/dev/tty', 'r+') as f: sys.stdin = f sys.stdout = f name = input('Name? ') sys.stdin = orig_stdin sys.stdout = orig_stdout print(name) into something more like this: with open('/dev/tty', 'r+') as f: name = input('Name? ', fin=f, fout=f) print(name) It's nice that it makes the assignment to a global variable to change the file used for input/output to no longer be needed. I had this idea the other day, and I realized that it would be super easy to implement, so I went ahead the threw up a PR also. Would love to see if anyone else is interested in this. I think it's pretty cool that the core logic really didn't need to be changed other than plumbing in the new args. FWIW, this change introduces no regressions and adds a few more tests to test the new functionality. Honestly, I think this functionality could probably be used to simplify some of the other tests as well, but I wanted to gauge what folks thought of the change before going farther. Wren Turkal Existential Production Engineer of the Ages Facebook, Inc. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mixie at gmail.com Fri Sep 29 12:17:39 2017 From: amit.mixie at gmail.com (Amit Green) Date: Fri, 29 Sep 2017 12:17:39 -0400 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: Message-ID: Hmm, very good point. Flushing standard error is essential: - This is because output of standard output & standard error are often redirected to the same file -- frequently the terminal -- and its important to flush one of them before output to the other (since these are typically line buffered you only see this if you output partial lines). - Here is an example of not flushing & the mess it causes: >>> import sys >>> sys.stdout.write('hi'); sys.stderr.write(' HMM '); sys.stdout.write('there\n') HMM hithere - Here is the same example with flushing: >>> sys.stdout.write('hi'); sys.stdout.flush(); sys.stderr.write(' HMM '); sys.stderr.flush(); sys.stdout.write('there\n') hi HMM there In fact, I think you need to improve your interface and documentation: - If either of the parameters 'outfile', or 'errfile' are passed in & not the 'none' value then the following happens: - Both sys.stdout & sys.stderr are flushed (in that order) before being redirected. (NOTE: Both are flushed, even if only one is redirected). - Before restoring sys.stdout & sys.stderr; then outfile & errfile are flushed (if both have been used then they are flushed in that order). You would of course need to write the documentation clearer than I did above (writing documentation well is not my skill) -- I wrote it to convey exactly what has to happen. On Fri, Sep 29, 2017 at 12:01 PM, Wren Turkal wrote: > Steven and Amit, > > > I originally configured the mailing list for digest delivery and can't > reply directly to his message. However, I now seen it, I will update the PR > with the suggested name changes as soon as my "make test" finishes. FWIW, I've > changed to direct message delivery. > > > With regard to ferr (the corresponding variable in the builtin_input c > function before the change), I saw this code: > > > /* First of all, flush stderr */ > tmp = _PyObject_CallMethodId(ferr, &PyId_flush, NULL); > if (tmp == NULL) > PyErr_Clear(); > else > Py_DECREF(tmp); > > And I assumed that it was important to be able to override a stderr as a > result. > > > I think there are few options here to resolve that: > > > 1. Remove the errfile parameter and just do what happened before. > 2. Remove the errfile and the above code (FWIW, I am not sure I > understand the importance of flushing stderr before taking input). > 3. Document the explicit purpose of the errfile. Amit, you'd responded > with something about this (again digest reply, sorry). I am not sure how to > concisely do that given Amit's descriptions. ? > > > I am honestly leaning toward 2 unless I can figure out why the flushing of > stderr is actually needed. > > > wt > > > ------------------------------ > *From:* Amit Green > *Sent:* Friday, September 29, 2017 8:13:21 AM > > *To:* Wren Turkal > *Cc:* python-ideas at python.org > *Subject:* Re: [Python-ideas] allow overriding files used for the input > builtin > > Yes, infile, outfile & errfile would be consistent with python naming > convention (also mentioned by Steven D'Aprano above) > > One of python's greatest strength is its library, the consistency of the > library, and how well documented the library is (in fact, I think the > library is a greater strength than even the very nice syntax of python in > general). > > By "consistency of the library" I mean: functions do pretty much what you > expect, they use consistent error mechanism & the documentation pretty much > accurately documents what the function does -- especially as to showing its > results & how it handles errors. > > Regarding this though, this then brings up the question (above from Steven > D'Aprano) -- what would the the "errfile" parameter do? > > - As a general principle of consistency python library functions, and > python itself, do not output to errfile, but instead throw errors. > - (There are very minor exceptions such as exceptions thrown in > __del__ functions; which are caught by python & then printed to standard > error). > > I would thus think you don't want the errfile parameter -- unless it would > be for catching these __del__ method that get triggered by input failing > (for example your 'infile' parameter when called, allocated an object, > which gets deallocated & throws an exception inside of its __del__ method). > > If this is the purpose, then (back to 'how well documented the library > is') -- it should be documented this is the purpose of the "errfile" > parameter ;-) > > [A secondary reason you might want to redirect "errfile" is that the > passed in input or output file's, themselves do output to standard error > ...] > > > > On Fri, Sep 29, 2017 at 10:50 AM, Wren Turkal wrote: > >> I am happy to rename the args. What do you think about infile, outfile, >> and errfile? >> >> >> FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I >> didn't think those quite captured the full meaning. >> >> >> wt >> ------------------------------ >> *From:* Amit Green >> *Sent:* Thursday, September 28, 2017 11:18:18 PM >> *To:* Wren Turkal >> *Cc:* python-ideas at python.org >> *Subject:* Re: [Python-ideas] allow overriding files used for the input >> builtin >> >> I'm fine with the idea in general of extra keyword parameters to the >> input function. >> >> A few points: >> >> Your example code, needs try/catch to match what the input with >> parameters does -- and yes, its way nicer to be able to use it the example >> you have shown than play games with try/catch (Personally I also refuse to >> ever change sys.stdin, or sys.stdout, as I consider that a bad coding >> style). >> >> Mostly though I would like to ask, please do not name keyword arguments >> with names like 'fin' & 'fout'. This is almost unreadable and make's code >> almost indecipherable to others the first time they see the function & its >> keyword arguments (First impressions are very important). >> >> Both a function name & its keyword parameters need to be as >> understandable as possible when a user encounters them for the first time. >> >> On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal wrote: >> >>> Hi there, >>> >>> >>> I have posted an idea for improvement with a PR of an implementation to >>> https://bugs.python.org/issue31603 >>> >>> . >>> >>> >>> The basic idea is to add fin, fout, and ferr file object parameters and >>> default to using what is used today when the args are not specified. I >>> believe this would be useful to allow captures input and send output to >>> specific files when using input. The input builtin has some logic to use >>> readline if it's available. It would be nice to be able to use this same >>> logic no matter what files are being used for input/output. >>> >>> >>> This is meant to turn code like the following: >>> >>> orig_stdin = sys.stdin >>> >>> orig_stdout = sys.stdout >>> >>> with open('/dev/tty', 'r+') as f: >>> >>> sys.stdin = f >>> >>> sys.stdout = f >>> >>> name = input('Name? ') >>> >>> sys.stdin = orig_stdin >>> >>> sys.stdout = orig_stdout >>> >>> print(name) >>> >>> >>> into something more like this: >>> >>> with open('/dev/tty', 'r+') as f: >>> >>> name = input('Name? ', fin=f, fout=f) >>> >>> print(name) >>> >>> >>> It's nice that it makes the assignment to a global variable to change >>> the file used for input/output to no longer be needed. >>> >>> >>> I had this idea the other day, and I realized that it would be super >>> easy to implement, so I went ahead the threw up a PR also. >>> >>> >>> Would love to see if anyone else is interested in this. I think it's >>> pretty cool that the core logic really didn't need to be changed other than >>> plumbing in the new args. >>> >>> >>> FWIW, this change introduces no regressions and adds a few more tests to >>> test the new functionality. Honestly, I think this functionality could >>> probably be used to simplify some of the other tests as well, but I wanted >>> to gauge what folks thought of the change before going farther. >>> >>> >>> Wren Turkal >>> >>> Existential Production Engineer of the Ages >>> >>> Facebook, Inc. >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From w00t at fb.com Fri Sep 29 12:53:01 2017 From: w00t at fb.com (Wren Turkal) Date: Fri, 29 Sep 2017 16:53:01 +0000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: , Message-ID: Oh, I thought that stderr was unbuffered. Like the following C program: #include int main() { fprintf(stdout, "stdout0"); fprintf(stderr, "stderr0"); fprintf(stdout, "stdout1"); return 0; } This outputs: stderr0stdout0stdout1 Turns out fprintf in glibc will also implicitly flush at newlines also. So, the following program: #include int main() { fprintf(stdout, "stdout0\nstdout12"); fprintf(stderr, "stderr0"); fprintf(stdout, "stdout1"); return 0; } outputs: stdout0 stderr0stdout12stdout1 This is in line with python. The following program: import sys sys.stdout.write('stdout0\nstdout01') sys.stderr.write('stderr0') sys.stdout.write('stdout1') outputs: stdout0 stderr0stdout01stdout1 While the following program: import sys sys.stdout.write('stdout0') sys.stderr.write('stderr0') sys.stdout.write('stdout1') outputs: stderr0stdout0stdout1 So, I guess that the errfile is explicitly flushed due to the fact that stdout may or may not be flushed during output. So, I have 2 questions: (1) do we need to keep the errfile param to make this useful (I think we do), and if #1 is yes, (2) what is an appropriate way to doc this? wt ________________________________ From: Amit Green Sent: Friday, September 29, 2017 9:17:39 AM To: Wren Turkal Cc: Steven D'Aprano; python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin Hmm, very good point. Flushing standard error is essential: * This is because output of standard output & standard error are often redirected to the same file -- frequently the terminal -- and its important to flush one of them before output to the other (since these are typically line buffered you only see this if you output partial lines). * Here is an example of not flushing & the mess it causes: >>> import sys >>> sys.stdout.write('hi'); sys.stderr.write(' HMM '); sys.stdout.write('there\n') HMM hithere * Here is the same example with flushing: >>> sys.stdout.write('hi'); sys.stdout.flush(); sys.stderr.write(' HMM '); sys.stderr.flush(); sys.stdout.write('there\n') hi HMM there In fact, I think you need to improve your interface and documentation: * If either of the parameters 'outfile', or 'errfile' are passed in & not the 'none' value then the following happens: * Both sys.stdout & sys.stderr are flushed (in that order) before being redirected. (NOTE: Both are flushed, even if only one is redirected). * Before restoring sys.stdout & sys.stderr; then outfile & errfile are flushed (if both have been used then they are flushed in that order). You would of course need to write the documentation clearer than I did above (writing documentation well is not my skill) -- I wrote it to convey exactly what has to happen. On Fri, Sep 29, 2017 at 12:01 PM, Wren Turkal > wrote: Steven and Amit, I originally configured the mailing list for digest delivery and can't reply directly to his message. However, I now seen it, I will update the PR with the suggested name changes as soon as my "make test" finishes. FWIW, I've changed to direct message delivery. With regard to ferr (the corresponding variable in the builtin_input c function before the change), I saw this code: /* First of all, flush stderr */ tmp = _PyObject_CallMethodId(ferr, &PyId_flush, NULL); if (tmp == NULL) PyErr_Clear(); else Py_DECREF(tmp); And I assumed that it was important to be able to override a stderr as a result. I think there are few options here to resolve that: 1. Remove the errfile parameter and just do what happened before. 2. Remove the errfile and the above code (FWIW, I am not sure I understand the importance of flushing stderr before taking input). 3. Document the explicit purpose of the errfile. Amit, you'd responded with something about this (again digest reply, sorry). I am not sure how to concisely do that given Amit's descriptions. ? I am honestly leaning toward 2 unless I can figure out why the flushing of stderr is actually needed. wt ________________________________ From: Amit Green > Sent: Friday, September 29, 2017 8:13:21 AM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin Yes, infile, outfile & errfile would be consistent with python naming convention (also mentioned by Steven D'Aprano above) One of python's greatest strength is its library, the consistency of the library, and how well documented the library is (in fact, I think the library is a greater strength than even the very nice syntax of python in general). By "consistency of the library" I mean: functions do pretty much what you expect, they use consistent error mechanism & the documentation pretty much accurately documents what the function does -- especially as to showing its results & how it handles errors. Regarding this though, this then brings up the question (above from Steven D'Aprano) -- what would the the "errfile" parameter do? * As a general principle of consistency python library functions, and python itself, do not output to errfile, but instead throw errors. * (There are very minor exceptions such as exceptions thrown in __del__ functions; which are caught by python & then printed to standard error). I would thus think you don't want the errfile parameter -- unless it would be for catching these __del__ method that get triggered by input failing (for example your 'infile' parameter when called, allocated an object, which gets deallocated & throws an exception inside of its __del__ method). If this is the purpose, then (back to 'how well documented the library is') -- it should be documented this is the purpose of the "errfile" parameter ;-) [A secondary reason you might want to redirect "errfile" is that the passed in input or output file's, themselves do output to standard error ...] On Fri, Sep 29, 2017 at 10:50 AM, Wren Turkal > wrote: I am happy to rename the args. What do you think about infile, outfile, and errfile? FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I didn't think those quite captured the full meaning. wt ________________________________ From: Amit Green > Sent: Thursday, September 28, 2017 11:18:18 PM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin I'm fine with the idea in general of extra keyword parameters to the input function. A few points: Your example code, needs try/catch to match what the input with parameters does -- and yes, its way nicer to be able to use it the example you have shown than play games with try/catch (Personally I also refuse to ever change sys.stdin, or sys.stdout, as I consider that a bad coding style). Mostly though I would like to ask, please do not name keyword arguments with names like 'fin' & 'fout'. This is almost unreadable and make's code almost indecipherable to others the first time they see the function & its keyword arguments (First impressions are very important). Both a function name & its keyword parameters need to be as understandable as possible when a user encounters them for the first time. On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal > wrote: Hi there, I have posted an idea for improvement with a PR of an implementation to https://bugs.python.org/issue31603. The basic idea is to add fin, fout, and ferr file object parameters and default to using what is used today when the args are not specified. I believe this would be useful to allow captures input and send output to specific files when using input. The input builtin has some logic to use readline if it's available. It would be nice to be able to use this same logic no matter what files are being used for input/output. This is meant to turn code like the following: orig_stdin = sys.stdin orig_stdout = sys.stdout with open('/dev/tty', 'r+') as f: sys.stdin = f sys.stdout = f name = input('Name? ') sys.stdin = orig_stdin sys.stdout = orig_stdout print(name) into something more like this: with open('/dev/tty', 'r+') as f: name = input('Name? ', fin=f, fout=f) print(name) It's nice that it makes the assignment to a global variable to change the file used for input/output to no longer be needed. I had this idea the other day, and I realized that it would be super easy to implement, so I went ahead the threw up a PR also. Would love to see if anyone else is interested in this. I think it's pretty cool that the core logic really didn't need to be changed other than plumbing in the new args. FWIW, this change introduces no regressions and adds a few more tests to test the new functionality. Honestly, I think this functionality could probably be used to simplify some of the other tests as well, but I wanted to gauge what folks thought of the change before going farther. Wren Turkal Existential Production Engineer of the Ages Facebook, Inc. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From w00t at fb.com Fri Sep 29 14:20:31 2017 From: w00t at fb.com (Wren Turkal) Date: Fri, 29 Sep 2017 18:20:31 +0000 Subject: [Python-ideas] allow overriding files used for the input builtin In-Reply-To: References: , , Message-ID: So, I was just thinking, maybe we don't want an errfile arg, but an arg that is a sequence of file objects that need to be flushed before showing the prompt. That's certainly more complicated, but it seems more general if we want to cover this case. Thoughts? wt ________________________________ From: Python-ideas on behalf of Wren Turkal Sent: Friday, September 29, 2017 9:53:01 AM To: Amit Green Cc: python-ideas at python.org Subject: [Potential Spoof] Re: [Python-ideas] allow overriding files used for the input builtin Oh, I thought that stderr was unbuffered. Like the following C program: #include int main() { fprintf(stdout, "stdout0"); fprintf(stderr, "stderr0"); fprintf(stdout, "stdout1"); return 0; } This outputs: stderr0stdout0stdout1 Turns out fprintf in glibc will also implicitly flush at newlines also. So, the following program: #include int main() { fprintf(stdout, "stdout0\nstdout12"); fprintf(stderr, "stderr0"); fprintf(stdout, "stdout1"); return 0; } outputs: stdout0 stderr0stdout12stdout1 This is in line with python. The following program: import sys sys.stdout.write('stdout0\nstdout01') sys.stderr.write('stderr0') sys.stdout.write('stdout1') outputs: stdout0 stderr0stdout01stdout1 While the following program: import sys sys.stdout.write('stdout0') sys.stderr.write('stderr0') sys.stdout.write('stdout1') outputs: stderr0stdout0stdout1 So, I guess that the errfile is explicitly flushed due to the fact that stdout may or may not be flushed during output. So, I have 2 questions: (1) do we need to keep the errfile param to make this useful (I think we do), and if #1 is yes, (2) what is an appropriate way to doc this? wt ________________________________ From: Amit Green Sent: Friday, September 29, 2017 9:17:39 AM To: Wren Turkal Cc: Steven D'Aprano; python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin Hmm, very good point. Flushing standard error is essential: * This is because output of standard output & standard error are often redirected to the same file -- frequently the terminal -- and its important to flush one of them before output to the other (since these are typically line buffered you only see this if you output partial lines). * Here is an example of not flushing & the mess it causes: >>> import sys >>> sys.stdout.write('hi'); sys.stderr.write(' HMM '); sys.stdout.write('there\n') HMM hithere * Here is the same example with flushing: >>> sys.stdout.write('hi'); sys.stdout.flush(); sys.stderr.write(' HMM '); sys.stderr.flush(); sys.stdout.write('there\n') hi HMM there In fact, I think you need to improve your interface and documentation: * If either of the parameters 'outfile', or 'errfile' are passed in & not the 'none' value then the following happens: * Both sys.stdout & sys.stderr are flushed (in that order) before being redirected. (NOTE: Both are flushed, even if only one is redirected). * Before restoring sys.stdout & sys.stderr; then outfile & errfile are flushed (if both have been used then they are flushed in that order). You would of course need to write the documentation clearer than I did above (writing documentation well is not my skill) -- I wrote it to convey exactly what has to happen. On Fri, Sep 29, 2017 at 12:01 PM, Wren Turkal > wrote: Steven and Amit, I originally configured the mailing list for digest delivery and can't reply directly to his message. However, I now seen it, I will update the PR with the suggested name changes as soon as my "make test" finishes. FWIW, I've changed to direct message delivery. With regard to ferr (the corresponding variable in the builtin_input c function before the change), I saw this code: /* First of all, flush stderr */ tmp = _PyObject_CallMethodId(ferr, &PyId_flush, NULL); if (tmp == NULL) PyErr_Clear(); else Py_DECREF(tmp); And I assumed that it was important to be able to override a stderr as a result. I think there are few options here to resolve that: 1. Remove the errfile parameter and just do what happened before. 2. Remove the errfile and the above code (FWIW, I am not sure I understand the importance of flushing stderr before taking input). 3. Document the explicit purpose of the errfile. Amit, you'd responded with something about this (again digest reply, sorry). I am not sure how to concisely do that given Amit's descriptions. ? I am honestly leaning toward 2 unless I can figure out why the flushing of stderr is actually needed. wt ________________________________ From: Amit Green > Sent: Friday, September 29, 2017 8:13:21 AM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin Yes, infile, outfile & errfile would be consistent with python naming convention (also mentioned by Steven D'Aprano above) One of python's greatest strength is its library, the consistency of the library, and how well documented the library is (in fact, I think the library is a greater strength than even the very nice syntax of python in general). By "consistency of the library" I mean: functions do pretty much what you expect, they use consistent error mechanism & the documentation pretty much accurately documents what the function does -- especially as to showing its results & how it handles errors. Regarding this though, this then brings up the question (above from Steven D'Aprano) -- what would the the "errfile" parameter do? * As a general principle of consistency python library functions, and python itself, do not output to errfile, but instead throw errors. * (There are very minor exceptions such as exceptions thrown in __del__ functions; which are caught by python & then printed to standard error). I would thus think you don't want the errfile parameter -- unless it would be for catching these __del__ method that get triggered by input failing (for example your 'infile' parameter when called, allocated an object, which gets deallocated & throws an exception inside of its __del__ method). If this is the purpose, then (back to 'how well documented the library is') -- it should be documented this is the purpose of the "errfile" parameter ;-) [A secondary reason you might want to redirect "errfile" is that the passed in input or output file's, themselves do output to standard error ...] On Fri, Sep 29, 2017 at 10:50 AM, Wren Turkal > wrote: I am happy to rename the args. What do you think about infile, outfile, and errfile? FWIW, I did consider "in", "out", and "err", but "in" is a keyword, and I didn't think those quite captured the full meaning. wt ________________________________ From: Amit Green > Sent: Thursday, September 28, 2017 11:18:18 PM To: Wren Turkal Cc: python-ideas at python.org Subject: Re: [Python-ideas] allow overriding files used for the input builtin I'm fine with the idea in general of extra keyword parameters to the input function. A few points: Your example code, needs try/catch to match what the input with parameters does -- and yes, its way nicer to be able to use it the example you have shown than play games with try/catch (Personally I also refuse to ever change sys.stdin, or sys.stdout, as I consider that a bad coding style). Mostly though I would like to ask, please do not name keyword arguments with names like 'fin' & 'fout'. This is almost unreadable and make's code almost indecipherable to others the first time they see the function & its keyword arguments (First impressions are very important). Both a function name & its keyword parameters need to be as understandable as possible when a user encounters them for the first time. On Fri, Sep 29, 2017 at 1:53 AM, Wren Turkal > wrote: Hi there, I have posted an idea for improvement with a PR of an implementation to https://bugs.python.org/issue31603. The basic idea is to add fin, fout, and ferr file object parameters and default to using what is used today when the args are not specified. I believe this would be useful to allow captures input and send output to specific files when using input. The input builtin has some logic to use readline if it's available. It would be nice to be able to use this same logic no matter what files are being used for input/output. This is meant to turn code like the following: orig_stdin = sys.stdin orig_stdout = sys.stdout with open('/dev/tty', 'r+') as f: sys.stdin = f sys.stdout = f name = input('Name? ') sys.stdin = orig_stdin sys.stdout = orig_stdout print(name) into something more like this: with open('/dev/tty', 'r+') as f: name = input('Name? ', fin=f, fout=f) print(name) It's nice that it makes the assignment to a global variable to change the file used for input/output to no longer be needed. I had this idea the other day, and I realized that it would be super easy to implement, so I went ahead the threw up a PR also. Would love to see if anyone else is interested in this. I think it's pretty cool that the core logic really didn't need to be changed other than plumbing in the new args. FWIW, this change introduces no regressions and adds a few more tests to test the new functionality. Honestly, I think this functionality could probably be used to simplify some of the other tests as well, but I wanted to gauge what folks thought of the change before going farther. Wren Turkal Existential Production Engineer of the Ages Facebook, Inc. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From diana.joan.clarke at gmail.com Fri Sep 29 16:24:52 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Fri, 29 Sep 2017 14:24:52 -0600 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: <20170928210248.65a06e4b@fsol> Message-ID: Oh, I like this idea! I had very briefly considered treating the existing flag as a bitfield, but then promptly forgot to explore that line of thought further. I'll play with that approach next week, see where it takes me, and then report back. Thanks so much for taking the time to think this through with me ? much appreciated. Cheers, --diana On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan wrote: > 2. We could reinterpret "optimize" as a bitfield instead of a regular > integer, special casing the already defined values: > > - all zero: no optimizations > - sign bit set: negative -> use global settings > - 0x0001: nodebug+noassert > - 0x0002: nodebug+noassert+nodocstrings > - 0x0004: nodebug > - 0x0008: noassert > - 0x0010: nodocstrings > > The "redefine optimizations as a bitfield" approach seems particularly > promising to me - it's a full integer, so even with all negative > numbers disallowed and the two low order bits reserved for the legacy > combinations, that's still 29 different optimisation flags given > 32-bit integers. We currently have 3, so that's room for an 866% > increase in the number of defined flags :) From diana.joan.clarke at gmail.com Sat Sep 30 16:36:36 2017 From: diana.joan.clarke at gmail.com (Diana Clarke) Date: Sat, 30 Sep 2017 14:36:36 -0600 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: <20170928210248.65a06e4b@fsol> Message-ID: In the mean time, I've re-opened the following pull request that can be merged independent of these changes (it's just additional test coverage). trivial: add test coverage for the __debug__ case (optimization levels) https://github.com/python/cpython/pull/3450 Please let me know if I should create a bpo for it, if the commit message is too long, or if you think I should otherwise change the patch in any way. As always, thanks for your time folks! --diana On Fri, Sep 29, 2017 at 2:24 PM, Diana Clarke wrote: > Oh, I like this idea! > > I had very briefly considered treating the existing flag as a > bitfield, but then promptly forgot to explore that line of thought > further. > > I'll play with that approach next week, see where it takes me, and > then report back. > > Thanks so much for taking the time to think this through with me ? > much appreciated. > > Cheers, > > --diana > > On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan wrote: >> 2. We could reinterpret "optimize" as a bitfield instead of a regular >> integer, special casing the already defined values: >> >> - all zero: no optimizations >> - sign bit set: negative -> use global settings >> - 0x0001: nodebug+noassert >> - 0x0002: nodebug+noassert+nodocstrings >> - 0x0004: nodebug >> - 0x0008: noassert >> - 0x0010: nodocstrings >> >> The "redefine optimizations as a bitfield" approach seems particularly >> promising to me - it's a full integer, so even with all negative >> numbers disallowed and the two low order bits reserved for the legacy >> combinations, that's still 29 different optimisation flags given >> 32-bit integers. We currently have 3, so that's room for an 866% >> increase in the number of defined flags :) From tjreedy at udel.edu Sat Sep 30 19:44:44 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 30 Sep 2017 19:44:44 -0400 Subject: [Python-ideas] Changes to the existing optimization levels In-Reply-To: References: <20170928210248.65a06e4b@fsol> Message-ID: On 9/30/2017 4:36 PM, Diana Clarke wrote: > In the mean time, I've re-opened the following pull request that can > be merged independent of these changes (it's just additional test > coverage). > > trivial: add test coverage for the __debug__ case (optimization levels) > https://github.com/python/cpython/pull/3450 > > Please let me know if I should create a bpo for it, if the commit > message is too long, or if you think I should otherwise change the > patch in any way. Your patch is substantial, well beyond trivial. Please open an issue with the page-long description as the first message, add a news blurb, and create a much shorter commit message. -- Terry Jan Reedy