From J.Demeyer at UGent.be Mon Apr 1 01:31:03 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Mon, 1 Apr 2019 07:31:03 +0200 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: <5C9FEF82.50207@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> Message-ID: <5CA1A217.1030007@UGent.be> I added benchmarks for PEP 590: https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248 From songofacandy at gmail.com Mon Apr 1 04:26:31 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Mon, 1 Apr 2019 17:26:31 +0900 Subject: [Python-Dev] Removing PendingDeprecationWarning In-Reply-To: References: <18ccdacf-8fc8-5130-b4ba-89df84e02987@python.org> Message-ID: On Sat, Mar 30, 2019 at 7:31 PM Nick Coghlan wrote: > > That's just a documentation fix: "If you're not sure whether to use > DeprecationWarning or PendingDeprecationWarning, use > DeprecationWarning". > Current proposed patch is: """ .. note:: PendingDeprecationWarning was introduced as an "ignored by default" version of DeprecationWarning. But :exc:`DeprecationWarning` is also ignored by default since Python 2.7 and 3.2. There is not much difference between PendingDeprecationWarning and DeprecationWarning nowadays. DeprecationWarning is recommended in general. """ https://github.com/python/cpython/pull/12505/files#diff-4d7187c7266c3f79727d358de3b3d228 -- Inada Naoki From steve.dower at python.org Mon Apr 1 12:12:26 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 1 Apr 2019 09:12:26 -0700 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: References: <13f98061-6f64-2e8b-de66-d84a7be00a17@python.org> <20190330023947.GA62291@cskk.homeip.net> <55e6f051-179d-73b4-4cca-b91c5c81b498@python.org> <3210441d-6094-b53e-6bf7-4b7c4cfb16ea@python.org> Message-ID: On 30Mar2019 1130, Gregory P. Smith wrote: > I wouldn't expect it to be the case in a CI environment but I believe a > umask can be overridden if the filesystem is mounted and configured with > acls set?? (oh, hah, Ivan just said the same thing) Yep, it appears this is the case. The Pipelines team got back to me and it seems to be a known issue - the workaround they gave me was to run "sudo setfacl -Rb /home/vsts" at the start, so I've merged that in for now (to master and 3.7). Cheers, Steve From steve.dower at python.org Mon Apr 1 12:31:36 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 1 Apr 2019 09:31:36 -0700 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> Message-ID: On 31Mar2019 0538, Christian Heimes wrote: > I don't like the fact that the PEP requires users to learn and use an > additional layer to handle native code. Although we cannot provide a > fully secure hook for native code, we could at least try to provide a > best effort hook and document the limitations. A bit more information > would make the verified open function more useful, too. So instead they need to learn a significantly more complicated API? :) (I was very happy to be able to say "it's the same as open(p, 'rb')"). > PyObject *PyImport_OpenForExecution( > const char *path, > const char *intent, > int flags, > PyObject *context > ) > > - Path is an absolute (!) file path. The PEP doesn't specify if the file > name is relative or absolute. IMO it should be always absolute. Yeah, this is fair enough. I'll add it as a requirement. > - The new intent argument lets the caller pass information how it > intents to use the file, e.g. pythoncode, zipimport, nativecode (for > loading a shared library/DLL), ctypes, ... This allows the verify hook > to react on the intent and provide different verifications for e.g. > Python code and native modules. I had an intent argument at one point and the feedback I got (from teams who wanted to implement it) is that they wouldn't trust it anyway :) In each case there should be associated audit events for tracking the intent (and interrupting at that point if it doesn't like the intended action), but for the simple case of "let me open this specific file" it doesn't really add much. And it almost certainly shouldn't impact decision making. > - The flags argument is for additional flags, e.g. return an opened file > or None, open the file in text or binary mode, ... This just makes it harder for the hook implementer - now you have to allow encoding/errors arguments and probably more. And as mentioned above, there should be an audit event showing the intent before this call, and a hook can reject it at that point (rather than verify without actually returning the verified content). > - Context is an optional Python object from the caller's context. For > the import system, it could be the loader instance. I think the audit event covers this, unless you have some way of using this context in mind that I can't think of? Cheers, Steve From steve.dower at python.org Mon Apr 1 13:42:58 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 1 Apr 2019 10:42:58 -0700 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <6ded2c50-bf28-1376-7b0c-9cc6839be56b@python.org> References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <6ded2c50-bf28-1376-7b0c-9cc6839be56b@python.org> Message-ID: <16bc5fae-47c0-e9f6-da60-e0e46cb84c78@python.org> On 30Mar2019 0913, Steve Dower wrote: > On 30Mar.2019 0747, Nick Coghlan wrote: >> I like this PEP in principle, but the specific "open_for_import" name >> bothers me a lot, as it implies that "importing" is the only situation >> where a file will be opened for code execution. >> >> If this part of the API were lower down the stack (e.g. >> "_io.open_for_code_execution") then I think it would make more sense - >> APIs like tokenize.open(), runpy.run_path(), PyRun_SimpleFile(), >> shelve, etc, could use that, without having to introduce a dependency >> on importlib to get access to the functionality. > > It was called "open_for_exec" at one point, though I forget exactly why > we changed it. But I have no problem with moving it. Something like this? > > PyImport_OpenForImport -> PyIO_OpenForExec > PyImport_SetOpenForImportHook -> PyIO_SetOpenForExecHook > importlib.util.open_for_import -> _io.open_for_exec > > Or more in line with Nick's suggestion: > > PyImport_OpenForImport -> PyIO_OpenExecutableCode > PyImport_SetOpenForImportHook -> PyIO_SetOpenExecutableCodeHook > importlib.util.open_for_import -> _io.open_executable_code > > I dropped "For", but I don't really care that much about the name. I'd > be okay dropping either "executable" or "code" as well - I don't really > have a good sense of which will make people more likely to use this > correctly. Looking at what we already have, perhaps putting it under "PyFile_OpenForExecute" would make the most sense? We don't currently have any public "PyIO" types or functions. Bikeshedding now, but as I'm the only one really participating in it, I think it's allowed :) Cheers, Steve From cs at cskk.id.au Mon Apr 1 18:35:39 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Tue, 2 Apr 2019 09:35:39 +1100 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: References: Message-ID: <20190401223539.GA47505@cskk.homeip.net> On 01Apr2019 09:12, Steve Dower wrote: >On 30Mar2019 1130, Gregory P. Smith wrote: >>I wouldn't expect it to be the case in a CI environment but I >>believe a umask can be overridden if the filesystem is mounted and >>configured with acls set?? (oh, hah, Ivan just said the same thing) > >Yep, it appears this is the case. The Pipelines team got back to me >and it seems to be a known issue - the workaround they gave me was to >run "sudo setfacl -Rb /home/vsts" at the start, so I've merged that in >for now (to master and 3.7). Could that be done _without_ sudo to just the local directory containing the test tar file? If that works then you don't need any nasty privileged sudo use (which will just break on platforms without sudo anyway). Cheers, Cameron Simpson From steve.dower at python.org Mon Apr 1 18:44:13 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 1 Apr 2019 15:44:13 -0700 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: <20190401223539.GA47505@cskk.homeip.net> References: <20190401223539.GA47505@cskk.homeip.net> Message-ID: On 01Apr2019 1535, Cameron Simpson wrote: > On 01Apr2019 09:12, Steve Dower wrote: >> On 30Mar2019 1130, Gregory P. Smith wrote: >>> I wouldn't expect it to be the case in a CI environment but I believe >>> a umask can be overridden if the filesystem is mounted and configured >>> with acls set?? (oh, hah, Ivan just said the same thing) >> >> Yep, it appears this is the case. The Pipelines team got back to me >> and it seems to be a known issue - the workaround they gave me was to >> run "sudo setfacl -Rb /home/vsts" at the start, so I've merged that in >> for now (to master and 3.7). > > Could that be done _without_ sudo to just the local directory containing > the test tar file? If that works then you don't need any nasty > privileged sudo use (which will just break on platforms without sudo > anyway). I tried something similar to that and it didn't work. My guess is it's to do with the actual mount point? (I also tried without sudo at first, and when I didn't work, I tried it with sudo. I hear that's how to decide whether you need it or not ;) ) In any case, it only applies to the Azure Pipelines build definition, so there aren't any other platforms where it'll be used. Cheers, Steve From cs at cskk.id.au Mon Apr 1 19:49:29 2019 From: cs at cskk.id.au (Cameron Simpson) Date: Tue, 2 Apr 2019 10:49:29 +1100 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: References: Message-ID: <20190401234929.GA53667@cskk.homeip.net> On 01Apr2019 15:44, Steve Dower wrote: >On 01Apr2019 1535, Cameron Simpson wrote: >>On 01Apr2019 09:12, Steve Dower wrote: >>>On 30Mar2019 1130, Gregory P. Smith wrote: >>>>I wouldn't expect it to be the case in a CI environment but I >>>>believe a umask can be overridden if the filesystem is mounted >>>>and configured with acls set?? (oh, hah, Ivan just said the same >>>>thing) >>> >>>Yep, it appears this is the case. The Pipelines team got back to >>>me and it seems to be a known issue - the workaround they gave me >>>was to run "sudo setfacl -Rb /home/vsts" at the start, so I've >>>merged that in for now (to master and 3.7). >> >>Could that be done _without_ sudo to just the local directory >>containing the test tar file? If that works then you don't need any >>nasty privileged sudo use (which will just break on platforms >>without sudo anyway). > >I tried something similar to that and it didn't work. My guess is it's >to do with the actual mount point? (I also tried without sudo at >first, and when I didn't work, I tried it with sudo. I hear that's how >to decide whether you need it or not ;) ) > >In any case, it only applies to the Azure Pipelines build definition, >so there aren't any other platforms where it'll be used. Ok then. Curious: is the sudo now in the build setup? I'm just thinking that this isn't a tarfile specific fix but a "get correct POSIX umask semantics" fix, so it should apply to the entire environment. Or am I naive? Cheers, Cameron Simpson From greg at krypto.org Mon Apr 1 19:59:43 2019 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 1 Apr 2019 16:59:43 -0700 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: <20190401234929.GA53667@cskk.homeip.net> References: <20190401234929.GA53667@cskk.homeip.net> Message-ID: On Mon, Apr 1, 2019 at 4:49 PM Cameron Simpson wrote: > On 01Apr2019 15:44, Steve Dower wrote: > >On 01Apr2019 1535, Cameron Simpson wrote: > >>On 01Apr2019 09:12, Steve Dower wrote: > >>>On 30Mar2019 1130, Gregory P. Smith wrote: > >>>>I wouldn't expect it to be the case in a CI environment but I > >>>>believe a umask can be overridden if the filesystem is mounted > >>>>and configured with acls set? (oh, hah, Ivan just said the same > >>>>thing) > >>> > >>>Yep, it appears this is the case. The Pipelines team got back to > >>>me and it seems to be a known issue - the workaround they gave me > >>>was to run "sudo setfacl -Rb /home/vsts" at the start, so I've > >>>merged that in for now (to master and 3.7). > >> > >>Could that be done _without_ sudo to just the local directory > >>containing the test tar file? If that works then you don't need any > >>nasty privileged sudo use (which will just break on platforms > >>without sudo anyway). > > > >I tried something similar to that and it didn't work. My guess is it's > >to do with the actual mount point? (I also tried without sudo at > >first, and when I didn't work, I tried it with sudo. I hear that's how > >to decide whether you need it or not ;) ) > > > >In any case, it only applies to the Azure Pipelines build definition, > >so there aren't any other platforms where it'll be used. > > Ok then. > > Curious: is the sudo now in the build setup? I'm just thinking that this > isn't a tarfile specific fix but a "get correct POSIX umask semantics" > fix, so it should apply to the entire environment. > > Or am I naive? > I'm reading between the lines and assuming we're not the only user of their CI complaining about this environment change. ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vano at mail.mipt.ru Mon Apr 1 23:14:00 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Tue, 2 Apr 2019 06:14:00 +0300 Subject: [Python-Dev] Strange umask(?)/st_mode issue In-Reply-To: References: <20190401223539.GA47505@cskk.homeip.net> Message-ID: <67704a69-45aa-47dc-8b97-e18e886752bf@mail.mipt.ru> On 02.04.2019 1:44, Steve Dower wrote: > On 01Apr2019 1535, Cameron Simpson wrote: >> On 01Apr2019 09:12, Steve Dower wrote: >>> On 30Mar2019 1130, Gregory P. Smith wrote: >>>> I wouldn't expect it to be the case in a CI environment but I believe a umask can be overridden if the filesystem is mounted and >>>> configured with acls set? (oh, hah, Ivan just said the same thing) >>> >>> Yep, it appears this is the case. The Pipelines team got back to me and it seems to be a known issue - the workaround they gave me was >>> to run "sudo setfacl -Rb /home/vsts" at the start, so I've merged that in for now (to master and 3.7). >> >> Could that be done _without_ sudo to just the local directory containing the test tar file? If that works then you don't need any nasty >> privileged sudo use (which will just break on platforms without sudo anyway). > > I tried something similar to that and it didn't work. My guess is it's to do with the actual mount point? (I also tried without sudo at > first, and when I didn't work, I tried it with sudo. I hear that's how to decide whether you need it or not ;) ) > > In any case, it only applies to the Azure Pipelines build definition, so there aren't any other platforms where it'll be used. > https://github.com/python/cpython/pull/12655 > Cheers, > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan From Peixing.Xin at windriver.com Tue Apr 2 05:46:14 2019 From: Peixing.Xin at windriver.com (Xin, Peixing) Date: Tue, 2 Apr 2019 09:46:14 +0000 Subject: [Python-Dev] =?windows-1252?q?how_to_rerun_the_job_=93Azure_Pipe?= =?windows-1252?q?lines_PR=94=3F?= Message-ID: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BAAF35@ALA-MBD.corp.ad.wrs.com> Hi, Experts: Anyone can tell how to rerun the job ?Azure Pipelines PR? for my PR? Sometimes my PR failed but this is caused by externals. The next day this external issue was fixed then I might want to rerun this specific job on my PR to get the new result. How can I reach this? [cid:image001.png at 01D4E97B.F67B1E20] Thanks, Peixing -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 4252 bytes Desc: image001.png URL: From tir.karthi at gmail.com Tue Apr 2 08:22:56 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Tue, 2 Apr 2019 17:52:56 +0530 Subject: [Python-Dev] =?utf-8?q?how_to_rerun_the_job_=E2=80=9CAzure_Pipel?= =?utf-8?b?aW5lcyBQUuKAnT8=?= In-Reply-To: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BAAF35@ALA-MBD.corp.ad.wrs.com> References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BAAF35@ALA-MBD.corp.ad.wrs.com> Message-ID: Closing and re-opening the PR will trigger the CI run again that might help in this case but it will run all the jobs. -- Regards, Karthikeyan S -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 4252 bytes Desc: not available URL: From cspealma at redhat.com Tue Apr 2 11:17:20 2019 From: cspealma at redhat.com (Calvin Spealman) Date: Tue, 2 Apr 2019 11:17:20 -0400 Subject: [Python-Dev] PEP-582 and multiple Python installations Message-ID: (I originally posted this to python-ideas, where I was told none of this PEP's authors subscribe so probably no one will see it there, so I'm posting it here to raise the issue where it can get seen and hopefully discussed) While the PEP does show the version number as part of the path to the actual packages, implying support for multiple versions, this doesn't seem to be spelled out in the actual text. Presumably __pypackages__/3.8/ might sit beside __pypackages__/3.9/, etc. to keep future versions capable of installing packages for each version, the way virtualenv today is bound to one version of Python. I'd like to raise a potential edge case that might be a problem, and likely an increasingly common one: users with multiple installations of the *same* version of Python. This is actually a common setup for Windows users who use WSL, Microsoft's Linux-on-Windows solution, as you could have both the Windows and Linux builds of a given Python version installed on the same machine. The currently implied support for multiple versions would not be able to separate these and could create problems if users pip install a Windows binary package through Powershell and then try to run a script in Bash from the same directory, causing the Linux version of Python to try to use Windows python packages. I'm not actually sure what the solution here is. Mostly I wanted to raise the concern, because I'm very keen on WSL being a great entry path for new developers and I want to make that a better experience, not a more confusing one. Maybe that version number could include some other unique identify, maybe based on Python's own executable. A hash maybe? I don't know if anything like that already exists to uniquely identify a Python build or installation. -- CALVIN SPEALMAN SENIOR QUALITY ENGINEER cspealma at redhat.com M: +1.336.210.5107 TRIED. TESTED. TRUSTED. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Tue Apr 2 12:09:30 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 2 Apr 2019 09:09:30 -0700 Subject: [Python-Dev] =?utf-8?q?how_to_rerun_the_job_=E2=80=9CAzure_Pipel?= =?utf-8?b?aW5lcyBQUuKAnT8=?= In-Reply-To: References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BAAF35@ALA-MBD.corp.ad.wrs.com> Message-ID: <05055b7d-922f-2b93-4425-4068564313ad@python.org> On 02Apr2019 0522, Karthikeyan wrote: > Closing and re-opening the PR will trigger the CI run again that might > help in this case but it will run all the jobs. Yes, I believe this is still the best way to re-run Pipelines jobs. For people with logins (not yet everyone in the GitHub org, but I hear that's coming) you can requeue the build, but last time I tried it didn't sync back to the pull request properly (I think it needs GitHub to cooperate, which is why triggering it from GitHub works best.) The Pipelines team is aware of this and working on it, so I expect the integration to improve over time. For now, close/reopen the PR. Cheers, Steve From steve.dower at python.org Tue Apr 2 12:10:59 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 2 Apr 2019 09:10:59 -0700 Subject: [Python-Dev] PEP-582 and multiple Python installations In-Reply-To: References: Message-ID: <2b889555-db6f-6c69-0347-ebb89d6fec21@python.org> On 02Apr2019 0817, Calvin Spealman wrote: > (I originally posted this to python-ideas, where I was told none of this > PEP's authors subscribe so probably no one will see it there, so I'm > posting it here to raise the issue where it can get seen and hopefully > discussed) Correct, thanks for posting. (I thought we had a "discussions-to" tag with distutils-sig on it, but apparently not.) > While the PEP does show the version number as part of the path to the > actual packages, implying support for multiple versions, this doesn't > seem to be spelled out in the actual text. Presumably > __pypackages__/3.8/ might sit beside __pypackages__/3.9/, etc. to keep > future versions capable of installing packages for each version, the way > virtualenv today is bound to one version of Python. > > I'd like to raise a potential edge case that might be a problem, and > likely an increasingly common one: users with multiple installations of > the *same* version of Python. This is actually a common setup for > Windows users who use WSL, Microsoft's Linux-on-Windows solution, as you > could have both the Windows and Linux builds of a given Python version > installed on the same machine. The currently implied support for > multiple versions would not be able to separate these and could create > problems if users pip install a Windows binary package through > Powershell and then try to run a script in Bash from the same directory, > causing the Linux version of Python to try to use Windows python packages. > > I'm not actually sure what the solution here is. Mostly I wanted to > raise the concern, because I'm very keen on WSL being a great entry path > for new developers and I want to make that a better experience, not a > more confusing one. Maybe that version number could include some other > unique identify, maybe based on Python's own executable. A hash maybe? I > don't know if anything like that already exists to uniquely identify a > Python build or installation. Yes, this is a situation we're aware of, and it's caught in the conflict of "who is this feature meant to support". Since all platforms have a unique extension module suffix (e.g. "module.cp38-win32.pyd"), it would be possible to support this with "fat" packages that include all binaries (or some clever way of merging wheels for multiple platforms). And since this is already in CPython itself, it leads to about the only reasonable solution - instead of "3.8", use the extension module suffix "cp38-win32". (Wheel tags are not in core CPython, so we can't use those.) But while this seems obvious, it also reintroduces problems that this has the potential to fix - suddenly, just like installing into your global environment, your packages are not project-specific anymore but are Python-specific. Which is one of the major confusions people run into ("I pip installed X but now can't import it in python"). So the main points of discussion right now are "whose problem does this solve" and "when do we tell people they need a full venv". And that discussion is mostly happening at https://discuss.python.org/t/pep-582-python-local-packages-directory/963/ Cheers, Steve From pviktori at redhat.com Tue Apr 2 08:49:56 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Tue, 2 Apr 2019 14:49:56 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5C9FEF82.50207@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> Message-ID: On 3/30/19 11:36 PM, Jeroen Demeyer wrote: > On 2019-03-30 17:30, Mark Shannon wrote: >> 2. The claim that PEP 580 allows "certain optimizations because other >> code can make assumptions" is flawed. In general, the caller cannot make >> assumptions about the callee or vice-versa. Python is a dynamic language. > > PEP 580 is meant for extension classes, not Python classes. Extension > classes are not dynamic. When you implement tp_call in a given way, the > user cannot change it. So if a class implements the C call protocol or > the vectorcall protocol, callers can make assumptions about what that > means. > >> PEP 579 is mainly a list of supposed flaws with the >> 'builtin_function_or_method' class. >> The general thrust of PEP 579 seems to be that builtin-functions and >> builtin-methods should be more flexible and extensible than they are. I >> don't agree. If you want different behaviour, then use a different >> object. Don't try an cram all this extra behaviour into a pre-existing >> object. > > I think that there is a misunderstanding here. I fully agree with the > "use a different object" solution. This isn't a new solution: it's > already possible to implement those different objects (Cython does it). > It's just that this solution comes at a performance cost and that's what > we want to avoid. It does seem like there is some misunderstanding. PEP 580 defines a CCall structure, which includes the function pointer, flags, "self" and "parent". Like the current implementation, it has various METH_ flags for various C signatures. When called, the info from CCall is matched up (in relatively complex ways) to what the C function expects. PEP 590 only adds the "vectorcall". It does away with flags and only has one C signatures, which is designed to fit all the existing ones, and is well optimized. Storing the "self"/"parent", and making sure they're passed to the C function is the responsibility of the callable object. There's an optimization for "self" (offsetting using PY_VECTORCALL_ARGUMENTS_OFFSET), and any supporting info can be provided as part of "self". >> I'll reiterate that PEP 590 is more general than PEP 580 and that once >> the callable's code has access to the callable object (as both PEPs >> allow) then anything is possible. You can't can get more extensible than >> that. Anything is possible, but if one of the possibilities becomes common and useful, PEP 590 would make it hard to optimize for it. Python has grown many "METH_*" signatures over the years as we found more things that need to be passed to callables. Why would "METH_VECTORCALL" be the last? If it won't (if you think about it as one more way to call functions), then dedicating a tp_* slot to it sounds quite expensive. In one of the ways to call C functions in PEP 580, the function gets access to: - the arguments, - "self", the object - the class that the method was found in (which is not necessarily type(self)) I still have to read the details, but when combined with LOAD_METHOD/CALL_METHOD optimization (avoiding creation of a "bound method" object), it seems impossible to do this efficiently with just the callable's code and callable's object. > I would argue the opposite: PEP 590 defines a fixed protocol that is not > easy to extend. PEP 580 on the other hand uses a new data structure > PyCCallDef which could easily be extended in the future (this will > intentionally never be part of the stable ABI, so we can do that). > > I have also argued before that the generality of PEP 590 is a bad thing > rather than a good thing: by defining a more rigid protocol as in PEP > 580, more optimizations are possible. > >> PEP 580 has the same limitation for the same reasons. The limitation is >> necessary for correctness if an object supports calls via `__call__` and >> through another calling convention. > > I don't think that this limitation is needed in either PEP. As I > explained at the top of this email, it can easily be solved by not using > the protocol for Python classes. What is wrong with my proposal in PEP > 580: https://www.python.org/dev/peps/pep-0580/#inheritance I'll add Jeroen's notes from the review of the proposed PEP 590 (https://github.com/python/peps/pull/960): The statement "PEP 580 is specifically targetted at function-like objects, and doesn't support other callables like classes, partial functions, or proxies" is factually false. The motivation for PEP 580 is certainly function/method-like objects but it's a general protocol that every class can implement. For certain classes, it may not be easy or desirable to do that but it's always possible. Given that `PY_METHOD_DESCRIPTOR` is a flag for tp_flags, shouldn't it be called `Py_TPFLAGS_METHOD_DESCRIPTOR` or something? Py_TPFLAGS_HAVE_VECTOR_CALL should be Py_TPFLAGS_HAVE_VECTORCALL, to be consistent with tp_vectorcall_offset and other uses of "vectorcall" (not "vector call") And mine, so far: I'm not clear on the constness of the "args" array. If it is mutable (PyObject **), you can't, for example, directly pass a tuple's storage (or any other array that could be used in the call). If it is not (PyObject * const *), you can't insert the "self" argument in. The reference implementations seems to be inconsistent here. What's the intention? From mark at hotpy.org Tue Apr 2 15:38:23 2019 From: mark at hotpy.org (Mark Shannon) Date: Tue, 2 Apr 2019 20:38:23 +0100 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: <5CA1A217.1030007@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <5CA1A217.1030007@UGent.be> Message-ID: Hi, On 01/04/2019 6:31 am, Jeroen Demeyer wrote: > I added benchmarks for PEP 590: > > https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248 Thanks. As expected for calls to C function for both PEPs and master perform about the same, as they are using almost the same calling convention under the hood. As an example of the advantage that a general fast calling convention gives you, I have implemented the vectorcall versions of list() and range() https://github.com/markshannon/cpython/compare/vectorcall-minimal...markshannon:vectorcall-examples Which gives a roughly 30% reduction in time for creating ranges, or lists from small tuples. https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8 Cheers, Mark. From mark at hotpy.org Tue Apr 2 17:12:11 2019 From: mark at hotpy.org (Mark Shannon) Date: Tue, 2 Apr 2019 22:12:11 +0100 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> Message-ID: <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> Hi, On 02/04/2019 1:49 pm, Petr Viktorin wrote: > On 3/30/19 11:36 PM, Jeroen Demeyer wrote: >> On 2019-03-30 17:30, Mark Shannon wrote: >>> 2. The claim that PEP 580 allows "certain optimizations because other >>> code can make assumptions" is flawed. In general, the caller cannot make >>> assumptions about the callee or vice-versa. Python is a dynamic >>> language. >> >> PEP 580 is meant for extension classes, not Python classes. Extension >> classes are not dynamic. When you implement tp_call in a given way, >> the user cannot change it. So if a class implements the C call >> protocol or the vectorcall protocol, callers can make assumptions >> about what that means. >> >>> PEP 579 is mainly a list of supposed flaws with the >>> 'builtin_function_or_method' class. >>> The general thrust of PEP 579 seems to be that builtin-functions and >>> builtin-methods should be more flexible and extensible than they are. I >>> don't agree. If you want different behaviour, then use a different >>> object. Don't try an cram all this extra behaviour into a pre-existing >>> object. >> >> I think that there is a misunderstanding here. I fully agree with the >> "use a different object" solution. This isn't a new solution: it's >> already possible to implement those different objects (Cython does >> it). It's just that this solution comes at a performance cost and >> that's what we want to avoid. > > It does seem like there is some misunderstanding. > > PEP 580 defines a CCall structure, which includes the function pointer, > flags, "self" and "parent". Like the current implementation, it has > various METH_ flags for various C signatures. When called, the info from > CCall is matched up (in relatively complex ways) to what the C function > expects. > > PEP 590 only adds the "vectorcall". It does away with flags and only has > one C signatures, which is designed to fit all the existing ones, and is > well optimized. Storing the "self"/"parent", and making sure they're > passed to the C function is the responsibility of the callable object. > There's an optimization for "self" (offsetting using > PY_VECTORCALL_ARGUMENTS_OFFSET), and any supporting info can be provided > as part of "self". > >>> I'll reiterate that PEP 590 is more general than PEP 580 and that once >>> the callable's code has access to the callable object (as both PEPs >>> allow) then anything is possible. You can't can get more extensible than >>> that. > > Anything is possible, but if one of the possibilities becomes common and > useful, PEP 590 would make it hard to optimize for it. > Python has grown many "METH_*" signatures over the years as we found > more things that need to be passed to callables. Why would > "METH_VECTORCALL" be the last? If it won't (if you think about it as one > more way to call functions), then dedicating a tp_* slot to it sounds > quite expensive. I doubt METH_VECTORCALL will be the last. Let me give you an example: It is quite common for a function to take two arguments, so we might want add a METH_OO flag for builtin-functions with 2 parameters. To support this in PEP 590, you would make exactly the same change as you would now; which is to add another case to the switch statement in _PyCFunction_FastCallKeywords. For PEP 580, you would add another case to the switch in PyCCall_FastCall. No difference really. PEP 580 uses a slot as well. It's only 8 bytes per class. > > > In one of the ways to call C functions in PEP 580, the function gets > access to: > - the arguments, > - "self", the object > - the class that the method was found in (which is not necessarily > type(self)) > I still have to read the details, but when combined with > LOAD_METHOD/CALL_METHOD optimization (avoiding creation of a "bound > method" object), it seems impossible to do this efficiently with just > the callable's code and callable's object. It is possible, and relatively straightforward. Why do you think it is impossible? > > >> I would argue the opposite: PEP 590 defines a fixed protocol that is >> not easy to extend. PEP 580 on the other hand uses a new data >> structure PyCCallDef which could easily be extended in the future >> (this will intentionally never be part of the stable ABI, so we can do >> that). >> >> I have also argued before that the generality of PEP 590 is a bad >> thing rather than a good thing: by defining a more rigid protocol as >> in PEP 580, more optimizations are possible. >> >>> PEP 580 has the same limitation for the same reasons. The limitation is >>> necessary for correctness if an object supports calls via `__call__` and >>> through another calling convention. >> >> I don't think that this limitation is needed in either PEP. As I >> explained at the top of this email, it can easily be solved by not >> using the protocol for Python classes. What is wrong with my proposal >> in PEP 580: https://www.python.org/dev/peps/pep-0580/#inheritance > > > I'll add Jeroen's notes from the review of the proposed PEP 590 > (https://github.com/python/peps/pull/960): > > The statement "PEP 580 is specifically targetted at function-like > objects, and doesn't support other callables like classes, partial > functions, or proxies" is factually false. The motivation for PEP 580 is > certainly function/method-like objects but it's a general protocol that > every class can implement. For certain classes, it may not be easy or > desirable to do that but it's always possible. > > Given that `PY_METHOD_DESCRIPTOR` is a flag for tp_flags, shouldn't it > be called `Py_TPFLAGS_METHOD_DESCRIPTOR` or something? > > Py_TPFLAGS_HAVE_VECTOR_CALL should be Py_TPFLAGS_HAVE_VECTORCALL, to be > consistent with tp_vectorcall_offset and other uses of "vectorcall" (not > "vector call") > Thanks for the comments, I'll update the PEP when I get the chance. > > And mine, so far: > > I'm not clear on the constness of the "args" array. > If it is mutable (PyObject **), you can't, for example, directly pass a > tuple's storage (or any other array that could be used in the call). > If it is not (PyObject * const *), you can't insert the "self" argument in. > The reference implementations seems to be inconsistent here. What's the > intention? > I'll make it clearer in the PEP. My thinking was that if `PY_VECTORCALL_ARGUMENTS_OFFSET` is set then the caller is allowing the callee to mutate element -1. It would make sense to generalise that to any element of the vector (including -1). When passing the contents of a tuple, `PY_VECTORCALL_ARGUMENTS_OFFSET` should not be set, and thus the vector could not be mutated. Cheers, Mark. From J.Demeyer at UGent.be Wed Apr 3 01:33:49 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 3 Apr 2019 07:33:49 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> Message-ID: <5CA445BD.4040705@UGent.be> >> In one of the ways to call C functions in PEP 580, the function gets >> access to: >> - the arguments, >> - "self", the object >> - the class that the method was found in (which is not necessarily >> type(self)) >> I still have to read the details, but when combined with >> LOAD_METHOD/CALL_METHOD optimization (avoiding creation of a "bound >> method" object), it seems impossible to do this efficiently with just >> the callable's code and callable's object. > > It is possible, and relatively straightforward. Access to the class isn't possible currently and also not with PEP 590. But it's easy enough to fix that: PEP 573 adds a new METH_METHOD flag to change the signature of the C function (not the vectorcall wrapper). PEP 580 supports this "out of the box" because I'm reusing the class also to do type checks. But this shouldn't be an argument for or against either PEP. From J.Demeyer at UGent.be Wed Apr 3 01:43:28 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 3 Apr 2019 07:43:28 +0200 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <5CA1A217.1030007@UGent.be> Message-ID: <5CA44800.9050901@UGent.be> On 2019-04-02 21:38, Mark Shannon wrote: > Hi, > > On 01/04/2019 6:31 am, Jeroen Demeyer wrote: >> I added benchmarks for PEP 590: >> >> https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248 > > Thanks. As expected for calls to C function for both PEPs and master > perform about the same, as they are using almost the same calling > convention under the hood. While they are "about the same", in general PEP 580 is slightly faster than master and PEP 590. And PEP 590 actually has a minor slow-down for METH_VARARGS calls. I think that this happens because PEP 580 has less levels of indirection than PEP 590. The vectorcall protocol (PEP 590) changes a slower level (tp_call) by a faster level (vectorcall), while PEP 580 just removes that level entirely: it calls the C function directly. This shows that PEP 580 is really meant to have maximal performance in all cases, accidentally even making existing code faster. Jeroen. From J.Demeyer at UGent.be Wed Apr 3 11:41:06 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 3 Apr 2019 17:41:06 +0200 Subject: [Python-Dev] PEP 590 vs. bpo-29259 In-Reply-To: <5CA445BD.4040705@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <5CA445BD.4040705@UGent.be> Message-ID: <5CA4D412.7050906@UGent.be> As I'm reading the PEP 590 reference implementation, it strikes me how similar it is to https://bugs.python.org/issue29259 The main difference is that bpo-29259 has a per-class pointer tp_fastcall instead of a per-object pointer. But actually, the PEP 590 reference implementation does not make much use of the per-object pointer: for all classes except "type", the vectorcall wrapper is the same for all objects of a given type. One thing that bpo-29259 did not realize is that existing optimizations could be dropped in favor of using tp_fastcall. For example, bpo-29259 has code like if (PyFunction_Check(callable)) { return _PyFunction_FastCallKeywords(...); } if (PyCFunction_Check(callable)) { return _PyCFunction_FastCallKeywords(...); } else if (PyType_HasFeature(..., Py_TPFLAGS_HAVE_FASTCALL) ...) but the first 2 branches are superfluous given the third. Anyway, this is just putting PEP 590 a bit in perspective. It doesn't say anything about the merits of PEP 590. Jeroen. From J.Demeyer at UGent.be Thu Apr 4 07:51:40 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 4 Apr 2019 13:51:40 +0200 Subject: [Python-Dev] Deprecating "instance method" class Message-ID: <5CA5EFCC.2030400@UGent.be> During my investigations related to low-level function/method classes, I came across the "instance method" class. There is a C API for it: https://docs.python.org/3.7/c-api/method.html However, it's not used/exposed anywhere in CPython, except as _testcapi.instancemethod (for testing its functionality) This class was discussed at https://mail.python.org/pipermail/python-3000/2007-December/011456.html and implemented in https://bugs.python.org/issue1587 Reading that old thread, there are use cases presented related to classic classes, wrapping Kogut (http://kokogut.sourceforge.net/kogut.html) objects and Pyrex. But classic classes no longer exist and the latter two use cases aren't actually needed if you read the thread to the end. So there are no surviving use cases from that thread. Does anybody know actual use cases or any code in the wild using it? To me, the fact that it's only exposed in the C API is a good sign that it's not really useful. So, should we deprecate the instance method class? Jeroen. From christian at python.org Thu Apr 4 08:09:44 2019 From: christian at python.org (Christian Heimes) Date: Thu, 4 Apr 2019 14:09:44 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA5EFCC.2030400@UGent.be> References: <5CA5EFCC.2030400@UGent.be> Message-ID: On 04/04/2019 13.51, Jeroen Demeyer wrote: > During my investigations related to low-level function/method classes, I > came across the "instance method" class. There is a C API for it: > https://docs.python.org/3.7/c-api/method.html > However, it's not used/exposed anywhere in CPython, except as > _testcapi.instancemethod (for testing its functionality) > > This class was discussed at > https://mail.python.org/pipermail/python-3000/2007-December/011456.html > and implemented in https://bugs.python.org/issue1587 > Reading that old thread, there are use cases presented related to > classic classes, wrapping Kogut > (http://kokogut.sourceforge.net/kogut.html) objects and Pyrex. But > classic classes no longer exist and the latter two use cases aren't > actually needed if you read the thread to the end. So there are no > surviving use cases from that thread. > > Does anybody know actual use cases or any code in the wild using it? To > me, the fact that it's only exposed in the C API is a good sign that > it's not really useful. You are drawing the wrong conclusion here. The feature was explicitly designed for C code and C API wrappers like swig and Cython to make adaption to Python 3 simpler. I implemented it when I removed unbound methods. > So, should we deprecate the instance method class? I couldn't find any current code that uses PyInstanceMethod_New. Let's deprecate the feature and schedule it for removal in 3.10. Christian From J.Demeyer at UGent.be Thu Apr 4 09:45:03 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 4 Apr 2019 15:45:03 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> Message-ID: <5CA60A5F.7060406@UGent.be> On 2019-04-04 14:09, Christian Heimes wrote: > I couldn't find any current code that uses PyInstanceMethod_New. Let's > deprecate the feature and schedule it for removal in 3.10. Done at https://github.com/python/cpython/pull/12685 From chris.barker at noaa.gov Thu Apr 4 12:02:00 2019 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2019 09:02:00 -0700 Subject: [Python-Dev] PEP-582 and multiple Python installations In-Reply-To: References: Message-ID: > I'd like to raise a potential edge case that might be a problem, and likely an increasingly common one: users with multiple installations of the *same* version of Python. I would suggest that that use case is best addressed by a system that isolates the entire python environment, such as conda. > This is actually a common setup for Windows users who use WSL, Microsoft's Linux-on-Windows solution, as you could have both the Windows and Linux builds of a given Python version installed on the same machine. Sure, but Isn?t the WSL subsystem pretty isolated already? Would native Windows and WSL users be running in the same dir? That being said, I?m pretty skeptical of the PEP ? I understand the motivation ? I make a point of avoiding virtual environments in my intro classes, but at some point folks will need to learn them. I?ve had students think that virtualenv was a part of (or required by) e.g. flask, because the tutorials include it in the setup. But I think environments really need to be more distinct, not less, I?m quite concerned about mingling them in one place. Maybe I?m reading it wrong, but it seems that this could create serious clashes with other ?environment? systems, such as conda. I suppose one could say: ?don?t do that? ? I.e. don?t create a __pypackages__ dir if you are going to use conda ? but many folks want the same source to be runnable in multiple ?styles? of Python. Also, I see a major benefit for teaching, but it does go a bit against my philosophy of not hiding important details from newbies ? that is, don?t teach using an approach that is not suitable for production. And newbies could be really confused by the fact that pip installs stuff differently depending on what dir they are in and what is in that dir. The PEP is listed as a draft ? anyone know what?s going on with it? -CHB From chris.barker at noaa.gov Thu Apr 4 18:54:46 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 4 Apr 2019 15:54:46 -0700 Subject: [Python-Dev] PEP-582 and multiple Python installations In-Reply-To: References: Message-ID: Sorry somehow missed Steve Dower's post: that discussion is mostly happening at https://discuss.python.org/t/pep-582-python-local-packages-directory/963/ I"ll go there to comment. -CHB On Thu, Apr 4, 2019 at 9:02 AM Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > > I'd like to raise a potential edge case that might be a problem, and > likely an increasingly common one: users with multiple installations of the > *same* version of Python. > > I would suggest that that use case is best addressed by a system that > isolates the entire python environment, such as conda. > > > This is actually a common setup for Windows users who use WSL, > Microsoft's Linux-on-Windows solution, as you could have both the Windows > and Linux builds of a given Python version installed on the same machine. > > Sure, but Isn?t the WSL subsystem pretty isolated already? Would native > Windows and WSL users be running in the same dir? > > That being said, I?m pretty skeptical of the PEP ? I understand the > motivation ? I make a point of avoiding virtual environments in my intro > classes, but at some point folks will need to learn them. > > I?ve had students think that virtualenv was a part of (or required by) > e.g. flask, because the tutorials include it in the setup. > > But I think environments really need to be more distinct, not less, I?m > quite concerned about mingling them in one place. > > Maybe I?m reading it wrong, but it seems that this could create serious > clashes with other ?environment? systems, such as conda. > > I suppose one could say: ?don?t do that? ? I.e. don?t create a > __pypackages__ dir if you are going to use conda ? but many folks want the > same source to be runnable in multiple ?styles? of Python. > > Also, I see a major benefit for teaching, but it does go a bit against my > philosophy of not hiding important details from newbies ? that is, don?t > teach using an approach that is not suitable for production. > > And newbies could be really confused by the fact that pip installs stuff > differently depending on what dir they are in and what is in that dir. > > The PEP is listed as a draft ? anyone know what?s going on with it? > > -CHB > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Apr 4 18:57:00 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 05 Apr 2019 11:57:00 +1300 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> Message-ID: <5CA68BBC.8060205@canterbury.ac.nz> Christian Heimes wrote: > I couldn't find any current code that uses PyInstanceMethod_New. Let's > deprecate the feature and schedule it for removal in 3.10. If it's designed for use by things outside of CPython, how can you be sure nothing is using it? -- Greg From J.Demeyer at UGent.be Fri Apr 5 02:07:25 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 08:07:25 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA68BBC.8060205@canterbury.ac.nz> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> Message-ID: <5CA6F09D.3000900@UGent.be> On 2019-04-05 00:57, Greg Ewing wrote: > If it's designed for use by things outside of CPython, how > can you be sure nothing is using it? Of course I'm not sure. However: 1. So far, nobody in this thread knows of any code using it. 2. So far, nobody in this thread knows any use case for it. And if we end up deprecating and it was a mistake, we can easily revert the deprecation. From storchaka at gmail.com Fri Apr 5 08:10:54 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 5 Apr 2019 15:10:54 +0300 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA6F09D.3000900@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> Message-ID: 05.04.19 09:07, Jeroen Demeyer ????: > On 2019-04-05 00:57, Greg Ewing wrote: >> If it's designed for use by things outside of CPython, how >> can you be sure nothing is using it? > > Of course I'm not sure. However: > > 1. So far, nobody in this thread knows of any code using it. > > 2. So far, nobody in this thread knows any use case for it. > > And if we end up deprecating and it was a mistake, we can easily revert > the deprecation. I have a use case. I did not know this before, but it can be used to implement accelerated versions of separate methods instead of the whole class. I'm going to use it to further optimize total_ordering. Thanks Josh Rosenberg for the tip. From J.Demeyer at UGent.be Fri Apr 5 07:27:17 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 13:27:17 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> Message-ID: <5CA73B95.6040509@UGent.be> On 2019-04-05 14:10, Serhiy Storchaka wrote: > it can be used to > implement accelerated versions of separate methods instead of the whole > class. Could you elaborate? I'm curious what you mean. > I'm going to use it to further optimize total_ordering. There are so many ways in which total_ordering is inefficient. If you really want it to be efficient, you should just implement it in C. From storchaka at gmail.com Fri Apr 5 09:13:26 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 5 Apr 2019 16:13:26 +0300 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA73B95.6040509@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> Message-ID: 05.04.19 14:27, Jeroen Demeyer ????: > On 2019-04-05 14:10, Serhiy Storchaka wrote: >> it can be used to >> implement accelerated versions of separate methods instead of the whole >> class. > > Could you elaborate? I'm curious what you mean. It is easy to implement a function in C. But there is a difference between functions implemented in Python and C -- the latter are not descriptors. They behave like static methods when assigned to a class attribute, i.e. there is no implicit passing of the "self" argument. >> I'm going to use it to further optimize total_ordering. > > There are so many ways in which total_ordering is inefficient. If you > really want it to be efficient, you should just implement it in C. Yes, this is what I want to do. I did not do this only because implementing method-like functions which which do not belong to concrete class implemented in C is not convention. But PyInstanceMethod_New() should help. From J.Demeyer at UGent.be Fri Apr 5 08:33:29 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 14:33:29 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> Message-ID: <5CA74B19.70806@UGent.be> On 2019-04-05 15:13, Serhiy Storchaka wrote: > It is easy to implement a function in C. Why does it need to be a PyCFunction? You could put an actual method descriptor in the class. In other words, use PyDescr_NewMethod() instead of PyCFunction_New() + PyInstanceMethod_New(). It's probably going to be faster too since the instancemethod adds an unoptimized extra level of indirection. > Yes, this is what I want to do. I did not do this only because > implementing method-like functions which which do not belong to concrete > class implemented in C is not convention. Sure, you could implement separate methods like __gt__ in C, but that's still less efficient than just implementing a specific tp_richcompare for total_ordering and then having the usual wrapper descriptors for __gt__. From guido at python.org Fri Apr 5 11:46:00 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 5 Apr 2019 08:46:00 -0700 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA74B19.70806@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> Message-ID: Let's stop here. This API is doing no harm, it's not a maintenance burden, clearly *some* folks have a use for it. Let's just keep it, okay? There are bigger fish to fry. On Fri, Apr 5, 2019 at 5:36 AM Jeroen Demeyer wrote: > On 2019-04-05 15:13, Serhiy Storchaka wrote: > > It is easy to implement a function in C. > > Why does it need to be a PyCFunction? You could put an actual method > descriptor in the class. In other words, use PyDescr_NewMethod() instead > of PyCFunction_New() + PyInstanceMethod_New(). It's probably going to be > faster too since the instancemethod adds an unoptimized extra level of > indirection. > > > Yes, this is what I want to do. I did not do this only because > > implementing method-like functions which which do not belong to concrete > > class implemented in C is not convention. > > Sure, you could implement separate methods like __gt__ in C, but that's > still less efficient than just implementing a specific tp_richcompare > for total_ordering and then having the usual wrapper descriptors for > __gt__. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Fri Apr 5 12:00:31 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 5 Apr 2019 18:00:31 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: Le dim. 31 mars 2019 ? 01:49, Steve Dower a ?crit : > Here is my first review of https://www.python.org/dev/peps/pep-0587/ and > in general I think it's very good. Ah nice, that's a good start :-) Thanks for reviewing it. Your email is long, and answer makes it even longer, so I will reply in multiple emails. > > ``PyWideCharList`` is a list of ``wchar_t*`` strings. > > I always forget whether "const" is valid in C99, but if it is, can we > make this a list of const strings? Short answer: no :-( This structure mostly exists to simplify the implementation. Sadly, "const PyWideCharList" doesn't automatically make PyWideCharList.items an array of "const wchar_t*". I tried some hacks to have an array of const strings... but it would be very complicated and not natural at all in C. Sadly, it's way more simple to have "wchar_t*" in practice. > I also prefer a name like ``PyWideStringList``, since that's what it is > (the other places we use WideChar in the C API refer only to a single > string, as far as I'm aware). I'm fine with this name. > > ``PyInitError`` is a structure to store an error message or an exit code > > for the Python Initialization. > > I love this struct! Currently it's private, but I wonder whether it's > worth making it public as PyError (or PyErrorInfo)? The PEP 587 makes the structure public, but I'm not sure about calling it PyError because PyInitError also allows to exit Python with an exit status which is something specific to the initialization. If you want to use a structure to reporting errors, I would prefer to add a new simpler PyError structure to only report an error message, but never exit Python. PyInitError use case is really specific to Python initialization. Moreover, the API is inefficient since it is returned by copy, not by reference. That's fine for Python initialization which only happens once and is not part of "hot code". I'm not sure if PyError would need to store the C function name where the error is triggered. Usually, we try hard to hide Python internals to the user. > > * ``exitcode`` (``int``): if greater or equal to zero, argument passed to > > ``exit()`` > > Windows is likely to need/want negative exit codes, as many system > errors are represented as 0x80000000|(source of error)|(specific code). Hum, int was used in Python 3.6 code base. We change change PyInitError.exitcode type to DWORD on Windows, but use int on Unix. We can add a private field to check if the error is an error message or an exit code. Or maybe check if the error message is NULL or not. Py_INIT_ERR(MSG) must never be called with Py_INIT_ERR(NULL) and it should be called with a static string, not with a dynamically allocated string (since the API doesn't allow to release memory). > > * ``user_err`` (int): if non-zero, the error is caused by the user > > configuration, otherwise it's an internal Python error. > > Maybe we could just encode this as "positive exitcode is user error, > negative is internal error"? I'm pretty sure struct return values are > passed by reference in most C calling conventions, so the size of the > struct isn't a big deal, but without a proper bool type it may look like > this is a second error code (like errno/winerror in a few places). Honestly, I'm not sure that we really have to distinguish "user error" and "internal error". It's an old debate about calling abort()/DebugBreak() or not. It seems like most users are annoyed by getting a coredump on Unix when abort() is called. Maybe we should just remove Py_INIT_USER_ERR(), always use Py_INIT_ERR(), and never call abort()/DebugBreak() in Py_ExitInitError(). Or does anyone see a good reason to trigger a debugger on an initialization error? See https://bugs.python.org/issue19983 discussion: "When interrupted during startup, Python should not call abort() but exit()" Note: I'm not talking about Py_FatalError() here, this one will not change. Victor From vstinner at redhat.com Fri Apr 5 12:12:50 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 5 Apr 2019 18:12:50 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: About PyPreConfig and encodings. > The appendix is excellent, by the way. Very useful detail to have > written down. Thanks. The appendix is based on Include/cpython/coreconfig.h comments which is now my reference documentation. By the way, this header file contains more information about PyConfig fields than the PEP 587. For example, the comment on filesystem_encoding and filesystem_errors lists every single cases and exceptions (it describes the implementation). > > ``PyPreConfig`` structure is used to pre-initialize Python: > > > > * Set the memory allocator > > * Configure the LC_CTYPE locale > > * Set the UTF-8 mode > > I think we should have the isolated flag in here - oh wait, we do - I > think we should have the isolated/use_environment options listed in this > list :) My introduction paragraph only explains the changes made by Py_PreInitialize(): calling Py_PreInitialize() doesn't "isolate" Python. PyPreConfig.isolated is used to decide if Python reads environment variables or not. Examples: PYTHONMALLOC, PYTHONUTF8, PYTHONDEVMODE (which has an impact on PyPreConfig.allocator), PYTHONCOERCECLOCALE, etc. That's why isolated and use_environment are present in PyPreConfig and PyConfig. In practice, values should be equal in both structures. Moreover, if PyConfig.isolated is equal to 1, Py_InitializeFromConfig() updates _PyRuntime.preconfig.isolated to 1 ;-) > > * ``PyInitError Py_PreInitialize(const PyPreConfig *config)`` > > * ``PyInitError Py_PreInitializeFromArgs( const PyPreConfig *config, > int argc, char **argv)`` > > * ``PyInitError Py_PreInitializeFromWideArgs( const PyPreConfig > *config, int argc, wchar_t **argv)`` > > I hope to one day be able to support multiple runtimes per process - can > we have an opaque PyRuntime object exposed publicly now and passed into > these functions? I hesitated to include a "_PyRuntimeState*" parameter somewhere, but I chose to not do so. Currently, there is a single global variable _PyRuntime which has the type _PyRuntimeState. The _PyRuntime_Initialize() API is designed around this global variable. For example, _PyRuntimeState contains the registry of interpreters: you don't want to have multiple registries :-) I understood that we should only have a single instance of _PyRuntimeState. So IMHO it's fine to keep it private at this point. There is no need to expose it in the API. > (FWIW, I think we're a long way from being able to support multiple > runtimes *simultaneously*, so the initial implementation here would be > to have a PyRuntime_Create() that returns our global one once and then > errors until it's finalised. The migration path is probably to enable > switching of the current runtime via a dedicated function (i.e. one > active at a time, probably with thread local storage), since we have no > "context" parameter for C API functions, and obviously there are still > complexities such as poorly written extension modules that nonetheless > can be dealt with in embedding scenarios by simply not using them. This > doesn't seem like an unrealistic future, *unless* we add a whole lot of > new APIs now that can't allow it :) ) FYI I tried to design an internal API with a "context" to pass _PyRuntimeState, PyPreConfig, _PyConfig, the current interpreter, etc. => https://bugs.python.org/issue35265 My first need was to pass a memory allocator to Py_DecodeLocale(). There are 2 possible implementations: * Modify *all* functions to add a new "context" parameter and modify *all* functions to pass this parameter to sub-functions. * Store the current "context" as a thread local variable or something like that. I wrote a proof-of-concept of the first option: the implementation was very painful to write: a lot of changes which looks useless and a lot of new private functions which to pass the argument. I had to modify way too much code. I gave up. For the second option: well, there is no API change needed! It can be done later. Moreover, we already have such API! PyThreadState_Get() gets the Python thread state of the current thread: the current interpreter can be accessed from there. > > ``PyPreConfig`` fields: > > > > * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale > > is coerced. > > * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to > > 1, read the LC_CTYPE to decide if it should be coerced. > > Can we use another value for coerce_c_locale to determine whether to > warn or not? Save a field. coerce_c_locale is already complex, it can have 4 values: -1, 0, 1 and 2. I prefer keep a separated field. Moreover, I understood that you might want to coerce the C locale *and* get the warning, or get the warning but *not* coerce the locale. > > * ``legacy_windows_fs_encoding`` (Windows only): if non-zero, set the > > Python filesystem encoding to ``"mbcs"``. > > * ``utf8_mode``: if non-zero, enable the UTF-8 mode > > Why not just set the encodings here? For different technical reasons, you simply cannot specify an encoding name. You can also pass options to tell Python that you have some preferences (PyPreConfig and PyConfig fields). Python doesn't support any encoding and encoding errors combinations. In practice, it only supports a narrow set of choices. The main implementation are Py_EncodeLocale() and Py_DecodeLocale() functions which uses the C codec of the current locale encoding to implement the filesystem encoding, before the codec implemented in Python can be used. Basically, only the current locale encoding or UTF-8 are supported. If you want UTF-8, enable the UTF-8 Mode. To load the Python codec, you need importlib. importlib needs to access the filesystem which requires a codec to encode/decode file names (PyConfig.module_search_paths uses Unicode wchar_t* strings, but the C API only supports bytes char* strings). Py_PreInitialize() doesn't set the filesystem encoding. It initializes the LC_CTYPE locale and Python global configuration variables (Py_UTF8Mode and Py_LegacyWindowsFSEncodingFlag). > Obviously we are not ready to import most encodings after pre > initialization, but I think that's okay. Embedders who set something > outside the range of what can be used without importing encodings will > get an error to that effect if we try. You need a C implementation of the Python filesystem encoding very early in Python initialization. You cannot start with one encoding and "later" switch the encoding. I tried multiple times the last 10 years and I always failed to do that. All attempts failed with mojibake at different levels. Unix pays the price of its history. Windows is a very different story: there are API to access the filesystem with Unicode strings, there is no such "bootstrap problem" for importlib. > In fact, I'd be totally okay with letting embedders specify their own > function pointer here to do encoding/decoding between Unicode and the OS > preferred encoding. In my experience, when someone wants to get a specific encoding: they only want UTF-8. There is now the UTF-8 Mode which ignores the locale and forces the usage of UTF-8. I'm not sure that there is a need to have a custom codec. Moreover, if there an API to pass a codec in C, you will need to expose it somehow at the Python level for os.fsencode() and os.fsdecode(). Currently, Python ensures during early stage of startup that codecs.lookup(sys.getfilesystemencoding()) works: there is a existing Python codec for the requested filesystem encoding. Victor From vstinner at redhat.com Fri Apr 5 12:22:17 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 5 Apr 2019 18:22:17 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: > > Example of Python initialization enabling the isolated mode:: > > > > PyConfig config = PyConfig_INIT; > > config.isolated = 1; > > Haven't we already used extenal values by this point that should have > been isolated? On this specific example, "config.isolated = 1;" ensures that Py_PreInitialize() is also called internally with "PyPreConfig.isolated = 1". > I'd rather have the isolation up front. Or better yet, > make isolation the default unless you call one of the "FromArgs" > functions, and then we don't actually need the config setting at all. While there are supporters of an "isolated Python" (sometimes called "system python"), the fact that it doesn't exist in any Linux distribution nor on any other operating system (Windows, macOS, FreeBSD), whereas it's already doable in Python 3.6 with Py_IsolatedFlag=1 makes me think that users like the ability to control Python with environment variables and configuration files. I would prefer to leave Python as not isolated by default. It's just a matter of comment line arguments. > > * The PEP 432 stores ``PYTHONCASEOK`` into the config. Do we need > > to add something for that into ``PyConfig``? How would it be exposed > > at the Python level for ``importlib``? Passed as an argument to > > ``importlib._bootstrap._setup()`` maybe? It can be added later if > > needed. > > Could we convert it into an xoption? It's very rarely used, to my knowledge. The first question is if there is any need for an embedder to change this option. Currently, importlib._bootstrap_external._install() reads the environment variable and it's the only way to control the option. ... By the way, importlib reads PYTHONCASEOK environment varaible even if isolated mode is enabled (sys.flags.isolated is equal to 1). Is it a bug? :-) Victor From vstinner at redhat.com Fri Apr 5 12:24:27 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 5 Apr 2019 18:24:27 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: > I think my biggest point (about halfway down) is that I'd rather use > argv/environ/etc. to *initialize* PyConfig and then only use the config > for initializing the runtime. That way it's more transparent for users > and more difficult for us to add options that embedders can't access. I chose to exclude PyConfig_Read() function from the PEP to try to start with the bare minimum public API and see how far we can go with that. The core of the PEP 587 implementation are PyPreConfig_Read() and PyConfig_Read() functions (currently called _PyPreConfig_Read() and _PyCoreConfig_Read()): they populate all fields so the read config becomes the reference config which will be applied. For example, PyConfig_Read() fills module_search_paths, from other PyConfig fields: it will become sys.path. I spent a lot of time to rework deeply the implementation of PyConfig_Read() to make sure that it has no side effect. Reading and writing the configuration are now strictly separated. So it is safe to call PyConfig_Read(), modify PyConfig afterwards, and pass the modified config to Py_InitializeFromConfig(). Do you think that exposing PyConfig_Read() would solve some of your problems? > Currently you have three functions, that take a PyConfig and optionally > also use the environment/argv to figure out the settings: > > > * ``PyInitError Py_InitializeFromConfig(const PyConfig *config)`` > > * ``PyInitError Py_InitializeFromArgs(const PyConfig *config, int > argc, char **argv)`` > > * ``PyInitError Py_InitializeFromWideArgs(const PyConfig *config, int > argc, wchar_t **argv)`` > > I would much prefer to see this flipped around, so that there is one > initialize function taking PyConfig, and two functions that will fill > out the PyConfig based on the environment: > > (note two of the "const"s are gone) > > * ``PyInitError Py_SetConfigFromArgs(PyConfig *config, int argc, char > **argv)`` > * ``PyInitError Py_SetConfigFromWideArgs(PyConfig *config, int argc, > wchar_t **argv)`` > * ``PyInitError Py_InitializeFromConfig(const PyConfig *config)`` This implementation evolved *A LOT* last months. I was *very confused* until the pre-initialization phase was introduced which solved a lot of bootstrap issues. After I wrote down the PEP and read it again, I also came to the same conclusion: Py_InitializeFromConfig(config) should be enough, and we can add helper functions to set arguments on PyConfig (as you showed). Victor From vstinner at redhat.com Fri Apr 5 12:36:37 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 5 Apr 2019 18:36:37 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: For the PyMainConfig structure idea, I cannot comment at this point. I need more time to think about it. About the "path configuration" fields, maybe a first step to enhance the API would be to add the the following function: PyInitError PyConfig_ComputePath(PyConfig *config, const wchar *home); where home can be NULL (and PyConfig.module_search_paths_env field goes away: the function reads PYTHONPATH env var internally). This function would "compute the path configuration", what's currently listed in _PyCoreConfig under: /* Path configuration outputs */ int use_module_search_paths; /* If non-zero, use module_search_paths */ _PyWstrList module_search_paths; /* sys.path paths. Computed if use_module_search_paths is equal to zero. */ wchar_t *executable; /* sys.executable */ wchar_t *prefix; /* sys.prefix */ wchar_t *base_prefix; /* sys.base_prefix */ wchar_t *exec_prefix; /* sys.exec_prefix */ wchar_t *base_exec_prefix; /* sys.base_exec_prefix */ #ifdef MS_WINDOWS wchar_t *dll_path; /* Windows DLL path */ #endif Victor From storchaka at gmail.com Fri Apr 5 13:53:10 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 5 Apr 2019 20:53:10 +0300 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA74B19.70806@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> Message-ID: 05.04.19 15:33, Jeroen Demeyer ????: > On 2019-04-05 15:13, Serhiy Storchaka wrote: >> It is easy to implement a function in C. > > Why does it need to be a PyCFunction? You could put an actual method > descriptor in the class. In other words, use PyDescr_NewMethod() instead > of PyCFunction_New() + PyInstanceMethod_New(). It's probably going to be > faster too since the instancemethod adds an unoptimized extra level of > indirection. PyDescr_NewMethod() takes PyTypeObject* which is not known at that moment. But maybe passing &PyBaseObject_Type will make a trick. I need to try. >> Yes, this is what I want to do. I did not do this only because >> implementing method-like functions which which do not belong to concrete >> class implemented in C is not convention. > > Sure, you could implement separate methods like __gt__ in C, but that's > still less efficient than just implementing a specific tp_richcompare > for total_ordering and then having the usual wrapper descriptors for > __gt__. At Python level we can monkeypatch __gt__, but not tp_richcompare. In any case, removing a C API is a large breakage, and it is better to avoid it unless that API is inherently broken. From christian at python.org Fri Apr 5 13:00:00 2019 From: christian at python.org (Christian Heimes) Date: Fri, 5 Apr 2019 19:00:00 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> Message-ID: <52930d72-803b-0f8b-b338-a96e914ff2eb@python.org> On 05/04/2019 17.46, Guido van Rossum wrote: > Let's stop here. This API is doing no harm, it's not a maintenance > burden, clearly *some* folks have a use for it. Let's just keep it, > okay? There are bigger fish to fry. Sounds good to me. My code is 12 years ago and I can't remember any complain. I have closed the BPO issue and PR. From J.Demeyer at UGent.be Fri Apr 5 13:56:08 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 19:56:08 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> Message-ID: <5CA796B8.1000202@UGent.be> On 2019-04-05 19:53, Serhiy Storchaka wrote: > At Python level we can monkeypatch __gt__, but not tp_richcompare. Sure, but you're planning to use C anyway so that's not really an argument. From status at bugs.python.org Fri Apr 5 14:07:47 2019 From: status at bugs.python.org (Python tracker) Date: Fri, 5 Apr 2019 18:07:47 +0000 (UTC) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20190405180747.30B2D52B1D6@bugs.ams1.psf.io> ACTIVITY SUMMARY (2019-03-29 - 2019-04-05) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 7056 ( +8) closed 41231 (+55) total 48287 (+63) Open issues with patches: 2816 Issues opened (44) ================== #36260: Cpython/Lib vulnerability found and request a patch submission https://bugs.python.org/issue36260 reopened by krnick #36474: RecursionError resets trace function set via sys.settrace https://bugs.python.org/issue36474 opened by blueyed #36475: PyEval_AcquireLock() and PyEval_AcquireThread() do not handle https://bugs.python.org/issue36475 opened by eric.snow #36476: Runtime finalization assumes all other threads have exited. https://bugs.python.org/issue36476 opened by eric.snow #36478: backport of pickle fixes to Python 3.5.7 uses C99 for loops https://bugs.python.org/issue36478 opened by Anthony Sottile #36479: Exit threads when interpreter is finalizing rather than runtim https://bugs.python.org/issue36479 opened by eric.snow #36481: telnetlib process_rawq() callback https://bugs.python.org/issue36481 opened by Quanthir #36484: Can't reorder TLS 1.3 ciphersuites https://bugs.python.org/issue36484 opened by Eman Alashwali #36485: Establish a uniform way to clear all caches in a given module https://bugs.python.org/issue36485 opened by serhiy.storchaka #36486: Bugs and inconsistencies in unicodedata https://bugs.python.org/issue36486 opened by dscorbett #36487: Make C-API docs clear about what the "main" interpreter is https://bugs.python.org/issue36487 opened by nanjekyejoannah #36488: os.sendfile() on BSD and macOS does not return bytes sent on E https://bugs.python.org/issue36488 opened by giampaolo.rodola #36489: add filename_extension_map and/or content-types_map dict(s) to https://bugs.python.org/issue36489 opened by Daniel Black #36490: Modernize function signature format in Archiving section of sh https://bugs.python.org/issue36490 opened by CAM-Gerlach #36494: bdb.Bdb.set_trace: should set f_trace_lines = True https://bugs.python.org/issue36494 opened by blueyed #36495: Out-of-bounds array reads in Python/ast.c https://bugs.python.org/issue36495 opened by blarsen #36497: Undocumented behavior in csv.Sniffer (preferred delimiters) https://bugs.python.org/issue36497 opened by thomas #36500: Add "regen-*" equivalent projects for Windows builds https://bugs.python.org/issue36500 opened by anthony shaw #36501: Remove POSIX.1e ACLs in tests that rely on default permissions https://bugs.python.org/issue36501 opened by Ivan.Pozdeev #36502: The behavior of str.isspace() for U+00A0 and U+202F is differe https://bugs.python.org/issue36502 opened by Jun #36503: remove references to aix3 and aix4 in \*.py https://bugs.python.org/issue36503 opened by Michael.Felt #36504: Signed integer overflow in _ctypes.c's PyCArrayType_new() https://bugs.python.org/issue36504 opened by ZackerySpytz #36506: [security] CVE-2019-10268: An arbitrary execution vulnerabilit https://bugs.python.org/issue36506 opened by bigbigliang #36508: python-config --ldflags must not contain LINKFORSHARED ("-Xlin https://bugs.python.org/issue36508 opened by vstinner #36509: Add iot layout for windows iot containers https://bugs.python.org/issue36509 opened by Paul Monson #36511: Add Windows ARM32 buildbot https://bugs.python.org/issue36511 opened by Paul Monson #36512: future_factory argument for Thread/ProcessPoolExecutor https://bugs.python.org/issue36512 opened by stefanhoelzl #36513: Add support for building arm32 nuget package https://bugs.python.org/issue36513 opened by Paul Monson #36515: unaligned memory access in the _sha3 extension https://bugs.python.org/issue36515 opened by doko #36516: Python Launcher can not recognize pyw file as Python GUI Scrip https://bugs.python.org/issue36516 opened by gjj2828 #36517: typing.NamedTuple does not support mixins https://bugs.python.org/issue36517 opened by rectalogic #36518: Avoid conflicts when pass arbitrary keyword arguments to Pytho https://bugs.python.org/issue36518 opened by serhiy.storchaka #36519: Blake2b/s implementations have minor GIL issues https://bugs.python.org/issue36519 opened by gwk #36520: Email header folded incorrectly https://bugs.python.org/issue36520 opened by Jonathan Horn #36521: Consider removing docstrings from co_consts in code objects https://bugs.python.org/issue36521 opened by rhettinger #36523: missing docs for IOBase writelines https://bugs.python.org/issue36523 opened by Marcin Niemira #36527: unused parameter warnings in Include/object.h (affecting build https://bugs.python.org/issue36527 opened by AMDmi3 #36528: Remove duplicate tests in Lib/tests/re_tests.py https://bugs.python.org/issue36528 opened by xtreak #36529: Python from WindowsStore: can't install package using "-m pip" https://bugs.python.org/issue36529 opened by Ilya Kazakevich #36531: PyType_FromSpec wrong behavior with multiple Py_tp_members https://bugs.python.org/issue36531 opened by eelizondo #36532: Example of logging.formatter with new str.format style https://bugs.python.org/issue36532 opened by spaceman_spiff #36533: logging regression with threading + fork are mixed in 3.7.1rc2 https://bugs.python.org/issue36533 opened by gregory.p.smith #36534: tarfile: handling Windows (path) illegal characters in archive https://bugs.python.org/issue36534 opened by CristiFati #36535: Windows build failure when use the code from the GitHub master https://bugs.python.org/issue36535 opened by Manjusaka Most recent 15 issues with no replies (15) ========================================== #36535: Windows build failure when use the code from the GitHub master https://bugs.python.org/issue36535 #36531: PyType_FromSpec wrong behavior with multiple Py_tp_members https://bugs.python.org/issue36531 #36529: Python from WindowsStore: can't install package using "-m pip" https://bugs.python.org/issue36529 #36528: Remove duplicate tests in Lib/tests/re_tests.py https://bugs.python.org/issue36528 #36527: unused parameter warnings in Include/object.h (affecting build https://bugs.python.org/issue36527 #36523: missing docs for IOBase writelines https://bugs.python.org/issue36523 #36520: Email header folded incorrectly https://bugs.python.org/issue36520 #36517: typing.NamedTuple does not support mixins https://bugs.python.org/issue36517 #36516: Python Launcher can not recognize pyw file as Python GUI Scrip https://bugs.python.org/issue36516 #36515: unaligned memory access in the _sha3 extension https://bugs.python.org/issue36515 #36513: Add support for building arm32 nuget package https://bugs.python.org/issue36513 #36512: future_factory argument for Thread/ProcessPoolExecutor https://bugs.python.org/issue36512 #36511: Add Windows ARM32 buildbot https://bugs.python.org/issue36511 #36509: Add iot layout for windows iot containers https://bugs.python.org/issue36509 #36503: remove references to aix3 and aix4 in \*.py https://bugs.python.org/issue36503 Most recent 15 issues waiting for review (15) ============================================= #36532: Example of logging.formatter with new str.format style https://bugs.python.org/issue36532 #36531: PyType_FromSpec wrong behavior with multiple Py_tp_members https://bugs.python.org/issue36531 #36528: Remove duplicate tests in Lib/tests/re_tests.py https://bugs.python.org/issue36528 #36527: unused parameter warnings in Include/object.h (affecting build https://bugs.python.org/issue36527 #36523: missing docs for IOBase writelines https://bugs.python.org/issue36523 #36518: Avoid conflicts when pass arbitrary keyword arguments to Pytho https://bugs.python.org/issue36518 #36516: Python Launcher can not recognize pyw file as Python GUI Scrip https://bugs.python.org/issue36516 #36515: unaligned memory access in the _sha3 extension https://bugs.python.org/issue36515 #36513: Add support for building arm32 nuget package https://bugs.python.org/issue36513 #36512: future_factory argument for Thread/ProcessPoolExecutor https://bugs.python.org/issue36512 #36509: Add iot layout for windows iot containers https://bugs.python.org/issue36509 #36508: python-config --ldflags must not contain LINKFORSHARED ("-Xlin https://bugs.python.org/issue36508 #36504: Signed integer overflow in _ctypes.c's PyCArrayType_new() https://bugs.python.org/issue36504 #36503: remove references to aix3 and aix4 in \*.py https://bugs.python.org/issue36503 #36501: Remove POSIX.1e ACLs in tests that rely on default permissions https://bugs.python.org/issue36501 Top 10 most discussed issues (10) ================================= #36485: Establish a uniform way to clear all caches in a given module https://bugs.python.org/issue36485 13 msgs #36466: Adding a way to strip annotations from compiled bytecode https://bugs.python.org/issue36466 12 msgs #36469: Stuck during interpreter exit, attempting to take the GIL https://bugs.python.org/issue36469 10 msgs #36506: [security] CVE-2019-10268: An arbitrary execution vulnerabilit https://bugs.python.org/issue36506 8 msgs #6721: Locks in the standard library should be sanitized on fork https://bugs.python.org/issue6721 7 msgs #36533: logging regression with threading + fork are mixed in 3.7.1rc2 https://bugs.python.org/issue36533 6 msgs #35866: concurrent.futures deadlock https://bugs.python.org/issue35866 5 msgs #36384: ipaddress Should not reject IPv4 addresses with leading zeroes https://bugs.python.org/issue36384 5 msgs #30661: Support tarfile.PAX_FORMAT in shutil.make_archive https://bugs.python.org/issue30661 4 msgs #35224: PEP 572: Assignment Expressions https://bugs.python.org/issue35224 4 msgs Issues closed (53) ================== #17110: sys.argv docs should explaining how to handle encoding issues https://bugs.python.org/issue17110 closed by inada.naoki #20844: SyntaxError: encoding problem: iso-8859-1 on Windows https://bugs.python.org/issue20844 closed by inada.naoki #21269: Provide args and kwargs attributes on mock call objects https://bugs.python.org/issue21269 closed by xtreak #22831: Use "with" to avoid possible fd leaks https://bugs.python.org/issue22831 closed by serhiy.storchaka #24214: UTF-8 incremental decoder doesn't support surrogatepass correc https://bugs.python.org/issue24214 closed by serhiy.storchaka #25451: tkinter: PhotoImage transparency methods https://bugs.python.org/issue25451 closed by serhiy.storchaka #29202: Improve dict iteration https://bugs.python.org/issue29202 closed by inada.naoki #31182: Suggested Enhancements to zipfile & tarfile command line inter https://bugs.python.org/issue31182 closed by brett.cannon #32413: Document that locals() may return globals() https://bugs.python.org/issue32413 closed by brett.cannon #32531: gdb.execute can not put string value. https://bugs.python.org/issue32531 closed by berker.peksag #32538: Multiprocessing Manager on 3D list - no change of the list pos https://bugs.python.org/issue32538 closed by berker.peksag #33261: inspect.isgeneratorfunction fails on hand-created methods https://bugs.python.org/issue33261 closed by petr.viktorin #34430: Symmetrical chaining futures in asyncio.future.wrap_future https://bugs.python.org/issue34430 closed by huji #35272: sqlite3 get the connected database url https://bugs.python.org/issue35272 closed by berker.peksag #35403: support application/wasm in mimetypes and http.server https://bugs.python.org/issue35403 closed by martin.panter #35838: ConfigParser: document optionxform must be idempotent https://bugs.python.org/issue35838 closed by inada.naoki #36010: Please provide a .zip Windows release of Python that is not cr https://bugs.python.org/issue36010 closed by steve.dower #36026: Different error message when sys.settrace is used https://bugs.python.org/issue36026 closed by inada.naoki #36085: Enable better DLL resolution https://bugs.python.org/issue36085 closed by steve.dower #36157: Document PyInterpreterState_Main(). https://bugs.python.org/issue36157 closed by eric.snow #36293: Nonblocking read sys.stdin raises error https://bugs.python.org/issue36293 closed by martin.panter #36322: Argument typo in dbm.ndbm.open https://bugs.python.org/issue36322 closed by brett.cannon #36377: Python 'datastructures.html' docs page needs improvement becau https://bugs.python.org/issue36377 closed by rhettinger #36404: Document PendingDeprecationWarning is not so useful. https://bugs.python.org/issue36404 closed by inada.naoki #36426: exec() issue when used inside function https://bugs.python.org/issue36426 closed by ncoghlan #36434: Zipfile breaks if signalled during write() https://bugs.python.org/issue36434 closed by serhiy.storchaka #36440: more helpful diagnostics for parser module https://bugs.python.org/issue36440 closed by pablogsal #36442: Different ValueError for the same operation in List and Tuple https://bugs.python.org/issue36442 closed by serhiy.storchaka #36445: bus error in test_gil test on armhf running with 64bit kernel https://bugs.python.org/issue36445 closed by doko #36448: Message "You will need to rebuild pythoncore to see the change https://bugs.python.org/issue36448 closed by steve.dower #36468: Treeview: wrong color change https://bugs.python.org/issue36468 closed by ned.deily #36472: Some old PR with CLA not signed https://bugs.python.org/issue36472 closed by brett.cannon #36473: dictkeysobject: Add maximum iteration check for .values() and https://bugs.python.org/issue36473 closed by inada.naoki #36477: Subinterpreters are not finalized during runtime finalization. https://bugs.python.org/issue36477 closed by eric.snow #36480: .strip() unexpected output on Windows https://bugs.python.org/issue36480 closed by eric.smith #36482: let struct's internal cache use FIFO policy https://bugs.python.org/issue36482 closed by rhettinger #36483: Missing line in documentation example https://bugs.python.org/issue36483 closed by martin.panter #36491: sum function's start optional parameter documented in help but https://bugs.python.org/issue36491 closed by rhettinger #36492: Deprecate passing some conflicting arguments by keyword https://bugs.python.org/issue36492 closed by serhiy.storchaka #36493: Add math.midpoint(a,b) function https://bugs.python.org/issue36493 closed by scoder #36496: Local variables can be used uninitialized in _PyPreConfig_Read https://bugs.python.org/issue36496 closed by vstinner #36498: combining dict comprehensing and lists lead to IndexError https://bugs.python.org/issue36498 closed by SilentGhost #36499: unpickling of a datetime object in 3.5 fails when pickled with https://bugs.python.org/issue36499 closed by josh.r #36505: PYTHON-CAN with vector https://bugs.python.org/issue36505 closed by SilentGhost #36507: frozenset type breaks ZFC https://bugs.python.org/issue36507 closed by rhettinger #36510: Regular Expression Dot-Star patter matching - re- text skippi https://bugs.python.org/issue36510 closed by SilentGhost #36514: -m switch revisited https://bugs.python.org/issue36514 closed by ronaldoussoren #36522: http/client.py does not print duplicate header values in debug https://bugs.python.org/issue36522 closed by serhiy.storchaka #36524: identity operator https://bugs.python.org/issue36524 closed by SilentGhost #36525: Deprecate instancemethod https://bugs.python.org/issue36525 closed by christian.heimes #36526: python crash when loading some .pyc file https://bugs.python.org/issue36526 closed by serhiy.storchaka #36530: Document codecs decode_encode() and encode_decode() APIs https://bugs.python.org/issue36530 closed by gregory.p.smith #36536: is there a python implementation of the cpython commandline in https://bugs.python.org/issue36536 closed by larry From J.Demeyer at UGent.be Fri Apr 5 14:29:00 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 20:29:00 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> Message-ID: <5CA79E6C.904@UGent.be> On 2019-04-05 17:46, Guido van Rossum wrote: > This API is doing no harm, it's not a maintenance > burden What if the following happens? 1. For some reason (possibly because of this thread), people discover instancemethod and start using it. 2. People realize that it's slow. 3. It needs to be made more efficient, causing new code bloat and maintenance burden. > clearly *some* folks have a use for it. I'm not convinced. I don't think that instancemethod is the right solution for functools.total_ordering for example. Jeroen. From brett at python.org Fri Apr 5 15:58:47 2019 From: brett at python.org (Brett Cannon) Date: Fri, 5 Apr 2019 12:58:47 -0700 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA79E6C.904@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA79E6C.904@UGent.be> Message-ID: On Fri, Apr 5, 2019 at 11:30 AM Jeroen Demeyer wrote: > On 2019-04-05 17:46, Guido van Rossum wrote: > > This API is doing no harm, it's not a maintenance > > burden > > What if the following happens? > > 1. For some reason (possibly because of this thread), people discover > instancemethod and start using it. > > 2. People realize that it's slow. > > 3. It needs to be made more efficient, causing new code bloat and > maintenance burden. > Then we can consider improving the documentation if there are performance implications. But the point is if there's code out there already using it without issue then ripping it out of the C API is painful since we don't have nearly as good of a deprecation setup as we do in Python code. Not everything about the C APi is about performance. -Brett > > > clearly *some* folks have a use for it. > > I'm not convinced. OK, but as of right now others like me are convinced and we typically err on the side of backwards-compatibility in these kinds of situations. -Brett > I don't think that instancemethod is the right > solution for functools.total_ordering for example. > > > Jeroen. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Fri Apr 5 16:09:48 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 5 Apr 2019 22:09:48 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA79E6C.904@UGent.be> Message-ID: <5CA7B60C.1060605@UGent.be> On 2019-04-05 21:58, Brett Cannon wrote: > Then we can consider improving the documentation if there are > performance implications. Sure, we could write in the docs something like "Don't use this, this is not what you want. It's slow and there are better alternatives like method descriptors". Should I do that (with better wording of course)? > since we don't have nearly as good of a deprecation setup as we > do in Python code. I don't get this. One can easily raise a DeprecationWarning from C code, there is plenty of code already doing that. Jeroen. From brett at python.org Fri Apr 5 20:30:31 2019 From: brett at python.org (Brett Cannon) Date: Fri, 5 Apr 2019 17:30:31 -0700 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA7B60C.1060605@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA79E6C.904@UGent.be> <5CA7B60C.1060605@UGent.be> Message-ID: On Fri, Apr 5, 2019 at 1:11 PM Jeroen Demeyer wrote: > On 2019-04-05 21:58, Brett Cannon wrote: > > Then we can consider improving the documentation if there are > > performance implications. > > Sure, we could write in the docs something like "Don't use this, this is > not what you want. It's slow and there are better alternatives like > method descriptors". Should I do that (with better wording of course)? > Up to you. Obviously help is always appreciated, just a question of who feels qualified to review the PR. > > > since we don't have nearly as good of a deprecation setup as we > > do in Python code. > > I don't get this. One can easily raise a DeprecationWarning from C code, > there is plenty of code already doing that. > True. I personally prefer compile-time warnings for that sort of thing, but you're right we can do it at the Python "level" with a raise of a DeprecationWarning on those instances. -------------- next part -------------- An HTML attachment was scrubbed... URL: From doko at ubuntu.com Fri Apr 5 22:39:49 2019 From: doko at ubuntu.com (Matthias Klose) Date: Sat, 6 Apr 2019 04:39:49 +0200 Subject: [Python-Dev] PEP-582 and multiple Python installations In-Reply-To: <2b889555-db6f-6c69-0347-ebb89d6fec21@python.org> References: <2b889555-db6f-6c69-0347-ebb89d6fec21@python.org> Message-ID: <46858269-5ce0-5f70-9a08-f22135b5c1e9@ubuntu.com> On 02.04.19 18:10, Steve Dower wrote: > On 02Apr2019 0817, Calvin Spealman wrote: >> (I originally posted this to python-ideas, where I was told none of this PEP's >> authors subscribe so probably no one will see it there, so I'm posting it here >> to raise the issue where it can get seen and hopefully discussed) > > Correct, thanks for posting. (I thought we had a "discussions-to" tag with > distutils-sig on it, but apparently not.) > >> While the PEP does show the version number as part of the path to the actual >> packages, implying support for multiple versions, this doesn't seem to be >> spelled out in the actual text. Presumably __pypackages__/3.8/ might sit >> beside __pypackages__/3.9/, etc. to keep future versions capable of installing >> packages for each version, the way virtualenv today is bound to one version of >> Python. >> >> I'd like to raise a potential edge case that might be a problem, and likely an >> increasingly common one: users with multiple installations of the *same* >> version of Python. This is actually a common setup for Windows users who use >> WSL, Microsoft's Linux-on-Windows solution, as you could have both the Windows >> and Linux builds of a given Python version installed on the same machine. The >> currently implied support for multiple versions would not be able to separate >> these and could create problems if users pip install a Windows binary package >> through Powershell and then try to run a script in Bash from the same >> directory, causing the Linux version of Python to try to use Windows python >> packages. >> >> I'm not actually sure what the solution here is. Mostly I wanted to raise the >> concern, because I'm very keen on WSL being a great entry path for new >> developers and I want to make that a better experience, not a more confusing >> one. Maybe that version number could include some other unique identify, maybe >> based on Python's own executable. A hash maybe? I don't know if anything like >> that already exists to uniquely identify a Python build or installation. > > Yes, this is a situation we're aware of, and it's caught in the conflict of "who > is this feature meant to support". This smells the same like mixing system installed python packages (deb/rpm) with one managed by pip, and pip touching system installed packages. > Since all platforms have a unique extension module suffix (e.g. > "module.cp38-win32.pyd"), it would be possible to support this with "fat" > packages that include all binaries (or some clever way of merging wheels for > multiple platforms). unfortunately not. The Android developers opted out of that, reverting that change. Also how would you differentiate win32 builds for different architectures? But maybe this is already done. > And since this is already in CPython itself, it leads to about the only > reasonable solution - instead of "3.8", use the extension module suffix > "cp38-win32". (Wheel tags are not in core CPython, so we can't use those.) > > But while this seems obvious, it also reintroduces problems that this has the > potential to fix - suddenly, just like installing into your global environment, > your packages are not project-specific anymore but are Python-specific. Which is > one of the major confusions people run into ("I pip installed X but now can't > import it in python"). > > So the main points of discussion right now are "whose problem does this solve" > and "when do we tell people they need a full venv". And that discussion is > mostly happening at > https://discuss.python.org/t/pep-582-python-local-packages-directory/963/ > > Cheers, > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/doko%40ubuntu.com From songofacandy at gmail.com Sat Apr 6 01:09:37 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sat, 6 Apr 2019 14:09:37 +0900 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On Sat, Apr 6, 2019 at 1:13 AM Victor Stinner wrote: > > > > ``PyPreConfig`` fields: > > > > > > * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale > > > is coerced. > > > * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to > > > 1, read the LC_CTYPE to decide if it should be coerced. > > > > Can we use another value for coerce_c_locale to determine whether to > > warn or not? Save a field. > > coerce_c_locale is already complex, it can have 4 values: -1, 0, 1 and 2. > I prefer keep a separated field. > > Moreover, I understood that you might want to coerce the C locale *and* > get the warning, or get the warning but *not* coerce the locale. > Are these configurations are really needed? Applications embedding Python may not initialize Python interpreter at first. For example, vim initializes Python when Python is used first time. On the other hand, C locale coercion should be done ASAP application starts. I think dedicated API for coercing C locale is better than preconfig. // When application starts: Py_CoerceCLocale(warn=0); // later... Py_Initialize(); -- Inada Naoki From vstinner at redhat.com Sat Apr 6 09:37:28 2019 From: vstinner at redhat.com (Victor Stinner) Date: Sat, 6 Apr 2019 15:37:28 +0200 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: Maybe I should clarify in the PEP 587 Rationale what are the use cases for the API. Embeding Python is one kind of use case, but writing your own Python with a specific config like "isolated Python" or "system Python" is also a valid use case. For a custom Python, you might want to get C locale coercion and UTF-8 Mode. The most common case is to embed Python in an application like Blender or vim: the application already executes a lot of code and manipulated strings and encoding before Python is initialized, so Python must not coerce the C locale in that case. That's why Nick and me decided to disable C loclae coercion and UTF-8 Mode by default when the C API is used. Victor Le samedi 6 avril 2019, Inada Naoki a ?crit : > On Sat, Apr 6, 2019 at 1:13 AM Victor Stinner wrote: >> >> > > ``PyPreConfig`` fields: >> > > >> > > * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale >> > > is coerced. >> > > * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to >> > > 1, read the LC_CTYPE to decide if it should be coerced. >> > >> > Can we use another value for coerce_c_locale to determine whether to >> > warn or not? Save a field. >> >> coerce_c_locale is already complex, it can have 4 values: -1, 0, 1 and 2. >> I prefer keep a separated field. >> >> Moreover, I understood that you might want to coerce the C locale *and* >> get the warning, or get the warning but *not* coerce the locale. >> > > Are these configurations are really needed? > > Applications embedding Python may not initialize Python interpreter at first. > For example, vim initializes Python when Python is used first time. > > On the other hand, C locale coercion should be done ASAP application starts. > > I think dedicated API for coercing C locale is better than preconfig. > > // When application starts: > Py_CoerceCLocale(warn=0); > > // later... > Py_Initialize(); > > -- > Inada Naoki > -- Night gathers, and now my watch begins. It shall not end until my death. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Apr 6 22:45:36 2019 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Apr 2019 12:45:36 +1000 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On Sat, 6 Apr 2019 at 02:16, Victor Stinner wrote: > > > ``PyPreConfig`` fields: > > > > > > * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale > > > is coerced. > > > * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to > > > 1, read the LC_CTYPE to decide if it should be coerced. > > > > Can we use another value for coerce_c_locale to determine whether to > > warn or not? Save a field. > > coerce_c_locale is already complex, it can have 4 values: -1, 0, 1 and 2. > I prefer keep a separated field. > > Moreover, I understood that you might want to coerce the C locale *and* > get the warning, or get the warning but *not* coerce the locale. Yeah, that's how they ended up being two different fields in the first place. However, I wonder if the two fields might be better named: * warn_on_legacy_c_locale * coerce_legacy_c_locale Neither set: legacy C locale is left alone Only warning flag set: complain about the legacy C locale on stderr Only coercion flag set: silently attempt to coerce the legacy C locale to a UTF-8 based one Both flags set: attempt the coercion, and then complain about it on stderr (regardless of whether the coercion succeeded or not) The original PEP 580 implementation tried to keep the config API simpler by always complaining, but that turned out to break the world (plenty of contexts where things get upset by unexpected output on stderr). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Apr 6 22:49:10 2019 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Apr 2019 12:49:10 +1000 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On Sun, 7 Apr 2019 at 12:45, Nick Coghlan wrote: > The original PEP 580 implementation tried to keep the config API > simpler by always complaining, but that turned out to break the world > (plenty of contexts where things get upset by unexpected output on > stderr). Err, PEP 538. No idea why my brain swapped in the wrong PEP number :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Sun Apr 7 03:48:45 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 7 Apr 2019 10:48:45 +0300 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA796B8.1000202@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA796B8.1000202@UGent.be> Message-ID: 05.04.19 20:56, Jeroen Demeyer ????: > On 2019-04-05 19:53, Serhiy Storchaka wrote: >> At Python level we can monkeypatch __gt__, but not tp_richcompare. > > Sure, but you're planning to use C anyway so that's not really an argument. total_ordering monkeypatches the decorated class. I'm planning to implement in C methods that implement __gt__ in terms of __lt__ etc. From J.Demeyer at UGent.be Sun Apr 7 03:15:57 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sun, 7 Apr 2019 09:15:57 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA796B8.1000202@UGent.be> Message-ID: <5CA9A3AD.1010300@UGent.be> On 2019-04-07 09:48, Serhiy Storchaka wrote: > total_ordering monkeypatches the decorated class. I'm planning to > implement in C methods that implement __gt__ in terms of __lt__ etc. Yes, I understood that. I'm just saying: if you want to make it fast, that's not the best solution. The fastest would be to implement tp_richcompare from scratch (instead of relying on slot_tp_richcompare dispatching to methods). From xdegaye at gmail.com Sun Apr 7 07:31:13 2019 From: xdegaye at gmail.com (Xavier de Gaye) Date: Sun, 7 Apr 2019 13:31:13 +0200 Subject: [Python-Dev] bedevere pipelines hang on github Message-ID: bedevere/issue-number and bedevere/news are not triggered for some reason at https://github.com/python/cpython/pull/12708 and hang forever with "Expected ? Waiting for status to be reported ". Xavier From lisandrosnik at gmail.com Sun Apr 7 07:40:32 2019 From: lisandrosnik at gmail.com (Lysandros Nikolaou) Date: Sun, 7 Apr 2019 13:40:32 +0200 Subject: [Python-Dev] bedevere pipelines hang on github In-Reply-To: References: Message-ID: There is an issue with bedevere at the moment. As described by Mariatta in https://github.com/python/bedevere/issues/162 it is still not clear, if this is our issue or GitHub's. I may have some time to look into it a bit later. On Sun, Apr 7, 2019 at 1:32 PM Xavier de Gaye wrote: > bedevere/issue-number and bedevere/news are not triggered for some > reason at https://github.com/python/cpython/pull/12708 and hang > forever with "Expected ? Waiting for status to be reported ". > > Xavier > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/lisandrosnik%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tir.karthi at gmail.com Sun Apr 7 23:38:16 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Mon, 8 Apr 2019 09:08:16 +0530 Subject: [Python-Dev] bedevere pipelines hang on github In-Reply-To: References: Message-ID: This seems to be fixed now : https://github.com/python/core-workflow/issues/321 Regards, Karthikeyan S -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdegaye at gmail.com Mon Apr 8 06:09:08 2019 From: xdegaye at gmail.com (Xavier de Gaye) Date: Mon, 8 Apr 2019 12:09:08 +0200 Subject: [Python-Dev] bedevere pipelines hang on github In-Reply-To: References: Message-ID: Thanks. Xavier From robert.wd.white at gmail.com Mon Apr 8 11:08:40 2019 From: robert.wd.white at gmail.com (Robert White) Date: Mon, 8 Apr 2019 10:08:40 -0500 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CA9A3AD.1010300@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA796B8.1000202@UGent.be> <5CA9A3AD.1010300@UGent.be> Message-ID: So we're making pretty heavy use of PyInstanceMethod_New in our python binding library that we've written for a bunch of in house tools. If this isn't the best / correct way to go about adding methods to objects, what should we be using instead? On Sun, Apr 7, 2019 at 2:17 AM Jeroen Demeyer wrote: > On 2019-04-07 09:48, Serhiy Storchaka wrote: > > total_ordering monkeypatches the decorated class. I'm planning to > > implement in C methods that implement __gt__ in terms of __lt__ etc. > > Yes, I understood that. I'm just saying: if you want to make it fast, > that's not the best solution. The fastest would be to implement > tp_richcompare from scratch (instead of relying on slot_tp_richcompare > dispatching to methods). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/robert.wd.white%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Mon Apr 8 11:24:34 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Mon, 8 Apr 2019 17:24:34 +0200 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA796B8.1000202@UGent.be> <5CA9A3AD.1010300@UGent.be> Message-ID: <5CAB67B2.6010106@UGent.be> On 2019-04-08 17:08, Robert White wrote: > So we're making pretty heavy use of PyInstanceMethod_New in our python > binding library that we've written for a bunch of in house tools. > If this isn't the best / correct way to go about adding methods to > objects, what should we be using instead? First of all, the consensus in this thread is not to deprecate instancemethod. Well, it depends what you mean with "adding methods to objects", that's vaguely formulated. Do you mean adding methods at run-time (a.k.a. monkey-patching) to a pre-existing class? And is the process of adding methods done in C or in Python? Do you only need PyInstanceMethod_New() or also other PyInstanceMethod_XXX functions/macros? From robert.wd.white at gmail.com Mon Apr 8 11:45:24 2019 From: robert.wd.white at gmail.com (Robert White) Date: Mon, 8 Apr 2019 10:45:24 -0500 Subject: [Python-Dev] Deprecating "instance method" class In-Reply-To: <5CAB67B2.6010106@UGent.be> References: <5CA5EFCC.2030400@UGent.be> <5CA68BBC.8060205@canterbury.ac.nz> <5CA6F09D.3000900@UGent.be> <5CA73B95.6040509@UGent.be> <5CA74B19.70806@UGent.be> <5CA796B8.1000202@UGent.be> <5CA9A3AD.1010300@UGent.be> <5CAB67B2.6010106@UGent.be> Message-ID: Just PyInstanceMethod_New, and by "adding methods to objects" this is adding C functions to types defined in C. Only appears to be called at module import / creation time. On Mon, Apr 8, 2019 at 10:24 AM Jeroen Demeyer wrote: > On 2019-04-08 17:08, Robert White wrote: > > So we're making pretty heavy use of PyInstanceMethod_New in our python > > binding library that we've written for a bunch of in house tools. > > If this isn't the best / correct way to go about adding methods to > > objects, what should we be using instead? > > First of all, the consensus in this thread is not to deprecate > instancemethod. > > Well, it depends what you mean with "adding methods to objects", that's > vaguely formulated. Do you mean adding methods at run-time (a.k.a. > monkey-patching) to a pre-existing class? And is the process of adding > methods done in C or in Python? > > Do you only need PyInstanceMethod_New() or also other > PyInstanceMethod_XXX functions/macros? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Tue Apr 9 10:22:56 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 9 Apr 2019 16:22:56 +0200 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability Message-ID: Hi, In May 2017, user "Orange" found a vulnerability in the urllib fix for CVE-2016-5699 (HTTP Header Injection vulnerability): https://bugs.python.org/issue30458 It allows to inject arbitrary HTTP headers. Copy of their message: """ Hi, the patch in CVE-2016-5699 can be broke by an addition space. http://www.cvedetails.com/cve/CVE-2016-5699/ https://hg.python.org/cpython/rev/bf3e1c9b80e9 https://hg.python.org/cpython/rev/1c45047c5102 import urllib, urllib2 urllib.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211') urllib2.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211') """ Last month, the same bug has been rediscovered by user "ragdoll.guo": https://bugs.python.org/issue36276 Almost one year after the bug has been reported, no one came with a solution. I'm not comfortable with having known security issues impacting HTTP. Can someone please have a look at the issue and try to write a change to fix the issue? According to Karthikeyan Singaravelan, the Go language fixed a similar issue in Go 1.12: throw an error if the URL contains any control character. If we decide that the issue is not a security issue, we should document the behavior properly and close the issue. See also this related issue: "urlopen URL with unescaped space" https://bugs.python.org/issue14826 Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Tue Apr 9 12:25:25 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 9 Apr 2019 18:25:25 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build Message-ID: Hi, When Python is built in debug mode, PyObject gets 2 new fields: _ob_prev and _ob_next. These fields change the offset of following fields in the PyObject structure and so breaks the ABI. I propose to modify the debug build (Py_DEBUG) to no longer imply Py_TRACE_REFS. Antoine Pitrou proposed this idea when the C API was discussed to get a stable ABI. https://bugs.python.org/issue36465 https://github.com/python/cpython/pull/12615 This change makes the debug build ABI closer to the release build ABI, but I am not sure how to compare these two ABI. Technically, C extensions still need to be recompiled. What do you think? -- I also wrote a PR to remove all code related to Py_TRACE_REFS: https://github.com/python/cpython/pull/12614 I don't think that it's right approach. I prefer to keep this special build around to see if anyone needs it, and wait one or two Python releases to decide what to do with it. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From steve.dower at python.org Tue Apr 9 16:16:29 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 9 Apr 2019 13:16:29 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> On 09Apr2019 0925, Victor Stinner wrote: > This change makes the debug build ABI closer to the release build ABI, > but I am not sure how to compare these two ABI. Technically, C > extensions still need to be recompiled. > > What do you think? What are the other changes that would be required? And is there another way to get the same functionality without ABI modifications? I think it's worthwhile if we can really get to debug and non-debug builds being ABI compatible. Getting partway there in this case doesn't seem to offer any benefits. Cheers, Steve From steve.dower at python.org Tue Apr 9 16:20:46 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 9 Apr 2019 13:20:46 -0700 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: Thanks for the replies. Anything I don't comment on means that I agree with you :) On 05Apr2019 0900, Victor Stinner wrote: > Honestly, I'm not sure that we really have to distinguish "user error" and > "internal error". It's an old debate about calling abort()/DebugBreak() or > not. It seems like most users are annoyed by getting a coredump on Unix > when abort() is called. I'm also annoyed by the crash reports on Windows when "encodings" cannot be found (because occasionally there are enough of them that the Windows team starts reviewing the issue, and I get pulled in to review and resolve their bugs). > Maybe we should just remove Py_INIT_USER_ERR(), always use Py_INIT_ERR(), > and never call abort()/DebugBreak() in Py_ExitInitError(). Not calling abort() sounds fine to me. Embedders would likely prefer an error code rather than a crash, but IIRC they'd have to call Py_ExitInitError() to get the crash anyway, right? > Or does anyone see a good reason to trigger a debugger on an > initialization error? Only before returning from the point where the error occurs. By the time you've returned the error value all the useful context is gone. > Note: I'm not talking about Py_FatalError() here, this one will not > change. Does this get called as part of initialization? If not, I'm fine with it not changing. Cheers, Steve From steve.dower at python.org Tue Apr 9 16:39:59 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 9 Apr 2019 13:39:59 -0700 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On 05Apr2019 0912, Victor Stinner wrote: > About PyPreConfig and encodings. > [...] >>> * ``PyInitError Py_PreInitialize(const PyPreConfig *config)`` >>> * ``PyInitError Py_PreInitializeFromArgs( const PyPreConfig *config, >> int argc, char **argv)`` >>> * ``PyInitError Py_PreInitializeFromWideArgs( const PyPreConfig >> *config, int argc, wchar_t **argv)`` >> >> I hope to one day be able to support multiple runtimes per process - can >> we have an opaque PyRuntime object exposed publicly now and passed into >> these functions? > > I hesitated to include a "_PyRuntimeState*" parameter somewhere, but I > chose to not do so. > > Currently, there is a single global variable _PyRuntime which has the type > _PyRuntimeState. The _PyRuntime_Initialize() API is designed around this > global variable. For example, _PyRuntimeState contains the registry of > interpreters: you don't want to have multiple registries :-) > > I understood that we should only have a single instance of > _PyRuntimeState. So IMHO it's fine to keep it private at this point. > There is no need to expose it in the API. So I didn't want to expose that particular object right now, but just some sort of "void*" parameter in the new APIs (and require either NULL or a known value be passed). That gives us the freedom to enable multiple runtimes in the future without having to change the API shape. > FYI I tried to design an internal API with a "context" to pass > _PyRuntimeState, PyPreConfig, _PyConfig, the current interpreter, etc. > [...] > There are 2 possible implementations: > > * Modify *all* functions to add a new "context" parameter and modify *all* > functions to pass this parameter to sub-functions. > * Store the current "context" as a thread local variable or something like > that. > [...] > For the second option: well, there is no API change needed! > It can be done later. > Moreover, we already have such API! PyThreadState_Get() gets the Python > thread state of the current thread: the current interpreter can be > accessed from there. Yes, this is what I had in mind as a transition. I think eventually it would be best to have the context parameter, as thread-local variables have overhead and add significant complexity (particularly when debugging crashes), but making that change is huge. >>> ``PyPreConfig`` fields: >>> >>> * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale >>> is coerced. >>> * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to >>> 1, read the LC_CTYPE to decide if it should be coerced. >> >> Can we use another value for coerce_c_locale to determine whether to >> warn or not? Save a field. > > coerce_c_locale is already complex, it can have 4 values: -1, 0, 1 and 2. > I prefer keep a separated field. > > Moreover, I understood that you might want to coerce the C locale *and* > get the warning, or get the warning but *not* coerce the locale. If we define meaningful constants, then it doesn't matter how many values it has. We could have PY_COERCE_LOCALE_AND_WARN, PY_COERCE_LOCALE_SILENTLY, PY_WARN_WITHOUT_COERCE etc. to represent the states. These actually make things simpler than trying to reason about how two similar parameters interact. >>> * ``legacy_windows_fs_encoding`` (Windows only): if non-zero, set the >>> Python filesystem encoding to ``"mbcs"``. >>> * ``utf8_mode``: if non-zero, enable the UTF-8 mode >> >> Why not just set the encodings here? > > For different technical reasons, you simply cannot specify an encoding > name. You can also pass options to tell Python that you have some > preferences (PyPreConfig and PyConfig fields). > > Python doesn't support any encoding and encoding errors combinations. In > practice, it only supports a narrow set of choices. The main implementation are > Py_EncodeLocale() and Py_DecodeLocale() functions which uses the C codec > of the current locale encoding to implement the filesystem encoding, > before the codec implemented in Python can be used. > > Basically, only the current locale encoding or UTF-8 are supported. > If you want UTF-8, enable the UTF-8 Mode. If we already had a trivial way to specify the default encodings as a string before any initialization has occurred, I think we would have made UTF-8 mode enabled by setting them to "utf-8" rather than a brand new flag. Again, we either have a huge set of flags to infer certain values at certain times, or we can just make them directly settable. If we make them settable, it's much easier for users to reason about what is going to happen. > To load the Python codec, you need importlib. importlib needs to access > the filesystem which requires a codec to encode/decode file names > (PyConfig.module_search_paths uses Unicode wchar_t* strings, but the C API > only supports bytes char* strings). Right, and the few places where we need an encoding *before* we can load any arbitrary ones we can easily compare the strings and fail if someone's trying to do something unusual (or if the platform can do the lookup itself, it could succeed). If we say "passing NULL means use the default" then we have that handled, and the actual encoding just gets set to the real default once we figure out what that is. > Py_PreInitialize() doesn't set the filesystem encoding. It initializes the > LC_CTYPE locale and Python global configuration variables (Py_UTF8Mode and > Py_LegacyWindowsFSEncodingFlag). Right, I'm proposing a simplification here where it *does* set the filesystem encoding (even though it doesn't get used until Py_Initialize() is called). That way we can use the filesystem encoding to access the filesystem during initialization, provided it's one of the built-in supported ones (e.g. NULL, which means the C locale, or "utf-8" which means UTF-8) rather than relying on the tables in the standard library. Oh look, I said all this in my original email: >> Obviously we are not ready to import most encodings after pre >> initialization, but I think that's okay. Embedders who set something >> outside the range of what can be used without importing encodings will >> get an error to that effect if we try. > > You need a C implementation of the Python filesystem encoding very early > in Python initialization. You cannot start with one encoding and "later" > switch the encoding. I tried multiple times the last 10 years and I always > failed to do that. All attempts failed with mojibake at different > levels. Again, this is for embedders. Regular Python users will only ever request "NULL" or "utf-8", depending on the UTF-8 mode flag. And embedders have to make sure they get what they ask for and also can't change it later. The problems you've hit in the past have always been to do with trying to infer or guess the actual encoding, rather than simply letting someone tell you what it is (via config) and letting them deal with the failure. >> In fact, I'd be totally okay with letting embedders specify their own >> function pointer here to do encoding/decoding between Unicode and the OS >> preferred encoding. > > In my experience, when someone wants to get a specific encoding: they > only want UTF-8. There is now the UTF-8 Mode which ignores the locale > and forces the usage of UTF-8. Your experience here sounds like it's limited to POSIX systems. I've wanted UTF-16 before, and been able to provide it (if Python had allowed me to provide a callback to encode/decode). And again, all this is about "why do we need to define a boolean that determines what the encoding is when we can just let people tell us what encoding they want". There's a good chance that an embedded Python isn't going to touch the real filesystem anyway. > I'm not sure that there is a need to have a custom codec. Moreover, if > there an API to pass a codec in C, you will need to expose it somehow > at the Python level for os.fsencode() and os.fsdecode(). We need to expose those operations anyway, and os.fsencode/fsdecode have their own issues (particularly since there *are* ways to change filesystem encoding while running). Turning them into actual native functions that might call out to a host-provided callback would not be difficult. > Currently, Python ensures during early stage of startup that > codecs.lookup(sys.getfilesystemencoding()) works: there is a existing > Python codec for the requested filesystem encoding. Right, it's a validation step. But we can also make codecs.lookup("whatever the file system encoding is") return something based on os.fsencode() and os.fsdecode(). We're not actually beholden to the current implementations here - we are allowed to change them! ;) From steve.dower at python.org Tue Apr 9 16:44:15 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 9 Apr 2019 13:44:15 -0700 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On 05Apr2019 0922, Victor Stinner wrote: > While there are supporters of an "isolated Python" (sometimes called > "system python"), the fact that it doesn't exist in any Linux distribution > nor on any other operating system (Windows, macOS, FreeBSD), whereas it's > already doable in Python 3.6 with Py_IsolatedFlag=1 makes me think that > users like the ability to control Python with environment variables and > configuration files. > > I would prefer to leave Python as not isolated by default. It's just a > matter of command line arguments. Not for embedders it isn't. When embedding you need to do a whole lot of special things to make sure that your private version of Python doesn't pick up settings relating to a regular Python install. We should make the Python runtime isolated by default, and only (automatically) pick up settings from the environment in the Python binary. >>> * The PEP 432 stores ``PYTHONCASEOK`` into the config. Do we need >>> to add something for that into ``PyConfig``? How would it be exposed >>> at the Python level for ``importlib``? Passed as an argument to >>> ``importlib._bootstrap._setup()`` maybe? It can be added later if >>> needed. >> >> Could we convert it into an xoption? It's very rarely used, to my knowledge. > > The first question is if there is any need for an embedder to change > this option. Currently, importlib._bootstrap_external._install() reads > the environment variable and it's the only way to control the option. I think the first question should be "is there any reason to prevent an embedder from changing this option". In general, the answer is going to be no. We should expose all the options we rely on to embedders, or else they're going to have to find workarounds. > ... By the way, importlib reads PYTHONCASEOK environment varaible even > if isolated mode is enabled (sys.flags.isolated is equal to 1). Is it > a bug? :-) Yes, I think it's a bug. Perhaps this should become a proper configuration option, rather than a directly-read environment variable? From steve.dower at python.org Tue Apr 9 16:51:03 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 9 Apr 2019 13:51:03 -0700 Subject: [Python-Dev] New Python Initialization API In-Reply-To: References: <70c245c7-8f40-a51b-934b-af958f7cd849@python.org> <119e26f0-d826-7e13-aa4a-e0a67bea3265@python.org> <91eec784-28ab-024f-42a6-8a1e5d37d9bf@python.org> <6a17e990-7e2a-0544-1e8e-9b16d05e4df6@python.org> Message-ID: On 05Apr2019 0936, Victor Stinner wrote: > For the PyMainConfig structure idea, I cannot comment at this point. I > need more time to think about it. > > > About the "path configuration" fields, maybe a first step to enhance > the API would be to add the the following function: > > PyInitError PyConfig_ComputePath(PyConfig *config, const wchar *home); > > where home can be NULL (and PyConfig.module_search_paths_env field > goes away: the function reads PYTHONPATH env var internally). Yes, I like this. Maybe pass PYTHONPATH value in as an "additional paths" parameter? Basically, this function would be the replacement for "Py_GetPath()" (which initializes paths to the defaults the first time it is called), and setting the path fields in PyConfig manually is the replacement for Py_SetPath() (or calling the various Py_Set*() functions to make the default logic infer the paths you want). Similarly, PyConfig_ComputeFromArgv() and/or PyConfig_ComputeFromEnviron() functions would also directly replace the magic we have scattered all over the place right now. It would also make it more obvious to the callers which values take precedence, and easier to see that there should be no side effects. I think it's easier to document as well. Cheers, Steve From tir.karthi at gmail.com Tue Apr 9 19:45:06 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Wed, 10 Apr 2019 05:15:06 +0530 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: I would recommend fixing it since it's potentially remote code execution on systems like Redis (latest versions of Redis have this mitigated) though I must admit I don't fully understand the complexity since there are multiple issues linked. Go was also assigned a CVE for linked issue and it seemed to be the same reporter by username : CVE-2019-9741 . I tried using go's approach in the commit but urlopen accepts more URLs like data URLs [0] that seemed to accept \n as a valid case and the patch broke some tests. Looking at the issue discussion complexity also involves backwards compatibility. golang also pushed an initial fix that seemed to broke their internal tests [0] to arrive at a more simpler fix. [0] https://github.com/python/cpython/blob/a40681dd5db8deaf05a635eecb91498dac882aa4/Lib/test/test_urllib.py#L482 [1] https://go-review.googlesource.com/c/go/+/159157/2#message-39c6be13a192bf760f6318ac641b432a6ab8fdc8 -- Regards, Karthikeyan S -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue Apr 9 20:41:27 2019 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 9 Apr 2019 17:41:27 -0700 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: On Tue, Apr 9, 2019 at 4:45 PM Karthikeyan wrote: > I would recommend fixing it since it's potentially remote code execution > on systems like Redis (latest versions of Redis have this mitigated) though > I must admit I don't fully understand the complexity since there are > multiple issues linked. Go was also assigned a CVE for linked issue and it > seemed to be the same reporter by username : CVE-2019-9741 . I tried using > go's approach in the commit but urlopen accepts more URLs like data URLs > [0] that seemed to accept \n as a valid case and the patch broke some > tests. Looking at the issue discussion complexity also involves backwards > compatibility. golang also pushed an initial fix that seemed to broke their > internal tests [0] to arrive at a more simpler fix. > > [0] > https://github.com/python/cpython/blob/a40681dd5db8deaf05a635eecb91498dac882aa4/Lib/test/test_urllib.py#L482 > [1] > https://go-review.googlesource.com/c/go/+/159157/2#message-39c6be13a192bf760f6318ac641b432a6ab8fdc8 > > -- > Regards, > Karthikeyan S > useful references, thanks! limiting the checks to only http and https as those are the text based protocols with urls transmitted in text form makes sense and avoids the data: test failures. proposed simple fix in https://github.com/python/cpython/pull/12755 but tests are needed as is an audit of the code to see where else we may potentially need to do such things. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tir.karthi at gmail.com Wed Apr 10 00:30:59 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Wed, 10 Apr 2019 10:00:59 +0530 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: Thanks Gregory. I think it's a good tradeoff to ensure this validation only for URLs of http scheme. I also agree handling newline is little problematic over the years and the discussion over the level at which validation should occur also prolongs some of the patches. https://bugs.python.org/issue35906 is another similar case where splitlines is used but it's better to raise an error and the proposed fix could be used there too. Victor seemed to wrote a similar PR like linked one for other urllib functions only to fix similar attack in ftplib to reject newlines that was eventually fixed only in ftplib * https://bugs.python.org/issue30713 * https://bugs.python.org/issue29606 Search also brings multiple issues with one duplicate over another that makes these attacks scattered over the tracker and some edge case missing. Slightly off topic, the last time I reported a cookie related issue where the policy can be overriden by third party library I was asked to fix it in stdlib itself since adding fixes to libraries causes maintenance burden to downstream libraries to keep up upstream. With urllib being a heavily used module across ecosystem it's good to have a fix landing in stdlib that secures downstream libraries encouraging users to upgrade Python too. Regards, Karthikeyan S > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Apr 10 06:16:10 2019 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 10 Apr 2019 06:16:10 -0400 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: 1. Is there a library of URL / Header injection tests e.g. for fuzzing that we could generate additional test cases with or from? 2. Are requests.get() and requests.post() also vulnerable? 3. Despite the much-heralded UNIX pipe protocols' utility, filenames containing newlines (the de-facto line record delimiter) are possible: "file"$'\n'"name" Should filenames containing newlines and control characters require a kwarg to be non-None in order to be passed through unescaped to the HTTP request? On Wednesday, April 10, 2019, Karthikeyan wrote: > Thanks Gregory. I think it's a good tradeoff to ensure this validation > only for URLs of http scheme. > > I also agree handling newline is little problematic over the years and the > discussion over the level at which validation should occur also prolongs > some of the patches. https://bugs.python.org/issue35906 is another > similar case where splitlines is used but it's better to raise an error and > the proposed fix could be used there too. Victor seemed to wrote a similar > PR like linked one for other urllib functions only to fix similar attack in > ftplib to reject newlines that was eventually fixed only in ftplib > > * https://bugs.python.org/issue30713 > * https://bugs.python.org/issue29606 > > Search also brings multiple issues with one duplicate over another that > makes these attacks scattered over the tracker and some edge case missing. > Slightly off topic, the last time I reported a cookie related issue where > the policy can be overriden by third party library I was asked to fix it in > stdlib itself since adding fixes to libraries causes maintenance burden to > downstream libraries to keep up upstream. With urllib being a heavily used > module across ecosystem it's good to have a fix landing in stdlib that > secures downstream libraries encouraging users to upgrade Python too. > > Regards, > Karthikeyan S > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Wed Apr 10 06:51:50 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 10 Apr 2019 12:51:50 +0200 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: Hi, I dig into Python code history and the bug tracker. I would like to say that this issue is a work-in-progress since 2004. Different fixes have been pushed, but there are *A LOT* of open issues: https://bugs.python.org/issue30458#msg339846 I would suggest to discuss on https://bugs.python.org/issue30458 rather than here, just to avoid to duplicate discussions ;-) Note: the whole class of issue (HTTP Header Injection) got at least 3 CVE: CVE-2016-5699, CVE-2019-9740, CVE-2019-9947. I changed bpo-30458 title to "[security][CVE-2019-9740][CVE-2019-9947] HTTP Header Injection (follow-up of CVE-2016-5699)". Victor Le mer. 10 avr. 2019 ? 12:20, Wes Turner a ?crit : > > 1. Is there a library of URL / Header injection tests e.g. for fuzzing that we could generate additional test cases with or from? > > 2. Are requests.get() and requests.post() also vulnerable? > > 3. Despite the much-heralded UNIX pipe protocols' utility, filenames containing newlines (the de-facto line record delimiter) are possible: "file"$'\n'"name" > > Should filenames containing newlines and control characters require a kwarg to be non-None in order to be passed through unescaped to the HTTP request? > > On Wednesday, April 10, 2019, Karthikeyan wrote: >> >> Thanks Gregory. I think it's a good tradeoff to ensure this validation only for URLs of http scheme. >> >> I also agree handling newline is little problematic over the years and the discussion over the level at which validation should occur also prolongs some of the patches. https://bugs.python.org/issue35906 is another similar case where splitlines is used but it's better to raise an error and the proposed fix could be used there too. Victor seemed to wrote a similar PR like linked one for other urllib functions only to fix similar attack in ftplib to reject newlines that was eventually fixed only in ftplib >> >> * https://bugs.python.org/issue30713 >> * https://bugs.python.org/issue29606 >> >> Search also brings multiple issues with one duplicate over another that makes these attacks scattered over the tracker and some edge case missing. Slightly off topic, the last time I reported a cookie related issue where the policy can be overriden by third party library I was asked to fix it in stdlib itself since adding fixes to libraries causes maintenance burden to downstream libraries to keep up upstream. With urllib being a heavily used module across ecosystem it's good to have a fix landing in stdlib that secures downstream libraries encouraging users to upgrade Python too. >> >> Regards, >> Karthikeyan S > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Wed Apr 10 07:01:42 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 10 Apr 2019 13:01:42 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le mar. 9 avr. 2019 ? 22:16, Steve Dower a ?crit : > What are the other changes that would be required? I don't know. > And is there another > way to get the same functionality without ABI modifications? Py_TRACE_REFS is a double linked list of *all* Python objects. To get this functionality, you need to store the list somewhere. I don't know how to maintain such list outside the PyObject structure. One solution would be to enable Py_TRACE_REFS in release mode. Does anyone want to add 16 bytes to every PyObject? I don't want that :-) > I think it's worthwhile if we can really get to debug and non-debug > builds being ABI compatible. Getting partway there in this case doesn't > seem to offer any benefits. Disabling Py_TRACE_REFS by default in debug mode reduces the Python memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 bytes on 64-bit platforms. I don't think that I ever used sys.getobjects(), whereas many projects use gc.get_objects() which is also available in release builds (not only in debug builds). I'm quite sure that almost nobody uses debug builds because the ABI is incompatible. The main question is if anyone ever used Py_TRACE_REFS? Does someone use sys.getobjects() or PYTHONDUMPREFS environment variable? Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does simply crash Python 3.7 at exit. So I don't think that anyone use it :-) I wrote PR 12614 to remove all code related to Py_TRACE_REFS. I wrote it to see which code depends on it: commit 63509498761a0e7f72585a8cd7df325ea2abd1b2 (HEAD -> remove_trace_refs, origin/remove_trace_refs) Author: Victor Stinner Date: Thu Mar 28 23:26:58 2019 +0100 WIP: bpo-36465: Remove Py_TRACE_REFS special build Remove _ob_prev and _ob_next fields of PyObject when Python is compiled in debug mode to make debug ABI closer to the release ABI. Remove: * sys.getobjects() * PYTHONDUMPREFS environment variable * _PyCoreConfig.dump_refs * PyObject._ob_prev and PyObject._ob_next fields * _PyObject_HEAD_EXTRA and _PyObject_EXTRA_INIT macros * _Py_AddToAllObjects() * _Py_PrintReferenceAddresses() * _Py_PrintReferences() Victor -- Night gathers, and now my watch begins. It shall not end until my death. From tir.karthi at gmail.com Wed Apr 10 07:07:03 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Wed, 10 Apr 2019 16:37:03 +0530 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: > 1. Is there a library of URL / Header injection tests e.g. for fuzzing > that we could generate additional test cases with or from? https://github.com/swisskyrepo/PayloadsAllTheThings seems to contain payload related stuff but not sure how useful it is for URL parsing. > > 2. Are requests.get() and requests.post() also vulnerable? > urllib3 seems to be vulnerable as noted in https://bugs.python.org/issue36276#msg337837 . requests uses urllib3 under the hood. The last time I checked requests passed encoded URL to urllib3 where this doesn't seem to be exploitable but I could be wrong. -- Regards, Karthikeyan S -------------- next part -------------- An HTML attachment was scrubbed... URL: From aranea.network at gmail.com Wed Apr 10 07:24:40 2019 From: aranea.network at gmail.com (Robert Okadar) Date: Wed, 10 Apr 2019 13:24:40 +0200 Subject: [Python-Dev] (no subject) Message-ID: Hi community, I have developed a tkinter GUI component, Python v3.7. It runs very well in Linux but seeing a huge performance impact in Windows 10. While in Linux an almost real-time performance is achieved, in Windows it is slow to an unusable level. The code is somewhat stripped down from the original, but the performance difference is the same anyway. The columns can be resized by clicking on the column border and dragging it. Resizing works only for the top row (but it resizes the entire column). In this demo, all bindings are avoided to exclude influence on the component performance and thus not included. If you resize the window (i.e., if you maximize it), you must call the function table.fit() from IDLE shell. Does anyone know where is this huge difference in performance coming from? Can anything be done about it? All the best, -- Robert Okadar IT Consultant Schedule an *online meeting * with me! Visit *aranea-mreze.hr* or call * +385 91 300 8887* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- import tkinter class Resizer(tkinter.Frame): def __init__(self, info_grid, master, **cnf): self.table_grid = info_grid tkinter.Frame.__init__(self, master, **cnf) self.bind('', self.resize_column) self.bind('', self.resize_start) self.bind('', self.resize_end) self._resizing = False self.bind('', self.onDestroyEvent) def onDestroyEvent(self, event): self.table_grid = [] def resize_column(self, event, width = None): #if self._resizing: top = self.table_grid.Top grid = self.table_grid._grid col = self.master.grid_info()["column"] if not width: width = self._width + event.x_root - self._x_root top.columnconfigure(col, minsize = width) grid.columnconfigure(col, minsize = width) def resize_start(self, event): top = self.table_grid.Top self._resizing = True self._x_root = event.x_root col = self.master.grid_info()["column"] self._width = top.grid_bbox(row = 0, column = col)[2] #print event.__dict__ col = self.master.grid_info()["column"] #print top.grid_bbox(row = 0, column = col) def resize_end(self, event): pass #self.table_grid.xscrollcommand() #self.table_grid.column_resize_callback(col, self) class Table(tkinter.Frame): def __init__(self, master, columns = 10, rows = 20, width = 100,**kw): tkinter.Frame.__init__(self, master, **kw) self.columns = [] self._width = width self._grid = grid = tkinter.Frame(self, bg = "#CCCCCC") self.Top = top = tkinter.Frame(self, bg = "#DDDDDD") self.create_top(columns) self.create_grid(rows) #self.bind('', self.on_table_configure) #self.bind('', self.on_table_map) top.pack(anchor = 'nw')#, expand = 1, fill = "both") grid.pack(anchor = 'nw')#fill = "both",expand = 1 def on_table_map(self, event): theight = self.winfo_height() def fit(self):#on_table_configure(self, event): i = 0 for frame in self.Top.grid_slaves(row = 0): frame.resizer.resize_column(None, width = frame.winfo_width()) i += 1 theight = self.winfo_height() fheight = self._grid.winfo_height() + self.Top.winfo_height() #print('', theight, fheight) if theight > fheight: rheight = self.grid_array[0][0].winfo_height() ammount = int((-fheight + theight) / rheight) #print(rheight, ammount) for i in range(ammount): self.add_row() self.update() def add_row(self, ammount = 1): columnsw = self.columns row = [] i = len(self.grid_array) for j in range(len(columnsw)): bg = self.bgcolor0 if i % 2 == 1: bg = self.bgcolor1 entry = tkinter.Label(self._grid, bg = bg, text = '%i %i' % (i, j)) entry.grid(row = i, column = j, sticky = "we", padx = 2) row.append(entry) self.grid_array.append(row) bgcolor0 = "#FFFFFF" bgcolor1 = "#EEEEEE" def create_grid(self, height): #grid.grid(row = 0, column = 0, sticky = "nsew") columnsw = self.columns# = self.Top.grid_slaves(row = 1) self.grid_array = [] for i in range(height): row = [] for j in range(len(columnsw)): bg = self.bgcolor0 if i % 2 == 1: bg = self.bgcolor1 #entry = self.EntryClass(False, self, self._grid, bg = bg, width = 1, ) entry = tkinter.Label(self._grid, bg = bg, text = '%i %i' % (i,j)) entry.grid(row = i, column = j, sticky = "we", padx = 2) row.append(entry) self.grid_array.append(row) def create_top(self, columns = 10): top = self.Top #columns = self._columns #maybe to rename for i in range(columns): name = 'column %i' % i self.add_column(name, top) def add_column(self, name, top, width = None): if not width: width = self._width col = tkinter.Frame(top) i = len(self.columns) #filter = Filter(self, name, i, top) entry = tkinter.Entry(col, width = 1) #readonlybackground #col.ColumnIndex = i #col.array_index = i entry.insert(0, name) resizer = Resizer(self, col, bg = "#000000", width = 3, height = 21, cursor = 'sb_h_double_arrow') #entry.grid(row = 0, column = 0, sticky = "we") #resizer.grid(row = 0, column = 1, sticky = "e") entry.pack(side = "left", fill = "both", expand = 1) resizer.pack(side = "right") top.columnconfigure(i, minsize = width, weight = 1) #filter.grid(row = 0, column = i, sticky = "we") col.grid(row = 0, column = i, sticky = "we") col.entry = entry #col.filter = filter #col.index = filter.index = i col.resizer = resizer #filter.Column = col entry.Column = col self.columns.append(col) if __name__ == '__main__': columns = 30 rows = 20 width = 60 root = tkinter.Tk() root.wm_title('TableGridTest') table = self = Table(root, columns = columns, rows = rows, width = width) table.pack(expand = 1, fill = 'both') #table.create_grid( From steve at pearwood.info Wed Apr 10 11:35:56 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 11 Apr 2019 01:35:56 +1000 Subject: [Python-Dev] (no subject) In-Reply-To: References: Message-ID: <20190410153555.GB3010@ando.pearwood.info> Hi Robert, This mailing list is for the development of the Python interpreter, not a general help desk. There are many other forums where you can ask for help, such as the comp.lang.python newsgroup, Stackoverflow, /r/python on Reddit, the IRC channel, and more. Perhaps you can help us though, I presume you signed up to this mailing list via the web interface at https://mail.python.org/mailman/listinfo/python-dev Is there something we could do to make it more clear that this is not the right place to ask for help? -- Steven From pviktori at redhat.com Wed Apr 10 12:25:57 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Wed, 10 Apr 2019 18:25:57 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> Message-ID: <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Hello! I've had time for a more thorough reading of PEP 590 and the reference implementation. Thank you for the work! Overall, I like PEP 590's direction. I'd now describe the fundamental difference between PEP 580 and PEP 590 as: - PEP 580 tries to optimize all existing calling conventions - PEP 590 tries to optimize (and expose) the most general calling convention (i.e. fastcall) PEP 580 also does a number of other things, as listed in PEP 579. But I think PEP 590 does not block future PEPs for the other items. On the other hand, PEP 580 has a much more mature implementation -- and that's where it picked up real-world complexity. PEP 590's METH_VECTORCALL is designed to handle all existing use cases, rather than mirroring the existing METH_* varieties. But both PEPs require the callable's code to be modified, so requiring it to switch calling conventions shouldn't be a problem. Jeroen's analysis from https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems to miss a step at the top: a. CALL_FUNCTION* / CALL_METHOD opcode calls b. _PyObject_FastCallKeywords() which calls c. _PyCFunction_FastCallKeywords() which calls d. _PyMethodDef_RawFastCallKeywords() which calls e. the actual C function (*ml_meth)() I think it's more useful to say that both PEPs bridge a->e (via _Py_VectorCall or PyCCall_Call). PEP 590 is built on a simple idea, formalizing fastcall. But it is complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and Py_TPFLAGS_METHOD_DESCRIPTOR. As far as I understand, both are there to avoid intermediate bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be general, but I don't see any other use case.) Is that right? (I'm running out of time today, but I'll write more on why I'm asking, and on the case I called "impossible" (while avoiding creation of a "bound method" object), later.) The way `const` is handled in the function signatures strikes me as too fragile for public API. I'd like if, as much as possible, PY_VECTORCALL_ARGUMENTS_OFFSET was treated as a special optimization that extension authors can either opt in to, or blissfully ignore. That might mean: - vectorcall, PyObject_VectorCallWithCallable, PyObject_VectorCall, PyCall_MakeTpCall all formally take "PyObject *const *args" - a na?ve callee must do "nargs &= ~PY_VECTORCALL_ARGUMENTS_OFFSET" (maybe spelled as "nargs &= PY_VECTORCALL_NARGS_MASK"), but otherwise writes compiler-enforced const-correct code. - if PY_VECTORCALL_ARGUMENTS_OFFSET is set, the callee may modify "args[-1]" (and only that, and after the author has read the docs). Another point I'd like some discussion on is that vectorcall function pointer is per-instance. It looks this is only useful for type objects, but it will add a pointer to every new-style callable object (including functions). That seems wasteful. Why not have a per-type pointer, and for types that need it (like PyTypeObject), make it dispatch to an instance-specific function? Minor things: - "Continued prohibition of callable classes as base classes" -- this section reads as a final. Would you be OK wording this as something other PEPs can tackle? - "PyObject_VectorCall" -- this looks extraneous, and the reference imlementation doesn't need it so far. Can it be removed, or justified? - METH_VECTORCALL is *not* strictly "equivalent to the currently undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the ARGUMENTS_OFFSET complication). - I'd like to officially call this PEP "Vectorcall", see https://github.com/python/peps/pull/984 Mark, what are your plans for next steps with PEP 590? If a volunteer wanted to help you push this forward, what would be the best thing to work on? Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or should address? From vano at mail.mipt.ru Wed Apr 10 13:57:13 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Wed, 10 Apr 2019 20:57:13 +0300 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: References: Message-ID: <8bd61b63-4fc7-91be-b717-226c205e3623@mail.mipt.ru> On 10.04.2019 7:30, Karthikeyan wrote: > Thanks Gregory. I think it's a good tradeoff to ensure this validation only for URLs of http scheme. > > I also agree handling newline is little problematic over the years and the discussion over the level at which validation should occur also > prolongs some of the patches. https://bugs.python.org/issue35906 is another similar case where splitlines is used but it's better to raise > an error and the proposed fix could be used there too. Victor seemed to wrote a similar PR like linked one for other urllib functions only > to fix similar attack in ftplib to reject newlines that was eventually fixed only in ftplib > > * https://bugs.python.org/issue30713 > * https://bugs.python.org/issue29606 > > Search also brings multiple issues with one duplicate over another that makes these attacks scattered over the tracker and some edge case > missing. Slightly off topic, the last time I reported a cookie related issue where the policy can be overriden by third party library I > was asked to fix it in stdlib itself since adding fixes to libraries causes maintenance burden to downstream libraries to keep up > upstream. With urllib being a heavily used module across ecosystem it's good to have a fix landing in stdlib that secures downstream > libraries encouraging users to upgrade Python too. > Validation should occur whenever user data crosses a trust boundary -- i.e. when the library starts to assume that an extracted chunk now contains something valid. https://tools.ietf.org/html/rfc3986 defines valid syntax (incl. valid characters) for every part of a URL -- _of any scheme_ (FYI, \r\n are invalid everywhere and the test code for ??? `data:' that Karthikeyan referred to is raw data to compare to rather than a part of a URL). It also obsoletes all the RFCs that the current code is written against. AFAICS, urllib.split* fns (obsoleted as public in 3.8) are used by both urllib and urllib2 to parse URLs. They can be made to each validate the chunk that they split off. urlparse can validate the entire URL altogether. Also, all modules ought to use the same code (urlparse looks like the best candidate) to parse URLs -- this will minimize the attack surface. I think I can look into this later this week. Fixing this is going to break code that relies on the current code accepting invalid URLs. But the docs have never said that e.g. in urlopen, anything apart from a (valid) URL is accepted (in particular, this implies that the user is responsible for escaping stuff properly before passing it). So I would say that we are within our right here and whoever is relying on those quirks is and has always been on unsupported territory. Determining which of those quirks are exploitable and which are not to fix just the former is an incomparably larger, more error-prone and avoidable work. If anything, the history of the issue referenced to by previous posters clearly shows that this is too much to ask from the Python team. I also see other undocumented behavior like accepting '>' (also obsoleted as public in 3.8) which I would like to but it's of no harm. -- Regards, Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Wed Apr 10 14:09:49 2019 From: steve.dower at python.org (Steve Dower) Date: Wed, 10 Apr 2019 11:09:49 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: On 10Apr2019 0401, Victor Stinner wrote: > Le mar. 9 avr. 2019 ? 22:16, Steve Dower a ?crit : >> What are the other changes that would be required? > > I don't know. > >> And is there another >> way to get the same functionality without ABI modifications? > > Py_TRACE_REFS is a double linked list of *all* Python objects. To get > this functionality, you need to store the list somewhere. I don't know > how to maintain such list outside the PyObject structure. There's certainly no more convenient way to do it. Maybe if we had detached reference counts it would be easier, but it would likely still result in ABI compatibility issues between debug builds of extensions and release builds of Python (the most common scenario, in my experience). > One solution would be to enable Py_TRACE_REFS in release mode. Does > anyone want to add 16 bytes to every PyObject? I don't want that :-) Yeah, nobody suggested that anyway :) >> I think it's worthwhile if we can really get to debug and non-debug >> builds being ABI compatible. Getting partway there in this case doesn't >> seem to offer any benefits. > > Disabling Py_TRACE_REFS by default in debug mode reduces the Python > memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 > bytes on 64-bit platforms. Right, except it's debug mode. > I don't think that I ever used sys.getobjects(), whereas many projects > use gc.get_objects() which is also available in release builds (not > only in debug builds). > > I'm quite sure that almost nobody uses debug builds because the ABI is > incompatible. There were just over 250,000 downloads of the prebuilt debug binaries for Windows (which are optional in the installer and turned off by default) in March. Whether they are being used is another question, but I know for sure at least a few people who use them. When you want to use a debug build of your extension module, using a debug build of CPython is the only way to do it. So unless we can get rid of *all* the ABI incompatibilities, a debug build of CPython is still going to be necessary and disabling/removing reference tracking doesn't provide any benefit. > The main question is if anyone ever used Py_TRACE_REFS? Does someone > use sys.getobjects() or PYTHONDUMPREFS environment variable? > > Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does > simply crash Python 3.7 at exit. So I don't think that anyone use it > :-) How do we track reference leaks in the buildbots? Can/should we be using this? It doesn't crash on Python 3.8, so I suspect fixing the bug is a better option than using it as an excuse to remove the feature. From a quick test, it seems that a tuple element is being freed but not removed from the tuple, so it's probably a double-decref bug somewhere in 3.7. Cheers, Steve From steve.dower at python.org Wed Apr 10 14:45:51 2019 From: steve.dower at python.org (Steve Dower) Date: Wed, 10 Apr 2019 11:45:51 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: <2778f0f5-3a29-54da-8292-dfc23a0c6621@python.org> On 10Apr2019 1109, Steve Dower wrote: > On 10Apr2019 0401, Victor Stinner wrote: >>> I think it's worthwhile if we can really get to debug and non-debug >>> builds being ABI compatible. Getting partway there in this case doesn't >>> seem to offer any benefits. >> >> Disabling Py_TRACE_REFS by default in debug mode reduces the Python >> memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 >> bytes on 64-bit platforms. > > Right, except it's debug mode. I left this comment unfinished :) It's debug mode, and so you should expect less efficient memory and CPU usage. That's why we have two modes - so that it's easier to debug issues. Now, if debug mode was unusably slow or had way too much overhead, we'd want to fix that. But it isn't unusable, so reducing memory usage at the cost of making debugging harder is not compelling. Cheers, Steve From guido at python.org Wed Apr 10 15:07:50 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 10 Apr 2019 12:07:50 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: <2778f0f5-3a29-54da-8292-dfc23a0c6621@python.org> References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <2778f0f5-3a29-54da-8292-dfc23a0c6621@python.org> Message-ID: I recall finding memory leaks using this. (E.g. I remember a leak in Zope due to a cache that was never pruned.) But presumably gc.get_objects() would have been sufficient. (IIRC it didn't exist at the time.) On Wed, Apr 10, 2019 at 11:48 AM Steve Dower wrote: > On 10Apr2019 1109, Steve Dower wrote: > > On 10Apr2019 0401, Victor Stinner wrote: > >>> I think it's worthwhile if we can really get to debug and non-debug > >>> builds being ABI compatible. Getting partway there in this case doesn't > >>> seem to offer any benefits. > >> > >> Disabling Py_TRACE_REFS by default in debug mode reduces the Python > >> memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 > >> bytes on 64-bit platforms. > > > > Right, except it's debug mode. > > I left this comment unfinished :) > > It's debug mode, and so you should expect less efficient memory and CPU > usage. That's why we have two modes - so that it's easier to debug issues. > > Now, if debug mode was unusably slow or had way too much overhead, we'd > want to fix that. But it isn't unusable, so reducing memory usage at the > cost of making debugging harder is not compelling. > > Cheers, > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed Apr 10 15:20:25 2019 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 10 Apr 2019 12:20:25 -0700 Subject: [Python-Dev] Need help to fix HTTP Header Injection vulnerability In-Reply-To: <8bd61b63-4fc7-91be-b717-226c205e3623@mail.mipt.ru> References: <8bd61b63-4fc7-91be-b717-226c205e3623@mail.mipt.ru> Message-ID: On Wed, Apr 10, 2019 at 11:00 AM Ivan Pozdeev via Python-Dev < python-dev at python.org> wrote: > > On 10.04.2019 7:30, Karthikeyan wrote: > > Thanks Gregory. I think it's a good tradeoff to ensure this validation > only for URLs of http scheme. > > I also agree handling newline is little problematic over the years and the > discussion over the level at which validation should occur also prolongs > some of the patches. https://bugs.python.org/issue35906 is another > similar case where splitlines is used but it's better to raise an error and > the proposed fix could be used there too. Victor seemed to wrote a similar > PR like linked one for other urllib functions only to fix similar attack in > ftplib to reject newlines that was eventually fixed only in ftplib > > * https://bugs.python.org/issue30713 > * https://bugs.python.org/issue29606 > > Search also brings multiple issues with one duplicate over another that > makes these attacks scattered over the tracker and some edge case missing. > Slightly off topic, the last time I reported a cookie related issue where > the policy can be overriden by third party library I was asked to fix it in > stdlib itself since adding fixes to libraries causes maintenance burden to > downstream libraries to keep up upstream. With urllib being a heavily used > module across ecosystem it's good to have a fix landing in stdlib that > secures downstream libraries encouraging users to upgrade Python too. > > Validation should occur whenever user data crosses a trust boundary -- > i.e. when the library starts to assume that an extracted chunk now contains > something valid. > > https://tools.ietf.org/html/rfc3986 defines valid syntax (incl. valid > characters) for every part of a URL -- _of any scheme_ (FYI, \r\n are > invalid everywhere and the test code for `data:' that Karthikeyan > referred to is raw data to compare to rather than a part of a URL). It also > obsoletes all the RFCs that the current code is written against. > > AFAICS, urllib.split* fns (obsoleted as public in 3.8) are used by both > urllib and urllib2 to parse URLs. They can be made to each validate the > chunk that they split off. urlparse can validate the entire URL altogether. > > Also, all modules ought to use the same code (urlparse looks like the best > candidate) to parse URLs -- this will minimize the attack surface. > > I think I can look into this later this week. > My PR as of last night cites that RFC and does validation in http.client while constructing the protocol request payload. Doing it within split functions was an initial hack that looked like it might work but didn't feel right as that isn't what people expect of those functions and that turned out to be the case as I tested things due to our mess of codepaths for opening URLs, but they all end with http.client so yay! I did *not* look at any of the async http client code paths. (legacy asyncore or new asyncio). If there is an issue there, those deserve to have their own bugs filed. As for third party PyPI libraries such as urllib3, they are on their own to fix bugs. If they happen to use a code path that a stdlib fix helps, good for them, but honestly they are much better off making and shipping their own update to avoid the bug. Users can get it much sooner as it's a mere pip install -U away rather than a python runtime upgrade. > Fixing this is going to break code that relies on the current code > accepting invalid URLs. But the docs have never said that e.g. in urlopen, > anything apart from a (valid) URL is accepted (in particular, this implies > that the user is responsible for escaping stuff properly before passing > it). So I would say that we are within our right here and whoever is > relying on those quirks is and has always been on unsupported territory. > yep. even http.client.HTTPConnection.request names the function parameter "url" so anyone embedding whitespace newlines and http protocol strings within that is well outside of supported territory (as one example in our own test_xmlrpc was taking advantage of to test a malformed request). I suggest following up on https://bugs.python.org/issue30458 rather than in this thread. the thread did its job, it directed our eyeballs at the problems. :) -gps > Determining which of those quirks are exploitable and which are not to fix > just the former is an incomparably larger, more error-prone and avoidable > work. If anything, the history of the issue referenced to by previous > posters clearly shows that this is too much to ask from the Python team. > > I also see other undocumented behavior like accepting '>' (also > obsoleted as public in 3.8) which I would like to but it's of no harm. > > -- > > Regards, > Ivan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Apr 10 15:27:03 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Apr 2019 12:27:03 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: On Wed, Apr 10, 2019, 04:04 Victor Stinner wrote: > Le mar. 9 avr. 2019 ? 22:16, Steve Dower a ?crit > : > > What are the other changes that would be required? > > I don't know. > > > And is there another > > way to get the same functionality without ABI modifications? > > Py_TRACE_REFS is a double linked list of *all* Python objects. To get > this functionality, you need to store the list somewhere. I don't know > how to maintain such list outside the PyObject structure. > I assume these pointers get updated from some generic allocation/free code. Could that code instead overallocate by 16 bytes, use the first 16 bytes to hold the pointers, and then return the PyObject* as (actual allocated pointer + 16)? Basically the "container_of" trick. I don't think that I ever used sys.getobjects(), whereas many projects > use gc.get_objects() which is also available in release builds (not > only in debug builds). Can anyone explain what pydebug builds are... for? Confession: I've never used them myself, and don't know why I would want to. (I have to assume that most of Steve's Windows downloads are from folks who thought they were downloading a python debugger.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From aranea.network at gmail.com Wed Apr 10 15:39:27 2019 From: aranea.network at gmail.com (Robert Okadar) Date: Wed, 10 Apr 2019 21:39:27 +0200 Subject: [Python-Dev] (no subject) In-Reply-To: <20190410153555.GB3010@ando.pearwood.info> References: <20190410153555.GB3010@ando.pearwood.info> Message-ID: Hi Steven, Thank you for pointing me in the right direction. Will search for help on places you mentioned. Not sure how can we help you with developing the Python interpreter, as I doubt we have any knowledge that this project might use it. When I say 'we', I mean on my colleague and me. All the best, -- Robert Okadar IT Consultant Schedule an *online meeting * with me! Visit *aranea-mreze.hr* or call * +385 91 300 8887* On Wed, 10 Apr 2019 at 17:36, Steven D'Aprano wrote: > Hi Robert, > > This mailing list is for the development of the Python interpreter, not > a general help desk. There are many other forums where you can ask for > help, such as the comp.lang.python newsgroup, Stackoverflow, /r/python > on Reddit, the IRC channel, and more. > > Perhaps you can help us though, I presume you signed up to this mailing > list via the web interface at > > https://mail.python.org/mailman/listinfo/python-dev > > Is there something we could do to make it more clear that this is not > the right place to ask for help? > > > -- > Steven > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Apr 10 15:44:54 2019 From: brett at python.org (Brett Cannon) Date: Wed, 10 Apr 2019 12:44:54 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: On Wed, Apr 10, 2019 at 12:30 PM Nathaniel Smith wrote: > On Wed, Apr 10, 2019, 04:04 Victor Stinner wrote: > >> Le mar. 9 avr. 2019 ? 22:16, Steve Dower a >> ?crit : >> > What are the other changes that would be required? >> >> I don't know. >> >> > And is there another >> > way to get the same functionality without ABI modifications? >> >> Py_TRACE_REFS is a double linked list of *all* Python objects. To get >> this functionality, you need to store the list somewhere. I don't know >> how to maintain such list outside the PyObject structure. >> > > I assume these pointers get updated from some generic allocation/free > code. Could that code instead overallocate by 16 bytes, use the first 16 > bytes to hold the pointers, and then return the PyObject* as (actual > allocated pointer + 16)? Basically the "container_of" trick. > > I don't think that I ever used sys.getobjects(), whereas many projects >> use gc.get_objects() which is also available in release builds (not >> only in debug builds). > > > Can anyone explain what pydebug builds are... for? Confession: I've never > used them myself, and don't know why I would want to. > There is a bunch of extra things done in a debug build, e.g. all freed memory is blanked out with a known pattern so it's easy to tell when you're reading from freed memory (and thus probably messed up your refcounts). And then various extras are tossed on to the sys module to help with things. Basically anything people have found useful and require being compiled in typically get clumped in under the debug build. -Brett > > (I have to assume that most of Steve's Windows downloads are from folks > who thought they were downloading a python debugger.) > > -n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Wed Apr 10 16:50:01 2019 From: steve.dower at python.org (Steve Dower) Date: Wed, 10 Apr 2019 13:50:01 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> On 10Apr2019 1227, Nathaniel Smith wrote: > On Wed, Apr 10, 2019, 04:04 Victor Stinner > wrote: > I don't think that I ever used sys.getobjects(), whereas many projects > use gc.get_objects() which is also available in release builds (not > only in debug builds). > > > Can anyone explain what pydebug builds are... for? Confession: I've > never used them myself, and don't know why I would want to. > > (I have to assume that most of Steve's Windows downloads are from folks > who thought they were downloading a python debugger.) They're for debugging :) In general, debug builds are meant for faster inner-loop development. They generally do incremental builds properly and much faster by omitting most optimisations, which also enables source mapping to be more accurate when debugging. Assertions are typically enabled so that you are notified when a precondition is first identified rather than when it causes the crash (compiling these out later means you don't pay a runtime cost once you've got the inputs correct - generally these are used for developer-controlled values, rather than user-provided ones). So the idea is that you can quickly edit, build, debug, fix your code in a debug configuration, and then use a release configuration for the actual released build. Full release builds may take 2-3x longer than full debug builds, given the extra effort they make at optimisation, and very often can't do minimal incremental builds at all (so they may be 10-100x slower if you only modified one source file). But because the builds behave functionally equivalently, you can iterate with the faster configuration and get more done. (Disclaimer: I do most of my work on Windows where this has been properly developed. What I hear from non-Windows developers is that other tools can't actually handle this kind of workflow properly. Sorry.) The reason we ship debug Python binaries is because debug builds use a different C Runtime, so if you do a debug build of an extension module you're working on it won't actually work with a non-debug build of CPython. While it's possible that people misread "Download debug binaries" (the text in the installer) and think that it's an actual debugger, I'd suggest that your total lack of context here means you should avoid making assumptions about users you know nothing about. Cheers, Steve From tjreedy at udel.edu Wed Apr 10 17:00:42 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 10 Apr 2019 17:00:42 -0400 Subject: [Python-Dev] (no subject) In-Reply-To: References: Message-ID: On 4/10/2019 7:24 AM, Robert Okadar wrote: > Hi community, > > I have developed a tkinter GUI component, Python v3.7. It runs very well in > Linux but seeing a huge performance impact in Windows 10. While in Linux an > almost real-time performance is achieved, in Windows it is slow to an > unusable level. > > The code is somewhat stripped down from the original, but the performance > difference is the same anyway. The columns can be resized by clicking on > the column border and dragging it. Resizing works only for the top row (but > it resizes the entire column). > In this demo, all bindings are avoided to exclude influence on the > component performance and thus not included. If you resize the window > (i.e., if you maximize it), you must call the function table.fit() from > IDLE shell. > > Does anyone know where is this huge difference in performance coming from? > Can anything be done about it? For reasons explained by Steve, please send this instead to python-list https://mail.python.org/mailman/listinfo/python-list To access python-list as a newsgroup, skip comp.lang.python and use newsgroup gmane.comp.python.general at news.gmane.org. I will respond there after testing/verifying and perhaps searching bugs.python.org for a similar issue. -- Terry Jan Reedy From tjreedy at udel.edu Wed Apr 10 17:06:28 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 10 Apr 2019 17:06:28 -0400 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: <2778f0f5-3a29-54da-8292-dfc23a0c6621@python.org> References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <2778f0f5-3a29-54da-8292-dfc23a0c6621@python.org> Message-ID: On 4/10/2019 2:45 PM, Steve Dower wrote: > It's debug mode, and so you should expect less efficient memory and CPU > usage. On my Windows machine, 'python -m test -ugui' takes about twice as long. > That's why we have two modes - so that it's easier to debug issues. -- Terry Jan Reedy From python at mrabarnett.plus.com Wed Apr 10 17:15:06 2019 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 10 Apr 2019 22:15:06 +0100 Subject: [Python-Dev] (no subject) In-Reply-To: References: Message-ID: <8eb6376a-5ce4-c0ad-ba35-438e140b304b@mrabarnett.plus.com> On 2019-04-10 22:00, Terry Reedy wrote: > On 4/10/2019 7:24 AM, Robert Okadar wrote: >> Hi community, >> >> I have developed a tkinter GUI component, Python v3.7. It runs very well in >> Linux but seeing a huge performance impact in Windows 10. While in Linux an >> almost real-time performance is achieved, in Windows it is slow to an >> unusable level. >> >> The code is somewhat stripped down from the original, but the performance >> difference is the same anyway. The columns can be resized by clicking on >> the column border and dragging it. Resizing works only for the top row (but >> it resizes the entire column). >> In this demo, all bindings are avoided to exclude influence on the >> component performance and thus not included. If you resize the window >> (i.e., if you maximize it), you must call the function table.fit() from >> IDLE shell. >> >> Does anyone know where is this huge difference in performance coming from? >> Can anything be done about it? > > For reasons explained by Steve, please send this instead to python-list > https://mail.python.org/mailman/listinfo/python-list > To access python-list as a newsgroup, skip comp.lang.python and use > newsgroup gmane.comp.python.general at news.gmane.org. > > I will respond there after testing/verifying and perhaps searching > bugs.python.org for a similar issue. > ttk has Treeview, which can be configured as a table. From vstinner at redhat.com Wed Apr 10 18:29:34 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 11 Apr 2019 00:29:34 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le mer. 10 avr. 2019 ? 20:09, Steve Dower a ?crit : > > The main question is if anyone ever used Py_TRACE_REFS? Does someone > > use sys.getobjects() or PYTHONDUMPREFS environment variable? > > > > Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does > > simply crash Python 3.7 at exit. So I don't think that anyone use it > > :-) > > How do we track reference leaks in the buildbots? Can/should we be using > this? Ah, maybe there is a misunderstanding. You don't need Py_TRACE_REFS to track memory leaks: "python3 -m test -R 3:3" works without that. test_regrtest contains an unit test for reference leaks (I know it that I wrote the test :-)), and you can see that the test pass on my PR. I also checked manually by adding a memory leak into a test: it is still detected :-) regrtest uses sys.gettotalrefcount(), sys.getallocatedblocks() and support.fd_count() to track reference, memory and file descriptor leaks. None of these functions are related to Py_TRACE_REFS. Again, the question is who rely on Py_TRACE_REFS. If nobody rely on it, I don't see the point of keeping this expensive feature (at least, not by default). > It doesn't crash on Python 3.8, so I suspect fixing the bug is a better > option than using it as an excuse to remove the feature. It's not what I said. I only said that it seems that nobody uses PYTHONDUMPREFS, since it's broken for a long time. It's just a hint about the usage of Py_TRACE_REFS. I don't propose to remove the feature, but to disable it by default. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Wed Apr 10 18:41:29 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 11 Apr 2019 00:41:29 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le mer. 10 avr. 2019 ? 21:45, Brett Cannon a ?crit : >> Can anyone explain what pydebug builds are... for? Confession: I've never used them myself, and don't know why I would want to. > > There is a bunch of extra things done in a debug build, e.g. all freed memory is blanked out with a known pattern so it's easy to tell when you're reading from freed memory (and thus probably messed up your refcounts). Since the debug build ABI is incompatible, it's not easy to use a debug build. For that reasons, I'm working for a few years to add such debugging features into regular release build. For example, you can now get this debugger on memory allocations using PYTHONMALLOC=debug environment variable since Python 3.6. Since such debug feature is not easy to discover (especially if you don't read closely What's New In Python 3.x), I added a generic "-X dev" command line option to enable a "development mode". It enables various similar features to debug code: https://docs.python.org/dev/using/cmdline.html#id5 Effect of the developer mode: * Add default warning filter, as -W default. * Install debug hooks on memory allocators: see the PyMem_SetupDebugHooks() C function. * Enable the faulthandler module to dump the Python traceback on a crash. * Enable asyncio debug mode. * Set the dev_mode attribute of sys.flags to True See also https://pythondev.readthedocs.io/debug_tools.html where I started to document these debug tools and how to use them. > And then various extras are tossed on to the sys module to help with things. Basically anything people have found useful and require being compiled in typically get clumped in under the debug build. The debug build still contains many features which are useful to debug C extensions. For example, it adds sys.gettotalrefcnt() which is a convenient way to detect reference leaks. This funtion require Py_REF_DEBUG which modifies Py_INCREF() to add "_Py_RefTotal++;". Iit is has an impact on overall Python performance and should not be enabled in release build. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From J.Demeyer at UGent.be Wed Apr 10 19:05:26 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 11 Apr 2019 01:05:26 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: <5CAE76B6.6090501@UGent.be> On 2019-04-10 18:25, Petr Viktorin wrote: > Hello! > I've had time for a more thorough reading of PEP 590 and the reference > implementation. Thank you for the work! And thank you for the review! > I'd now describe the fundamental > difference between PEP 580 and PEP 590 as: > - PEP 580 tries to optimize all existing calling conventions > - PEP 590 tries to optimize (and expose) the most general calling > convention (i.e. fastcall) And PEP 580 has better performance overall, even for METH_FASTCALL. See this thread: https://mail.python.org/pipermail/python-dev/2019-April/156954.html Since these PEPs are all about performance, I consider this a very relevant argument in favor of PEP 580. > PEP 580 also does a number of other things, as listed in PEP 579. But I > think PEP 590 does not block future PEPs for the other items. > On the other hand, PEP 580 has a much more mature implementation -- and > that's where it picked up real-world complexity. About complexity, please read what I wrote in https://mail.python.org/pipermail/python-dev/2019-March/156853.html I claim that the complexity in the protocol of PEP 580 is a good thing, as it removes complexity from other places, in particular from the users of the protocol (better have a complex protocol that's simple to use, rather than a simple protocol that's complex to use). As a more concrete example of the simplicity that PEP 580 could bring, CPython currently has 2 classes for bound methods implemented in C: - "builtin_function_or_method" for normal C methods - "method-descriptor" for slot wrappers like __eq__ or __add__ With PEP 590, these classes would need to stay separate to get maximal performance. With PEP 580, just one class for bound methods would be sufficient and there wouldn't be any performance loss. And this extends to custom third-party function/method classes, for example as implemented by Cython. > PEP 590's METH_VECTORCALL is designed to handle all existing use cases, > rather than mirroring the existing METH_* varieties. > But both PEPs require the callable's code to be modified, so requiring > it to switch calling conventions shouldn't be a problem. Agreed. > Jeroen's analysis from > https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems > to miss a step at the top: > > a. CALL_FUNCTION* / CALL_METHOD opcode > calls > b. _PyObject_FastCallKeywords() > which calls > c. _PyCFunction_FastCallKeywords() > which calls > d. _PyMethodDef_RawFastCallKeywords() > which calls > e. the actual C function (*ml_meth)() > > I think it's more useful to say that both PEPs bridge a->e (via > _Py_VectorCall or PyCCall_Call). Not quite. For a builtin_function_or_method, we have with PEP 580: a. call_function() calls d. PyCCall_FastCall which calls e. the actual C function and with PEP 590 it's more like: a. call_function() calls c. _PyCFunction_FastCallKeywords which calls d. _PyMethodDef_RawFastCallKeywords which calls e. the actual C function Level c. above is the vectorcall wrapper, which is a level that PEP 580 doesn't have. > The way `const` is handled in the function signatures strikes me as too > fragile for public API. That's a detail which shouldn't influence the acceptance of either PEP. > Why not have a per-type pointer, and for types that need it (like > PyTypeObject), make it dispatch to an instance-specific function? That would be exactly https://bugs.python.org/issue29259 I'll let Mark comment on this. > Minor things: > - "Continued prohibition of callable classes as base classes" -- this > section reads as a final. Would you be OK wording this as something > other PEPs can tackle? > - "PyObject_VectorCall" -- this looks extraneous, and the reference > imlementation doesn't need it so far. Can it be removed, or justified? > - METH_VECTORCALL is *not* strictly "equivalent to the currently > undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the > ARGUMENTS_OFFSET complication). > - I'd like to officially call this PEP "Vectorcall", see > https://github.com/python/peps/pull/984 Those are indeed details which shouldn't influence the acceptance of either PEP. If you go with PEP 590, then we should discuss this further. > Mark, what are your plans for next steps with PEP 590? If a volunteer > wanted to help you push this forward, what would be the best thing to > work on? Personally, I think what we need now is a decision between PEP 580 and PEP 590 (there is still the possibility of rejecting both but I really hope that this won't happen). There is a lot of work that still needs to be done after either PEP is accepted, such as: - finish and merge the reference implementation - document everything - use the protocol in more classes where it makes sense (for example, staticmethod, wrapper_descriptor) - use this in Cython - handle more issues from PEP 579 I volunteer to put my time into this, regardless of which PEP is accepted. Of course, I still think that PEP 580 is better, but I also want this functionality even if PEP 590 is accepted. > Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or > should address? Well, PEP 580 is an extensible protocol while PEP 590 is not. But, PyTypeObject is extensible, so even with PEP 590 one can always extend that (for example, PEP 590 uses a type flag Py_TPFLAGS_METHOD_DESCRIPTOR where PEP 580 instead uses the structs for the C call protocol). But I guess that extending PyTypeObject will be harder to justify (say, in a future PEP) than extending the C call protocol. Also, it's explicitly allowed for users of the PEP 580 protocol to extend the PyCCallDef structure with custom fields. But I don't have a concrete idea of whether that will be useful. Kind regards, Jeroen. From njs at pobox.com Wed Apr 10 22:17:28 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Apr 2019 19:17:28 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: On Wed, Apr 10, 2019 at 1:50 PM Steve Dower wrote: > > On 10Apr2019 1227, Nathaniel Smith wrote: > > On Wed, Apr 10, 2019, 04:04 Victor Stinner > > wrote: > > I don't think that I ever used sys.getobjects(), whereas many projects > > use gc.get_objects() which is also available in release builds (not > > only in debug builds). > > > > > > Can anyone explain what pydebug builds are... for? Confession: I've > > never used them myself, and don't know why I would want to. > > > > (I have to assume that most of Steve's Windows downloads are from folks > > who thought they were downloading a python debugger.) > > They're for debugging :) > > In general, debug builds are meant for faster inner-loop development. > They generally do incremental builds properly and much faster by > omitting most optimisations, which also enables source mapping to be > more accurate when debugging. Assertions are typically enabled so that > you are notified when a precondition is first identified rather than > when it causes the crash (compiling these out later means you don't pay > a runtime cost once you've got the inputs correct - generally these are > used for developer-controlled values, rather than user-provided ones). > > So the idea is that you can quickly edit, build, debug, fix your code in > a debug configuration, and then use a release configuration for the > actual released build. Full release builds may take 2-3x longer than > full debug builds, given the extra effort they make at optimisation, and > very often can't do minimal incremental builds at all (so they may be > 10-100x slower if you only modified one source file). But because the > builds behave functionally equivalently, you can iterate with the faster > configuration and get more done. Sure, I'm familiar with the idea of debug and optimization settings in compilers. I build python with custom -g and -O flags all the time. (I do it by setting OPT when running configure.) It's also handy that many Linux distros these days let you install debug metadata for all the binaries they ship ? I've used that when debugging third-party extension modules, to get a better idea of what was happening when a backtrace passes through libpython. But --with-pydebug is a whole other thing beyond that, that changes the ABI, has its own wheel tags, requires special cases in packages that use ctypes to access PyObject* internals, and appears to be almost entirely undocumented. It sounds like --with-pydebug has accumulated a big grab bag of unrelated features, mostly stuff that was useful at some point for some CPython dev trying to debug CPython itself? It's clearly not designed with end users as the primary audience, given that no-one knows what it actually does and that it makes third-party extensions really awkward to run. If that's right then I think Victor's plan of to sort through what it's actually doing makes a lot of sense, especially if we can remove the ABI breaking stuff, since that causes a disproportionate amount of trouble. > The reason we ship debug Python binaries is because debug builds use a > different C Runtime, so if you do a debug build of an extension module > you're working on it won't actually work with a non-debug build of CPython. ...But this is an important point. I'd forgotten that MSVC has a habit of changing the entire C runtime when you turn on the compiler's debugging mode. (On Linux, we generally don't bother rebuilding the C runtime unless you're actually debugging the C runtime, and anyway if you do want to switch to a debug version of the C runtime, it's ABI compatible so your program binaries don't have to be rebuilt.) Is it true that if the interpreter is built against ucrtd.lib, and an extension module is built against ucrt.lib, then they'll have incompatible ABIs and not work together? And that this detail is part of what's been glommed together into the "d" flag in the soabi tag on Windows? Is it possible for the Windows installer to include PDB files (/Zi /DEBUG) to allow debuggers to understand the regular release executable? (That's what I would have expected to get if I checked a box labeled "Download debug binaries".) -n -- Nathaniel J. Smith -- https://vorpus.org From storchaka at gmail.com Thu Apr 11 01:45:28 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 11 Apr 2019 08:45:28 +0300 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: 10.04.19 14:01, Victor Stinner ????: > Disabling Py_TRACE_REFS by default in debug mode reduces the Python > memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 > bytes on 64-bit platforms. Does not the memory allocator in debug mode have even larger cost per allocated block? From vstinner at redhat.com Thu Apr 11 05:28:51 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 11 Apr 2019 11:28:51 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le jeu. 11 avr. 2019 ? 07:49, Serhiy Storchaka a ?crit : > 10.04.19 14:01, Victor Stinner ????: > > Disabling Py_TRACE_REFS by default in debug mode reduces the Python > > memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 > > bytes on 64-bit platforms. > > Does not the memory allocator in debug mode have even larger cost per > allocated block? What do you mean? That a debug build already waste too much memory and so doesn't deserve to have a smaller memory footprint? I'm not sure that I understand your point. A smaller footprint can mean that more people may be able to use debug build. Disabling Py_TRACE_REFS should make Python a little bit faster. My question stands: is it worth to keep a feature which "waste" resources (memory footprint and CPU) and nobody uses it? Debug hooks add 4 x sizeof(size_t) bytes to every memory allocation to detect buffer underflow and buffer overflow. That's 32 bytes per memory allocation. By the way, IMHO the "serial number" is not really useful and could be removed to only add 3 x sizeof(size_t) (24 bytes). But the debug hook is very useful, it's common that it helps me to find real bugs in the code. Whereas I don't recall that Py_TRACE_REFS helped me even once. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From Peixing.Xin at windriver.com Thu Apr 11 05:45:00 2019 From: Peixing.Xin at windriver.com (Xin, Peixing) Date: Thu, 11 Apr 2019 09:45:00 +0000 Subject: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? Message-ID: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> Hi, Math experts: Looking at the codes below, for many math operations, CPython is checking errno to determine the error status even though the math function returns normal value back. Is it a safe solution? From the description here http://man7.org/linux/man-pages/man3/errno.3.html and https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152351, it looks apis probably set the errno when normal result is returned. Or being a side effect by calling other APIs in the implementation. In this situation, CPython's math operation might raise exceptions however in fact the result is correct. https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L956 https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L864 Thanks, Peixing From encukou at gmail.com Thu Apr 11 07:21:49 2019 From: encukou at gmail.com (Petr Viktorin) Date: Thu, 11 Apr 2019 13:21:49 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CAE76B6.6090501@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> Message-ID: <308f7b69-01d4-55c6-0bd6-d767ecc3fd99@gmail.com> On 4/11/19 1:05 AM, Jeroen Demeyer wrote: > On 2019-04-10 18:25, Petr Viktorin wrote: >> Hello! >> I've had time for a more thorough reading of PEP 590 and the reference >> implementation. Thank you for the work! > > And thank you for the review! One general note: I am not (yet) choosing between PEP 580 and PEP 590. I am not looking for arguments for/against whole PEPs, but individual ideas which, I believe, can still be mixed & matched. I see the situation this way: - I get about one day per week when I can properly concentrate on CPython. It's frustrating to be the bottleneck. - Jeroen has time, but it would frustrating to work on something that will later be discarded, and it's frustrating to not be able to move the project forward. - Mark has good ideas, but seems to lack the time to polish them, or even test out if they are good. It is probably frustrating to see unpolished ideas rejected. I'm looking for ways to reduce the frustration, given where we are. Jeroen, thank you for the comments. Apologies for not having the time to reply to all of them properly right now. Mark, if you could find the time to answer (even just a few of the points), it would be great. I ask you to share/clarify your thoughts, not defend your PEP. >> I'd now describe the fundamental >> difference between PEP 580 and PEP 590 as: >> - PEP 580 tries to optimize all existing calling conventions >> - PEP 590 tries to optimize (and expose) the most general calling >> convention (i.e. fastcall) > > And PEP 580 has better performance overall, even for METH_FASTCALL. See > this thread: > https://mail.python.org/pipermail/python-dev/2019-April/156954.html > > Since these PEPs are all about performance, I consider this a very > relevant argument in favor of PEP 580. > >> PEP 580 also does a number of other things, as listed in PEP 579. But I >> think PEP 590 does not block future PEPs for the other items. >> On the other hand, PEP 580 has a much more mature implementation -- and >> that's where it picked up real-world complexity. > About complexity, please read what I wrote in > https://mail.python.org/pipermail/python-dev/2019-March/156853.html > > I claim that the complexity in the protocol of PEP 580 is a good thing, > as it removes complexity from other places, in particular from the users > of the protocol (better have a complex protocol that's simple to use, > rather than a simple protocol that's complex to use). Sadly, I need more time on this than I have today; I'll get back to it next week. > As a more concrete example of the simplicity that PEP 580 could bring, > CPython currently has 2 classes for bound methods implemented in C: > - "builtin_function_or_method" for normal C methods > - "method-descriptor" for slot wrappers like __eq__ or __add__ > > With PEP 590, these classes would need to stay separate to get maximal > performance. With PEP 580, just one class for bound methods would be > sufficient and there wouldn't be any performance loss. And this extends > to custom third-party function/method classes, for example as > implemented by Cython. > >> PEP 590's METH_VECTORCALL is designed to handle all existing use cases, >> rather than mirroring the existing METH_* varieties. >> But both PEPs require the callable's code to be modified, so requiring >> it to switch calling conventions shouldn't be a problem. > > Agreed. > >> Jeroen's analysis from >> https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems >> to miss a step at the top: >> >> a. CALL_FUNCTION* / CALL_METHOD opcode >> ?????? calls >> b. _PyObject_FastCallKeywords() >> ?????? which calls >> c. _PyCFunction_FastCallKeywords() >> ?????? which calls >> d. _PyMethodDef_RawFastCallKeywords() >> ?????? which calls >> e. the actual C function (*ml_meth)() >> >> I think it's more useful to say that both PEPs bridge a->e (via >> _Py_VectorCall or PyCCall_Call). > > Not quite. For a builtin_function_or_method, we have with PEP 580: > > a. call_function() > ??? calls > d. PyCCall_FastCall > ??? which calls > e. the actual C function > > and with PEP 590 it's more like: > > a. call_function() > ??? calls > c. _PyCFunction_FastCallKeywords > ??? which calls > d. _PyMethodDef_RawFastCallKeywords > ??? which calls > e. the actual C function > > Level c. above is the vectorcall wrapper, which is a level that PEP 580 > doesn't have. Again, I'll get back to this next week. >> The way `const` is handled in the function signatures strikes me as too >> fragile for public API. > > That's a detail which shouldn't influence the acceptance of either PEP. True. I guess what I want from the answer is to know how much thought went into const handling: is what's in the PEP an initial draft, or does it solve some hidden issue? >> Why not have a per-type pointer, and for types that need it (like >> PyTypeObject), make it dispatch to an instance-specific function? > > That would be exactly https://bugs.python.org/issue29259 > > I'll let Mark comment on this. > >> Minor things: >> - "Continued prohibition of callable classes as base classes" -- this >> section reads as a final. Would you be OK wording this as something >> other PEPs can tackle? >> - "PyObject_VectorCall" -- this looks extraneous, and the reference >> imlementation doesn't need it so far. Can it be removed, or justified? >> - METH_VECTORCALL is *not* strictly "equivalent to the currently >> undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the >> ARGUMENTS_OFFSET complication). >> - I'd like to officially call this PEP "Vectorcall", see >> https://github.com/python/peps/pull/984 > > Those are indeed details which shouldn't influence the acceptance of > either PEP. If you go with PEP 590, then we should discuss this further. Here again, I mostly want to know if the details are there for deeper reasons, or just points to polish. >> Mark, what are your plans for next steps with PEP 590? If a volunteer >> wanted to help you push this forward, what would be the best thing to >> work on? > > Personally, I think what we need now is a decision between PEP 580 and > PEP 590 (there is still the possibility of rejecting both but I really > hope that this won't happen). There is a lot of work that still needs to > be done after either PEP is accepted, such as: > - finish and merge the reference implementation > - document everything > - use the protocol in more classes where it makes sense (for example, > staticmethod, wrapper_descriptor) > - use this in Cython > - handle more issues from PEP 579 > > I volunteer to put my time into this, regardless of which PEP is > accepted. Of course, I still think that PEP 580 is better, but I also > want this functionality even if PEP 590 is accepted. Thank you. Sorry for the way this is dragged out. Would it help to set some timeline/deadlines here? >> Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or >> should address? > > Well, PEP 580 is an extensible protocol while PEP 590 is not. But, > PyTypeObject is extensible, so even with PEP 590 one can always extend > that (for example, PEP 590 uses a type flag Py_TPFLAGS_METHOD_DESCRIPTOR > where PEP 580 instead uses the structs for the C call protocol). But I > guess that extending PyTypeObject will be harder to justify (say, in a > future PEP) than extending the C call protocol. Thanks. I also like PEP 580's extensibility. > Also, it's explicitly allowed for users of the PEP 580 protocol to > extend the PyCCallDef structure with custom fields. But I don't have a > concrete idea of whether that will be useful. I don't have good general experience with premature extensibility, so I'd not count this as a plus. From J.Demeyer at UGent.be Thu Apr 11 08:04:27 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 11 Apr 2019 14:04:27 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <308f7b69-01d4-55c6-0bd6-d767ecc3fd99@gmail.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <308f7b69-01d4-55c6-0bd6-d767ecc3fd99@gmail.com> Message-ID: <5CAF2D4B.8070205@UGent.be> Petr, I realize that you are in a difficult position. You'll end up disappointing either me or Mark... I don't know if the steering council or somebody else has a good idea to deal with this situation. > Jeroen has time Speaking of time, maybe I should clarify that I have time until the end of August: I am working for the OpenDreamKit grant, which allows me to work basically full-time on open source software development but that ends at the end of August. > Here again, I mostly want to know if the details are there for deeper > reasons, or just points to polish. I would say: mostly shallow details. The subclassing thing would be good to resolve, but I don't see any difference between PEP 580 and PEP 590 there. In PEP 580, I wrote a strategy for dealing with subclassing. I believe that it works and that exactly the same idea would work for PEP 590 too. Of course, I may be overlooking something... > I don't have good general experience with premature extensibility, so > I'd not count this as a plus. Fair enough. I also see it more as a "nice to have", not as a big plus. From christian at python.org Thu Apr 11 08:23:48 2019 From: christian at python.org (Christian Heimes) Date: Thu, 11 Apr 2019 14:23:48 +0200 Subject: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? In-Reply-To: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> Message-ID: <5fb2462a-ee2c-00e3-1928-d56510e29570@python.org> On 11/04/2019 11.45, Xin, Peixing wrote: > Hi, Math experts: > > Looking at the codes below, for many math operations, CPython is checking errno to determine the error status even though the math function returns normal value back. Is it a safe solution? From the description here http://man7.org/linux/man-pages/man3/errno.3.html and https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152351, it looks apis probably set the errno when normal result is returned. Or being a side effect by calling other APIs in the implementation. In this situation, CPython's math operation might raise exceptions however in fact the result is correct. > > https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L956 > https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L864 This is safe because all places first set errno to 0. Errno is a thread local variable, so other threads cannot influence the variable during the calls. This is one of the many quirks that Mark has implemented for platforms bugs in various libm. Christian From steve.dower at python.org Thu Apr 11 11:19:51 2019 From: steve.dower at python.org (Steve Dower) Date: Thu, 11 Apr 2019 08:19:51 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: <3de8eb9f-e2ad-f5b2-5de6-ba02fd4e3895@python.org> On 11Apr2019 0228, Victor Stinner wrote: > Le jeu. 11 avr. 2019 ? 07:49, Serhiy Storchaka a ?crit : >> 10.04.19 14:01, Victor Stinner ????: >>> Disabling Py_TRACE_REFS by default in debug mode reduces the Python >>> memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 >>> bytes on 64-bit platforms. >> >> Does not the memory allocator in debug mode have even larger cost per >> allocated block? > > What do you mean? That a debug build already waste too much memory and > so doesn't deserve to have a smaller memory footprint? I'm not sure > that I understand your point. He means you're micro-optimising something that doesn't matter. If you really wanted to reduce memory usage in debug builds, you'd go after one of the bigger "problems". > A smaller footprint can mean that more people may be able to use debug > build. Disabling Py_TRACE_REFS should make Python a little bit faster. This isn't one of the goals of a debug build though, and you haven't pointed at any examples of people not being able to use the debug build because of memory pressure. (Which is because most people who are not working on CPython itself should not be using the debug build.) > My question stands: is it worth to keep a feature which "waste" > resources (memory footprint and CPU) and nobody uses it? You haven't even tried to show that nobody uses it, other than pointing out that it exposes a crash due to a refcounting bug (which is kind of the point ;) ). Cheers, Steve From steve.dower at python.org Thu Apr 11 11:26:47 2019 From: steve.dower at python.org (Steve Dower) Date: Thu, 11 Apr 2019 08:26:47 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: On 10Apr2019 1917, Nathaniel Smith wrote: > It sounds like --with-pydebug has accumulated a big grab bag of > unrelated features, mostly stuff that was useful at some point for > some CPython dev trying to debug CPython itself? It's clearly not > designed with end users as the primary audience, given that no-one > knows what it actually does and that it makes third-party extensions > really awkward to run. If that's right then I think Victor's plan of > to sort through what it's actually doing makes a lot of sense, > especially if we can remove the ABI breaking stuff, since that causes > a disproportionate amount of trouble. Does it really cause a "disproportionate" amount of trouble? It's definitely not meant for anyone who isn't working on C code, whether in CPython, an extension or a host application. If you want to use third-party extensions and are not able to rebuild them, that's a very good sign that you probably shouldn't be on the debug build at all. Perhaps the "--with-pydebug" option is too attractive? (Is it the default?) That's easily fixed. >> The reason we ship debug Python binaries is because debug builds use a >> different C Runtime, so if you do a debug build of an extension module >> you're working on it won't actually work with a non-debug build of CPython. > > ...But this is an important point. I'd forgotten that MSVC has a habit > of changing the entire C runtime when you turn on the compiler's > debugging mode. Technically they are separate options, but most project files are configured such that *their* Debug/Release switch affects both the compiler options (optimization) and the linker options (C runtime linkage). > Is it true that if the interpreter is built against ucrtd.lib, and an > extension module is built against ucrt.lib, then they'll have > incompatible ABIs and not work together? And that this detail is part > of what's been glommed together into the "d" flag in the soabi tag on > Windows? Yep, except it's not actually in the soabi tag, but it's the "_d" suffix on module/executable names. > Is it possible for the Windows installer to include PDB files (/Zi > /DEBUG) to allow debuggers to understand the regular release > executable? (That's what I would have expected to get if I checked a > box labeled "Download debug binaries".) That box is immediately below one labelled "Download debug symbols", so hopefully seeing it in context would have set the right expectation. (And since I have them, there were 1.3 million downloads of the symbol packages via this option in March, but we also enable it by default via Visual Studio and that's responsible for about 1 million of those.) Cheers, Steve From storchaka at gmail.com Thu Apr 11 11:30:18 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 11 Apr 2019 18:30:18 +0300 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: 11.04.19 12:28, Victor Stinner ????: > Le jeu. 11 avr. 2019 ? 07:49, Serhiy Storchaka a ?crit : >> 10.04.19 14:01, Victor Stinner ????: >>> Disabling Py_TRACE_REFS by default in debug mode reduces the Python >>> memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16 >>> bytes on 64-bit platforms. >> >> Does not the memory allocator in debug mode have even larger cost per >> allocated block? > > What do you mean? That a debug build already waste too much memory and > so doesn't deserve to have a smaller memory footprint? I'm not sure > that I understand your point. If reducing the Python memory footprint is an argument for disabling Py_TRACE_REFS, it is a weak argument because there is larger overhead in the debug build. On other hand, since using the debug allocator doesn't cause problems with compatibility, it may be possible to use similar technique for the objects double list. Although this is not easy because of objects placed at static memory. From brett at python.org Thu Apr 11 16:32:15 2019 From: brett at python.org (Brett Cannon) Date: Thu, 11 Apr 2019 13:32:15 -0700 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CAF2D4B.8070205@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <308f7b69-01d4-55c6-0bd6-d767ecc3fd99@gmail.com> <5CAF2D4B.8070205@UGent.be> Message-ID: On Thu, Apr 11, 2019 at 5:06 AM Jeroen Demeyer wrote: > Petr, > > I realize that you are in a difficult position. You'll end up > disappointing either me or Mark... > > I don't know if the steering council or somebody else has a good idea to > deal with this situation. > Our answer was "ask Petr to be BDFL Delegate". ;) In all seriousness, none of us on the council or as well equipped as Petr to handle this tough decision, else it would take even longer for us to learn enough to make an informed decision and we would be even worse off. -Brett > > > Jeroen has time > > Speaking of time, maybe I should clarify that I have time until the end > of August: I am working for the OpenDreamKit grant, which allows me to > work basically full-time on open source software development but that > ends at the end of August. > > > Here again, I mostly want to know if the details are there for deeper > > reasons, or just points to polish. > > I would say: mostly shallow details. > > The subclassing thing would be good to resolve, but I don't see any > difference between PEP 580 and PEP 590 there. In PEP 580, I wrote a > strategy for dealing with subclassing. I believe that it works and that > exactly the same idea would work for PEP 590 too. Of course, I may be > overlooking something... > > > I don't have good general experience with premature extensibility, so > > I'd not count this as a plus. > > Fair enough. I also see it more as a "nice to have", not as a big plus. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 11 21:06:14 2019 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Apr 2019 18:06:14 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: On Thu, Apr 11, 2019 at 8:32 AM Serhiy Storchaka wrote: > On other hand, since using the debug allocator doesn't cause problems > with compatibility, it may be possible to use similar technique for the > objects double list. Although this is not easy because of objects placed > at static memory. I guess one could track static objects separately, e.g. keep a simple global PyList containing all statically allocated objects. (This is easy since we know they're all immortal.) And then sys.getobjects() could walk the heap objects and statically allocated objects separately. -n -- Nathaniel J. Smith -- https://vorpus.org From Peixing.Xin at windriver.com Thu Apr 11 22:19:32 2019 From: Peixing.Xin at windriver.com (Xin, Peixing) Date: Fri, 12 Apr 2019 02:19:32 +0000 Subject: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? In-Reply-To: <5fb2462a-ee2c-00e3-1928-d56510e29570@python.org> References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> <5fb2462a-ee2c-00e3-1928-d56510e29570@python.org> Message-ID: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0B64@ALA-MBD.corp.ad.wrs.com> Thanks for your explanation, Christian. Actually my question is not about thread safe or the original value 0 on errno. Probably I didn't express the point clearly. To be more clear, let me take expm1 as an example below. On certain platform, expm1() is implemented as exp() minus 1. To calculate expm1(-1420.0), that will call exp(-1420.0) then substract 1. You know, exp(-1420.0) will underflow to zero and errno is set to ERANGE. As a consequence the errno keeps set there when expm1() returns the correct result -1. So for this situation, CPthon's api is_error() will raise overflow unexpectedly. Whose bug should it be scoped to? A bug of the platform? Isn't errno allowed to be set when calculation gets normal result? Thanks, Peixing -----Original Message----- From: Christian Heimes [mailto:christian at python.org] Sent: Thursday, April 11, 2019 8:24 PM To: Xin, Peixing; python-dev at python.org; Mark Dickinson Subject: Re: checking "errno" for math operaton is safe to determine the error status? On 11/04/2019 11.45, Xin, Peixing wrote: > Hi, Math experts: > > Looking at the codes below, for many math operations, CPython is checking errno to determine the error status even though the math function returns normal value back. Is it a safe solution? From the description here http://man7.org/linux/man-pages/man3/errno.3.html and https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152351, it looks apis probably set the errno when normal result is returned. Or being a side effect by calling other APIs in the implementation. In this situation, CPython's math operation might raise exceptions however in fact the result is correct. > > https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L956 > https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L864 This is safe because all places first set errno to 0. Errno is a thread local variable, so other threads cannot influence the variable during the calls. This is one of the many quirks that Mark has implemented for platforms bugs in various libm. Christian From greg.ewing at canterbury.ac.nz Fri Apr 12 01:44:33 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 12 Apr 2019 17:44:33 +1200 Subject: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? In-Reply-To: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0B64@ALA-MBD.corp.ad.wrs.com> References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> <5fb2462a-ee2c-00e3-1928-d56510e29570@python.org> <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0B64@ALA-MBD.corp.ad.wrs.com> Message-ID: <5CB025C1.2060708@canterbury.ac.nz> Xin, Peixing wrote: > On certain platform, expm1() is implemented as exp() minus 1. To calculate > expm1(-1420.0), that will call exp(-1420.0) then substract 1. You know, > exp(-1420.0) will underflow to zero and errno is set to ERANGE. As a > consequence the errno keeps set there when expm1() returns the correct result > -1. This sounds like a bug in that platform's implementation of expm1() to me. Which platform is it? -- Greg From vstinner at redhat.com Fri Apr 12 06:57:04 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 12 Apr 2019 12:57:04 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le jeu. 11 avr. 2019 ? 17:33, Serhiy Storchaka a ?crit : > If reducing the Python memory footprint is an argument for disabling > Py_TRACE_REFS, it is a weak argument because there is larger overhead in > the debug build. The "serialno" field of debug memory allocators is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." I'm debugging crashes and memory leaks in CPython for 10 years, and I simply never had to use "serialno". I wrote https://bugs.python.org/issue36611 to remove the serialno field of debug hooks on Python memory allocators: it reduces the memory footprint by 5% (ex: 1.2 MiB on 33.0 MiB when running test_asyncio). Python is used on devices with low memory (ex: 256 MiB for the whole system). Allowing developers to use a debug build on such devices seem to be a legit rationale for such change. The debug build is very useful to identify bugs in C extensions. > On other hand, since using the debug allocator doesn't cause problems > with compatibility, it may be possible to use similar technique for the > objects double list. Although this is not easy because of objects placed > at static memory. I'm not sure of what you means by "objects placed at static memory": the double linked list of all Python objects is created at runtime. _ob_next and _ob_prev are initialized statically to NULL. I would be interested if Py_TRACE_REFS could be reimplemented in a more dynamic fashion. Even if it would still require a debug build, it would be nice to be able to "opt-in" for this feature (have it disabled by default, again, to reduce the overhead and reduce the memory footprint), as tracemalloc which plugs itself into memory allocators to attach traces to memory blocks. Except Guido who wrote "I recall finding memory leaks using this. (E.g. I remember a leak in Zope due to a cache that was never pruned.) But presumably gc.get_objects() would have been sufficient. (IIRC it didn't exist at the time.)", at this point, nobody said that they use Py_TRACE_REFS. So I'm not sure that it's worth it to invest time on a feature if nobody uses it? Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Fri Apr 12 08:06:04 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 12 Apr 2019 14:06:04 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Le ven. 12 avr. 2019 ? 12:57, Victor Stinner a ?crit : > I wrote https://bugs.python.org/issue36611 to remove the serialno field > of debug hooks on Python memory allocators: it reduces > the memory footprint by 5% (ex: 1.2 MiB on 33.0 MiB when running > test_asyncio). I measured the memory footprint when I combine my two changes: * disable Py_TRACE_REFS: https://bugs.python.org/issue36465 * disable/remove serialno field: https://bugs.python.org/issue36611 python3 -m test test_asyncio, without => with the change: 34,038.0 kB => 30,612.2 kB (-3,425.8 kiB, -10%) A reduction of 3.4 MiB on 34.0 MiB is quite significant, no? Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Fri Apr 12 08:34:56 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 12 Apr 2019 14:34:56 +0200 Subject: [Python-Dev] PEP-582 and multiple Python installations In-Reply-To: References: Message-ID: Hi, Le mar. 2 avr. 2019 ? 17:20, Calvin Spealman a ?crit : > While the PEP does show the version number as part of the path to the actual packages, implying support for multiple versions, this doesn't seem to be spelled out in the actual text. Presumably __pypackages__/3.8/ might sit beside __pypackages__/3.9/, etc. to keep future versions capable of installing packages for each version, the way virtualenv today is bound to one version of Python. > > > I'd like to raise a potential edge case that might be a problem, and likely an increasingly common one: users with multiple installations of the *same* version of Python. Hum, I don't know if it's relevant to support multiple Python binaries of the same Python version, but just in case, let me share my experience with that in the pyperformance project. The pyperformance project uses virtual environment for two binaries of the exact Python version (and usually the same path!): one unpatched "reference" and one "patched" binary, to experiment an optimization. I needed a way to build a short text identifier to still be able to get a "cached" virtual environment per Python binary. I wrote a short code to generate the identifier using: * pyperformance version * requirements.txt * sys.executable * sys.version * sys.version_info * sys.implementation.name of platform.python_implementation() The script builds a long string using these info, hash it with SHA1 and take first 12 characters of the hexadecimal format of the hash. Script: --- import hashlib import platform import sys performance_version = sys.argv[1] requirements = sys.argv[2] data = performance_version + sys.executable + sys.version pyver= sys.version_info if hasattr(sys, 'implementation'): # PEP 421, Python 3.3 implementation = sys.implementation.name else: implementation = platform.python_implementation() implementation = implementation.lower() if not isinstance(data, bytes): data = data.encode('utf-8') with open(requirements, 'rb') as fp: data += fp.read() sha1 = hashlib.sha1(data).hexdigest() name = ('%s%s.%s-%s' % (implementation, pyver.major, pyver.minor, sha1[:12])) print(name) --- Examples: $ touch requirements.txt # empty file $ python3.7 x.py version requirements.txt cpython3.7-502d35b8e005 $ python3.6 x.py version requirements.txt cpython3.6-7f4febbec0be $ python3 x.py version requirements.txt cpython3.7-59ab636dfacb $ file /usr/bin/python3 /usr/bin/python3: symbolic link to python3.7 Hum, python3 and python3.7 produce the different hash whereas it's the same binary. Maybe os.path.realpath() should be called on sys.executable :-) Victor -- Night gathers, and now my watch begins. It shall not end until my death. From stefan_ml at behnel.de Fri Apr 12 08:48:30 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 12 Apr 2019 14:48:30 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: Serhiy Storchaka schrieb am 11.04.19 um 17:30: > If reducing the Python memory footprint is an argument for disabling > Py_TRACE_REFS, it is a weak argument because there is larger overhead in > the debug build. I think what Victor is argueing is rather that we have better ways to debug memory problems these days, so we might be able to get rid of a relict that no-one is using (or should be using) anymore and that has its drawbacks (such as a very different ABI and higher memory load). I don't really have an opinion here, but I can at least say that I never found a use case for Py_TRACE_REFS myself and therefore certainly wouldn't miss it. Stefan From songofacandy at gmail.com Fri Apr 12 09:44:05 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Fri, 12 Apr 2019 22:44:05 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) Message-ID: Hi, all. I propose adding new method: dict.with_values(iterable) # Motivation Python is used to handle data. While dict is not efficient way to handle may records, it is still convenient way. When creating many dicts with same keys, dict need to lookup internal hash table while inserting each keys. It is costful operation. If we can reuse existing keys of dict, we can skip this inserting cost. Additionally, we have "Key-Sharing Dictionary (PEP 412)". When all keys are string, many dict can share one key. It reduces memory consumption. This might be usable for: * csv.DictReader * namedtuple._asdict() * DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python) # Draft implementation pull request: https://github.com/python/cpython/pull/12802 with_values(self, iterable, /) Create a new dictionary with keys from this dict and values from iterable. When length of iterable is different from len(self), ValueError is raised. This method does not support dict subclass. ## Memory usage (Key-Sharing dict) >>> import sys >>> keys = tuple("abcdefg") >>> keys ('a', 'b', 'c', 'd', 'e', 'f', 'g') >>> d = dict(zip(keys, range(7))) >>> d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} >>> sys.getsizeof(d) 360 >>> keys = dict.fromkeys("abcdefg") >>> d = keys.with_values(range(7)) >>> d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} >>> sys.getsizeof(d) 144 ## Speed $ ./python -m perf timeit -o zip_dict.json -s 'keys = tuple("abcdefg"); values=[*range(7)]' 'dict(zip(keys, values))' $ ./python -m perf timeit -o with_values.json -s 'keys = dict.fromkeys("abcdefg"); values=[*range(7)]' 'keys.with_values(values)' $ ./python -m perf compare_to zip_dict.json with_values.json Mean +- std dev: [zip_dict] 935 ns +- 9 ns -> [with_values] 109 ns +- 2 ns: 8.59x faster (-88%) How do you think? Any comments are appreciated. Regards, -- Inada Naoki From vstinner at redhat.com Fri Apr 12 10:31:47 2019 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 12 Apr 2019 16:31:47 +0200 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: Nice optimization! I have questions on the proposed API. > with_values(self, iterable, /) > Create a new dictionary with keys from this dict and values from iterable. > > When length of iterable is different from len(self), ValueError is raised. > This method does not support dict subclass. In short, mydict.with_values(values) behaves as dict(zip(mydict.keys(), values)), but is more efficient? The method rely on the fact that dict is preserving key insertion order, right? Le ven. 12 avr. 2019 ? 15:47, Inada Naoki a ?crit : > This might be usable for: > > * csv.DictReader > * namedtuple._asdict() > * DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python) I guess that a new dict constructor taken keys and values like dict.from_keys_and_values(keys, values) would work, but would not benefit from the dict key-sharing optimization? Would it be possible to implement the key-sharing optimization using a dict.from_keys_and_values(mydict.keys(), values) method: detect that keys are owned by a dict, and so create a new dict linked to the keys dict? A dict view contains a reference to the iterated dict (dictiterobject.di_dict). I'm fine with dict.with_values() API, but I'm asking if it could be written differently. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From guido at python.org Fri Apr 12 10:40:13 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 12 Apr 2019 07:40:13 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: On Fri, Apr 12, 2019 at 5:51 AM Stefan Behnel wrote: > Serhiy Storchaka schrieb am 11.04.19 um 17:30: > > If reducing the Python memory footprint is an argument for disabling > > Py_TRACE_REFS, it is a weak argument because there is larger overhead in > > the debug build. > > I think what Victor is argueing is rather that we have better ways to debug > memory problems these days, so we might be able to get rid of a relict that > no-one is using (or should be using) anymore and that has its drawbacks > (such as a very different ABI and higher memory load). > > I don't really have an opinion here, but I can at least say that I never > found a use case for Py_TRACE_REFS myself and therefore certainly wouldn't > miss it. > I have a feeling that at some point someone might want to use this to debug some leak (presumably caused by C code) beyond what gc.get_objects() can report. But I agree that it isn't useful to the vast majority of users of a regular debug build. So let's leave it off by default even in debug builds. But let's not delete the macros. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhroncok at redhat.com Fri Apr 12 10:53:16 2019 From: mhroncok at redhat.com (=?UTF-8?Q?Miro_Hron=c4=8dok?=) Date: Fri, 12 Apr 2019 16:53:16 +0200 Subject: [Python-Dev] Update PEP 394: Distributions can choose what does python command mean Message-ID: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> Hello. Based on discussions in [1], Petr Viktorin and me have drafted a new update [2] to the PEP 394 (The "python" Command on Unix-Like Systems). The update gives distributors the opportunity to decide where does the "python" command lead to, whether it is present etc. Please, see the PR [2] for the suggested changes. [1]: https://mail.python.org/pipermail/python-dev/2019-February/156272.html [2]: https://github.com/python/peps/pull/989 Thanks, -- Miro Hron?ok -- Phone: +420777974800 IRC: mhroncok From encukou at gmail.com Fri Apr 12 10:59:18 2019 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 12 Apr 2019 16:59:18 +0200 Subject: [Python-Dev] Update PEP 394: Distributions can choose what does python command mean In-Reply-To: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> References: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> Message-ID: <4c143e85-be0c-7995-0129-30d19b9bb690@gmail.com> On 4/12/19 4:53 PM, Miro Hron?ok wrote: > Hello. > > Based on discussions in [1], Petr Viktorin and me have drafted a new > update [2] to the PEP 394 (The "python" Command on Unix-Like Systems). > > The update gives distributors the opportunity to decide where does the > "python" command lead to, whether it is present etc. > > Please, see the PR [2] for the suggested changes. > > [1]: https://mail.python.org/pipermail/python-dev/2019-February/156272.html > [2]: https://github.com/python/peps/pull/989 The text is available at https://github.com/hroncok/peps/blob/pep394-2019/pep-0394.txt As a summary, I'll paste the rationale sections here: History of this PEP =================== In 2011, the majority of distributions aliased the ``python`` command to Python 2, but some started switching it to Python 3 ([5]_). As some of the former distributions did not provide a ``python2`` command by default, there was previously no way for Python 2 code (or any code that invokes the Python 2 interpreter directly rather than via ``sys.executable``) to reliably run on all Unix-like systems without modification, as the ``python`` command would invoke the wrong interpreter version on some systems, and the ``python2`` command would fail completely on others. This PEP originally provided a very simple mechanism to restore cross-platform support, with minimal additional work required on the part of distribution maintainers. Simplified, the recommendation was: 1. The ``python`` command was preferred for code compatible with both Python 2 and 3 (since it was available on all systems, even those that already aliased it to Python 3). 2. The ``python`` command should always invoke Python 2 (to prevent hard-to-diagnose errors when Python 2 code is run on Python 3). 3. The ``python2`` and ``python3`` commands should be available to specify the version explicitly. However, these recommendations implicitly assumed that Python 2 would always be available. As Python 2 is nearing its end of life in 2020 (PEP 373, PEP 404), distributions are making Python 2 optional or removing entirely. This means either removing the ``python`` command or switching it to invoke Python 3, invalidating respectively the first or second recommendation. Also, some distributors decided that their users are better served by ignoring the PEP's recommendations, making the PEP's supposedly cross-platform recommendations on ``python`` and ``python2`` in shebangs increasingly unreliable. Current Rationale ================= As of 2019, nearly all new systems include Python 3 and the ``python3`` command. This makes the ``python3`` command the best general choice for code that can run on either Python 3.x or 2.x, even though it is not available everywhere. The recommendation is skewed toward current and future systems, leaving behind ?*old systems*? (like RHEL 6 or default Python installed on macOS). On these systems, Python software is rarely updated and any recommendations this PEP makes would likely be ignored. Also, since distributors often ignored recommendations the PEP gave regarding the ``python`` command (for what they saw as legitimate special needs), this PEP now gives them broad control over the command. Correspondingly, users are advised to not use the ``python`` command in cross-platform code. Instead, this PEP specifies the expected behavior of the ``python3`` and ``python2`` commands, which is not controversial. From guido at python.org Fri Apr 12 11:13:42 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 12 Apr 2019 08:13:42 -0700 Subject: [Python-Dev] Update PEP 394: Distributions can choose what does python command mean In-Reply-To: <4c143e85-be0c-7995-0129-30d19b9bb690@gmail.com> References: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> <4c143e85-be0c-7995-0129-30d19b9bb690@gmail.com> Message-ID: I think this is reasonable. Thanks for making the rationale clear! On Fri, Apr 12, 2019 at 8:02 AM Petr Viktorin wrote: > On 4/12/19 4:53 PM, Miro Hron?ok wrote: > > Hello. > > > > Based on discussions in [1], Petr Viktorin and me have drafted a new > > update [2] to the PEP 394 (The "python" Command on Unix-Like Systems). > > > > The update gives distributors the opportunity to decide where does the > > "python" command lead to, whether it is present etc. > > > > Please, see the PR [2] for the suggested changes. > > > > [1]: > https://mail.python.org/pipermail/python-dev/2019-February/156272.html > > [2]: https://github.com/python/peps/pull/989 > > The text is available at > https://github.com/hroncok/peps/blob/pep394-2019/pep-0394.txt > > As a summary, I'll paste the rationale sections here: > > History of this PEP > =================== > > In 2011, the majority of distributions > aliased the ``python`` command to Python 2, but some started switching it > to > Python 3 ([5]_). As some of the former distributions did not provide a > ``python2`` command by default, there was previously no way for Python 2 > code > (or any code that invokes the Python 2 interpreter directly rather than via > ``sys.executable``) to reliably run on all Unix-like systems without > modification, as the ``python`` command would invoke the wrong interpreter > version on some systems, and the ``python2`` command would fail completely > on others. This PEP originally provided a very simple mechanism > to restore cross-platform support, with minimal additional work required > on the part of distribution maintainers. Simplified, the recommendation > was: > > 1. The ``python`` command was preferred for code compatible with both > Python 2 and 3 (since it was available on all systems, even those that > already aliased it to Python 3). > 2. The ``python`` command should always invoke Python 2 (to prevent > hard-to-diagnose errors when Python 2 code is run on Python 3). > 3. The ``python2`` and ``python3`` commands should be available to specify > the version explicitly. > > However, these recommendations implicitly assumed that Python 2 would > always be > available. As Python 2 is nearing its end of life in 2020 (PEP 373, PEP > 404), > distributions are making Python 2 optional or removing entirely. > This means either removing the ``python`` command or switching it to invoke > Python 3, invalidating respectively the first or second recommendation. > Also, some distributors decided that their users are better served by > ignoring the PEP's recommendations, making the PEP's supposedly > cross-platform recommendations on ``python`` and ``python2`` in shebangs > increasingly unreliable. > > > Current Rationale > ================= > > As of 2019, nearly all new systems include Python 3 and the ``python3`` > command. This makes the ``python3`` command the best general choice for > code that can run on either Python 3.x or 2.x, even though it is not > available everywhere. > > The recommendation is skewed toward current and future systems, leaving > behind ?*old systems*? (like RHEL 6 or default Python installed on macOS). > On these systems, Python software is rarely updated and any recommendations > this PEP makes would likely be ignored. > > Also, since distributors often ignored recommendations the PEP gave > regarding the ``python`` command (for what they saw as legitimate special > needs), this PEP now gives them broad control over the command. > Correspondingly, users are advised to not use the ``python`` command > in cross-platform code. > Instead, this PEP specifies the expected behavior of the ``python3`` and > ``python2`` commands, which is not controversial. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Apr 12 11:17:37 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 12 Apr 2019 18:17:37 +0300 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: 12.04.19 17:40, Guido van Rossum ????: > So let's leave it off > by default even in debug builds. But let's not delete the macros. Maybe switch it on (together with other disabled by default options) on some fast buildbot? From storchaka at gmail.com Fri Apr 12 11:35:00 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 12 Apr 2019 18:35:00 +0300 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: 12.04.19 16:44, Inada Naoki ????: > When creating many dicts with same keys, dict need to > lookup internal hash table while inserting each keys. > > It is costful operation. If we can reuse existing keys of dict, > we can skip this inserting cost. > > Additionally, we have "Key-Sharing Dictionary (PEP 412)". > When all keys are string, many dict can share one key. > It reduces memory consumption. It looks contrary to simplification made in Python 3 when we get rid of some more efficient lists in favor of more general iterators. If this is a common case we can add an invisible optimization for dict(zip(keys, values)), especially if keys is a key-sharing dictionary. This will benefit all users without the need to rewrite the code to use the new special method. The interface of dict is already overloaded. It contains many methods which most users use rarely (and therefore which are not kept in the working set of memory). From songofacandy at gmail.com Fri Apr 12 12:07:23 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sat, 13 Apr 2019 01:07:23 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Fri, Apr 12, 2019 at 11:31 PM Victor Stinner wrote: > > Nice optimization! I have questions on the proposed API. > > > with_values(self, iterable, /) > > Create a new dictionary with keys from this dict and values from iterable. > > > > When length of iterable is different from len(self), ValueError is raised. > > This method does not support dict subclass. > > In short, mydict.with_values(values) behaves as > dict(zip(mydict.keys(), values)), but is more efficient? Yes. But unlike zip, keys() and values must have exactly same length. > > The method rely on the fact that dict is preserving key insertion order, right? > Yes. > > This might be usable for: > > > > * csv.DictReader > > * namedtuple._asdict() > > * DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python) > > I guess that a new dict constructor taken keys and values like > dict.from_keys_and_values(keys, values) would work, but would not > benefit from the dict key-sharing optimization? > I don't like more overloading. And this API is specialized to build multiple dicts, not one dict. So I want to have dedicated API for it. > Would it be possible to implement the key-sharing optimization using a > dict.from_keys_and_values(mydict.keys(), values) method: detect that > keys are owned by a dict, and so create a new dict linked to the keys > dict? A dict view contains a reference to the iterated dict > (dictiterobject.di_dict). I think it is possible. > > I'm fine with dict.with_values() API, but I'm asking if it could be > written differently. > > Victor I implemented it as instance method of dict because it may modify the dict internally (at first invocation). -- Inada Naoki From songofacandy at gmail.com Fri Apr 12 12:17:10 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sat, 13 Apr 2019 01:17:10 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Sat, Apr 13, 2019 at 12:38 AM Serhiy Storchaka wrote: > > It looks contrary to simplification made in Python 3 when we get rid of > some more efficient lists in favor of more general iterators. > Yes. This is API for special use case creates many dict having same keys, like csv.DictReader. It is not good design for general purpose. strings module has strings.Template class. But I don't want to add dicts module. Maybe, collections.DictBuilder may be another option. e.g. >>> from collections import DictBuilder >>> builder = DictBuilder(tuple("abc")) >>> builder.build(range(3)) {"a": 0, "b": 1, "c": 2} > If this is a common case we can add an invisible optimization for > dict(zip(keys, values)), especially if keys is a key-sharing dictionary. > This will benefit all users without the need to rewrite the code to use > the new special method. But this optimization may slow down when creating one dict... > > The interface of dict is already overloaded. It contains many methods > which most users use rarely (and therefore which are not kept in the > working set of memory). Yes. -- Inada Naoki From J.Demeyer at UGent.be Fri Apr 12 12:19:36 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 12 Apr 2019 18:19:36 +0200 Subject: [Python-Dev] Removing PID check from signal handler Message-ID: <5CB0BA98.4070102@UGent.be> The signal handler (that receives signals from the OS) in Python starts with a check if (getpid() == main_pid) Looking at the comments, the intent was to do a check for the main *thread* but this is checking the *process* id. So this condition is basically always true. Therefore, I suggest to remove it in https://bugs.python.org/issue36601 If you have any objections or comments, I suggest to post them to that bpo. Jeroen. From brett at python.org Fri Apr 12 13:16:11 2019 From: brett at python.org (Brett Cannon) Date: Fri, 12 Apr 2019 10:16:11 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Fri, Apr 12, 2019 at 8:35 AM Serhiy Storchaka wrote: > 12.04.19 16:44, Inada Naoki ????: > > When creating many dicts with same keys, dict need to > > lookup internal hash table while inserting each keys. > > > > It is costful operation. If we can reuse existing keys of dict, > > we can skip this inserting cost. > > > > Additionally, we have "Key-Sharing Dictionary (PEP 412)". > > When all keys are string, many dict can share one key. > > It reduces memory consumption. > > It looks contrary to simplification made in Python 3 when we get rid of > some more efficient lists in favor of more general iterators. > > If this is a common case I think that "if" is my big sticking point. I don't think I've ever had a need for this and the zip() solution was what I originally thought of when I realized what the method was meant to do (which wasn't obvious to me initially). This doesn't strike me as needing an optimization through a dedicated method. -Brett > we can add an invisible optimization for > dict(zip(keys, values)), especially if keys is a key-sharing dictionary. > This will benefit all users without the need to rewrite the code to use > the new special method. > > The interface of dict is already overloaded. It contains many methods > which most users use rarely (and therefore which are not kept in the > working set of memory). > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Fri Apr 12 14:05:03 2019 From: steve.dower at python.org (Steve Dower) Date: Fri, 12 Apr 2019 11:05:03 -0700 Subject: [Python-Dev] Removing PID check from signal handler In-Reply-To: <5CB0BA98.4070102@UGent.be> References: <5CB0BA98.4070102@UGent.be> Message-ID: <8f8399d0-5a6b-5b72-3b24-183c800c31a6@python.org> On 12Apr.2019 0919, Jeroen Demeyer wrote: > The signal handler (that receives signals from the OS) in Python starts > with a check > > ??? if (getpid() == main_pid) > > Looking at the comments, the intent was to do a check for the main > *thread* but this is checking the *process* id. So this condition is > basically always true. Therefore, I suggest to remove it in > https://bugs.python.org/issue36601 > > If you have any objections or comments, I suggest to post them to that bpo. To add a little more context, the check was added about 25 years ago as a "hack" for some reason that we can't figure out anymore. So if you are a historian of ancient operating systems and know of one that might have raised signal handlers in a different process from the one where it was registered, we'd love to hear from you. Cheers, Steve From status at bugs.python.org Fri Apr 12 14:07:56 2019 From: status at bugs.python.org (Python tracker) Date: Fri, 12 Apr 2019 18:07:56 +0000 (UTC) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20190412180756.A925F52B212@bugs.ams1.psf.io> ACTIVITY SUMMARY (2019-04-05 - 2019-04-12) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 7063 ( +7) closed 41307 (+76) total 48370 (+83) Open issues with patches: 2815 Issues opened (60) ================== #25160: Stop using deprecated imp module; imp should now emit a real D https://bugs.python.org/issue25160 reopened by ncoghlan #35934: Add socket.create_server() utility function https://bugs.python.org/issue35934 reopened by vstinner #36235: distutils.sysconfig.customize_compiler() overrides CFLAGS var https://bugs.python.org/issue36235 reopened by vstinner #36537: except statement block incorrectly assumes end of scope(?). https://bugs.python.org/issue36537 opened by Saim Raza #36538: _thread.interrupt_main() no longer interrupts Lock.wait https://bugs.python.org/issue36538 opened by gregory.p.smith #36540: PEP 570: Python Positional-Only Parameters https://bugs.python.org/issue36540 opened by pablogsal #36541: Make lib2to3 grammar more closely match Python https://bugs.python.org/issue36541 opened by thatch #36542: Allow to overwrite the signature for Python functions https://bugs.python.org/issue36542 opened by serhiy.storchaka #36543: Remove old-deprecated ElementTree features (part 2) https://bugs.python.org/issue36543 opened by serhiy.storchaka #36545: Python 3.5 OOM during test_socket on make https://bugs.python.org/issue36545 opened by dekken #36546: Add quantiles() to the statistics module https://bugs.python.org/issue36546 opened by rhettinger #36548: Make the repr of re flags more readable https://bugs.python.org/issue36548 opened by serhiy.storchaka #36550: Avoid creating AttributeError exceptions in the debugger https://bugs.python.org/issue36550 opened by blueyed #36551: Optimize list comprehensions with preallocate size and protect https://bugs.python.org/issue36551 opened by anthony shaw #36552: Replace OverflowError with ValueError when calculating length https://bugs.python.org/issue36552 opened by anthony shaw #36553: inspect.is_decorator_call(frame) https://bugs.python.org/issue36553 opened by smarie #36556: Trashcan causing duplicated __del__ calls https://bugs.python.org/issue36556 opened by jdemeyer #36557: Python (Launcher)3.7.3 CMDLine install/uninstall https://bugs.python.org/issue36557 opened by mattcher_h #36558: Change time.mktime() return type from float to int? https://bugs.python.org/issue36558 opened by vstinner #36560: test_functools leaks randomly 1 memory block https://bugs.python.org/issue36560 opened by vstinner #36563: pdbrc is read twice if current directory is the home directory https://bugs.python.org/issue36563 opened by blueyed #36564: Infinite loop with short maximum line lengths in EmailPolicy https://bugs.python.org/issue36564 opened by p-ganssle #36567: DOC: manpage directive doesn't create hyperlink https://bugs.python.org/issue36567 opened by cheryl.sabella #36568: Typo in socket.CAN_RAW_FD_FRAMES library documentation https://bugs.python.org/issue36568 opened by Carl Cerecke #36569: @staticmethod seems to work with setUpClass, but docs say it s https://bugs.python.org/issue36569 opened by Peter de Blanc #36572: python-snappy install issue during Crossbar install with Pytho https://bugs.python.org/issue36572 opened by telatoa #36573: zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pd https://bugs.python.org/issue36573 opened by Jozef Cernak #36576: Some test_ssl and test_asyncio tests fail with OpenSSL 1.1.1 o https://bugs.python.org/issue36576 opened by vstinner #36580: unittest.mock does not understand dataclasses https://bugs.python.org/issue36580 opened by John Parejko2 #36581: __dir__ on unittest.mock not safe for all spec types https://bugs.python.org/issue36581 opened by Dylan Semler #36582: collections.UserString encode method returns a string https://bugs.python.org/issue36582 opened by trey #36583: Do not swallow exceptions in the _ssl module https://bugs.python.org/issue36583 opened by serhiy.storchaka #36585: test_posix.py fails due to unsupported RWF_HIPRI https://bugs.python.org/issue36585 opened by jdemeyer #36586: multiprocessing.Queue.close doesn't behave as documented https://bugs.python.org/issue36586 opened by graingert #36587: race in logging code when fork() https://bugs.python.org/issue36587 opened by cagney #36589: Incorrect error handling in curses.update_lines_cols() https://bugs.python.org/issue36589 opened by ZackerySpytz #36590: Add Bluetooth RFCOMM Support for Windows https://bugs.python.org/issue36590 opened by topnotcher #36593: Trace function interferes with MagicMock isinstance? https://bugs.python.org/issue36593 opened by nedbat #36594: Undefined behavior due to incorrect usage of %p in format stri https://bugs.python.org/issue36594 opened by ZackerySpytz #36595: IDLE: Add search to Squeezed Output text viewer. https://bugs.python.org/issue36595 opened by Shane Smith #36596: tarfile module considers anything starting with 512 bytes of z https://bugs.python.org/issue36596 opened by cks #36600: re-enable test in nntplib https://bugs.python.org/issue36600 opened by Marcin Niemira #36601: signals can be caught by any thread https://bugs.python.org/issue36601 opened by jdemeyer #36602: Recursive directory list with pathlib.Path.iterdir https://bugs.python.org/issue36602 opened by Epic_Wink #36603: should pty.openpty() set pty/tty inheritable? https://bugs.python.org/issue36603 opened by cagney #36605: make tags should also parse Modules/_io/*.c and Modules/_io/*. https://bugs.python.org/issue36605 opened by vstinner #36606: calling super() causes __class__ to be not defined when sys.se https://bugs.python.org/issue36606 opened by xtreak #36607: asyncio.all_tasks() crashes if asyncio is used in multiple thr https://bugs.python.org/issue36607 opened by Nick Davies #36608: Replace bundled pip and setuptools with a downloader in the en https://bugs.python.org/issue36608 opened by webknjaz #36609: activate.ps1 in venv for Windows should encoded with BOM https://bugs.python.org/issue36609 opened by ????????? #36610: os.sendfile can return EINVAL on Solaris https://bugs.python.org/issue36610 opened by kulikjak #36611: Debug memory allocators: remove useless "serialno" field to re https://bugs.python.org/issue36611 opened by vstinner #36612: Unittest document is not clear on SetUpClass calls https://bugs.python.org/issue36612 opened by vrpolakatcisco #36613: asyncio._wait() don't remove callback in case of exception https://bugs.python.org/issue36613 opened by gescheit #36614: Popen output on windows server 2019 https://bugs.python.org/issue36614 opened by weispinc #36615: why call _Py_set_inheritable(0) from os.open() when O_CLOEXEC? https://bugs.python.org/issue36615 opened by cagney #36616: Optimize thread state handling in function call code https://bugs.python.org/issue36616 opened by jdemeyer #36617: The rich comparison operators are second class citizens https://bugs.python.org/issue36617 opened by bup #36618: clang expects memory aligned on 16 bytes, but pymalloc aligns https://bugs.python.org/issue36618 opened by vstinner #36619: when is os.posix_spawn(setsid=True) safe? https://bugs.python.org/issue36619 opened by cagney Most recent 15 issues with no replies (15) ========================================== #36619: when is os.posix_spawn(setsid=True) safe? https://bugs.python.org/issue36619 #36618: clang expects memory aligned on 16 bytes, but pymalloc aligns https://bugs.python.org/issue36618 #36616: Optimize thread state handling in function call code https://bugs.python.org/issue36616 #36613: asyncio._wait() don't remove callback in case of exception https://bugs.python.org/issue36613 #36607: asyncio.all_tasks() crashes if asyncio is used in multiple thr https://bugs.python.org/issue36607 #36606: calling super() causes __class__ to be not defined when sys.se https://bugs.python.org/issue36606 #36603: should pty.openpty() set pty/tty inheritable? https://bugs.python.org/issue36603 #36596: tarfile module considers anything starting with 512 bytes of z https://bugs.python.org/issue36596 #36594: Undefined behavior due to incorrect usage of %p in format stri https://bugs.python.org/issue36594 #36590: Add Bluetooth RFCOMM Support for Windows https://bugs.python.org/issue36590 #36589: Incorrect error handling in curses.update_lines_cols() https://bugs.python.org/issue36589 #36585: test_posix.py fails due to unsupported RWF_HIPRI https://bugs.python.org/issue36585 #36583: Do not swallow exceptions in the _ssl module https://bugs.python.org/issue36583 #36572: python-snappy install issue during Crossbar install with Pytho https://bugs.python.org/issue36572 #36567: DOC: manpage directive doesn't create hyperlink https://bugs.python.org/issue36567 Most recent 15 issues waiting for review (15) ============================================= #36618: clang expects memory aligned on 16 bytes, but pymalloc aligns https://bugs.python.org/issue36618 #36613: asyncio._wait() don't remove callback in case of exception https://bugs.python.org/issue36613 #36612: Unittest document is not clear on SetUpClass calls https://bugs.python.org/issue36612 #36611: Debug memory allocators: remove useless "serialno" field to re https://bugs.python.org/issue36611 #36610: os.sendfile can return EINVAL on Solaris https://bugs.python.org/issue36610 #36608: Replace bundled pip and setuptools with a downloader in the en https://bugs.python.org/issue36608 #36605: make tags should also parse Modules/_io/*.c and Modules/_io/*. https://bugs.python.org/issue36605 #36602: Recursive directory list with pathlib.Path.iterdir https://bugs.python.org/issue36602 #36601: signals can be caught by any thread https://bugs.python.org/issue36601 #36600: re-enable test in nntplib https://bugs.python.org/issue36600 #36594: Undefined behavior due to incorrect usage of %p in format stri https://bugs.python.org/issue36594 #36593: Trace function interferes with MagicMock isinstance? https://bugs.python.org/issue36593 #36590: Add Bluetooth RFCOMM Support for Windows https://bugs.python.org/issue36590 #36589: Incorrect error handling in curses.update_lines_cols() https://bugs.python.org/issue36589 #36585: test_posix.py fails due to unsupported RWF_HIPRI https://bugs.python.org/issue36585 Top 10 most discussed issues (10) ================================= #36551: Optimize list comprehensions with preallocate size and protect https://bugs.python.org/issue36551 24 msgs #30458: [security][CVE-2019-9740][CVE-2019-9947] HTTP Header Injection https://bugs.python.org/issue30458 13 msgs #36560: test_functools leaks randomly 1 memory block https://bugs.python.org/issue36560 13 msgs #36537: except statement block incorrectly assumes end of scope(?). https://bugs.python.org/issue36537 9 msgs #33608: Add a cross-interpreter-safe mechanism to indicate that an obj https://bugs.python.org/issue33608 8 msgs #36533: logging regression with threading + fork are mixed in 3.7.1rc2 https://bugs.python.org/issue36533 8 msgs #36389: Add gc.enable_object_debugger(): detect corrupted Python objec https://bugs.python.org/issue36389 7 msgs #36611: Debug memory allocators: remove useless "serialno" field to re https://bugs.python.org/issue36611 7 msgs #18748: io.IOBase destructor silence I/O error on close() by default https://bugs.python.org/issue18748 6 msgs #36573: zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pd https://bugs.python.org/issue36573 6 msgs Issues closed (73) ================== #2281: Enhanced cPython profiler with high-resolution timer https://bugs.python.org/issue2281 closed by inada.naoki #12910: urllib.quote quotes too many chars, e.g., '()' https://bugs.python.org/issue12910 closed by orsenthil #14017: Make it easy to create a new TextIOWrapper based on an existin https://bugs.python.org/issue14017 closed by ncoghlan #16712: collections.abc.Sequence should not provide __reversed__ https://bugs.python.org/issue16712 closed by inada.naoki #17396: modulefinder fails if module contains syntax error https://bugs.python.org/issue17396 closed by ncoghlan #17561: Add socket.bind_socket() convenience function https://bugs.python.org/issue17561 closed by giampaolo.rodola #19417: Bdb: add a unittest file (test.test_bdb) https://bugs.python.org/issue19417 closed by xdegaye #19476: Add a dedicated specification for module "reloading" to the la https://bugs.python.org/issue19476 closed by eric.snow #21318: sdist fails with symbolic links do non-existing files https://bugs.python.org/issue21318 closed by cheryl.sabella #25922: canceling a repair install breaks the ability to uninstall, re https://bugs.python.org/issue25922 closed by cheryl.sabella #27181: Add geometric mean to `statistics` module https://bugs.python.org/issue27181 closed by rhettinger #28351: statistics.geometric_mean can enter infinite loop for Decimal https://bugs.python.org/issue28351 closed by cheryl.sabella #28626: Tutorial: rearrange discussion of output formatting to encoura https://bugs.python.org/issue28626 closed by cheryl.sabella #29209: Remove old-deprecated ElementTree features https://bugs.python.org/issue29209 closed by serhiy.storchaka #29707: os.path.ismount() always returns false for mount --bind on sam https://bugs.python.org/issue29707 closed by christian.heimes #30134: BytesWarning is missing from the documents https://bugs.python.org/issue30134 closed by inada.naoki #30661: Support tarfile.PAX_FORMAT in shutil.make_archive https://bugs.python.org/issue30661 closed by ncoghlan #31155: Encode set, frozenset, bytearray, and iterators as json arrays https://bugs.python.org/issue31155 closed by inada.naoki #31512: Add non-elevated symlink support for dev mode Windows 10 https://bugs.python.org/issue31512 closed by steve.dower #32534: Speed-up list.insert: use memmove() https://bugs.python.org/issue32534 closed by inada.naoki #33228: Use Random.choices in tempfile https://bugs.python.org/issue33228 closed by inada.naoki #33456: site.py: by default, a virtual environment is *not* isolated f https://bugs.python.org/issue33456 closed by vinay.sajip #33461: json.loads(encoding=) does not emit deprecation warn https://bugs.python.org/issue33461 closed by inada.naoki #33722: Document builtins in mock_open https://bugs.python.org/issue33722 closed by jcrotts #34060: regrtest: log "CPU usage" on Windows https://bugs.python.org/issue34060 closed by vstinner #34139: Remove stale unix datagram socket before binding https://bugs.python.org/issue34139 closed by asvetlov #34144: venv activate.bat reset codepage fails on windows 10 https://bugs.python.org/issue34144 closed by cheryl.sabella #34160: ElementTree not preserving attribute order https://bugs.python.org/issue34160 closed by scoder #34805: Explicitly specify `MyClass.__subclasses__()` returns classes https://bugs.python.org/issue34805 closed by cheryl.sabella #35376: modulefinder skips nested modules with same name as top-level https://bugs.python.org/issue35376 closed by ncoghlan #35416: Fix potential resource warnings in distutils https://bugs.python.org/issue35416 closed by inada.naoki #35488: pathlib Path.match does not behave as described https://bugs.python.org/issue35488 closed by anthony shaw #35848: readinto is not a method on io.TextIOBase https://bugs.python.org/issue35848 closed by benjamin.peterson #35906: [CVE-2019-9947] Header Injection in urllib https://bugs.python.org/issue35906 closed by gregory.p.smith #35936: Give modulefinder some much-needed updates. https://bugs.python.org/issue35936 closed by ncoghlan #36050: Why does http.client.HTTPResponse._safe_read use MAXAMOUNT https://bugs.python.org/issue36050 closed by inada.naoki #36378: Add support to load from paths to json.load https://bugs.python.org/issue36378 closed by inada.naoki #36416: bytes.rpartition bug in online documentation https://bugs.python.org/issue36416 closed by inada.naoki #36495: Out-of-bounds array reads in Python/ast.c https://bugs.python.org/issue36495 closed by levkivskyi #36501: Remove POSIX.1e ACLs in tests that rely on default permissions https://bugs.python.org/issue36501 closed by Ivan.Pozdeev #36503: remove references to aix3 and aix4 in \*.py https://bugs.python.org/issue36503 closed by inada.naoki #36504: Signed integer overflow in _ctypes.c's PyCArrayType_new() https://bugs.python.org/issue36504 closed by serhiy.storchaka #36506: [security] CVE-2019-10268: An arbitrary execution vulnerabilit https://bugs.python.org/issue36506 closed by serhiy.storchaka #36513: Add support for building arm32 nuget package https://bugs.python.org/issue36513 closed by steve.dower #36527: unused parameter warnings in Include/object.h (affecting build https://bugs.python.org/issue36527 closed by inada.naoki #36532: Example of logging.formatter with new str.format style https://bugs.python.org/issue36532 closed by spaceman_spiff #36535: Windows build failure when use the code from the GitHub master https://bugs.python.org/issue36535 closed by Manjusaka #36539: Distutils VC 6.0 Errors When Using mingw-w64 GCC https://bugs.python.org/issue36539 closed by danyeaw #36544: cannot import hashlib when openssl is missing https://bugs.python.org/issue36544 closed by xdegaye #36547: bedevere is not working https://bugs.python.org/issue36547 closed by xtreak #36549: str.capitalize should titlecase the first character not upperc https://bugs.python.org/issue36549 closed by steve.dower #36554: unittest.TestCase: "subTest" cannot be used together with "deb https://bugs.python.org/issue36554 closed by dmaurer #36555: PEP484 @overload vs. str/bytes https://bugs.python.org/issue36555 closed by levkivskyi #36559: "import random" should import hashlib on demand (nor load Open https://bugs.python.org/issue36559 closed by rhettinger #36561: Python argparse doesn't work in the presence of my custom modu https://bugs.python.org/issue36561 closed by xtreak #36562: Can't call a method from a module built in Python C API https://bugs.python.org/issue36562 closed by jjppof #36565: Reference hunting (python3 -m test -R 3:3) doesn't work if the https://bugs.python.org/issue36565 closed by vstinner #36566: Support password masking in getpass.getpass() https://bugs.python.org/issue36566 closed by cheryl.sabella #36570: ftplib timeouts for misconfigured server https://bugs.python.org/issue36570 closed by giampaolo.rodola #36571: Lib/smtplib.py have some pep8 issues https://bugs.python.org/issue36571 closed by Marcin Niemira #36574: Error with self in python https://bugs.python.org/issue36574 closed by steven.daprano #36575: Use _PyTime_GetPerfCounter() in lsprof https://bugs.python.org/issue36575 closed by inada.naoki #36577: setup doesn't report missing _ssl and _hashlib https://bugs.python.org/issue36577 closed by christian.heimes #36578: multiprocessing pool + subprocess ValueError: empty range for https://bugs.python.org/issue36578 closed by SilentGhost #36579: test_venv: test_with_pip() hangs on PPC64 AIX 3.x https://bugs.python.org/issue36579 closed by vstinner #36584: cython nametuple TypeError https://bugs.python.org/issue36584 closed by serhiy.storchaka #36588: change sys.platform() to just "aix" for AIX https://bugs.python.org/issue36588 closed by vstinner #36591: Should be a typing.UserNamedTuple https://bugs.python.org/issue36591 closed by levkivskyi #36592: is behave different for integers in 3.6 and 3.7 https://bugs.python.org/issue36592 closed by eric.smith #36597: Travis CI: doctest failure https://bugs.python.org/issue36597 closed by inada.naoki #36598: mock side_effect should be checked for iterable not callable https://bugs.python.org/issue36598 closed by xtreak #36599: doctest document says dict order is unstable https://bugs.python.org/issue36599 closed by inada.naoki #36604: Add recipe to itertools https://bugs.python.org/issue36604 closed by rhettinger From barry at python.org Fri Apr 12 14:08:45 2019 From: barry at python.org (Barry Warsaw) Date: Fri, 12 Apr 2019 11:08:45 -0700 Subject: [Python-Dev] Update PEP 394: Distributions can choose what does python command mean In-Reply-To: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> References: <82dd4715-9709-1fcf-769f-4902fd381578@redhat.com> Message-ID: <50534007-27B6-482D-B4D2-30D624441146@python.org> Thanks for the update. I have made one small suggestion on the PR for clarification, but otherwise the changes LGTM. -Barry > On Apr 12, 2019, at 07:53, Miro Hron?ok wrote: > > Hello. > > Based on discussions in [1], Petr Viktorin and me have drafted a new update [2] to the PEP 394 (The "python" Command on Unix-Like Systems). > > The update gives distributors the opportunity to decide where does the "python" command lead to, whether it is present etc. > > Please, see the PR [2] for the suggested changes. > > [1]: https://mail.python.org/pipermail/python-dev/2019-February/156272.html > [2]: https://github.com/python/peps/pull/989 > > Thanks, > -- > Miro Hron?ok > -- > Phone: +420777974800 > IRC: mhroncok > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/barry%40python.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vano at mail.mipt.ru Fri Apr 12 14:15:59 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Fri, 12 Apr 2019 21:15:59 +0300 Subject: [Python-Dev] Removing PID check from signal handler In-Reply-To: <8f8399d0-5a6b-5b72-3b24-183c800c31a6@python.org> References: <5CB0BA98.4070102@UGent.be> <8f8399d0-5a6b-5b72-3b24-183c800c31a6@python.org> Message-ID: On 12.04.2019 21:05, Steve Dower wrote: > On 12Apr.2019 0919, Jeroen Demeyer wrote: >> The signal handler (that receives signals from the OS) in Python starts >> with a check >> >> ??? if (getpid() == main_pid) >> >> Looking at the comments, the intent was to do a check for the main >> *thread* but this is checking the *process* id. So this condition is >> basically always true. Therefore, I suggest to remove it in >> https://bugs.python.org/issue36601 >> >> If you have any objections or comments, I suggest to post them to that bpo. > To add a little more context, the check was added about 25 years ago as > a "hack" for some reason that we can't figure out anymore. > > So if you are a historian of ancient operating systems and know of one > that might have raised signal handlers in a different process from the > one where it was registered, we'd love to hear from you. According to https://www.linuxquestions.org/questions/programming-9/the-return-value-of-getpid-called-from-main-thread-and-new-thread-r-identical-624399/ , threads used to have different PIDs in the 2.4 Linux kernel. > Cheers, > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan From greg.ewing at canterbury.ac.nz Fri Apr 12 18:34:08 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 13 Apr 2019 10:34:08 +1200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: <5CB11260.1060708@canterbury.ac.nz> Victor Stinner wrote: > Python is used on devices with low memory (ex: 256 MiB for the whole > system). Allowing developers to use a debug build on such devices seem > to be a legit rationale for such change. Rather than removing features altogether, maybe the debug build could be split into a number of separate features that can be enabled individually? -- Greg From greg.ewing at canterbury.ac.nz Fri Apr 12 18:38:43 2019 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 13 Apr 2019 10:38:43 +1200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> Message-ID: <5CB11373.90607@canterbury.ac.nz> Victor Stinner wrote: > I'm not sure of what you means by "objects placed at static memory": > the double linked list of all Python objects is created at runtime. > _ob_next and _ob_prev are initialized statically to NULL. The trick of allocating extra memory in front of the object would be harder to pull off for statically allocated objects, although probably not impossible. -- Greg From vstinner at redhat.com Fri Apr 12 19:13:57 2019 From: vstinner at redhat.com (Victor Stinner) Date: Sat, 13 Apr 2019 01:13:57 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: <5CB11260.1060708@canterbury.ac.nz> References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <5CB11260.1060708@canterbury.ac.nz> Message-ID: Le sam. 13 avr. 2019 ? 00:38, Greg Ewing a ?crit : > Rather than removing features altogether, maybe the debug > build could be split into a number of separate features > that can be enabled individually? I don't propose to *remove* a feature, but just to *disable* it *by default* (when Python is compiled in debug mode): "[WIP] bpo-36465: Py_DEBUG no longer implies Py_TRACE_REFS #12615" https://github.com/python/cpython/pull/12615/files In short, my change just removes: /* Py_DEBUG implies Py_TRACE_REFS. */ #if defined(Py_DEBUG) && !defined(Py_TRACE_REFS) #define Py_TRACE_REFS #endif The feature will still be accessible if you compile Python with Py_TRACE_REFS defined. In practice, I understood that the debug build of Python is not known by all core developers, and it seems like it's mostly used by core developers. Maybe it's even only used by core developers? It's hard to say. If it's only used by core developers, I hope that all core devs know to compile Python :-) Victor -- Night gathers, and now my watch begins. It shall not end until my death. From njs at pobox.com Fri Apr 12 19:43:51 2019 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 12 Apr 2019 16:43:51 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: On Thu, Apr 11, 2019 at 8:26 AM Steve Dower wrote: > > On 10Apr2019 1917, Nathaniel Smith wrote: > > It sounds like --with-pydebug has accumulated a big grab bag of > > unrelated features, mostly stuff that was useful at some point for > > some CPython dev trying to debug CPython itself? It's clearly not > > designed with end users as the primary audience, given that no-one > > knows what it actually does and that it makes third-party extensions > > really awkward to run. If that's right then I think Victor's plan of > > to sort through what it's actually doing makes a lot of sense, > > especially if we can remove the ABI breaking stuff, since that causes > > a disproportionate amount of trouble. > > Does it really cause a "disproportionate" amount of trouble? It's > definitely not meant for anyone who isn't working on C code, whether in > CPython, an extension or a host application. If you want to use > third-party extensions and are not able to rebuild them, that's a very > good sign that you probably shouldn't be on the debug build at all. Well, here's what I mean by "disproportionate". Some of the costs of the ABI divergence are: - The first time I had to debug a C extension, I wasted a bunch of time trying to figure out how I was supposed to use Debian's 'python-dbg' package (the --with-pydebug build), before eventually figuring out that it was a red herring and what I actually wanted was the -dbgsym package (their equivalent of MSVC's /Zi /DEBUG files). - The extension loading machinery has extra code and complexity to track the two different ABIs. The package ecosystem does too, e.g. distutils needs to name extensions appropriately, and we need special wheel tags, and pip needs code to handle these tags: https://github.com/pypa/pip/blob/54b6a91405adc79cdb8a2954e9614d6860799ccb/src/pip/_internal/pep425tags.py#L106-L109 - If you want some of the features of --with-pydebug that don't change the ABI, then you still have to rebuild third-party extensions to get at them, and that's a significant hassle. (I could do it if I had to, but my time has value.) - Everyone who uses ctypes to access a PyObject* has to include some extra hacks to handle the difference between the regular and debug ABIs. There are a few different versions that get copy/pasted around as folklore, and they're all pretty obscure. For example: https://github.com/pallets/jinja/blob/fd89fed7456e755e33ba70674c41be5ab222e193/jinja2/debug.py#L317-L334 https://github.com/johndpope/sims4-ai-engine/blob/865212e841c716dc4364e0dba286f02af8d716e8/core/framewrapper.py#L12-L41 https://github.com/python-trio/trio/blob/862ced04e1f19287e098380ed8a0635004c36dd1/trio/_core/_multierror.py#L282 And then if you want to test this code, it means you have to add a --with-pydebug build to your CI infrastructure... I don't know how many people use Py_TRACE_REFS, but if we can't find anyone on python-dev who uses it then it must be pretty rare. If dropping Py_TRACE_REFS would let us converge the ABIs and get rid of all the stuff above, then that seems like a pretty good trade! But maybe the Windows C runtime issue will foil this... > >> The reason we ship debug Python binaries is because debug builds use a > >> different C Runtime, so if you do a debug build of an extension module > >> you're working on it won't actually work with a non-debug build of CPython. > > > > ...But this is an important point. I'd forgotten that MSVC has a habit > > of changing the entire C runtime when you turn on the compiler's > > debugging mode. > > Technically they are separate options, but most project files are > configured such that *their* Debug/Release switch affects both the > compiler options (optimization) and the linker options (C runtime linkage). So how do other projects handle this? I guess historically the main target audience for Visual Studio was folks building monolithic apps, where you can just rebuild everything with whatever options you want, and compared to that Python extensions are messier. But Python isn't the only project in this boat. Do ruby, nodejs, R, etc., all provide separate debug builds with incompatible ABIs on Windows, and propagate that information throughout their module/package ecosystem? -n -- Nathaniel J. Smith -- https://vorpus.org From steve.dower at python.org Fri Apr 12 20:05:24 2019 From: steve.dower at python.org (Steve Dower) Date: Fri, 12 Apr 2019 17:05:24 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: On 12Apr.2019 1643, Nathaniel Smith wrote: > On Thu, Apr 11, 2019 at 8:26 AM Steve Dower wrote: >> >> On 10Apr2019 1917, Nathaniel Smith wrote: > I don't know how many people use Py_TRACE_REFS, but if we can't find > anyone on python-dev who uses it then it must be pretty rare. If > dropping Py_TRACE_REFS would let us converge the ABIs and get rid of > all the stuff above, then that seems like a pretty good trade! But > maybe the Windows C runtime issue will foil this... The very first question I asked was whether this would let us converge the ABIs, and the answer was "no". Otherwise I'd have said go for it, despite the C runtime issues. >>>> The reason we ship debug Python binaries is because debug builds use a >>>> different C Runtime, so if you do a debug build of an extension module >>>> you're working on it won't actually work with a non-debug build of CPython. >>> >>> ...But this is an important point. I'd forgotten that MSVC has a habit >>> of changing the entire C runtime when you turn on the compiler's >>> debugging mode. >> >> Technically they are separate options, but most project files are >> configured such that *their* Debug/Release switch affects both the >> compiler options (optimization) and the linker options (C runtime linkage). > > So how do other projects handle this? I guess historically the main > target audience for Visual Studio was folks building monolithic apps, > where you can just rebuild everything with whatever options you want, > and compared to that Python extensions are messier. But Python isn't > the only project in this boat. Do ruby, nodejs, R, etc., all provide > separate debug builds with incompatible ABIs on Windows, and propagate > that information throughout their module/package ecosystem? Mostly I hear complaints about those languages *not* providing any help here. Python is renowned for having significantly better Windows support than any of them, so they're the wrong comparison to make in my opinion. Arguing that we should regress because other languages haven't caught up to us yet makes no sense. The tools that are better than Python typically don't ship debug builds either, unless you specifically request them. But they also don't leak their implementation details all over the place. If we had a better C API, we wouldn't have users who needed to match ABIs. For the most part, disabling optimizations in your own extension but using the non-debug ABI is sufficient, and if you're having to deal with other people's packages then maybe you don't have any choice (though I do know of people who have built debug versions of numpy before - turns out Windows developers are often just as capable as non-Windows developers when it comes to building things ;) ). And yes, they could also build CPython from source as well to get the debug ABI, or get the debug symbols, but I saw enough need that I decided it was worth the effort to just solve that problem. 250k downloads a month is enough to justify it for me. Not to bring the packaging discussions to another venue, but maybe this is yet another area we need to stop pretending that we're able to solve every single problem with just the tools we already have available? People who want debug builds of packages can build them themselves, even numpy and scipy, they don't need us to preemptively do all their work for them. But we can (and should) help short-cut unnecessary effort or research by providing helpful tools and instruction. Cheers, Steve From njs at pobox.com Fri Apr 12 21:19:03 2019 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 12 Apr 2019 18:19:03 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: On Fri, Apr 12, 2019 at 5:05 PM Steve Dower wrote: > > On 12Apr.2019 1643, Nathaniel Smith wrote: > > On Thu, Apr 11, 2019 at 8:26 AM Steve Dower wrote: > >> > >> On 10Apr2019 1917, Nathaniel Smith wrote: > > I don't know how many people use Py_TRACE_REFS, but if we can't find > > anyone on python-dev who uses it then it must be pretty rare. If > > dropping Py_TRACE_REFS would let us converge the ABIs and get rid of > > all the stuff above, then that seems like a pretty good trade! But > > maybe the Windows C runtime issue will foil this... > > The very first question I asked was whether this would let us converge > the ABIs, and the answer was "no". > > Otherwise I'd have said go for it, despite the C runtime issues. I don't see that in the thread... just Victor saying he isn't sure whether there might be other ABI incompatibilities lurking that he hasn't found yet. Did I miss something? I'm mostly interested in this because of the possibility of converging the ABIs. If you think that the C runtime thing isn't a blocker for that, then that's useful information. Though obviously we still need to figure out whether there are any other blockers :-). > >>>> The reason we ship debug Python binaries is because debug builds use a > >>>> different C Runtime, so if you do a debug build of an extension module > >>>> you're working on it won't actually work with a non-debug build of CPython. > >>> > >>> ...But this is an important point. I'd forgotten that MSVC has a habit > >>> of changing the entire C runtime when you turn on the compiler's > >>> debugging mode. > >> > >> Technically they are separate options, but most project files are > >> configured such that *their* Debug/Release switch affects both the > >> compiler options (optimization) and the linker options (C runtime linkage). > > > > So how do other projects handle this? I guess historically the main > > target audience for Visual Studio was folks building monolithic apps, > > where you can just rebuild everything with whatever options you want, > > and compared to that Python extensions are messier. But Python isn't > > the only project in this boat. Do ruby, nodejs, R, etc., all provide > > separate debug builds with incompatible ABIs on Windows, and propagate > > that information throughout their module/package ecosystem? > > Mostly I hear complaints about those languages *not* providing any help > here. Python is renowned for having significantly better Windows support > than any of them, so they're the wrong comparison to make in my opinion. > Arguing that we should regress because other languages haven't caught up > to us yet makes no sense. > > The tools that are better than Python typically don't ship debug builds > either, unless you specifically request them. But they also don't leak > their implementation details all over the place. If we had a better C > API, we wouldn't have users who needed to match ABIs. Do you happen to have a list of places where the C API leaks details of the underlying CRT? (I'm mostly curious because whenever I've looked my conclusion was essentially: "Well....... I don't see any places that are *definitely* broken, so maybe mixing CRTs is fine? but I have zero confidence that I caught everything, so probably better to play it safe?". At least on py3 ? I know the py2 C API was definitely broken if you mixed CRTs, because of the exposed FILE*.) > For the most part, disabling optimizations in your own extension but > using the non-debug ABI is sufficient, and if you're having to deal with > other people's packages then maybe you don't have any choice (though I > do know of people who have built debug versions of numpy before - turns > out Windows developers are often just as capable as non-Windows > developers when it comes to building things ;) I'm not sure why you think I was implying otherwise? I'm sorry if you thought I was attacking your users or something. I did say that I thought most users downloading the debug builds were probably confused about what they were actually getting, but I didn't mean because they were stupid Windows users, I meant because the debug builds are so confusing that even folks on the Python core team are confused about what they're actually getting. -n -- Nathaniel J. Smith -- https://vorpus.org From vstinner at redhat.com Fri Apr 12 21:35:42 2019 From: vstinner at redhat.com (Victor Stinner) Date: Sat, 13 Apr 2019 03:35:42 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: >> The very first question I asked was whether this would let us converge >> the ABIs, and the answer was "no". The answer is yes and it's my primary goal. See my first email: "This change makes the debug build ABI closer to the release build ABI". To be honest, I am now lost in this long thread :-) I don't recall why I started to argue so much about the memory footprint, it's not really the main point here. Victor >> Otherwise I'd have said go for it, despite the C runtime issues. > > I don't see that in the thread... just Victor saying he isn't sure > whether there might be other ABI incompatibilities lurking that he > hasn't found yet. Did I miss something? > > I'm mostly interested in this because of the possibility of converging > the ABIs. If you think that the C runtime thing isn't a blocker for > that, then that's useful information. Though obviously we still need > to figure out whether there are any other blockers :-). > >> >>>> The reason we ship debug Python binaries is because debug builds use a >> >>>> different C Runtime, so if you do a debug build of an extension module >> >>>> you're working on it won't actually work with a non-debug build of CPython. >> >>> >> >>> ...But this is an important point. I'd forgotten that MSVC has a habit >> >>> of changing the entire C runtime when you turn on the compiler's >> >>> debugging mode. >> >> >> >> Technically they are separate options, but most project files are >> >> configured such that *their* Debug/Release switch affects both the >> >> compiler options (optimization) and the linker options (C runtime linkage). >> > >> > So how do other projects handle this? I guess historically the main >> > target audience for Visual Studio was folks building monolithic apps, >> > where you can just rebuild everything with whatever options you want, >> > and compared to that Python extensions are messier. But Python isn't >> > the only project in this boat. Do ruby, nodejs, R, etc., all provide >> > separate debug builds with incompatible ABIs on Windows, and propagate >> > that information throughout their module/package ecosystem? >> >> Mostly I hear complaints about those languages *not* providing any help >> here. Python is renowned for having significantly better Windows support >> than any of them, so they're the wrong comparison to make in my opinion. >> Arguing that we should regress because other languages haven't caught up >> to us yet makes no sense. >> >> The tools that are better than Python typically don't ship debug builds >> either, unless you specifically request them. But they also don't leak >> their implementation details all over the place. If we had a better C >> API, we wouldn't have users who needed to match ABIs. > > Do you happen to have a list of places where the C API leaks details > of the underlying CRT? > > (I'm mostly curious because whenever I've looked my conclusion was > essentially: "Well....... I don't see any places that are *definitely* > broken, so maybe mixing CRTs is fine? but I have zero confidence that > I caught everything, so probably better to play it safe?". At least on > py3 ? I know the py2 C API was definitely broken if you mixed CRTs, > because of the exposed FILE*.) > >> For the most part, disabling optimizations in your own extension but >> using the non-debug ABI is sufficient, and if you're having to deal with >> other people's packages then maybe you don't have any choice (though I >> do know of people who have built debug versions of numpy before - turns >> out Windows developers are often just as capable as non-Windows >> developers when it comes to building things ;) > > I'm not sure why you think I was implying otherwise? I'm sorry if you > thought I was attacking your users or something. I did say that I > thought most users downloading the debug builds were probably confused > about what they were actually getting, but I didn't mean because they > were stupid Windows users, I meant because the debug builds are so > confusing that even folks on the Python core team are confused about > what they're actually getting. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com > -- Night gathers, and now my watch begins. It shall not end until my death. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdegaye at gmail.com Sat Apr 13 12:42:17 2019 From: xdegaye at gmail.com (Xavier de Gaye) Date: Sat, 13 Apr 2019 18:42:17 +0200 Subject: [Python-Dev] duplicate method names in tests Message-ID: The last post [1] in issue bpo-16079 lists the methods in Lib/test that have duplicate names and that should be fixed. Xavier [1] https://bugs.python.org/issue16079#msg340168 From mark at hotpy.org Sun Apr 14 07:30:48 2019 From: mark at hotpy.org (Mark Shannon) Date: Sun, 14 Apr 2019 12:30:48 +0100 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: Hi, Petr On 10/04/2019 5:25 pm, Petr Viktorin wrote: > Hello! > I've had time for a more thorough reading of PEP 590 and the reference > implementation. Thank you for the work! > Overall, I like PEP 590's direction. I'd now describe the fundamental > difference between PEP 580 and PEP 590 as: > - PEP 580 tries to optimize all existing calling conventions > - PEP 590 tries to optimize (and expose) the most general calling > convention (i.e. fastcall) > > PEP 580 also does a number of other things, as listed in PEP 579. But I > think PEP 590 does not block future PEPs for the other items. > On the other hand, PEP 580 has a much more mature implementation -- and > that's where it picked up real-world complexity. > > PEP 590's METH_VECTORCALL is designed to handle all existing use cases, > rather than mirroring the existing METH_* varieties. > But both PEPs require the callable's code to be modified, so requiring > it to switch calling conventions shouldn't be a problem. > > Jeroen's analysis from > https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems > to miss a step at the top: > > a. CALL_FUNCTION* / CALL_METHOD opcode > ????? calls > b. _PyObject_FastCallKeywords() > ????? which calls > c. _PyCFunction_FastCallKeywords() > ????? which calls > d. _PyMethodDef_RawFastCallKeywords() > ????? which calls > e. the actual C function (*ml_meth)() > > I think it's more useful to say that both PEPs bridge a->e (via > _Py_VectorCall or PyCCall_Call). > > > PEP 590 is built on a simple idea, formalizing fastcall. But it is > complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and > Py_TPFLAGS_METHOD_DESCRIPTOR. > As far as I understand, both are there to avoid intermediate > bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be > general, but I don't see any other use case.) > Is that right? Not quite. Py_TPFLAGS_METHOD_DESCRIPTOR is for LOAD_METHOD/CALL_METHOD, it allows any callable descriptor to benefit from the LOAD_METHOD/CALL_METHOD optimisation. PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward calls with an additional argument can do so efficiently. The obvious example is bound-methods, but classes are at least as important. cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args) > (I'm running out of time today, but I'll write more on why I'm asking, > and on the case I called "impossible" (while avoiding creation of a > "bound method" object), later.) > > > The way `const` is handled in the function signatures strikes me as too > fragile for public API. > I'd like if, as much as possible, PY_VECTORCALL_ARGUMENTS_OFFSET was > treated as a special optimization that extension authors can either opt > in to, or blissfully ignore. > That might mean: > - vectorcall, PyObject_VectorCallWithCallable, PyObject_VectorCall, > PyCall_MakeTpCall all formally take "PyObject *const *args" > - a na?ve callee must do "nargs &= ~PY_VECTORCALL_ARGUMENTS_OFFSET" > (maybe spelled as "nargs &= PY_VECTORCALL_NARGS_MASK"), but otherwise > writes compiler-enforced const-correct code. > - if PY_VECTORCALL_ARGUMENTS_OFFSET is set, the callee may modify > "args[-1]" (and only that, and after the author has read the docs). The updated minimal implementation now uses `const` arguments. Code that uses args[-1] must explicitly cast away the const. https://github.com/markshannon/cpython/blob/vectorcall-minimal/Objects/classobject.c#L55 > > > Another point I'd like some discussion on is that vectorcall function > pointer is per-instance. It looks this is only useful for type objects, > but it will add a pointer to every new-style callable object (including > functions). That seems wasteful. > Why not have a per-type pointer, and for types that need it (like > PyTypeObject), make it dispatch to an instance-specific function? Firstly, each callable has different behaviour, so it makes sense to be able to do the dispatch from caller to callee in one step. Having a per-object function pointer allows that. Secondly, callables are either large or transient. If large, then the extra few bytes makes little difference. If transient then, it matters even less. The total increase in memory is likely to be only a few tens of kilobytes, even for a large program. > > > Minor things: > - "Continued prohibition of callable classes as base classes" -- this > section reads as a final. Would you be OK wording this as something > other PEPs can tackle? > - "PyObject_VectorCall" -- this looks extraneous, and the reference > imlementation doesn't need it so far. Can it be removed, or justified? Yes, removing it makes sense. I can then rename the clumsily named "PyObject_VectorCallWithCallable" as "PyObject_VectorCall". > - METH_VECTORCALL is *not* strictly "equivalent to the currently > undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the > ARGUMENTS_OFFSET complication). METH_VECTORCALL is just making METH_FASTCALL | METH_KEYWORD documented and public. Would you prefer that it has a different name to prevent confusion with over PY_VECTORCALL_ARGUMENTS_OFFSET? I don't like calling things "fast" or "new" as the names can easily become misleading. New College, Oxford is over 600 years old. Not so "new" any more :) > - I'd like to officially call this PEP "Vectorcall", see > https://github.com/python/peps/pull/984 > > > > Mark, what are your plans for next steps with PEP 590? If a volunteer > wanted to help you push this forward, what would be the best thing to > work on? The minimal implementation is also a complete implementation. Third party code can use the vectorcall protocol immediately use and be called efficiently from the interpreter. I think it is very close to being mergeable. To gain the promised performance improvements is obviously a lot more work, but can be done incrementally over the next few months. Cheers, Mark. From mark at hotpy.org Sun Apr 14 07:34:17 2019 From: mark at hotpy.org (Mark Shannon) Date: Sun, 14 Apr 2019 12:34:17 +0100 Subject: [Python-Dev] PEP 580 and PEP 590 comparison. Message-ID: Hi Petr, Thanks for spending time on this. I think the comparison of the two PEPs falls into two broad categories, performance and capability. I'll address capability first. Let's try a thought experiment. Consider PEP 580. It uses the old `tp_print` slot as an offset to mark the location of the CCall structure within the callable. Now suppose instead that it uses a `tp_flag` to mark the presence of an offset field and that the offset field is moved to the end of the TypeObject. This would not impact the capabilities of PEP 580. Now add a single line nargs ~= PY_VECTORCALL_ARGUMENTS_OFFSET here https://github.com/python/cpython/compare/master...jdemeyer:pep580#diff-1160d7c87cbab324fda44e7827b36cc9R570 which would make PyCCall_FastCall compatible with the PEP 590 vectorcall protocol. Now rebase the PEP 580 reference code on top of PEP 590 minimal implementation and make the vectorcall field of CFunction point to PyCCall_FastCall. The resulting hybrid is both a PEP 590 conformant implementation, and is at least as capable as the reference PEP 580 implementation. Therefore PEP 590, must be at least as capable at PEP 580. Now performance. Currently the PEP 590 implementation is intentionally minimal. It does nothing for performance. The benchmark Jeroen provides is a micro-benchmark that calls the same functions repeatedly. This is trivial and unrealistic. So, there is no real evidence either way. I will try to provide some. The point of PEP 590 is that it allows performance improvements by allowing callables more freedom of implementation. To repeat an example from an earlier email, which may have been overlooked, this code reduces the time to create ranges and small lists by about 30% https://github.com/markshannon/cpython/compare/vectorcall-minimal...markshannon:vectorcall-examples https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8 To speed up calls to builtin functions by a measurable amount will need some work on argument clinic. I plan to have that done before PyCon in May. Cheers, Mark. From wieser.eric+numpy at gmail.com Sun Apr 14 02:54:04 2019 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Sat, 13 Apr 2019 23:54:04 -0700 Subject: [Python-Dev] Fixing the ctypes implementation of the PEP3118 buffer interface Message-ID: I've recently been adding better support to Numpy 1.16 for interoperability with ctypes. In doing so, I came across two bugs in the implementation of the PEP3118 buffer interface within ctypes, affecting `Structure`s and arrays. Rather than repeating the issue summaries here, I've linked their tracker issues below, and the patches I filed to fix them. * https://bugs.python.org/issue32782 (patch: https://github.com/python/cpython/pull/5576) * https://bugs.python.org/issue32780 (patch: https://github.com/python/cpython/pull/5561) I've seen little to no response on either the bug tracker or the github PRs regarding these, so at the recommendation of the "Lifecycle of a Pull Request" am emailing this list. Without these fixes, numpy has no choice but to ignore the broken buffer interface that ctypes provides, and instead try to parse the ctypes types manually. The sooner this makes a CPython release, the sooner numpy can remove those workarounds. Thanks, Eric From tjreedy at udel.edu Sun Apr 14 16:42:13 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 14 Apr 2019 16:42:13 -0400 Subject: [Python-Dev] Fixing the ctypes implementation of the PEP3118 buffer interface In-Reply-To: References: Message-ID: On 4/14/2019 2:54 AM, Eric Wieser wrote: > I've recently been adding better support to Numpy 1.16 for > interoperability with ctypes. > > In doing so, I came across two bugs in the implementation of the > PEP3118 buffer interface within ctypes, affecting `Structure`s and > arrays. Rather than repeating the issue summaries here, I've linked > their tracker issues below, and the patches I filed to fix them. > * https://bugs.python.org/issue32782 (patch: > https://github.com/python/cpython/pull/5576) memoryview(object).itemsize is 0 when object is ctypes structure and format. C expert needed to review 30-line patch, most of which is error handling. Patch includes new tests and blurb. > * https://bugs.python.org/issue32780 (patch: > https://github.com/python/cpython/pull/5561) A partial fix for a more complicated memoryview, ctypes structure and format, and itemsize situation. > I've seen little to no response on either the bug tracker or the > github PRs regarding these, so at the recommendation of the "Lifecycle > of a Pull Request" am emailing this list. The problem is that the currently listed ctypes and memoryview experts are not currently active. > Without these fixes, numpy has no choice but to ignore the broken > buffer interface that ctypes provides, and instead try to parse the > ctypes types manually. The sooner this makes a CPython release, the > sooner numpy can remove those workarounds. -- Terry Jan Reedy From Peixing.Xin at windriver.com Sun Apr 14 21:42:11 2019 From: Peixing.Xin at windriver.com (Xin, Peixing) Date: Mon, 15 Apr 2019 01:42:11 +0000 Subject: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? In-Reply-To: <5CB025C1.2060708@canterbury.ac.nz> References: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0437@ALA-MBD.corp.ad.wrs.com> <5fb2462a-ee2c-00e3-1928-d56510e29570@python.org> <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB0B64@ALA-MBD.corp.ad.wrs.com> <5CB025C1.2060708@canterbury.ac.nz> Message-ID: <8488FBC4EAAC5941BA4B85DD1ECCF1870133BB22AC@ALA-MBD.corp.ad.wrs.com> VxWorks RTOS with 3rd party math lib. Thanks, Peixing -----Original Message----- From: Python-Dev [mailto:python-dev-bounces+peixing.xin=windriver.com at python.org] On Behalf Of Greg Ewing Sent: Friday, April 12, 2019 1:45 PM To: python-dev at python.org Subject: Re: [Python-Dev] checking "errno" for math operaton is safe to determine the error status? Xin, Peixing wrote: > On certain platform, expm1() is implemented as exp() minus 1. To calculate > expm1(-1420.0), that will call exp(-1420.0) then substract 1. You know, > exp(-1420.0) will underflow to zero and errno is set to ERANGE. As a > consequence the errno keeps set there when expm1() returns the correct result > -1. This sounds like a bug in that platform's implementation of expm1() to me. Which platform is it? -- Greg _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/peixing.xin%40windriver.com From J.Demeyer at UGent.be Mon Apr 15 04:34:53 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Mon, 15 Apr 2019 10:34:53 +0200 Subject: [Python-Dev] PEP 580 and PEP 590 comparison. In-Reply-To: References: Message-ID: <5CB4422D.8030507@UGent.be> On 2019-04-14 13:34, Mark Shannon wrote: > I'll address capability first. I don't think that comparing "capability" makes a lot of sense since neither PEP 580 nor PEP 590 adds any new capabilities to CPython. They are meant to allow doing things faster, not to allow more things. And yes, the C call protocol can be implemented on top of the vectorcall protocol and conversely, but that doesn't mean much. > Now performance. > > Currently the PEP 590 implementation is intentionally minimal. It does > nothing for performance. So, we're missing some information here. What kind of performance improvements are possible with PEP 590 which are not in the reference implementation? > The benchmark Jeroen provides is a > micro-benchmark that calls the same functions repeatedly. This is > trivial and unrealistic. Well, it depends what you want to measure... I'm trying to measure precisely the thing that makes PEP 580 and PEP 590 different from the status-quo, so in that sense those benchmarks are very relevant. I think that the following 3 statements are objectively true: (A) Both PEP 580 and PEP 590 add a new calling convention, which is equally fast as builtin functions (and hence faster than tp_call). (B) Both PEP 580 and PEP 590 keep roughly the same performance as the status-quo for existing function/method calls. (C) While the performance of PEP 580 and PEP 590 is roughly the same, PEP 580 is slightly faster (based on the reference implementations linked from PEP 580 and PEP 590). Two caveats concerning (C): - the difference may be too small to matter. Relatively, it's a few percent of the call time but in absolute numbers, it's less than 10 CPU clock cycles. - there might be possible improvements to the reference implementation of either PEP 580/PEP 590. I don't expect big differences though. > To repeat an example > from an earlier email, which may have been overlooked, this code reduces > the time to create ranges and small lists by about 30% That's just a special case of the general fact (A) above and using the new calling convention for "type". It's an argument in favor of both PEP 580 and PEP 590, not for PEP 590 specifically. Jeroen. From J.Demeyer at UGent.be Mon Apr 15 04:38:11 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Mon, 15 Apr 2019 10:38:11 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: <5CB442F3.5060705@UGent.be> On 2019-04-14 13:30, Mark Shannon wrote: > PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward > calls with an additional argument can do so efficiently. The obvious > example is bound-methods, but classes are at least as important. > cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args) But tp_new and tp_init take the "cls" and "self" as separate arguments, not as part of *args. So I don't see why you need PY_VECTORCALL_ARGUMENTS_OFFSET for this. > The updated minimal implementation now uses `const` arguments. > Code that uses args[-1] must explicitly cast away the const. > https://github.com/markshannon/cpython/blob/vectorcall-minimal/Objects/classobject.c#L55 That's better indeed. Jeroen. From solipsis at pitrou.net Mon Apr 15 06:50:00 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 15 Apr 2019 12:50:00 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: <20190415125000.571364a0@fsol> On Thu, 11 Apr 2019 08:26:47 -0700 Steve Dower wrote: > On 10Apr2019 1917, Nathaniel Smith wrote: > > It sounds like --with-pydebug has accumulated a big grab bag of > > unrelated features, mostly stuff that was useful at some point for > > some CPython dev trying to debug CPython itself? It's clearly not > > designed with end users as the primary audience, given that no-one > > knows what it actually does and that it makes third-party extensions > > really awkward to run. If that's right then I think Victor's plan of > > to sort through what it's actually doing makes a lot of sense, > > especially if we can remove the ABI breaking stuff, since that causes > > a disproportionate amount of trouble. > > Does it really cause a "disproportionate" amount of trouble? It's > definitely not meant for anyone who isn't working on C code, whether in > CPython, an extension or a host application. If you want to use > third-party extensions and are not able to rebuild them, that's a very > good sign that you probably shouldn't be on the debug build at all. I can't really agree with that. There are third-party extensions that have non-trivial build requirements. The fact that you have to rebuild third-party dependencies is a strong deterrent against using pydebug builds even when they may be actually useful (for example when debugging an extension module of your own). If you could just install mainstream binary packages (e.g. from Anaconda or PyPI) on a debug build interpreter, the pain would go away. Regards Antoine. From senthil at uthcode.com Mon Apr 15 08:13:38 2019 From: senthil at uthcode.com (Senthil Kumaran) Date: Mon, 15 Apr 2019 05:13:38 -0700 Subject: [Python-Dev] Season of Docs Message-ID: Hello Python Developers, Google is running a program called Season of Docs ( https://developers.google.com/season-of-docs/) to encourage technical writers to improve the documentation of Open Source Projects. As Python-Dev, and Python Software Foundation, do you think: a) We should participate? b) If yes to a), are you willing to be a mentor and identify project ideas? If you are willing to mentor and have project ideas, please let us know and we can think about the next steps. The deadline for org application is April 23, 2019. This discussion started here https://discuss.python.org/t/will-python-apply-for-season-of-docs-and-allow-suborgs/ Thank you, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephane at wirtel.be Mon Apr 15 10:18:38 2019 From: stephane at wirtel.be (Stephane Wirtel) Date: Mon, 15 Apr 2019 16:18:38 +0200 Subject: [Python-Dev] [Core-mentorship] Season of Docs In-Reply-To: References: Message-ID: <20190415141838.245nppxwpt5fvwk6@xps> I don't know if Julien Palard is on this mailing list, but maybe he could be interested by this initiative. On 04/15, Senthil Kumaran wrote: >Hello Python Developers, > >Google is running a program called Season of Docs ( >https://developers.google.com/season-of-docs/) to encourage technical >writers to improve the documentation of Open Source Projects. > >As Python-Dev, and Python Software Foundation, do you think: > >a) We should participate? >b) If yes to a), are you willing to be a mentor and identify project ideas? > >If you are willing to mentor and have project ideas, please let us know and >we can think about the next steps. The deadline for org application is >April 23, 2019. > >This discussion started here >https://discuss.python.org/t/will-python-apply-for-season-of-docs-and-allow-suborgs/ > >Thank you, >Senthil >================================================== >Core-mentorship mailing list: core-mentorship at python.org >To unsubscribe send an email to core-mentorship-leave at python.org >https://mail.python.org/mm3/mailman3/lists/core-mentorship.python.org/ >Code of Conduct: https://www.python.org/psf/codeofconduct/ -- St?phane Wirtel - https://wirtel.be - @matrixise From alan.pope at canonical.com Mon Apr 15 07:21:17 2019 From: alan.pope at canonical.com (Alan Pope) Date: Mon, 15 Apr 2019 12:21:17 +0100 Subject: [Python-Dev] Collaboration on a set of Python snaps Message-ID: Hi Python devs, I work on the Snapcraft [0] team at Canonical. I'm looking for a Python contributor to collaborate with us on making snaps of supported releases of Python available in the Snap Store [1]. Travis CI and Canonical are looking for someone (preferably North-America based) to participate in an in-person Snapcraft Summit in downtown Montreal, Canada from 11th to 13th June. We're sponsoring a number of software vendors, device manufacturers people from the robotics sector to come. We'd love someone from the Python project to join us. We?ve published this blog post to explain the event in more detail: https://snapcraft.io/blog/snapcraft-summit-montreal The goal would be to create snaps of the major supported releases of Python, and authoritatively publish them in the Snap Store. This would enable users of many different Linux distributions to easily obtain up to date supported versions of Python directly from the Python project. It also enables providers of CI systems (such as Travis) to the latest builds of Python are easily available to developers who use their services. We've done this previously with NodeJS [2] and Ruby [3] - among others. It would be great to have Python available via this method too. All the best, Al. [0] - https://snapcraft.io/ [1] - https://snapcraft.io/store [2] - https://snapcraft.io/node [3] - https://snapcraft.io/ruby -- Alan Pope Community Advocate Canonical - Ubuntu Engineering and Services +44 (0) 7973 620 164 alan.pope at canonical.com http://ubuntu.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlehtosalo at gmail.com Mon Apr 15 12:42:23 2019 From: jlehtosalo at gmail.com (Jukka Lehtosalo) Date: Mon, 15 Apr 2019 09:42:23 -0700 Subject: [Python-Dev] PEP 589 discussion (TypedDict) happening at typing-sig@ Message-ID: Hi everyone, I submitted PEP 589 (TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys) for discussion to typing-sig [1]. Here's an excerpt from the abstract of the PEP: PEP 484 defines the type Dict[K, V] for uniform dictionaries, where each value has the same type, and arbitrary key values are supported. It doesn't properly support the common pattern where the type of a dictionary value depends on the string value of the key. This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type. Jukka Lehtosalo [1] https://mail.python.org/mailman3/lists/typing-sig.python.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Mon Apr 15 12:59:45 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 15 Apr 2019 09:59:45 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> Message-ID: <832f3456-3e46-fdca-4c51-8fe8ed111e38@python.org> On 12Apr2019 1819, Nathaniel Smith wrote: > On Fri, Apr 12, 2019 at 5:05 PM Steve Dower wrote: >> >> On 12Apr.2019 1643, Nathaniel Smith wrote: >>> On Thu, Apr 11, 2019 at 8:26 AM Steve Dower wrote: >> The very first question I asked was whether this would let us converge >> the ABIs, and the answer was "no". >> >> Otherwise I'd have said go for it, despite the C runtime issues. > > I don't see that in the thread... just Victor saying he isn't sure > whether there might be other ABI incompatibilities lurking that he > hasn't found yet. Did I miss something? "I don't know" means we can't say the APIs are converged, which is a no. I don't think you missed anything, but just read it through a different filter. > I'm mostly interested in this because of the possibility of converging > the ABIs. If you think that the C runtime thing isn't a blocker for > that, then that's useful information. Though obviously we still need > to figure out whether there are any other blockers :-). > [SNIP] > Do you happen to have a list of places where the C API leaks details > of the underlying CRT? > > (I'm mostly curious because whenever I've looked my conclusion was > essentially: "Well....... I don't see any places that are *definitely* > broken, so maybe mixing CRTs is fine? but I have zero confidence that > I caught everything, so probably better to play it safe?". At least on > py3 ? I know the py2 C API was definitely broken if you mixed CRTs, > because of the exposed FILE*.) Not since the discussions about migrating to VS 2015, but a few off the top of my head: * locale * file descriptors * stream buffers * thread locals * exception [handler] state (yes, there are exceptions used within the CRT, and they occasionally intentionally leak out past the C code) * atexit handlers * internal callbacks (mostly debug handlers, but since we're talking about debugging...) I'm pretty sure if I did some digging I'd be able to figure out which of these come from vcruntime140.dll vs ucrtbase.dll, and then come up with some far-too-clever linker options to make some of these more consistent, but there's no complete solution other than making sure you've got a complete debug or complete release build. >> For the most part, disabling optimizations in your own extension but >> using the non-debug ABI is sufficient, and if you're having to deal with >> other people's packages then maybe you don't have any choice (though I >> do know of people who have built debug versions of numpy before - turns >> out Windows developers are often just as capable as non-Windows >> developers when it comes to building things ;) > > I'm not sure why you think I was implying otherwise? I'm sorry if you > thought I was attacking your users or something. I did say that I > thought most users downloading the debug builds were probably confused > about what they were actually getting, but I didn't mean because they > were stupid Windows users, I meant because the debug builds are so > confusing that even folks on the Python core team are confused about > what they're actually getting. "Our users", please :) In my experience, Windows developers just treat debug and release builds as part of the normal development process. The only confusion I've seen has been related to CPython's not-quite-Windows-ish approach to debug builds, and in practically every case it's been enough to explain "release CPython uses a different CRT to your debug extension, but once you align those it'll be fine". I definitely *do not* want to force or encourage package developers to release debug ABI versions of their prebuilt packages. But at the same time I don't want to remove the benefits that debug builds currently include. Basically, I'm happy with the status quo, and the users I talk to are happy with it. So I'd rather not worry about optimising debug builds for speed or memory usage. (It's a question of direction more than anything else, and until we get some official statement of direction then I'll keep advocating a direction based on my experiences ;) ) Cheers, Steve From solipsis at pitrou.net Mon Apr 15 14:09:18 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 15 Apr 2019 20:09:18 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build References: <0a1871f0-14f8-8591-cf35-21cb3741f354@python.org> <15ff965b-30a9-dc4b-cc60-940532f349b9@python.org> <20190415125000.571364a0@fsol> Message-ID: <20190415200918.760863f8@fsol> On Mon, 15 Apr 2019 12:50:00 +0200 Antoine Pitrou wrote: > On Thu, 11 Apr 2019 08:26:47 -0700 > Steve Dower wrote: > > On 10Apr2019 1917, Nathaniel Smith wrote: > > > It sounds like --with-pydebug has accumulated a big grab bag of > > > unrelated features, mostly stuff that was useful at some point for > > > some CPython dev trying to debug CPython itself? It's clearly not > > > designed with end users as the primary audience, given that no-one > > > knows what it actually does and that it makes third-party extensions > > > really awkward to run. If that's right then I think Victor's plan of > > > to sort through what it's actually doing makes a lot of sense, > > > especially if we can remove the ABI breaking stuff, since that causes > > > a disproportionate amount of trouble. > > > > Does it really cause a "disproportionate" amount of trouble? It's > > definitely not meant for anyone who isn't working on C code, whether in > > CPython, an extension or a host application. If you want to use > > third-party extensions and are not able to rebuild them, that's a very > > good sign that you probably shouldn't be on the debug build at all. > > I can't really agree with that. There are third-party extensions that > have non-trivial build requirements. The fact that you have to rebuild > third-party dependencies is a strong deterrent against using pydebug > builds even when they may be actually useful (for example when > debugging an extension module of your own). Oh, and as a datapoint, there are user requests for pydebug builds in Anaconda and conda-forge: https://github.com/ContinuumIO/anaconda-issues/issues/80 https://github.com/conda-forge/staged-recipes/issues/1593 The problem is, while it's technically relatively easy to build and distribute a special build of Python, to make it useful implies also building a whole separate distribution of Python libraries as well. I suspect the latter is why those issues were never acted upon. So, there's actual demand from people who would (probably) benefit from it, but are blocked by burden of recompiling all dependencies. Regards Antoine. From christian at python.org Mon Apr 15 16:44:58 2019 From: christian at python.org (Christian Heimes) Date: Mon, 15 Apr 2019 22:44:58 +0200 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: References: Message-ID: <47700a82-7654-f36a-26c7-0fd13d7cd8f7@python.org> On 28/03/2019 23.35, Steve Dower wrote: > Hi all > > Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into > Python 3.8. Here's the current text for review and comment before I > submit to the Steering Council. > > The formatted text is at https://www.python.org/dev/peps/pep-0578/ > (update just pushed, so give it an hour or so, but it's fundamentally > the same as what's there) > > No Discourse post, because we don't have a python-dev equivalent there > yet, so please reply here for this one. > > Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and > my backport to 3.7 (https://github.com/zooba/cpython/tree/pep-578-3.7/) > is already getting some real use (though this will not be added to 3.7, > unless people *really* want it, so the backport is just for reference). Hi Steve, (memory dump before I go to bed) Steve Grubb from Red Hat security pointed me to some interesting things [1]. For instance there is some work on a new O_MAYEXEC flag for open(). Steve came to similar conclusions like we, e.g. streaming code from stdin is insecure. I think it would be also beneficial to have auditing events for the import system to track when sys.path or import loaders are changed. Christian [1] https://marc.info/?l=linux-fsdevel&m=155535414414626&w=2 From steve.dower at python.org Mon Apr 15 17:17:04 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 15 Apr 2019 14:17:04 -0700 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <47700a82-7654-f36a-26c7-0fd13d7cd8f7@python.org> References: <47700a82-7654-f36a-26c7-0fd13d7cd8f7@python.org> Message-ID: <4e6ccd29-15c5-2704-04dd-0804f279638d@python.org> On 15Apr2019 1344, Christian Heimes wrote: > Hi Steve, > > (memory dump before I go to bed) > > Steve Grubb from Red Hat security pointed me to some interesting things > [1]. For instance there is some work on a new O_MAYEXEC flag for open(). > Steve came to similar conclusions like we, e.g. streaming code from > stdin is insecure. > > [1] https://marc.info/?l=linux-fsdevel&m=155535414414626&w=2 Thanks for the pointer! Using this for open_code() by default on platforms that support it might be a good opportunity in the future. But I'm glad I'm not the only one who thinks this is the right approach :) > I think it would be also beneficial to have auditing events for the > import system to track when sys.path or import loaders are changed. Already in there (kind of... the "import" events include the contents of the sys properties that are about to be used to resolve it - since these are plain-old lists, and can be easily reassigned, passing them through here allows you to add a check if you really want it but otherwise not pay the cost of replacing the sys module with a special implementation and its attributes with special lists). Cheers, Steve From stefano.borini at gmail.com Mon Apr 15 17:32:24 2019 From: stefano.borini at gmail.com (Stefano Borini) Date: Mon, 15 Apr 2019 22:32:24 +0100 Subject: [Python-Dev] Cannot find documented API in PEP-376 (Database of Installed Python Distributions) Message-ID: Hello, I am on a PEP scouting effort to check the current status of python packaging and its historical context, mostly for learning purposes. I noted that the PEP defines some functions for pkgutil (e.g. get_distributions), but I cannot find them. I tried to do some searching on the mailing list history, but I came up with pretty much nothing of value. It appears that the topic was last considered in 2009 (the year of the PEP). dist-info was then implemented, but I cannot find any information about the missing API, nor any additional PEP, except for a brief reference in PEP-427. Does anyone have some context for this? I understand it was 10 years ago, so it's mostly a curiosity. Thanks. -- Kind regards, Stefano Borini From p.f.moore at gmail.com Mon Apr 15 17:47:16 2019 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 15 Apr 2019 22:47:16 +0100 Subject: [Python-Dev] Cannot find documented API in PEP-376 (Database of Installed Python Distributions) In-Reply-To: References: Message-ID: On Mon, 15 Apr 2019 at 22:35, Stefano Borini wrote: > > Hello, > > I am on a PEP scouting effort to check the current status of python > packaging and its historical context, mostly for learning purposes. I > noted that the PEP defines some functions for pkgutil (e.g. > get_distributions), but I cannot find them. > I tried to do some searching on the mailing list history, but I came > up with pretty much nothing of value. It appears that the topic was > last considered in 2009 (the year of the PEP). dist-info was then > implemented, but I cannot find any information about the missing API, > nor any additional PEP, except for a brief reference in PEP-427. > > Does anyone have some context for this? > > I understand it was 10 years ago, so it's mostly a curiosity. Thanks. PEP 376 was part of a rather grand plan to re-engineer a lot of Python's packaging tools (distutils and setuptools at the time, mainly). Although the PEP was accepted, a lot of the coding never got done and ultimately the project was abandoned, and we moved over to a more incremental approach of improving what was there, rather than wholesale replacing things. So the PEP itself is something of a mixture now, some parts that are implemented, some parts that are relevant in principle but the details never got filled in, and some parts that simply never happened. >From what I recall (I was around at the time) a lot of the discussion was on distutils-sig - did you check the archives of that list in your searching? But there was a lot of what I would describe as "heated debate" going on at that point, so it may be hard to find anything particularly informative. Hopefully, that's of some use - good luck in your investigations! Paul From sully at msully.net Mon Apr 15 18:23:10 2019 From: sully at msully.net (Michael Sullivan) Date: Mon, 15 Apr 2019 15:23:10 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build Message-ID: > The main question is if anyone ever used Py_TRACE_REFS? Does someone > use sys.getobjects() or PYTHONDUMPREFS environment variable? I used sys.getobjects() today to track down a memory leak in the mypyc-compiled version of mypy. We were leaking memory badly but no sign of the leak was showing up in mypy's gc.get_objects() based profiler. Using a debug build and switching to sys.getobjects() showed that we were badly leaking int objects. A quick inspection of the values in question (large and random looking) suggested we were leaking hash values, and that quickly pointed me to https://github.com/mypyc/mypyc/pull/562. I don't have any strong feelings about whether to keep it in the "default" debug build, though. I was using a debug build that I built myself with every debug feature that seemed potentially useful. -sully -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Apr 15 19:05:58 2019 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 15 Apr 2019 16:05:58 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: On Mon, Apr 15, 2019, 15:27 Michael Sullivan wrote: > > The main question is if anyone ever used Py_TRACE_REFS? Does someone > > use sys.getobjects() or PYTHONDUMPREFS environment variable? > > I used sys.getobjects() today to track down a memory leak in the > mypyc-compiled version of mypy. > > We were leaking memory badly but no sign of the leak was showing up in > mypy's gc.get_objects() based profiler. Using a debug build and switching > to sys.getobjects() showed that we were badly leaking int objects. A quick > inspection of the values in question (large and random looking) suggested > we were leaking hash values, and that quickly pointed me to > https://github.com/mypyc/mypyc/pull/562. > > I don't have any strong feelings about whether to keep it in the "default" > debug build, though. I was using a debug build that I built myself with > every debug feature that seemed potentially useful. > This is mostly to satisfy my curiosity, so feel free to ignore: did you try using address sanitizer or valgrind? -n > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sully at msully.net Mon Apr 15 19:58:07 2019 From: sully at msully.net (Michael Sullivan) Date: Mon, 15 Apr 2019 16:58:07 -0700 Subject: [Python-Dev] PEP 591 discussion (final qualifier) happening at typing-sig@ Message-ID: I've submitted PEP 591 (Adding a final qualifier to typing) for discussion to typing-sig [1]. Here's the abstract: This PEP proposes a "final" qualifier to be added to the ``typing`` module---in the form of a ``final`` decorator and a ``Final`` type annotation---to serve three related purposes: * Declaring that a method should not be overridden * Declaring that a class should not be subclassed * Declaring that a variable or attribute should not be reassigned Full text at https://www.python.org/dev/peps/pep-0591/ -sully [1] https://mail.python.org/mailman3/lists/typing-sig.python.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From philgagnon1 at gmail.com Mon Apr 15 21:06:58 2019 From: philgagnon1 at gmail.com (Philippe Gagnon) Date: Mon, 15 Apr 2019 21:06:58 -0400 Subject: [Python-Dev] PEP 589 discussion (TypedDict) happening at typing-sig@ In-Reply-To: References: Message-ID: Hi Jukka, Thanks for submitting this PEP, I think it will be a net plus for the python language. I have been using TypedDict as a mypy_extensions module and it's been a great help. I found that the one thing that may be less intuitive and its design is the totality property. The fact that you need to use inheritance to compose TypedDicts that contain both required and optional keys create situations that may be a little verbose for some use cases. Perhaps an "optional" property taking a list of keys that type checkers would recognize as (no surprise) optional could be an alternative design with some merit. Best regards, Philippe On Mon, Apr 15, 2019 at 12:44 PM Jukka Lehtosalo wrote: > Hi everyone, > > I submitted PEP 589 (TypedDict: Type Hints for Dictionaries with a Fixed > Set of Keys) for discussion to typing-sig [1]. > > Here's an excerpt from the abstract of the PEP: > > PEP 484 defines the type Dict[K, V] for uniform dictionaries, where each > value has the same type, and arbitrary key values are supported. It doesn't > properly support the common pattern where the type of a dictionary value > depends on the string value of the key. This PEP proposes a type > constructor typing.TypedDict to support the use case where a dictionary > object has a specific set of string keys, each with a value of a specific > type. > > Jukka Lehtosalo > > [1] https://mail.python.org/mailman3/lists/typing-sig.python.org/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/philgagnon1%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Apr 15 23:12:33 2019 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 15 Apr 2019 20:12:33 -0700 Subject: [Python-Dev] PEP 591 discussion (final qualifier) happening at typing-sig@ In-Reply-To: References: Message-ID: On Mon, Apr 15, 2019 at 5:00 PM Michael Sullivan wrote: > > I've submitted PEP 591 (Adding a final qualifier to typing) for discussion to typing-sig [1]. I'm not on typing-sig [1] so I'm replying here. > Here's the abstract: > This PEP proposes a "final" qualifier to be added to the ``typing`` > module---in the form of a ``final`` decorator and a ``Final`` type > annotation---to serve three related purposes: > > * Declaring that a method should not be overridden > * Declaring that a class should not be subclassed > * Declaring that a variable or attribute should not be reassigned I've been meaning to start blocking subclassing at runtime (e.g. like [2]), so being able to express that to the typechecker seems like a nice addition. I'm assuming though that the '@final' decorator doesn't have any runtime effect, so I'd have to say it twice? @typing.final class MyClass(metaclass=othermod.Final): ... Or on 3.6+ with __init_subclass__, it's easy to define a @final decorator that works at runtime, but I guess this would have to be a different decorator? @typing.final @alsoruntime.final class MyClass: ... This seems kinda awkward. Have you considered giving it a runtime effect, or providing some way for users to combine these two things together on their own? -n [1] https://github.com/willingc/pep-communication/issues/1 [2] https://stackoverflow.com/a/3949004/1925449 -- Nathaniel J. Smith -- https://vorpus.org From sully at msully.net Mon Apr 15 23:57:53 2019 From: sully at msully.net (Michael Sullivan) Date: Mon, 15 Apr 2019 20:57:53 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: On Mon, Apr 15, 2019 at 4:06 PM Nathaniel Smith wrote: > On Mon, Apr 15, 2019, 15:27 Michael Sullivan wrote: > >> > The main question is if anyone ever used Py_TRACE_REFS? Does someone >> > use sys.getobjects() or PYTHONDUMPREFS environment variable? >> >> I used sys.getobjects() today to track down a memory leak in the >> mypyc-compiled version of mypy. >> >> We were leaking memory badly but no sign of the leak was showing up in >> mypy's gc.get_objects() based profiler. Using a debug build and switching >> to sys.getobjects() showed that we were badly leaking int objects. A quick >> inspection of the values in question (large and random looking) suggested >> we were leaking hash values, and that quickly pointed me to >> https://github.com/mypyc/mypyc/pull/562. >> >> I don't have any strong feelings about whether to keep it in the >> "default" debug build, though. I was using a debug build that I built >> myself with every debug feature that seemed potentially useful. >> > > This is mostly to satisfy my curiosity, so feel free to ignore: did you > try using address sanitizer or valgrind? > > I didn't, mostly because I assume that valgrind wouldn't play well with cpython. (I've never used address sanitizer.) I was curious, so I went back and tried it out. It turned out to not seem to need that much fiddling to get to work. It slows things down a *lot* and produced 17,000 "loss records", though, so maybe I don't have it working right. At a glance the records did not shed any light. I'd definitely believe that valgrind is up to the task of debugging this, but my initial take with it shed much less light than my sys.getobjects() approach. (Though note that my sys.getobjects() approach was slotting it into an existing python memory profiler we had hacked up, so...) -sully > -n > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sully at msully.net Tue Apr 16 03:48:20 2019 From: sully at msully.net (Michael Sullivan) Date: Tue, 16 Apr 2019 00:48:20 -0700 Subject: [Python-Dev] PEP 591 discussion (final qualifier) happening at typing-sig@ In-Reply-To: References: Message-ID: On Mon, Apr 15, 2019 at 8:12 PM Nathaniel Smith wrote: > On Mon, Apr 15, 2019 at 5:00 PM Michael Sullivan wrote: > > > > I've submitted PEP 591 (Adding a final qualifier to typing) for > discussion to typing-sig [1]. > > I'm not on typing-sig [1] so I'm replying here. > > > Here's the abstract: > > This PEP proposes a "final" qualifier to be added to the ``typing`` > > module---in the form of a ``final`` decorator and a ``Final`` type > > annotation---to serve three related purposes: > > > > * Declaring that a method should not be overridden > > * Declaring that a class should not be subclassed > > * Declaring that a variable or attribute should not be reassigned > > I've been meaning to start blocking subclassing at runtime (e.g. like > [2]), so being able to express that to the typechecker seems like a > nice addition. I'm assuming though that the '@final' decorator doesn't > have any runtime effect, so I'd have to say it twice? > > @typing.final > class MyClass(metaclass=othermod.Final): > ... > > Or on 3.6+ with __init_subclass__, it's easy to define a @final > decorator that works at runtime, but I guess this would have to be a > different decorator? > > @typing.final > @alsoruntime.final > class MyClass: > ... > > This seems kinda awkward. Have you considered giving it a runtime > effect, or providing some way for users to combine these two things > together on their own? > > Nothing else in typing does any type of runtime enforcement, so I'd be reluctant to start here. One approach would be doing something like this (maybe in a support module): if typing.TYPE_CHECKING: from typing import final else: from alsoruntime import final So that at checking time, the typechecker would use the typing final but at runtime we'd get something that does enforcement. (And for the pre-3.6 case, you could maybe use something like six.add_metaclass in order to specify the metaclass as a decorator.) I can add this as an example to the PEP. -sully > -n > > [1] https://github.com/willingc/pep-communication/issues/1 > [2] https://stackoverflow.com/a/3949004/1925449 > > -- > Nathaniel J. Smith -- https://vorpus.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Tue Apr 16 05:11:05 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 11:11:05 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: Hi Michael, Do you know the tracemalloc module? Did you try it? It works on a regular Python (compiled in debug mode). I would be curious to know if tracemalloc also allows you to track the memory leak. sys.getobjects() is just a list of objects. Do you have a tool written on top of it to track memory leaks? If yes, how? Victor Le mar. 16 avr. 2019 ? 00:28, Michael Sullivan a ?crit : > > > The main question is if anyone ever used Py_TRACE_REFS? Does someone > > use sys.getobjects() or PYTHONDUMPREFS environment variable? > > I used sys.getobjects() today to track down a memory leak in the mypyc-compiled version of mypy. > > We were leaking memory badly but no sign of the leak was showing up in mypy's gc.get_objects() based profiler. Using a debug build and switching to sys.getobjects() showed that we were badly leaking int objects. A quick inspection of the values in question (large and random looking) suggested we were leaking hash values, and that quickly pointed me to https://github.com/mypyc/mypyc/pull/562. > > I don't have any strong feelings about whether to keep it in the "default" debug build, though. I was using a debug build that I built myself with every debug feature that seemed potentially useful. > > -sully > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. From njs at pobox.com Tue Apr 16 06:05:56 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 16 Apr 2019 03:05:56 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: On Mon, Apr 15, 2019 at 8:58 PM Michael Sullivan wrote: > > On Mon, Apr 15, 2019 at 4:06 PM Nathaniel Smith wrote: >> >> On Mon, Apr 15, 2019, 15:27 Michael Sullivan wrote: >>> >>> > The main question is if anyone ever used Py_TRACE_REFS? Does someone >>> > use sys.getobjects() or PYTHONDUMPREFS environment variable? >>> >>> I used sys.getobjects() today to track down a memory leak in the mypyc-compiled version of mypy. >>> >>> We were leaking memory badly but no sign of the leak was showing up in mypy's gc.get_objects() based profiler. Using a debug build and switching to sys.getobjects() showed that we were badly leaking int objects. A quick inspection of the values in question (large and random looking) suggested we were leaking hash values, and that quickly pointed me to https://github.com/mypyc/mypyc/pull/562. >>> >>> I don't have any strong feelings about whether to keep it in the "default" debug build, though. I was using a debug build that I built myself with every debug feature that seemed potentially useful. >> >> >> This is mostly to satisfy my curiosity, so feel free to ignore: did you try using address sanitizer or valgrind? >> > I didn't, mostly because I assume that valgrind wouldn't play well with cpython. (I've never used address sanitizer.) > > I was curious, so I went back and tried it out. > It turned out to not seem to need that much fiddling to get to work. It slows things down a *lot* and produced 17,000 "loss records", though, so maybe I don't have it working right. At a glance the records did not shed any light. > > I'd definitely believe that valgrind is up to the task of debugging this, but my initial take with it shed much less light than my sys.getobjects() approach. (Though note that my sys.getobjects() approach was slotting it into an existing python memory profiler we had hacked up, so...) valgrind on CPython is definitely a bit fiddly ? if you need it again you might check out Misc/README.valgrind. Supposedly memory sanitizer is just './configure --with-memory-sanitizer', but I haven't tried it either :-) -n -- Nathaniel J. Smith -- https://vorpus.org From vstinner at redhat.com Tue Apr 16 06:17:39 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 12:17:39 +0200 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: Since Python 3.6, you can use PYTHONMALLOC=malloc for Valgrind: it avoids false alarms produced by the pymalloc allocator. Victor Le mar. 16 avr. 2019 ? 12:09, Nathaniel Smith a ?crit : > > On Mon, Apr 15, 2019 at 8:58 PM Michael Sullivan wrote: > > > > On Mon, Apr 15, 2019 at 4:06 PM Nathaniel Smith wrote: > >> > >> On Mon, Apr 15, 2019, 15:27 Michael Sullivan wrote: > >>> > >>> > The main question is if anyone ever used Py_TRACE_REFS? Does someone > >>> > use sys.getobjects() or PYTHONDUMPREFS environment variable? > >>> > >>> I used sys.getobjects() today to track down a memory leak in the mypyc-compiled version of mypy. > >>> > >>> We were leaking memory badly but no sign of the leak was showing up in mypy's gc.get_objects() based profiler. Using a debug build and switching to sys.getobjects() showed that we were badly leaking int objects. A quick inspection of the values in question (large and random looking) suggested we were leaking hash values, and that quickly pointed me to https://github.com/mypyc/mypyc/pull/562. > >>> > >>> I don't have any strong feelings about whether to keep it in the "default" debug build, though. I was using a debug build that I built myself with every debug feature that seemed potentially useful. > >> > >> > >> This is mostly to satisfy my curiosity, so feel free to ignore: did you try using address sanitizer or valgrind? > >> > > I didn't, mostly because I assume that valgrind wouldn't play well with cpython. (I've never used address sanitizer.) > > > > I was curious, so I went back and tried it out. > > It turned out to not seem to need that much fiddling to get to work. It slows things down a *lot* and produced 17,000 "loss records", though, so maybe I don't have it working right. At a glance the records did not shed any light. > > > > I'd definitely believe that valgrind is up to the task of debugging this, but my initial take with it shed much less light than my sys.getobjects() approach. (Though note that my sys.getobjects() approach was slotting it into an existing python memory profiler we had hacked up, so...) > > valgrind on CPython is definitely a bit fiddly ? if you need it again > you might check out Misc/README.valgrind. > > Supposedly memory sanitizer is just './configure > --with-memory-sanitizer', but I haven't tried it either :-) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. From J.Demeyer at UGent.be Tue Apr 16 07:15:45 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 16 Apr 2019 13:15:45 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CA445BD.4040705@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <5CA445BD.4040705@UGent.be> Message-ID: <5CB5B961.6020704@UGent.be> On 2019-04-03 07:33, Jeroen Demeyer wrote: > Access to the class isn't possible currently and also not with PEP 590. > But it's easy enough to fix that: PEP 573 adds a new METH_METHOD flag to > change the signature of the C function (not the vectorcall wrapper). PEP > 580 supports this "out of the box" because I'm reusing the class also to > do type checks. But this shouldn't be an argument for or against either > PEP. Actually, in the answer above I only considered "is implementing PEP 573 possible?" but I did not consider the complexity of doing that. And in line with what I claimed about complexity before, I think that PEP 580 scores better in this regard. Take PEP 580 and assume for the sake of argument that it didn't already have the cc_parent field. Then adding support for PEP 573 is easy: just add the cc_parent field to the C call protocol structure and set that field when initializing a method_descriptor. C functions can use the METH_DEFARG flag to get access to the PyCCallDef structure, which gives cc_parent. Implementing PEP 573 for a custom function class takes no extra effort: it doesn't require any changes to that class, except for correctly initializing the cc_parent field. Since PEP 580 has built-in support for methods, nothing special needs to be done to support methods too. With PEP 590 on the other hand, every single class which is involved in PEP 573 must be changed and every single vectorcall wrapper supporting PEP 573 must be changed. This is not limited to the function class itself, also the corresponding method class (for example, builtin_function_or_method for method_descriptor) needs to be changed. Jeroen From christian at python.org Tue Apr 16 07:39:59 2019 From: christian at python.org (Christian Heimes) Date: Tue, 16 Apr 2019 13:39:59 +0200 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <4e6ccd29-15c5-2704-04dd-0804f279638d@python.org> References: <47700a82-7654-f36a-26c7-0fd13d7cd8f7@python.org> <4e6ccd29-15c5-2704-04dd-0804f279638d@python.org> Message-ID: <229b7489-5d3f-8638-f8f2-eda0fa7c041d@python.org> On 15/04/2019 23.17, Steve Dower wrote: > On 15Apr2019 1344, Christian Heimes wrote: >> Hi Steve, >> >> (memory dump before I go to bed) >> >> Steve Grubb from Red Hat security pointed me to some interesting things >> [1]. For instance there is some work on a new O_MAYEXEC flag for open(). >> Steve came to similar conclusions like we, e.g. streaming code from >> stdin is insecure. >> >> [1] https://marc.info/?l=linux-fsdevel&m=155535414414626&w=2 > > Thanks for the pointer! Using this for open_code() by default on > platforms that support it might be a good opportunity in the future. But > I'm glad I'm not the only one who thinks this is the right approach :) Here is the original patch on LWN with some links to presentations: https://lwn.net/Articles/774676/ The approach has one downside: The current user must have DAC executable permission for a file in order to open a file with O_MAYEXEC. That means we have to +x all Python files and PYC files, not just the files that are designed as entry points. >> I think it would be also beneficial to have auditing events for the >> import system to track when sys.path or import loaders are changed. > > Already in there (kind of... the "import" events include the contents of > the sys properties that are about to be used to resolve it - since these > are plain-old lists, and can be easily reassigned, passing them through > here allows you to add a check if you really want it but otherwise not > pay the cost of replacing the sys module with a special implementation > and its attributes with special lists). Yeah, it's complicated :/ Steve Grubb mentioned remote importers or hacks like mem_fd + dlopen() from /proc/self/fd as attack vectors. Mitigations and audit systems like IMA Appraisal only work properly if code has to hit the disk first. If an attacker or user can perform the equivalent of PROT_EXEC | PROT_WRITE, then IMA won't be able to 'see' the malicious code. https://www.tutorialspoint.com/How-to-use-remote-python-modules https://github.com/operatorequals/httpimport https://0x00sec.org/t/pure-python-in-memory-so-loading-without-shm/6453 https://github.com/nullbites/SnakeEater/blob/master/SnakeEater2.py Christian From christian at python.org Tue Apr 16 08:32:14 2019 From: christian at python.org (Christian Heimes) Date: Tue, 16 Apr 2019 14:32:14 +0200 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> Message-ID: <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> Sorry, I forgot to reply. Do you think it would make sense to split the PEP into two PEPs? The auditing hook and import opener hook are related, but distinct improvements. The auditing part looks solid and ready now. The import opener may need some more refinement. I would also like to get feedback from some Linux Kernel security engineers first. On 01/04/2019 18.31, Steve Dower wrote: > On 31Mar2019 0538, Christian Heimes wrote: >> I don't like the fact that the PEP requires users to learn and use an >> additional layer to handle native code. Although we cannot provide a >> fully secure hook for native code, we could at least try to provide a >> best effort hook and document the limitations. A bit more information >> would make the verified open function more useful, too. > > So instead they need to learn a significantly more complicated API? :) > (I was very happy to be able to say "it's the same as open(p, 'rb')"). > >> PyObject *PyImport_OpenForExecution( >> ???? const char *path, >> ???? const char *intent, >> ???? int flags, >> ???? PyObject *context >> ) >> >> - Path is an absolute (!) file path. The PEP doesn't specify if the file >> name is relative or absolute. IMO it should be always absolute. > > Yeah, this is fair enough. I'll add it as a requirement. > >> - The new intent argument lets the caller pass information how it >> intents to use the file, e.g. pythoncode, zipimport, nativecode (for >> loading a shared library/DLL), ctypes, ... This allows the verify hook >> to react on the intent and provide different verifications for e.g. >> Python code and native modules. > > I had an intent argument at one point and the feedback I got (from teams > who wanted to implement it) is that they wouldn't trust it anyway :) > > In each case there should be associated audit events for tracking the > intent (and interrupting at that point if it doesn't like the intended > action), but for the simple case of "let me open this specific file" it > doesn't really add much. And it almost certainly shouldn't impact > decision making. There is no need to trust the intent flag that much. I would like to have a way to further narrow down the scope for an open call. This would allow the caller to tell the hook "I want to open something that should be a shared library suitable for ctypes". It would allow tighter control. Audit events are useful and powerful. But I don't want to put too much burden on the auditing framwork. I prefer to have checks that prevent operations rather than allow operations and audit them. >> - The flags argument is for additional flags, e.g. return an opened file >> or None, open the file in text or binary mode, ... > > This just makes it harder for the hook implementer - now you have to > allow encoding/errors arguments and probably more. And as mentioned > above, there should be an audit event showing the intent before this > call, and a hook can reject it at that point (rather than verify without > actually returning the verified content). I retract this part of my proposal. With O_MAYEXEC it's better to always open the file, but then use the file's FD to retrieve the actual file name for dlopen(). That approach allows the Kernel to verify DAC permissions, prevents memfd_create() hacks through readlink, and simplifies the hook. * Linux: readlink("/proc/self/fd/%i") * macOS: fcntl F_GETPATH * Windows: GetFileInformationByHandleEx >> - Context is an optional Python object from the caller's context. For >> the import system, it could be the loader instance. > > I think the audit event covers this, unless you have some way of using > this context in mind that I can't think of? To be honest I don't have a good use case yet. I just like the idea to have a way to pass some custom thing into an API and now who called an API. You seem to like it, too. Your hook has a void *userData, but it's not passed into the Python function. :) int PyImport_SetOpenForImportHook(hook_func handler, void *userData) Christian From vstinner at redhat.com Tue Apr 16 08:57:35 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 14:57:35 +0200 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> Message-ID: Le mar. 16 avr. 2019 ? 14:35, Christian Heimes a ?crit : > * Linux: readlink("/proc/self/fd/%i") That doens't work if /proc is not mounted, which can occur in a container (where /proc is not mounted nor binded to host /proc). Victor From christian at python.org Tue Apr 16 09:09:34 2019 From: christian at python.org (Christian Heimes) Date: Tue, 16 Apr 2019 15:09:34 +0200 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> Message-ID: <9bcfb93b-d52f-7284-7bc6-d629edebacd5@python.org> On 16/04/2019 14.57, Victor Stinner wrote: > Le mar. 16 avr. 2019 ? 14:35, Christian Heimes a ?crit : >> * Linux: readlink("/proc/self/fd/%i") > > That doens't work if /proc is not mounted, which can occur in a > container (where /proc is not mounted nor binded to host /proc). No, it won't work. But there is much more that breaks when /proc is not mounted. Therefore all container runtimes mount /proc and /sys into containers. I checked systemd-nspawn, podman, and docker. Christian From vstinner at redhat.com Tue Apr 16 10:24:07 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 16:24:07 +0200 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? Message-ID: Hi, time.mktime() looks "inconsistent" to me and I would like to change it, but I'm not sure how it impacts backward compatibility. https://bugs.python.org/issue36558 time.mktime() returns a floating point number: >>> type(time.mktime(time.localtime())) The documentation says: "It returns a floating point number, for compatibility with :func:`.time`." time.time() returns a float because it has sub-second resolution, but the C function mktime() returns an integer number of seconds. Would it make sense to change mktime() return type from float to int? I would like to change mktime() return type to make the function more consistent: all inputs are integers, it sounds wrong to me to return float. The result should be integer as well. How much code would it break? I guess that the main impact are unit tests relying on repr(time.mktime(t)) exact value. But it's easy to fix the tests: use int(time.mktime(t)) or "%.0f" % time.mktime(t) to never get ".0", or use float(time.mktime(t))) to explicitly cast for a float (that which be a bad but quick fix). Note: I wrote and implemented the PEP 564 to avoid any precision loss. mktime() will not start loosing precision before year 285,422,891 (which is quite far in the future ;-)). Victor -- Night gathers, and now my watch begins. It shall not end until my death. From paul at ganssle.io Tue Apr 16 10:41:05 2019 From: paul at ganssle.io (Paul Ganssle) Date: Tue, 16 Apr 2019 10:41:05 -0400 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: I already chimed in on the issue, but for the list, I'll boil my comments down to two questions: 1. For anyone who knows: when the documentation refers to "compatibility with `.time`", is that just saying it was designed that way because .time returns a float (i.e. for /consistency/ with `.time()`), or is there some practical reason that you would want `.time()` and `.mktime()` to return the same type? 2. Mainly for Victor, but anyone can answer: I agree that the natural output of `mktime()` would be `int` if I were designing it today, but would there be any /practical/ benefits for making this change? Are there problems cropping up because it's returning a float? Is it faster to return an integer? Best, Paul On 4/16/19 10:24 AM, Victor Stinner wrote: > Hi, > > time.mktime() looks "inconsistent" to me and I would like to change > it, but I'm not sure how it impacts backward compatibility. > https://bugs.python.org/issue36558 > > time.mktime() returns a floating point number: > >>>> type(time.mktime(time.localtime())) > > > The documentation says: > > "It returns a floating point number, for compatibility with :func:`.time`." > > time.time() returns a float because it has sub-second resolution, but > the C function mktime() returns an integer number of seconds. > > Would it make sense to change mktime() return type from float to int? > > I would like to change mktime() return type to make the function more > consistent: all inputs are integers, it sounds wrong to me to return > float. The result should be integer as well. > > How much code would it break? I guess that the main impact are unit > tests relying on repr(time.mktime(t)) exact value. But it's easy to > fix the tests: use int(time.mktime(t)) or "%.0f" % time.mktime(t) to > never get ".0", or use float(time.mktime(t))) to explicitly cast for a > float (that which be a bad but quick fix). > > Note: I wrote and implemented the PEP 564 to avoid any precision loss. > mktime() will not start loosing precision before year 285,422,891 > (which is quite far in the future ;-)). > > Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From stephane at wirtel.be Tue Apr 16 10:44:22 2019 From: stephane at wirtel.be (=?utf-8?B?U3TDqXBoYW5l?= Wirtel) Date: Tue, 16 Apr 2019 16:44:22 +0200 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: <20190416144422.3bwbgl4cvr5rgn3d@xps> >I would like to change mktime() return type to make the function more >consistent: all inputs are integers, it sounds wrong to me to return >float. The result should be integer as well. In C, the signature of mktime is time_t mktime(struct tm *time); from Wikipedia, the Unix time_t data type, on many platforms, is a signed integer, tradionally (32bits). In the newer operating systems, time_t has been widened to 64 bits. -- St?phane Wirtel - https://wirtel.be - @matrixise From vstinner at redhat.com Tue Apr 16 11:16:37 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 17:16:37 +0200 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: Le mar. 16 avr. 2019 ? 16:44, Paul Ganssle a ?crit : > 2. Mainly for Victor, but anyone can answer: I agree that the natural output of `mktime()` would be `int` if I were designing it today, but would there be any practical benefits for making this change? It's just for the consistency of the function regarding to C function mktime() return type and its input types :-) > Are there problems cropping up because it's returning a float? None. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From guido at python.org Tue Apr 16 11:46:31 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Apr 2019 08:46:31 -0700 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: On Tue, Apr 16, 2019 at 8:19 AM Victor Stinner wrote: > Le mar. 16 avr. 2019 ? 16:44, Paul Ganssle a ?crit : > > 2. Mainly for Victor, but anyone can answer: I agree that the natural > output of `mktime()` would be `int` if I were designing it today, but would > there be any practical benefits for making this change? > > It's just for the consistency of the function regarding to C function > mktime() return type and its input types :-) > But all Python times are reported or accept floats -- this allows sub-second precision without using complicated data structures. None of the C functions use floats. Consistency with C should not be the issue -- consistency between the time functions is important. > > Are there problems cropping up because it's returning a float? > > None. > So let's drop the idea. > Victor > -- > Night gathers, and now my watch begins. It shall not end until my death. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From vano at mail.mipt.ru Tue Apr 16 11:47:35 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Tue, 16 Apr 2019 18:47:35 +0300 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: <17dc3ff7-b9f1-998f-abe8-cd02e388f5f0@mail.mipt.ru> On 16.04.2019 17:24, Victor Stinner wrote: > Hi, > > time.mktime() looks "inconsistent" to me and I would like to change > it, but I'm not sure how it impacts backward compatibility. > https://bugs.python.org/issue36558 > > time.mktime() returns a floating point number: > >>>> type(time.mktime(time.localtime())) > > > The documentation says: > > "It returns a floating point number, for compatibility with :func:`.time`." > > time.time() returns a float because it has sub-second resolution, but > the C function mktime() returns an integer number of seconds. > > Would it make sense to change mktime() return type from float to int? > > I would like to change mktime() return type to make the function more > consistent: all inputs are integers, it sounds wrong to me to return > float. The result should be integer as well. > > How much code would it break? I guess that the main impact are unit > tests relying on repr(time.mktime(t)) exact value. But it's easy to > fix the tests: use int(time.mktime(t)) or "%.0f" % time.mktime(t) to > never get ".0", or use float(time.mktime(t))) to explicitly cast for a > float (that which be a bad but quick fix). I envision it breaking code that relies on implicitly inferring the type of the result from the types of both operands (e.g. arithmetic operations). But for mktime() specifically, I presume the amount of such code very small. > Note: I wrote and implemented the PEP 564 to avoid any precision loss. > mktime() will not start loosing precision before year 285,422,891 > (which is quite far in the future ;-)). > > Victor -- Regards, Ivan From vstinner at redhat.com Tue Apr 16 12:20:05 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 16 Apr 2019 18:20:05 +0200 Subject: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int? In-Reply-To: References: Message-ID: Le mar. 16 avr. 2019 ? 17:46, Guido van Rossum a ?crit : > Consistency with C should not be the issue -- consistency between the time functions is important. > (...) > So let's drop the idea. Ok, I'm fine with that. It was just an idea ;-) I closed the issue. Victor From steve.dower at python.org Tue Apr 16 16:53:26 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 16 Apr 2019 13:53:26 -0700 Subject: [Python-Dev] PEP 578: Python Runtime Audit Hooks In-Reply-To: <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> References: <2cb3740e-ebc2-1839-1d2e-73d1b9f0a445@python.org> <8466c9c4-b5dc-c6c5-6fe4-a49dc2f4f968@python.org> <03d9d56d-540c-0eae-2b69-2c0960bc030b@python.org> Message-ID: <709491c7-702b-a74d-f341-2af6c19f1363@python.org> On 16Apr2019 0532, Christian Heimes wrote: > Sorry, I forgot to reply. > > Do you think it would make sense to split the PEP into two PEPs? The > auditing hook and import opener hook are related, but distinct > improvements. The auditing part looks solid and ready now. The import > opener may need some more refinement. I would also like to get feedback > from some Linux Kernel security engineers first. That will make three PEPs... The only question for the security engineers is "how much context do you need from the calling process", as that's the only thing that will affect the API. It doesn't have to have any implementation right now. And so far, all the context that's been proposed is "may be executed", which is already implied in the open_code() call. I haven't heard any more requests than "give us the filename and let us return the open (and exclusive) handle/descriptor", so this feels like YAGNI. > On 01/04/2019 18.31, Steve Dower wrote: >> In each case there should be associated audit events for tracking the >> intent (and interrupting at that point if it doesn't like the intended >> action), but for the simple case of "let me open this specific file" it >> doesn't really add much. And it almost certainly shouldn't impact >> decision making. > > There is no need to trust the intent flag that much. I would like to > have a way to further narrow down the scope for an open call. This would > allow the caller to tell the hook "I want to open something that should > be a shared library suitable for ctypes". It would allow tighter control. But those don't go through open(), they'll go through dlopen(), right? It's already a totally different code path from "open and read arbitrary bytes". > Audit events are useful and powerful. But I don't want to put too much > burden on the auditing framwork. I prefer to have checks that prevent > operations rather than allow operations and audit them. Right, and this is the default position for security defenders (to try and block things) ;) Auditing has been found to be a working balance >>> - Context is an optional Python object from the caller's context. For >>> the import system, it could be the loader instance. >> >> I think the audit event covers this, unless you have some way of using >> this context in mind that I can't think of? > > To be honest I don't have a good use case yet. I just like the idea to > have a way to pass some custom thing into an API and now who called an > API. You seem to like it, too. Your hook has a void *userData, but it's > not passed into the Python function. :) > > int PyImport_SetOpenForImportHook(hook_func handler, void *userData) There is no Python function for the (now named) open_code hook. It can only be set as a C function by an embedder, and that's when the userData is provided. Nothing to do with each individual call - just one value per CPython runtime. (Similarly with the audit hook, but for the Python hooks you can pass a closure or a method - in C you need a separate pointer for this.) Cheers, Steve From sully at msully.net Tue Apr 16 22:05:50 2019 From: sully at msully.net (Michael Sullivan) Date: Tue, 16 Apr 2019 19:05:50 -0700 Subject: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build In-Reply-To: References: Message-ID: On Tue, Apr 16, 2019 at 2:11 AM Victor Stinner wrote: > Hi Michael, > > Do you know the tracemalloc module? Did you try it? It works on a > regular Python (compiled in debug mode). > > I would be curious to know if tracemalloc also allows you to track the > memory leak. > > Playing around with it a little it does not seem super helpful here (unless I am missing something): it tracks the allocations based on the python call stack, which doesn't help here, in a C extension module generated from python code. Though, in the the mypyc case, we could implement a debug option for creating dummy frames so that we always have a useful call stack. That seems like less of an option for actual hand-written extension modules, though. (Though on the flip side, the python call stacks might be more useful there.) sys.getobjects() is just a list of objects. Do you have a tool written > on top of it to track memory leaks? If yes, how? > > Not really. We have a very simple memory profiler built on top of gc.get_objects() that just reports how many of different types of objects there are and how much memory they are using: https://github.com/python/mypy/blob/master/mypy/memprofile.py. I swapped out gc.get_objects() for sys.getobjects(), observed that we were leaking int objects, and inspected the live int objects, which gave a pretty good clue where the leak was. > Victor > > Le mar. 16 avr. 2019 ? 00:28, Michael Sullivan a ?crit > : > > > > > The main question is if anyone ever used Py_TRACE_REFS? Does someone > > > use sys.getobjects() or PYTHONDUMPREFS environment variable? > > > > I used sys.getobjects() today to track down a memory leak in the > mypyc-compiled version of mypy. > > > > We were leaking memory badly but no sign of the leak was showing up in > mypy's gc.get_objects() based profiler. Using a debug build and switching > to sys.getobjects() showed that we were badly leaking int objects. A quick > inspection of the values in question (large and random looking) suggested > we were leaking hash values, and that quickly pointed me to > https://github.com/mypyc/mypyc/pull/562. > > > > I don't have any strong feelings about whether to keep it in the > "default" debug build, though. I was using a debug build that I built > myself with every debug feature that seemed potentially useful. > > > > -sully > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com > > > > -- > Night gathers, and now my watch begins. It shall not end until my death. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Apr 17 07:10:44 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 17 Apr 2019 20:10:44 +0900 Subject: [Python-Dev] PEP 7: Adding anonymous union / struct Message-ID: Hi, all. PEP 7 includes some C99 features. I propose to add include anonymous union and struct to the list. https://www.geeksforgeeks.org/g-fact-38-anonymous-union-and-structure/ Anonymous union and struct are C11 feature, not C99. But gcc and MSVC supported it as language extension from before C11. Anonymous union is useful when all union members have different names. Especially, when we need to add dummy member only for padding / alignment, union name looks too verbose: ... # in some struct union { struct { int member1; int member2; } s; long double _dummy; // for largest alignment. } u; ... x.u.s.member1 = 42; vs ... union { struct { int member1; int member2; }; long double _dummy; // for largest alignment. }; ... x.member1 = 42; Does anyone know compiler which can be use to compile Python but doesn't support anonymous union / struct? Regards, -- Inada Naoki From vstinner at redhat.com Wed Apr 17 07:27:39 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 17 Apr 2019 13:27:39 +0200 Subject: [Python-Dev] PEP 7: Adding anonymous union / struct In-Reply-To: References: Message-ID: AIX is somehow supported and uses xlc compiler: does xlc support this C11 feature? Do you want to use it in Python 3.8 and newer only? Victor Le mer. 17 avr. 2019 ? 13:14, Inada Naoki a ?crit : > > Hi, all. > > PEP 7 includes some C99 features. > I propose to add include anonymous union and struct to the list. > https://www.geeksforgeeks.org/g-fact-38-anonymous-union-and-structure/ > > Anonymous union and struct are C11 feature, not C99. > But gcc and MSVC supported it as language extension from before C11. > > Anonymous union is useful when all union members have different names. > Especially, when we need to add dummy member only for padding / alignment, > union name looks too verbose: > > ... # in some struct > union { > struct { > int member1; > int member2; > } s; > long double _dummy; // for largest alignment. > } u; > ... > x.u.s.member1 = 42; > > vs > > ... > union { > struct { > int member1; > int member2; > }; > long double _dummy; // for largest alignment. > }; > ... > x.member1 = 42; > > > Does anyone know compiler which can be use to compile Python but > doesn't support anonymous union / struct? > > Regards, > -- > Inada Naoki > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. From songofacandy at gmail.com Wed Apr 17 07:47:29 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 17 Apr 2019 20:47:29 +0900 Subject: [Python-Dev] PEP 7: Adding anonymous union / struct In-Reply-To: References: Message-ID: On Wed, Apr 17, 2019 at 8:27 PM Victor Stinner wrote: > > AIX is somehow supported and uses xlc compiler: does xlc support this > C11 feature? I find Language Reference for v11.1 (2010/4/13) https://www-01.ibm.com/support/docview.wss?uid=swg27017991 I find "anonymous union" in p73. I can not find language reference for versions older than v11.1. And I can not find "anonymous struct" in v11.1 too. Maybe, we should consider only anonymous union? > > Do you want to use it in Python 3.8 and newer only? > Yes. In case of bpo-27987, Python 3.6 and 3.7 uses named union for PyGC_Head. So changing dummy from "double" to "long double" is enough. In case of Python 3.8, I removed dummy from PyGC_Head and stop using named union because it is (implicitly) aligned by two words (16byte on 64bit, 8byte on 32bit platform) already. But we can align it more explicitly by using anonymous union, without adding many `.gc.` again. Regards, -- Inada Naoki From paul at ganssle.io Wed Apr 17 11:44:52 2019 From: paul at ganssle.io (Paul Ganssle) Date: Wed, 17 Apr 2019 11:44:52 -0400 Subject: [Python-Dev] Adding shlex.join? Message-ID: Hey all, I've been reviewing old "awaiting review" PRs recently, and about a week ago I found PR #7605 , adding shlex.join(), with a corresponding bug at bpo-22454 . The PR's implementation is simple and seems reasonable and decently well-tested, but it has been unreviewed for ~10 months. The reason I'm bringing it up here is that I believe the major blocker here is getting agreement to actually add the function. There doesn't seem to be much /opposition/ in the BPO issue, but given how infrequently the shlex module is changed I'm worried that there may be no one around who feels confident to judge how the interface should evolve. Does anyone feel strongly about this issue? Is there anyone who wants to make a yes/no decision on this feature? Best, Paul P.S. The PR's submitter seems responsive. I made a comment on the documentation and it was addressed in something like 5 minutes. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From guido at python.org Wed Apr 17 11:53:26 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 17 Apr 2019 08:53:26 -0700 Subject: [Python-Dev] Adding shlex.join? In-Reply-To: References: Message-ID: I think it's fine to add this. On Wed, Apr 17, 2019 at 8:47 AM Paul Ganssle wrote: > Hey all, > > I've been reviewing old "awaiting review" PRs recently, and about a week > ago I found PR #7605 , > adding shlex.join(), with a corresponding bug at bpo-22454 > . The PR's implementation is simple > and seems reasonable and decently well-tested, but it has been unreviewed > for ~10 months. > > The reason I'm bringing it up here is that I believe the major blocker > here is getting agreement to actually add the function. There doesn't seem > to be much *opposition* in the BPO issue, but given how infrequently the > shlex module is changed I'm worried that there may be no one around who > feels confident to judge how the interface should evolve. > > Does anyone feel strongly about this issue? Is there anyone who wants to > make a yes/no decision on this feature? > > Best, > Paul > > P.S. The PR's submitter seems responsive. I made a comment on the > documentation and it was addressed in something like 5 minutes. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Wed Apr 17 12:29:25 2019 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 17 Apr 2019 12:29:25 -0400 Subject: [Python-Dev] Adding shlex.join? In-Reply-To: References: Message-ID: <20190417162925.GA16594@thyrsus.com> Paul Ganssle : > Hey all, > > I've been reviewing old "awaiting review" PRs recently, and about a week > ago I found PR #7605 , > adding shlex.join(), with a corresponding bug at bpo-22454 > . The PR's implementation is simple > and seems reasonable and decently well-tested, but it has been > unreviewed for ~10 months. > > The reason I'm bringing it up here is that I believe the major blocker > here is getting agreement to actually add the function. There doesn't > seem to be much /opposition/ in the BPO issue, but given how > infrequently the shlex module is changed I'm worried that there may be > no one around who feels confident to judge how the interface should evolve. > > Does anyone feel strongly about this issue? Is there anyone who wants to > make a yes/no decision on this feature? > > Best, > Paul > > P.S. The PR's submitter seems responsive. I made a comment on the > documentation and it was addressed in something like 5 minutes. I'm the person who originally wrote shlex, which I guess makes me the authority on designer's intention. Providing this addition is properly unit-tested (which apparently it is) I don't have any objection to it. Seems like a reasonable idea. So I'll say yes. But I haven't touched this code in a long time. Maybe somebody on the core dev team thinks they own it now; if so, they might well be right. If so, that person should speak up. I suspect, however, that this code has nobody actively maintaining it because it Just Works. In which case, the authority to make this change should rest with the person who took the responsibility to review it. That would be *you.* So my advice is: pull the trigger. Get forgiveness if it turns out you need it. I don't expect you will. -- Eric S. Raymond -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From hodgestar+pythondev at gmail.com Wed Apr 17 13:54:29 2019 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 17 Apr 2019 19:54:29 +0200 Subject: [Python-Dev] Adding shlex.join? In-Reply-To: <20190417162925.GA16594@thyrsus.com> References: <20190417162925.GA16594@thyrsus.com> Message-ID: Software that "Just Works" and hasn't needed maintenance in years is the best software. :D -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 18 10:44:38 2019 From: guido at python.org (Guido van Rossum) Date: Thu, 18 Apr 2019 07:44:38 -0700 Subject: [Python-Dev] PEP 591 discussion (final qualifier) happening at typing-sig@ In-Reply-To: References: Message-ID: Yes, please add this to the PEP in the rejected ideas section, with the motivation for rejection -- the example can show how to work around it. On Tue, Apr 16, 2019 at 12:51 AM Michael Sullivan wrote: > On Mon, Apr 15, 2019 at 8:12 PM Nathaniel Smith wrote: > >> On Mon, Apr 15, 2019 at 5:00 PM Michael Sullivan >> wrote: >> > >> > I've submitted PEP 591 (Adding a final qualifier to typing) for >> discussion to typing-sig [1]. >> >> I'm not on typing-sig [1] so I'm replying here. >> >> > Here's the abstract: >> > This PEP proposes a "final" qualifier to be added to the ``typing`` >> > module---in the form of a ``final`` decorator and a ``Final`` type >> > annotation---to serve three related purposes: >> > >> > * Declaring that a method should not be overridden >> > * Declaring that a class should not be subclassed >> > * Declaring that a variable or attribute should not be reassigned >> >> I've been meaning to start blocking subclassing at runtime (e.g. like >> [2]), so being able to express that to the typechecker seems like a >> nice addition. I'm assuming though that the '@final' decorator doesn't >> have any runtime effect, so I'd have to say it twice? >> >> @typing.final >> class MyClass(metaclass=othermod.Final): >> ... >> >> Or on 3.6+ with __init_subclass__, it's easy to define a @final >> decorator that works at runtime, but I guess this would have to be a >> different decorator? >> >> @typing.final >> @alsoruntime.final >> class MyClass: >> ... >> >> This seems kinda awkward. Have you considered giving it a runtime >> effect, or providing some way for users to combine these two things >> together on their own? >> >> Nothing else in typing does any type of runtime enforcement, so I'd be > reluctant to start here. > > One approach would be doing something like this (maybe in a support > module): > if typing.TYPE_CHECKING: > from typing import final > else: > from alsoruntime import final > > So that at checking time, the typechecker would use the typing final but > at runtime we'd get something that does enforcement. > (And for the pre-3.6 case, you could maybe use something like > six.add_metaclass in order to specify the metaclass as a decorator.) > > I can add this as an example to the PEP. > > -sully > > >> -n >> >> [1] https://github.com/willingc/pep-communication/issues/1 >> [2] https://stackoverflow.com/a/3949004/1925449 >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Apr 19 14:07:53 2019 From: status at bugs.python.org (Python tracker) Date: Fri, 19 Apr 2019 18:07:53 +0000 (UTC) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20190419180753.DC2B552B253@bugs.ams1.psf.io> ACTIVITY SUMMARY (2019-04-12 - 2019-04-19) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 7058 ( -5) closed 41363 (+56) total 48421 (+51) Open issues with patches: 2807 Issues opened (33) ================== #30485: Element.findall(path, dict) doesn't insert null namespace https://bugs.python.org/issue30485 reopened by scoder #36345: Deprecate Tools/scripts/serve.py in favour of python -m http.s https://bugs.python.org/issue36345 reopened by matrixise #36620: Documentation missing parameter for Itertools.zip_longest https://bugs.python.org/issue36620 opened by CharlesMerriam #36621: shutil.rmtree follows junctions on windows https://bugs.python.org/issue36621 opened by Jordan Hueckstaedt #36624: cleanup the stdlib and tests with regard to sys.platform usage https://bugs.python.org/issue36624 opened by Michael.Felt #36630: failure of test_colors_funcs in test_curses with ncurses 6.1 https://bugs.python.org/issue36630 opened by xdegaye #36631: test_urllib2net: test_ftp_no_timeout() killed after a timeout https://bugs.python.org/issue36631 opened by vstinner #36632: test_multiprocessing_forkserver: test_rapid_restart() leaked a https://bugs.python.org/issue36632 opened by vstinner #36634: venv: activate.bat fails for venv with parentheses in PATH https://bugs.python.org/issue36634 opened by BWenzel #36635: Add _testinternalcapi module https://bugs.python.org/issue36635 opened by vstinner #36636: Inner exception is not being raised using asyncio.gather https://bugs.python.org/issue36636 opened by Drew Budwin #36640: python ibm_db setup.py post install script does not seem to wo https://bugs.python.org/issue36640 opened by sabakauser #36643: Forward reference is not resolved by dataclasses.fields() https://bugs.python.org/issue36643 opened by mdrachuk #36644: Improve documentation of slice.indices() https://bugs.python.org/issue36644 opened by pewscorner #36645: re.sub() library entry does not adequately document surprising https://bugs.python.org/issue36645 opened by mollison #36646: os.listdir() got permission error in Python3.6 but it's fine i https://bugs.python.org/issue36646 opened by Ryan_D at 163.com #36647: TextTestRunner doesn't honour "buffer" argument https://bugs.python.org/issue36647 opened by Jos?? Luis Segura Lucas #36648: MAP_SHARED isn't proper for anonymous mappings for VxWorks https://bugs.python.org/issue36648 opened by lzhao #36650: Cached method implementation no longer works on Python 3.7.3 https://bugs.python.org/issue36650 opened by jaraco #36654: Add example to tokenize.tokenize https://bugs.python.org/issue36654 opened by Windson Yang #36656: Allow os.symlink(src, target, force=True) to prevent race cond https://bugs.python.org/issue36656 opened by Tom Hale #36658: Py_Initialze() throws error 'unable to load the file system en https://bugs.python.org/issue36658 opened by rvq #36659: distutils UnixCCompiler: Remove standard library path from rpa https://bugs.python.org/issue36659 opened by vstinner #36661: Missing dataclass decorator import in dataclasses module docs https://bugs.python.org/issue36661 opened by mfisherlevine #36662: asdict/astuple Dataclass methods https://bugs.python.org/issue36662 opened by gsakkis #36663: pdb: store whole exception information in locals (via user_exc https://bugs.python.org/issue36663 opened by blueyed #36664: argparse: parser aliases in subparsers stores alias in dest va https://bugs.python.org/issue36664 opened by Peter McEldowney #36665: REPL doesn't ensure builtins are available when implicitly rec https://bugs.python.org/issue36665 opened by ncoghlan #36666: threading.Thread should have way to catch an exception thrown https://bugs.python.org/issue36666 opened by Joel Croteau #36667: pdb: restore SIGINT handler in sigint_handler already https://bugs.python.org/issue36667 opened by blueyed #36668: semaphore_tracker is not reused by child processes https://bugs.python.org/issue36668 opened by tomMoral #36669: weakref proxy doesn't support the matrix multiplication operat https://bugs.python.org/issue36669 opened by bup #36670: test suite broken due to cpu usage feature on win 10/ german https://bugs.python.org/issue36670 opened by LorenzMende Most recent 15 issues with no replies (15) ========================================== #36669: weakref proxy doesn't support the matrix multiplication operat https://bugs.python.org/issue36669 #36668: semaphore_tracker is not reused by child processes https://bugs.python.org/issue36668 #36667: pdb: restore SIGINT handler in sigint_handler already https://bugs.python.org/issue36667 #36663: pdb: store whole exception information in locals (via user_exc https://bugs.python.org/issue36663 #36654: Add example to tokenize.tokenize https://bugs.python.org/issue36654 #36647: TextTestRunner doesn't honour "buffer" argument https://bugs.python.org/issue36647 #36644: Improve documentation of slice.indices() https://bugs.python.org/issue36644 #36643: Forward reference is not resolved by dataclasses.fields() https://bugs.python.org/issue36643 #36621: shutil.rmtree follows junctions on windows https://bugs.python.org/issue36621 #36613: asyncio._wait() don't remove callback in case of exception https://bugs.python.org/issue36613 #36606: calling super() causes __class__ to be not defined when sys.se https://bugs.python.org/issue36606 #36603: should pty.openpty() set pty/tty inheritable? https://bugs.python.org/issue36603 #36590: Add Bluetooth RFCOMM Support for Windows https://bugs.python.org/issue36590 #36589: Incorrect error handling in curses.update_lines_cols() https://bugs.python.org/issue36589 #36583: Do not swallow exceptions in the _ssl module https://bugs.python.org/issue36583 Most recent 15 issues waiting for review (15) ============================================= #36668: semaphore_tracker is not reused by child processes https://bugs.python.org/issue36668 #36667: pdb: restore SIGINT handler in sigint_handler already https://bugs.python.org/issue36667 #36659: distutils UnixCCompiler: Remove standard library path from rpa https://bugs.python.org/issue36659 #36648: MAP_SHARED isn't proper for anonymous mappings for VxWorks https://bugs.python.org/issue36648 #36645: re.sub() library entry does not adequately document surprising https://bugs.python.org/issue36645 #36635: Add _testinternalcapi module https://bugs.python.org/issue36635 #36634: venv: activate.bat fails for venv with parentheses in PATH https://bugs.python.org/issue36634 #36624: cleanup the stdlib and tests with regard to sys.platform usage https://bugs.python.org/issue36624 #36618: clang expects memory aligned on 16 bytes, but pymalloc aligns https://bugs.python.org/issue36618 #36613: asyncio._wait() don't remove callback in case of exception https://bugs.python.org/issue36613 #36612: Unittest document is not clear on SetUpClass calls https://bugs.python.org/issue36612 #36610: os.sendfile can return EINVAL on Solaris https://bugs.python.org/issue36610 #36608: Replace bundled pip and setuptools with a downloader in the en https://bugs.python.org/issue36608 #36602: Recursive directory list with pathlib.Path.iterdir https://bugs.python.org/issue36602 #36601: signals can be caught by any thread https://bugs.python.org/issue36601 Top 10 most discussed issues (10) ================================= #35866: concurrent.futures deadlock https://bugs.python.org/issue35866 17 msgs #36618: clang expects memory aligned on 16 bytes, but pymalloc aligns https://bugs.python.org/issue36618 14 msgs #30485: Element.findall(path, dict) doesn't insert null namespace https://bugs.python.org/issue30485 10 msgs #36646: os.listdir() got permission error in Python3.6 but it's fine i https://bugs.python.org/issue36646 10 msgs #27987: obmalloc's 8-byte alignment causes undefined behavior https://bugs.python.org/issue27987 8 msgs #36624: cleanup the stdlib and tests with regard to sys.platform usage https://bugs.python.org/issue36624 8 msgs #16079: list duplicate test names with patchcheck https://bugs.python.org/issue16079 7 msgs #31904: Python should support VxWorks RTOS https://bugs.python.org/issue31904 6 msgs #36634: venv: activate.bat fails for venv with parentheses in PATH https://bugs.python.org/issue36634 6 msgs #32782: memoryview & ctypes: incorrect itemsize for empty array https://bugs.python.org/issue32782 5 msgs Issues closed (53) ================== #2007: cookielib lacks FileCookieJar class for Internet Explorer https://bugs.python.org/issue2007 closed by inada.naoki #15917: hg hook to detect unmerged changesets https://bugs.python.org/issue15917 closed by inada.naoki #16254: Make PyUnicode_AsWideCharString() increase temporary https://bugs.python.org/issue16254 closed by inada.naoki #18610: wsgiref.validate expects wsgi.input read to give exactly one a https://bugs.python.org/issue18610 closed by cheryl.sabella #22991: test_gdb leaves the terminal in raw mode with gdb 7.8.1 https://bugs.python.org/issue22991 closed by xdegaye #23768: assert on getting the representation of a thread in atexit fun https://bugs.python.org/issue23768 closed by xdegaye #27326: SIGSEV in test_window_funcs of test_curses https://bugs.python.org/issue27326 closed by xdegaye #28055: pyhash's siphash24 assumes alignment of the data pointer https://bugs.python.org/issue28055 closed by vstinner #28809: mention asyncio.gather non-deterministic task starting order https://bugs.python.org/issue28809 closed by cheryl.sabella #31658: xml.sax.parse won't accept path objects https://bugs.python.org/issue31658 closed by scoder #32849: Fatal Python error: Py_Initialize: can't initialize sys standa https://bugs.python.org/issue32849 closed by vstinner #32913: Improve regular expression HOWTO https://bugs.python.org/issue32913 closed by brett.cannon #33783: Use proper class markup for random.Random docs https://bugs.python.org/issue33783 closed by vstinner #34814: makesetup: must link C extensions to libpython when compiled i https://bugs.python.org/issue34814 closed by vstinner #35581: Document @typing.type_check_only https://bugs.python.org/issue35581 closed by gvanrossum #35697: _decimal: Implement the previously rejected changes from #7442 https://bugs.python.org/issue35697 closed by vstinner #35755: On Unix, shutil.which() and subprocess no longer look for the https://bugs.python.org/issue35755 closed by vstinner #36071: Add support for Windows ARM32 in ctypes/libffi https://bugs.python.org/issue36071 closed by steve.dower #36227: Add default_namespace argument to xml.etree.ElementTree.tostri https://bugs.python.org/issue36227 closed by scoder #36263: test_hashlib.test_scrypt() fails on Fedora 29 https://bugs.python.org/issue36263 closed by vstinner #36348: test_imaplib.RemoteIMAP_STARTTLSTest.test_logout() fails rando https://bugs.python.org/issue36348 closed by vstinner #36427: Document that PyEval_RestoreThread and PyGILState_Ensure can t https://bugs.python.org/issue36427 closed by pablogsal #36466: Adding a way to strip annotations from compiled bytecode https://bugs.python.org/issue36466 closed by cary #36508: python-config --ldflags must not contain LINKFORSHARED ("-Xlin https://bugs.python.org/issue36508 closed by vstinner #36558: Change time.mktime() return type from float to int? https://bugs.python.org/issue36558 closed by vstinner #36572: python-snappy install issue during Crossbar install with Pytho https://bugs.python.org/issue36572 closed by SilentGhost #36585: test_posix.py fails due to unsupported RWF_HIPRI https://bugs.python.org/issue36585 closed by pablogsal #36593: isinstance check fails for Mock objects with spec executed und https://bugs.python.org/issue36593 closed by pablogsal #36600: re-enable test in nntplib https://bugs.python.org/issue36600 closed by Marcin Niemira #36605: make tags should also parse Modules/_io/*.c and Modules/_io/*. https://bugs.python.org/issue36605 closed by vstinner #36611: Debug memory allocators: remove useless "serialno" field to re https://bugs.python.org/issue36611 closed by vstinner #36616: Optimize thread state handling in function call code https://bugs.python.org/issue36616 closed by vstinner #36622: Inconsistent exponent notation formatting https://bugs.python.org/issue36622 closed by mark.dickinson #36623: Clean unused parser headers https://bugs.python.org/issue36623 closed by pablogsal #36625: Obsolete comments in docstrings in fractions module https://bugs.python.org/issue36625 closed by mark.dickinson #36626: asyncio run_forever blocks indefinitely https://bugs.python.org/issue36626 closed by asvetlov #36627: composing generator expression doesn't work as expected https://bugs.python.org/issue36627 closed by SilentGhost #36628: Enhancement: i-Strings https://bugs.python.org/issue36628 closed by SilentGhost #36629: imaplib test fails with errno 101 https://bugs.python.org/issue36629 closed by vstinner #36633: py_compile.compile: AttributeError on importlib.utils https://bugs.python.org/issue36633 closed by xtreak #36637: Restrict syntax for tuple literals with one element https://bugs.python.org/issue36637 closed by brett.cannon #36638: typeperf.exe is not in all skus of Windows SKUs https://bugs.python.org/issue36638 closed by steve.dower #36639: Provide list.rindex() https://bugs.python.org/issue36639 closed by rhettinger #36641: make docstring in C const https://bugs.python.org/issue36641 closed by inada.naoki #36642: make unicodedata "const" https://bugs.python.org/issue36642 closed by inada.naoki #36649: Windows Store app install registry keys have incorrect paths https://bugs.python.org/issue36649 closed by steve.dower #36651: Asyncio Event Loop documentation inconsistency (call_later and https://bugs.python.org/issue36651 closed by asvetlov #36652: Non-embedded zip distribution https://bugs.python.org/issue36652 closed by steve.dower #36653: Dictionary Key is without ' ' quotes https://bugs.python.org/issue36653 closed by mrabarnett #36655: Division Precision Problem https://bugs.python.org/issue36655 closed by christian.heimes #36657: AttributeError https://bugs.python.org/issue36657 closed by xtreak #36660: TypeError https://bugs.python.org/issue36660 closed by vstinner #1402289: Allow mappings as globals (was: Fix dictionary subclass ...) https://bugs.python.org/issue1402289 closed by rhettinger From alessandro.cucci at gmail.com Sat Apr 20 04:14:08 2019 From: alessandro.cucci at gmail.com (Alessandro Cucci) Date: Sat, 20 Apr 2019 10:14:08 +0200 Subject: [Python-Dev] Python Documentation Translation in italian language Message-ID: Hello folks, I want to start a project for translating the Python Documentation in Italian. I'm reading the PEP545, trying to understand how it works. I founded a Python User Group in my city and I can work with them on the translations, plus next month I will be speaker at Pycon Italy, so I can easily sponsor this project during the talk and reclute more people to work on that. Is there anybody who can help me to start? Thanks, have a nice day. *Alessandro Cucci* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ikamenshchikov at gmail.com Sat Apr 20 04:56:44 2019 From: ikamenshchikov at gmail.com (Ilya Kamenshchikov) Date: Sat, 20 Apr 2019 10:56:44 +0200 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm Message-ID: I am using concurrent.futures to parallelize independent tasks on multiple cores once in a while. Each time I have a difficulty remembering the specific syntax and have to look it up in old code or google. I would much prefer to be able to find the source through the PyCharm and have autocompletion. It takes adding two lines to the __init__.py of concurrent.futures: (insert on line 19) from .process import ProcessPoolExecutor from .thread import ThreadPoolExecutor I would also guess that it would make the __getattr__ redundant? Am I missing something or can this change be done this way and would indeed be an improvement? Best Regards, -- Ilya Kamenshchikov -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sat Apr 20 11:34:10 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sun, 21 Apr 2019 00:34:10 +0900 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: See https://bugs.python.org/issue32596 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ikamenshchikov at gmail.com Sat Apr 20 12:43:19 2019 From: ikamenshchikov at gmail.com (Ilya Kamenshchikov) Date: Sat, 20 Apr 2019 18:43:19 +0200 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: alright, so would an import under TYPE_CHECKING guard be an option? like: from typing import TYPE_CHECKING if TYPE_CHECKING: from .process import ProcessPoolExecutor from .thread import ThreadPoolExecutor Perhaps we can have both clarity and performance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Apr 20 13:17:57 2019 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 20 Apr 2019 13:17:57 -0400 Subject: [Python-Dev] Python Documentation Translation in italian language In-Reply-To: References: Message-ID: On 4/20/2019 4:14 AM, Alessandro Cucci wrote: > Hello folks, > I want to start a project for translating the Python Documentation in > Italian. > I'm reading the PEP545, trying to understand how it works. > > I founded a Python User Group in my city and I can work with them on the > translations, plus next month I will be speaker at Pycon Italy, so I can > easily sponsor this project during the talk and reclute more people to > work on that. > > Is there anybody who can help me to start? > Thanks, have a nice day. Devguide: "7.6. Translations There are now several official documentation translations (see section 21.5. Documentation Translations and PEP 545 for details). Discussions about translations occur on the doc-sig mailing list." There is no Italian translation yet. There may or may not be one in progress. Post to doc-sig. https://mail.python.org/mailman/listinfo/doc-sig -- Terry Jan Reedy From songofacandy at gmail.com Sat Apr 20 17:08:36 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Sun, 21 Apr 2019 06:08:36 +0900 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: "import typing" is slow too. 2019?4?21?(?) 1:43 Ilya Kamenshchikov : > alright, so would an import under TYPE_CHECKING guard be an option? like: > > from typing import TYPE_CHECKING > if TYPE_CHECKING: > from .process import ProcessPoolExecutor > from .thread import ThreadPoolExecutor > > > Perhaps we can have both clarity and performance. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Sat Apr 20 18:26:14 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 20 Apr 2019 15:26:14 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: <4bb10b8b-5cf7-926c-f6a0-f7f365594037@g.nevcal.com> On 4/20/2019 2:08 PM, Inada Naoki wrote: > "import typing" is slow too. > > 2019?4?21?(?) 1:43 Ilya Kamenshchikov >: > > alright, so would an import under TYPE_CHECKING guard be an > option? like: > > from typingimport TYPE_CHECKING > if TYPE_CHECKING: > from .processimport ProcessPoolExecutor > from .threadimport ThreadPoolExecutor > > Perhaps we can have both clarity and performance. > How about: from faketyping import TYPE_CHECKING where faketyping.py: TYPE_CHECKING = None I don't know enough about how TYPE_CHECKING (or typing) is optionally enabled to come up with an exactly correct proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Apr 20 20:13:27 2019 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 20 Apr 2019 17:13:27 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: On Sat, Apr 20, 2019 at 2:11 PM Inada Naoki wrote: > > "import typing" is slow too. Many static analysis tools will also accept: TYPE_CHECKING = False if TYPE_CHECKING: ... At least mypy and pylint both treat all variables named TYPE_CHECKING as true, regardless of where they came from. I'm not sure if this is intentional or because they're cutting corners, but it works... -n -- Nathaniel J. Smith -- https://vorpus.org From chris.barker at noaa.gov Mon Apr 22 13:06:20 2019 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 22 Apr 2019 10:06:20 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Fri, Apr 12, 2019 at 10:20 AM Brett Cannon wrote: > >> This doesn't strike me as needing an optimization through a dedicated > method. > maybe a new dict mapping type -- "shared_dict" -- it would be used in places like the csv reader where it makes sense, but wouldn't impact the regular dict at all. you could get really clever an have it auto-convert to a regular dict when any changes were made that are incompatible with the shared keys... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Apr 22 15:40:42 2019 From: brett at python.org (Brett Cannon) Date: Mon, 22 Apr 2019 12:40:42 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: On Sat, Apr 20, 2019 at 2:10 PM Inada Naoki wrote: > "import typing" is slow too. > But is it so slow as to not do the right thing here and use the 'typing' module as expected? If you have so much work you need to launch some threads or processes to deal with it then a single import isn't going to be your biggest bottleneck. -Brett > > 2019?4?21?(?) 1:43 Ilya Kamenshchikov : > >> alright, so would an import under TYPE_CHECKING guard be an option? like: >> >> from typing import TYPE_CHECKING >> if TYPE_CHECKING: >> from .process import ProcessPoolExecutor >> from .thread import ThreadPoolExecutor >> >> >> Perhaps we can have both clarity and performance. >> >> _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Apr 22 18:21:35 2019 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 23 Apr 2019 01:21:35 +0300 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: I see the chicken and egg problem here. If we are talking about typing module usage -- typeshed is the type hints provider. If PyCharm doesn't want to use it -- it is not CPython problem. I think there is no need to change python code itself but used tooling. On Mon, Apr 22, 2019 at 11:06 PM Brett Cannon wrote: > > > > On Sat, Apr 20, 2019 at 2:10 PM Inada Naoki wrote: >> >> "import typing" is slow too. > > > But is it so slow as to not do the right thing here and use the 'typing' module as expected? If you have so much work you need to launch some threads or processes to deal with it then a single import isn't going to be your biggest bottleneck. > > -Brett > >> >> >> 2019?4?21?(?) 1:43 Ilya Kamenshchikov : >>> >>> alright, so would an import under TYPE_CHECKING guard be an option? like: >>> >>> from typing import TYPE_CHECKING >>> if TYPE_CHECKING: >>> from .process import ProcessPoolExecutor >>> from .thread import ThreadPoolExecutor >>> >>> >>> Perhaps we can have both clarity and performance. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From steve.dower at python.org Mon Apr 22 18:46:55 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 22 Apr 2019 15:46:55 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: On 22Apr2019 1521, Andrew Svetlov wrote: > I see the chicken and egg problem here. > If we are talking about typing module usage -- typeshed is the type > hints provider. > If PyCharm doesn't want to use it -- it is not CPython problem. > > I think there is no need to change python code itself but used tooling. It's not typeshed related, it's most likely because Python 3.7 Lib/concurrent/future/__init__.py switched from always importing the subclasses to doing it lazily in a module __getattr__ function. I assume for performance, since either submodule may have deep import chains. Presumably PyCharm has not yet added support for this, and so it simply doesn't know how to resolve ThreadPoolExecutor or ProcessPoolExecutor without actually executing code (which most static analysers will hesitate to do, since you don't know if that code is "os.system('rm -rf /')" until it's too late). Perhaps for the sake of IDEs and static analysers we could make a policy for standard library modules to include an "if False:" or "TYPE_CHECKING = False; if TYPE_CHECKING:" block that includes the import statement when adding lazy module attribute resolution? Cheers, Steve From songofacandy at gmail.com Mon Apr 22 18:55:16 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 23 Apr 2019 07:55:16 +0900 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: On Tue, Apr 23, 2019 at 4:40 AM Brett Cannon wrote: > > On Sat, Apr 20, 2019 at 2:10 PM Inada Naoki wrote: >> >> "import typing" is slow too. > > But is it so slow as to not do the right thing here and use the 'typing' module as expected? I don't know it is not a "right thing" yet. It feel it is just a workaround for PyCharm at the moment. __dir__ and __all__ has ProcessPoolExecutor and ThreadPoolExecutor for interactive shell. So Python REPL can complete them. But we didn't discussed about "static hinting" version of __all__ in PEP 562. If we decide it's a "right way", we can update example code in PEP 562. But when we use lazy import, we want to make import faster. Adding more 3~5ms import time seems not so happy solution. Maybe, can we add TYPE_CHECKING=False in builtins? > If you have so much work you need to launch some threads or processes to deal with it then a single import isn't going to be your biggest bottleneck. Importing futures module doesn't mean the app really need thread or processes. That's why we defer importing ThreadPoolExecutor and ProcessPoolExecutor. And people who want apps like vim starts quickly (~200ms), we want avoid every "significant overhead" as possible. Not only "the biggest bottleneck" is the problem. -- Inada Naoki From songofacandy at gmail.com Mon Apr 22 19:03:05 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 23 Apr 2019 08:03:05 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Tue, Apr 23, 2019 at 2:18 AM Chris Barker via Python-Dev wrote: > > On Fri, Apr 12, 2019 at 10:20 AM Brett Cannon wrote: >>> >>> >> This doesn't strike me as needing an optimization through a dedicated method. > > maybe a new dict mapping type -- "shared_dict" -- it would be used in places like the csv reader where it makes sense, but wouldn't impact the regular dict at all. > > you could get really clever an have it auto-convert to a regular dict when any changes were made that are incompatible with the shared keys... My current idea is adding builder in somewhere in stdlib (maybe collections?): builder = DictBuilder(keys_tuple) value = builder(values) # repeatedly called. I don't want to add new mapping type because we already have shared key dict, and changing mapping type may cause backward compatibility problem. Regards, -- Inada Naoki From v+python at g.nevcal.com Mon Apr 22 19:36:32 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 22 Apr 2019 16:36:32 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: <4ba59cfd-c82a-df10-9ff2-bd8d000f85d6@g.nevcal.com> On 4/22/2019 4:03 PM, Inada Naoki wrote: > On Tue, Apr 23, 2019 at 2:18 AM Chris Barker via Python-Dev > wrote: >> On Fri, Apr 12, 2019 at 10:20 AM Brett Cannon wrote: >>>> >>> This doesn't strike me as needing an optimization through a dedicated method. >> maybe a new dict mapping type -- "shared_dict" -- it would be used in places like the csv reader where it makes sense, but wouldn't impact the regular dict at all. >> >> you could get really clever an have it auto-convert to a regular dict when any changes were made that are incompatible with the shared keys... > > My current idea is adding builder in somewhere in stdlib (maybe collections?): > > builder = DictBuilder(keys_tuple) > value = builder(values) # repeatedly called. > > I don't want to add new mapping type because we already have shared key dict, > and changing mapping type may cause backward compatibility problem. > > > Regards, As a heavy user of some self-written code that does stuff very similar to csv reader, and creates lots of same-key dicts, I'd be supportive of a performance enhancing solution here, although I haven't done a detailed study of where the time is currently spent. Is the problem that the existing shared key dict isn't always detected? Or just that knowing in advance that it is expected to be a shared key dict can save the detection work? I do know that in my code, I have a complete list of keys and values when I create each dict, and would be happy to tweak it to use the most performance technique. The above looks like a nice interface, assuming that values is expected to be in the same iterable order as keys_tuple (but is there a need for keys_tuple to be a tuple? could it be any iterable?). -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Apr 22 20:19:30 2019 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 23 Apr 2019 10:19:30 +1000 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: <20190423001930.GL3010@ando.pearwood.info> On Mon, Apr 22, 2019 at 10:06:20AM -0700, Chris Barker via Python-Dev wrote: > maybe a new dict mapping type -- "shared_dict" -- it would be used in > places like the csv reader where it makes sense, but wouldn't impact the > regular dict at all. > > you could get really clever an have it auto-convert to a regular dict when > any changes were made that are incompatible with the shared keys... Oh, you mean just like regular dicts with shared keys already do :-) https://www.python.org/dev/peps/pep-0412/ Perhaps I've missed something in this discussion, but isn't this a matter of just making the existing shared-keys functionality explicitly usable rather than just purely implicit? Quoting from the PEP: When dictionaries are created to fill the __dict__ slot of an object, they are created in split form. The keys table is cached in the type, potentially allowing all attribute dictionaries of instances of one class to share keys. In the event of the keys of these dictionaries starting to diverge, individual dictionaries will lazily convert to the combined-table form. There's no explicit interface to control this; it all happens by magic, behind the scenes. I think the proposal here is to add some sort of interface, possibly a new method, to explicitly use key sharing. -- Steven From v+python at g.nevcal.com Mon Apr 22 21:22:30 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 22 Apr 2019 18:22:30 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <20190423001930.GL3010@ando.pearwood.info> References: <20190423001930.GL3010@ando.pearwood.info> Message-ID: <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> On 4/22/2019 5:19 PM, Steven D'Aprano wrote: > Oh, you mean just like regular dicts with shared keys already do:-) > > https://www.python.org/dev/peps/pep-0412/ > > Perhaps I've missed something in this discussion, but isn't this a > matter of just making the existing shared-keys functionality explicitly > usable rather than just purely implicit? Quoting from the PEP: > > When dictionaries are created to fill the __dict__ slot of an object, > they are created in split form. The keys table is cached in the type, > potentially allowing all attribute dictionaries of instances of one > class to share keys. In the event of the keys of these dictionaries > starting to diverge, individual dictionaries will lazily convert to > the combined-table form. > > There's no explicit interface to control this; it all happens by magic, > behind the scenes. I think the proposal here is to add some sort of > interface, possibly a new method, to explicitly use key sharing. Thanks for the PEP reference; I'd forgotten some of the details, and hadn't yet gone to look them up. Yes, it is all magic, but is only available for object __dict__ slot dicts. I'd forgotten that that was the "detection" mechanism.? In the general case, it would be too time-consuming to examine all existing dicts to discover some that might accidentally have the same keys, whereas Mark realized that objects very frequently have __dict__ slot dictionaries with the same keys, and were ripe for (significant memory and minor performance) optimization. Inada is now proposing a way to allow the coder to suggest a group of dictionaries that might benefit from the same gains, by preclassifying non-__dict__ slot dictionaries to do similar sharing. CSV reader is an exemplary candidate, because it creates groups of dicts that use the same keys. (column names). I have other code that does similar things, that would get similar benefits. Seems like since it is just an interface to existing builtin code, that the one interface function (or dictionary factory class) could just as well be a builtin function, instead of requiring an import. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Mon Apr 22 22:21:41 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 22 Apr 2019 19:21:41 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> Message-ID: On 22Apr2019 1822, Glenn Linderman wrote: > Inada is now proposing a way to allow the coder to suggest a group of > dictionaries that might benefit from the same gains, by preclassifying > non-__dict__ slot dictionaries to do similar sharing. > > CSV reader is an exemplary candidate, because it creates groups of dicts > that use the same keys. (column names). I have other code that does > similar things, that would get similar benefits. > > Seems like since it is just an interface to existing builtin code, that > the one interface function (or dictionary factory class) could just as > well be a builtin function, instead of requiring an import. Sounds like a similar optimisation to sys.intern() is for strings. I see no reason to try and avoid an import here - it's definitely a special-case situation - but otherwise having a function to say "clone and update this dict" that starts by sharing the keys in the same way that __dict__ does (including the transformation when necessary) seems like an okay addition. Maybe copy() could just be enabled for this? Cheers, Steve From steve.dower at python.org Mon Apr 22 22:27:43 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 22 Apr 2019 19:27:43 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> Message-ID: <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> On 22Apr2019 1921, Steve Dower wrote: > On 22Apr2019 1822, Glenn Linderman wrote: >> Inada is now proposing a way to allow the coder to suggest a group of >> dictionaries that might benefit from the same gains, by preclassifying >> non-__dict__ slot dictionaries to do similar sharing. >> >> CSV reader is an exemplary candidate, because it creates groups of >> dicts that use the same keys. (column names). I have other code that >> does similar things, that would get similar benefits. >> >> Seems like since it is just an interface to existing builtin code, >> that the one interface function (or dictionary factory class) could >> just as well be a builtin function, instead of requiring an import. > > Sounds like a similar optimisation to sys.intern() is for strings. > > I see no reason to try and avoid an import here - it's definitely a > special-case situation - but otherwise having a function to say "clone > and update this dict" that starts by sharing the keys in the same way > that __dict__ does (including the transformation when necessary) seems > like an okay addition. Maybe copy() could just be enabled for this? Or possibly just "dict(existing_dict).update(new_items)". My primary concern is still to avoid making CPython performance characteristics part of the Python language definition. That only makes it harder for alternate implementations. (Even though I was out-voted last time on this issue since all the publicly-known alternate implementations said it would be okay... I'm still going to put in a vote for avoiding new language semantics for the sake of a single runtime's performance characteristics.) Cheers, Steve From v+python at g.nevcal.com Tue Apr 23 00:19:08 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 22 Apr 2019 21:19:08 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: On 4/22/2019 7:27 PM, Steve Dower wrote: > On 22Apr2019 1921, Steve Dower wrote: >> On 22Apr2019 1822, Glenn Linderman wrote: >>> Inada is now proposing a way to allow the coder to suggest a group >>> of dictionaries that might benefit from the same gains, by >>> preclassifying non-__dict__ slot dictionaries to do similar sharing. >>> >>> CSV reader is an exemplary candidate, because it creates groups of >>> dicts that use the same keys. (column names). I have other code that >>> does similar things, that would get similar benefits. >>> >>> Seems like since it is just an interface to existing builtin code, >>> that the one interface function (or dictionary factory class) could >>> just as well be a builtin function, instead of requiring an import. >> >> Sounds like a similar optimisation to sys.intern() is for strings. >> >> I see no reason to try and avoid an import here - it's definitely a >> special-case situation - but otherwise having a function to say >> "clone and update this dict" that starts by sharing the keys in the >> same way that __dict__ does (including the transformation when >> necessary) seems like an okay addition. Maybe copy() could just be >> enabled for this? > > Or possibly just "dict(existing_dict).update(new_items)". > > My primary concern is still to avoid making CPython performance > characteristics part of the Python language definition. That only > makes it harder for alternate implementations. (Even though I was > out-voted last time on this issue since all the publicly-known > alternate implementations said it would be okay... I'm still going to > put in a vote for avoiding new language semantics for the sake of a > single runtime's performance characteristics.) While Inada's suggested DictBuilder interface was immediately obvious, I don't get how either copy or update would achieve the goal. Perhaps you could explain? Particularly, what would be the trigger that would make dict() choose to create a shared key dictionary from the start? Unless it is known that there will be lots of (mostly static) dictionaries with the same set of keys at the time of creation of the first one, creating a shared key dictionary in every case would cause later inefficiencies in converting them, when additional items are added? (I'm assuming without knowledge that a single shared key dictionary is less efficient than a single regular dictionary.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue Apr 23 00:28:46 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 22 Apr 2019 21:28:46 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: <72b0f6a8-1110-1b5b-d796-94a1e5d28d70@g.nevcal.com> On 4/22/2019 7:27 PM, Steve Dower wrote: > On 22Apr2019 1921, Steve Dower wrote: >> On 22Apr2019 1822, Glenn Linderman wrote: >>> Inada is now proposing a way to allow the coder to suggest a group >>> of dictionaries that might benefit from the same gains, by >>> preclassifying non-__dict__ slot dictionaries to do similar sharing. >>> >>> CSV reader is an exemplary candidate, because it creates groups of >>> dicts that use the same keys. (column names). I have other code that >>> does similar things, that would get similar benefits. >>> >>> Seems like since it is just an interface to existing builtin code, >>> that the one interface function (or dictionary factory class) could >>> just as well be a builtin function, instead of requiring an import. >> >> Sounds like a similar optimisation to sys.intern() is for strings. >> >> I see no reason to try and avoid an import here - it's definitely a >> special-case situation - but otherwise having a function to say >> "clone and update this dict" that starts by sharing the keys in the >> same way that __dict__ does (including the transformation when >> necessary) seems like an okay addition. Maybe copy() could just be >> enabled for this? > > Or possibly just "dict(existing_dict).update(new_items)". > > My primary concern is still to avoid making CPython performance > characteristics part of the Python language definition. That only > makes it harder for alternate implementations. (Even though I was > out-voted last time on this issue since all the publicly-known > alternate implementations said it would be okay... I'm still going to > put in a vote for avoiding new language semantics for the sake of a > single runtime's performance characteristics.) I note that dict() doesn't have a method to take two parallel iterables of keys/values and create a dict... if it did, that could be a trigger that a shared key dict might be appropriate... it seems more likely that data in that form is dealing with rows and columns, instead of the forms currently accepted by dict(). Perhaps an alternate constructor that took data in that form, AND defined an optional parameter to trigger a shared dict, would be a useful addition to the language.? Other implementations could ignore the optional parameter if they want, and the implementation would be a one-liner calling the current constructor and zip()ing the parameters. The alternate constructor would be nice even if shared key dicts were not particularly needed in an application, and would provide a method of adding a trigger for the shared key optimization when appropriate. -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Apr 23 00:43:07 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 23 Apr 2019 13:43:07 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: On Tue, Apr 23, 2019 at 11:30 AM Steve Dower wrote: > > Or possibly just "dict(existing_dict).update(new_items)". > Do you mean .update accepts values tuple? I can't think it's > My primary concern is still to avoid making CPython performance > characteristics part of the Python language definition. That only makes > it harder for alternate implementations. Note that this proposal is not only for key sharing dict: * We can avoid rebuilding hash table again and again. * We can avoid checking duplicated keys again and again. These characteristics are not only for Python, but for all mapping implementations using hash table. -- Inada Naoki From steve.dower at python.org Tue Apr 23 01:54:39 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 22 Apr 2019 22:54:39 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: <31b4e1aa-ef17-6ece-5698-acb1aee9fb35@python.org> On 22Apr2019 2143, Inada Naoki wrote: > On Tue, Apr 23, 2019 at 11:30 AM Steve Dower wrote: >> >> Or possibly just "dict(existing_dict).update(new_items)". >> > > Do you mean .update accepts values tuple? > I can't think it's Not sure what you were going to go on to say here, but why not? If it's a key-sharing dict, then all the keys are strings. We know that when we go to do the update, so we can intern all the strings (going to do that anyway) and then it's a quick check if it already exists. If it's a regular dict, then we calculate hashes as normal. Updating the value is just a decref, incref and assignment. If not all these conditions are met, we convert to a regular dict. The proposed function was going to raise an error in this case, so all we've done is make it transparent. The biggest downside is now you don't get a warning that your preferred optimization isn't actually working when you pass in new_items with different keys from what were in existing_dict. Note that it .update() would probably require a dict or key/value tuples here - but if you have the keys in a tuple already then zip() is going to be good enough for setting it (in fact, zip(existing_dict, new_values) should be fine, and we can internally special-case that scenario, too). I'd assumed the benefit was in memory usage after construction, rather than speed-to-construct, since everyone keeps talking about "key-sharing dictionaries" and not "arrays" ;) (Randomizing side note: is this scenario enough to make a case for a built-in data frame type?) >> My primary concern is still to avoid making CPython performance >> characteristics part of the Python language definition. That only makes >> it harder for alternate implementations. > > Note that this proposal is not only for key sharing dict: > > * We can avoid rebuilding hash table again and again. > * We can avoid checking duplicated keys again and again. > > These characteristics are not only for Python, but for all mapping > implementations using hash table. I believe all of these are met by making d2=dict(d1) construct a dict d2 that shares keys with d1 by default. Can you show how they are not? * when you only d2.update existing keys, no need to rebuild the table * a duplicated key overwrites multiple times - what else are you going to do? This is already easiest, fastest, uses the least memory and is most consistent with every other form of setting dict items. Why complicate things by checking them? Let the caller do it Cheers, Steve From steve.dower at python.org Tue Apr 23 01:59:56 2019 From: steve.dower at python.org (Steve Dower) Date: Mon, 22 Apr 2019 22:59:56 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: On 22Apr2019 2119, Glenn Linderman wrote: > While Inada's suggested DictBuilder interface was immediately obvious, I > don't get how either copy or update would achieve the goal. Perhaps you > could explain? Particularly, what would be the trigger that would make > dict() choose to create a shared key dictionary from the start? Unless > it is known that there will be lots of (mostly static) dictionaries with > the same set of keys at the time of creation of the first one, creating > a shared key dictionary in every case would cause later inefficiencies > in converting them, when additional items are added? (I'm assuming > without knowledge that a single shared key dictionary is less efficient > than a single regular dictionary.) Passing a dictionary to the dict() constructor creates a copy of that dictionary (as does copy.copy()). What other trigger do you need to decide "it contains the same keys"? It's a copy of the original dict, so by definition at that point it may as well share its entire contents with the original. Basically this is just a partial copy-on-write, where we copy values eagerly - since they're almost certainly going to change - and keys lazily - since there are known scenarios where they are not going to be changed, but we'll pay the cost later if it turns out they are. Cheers, Steve From songofacandy at gmail.com Tue Apr 23 03:08:01 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Tue, 23 Apr 2019 16:08:01 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <31b4e1aa-ef17-6ece-5698-acb1aee9fb35@python.org> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> <31b4e1aa-ef17-6ece-5698-acb1aee9fb35@python.org> Message-ID: On Tue, Apr 23, 2019 at 2:54 PM Steve Dower wrote: > > On 22Apr2019 2143, Inada Naoki wrote: > > On Tue, Apr 23, 2019 at 11:30 AM Steve Dower wrote: > >> > >> Or possibly just "dict(existing_dict).update(new_items)". > >> > > > > Do you mean .update accepts values tuple? > > I can't think it's > > Not sure what you were going to go on to say here, but why not? Sorry, I sent mail without finishing. dict.update() has too many overloading. Adding values_tuple is impossible without breaking backward compatibility. But I think you're saying about items_sequence, not values. > > If it's a key-sharing dict, then all the keys are strings. We know that > when we go to do the update, so we can intern all the strings (going to > do that anyway) and then it's a quick check if it already exists. If > it's a regular dict, then we calculate hashes as normal. Updating the > value is just a decref, incref and assignment. There are some problem. 1. Searching hash table is not zero-cost, comparing to appending to sequence. This cost is very near to building new hash tables. 2. In my proposal, main user is csv.DictReader or sql.DictCursor. They parse only values on each rows. So they need to use map. 3. (CPython only) dict.copy(), dict(dict), and dict.update() are general purpose methods. There is no obvious place to start using key-sharing dict. That's why I proposed specific method / function for specific purpose. > > Note that it .update() would probably require a dict or key/value tuples > here - but if you have the keys in a tuple already then zip() is going > to be good enough for setting it (in fact, zip(existing_dict, > new_values) should be fine, and we can internally special-case that > scenario, too). If *CPython* specialized dict(zip(dict, values)), it still be CPython implementation detail. Do you want recommend using such CPython hacky optimization? Should we use such optimization in stdlib, even if it will be slower than dict(zip(keys_tuple, values)) on some other Python implementations? Or do you propose making dict(zip(dict, values)) optimization as language specification? One obvious advantage of having DictBuilder is it is for specific purpose. It has at least same performance to dict(zip(keys, values)) on all Python implementations. Libraries like csv parser can use it without worrying about its performance on Python other than CPython. > I'd assumed the benefit was in memory usage after > construction, rather than speed-to-construct, since everyone keeps > talking about "key-sharing dictionaries" and not "arrays" ;) Both is important. I had talked about non key-sharing dict. > (Randomizing side note: is this scenario enough to make a case for a > built-in data frame type?) https://xkcd.com/927/ > >> My primary concern is still to avoid making CPython performance > >> characteristics part of the Python language definition. That only makes > >> it harder for alternate implementations. > > > > Note that this proposal is not only for key sharing dict: > > > > * We can avoid rebuilding hash table again and again. > > * We can avoid checking duplicated keys again and again. > > > > These characteristics are not only for Python, but for all mapping > > implementations using hash table. > > I believe all of these are met by making d2=dict(d1) construct a dict d2 > that shares keys with d1 by default. Can you show how they are not? If you want only copy, it's same. > > * when you only d2.update existing keys, no need to rebuild the table > * a duplicated key overwrites multiple times - what else are you going > to do? But all keys should be looked up. It is very similar overhead to rebuilding hash table. > This is already easiest, fastest, uses the least memory and is > most consistent with every other form of setting dict items. Why > complicate things by checking them? Let the caller do it As I wrote above, it is: * slower than my proposal. * no obvious place to start using key sharing dict. -- Inada Naoki From v+python at g.nevcal.com Tue Apr 23 03:34:53 2019 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Apr 2019 00:34:53 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> Message-ID: <5feb2cd6-8075-3589-e091-a79ce27cc25f@g.nevcal.com> On 4/22/2019 10:59 PM, Steve Dower wrote: > On 22Apr2019 2119, Glenn Linderman wrote: >> While Inada's suggested DictBuilder interface was immediately >> obvious, I don't get how either copy or update would achieve the >> goal. Perhaps you could explain? Particularly, what would be the >> trigger that would make dict() choose to create a shared key >> dictionary from the start? Unless it is known that there will be lots >> of (mostly static) dictionaries with the same set of keys at the time >> of creation of the first one, creating a shared key dictionary in >> every case would cause later inefficiencies in converting them, when >> additional items are added? (I'm assuming without knowledge that a >> single shared key dictionary is less efficient than a single regular >> dictionary.) > > Passing a dictionary to the dict() constructor creates a copy of that > dictionary (as does copy.copy()). What other trigger do you need to > decide "it contains the same keys"? It's a copy of the original dict, > so by definition at that point it may as well share its entire > contents with the original. But if the original dictionary wasn't created with shared keys... the copy can't share them either.? Or are you suggesting adding new code to create a shared key dictionary from one that isn't? > > Basically this is just a partial copy-on-write, where we copy values > eagerly - since they're almost certainly going to change - and keys > lazily - since there are known scenarios where they are not going to > be changed, but we'll pay the cost later if it turns out they are. > > Cheers, > Steve > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Tue Apr 23 07:58:53 2019 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 23 Apr 2019 14:58:53 +0300 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: I agree that `from typing import TYPE_CHECKING` is not desirable from the import time reduction perspective. >From my understanding code completion *can* be based on type hinting to avoid actual code execution. That's why I've mentioned that typeshed already has the correct type information. if TYPE_CHECKING: import ... requires mypy modification. if False: import ... Works right now for stdlib (mypy ignores stdlib code but uses typeshed anyway) but looks a little cryptic. Requires a comprehensive comment at least. On Tue, Apr 23, 2019 at 1:59 AM Inada Naoki wrote: > > On Tue, Apr 23, 2019 at 4:40 AM Brett Cannon wrote: > > > > On Sat, Apr 20, 2019 at 2:10 PM Inada Naoki wrote: > >> > >> "import typing" is slow too. > > > > But is it so slow as to not do the right thing here and use the 'typing' module as expected? > > I don't know it is not a "right thing" yet. It feel it is just a > workaround for PyCharm at the moment. > > __dir__ and __all__ has ProcessPoolExecutor and ThreadPoolExecutor for > interactive shell. So Python REPL can complete them. But we didn't discussed > about "static hinting" version of __all__ in PEP 562. > > If we decide it's a "right way", we can update example code in PEP 562. > > But when we use lazy import, we want to make import faster. > Adding more 3~5ms import time seems not so happy solution. > > Maybe, can we add TYPE_CHECKING=False in builtins? > > > > If you have so much work you need to launch some threads or processes to deal with it then a single import isn't going to be your biggest bottleneck. > > Importing futures module doesn't mean the app really need > thread or processes. That's why we defer importing ThreadPoolExecutor > and ProcessPoolExecutor. > > And people who want apps like vim starts quickly (~200ms), we want avoid > every "significant overhead" as possible. Not only "the biggest bottleneck" > is the problem. > > -- > Inada Naoki > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From steve.dower at python.org Tue Apr 23 11:13:32 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 23 Apr 2019 08:13:32 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <5feb2cd6-8075-3589-e091-a79ce27cc25f@g.nevcal.com> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> <5feb2cd6-8075-3589-e091-a79ce27cc25f@g.nevcal.com> Message-ID: <8939429a-cc81-6cc5-c4a7-2f0a5ca01ac6@python.org> On 23Apr2019 0034, Glenn Linderman wrote: > On 4/22/2019 10:59 PM, Steve Dower wrote: >> On 22Apr2019 2119, Glenn Linderman wrote: >>> While Inada's suggested DictBuilder interface was immediately >>> obvious, I don't get how either copy or update would achieve the >>> goal. Perhaps you could explain? Particularly, what would be the >>> trigger that would make dict() choose to create a shared key >>> dictionary from the start? Unless it is known that there will be lots >>> of (mostly static) dictionaries with the same set of keys at the time >>> of creation of the first one, creating a shared key dictionary in >>> every case would cause later inefficiencies in converting them, when >>> additional items are added? (I'm assuming without knowledge that a >>> single shared key dictionary is less efficient than a single regular >>> dictionary.) >> >> Passing a dictionary to the dict() constructor creates a copy of that >> dictionary (as does copy.copy()). What other trigger do you need to >> decide "it contains the same keys"? It's a copy of the original dict, >> so by definition at that point it may as well share its entire >> contents with the original. > > But if the original dictionary wasn't created with shared keys... the > copy can't share them either.? Or are you suggesting adding new code to > create a shared key dictionary from one that isn't? This is a good point. Maybe dict.fromkeys() could do it? Or a sys.intern-like function (which is why I brought up that precedent). The point is to make it an optional benefit rather than strict language/library semantics. Is there a cost to using a key sharing dict that is prohibitive when the keys aren't actually being shared? Cheers, Steve From steve.dower at python.org Tue Apr 23 11:33:07 2019 From: steve.dower at python.org (Steve Dower) Date: Tue, 23 Apr 2019 08:33:07 -0700 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> <31b4e1aa-ef17-6ece-5698-acb1aee9fb35@python.org> Message-ID: On 23Apr2019 0008, Inada Naoki wrote: > On Tue, Apr 23, 2019 at 2:54 PM Steve Dower wrote: >> >> On 22Apr2019 2143, Inada Naoki wrote: >>> On Tue, Apr 23, 2019 at 11:30 AM Steve Dower wrote: >>>> >>>> Or possibly just "dict(existing_dict).update(new_items)". >>>> >>> >>> Do you mean .update accepts values tuple? >>> I can't think it's >> >> Not sure what you were going to go on to say here, but why not? > > Sorry, I sent mail without finishing. > > dict.update() has too many overloading. > Adding values_tuple is impossible without breaking backward compatibility. > > But I think you're saying about items_sequence, not values. Right. I'm specifically trying to avoid changing public APIs at all (including adding anything new, if possible) by identifying suitable patterns that we can handle specially to provide a transparent speed improvement. >> If it's a key-sharing dict, then all the keys are strings. We know that >> when we go to do the update, so we can intern all the strings (going to >> do that anyway) and then it's a quick check if it already exists. If >> it's a regular dict, then we calculate hashes as normal. Updating the >> value is just a decref, incref and assignment. > > There are some problem. > > 1. Searching hash table is not zero-cost, comparing to appending to sequence. > This cost is very near to building new hash tables. If we know that you're sharing keys with the new items then we can skip the search. This was my point about the d2 = copy(d1); d2.update(zip(d2, values)) idea: def update(self, items): if isinstance(items, ZipObject): # whatever the type is called if are_sharing_keys(self, items.sequence_1): # fast update from iter(items.sequence_2) return # regular update from iter(items) Totally transparent and encourages composition of existing builtins. It's a bit of a trick and may not be as obvious as a new method, but it's backwards compatible at least as far as ordered dicts (which is a requirement of any of these approaches anyway, yes?) > 2. In my proposal, main user is csv.DictReader or sql.DictCursor. > They parse only values on each rows. So they need to use map. In that case, use a private helper. _csv already has a native module. We don't need to add new public APIs for internal optimisations, provided there is a semantically equivalent way to do it without the internal API. > 3. (CPython only) dict.copy(), dict(dict), and dict.update() are general purpose > methods. There is no obvious place to start using key-sharing dict. See my reply to Glenn, but potentially fromkeys() could start with the key-sharing dict and then copy()/dict() could continue sharing it (hopefully they already do?). > That's why I proposed specific method / function for specific purpose. > >> >> Note that it .update() would probably require a dict or key/value tuples >> here - but if you have the keys in a tuple already then zip() is going >> to be good enough for setting it (in fact, zip(existing_dict, >> new_values) should be fine, and we can internally special-case that >> scenario, too). > > If *CPython* specialized dict(zip(dict, values)), it still be CPython > implementation detail. > Do you want recommend using such CPython hacky optimization? > Should we use such optimization in stdlib, even if it will be slower > than dict(zip(keys_tuple, values)) on some other Python implementations? We do "hacky" optimisations everywhere :) The point of the runtime is to let users write code that works and we do the effort behind the scenes to make it efficient. We're not C - we're here to help our users. The point is that it will work on other implementations - including previous versions of CPython - and those are free to optimise it however they like. > Or do you propose making dict(zip(dict, values)) optimization as > language specification? Definitely not! It's just a pattern that we have the ability to recognize and optimize at runtime, so why not do it? > One obvious advantage of having DictBuilder is it is for specific > purpose. It has at least same performance to dict(zip(keys, values)) > on all Python implementations. > Libraries like csv parser can use it without worrying about its performance > on Python other than CPython. A singular purpose isn't necessarily an obvious advantage. We're better off with generic building blocks that our users can compose in ways that were originally non-obvious (and then as patterns emerge we can look at ways to simplify or formalise them). >> (Randomizing side note: is this scenario enough to make a case for a >> built-in data frame type?) > > https://xkcd.com/927/ Yep. The difference is that as the language team, our standard wins by default ;) (For those who don't click links, it's pointing at the "let's make a new standard" XKCD comic) >> * when you only d2.update existing keys, no need to rebuild the table >> * a duplicated key overwrites multiple times - what else are you going >> to do? > > But all keys should be looked up. It is very similar overhead to rebuilding > hash table. See my suggestion above - when we know the keys are shared, we can skip the lookup, and there are ways we can detect that they're shared. (Perhaps it is also faster to start by assuming they are shared and test each one, rather than assuming they are unshared? That might be worth testing.) Cheers, Steve From songofacandy at gmail.com Tue Apr 23 11:53:10 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 24 Apr 2019 00:53:10 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: <8939429a-cc81-6cc5-c4a7-2f0a5ca01ac6@python.org> References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> <5feb2cd6-8075-3589-e091-a79ce27cc25f@g.nevcal.com> <8939429a-cc81-6cc5-c4a7-2f0a5ca01ac6@python.org> Message-ID: On Wed, Apr 24, 2019 at 12:28 AM Steve Dower wrote: > > > > > But if the original dictionary wasn't created with shared keys... the > > copy can't share them either. Or are you suggesting adding new code to > > create a shared key dictionary from one that isn't? > > This is a good point. Maybe dict.fromkeys() could do it? Or a > sys.intern-like function (which is why I brought up that precedent). The > point is to make it an optional benefit rather than strict > language/library semantics. > Then, why not support values when creating key sharing dict? That's one form of my proposal :) -- Inada Naoki From songofacandy at gmail.com Tue Apr 23 12:29:39 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 24 Apr 2019 01:29:39 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: <20190423001930.GL3010@ando.pearwood.info> <1514bfce-6aed-5b88-bbb6-78a9cc261282@g.nevcal.com> <25ca35f9-ae2b-ed4e-cb07-8d8fa87f6dbf@python.org> <31b4e1aa-ef17-6ece-5698-acb1aee9fb35@python.org> Message-ID: On Wed, Apr 24, 2019 at 12:34 AM Steve Dower wrote: > > >> If it's a key-sharing dict, then all the keys are strings. We know that > >> when we go to do the update, so we can intern all the strings (going to > >> do that anyway) and then it's a quick check if it already exists. If > >> it's a regular dict, then we calculate hashes as normal. Updating the > >> value is just a decref, incref and assignment. > > > > There are some problem. > > > > 1. Searching hash table is not zero-cost, comparing to appending to sequence. > > This cost is very near to building new hash tables. > > If we know that you're sharing keys with the new items then we can skip > the search. This was my point about the d2 = copy(d1); d2.update(zip(d2, > values)) idea: > OK, I got it. But note that zip object doesn't expose items, neither Python level or C level. > > 2. In my proposal, main user is csv.DictReader or sql.DictCursor. > > They parse only values on each rows. So they need to use map. > > In that case, use a private helper. _csv already has a native module. We > don't need to add new public APIs for internal optimisations, provided > there is a semantically equivalent way to do it without the internal API. csv is stdlib. But there are some third party extensions similar to csv. > > > 3. (CPython only) dict.copy(), dict(dict), and dict.update() are general purpose > > methods. There is no obvious place to start using key-sharing dict. > > See my reply to Glenn, but potentially fromkeys() could start with the > key-sharing dict and then copy()/dict() could continue sharing it > (hopefully they already do?). Key-sharing dict is used only for instance dict at the moment. 2nd argument of dict.fromkeys() is value, not values. How about adding dict.fromkeyvalues(keys, values)? When keys is dict, it's behavior is same to my first proposal (`dict.with_values(d1, values)`). > > > > If *CPython* specialized dict(zip(dict, values)), it still be CPython > > implementation detail. > > Do you want recommend using such CPython hacky optimization? > > Should we use such optimization in stdlib, even if it will be slower > > than dict(zip(keys_tuple, values)) on some other Python implementations? > > We do "hacky" optimisations everywhere :) The point of the runtime is to > let users write code that works and we do the effort behind the scenes > to make it efficient. We're not C - we're here to help our users. But we avoid CPython-only hack which will make stdlib slower on other Python implementations as possible. For example, we optimize `s1 += s` loop. But we use `''.join(list_of_str)` instead of it. > > The point is that it will work on other implementations - including > previous versions of CPython - and those are free to optimise it however > they like. > > > Or do you propose making dict(zip(dict, values)) optimization as > > language specification? > > Definitely not! It's just a pattern that we have the ability to > recognize and optimize at runtime, so why not do it? Why we need to recommend patterns fast only in CPython? d2 = dict.fromkeys(keys_dict) # make key sharing dict, only in CPython 3.8+ d2.update(zip(d2, row)) # update values without key lookup, only in CPython 3.8+ Obviously, this may be much slower than `d2 = dict(zip(keys_tuple, row))` on current CPython and other Python implementations. Note that this pattern will be used when dict creation is bottleneck. If we has specialized API, libraries can use it if the API is available, and use dict(zip(keys, row)) otherwise. > > > One obvious advantage of having DictBuilder is it is for specific > > purpose. It has at least same performance to dict(zip(keys, values)) > > on all Python implementations. > > Libraries like csv parser can use it without worrying about its performance > > on Python other than CPython. > > A singular purpose isn't necessarily an obvious advantage. We're better > off with generic building blocks that our users can compose in ways that > were originally non-obvious (and then as patterns emerge we can look at > ways to simplify or formalise them). In generic building blocks, we can not know user will create massive dicts with same keys or just creating one copy. We need to guess, and the guess may be wrong. Regards, -- Inada Naoki From njs at pobox.com Tue Apr 23 12:54:20 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 23 Apr 2019 09:54:20 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: On Tue, Apr 23, 2019, 05:09 Andrew Svetlov wrote: > I agree that `from typing import TYPE_CHECKING` is not desirable from > the import time reduction perspective. > > From my understanding code completion *can* be based on type hinting > to avoid actual code execution. > That's why I've mentioned that typeshed already has the correct type > information. > > if TYPE_CHECKING: > import ... > > requires mypy modification. > > if False: > import ... > > Works right now for stdlib (mypy ignores stdlib code but uses typeshed > anyway) but looks a little cryptic. > Requires a comprehensive comment at least. > Last time I looked at this, I'm pretty sure `if False` broke at least one popular static analysis tool (ie it was clever enough to ignore everything inside `if False`) ? I think either pylint or jedi? I'd suggest checking any clever hacks against at least: mypy, pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static analysis engines, and each one has its own idiosyncratic quirks. We've struggled with this a *lot* in trio, and eventually ended up giving up on all forms of dynamic export cleverness; we've even banned the use of __all__ entirely. Static analysis has gotten good enough that users won't accept it not working, but it hasn't gotten good enough to handle anything but the simplest static exports in a reliable way: https://github.com/python-trio/trio/pull/316 https://github.com/python-trio/trio/issues/542 The stdlib has more leeway because when tools don't work on the stdlib then they tend to eventually add workarounds. I'm just saying, think twice before diving into clever hacks to workaround static analysis limits, and if you're going to do it then be careful to be thorough. You're basically relying on undocumented bugs, and it gets really messy really quickly. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Apr 23 13:05:18 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 23 Apr 2019 10:05:18 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: In any case I think this should be filed (by the OP) as an issue against JetBrains' PyCharm issue tracker. Who knows they may be able to special-case this in a jiffy. I don't think we should add any clever hacks to the stdlib for this. On Tue, Apr 23, 2019 at 9:59 AM Nathaniel Smith wrote: > On Tue, Apr 23, 2019, 05:09 Andrew Svetlov > wrote: > >> I agree that `from typing import TYPE_CHECKING` is not desirable from >> the import time reduction perspective. >> >> From my understanding code completion *can* be based on type hinting >> to avoid actual code execution. >> That's why I've mentioned that typeshed already has the correct type >> information. >> >> if TYPE_CHECKING: >> import ... >> >> requires mypy modification. >> >> if False: >> import ... >> >> Works right now for stdlib (mypy ignores stdlib code but uses typeshed >> anyway) but looks a little cryptic. >> Requires a comprehensive comment at least. >> > > Last time I looked at this, I'm pretty sure `if False` broke at least one > popular static analysis tool (ie it was clever enough to ignore everything > inside `if False`) ? I think either pylint or jedi? > > I'd suggest checking any clever hacks against at least: mypy, > pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static > analysis engines, and each one has its own idiosyncratic quirks. > > We've struggled with this a *lot* in trio, and eventually ended up giving > up on all forms of dynamic export cleverness; we've even banned the use of > __all__ entirely. Static analysis has gotten good enough that users won't > accept it not working, but it hasn't gotten good enough to handle anything > but the simplest static exports in a reliable way: > https://github.com/python-trio/trio/pull/316 > https://github.com/python-trio/trio/issues/542 > > The stdlib has more leeway because when tools don't work on the stdlib > then they tend to eventually add workarounds. I'm just saying, think twice > before diving into clever hacks to workaround static analysis limits, and > if you're going to do it then be careful to be thorough. You're basically > relying on undocumented bugs, and it gets really messy really quickly. > > -n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Apr 23 13:48:15 2019 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 23 Apr 2019 20:48:15 +0300 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: 12.04.19 19:17, Inada Naoki ????: > Maybe, collections.DictBuilder may be another option. e.g. > >>>> from collections import DictBuilder >>>> builder = DictBuilder(tuple("abc")) >>>> builder.build(range(3)) > {"a": 0, "b": 1, "c": 2} Nitpicking: this is rather a factory than a builder. The difference between the patterns is that you create a new builder object for every dict: builder = DictBuilder() builder['a'] = 0 builder['b'] = 1 builder['c'] = 2 result = builder.build() and create a fabric only for the whole class of dicts: factory = DictFactory(tuple("abc")) # only once ... result = factory(range(3)) I like the idea of a factory object more than the idea of the dict method. From ikamenshchikov at gmail.com Tue Apr 23 15:32:22 2019 From: ikamenshchikov at gmail.com (Ilya Kamenshchikov) Date: Tue, 23 Apr 2019 21:32:22 +0200 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: How would we answer the same question if it was not a part of stdlib? I am not sure it is fair to expect of Pycharm to parse / execute the __getattr__ on modules, as more elaborate implementation could even contain different types per some condition at the runtime or anything at all. The code: TYPE_CHECKING = False if TYPE_CHECKING: from .process import ProcessPoolExecutor from .thread import ThreadPoolExecutor works for type checking in PyCharm and is fast. This is how stdlib can be an example to how side libraries can be implemented. If we can agree that this is the only clear, performant and sufficient code - then perhaps modifying mypy is a reasonable price to pay. Perhaps this particular case can be just patched locally by PyCharm /JetBrains, but what is a general solution to this class of problems? Best Regards, -- Ilya Kamenshchikov On Tue, Apr 23, 2019 at 7:05 PM Guido van Rossum wrote: > In any case I think this should be filed (by the OP) as an issue against > JetBrains' PyCharm issue tracker. Who knows they may be able to > special-case this in a jiffy. I don't think we should add any clever hacks > to the stdlib for this. > > On Tue, Apr 23, 2019 at 9:59 AM Nathaniel Smith wrote: > >> On Tue, Apr 23, 2019, 05:09 Andrew Svetlov >> wrote: >> >>> I agree that `from typing import TYPE_CHECKING` is not desirable from >>> the import time reduction perspective. >>> >>> From my understanding code completion *can* be based on type hinting >>> to avoid actual code execution. >>> That's why I've mentioned that typeshed already has the correct type >>> information. >>> >>> if TYPE_CHECKING: >>> import ... >>> >>> requires mypy modification. >>> >>> if False: >>> import ... >>> >>> Works right now for stdlib (mypy ignores stdlib code but uses typeshed >>> anyway) but looks a little cryptic. >>> Requires a comprehensive comment at least. >>> >> >> Last time I looked at this, I'm pretty sure `if False` broke at least one >> popular static analysis tool (ie it was clever enough to ignore everything >> inside `if False`) ? I think either pylint or jedi? >> >> I'd suggest checking any clever hacks against at least: mypy, >> pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static >> analysis engines, and each one has its own idiosyncratic quirks. >> >> We've struggled with this a *lot* in trio, and eventually ended up giving >> up on all forms of dynamic export cleverness; we've even banned the use of >> __all__ entirely. Static analysis has gotten good enough that users won't >> accept it not working, but it hasn't gotten good enough to handle anything >> but the simplest static exports in a reliable way: >> https://github.com/python-trio/trio/pull/316 >> https://github.com/python-trio/trio/issues/542 >> >> The stdlib has more leeway because when tools don't work on the stdlib >> then they tend to eventually add workarounds. I'm just saying, think twice >> before diving into clever hacks to workaround static analysis limits, and >> if you're going to do it then be careful to be thorough. You're basically >> relying on undocumented bugs, and it gets really messy really quickly. >> >> -n >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > > -- > --Guido van Rossum (python.org/~guido) > *Pronouns: he/him/his **(why is my pronoun here?)* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Apr 23 15:43:54 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 23 Apr 2019 12:43:54 -0700 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: The general solution is import typing if typing.TYPE_CHECKING: The hack of starting with TYPE_CHECKING = False happens to work but is not endorsed by PEP 484 so is not guaranteed for the future. Note that 3rd party code is rarely in such a critical part for script startup that the cost of `import typing` is too much. But the stdlib often *is* in the critical path for script startup, and some consider the time spent in that import too much (startup time should be in the order of tens of msec so every msec counts -- but once you start importing 3rd party code you basically can't make it that fast regardless). Anyway, the stdlib should almost never be used as an example for non-stdlib code -- there are many reasons for this that I don't want to have to repeat here. On Tue, Apr 23, 2019 at 12:33 PM Ilya Kamenshchikov < ikamenshchikov at gmail.com> wrote: > How would we answer the same question if it was not a part of stdlib? > I am not sure it is fair to expect of Pycharm to parse / execute the > __getattr__ on modules, as more elaborate implementation could even contain > different types per some condition at the runtime or anything at all. > The code: > > TYPE_CHECKING = False > if TYPE_CHECKING: > from .process import ProcessPoolExecutor > from .thread import ThreadPoolExecutor > > works for type checking in PyCharm and is fast. > > This is how stdlib can be an example to how side libraries can be implemented. If we can agree that this is the only clear, performant and sufficient code - then perhaps modifying mypy is a reasonable price to pay. > > Perhaps this particular case can be just patched locally by PyCharm > /JetBrains, but what is a general solution to this class of problems? > > Best Regards, > -- > Ilya Kamenshchikov > > > On Tue, Apr 23, 2019 at 7:05 PM Guido van Rossum wrote: > >> In any case I think this should be filed (by the OP) as an issue against >> JetBrains' PyCharm issue tracker. Who knows they may be able to >> special-case this in a jiffy. I don't think we should add any clever hacks >> to the stdlib for this. >> >> On Tue, Apr 23, 2019 at 9:59 AM Nathaniel Smith wrote: >> >>> On Tue, Apr 23, 2019, 05:09 Andrew Svetlov >>> wrote: >>> >>>> I agree that `from typing import TYPE_CHECKING` is not desirable from >>>> the import time reduction perspective. >>>> >>>> From my understanding code completion *can* be based on type hinting >>>> to avoid actual code execution. >>>> That's why I've mentioned that typeshed already has the correct type >>>> information. >>>> >>>> if TYPE_CHECKING: >>>> import ... >>>> >>>> requires mypy modification. >>>> >>>> if False: >>>> import ... >>>> >>>> Works right now for stdlib (mypy ignores stdlib code but uses typeshed >>>> anyway) but looks a little cryptic. >>>> Requires a comprehensive comment at least. >>>> >>> >>> Last time I looked at this, I'm pretty sure `if False` broke at least >>> one popular static analysis tool (ie it was clever enough to ignore >>> everything inside `if False`) ? I think either pylint or jedi? >>> >>> I'd suggest checking any clever hacks against at least: mypy, >>> pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static >>> analysis engines, and each one has its own idiosyncratic quirks. >>> >>> We've struggled with this a *lot* in trio, and eventually ended up >>> giving up on all forms of dynamic export cleverness; we've even banned the >>> use of __all__ entirely. Static analysis has gotten good enough that users >>> won't accept it not working, but it hasn't gotten good enough to handle >>> anything but the simplest static exports in a reliable way: >>> https://github.com/python-trio/trio/pull/316 >>> https://github.com/python-trio/trio/issues/542 >>> >>> The stdlib has more leeway because when tools don't work on the stdlib >>> then they tend to eventually add workarounds. I'm just saying, think twice >>> before diving into clever hacks to workaround static analysis limits, and >>> if you're going to do it then be careful to be thorough. You're basically >>> relying on undocumented bugs, and it gets really messy really quickly. >>> >>> -n >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> https://mail.python.org/mailman/options/python-dev/guido%40python.org >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> *Pronouns: he/him/his **(why is my pronoun here?)* >> >> > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue Apr 23 15:44:14 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 23 Apr 2019 20:44:14 +0100 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: Mypy doesn't use source code of stlib for analysis and instead uses stub files from typeshed. IIUC PyCharm can also do that (i.e. use the typeshed stubs). The whole idea of typeshed is to avoid changing stlib solely for the sake of static analysis. Please open an issue on typeshed an/or PyCharm tracker. -- Ivan On Tue, 23 Apr 2019 at 20:38, Ilya Kamenshchikov wrote: > How would we answer the same question if it was not a part of stdlib? > I am not sure it is fair to expect of Pycharm to parse / execute the > __getattr__ on modules, as more elaborate implementation could even contain > different types per some condition at the runtime or anything at all. > The code: > > TYPE_CHECKING = False > if TYPE_CHECKING: > from .process import ProcessPoolExecutor > from .thread import ThreadPoolExecutor > > works for type checking in PyCharm and is fast. > > This is how stdlib can be an example to how side libraries can be implemented. If we can agree that this is the only clear, performant and sufficient code - then perhaps modifying mypy is a reasonable price to pay. > > Perhaps this particular case can be just patched locally by PyCharm > /JetBrains, but what is a general solution to this class of problems? > > Best Regards, > -- > Ilya Kamenshchikov > > > On Tue, Apr 23, 2019 at 7:05 PM Guido van Rossum wrote: > >> In any case I think this should be filed (by the OP) as an issue against >> JetBrains' PyCharm issue tracker. Who knows they may be able to >> special-case this in a jiffy. I don't think we should add any clever hacks >> to the stdlib for this. >> >> On Tue, Apr 23, 2019 at 9:59 AM Nathaniel Smith wrote: >> >>> On Tue, Apr 23, 2019, 05:09 Andrew Svetlov >>> wrote: >>> >>>> I agree that `from typing import TYPE_CHECKING` is not desirable from >>>> the import time reduction perspective. >>>> >>>> From my understanding code completion *can* be based on type hinting >>>> to avoid actual code execution. >>>> That's why I've mentioned that typeshed already has the correct type >>>> information. >>>> >>>> if TYPE_CHECKING: >>>> import ... >>>> >>>> requires mypy modification. >>>> >>>> if False: >>>> import ... >>>> >>>> Works right now for stdlib (mypy ignores stdlib code but uses typeshed >>>> anyway) but looks a little cryptic. >>>> Requires a comprehensive comment at least. >>>> >>> >>> Last time I looked at this, I'm pretty sure `if False` broke at least >>> one popular static analysis tool (ie it was clever enough to ignore >>> everything inside `if False`) ? I think either pylint or jedi? >>> >>> I'd suggest checking any clever hacks against at least: mypy, >>> pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static >>> analysis engines, and each one has its own idiosyncratic quirks. >>> >>> We've struggled with this a *lot* in trio, and eventually ended up >>> giving up on all forms of dynamic export cleverness; we've even banned the >>> use of __all__ entirely. Static analysis has gotten good enough that users >>> won't accept it not working, but it hasn't gotten good enough to handle >>> anything but the simplest static exports in a reliable way: >>> https://github.com/python-trio/trio/pull/316 >>> https://github.com/python-trio/trio/issues/542 >>> >>> The stdlib has more leeway because when tools don't work on the stdlib >>> then they tend to eventually add workarounds. I'm just saying, think twice >>> before diving into clever hacks to workaround static analysis limits, and >>> if you're going to do it then be careful to be thorough. You're basically >>> relying on undocumented bugs, and it gets really messy really quickly. >>> >>> -n >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> https://mail.python.org/mailman/options/python-dev/guido%40python.org >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> *Pronouns: he/him/his **(why is my pronoun here?)* >> >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ikamenshchikov at gmail.com Tue Apr 23 15:58:02 2019 From: ikamenshchikov at gmail.com (Ilya Kamenshchikov) Date: Tue, 23 Apr 2019 21:58:02 +0200 Subject: [Python-Dev] Concurrent.futures: no type discovery for PyCharm In-Reply-To: References: Message-ID: Ok thanks for explaining. I will proceed by trying it with typeshed. Best Regards, -- Ilya Kamenshchikov On Tue, Apr 23, 2019 at 9:44 PM Ivan Levkivskyi wrote: > Mypy doesn't use source code of stlib for analysis and instead uses stub > files from typeshed. IIUC PyCharm can also do that (i.e. use the typeshed > stubs). > The whole idea of typeshed is to avoid changing stlib solely for the sake > of static analysis. Please open an issue on typeshed an/or PyCharm tracker. > > -- > Ivan > > > > On Tue, 23 Apr 2019 at 20:38, Ilya Kamenshchikov > wrote: > >> How would we answer the same question if it was not a part of stdlib? >> I am not sure it is fair to expect of Pycharm to parse / execute the >> __getattr__ on modules, as more elaborate implementation could even contain >> different types per some condition at the runtime or anything at all. >> The code: >> >> TYPE_CHECKING = False >> if TYPE_CHECKING: >> from .process import ProcessPoolExecutor >> from .thread import ThreadPoolExecutor >> >> works for type checking in PyCharm and is fast. >> >> This is how stdlib can be an example to how side libraries can be implemented. If we can agree that this is the only clear, performant and sufficient code - then perhaps modifying mypy is a reasonable price to pay. >> >> Perhaps this particular case can be just patched locally by PyCharm >> /JetBrains, but what is a general solution to this class of problems? >> >> Best Regards, >> -- >> Ilya Kamenshchikov >> >> >> On Tue, Apr 23, 2019 at 7:05 PM Guido van Rossum >> wrote: >> >>> In any case I think this should be filed (by the OP) as an issue against >>> JetBrains' PyCharm issue tracker. Who knows they may be able to >>> special-case this in a jiffy. I don't think we should add any clever hacks >>> to the stdlib for this. >>> >>> On Tue, Apr 23, 2019 at 9:59 AM Nathaniel Smith wrote: >>> >>>> On Tue, Apr 23, 2019, 05:09 Andrew Svetlov >>>> wrote: >>>> >>>>> I agree that `from typing import TYPE_CHECKING` is not desirable from >>>>> the import time reduction perspective. >>>>> >>>>> From my understanding code completion *can* be based on type hinting >>>>> to avoid actual code execution. >>>>> That's why I've mentioned that typeshed already has the correct type >>>>> information. >>>>> >>>>> if TYPE_CHECKING: >>>>> import ... >>>>> >>>>> requires mypy modification. >>>>> >>>>> if False: >>>>> import ... >>>>> >>>>> Works right now for stdlib (mypy ignores stdlib code but uses typeshed >>>>> anyway) but looks a little cryptic. >>>>> Requires a comprehensive comment at least. >>>>> >>>> >>>> Last time I looked at this, I'm pretty sure `if False` broke at least >>>> one popular static analysis tool (ie it was clever enough to ignore >>>> everything inside `if False`) ? I think either pylint or jedi? >>>> >>>> I'd suggest checking any clever hacks against at least: mypy, >>>> pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static >>>> analysis engines, and each one has its own idiosyncratic quirks. >>>> >>>> We've struggled with this a *lot* in trio, and eventually ended up >>>> giving up on all forms of dynamic export cleverness; we've even banned the >>>> use of __all__ entirely. Static analysis has gotten good enough that users >>>> won't accept it not working, but it hasn't gotten good enough to handle >>>> anything but the simplest static exports in a reliable way: >>>> https://github.com/python-trio/trio/pull/316 >>>> https://github.com/python-trio/trio/issues/542 >>>> >>>> The stdlib has more leeway because when tools don't work on the stdlib >>>> then they tend to eventually add workarounds. I'm just saying, think twice >>>> before diving into clever hacks to workaround static analysis limits, and >>>> if you're going to do it then be careful to be thorough. You're basically >>>> relying on undocumented bugs, and it gets really messy really quickly. >>>> >>>> -n >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: >>>> https://mail.python.org/mailman/options/python-dev/guido%40python.org >>>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> *Pronouns: he/him/his **(why is my pronoun here?)* >>> >>> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Tue Apr 23 17:17:02 2019 From: mark at hotpy.org (Mark Shannon) Date: Tue, 23 Apr 2019 22:17:02 +0100 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: Hi, On 12/04/2019 2:44 pm, Inada Naoki wrote: > Hi, all. > > I propose adding new method: dict.with_values(iterable) You can already do something like this, if memory saving is the main concern. This should work on all versions from 3.3. def shared_keys_dict_maker(keys): class C: pass instance = C() for key in keys: for key in keys: setattr(instance, key, None) prototype = instance.__dict__ def maker(values): result = prototype.copy() result.update(zip(keys, values)) return result return maker m = shared_keys_dict_maker(('a', 'b')) >>> d1 = {'a':1, 'b':2} >>> print(sys.getsizeof(d1)) ... 248 >>> d2 = m((1,2)) >>> print(sys.getsizeof(d2)) ... 120 >>> d3 = m((None,"Hi")) >>> print(sys.getsizeof(d3)) ... 120 > > # Motivation > > Python is used to handle data. > While dict is not efficient way to handle may records, it is still > convenient way. > > When creating many dicts with same keys, dict need to > lookup internal hash table while inserting each keys. > > It is costful operation. If we can reuse existing keys of dict, > we can skip this inserting cost. > > Additionally, we have "Key-Sharing Dictionary (PEP 412)". > When all keys are string, many dict can share one key. > It reduces memory consumption. > > This might be usable for: > > * csv.DictReader > * namedtuple._asdict() > * DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python) > > > # Draft implementation > > pull request: https://github.com/python/cpython/pull/12802 > > with_values(self, iterable, /) > Create a new dictionary with keys from this dict and values from iterable. > > When length of iterable is different from len(self), ValueError is raised. > This method does not support dict subclass. > > > ## Memory usage (Key-Sharing dict) > >>>> import sys >>>> keys = tuple("abcdefg") >>>> keys > ('a', 'b', 'c', 'd', 'e', 'f', 'g') >>>> d = dict(zip(keys, range(7))) >>>> d > {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} >>>> sys.getsizeof(d) > 360 > >>>> keys = dict.fromkeys("abcdefg") >>>> d = keys.with_values(range(7)) >>>> d > {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} >>>> sys.getsizeof(d) > 144 > > ## Speed > > $ ./python -m perf timeit -o zip_dict.json -s 'keys = > tuple("abcdefg"); values=[*range(7)]' 'dict(zip(keys, values))' > > $ ./python -m perf timeit -o with_values.json -s 'keys = > dict.fromkeys("abcdefg"); values=[*range(7)]' > 'keys.with_values(values)' > > $ ./python -m perf compare_to zip_dict.json with_values.json > Mean +- std dev: [zip_dict] 935 ns +- 9 ns -> [with_values] 109 ns +- > 2 ns: 8.59x faster (-88%) > > > How do you think? > Any comments are appreciated. > > Regards, > From vstinner at redhat.com Tue Apr 23 19:44:17 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 24 Apr 2019 01:44:17 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode Message-ID: Hi, Two weeks ago, I started a thread "No longer enable Py_TRACE_REFS by default in debug build", but I lost myself in details, I forgot the main purpose of my proposal... Let me retry from scratch with a more explicit title: I would like to be able to run C extensions compiled in release mode on a Python compiled in debug mode ("pydebug"). The use case is to debug bugs in C extensions thanks to additional runtime checks of a Python debug build, and more generally get a better debugging experiences on Python. Even for pure Python, a debug build is useful (to get the Pyhon traceback in gdb using "py-bt" command). Currently, using a Python compiled in debug mode means to have to recompile C extensions in debug mode. Compile a C extension requires a C compiler, header files, pull dependencies, etc. It can be very complicated in practical (and pollute your system with all these additional dependencies). On Linux, it's already hard, but on Windows it can be even harder. Just one concrete example: no debug build of numpy is provided at https://pypi.org/project/numpy/ Good luck to build numpy in debug mode manually (install OpenBLAS, ATLAS, Fortran compiler, Cython, etc.) :-) -- The first requirement for the use case is that a Python debug build supports the ABI of a release build. The current blocker issue is that the Py_DEBUG define imply the Py_TRACE_REFS define: PyObject gets 2 extra fields (_ob_prev and _ob_next) which change the offset of all attributes of all objects and makes the ABI completely incompatible. I propose to no longer imply Py_TRACE_REFS *by default* (but keep the code): https://bugs.python.org/issue36465 https://github.com/python/cpython/pull/12615 (Py_TRACE_REFS would be a different ABI.) The second issue is that library filenames are different for a debug build: SOABI gets an additional "d" flag for Py_DEBUG. A debug build should first look for "NAME.cpython-38dm.so" (flags: "dm"), but then also look for "NAME.cpython-38m.so" (flags: "m"). The opposite is not possible: a debug build contains many additional functions missing from a release build. For Windows, maybe we should provide a Python compiled in debug mode with the same C Runtime than a Python compiled in release mode. Otherwise, the debug C Runtime is causing another ABI issue. Maybe pip could be enhanced to support installing C extensions compiled in release mode when using a debug mode. But that's more for convenience, it's not really required, since it is easy to switch the Python runtime between release and debug build. Apart of Py_TRACE_REFS, I'm not aware of other ABI differences in structures. I know that the COUNT_ALLOCS define changes the ABI, but it's not implied by Py_DEBUG: you have to opt-in for COUNT_ALLOCS. (I propose to do the same for Py_TRACE_REFS ;-)) Note: Refleaks buildbots don't use Py_TRACE_REFS to track memory leaks, only sys.gettotalrefcount(). -- Python debug build has many benefit. If you ignore C extensions, the debug build is usually compiled with compiler optimization disabled which makes debugging in gdb a much better experience. If you never tried: on a release build, most (if not all) variables are "" and it's really painful to basic debug functions like displaying the current Python frame. Assertions are removed in release modes, whereas they can detect a wide range of bugs way earlier: integer overflow, buffer under- and overflow, exceptions ignored silently, etc. Nobody likes to see a bug for the first time in production. For example, I modified Python 3.8 to now logs I/O errors when a file is closed implicitly, but only in debug or development mode. In release Python silently ignored EBADF error on such case, whereas it can lead to very nasty bugs causing Python to call abort() (which creates a coredump on Linux): see https://bugs.python.org/issue18748 ... DeprecationWarning and ResourceWarning are shown by default in debug mode :-) There are too many different additional checks done at runtime: I cannot list them all here. -- Being able to switch between Python in release mode and Python in debug mode is a first step. My long term plan would be to better separate "Python" from its "runtime". CPython in release mode would be one runtime, CPython in debug mode would be another runtime, PyPy can seeen as another runtime, etc. The more general idea is: "compile your C extension once and use any Python runtime". https://pythoncapi.readthedocs.io/runtimes.html#runtimes If you opt-in for the stable ABI, you can already switch between runtimes of different Python versions (ex: Python 3.6 or Python 3.8). Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Tue Apr 23 20:09:27 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 24 Apr 2019 02:09:27 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: Le mer. 24 avr. 2019 ? 01:44, Victor Stinner a ?crit : > The first requirement for the use case is that a Python debug build > supports the ABI of a release build. (...) I > propose to no longer imply Py_TRACE_REFS (...) > > Apart of Py_TRACE_REFS, I'm not aware of other ABI differences in > structures. (...) I tested manually: just by disabling Py_TRACE_REFS, the release ABI looks *fully* compatible with a debug build! I modified Python 3.7 to disable Py_TRACE_REFS and to omit "d" from SOABI when build in debug mode. I built Python in debug mode. I ran tests on numpy and lxml.etree: I can use .so libraries from /usr/lib64/python3.7/site-packages (compiled in release mode), it just works! I was very surprised of not getting any crash on such non trivial C extension, so I checked manually that I was running a debug build: yes, sys.gettotalrefcount is present :-) I also wanted to test an even more complex application: I installed gajim, a Jabber client written in Python 3 with PyGTK. It uses many C extensions. Running Gajim with my debug build is slower, well, that's not a surprise, but it works well! (no crash) About the SOABI, maybe we should only keep "d" when Py_TRACE_REFS is used, since technically, the ABI is same between release and debug mode without Py_TRACE_REFS. In that case, pip doesn't need to be modified ;-) If you also want to try, use: PYTHONPATH=/usr/lib64/python3.7/site-packages:/usr/lib/python3.7/site-packages ./python /usr/bin/gajim On a Python compiled with "./configure --with-pydebug && make" and the following patch: diff --git a/Include/object.h b/Include/object.h index bcf78afe6b..4c807981c4 100644 --- a/Include/object.h +++ b/Include/object.h @@ -51,13 +51,8 @@ A standard interface exists for objects that contain an array of items whose size is determined when the object is allocated. */ -/* Py_DEBUG implies Py_TRACE_REFS. */ -#if defined(Py_DEBUG) && !defined(Py_TRACE_REFS) -#define Py_TRACE_REFS -#endif - -/* Py_TRACE_REFS implies Py_REF_DEBUG. */ -#if defined(Py_TRACE_REFS) && !defined(Py_REF_DEBUG) +/* Py_DEBUG implies Py_REF_DEBUG. */ +#if defined(Py_DEBUG) && !defined(Py_REF_DEBUG) #define Py_REF_DEBUG #endif diff --git a/configure b/configure index 2db11e6e86..7271e9de40 100755 --- a/configure +++ b/configure @@ -6365,7 +6365,6 @@ $as_echo "#define Py_DEBUG 1" >>confdefs.h { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; }; Py_DEBUG='true' - ABIFLAGS="${ABIFLAGS}d" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; }; Py_DEBUG='false' fi diff --git a/configure.ac b/configure.ac index e5fb7e7b0b..fa4bb1944f 100644 --- a/configure.ac +++ b/configure.ac @@ -1246,7 +1246,6 @@ then [Define if you want to build an interpreter with many run-time checks.]) AC_MSG_RESULT(yes); Py_DEBUG='true' - ABIFLAGS="${ABIFLAGS}d" else AC_MSG_RESULT(no); Py_DEBUG='false' fi], [AC_MSG_RESULT(no)]) Victor From vano at mail.mipt.ru Tue Apr 23 20:50:25 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Wed, 24 Apr 2019 03:50:25 +0300 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: <73aa488a-614a-25a7-df7c-94f1fd285c9f@mail.mipt.ru> On 24.04.2019 2:44, Victor Stinner wrote: > Hi, > > Two weeks ago, I started a thread "No longer enable Py_TRACE_REFS by > default in debug build", but I lost myself in details, I forgot the > main purpose of my proposal... > > Let me retry from scratch with a more explicit title: I would like to > be able to run C extensions compiled in release mode on a Python > compiled in debug mode ("pydebug"). This is going to be impossible because debug Python links against debug C runtime which is binary incompatible with the release one (at least, in Windows). > The use case is to debug bugs in C > extensions thanks to additional runtime checks of a Python debug > build, and more generally get a better debugging experiences on > Python. Even for pure Python, a debug build is useful (to get the > Pyhon traceback in gdb using "py-bt" command). That said, debug vs release extension compilation is currently bugged. It's impossible to make a debug build of an extension against a release Python (linked against release runtime, so not fully debug, just without optimizations) and vice versa. pip fails to build extensions for a debug Python for the same reason. I've no idea how (and if at all) people manage to diagnose problems in extensions. https://bugs.python.org/issue33637 > > Currently, using a Python compiled in debug mode means to have to > recompile C extensions in debug mode. Compile a C extension requires a > C compiler, header files, pull dependencies, etc. It can be very > complicated in practical (and pollute your system with all these > additional dependencies). On Linux, it's already hard, but on Windows > it can be even harder. > > Just one concrete example: no debug build of numpy is provided at > https://pypi.org/project/numpy/ Good luck to build numpy in debug mode > manually (install OpenBLAS, ATLAS, Fortran compiler, Cython, etc.) > :-) The above paragraph is probably the reason ;-) > > -- > > The first requirement for the use case is that a Python debug build > supports the ABI of a release build. The current blocker issue is that > the Py_DEBUG define imply the Py_TRACE_REFS define: PyObject gets 2 > extra fields (_ob_prev and _ob_next) which change the offset of all > attributes of all objects and makes the ABI completely incompatible. I > propose to no longer imply Py_TRACE_REFS *by default* (but keep the > code): > > https://bugs.python.org/issue36465 > https://github.com/python/cpython/pull/12615 > > (Py_TRACE_REFS would be a different ABI.) > > The second issue is that library filenames are different for a debug > build: SOABI gets an additional "d" flag for Py_DEBUG. A debug build > should first look for "NAME.cpython-38dm.so" (flags: "dm"), but then > also look for "NAME.cpython-38m.so" (flags: "m"). The opposite is not > possible: a debug build contains many additional functions missing > from a release build. > > For Windows, maybe we should provide a Python compiled in debug mode > with the same C Runtime than a Python compiled in release mode. > Otherwise, the debug C Runtime is causing another ABI issue. > > Maybe pip could be enhanced to support installing C extensions > compiled in release mode when using a debug mode. But that's more for > convenience, it's not really required, since it is easy to switch the > Python runtime between release and debug build. > > Apart of Py_TRACE_REFS, I'm not aware of other ABI differences in > structures. I know that the COUNT_ALLOCS define changes the ABI, but > it's not implied by Py_DEBUG: you have to opt-in for COUNT_ALLOCS. (I > propose to do the same for Py_TRACE_REFS ;-)) > > Note: Refleaks buildbots don't use Py_TRACE_REFS to track memory > leaks, only sys.gettotalrefcount(). > > -- > > Python debug build has many benefit. If you ignore C extensions, the > debug build is usually compiled with compiler optimization disabled > which makes debugging in gdb a much better experience. If you never > tried: on a release build, most (if not all) variables are " out>" and it's really painful to basic debug functions like displaying > the current Python frame. > > Assertions are removed in release modes, whereas they can detect a > wide range of bugs way earlier: integer overflow, buffer under- and > overflow, exceptions ignored silently, etc. Nobody likes to see a bug > for the first time in production. For example, I modified Python 3.8 > to now logs I/O errors when a file is closed implicitly, but only in > debug or development mode. In release Python silently ignored EBADF > error on such case, whereas it can lead to very nasty bugs causing > Python to call abort() (which creates a coredump on Linux): see > https://bugs.python.org/issue18748 ... > > DeprecationWarning and ResourceWarning are shown by default in debug mode :-) > > There are too many different additional checks done at runtime: I > cannot list them all here. > > -- > > Being able to switch between Python in release mode and Python in > debug mode is a first step. My long term plan would be to better > separate "Python" from its "runtime". CPython in release mode would be > one runtime, CPython in debug mode would be another runtime, PyPy can > seeen as another runtime, etc. The more general idea is: "compile your > C extension once and use any Python runtime". > > https://pythoncapi.readthedocs.io/runtimes.html#runtimes > > If you opt-in for the stable ABI, you can already switch between > runtimes of different Python versions (ex: Python 3.6 or Python 3.8). > > Victor > -- > Night gathers, and now my watch begins. It shall not end until my death. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan From vstinner at redhat.com Tue Apr 23 21:04:49 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 24 Apr 2019 03:04:49 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: Le mer. 24 avr. 2019 ? 01:44, Victor Stinner a ?crit : > The current blocker issue is that > the Py_DEBUG define imply the Py_TRACE_REFS define (...): > > https://bugs.python.org/issue36465 > https://github.com/python/cpython/pull/12615 I updated my PR: """ Release build and debug build are now ABI compatible: the Py_DEBUG define no longer implies Py_TRACE_REFS define which introduces the only ABI incompatibility. A new "./configure --with-trace-refs" build option is now required to get Py_TRACE_REFS define which adds sys.getobjects() function and PYTHONDUMPREFS environment variable. Changes: * Add ./configure --with-trace-refs * Py_DEBUG no longer implies Py_TRACE_REFS * The "d" flag of SOABI (sys.implementation.cache_tag) is now only added by --with-trace-refs. It is no longer added by --with-pydebug. """ > Maybe pip could be enhanced to support installing C extensions > compiled in release mode when using a debug mode. In fact, pip doesn't have to be modified. I "fixed" sys.implementation.cache_tag by removing "d" in debug mode instead ;-) By the way, the "m" ABI flag for pymalloc is outdated. I proposed the following change to simply remove it: https://bugs.python.org/issue36707 https://github.com/python/cpython/pull/12931/files With my PR 12931 and my PR 12615, the only remaining ABI flag which be "d" which would only be enabled by ./configure --with-trace-refs, whereas ./configure --with-pydebug has no more effect on SOABI (sys.implementation.cache_tag). Victor -- Night gathers, and now my watch begins. It shall not end until my death. From songofacandy at gmail.com Tue Apr 23 23:13:43 2019 From: songofacandy at gmail.com (Inada Naoki) Date: Wed, 24 Apr 2019 12:13:43 +0900 Subject: [Python-Dev] Proposal: dict.with_values(iterable) In-Reply-To: References: Message-ID: On Wed, Apr 24, 2019 at 6:17 AM Mark Shannon wrote: > > Hi, > > On 12/04/2019 2:44 pm, Inada Naoki wrote: > > Hi, all. > > > > I propose adding new method: dict.with_values(iterable) > > You can already do something like this, if memory saving is the main > concern. This should work on all versions from 3.3. > Of course, performance is main concern too. -- Inada Naoki From J.Demeyer at UGent.be Wed Apr 24 03:24:47 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Wed, 24 Apr 2019 09:24:47 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: <5CC00F3F.9090602@UGent.be> On 2019-04-24 01:44, Victor Stinner wrote: > I would like to > be able to run C extensions compiled in release mode on a Python > compiled in debug mode That seems like a very good idea. I would certainly use the debug mode while developing CPython or C extensions. From vano at mail.mipt.ru Wed Apr 24 07:44:35 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Wed, 24 Apr 2019 14:44:35 +0300 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <73aa488a-614a-25a7-df7c-94f1fd285c9f@mail.mipt.ru> References: <73aa488a-614a-25a7-df7c-94f1fd285c9f@mail.mipt.ru> Message-ID: <8d9ffb09-2a9a-c07a-357d-e7a35b22909d@mail.mipt.ru> On 24.04.2019 3:50, Ivan Pozdeev via Python-Dev wrote: > On 24.04.2019 2:44, Victor Stinner wrote: >> Hi, >> >> Two weeks ago, I started a thread "No longer enable Py_TRACE_REFS by >> default in debug build", but I lost myself in details, I forgot the >> main purpose of my proposal... >> >> Let me retry from scratch with a more explicit title: I would like to >> be able to run C extensions compiled in release mode on a Python >> compiled in debug mode ("pydebug"). > > This is going to be impossible because debug Python links against debug C runtime which is binary incompatible with the release one (at > least, in Windows). To elaborate: As per https://stackoverflow.com/questions/37541210/whats-the-difference-in-usage-between-shared-libraries-built-in-debug-and-relea/37580323#37580323 , Problems will occur if you have two modules that 1. use different versions or binary representations of a type and 2. exchange objects of that type Now, I trust Victor has ensured no discrepancies in explicitly exchanged types. But I'm not sure if Python and the extension still rely on implicitly sharing some C runtime entities. (In Py2, that would at least be descriptor table that MSVCRT maintains privately but Py3 doesn't rely on it AFAIK). >> The use case is to debug bugs in C >> extensions thanks to additional runtime checks of a Python debug >> build, and more generally get a better debugging experiences on >> Python. Even for pure Python, a debug build is useful (to get the >> Pyhon traceback in gdb using "py-bt" command). > That said, debug vs release extension compilation is currently bugged. It's impossible to make a debug build of an extension against a > release Python (linked against release runtime, so not fully debug, just without optimizations) and vice versa. pip fails to build > extensions for a debug Python for the same reason. I've no idea how (and if at all) people manage to diagnose problems in extensions. > https://bugs.python.org/issue33637 >> >> Currently, using a Python compiled in debug mode means to have to >> recompile C extensions in debug mode. Compile a C extension requires a >> C compiler, header files, pull dependencies, etc. It can be very >> complicated in practical (and pollute your system with all these >> additional dependencies). On Linux, it's already hard, but on Windows >> it can be even harder. >> >> Just one concrete example: no debug build of numpy is provided at >> https://pypi.org/project/numpy/ Good luck to build numpy in debug mode >> manually (install OpenBLAS, ATLAS, Fortran compiler, Cython, etc.) >> :-) > The above paragraph is probably the reason ;-) >> >> -- >> >> The first requirement for the use case is that a Python debug build >> supports the ABI of a release build. The current blocker issue is that >> the Py_DEBUG define imply the Py_TRACE_REFS define: PyObject gets 2 >> extra fields (_ob_prev and _ob_next) which change the offset of all >> attributes of all objects and makes the ABI completely incompatible. I >> propose to no longer imply Py_TRACE_REFS *by default* (but keep the >> code): >> >> https://bugs.python.org/issue36465 >> https://github.com/python/cpython/pull/12615 >> >> (Py_TRACE_REFS would be a different ABI.) >> >> The second issue is that library filenames are different for a debug >> build: SOABI gets an additional "d" flag for Py_DEBUG. A debug build >> should first look for "NAME.cpython-38dm.so" (flags: "dm"), but then >> also look for "NAME.cpython-38m.so" (flags: "m"). The opposite is not >> possible: a debug build contains many additional functions missing >> from a release build. >> >> For Windows, maybe we should provide a Python compiled in debug mode >> with the same C Runtime than a Python compiled in release mode. >> Otherwise, the debug C Runtime is causing another ABI issue. >> >> Maybe pip could be enhanced to support installing C extensions >> compiled in release mode when using a debug mode. But that's more for >> convenience, it's not really required, since it is easy to switch the >> Python runtime between release and debug build. >> >> Apart of Py_TRACE_REFS, I'm not aware of other ABI differences in >> structures. I know that the COUNT_ALLOCS define changes the ABI, but >> it's not implied by Py_DEBUG: you have to opt-in for COUNT_ALLOCS. (I >> propose to do the same for Py_TRACE_REFS ;-)) >> >> Note: Refleaks buildbots don't use Py_TRACE_REFS to track memory >> leaks, only sys.gettotalrefcount(). >> >> -- >> >> Python debug build has many benefit. If you ignore C extensions, the >> debug build is usually compiled with compiler optimization disabled >> which makes debugging in gdb a much better experience. If you never >> tried: on a release build, most (if not all) variables are "> out>" and it's really painful to basic debug functions like displaying >> the current Python frame. >> >> Assertions are removed in release modes, whereas they can detect a >> wide range of bugs way earlier: integer overflow, buffer under- and >> overflow, exceptions ignored silently, etc. Nobody likes to see a bug >> for the first time in production. For example, I modified Python 3.8 >> to now logs I/O errors when a file is closed implicitly, but only in >> debug or development mode. In release Python silently ignored EBADF >> error on such case, whereas it can lead to very nasty bugs causing >> Python to call abort() (which creates a coredump on Linux): see >> https://bugs.python.org/issue18748 ... >> >> DeprecationWarning and ResourceWarning are shown by default in debug mode :-) >> >> There are too many different additional checks done at runtime: I >> cannot list them all here. >> >> -- >> >> Being able to switch between Python in release mode and Python in >> debug mode is a first step. My long term plan would be to better >> separate "Python" from its "runtime". CPython in release mode would be >> one runtime, CPython in debug mode would be another runtime, PyPy can >> seeen as another runtime, etc. The more general idea is: "compile your >> C extension once and use any Python runtime". >> >> https://pythoncapi.readthedocs.io/runtimes.html#runtimes >> >> If you opt-in for the stable ABI, you can already switch between >> runtimes of different Python versions (ex: Python 3.6 or Python 3.8). >> >> Victor >> -- >> Night gathers, and now my watch begins. It shall not end until my death. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru > -- Regards, Ivan From solipsis at pitrou.net Wed Apr 24 10:03:22 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Apr 2019 16:03:22 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode References: Message-ID: <20190424160322.63323ce1@fsol> On Wed, 24 Apr 2019 01:44:17 +0200 Victor Stinner wrote: > > The first requirement for the use case is that a Python debug build > supports the ABI of a release build. The current blocker issue is that > the Py_DEBUG define imply the Py_TRACE_REFS define: PyObject gets 2 > extra fields (_ob_prev and _ob_next) which change the offset of all > attributes of all objects and makes the ABI completely incompatible. I > propose to no longer imply Py_TRACE_REFS *by default* (but keep the > code): > > https://bugs.python.org/issue36465 > https://github.com/python/cpython/pull/12615 +1 from me. > The second issue is that library filenames are different for a debug > build: SOABI gets an additional "d" flag for Py_DEBUG. A debug build > should first look for "NAME.cpython-38dm.so" (flags: "dm"), but then > also look for "NAME.cpython-38m.so" (flags: "m"). Sounds fair (but only on Unix, I guess). > Maybe pip could be enhanced to support installing C extensions > compiled in release mode when using a debug mode. But that's more for > convenience, it's not really required, since it is easy to switch the > Python runtime between release and debug build. Not sure what you mean by "easy to switch the Python runtime". As soon as I want to use pip, I have to use a release build, right? Regards Antoine. From vstinner at redhat.com Wed Apr 24 12:02:18 2019 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 24 Apr 2019 18:02:18 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: Hum, I found issues with libpython: C extensions are explicitly linked to libpython built in release mode. So a debug python loading a C extension may load libpython in release mode, whereas libpython in debug mode is already loaded. When Python is built with --enable-shared, the python3.7 program is linked to libpython3.7m.so.1.0 on Linux. C extensions are explicitly linked to libpython3.7m as well: $ python3.7-config --ldflags ... -lpython3.7m ... Example with numpy: $ ldd /usr/lib64/python3.7/site-packages/numpy/core/umath.cpython-37m-x86_64-linux-gnu.so ... libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (...) ... When Python 3.7 is compiled in debug mode, libpython gets a "d" flag for debug: libpython3.7dm.so.1.0. I see 2 solutions: (1) Use a different directory. If "libpython" gets the same filename in release and debug mode, at least, they must be installed in different directories. If libpython build in debug mode is installed in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug libpython. (2) If "libpython" gets a different filename in debug mode, C extensions should not be linked to libpython explicitly but *implicitly* to avoid picking the wrong libpython. For example, remove "-lpython3.7m" from "python3.7-config --ldflags" output. The option (1) rely on rpath which is discouraged by Linux vendors and may not be supported by all operating systems. The option (2) is simpler and likely more portable. Currently, C extensions of the standard library may or may not be linked to libpython depending on they are built. In practice, both work since python3.7 is already linked to libpython: so libpython is already loaded in memory before C extensions are loaded. I opened https://bugs.python.org/issue34814 to discuss how C extensions of the standard library should be linked but I closed it because we failed to find a consensus and the initial use case became a non-issue. It seems like we should reopen the discussion :-) Victor From solipsis at pitrou.net Wed Apr 24 13:00:21 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Apr 2019 19:00:21 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode References: Message-ID: <20190424190021.6e4852ec@fsol> On Wed, 24 Apr 2019 18:02:18 +0200 Victor Stinner wrote: > > I see 2 solutions: > > (1) Use a different directory. If "libpython" gets the same filename > in release and debug mode, at least, they must be installed in > different directories. If libpython build in debug mode is installed > in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be > compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug > libpython. > > (2) If "libpython" gets a different filename in debug mode, C > extensions should not be linked to libpython explicitly but > *implicitly* to avoid picking the wrong libpython. For example, remove > "-lpython3.7m" from "python3.7-config --ldflags" output. > > The option (1) rely on rpath which is discouraged by Linux vendors and > may not be supported by all operating systems. > > The option (2) is simpler and likely more portable. > > Currently, C extensions of the standard library may or may not be > linked to libpython depending on they are built. In practice, both > work since python3.7 is already linked to libpython: so libpython is > already loaded in memory before C extensions are loaded. You can participate in https://bugs.python.org/issue21536 Regards Antoine. From stefan_ml at behnel.de Wed Apr 24 13:22:02 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 24 Apr 2019 19:22:02 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <5CC00F3F.9090602@UGent.be> References: <5CC00F3F.9090602@UGent.be> Message-ID: Jeroen Demeyer schrieb am 24.04.19 um 09:24: > On 2019-04-24 01:44, Victor Stinner wrote: >> I would like to >> be able to run C extensions compiled in release mode on a Python >> compiled in debug mode > > That seems like a very good idea. I would certainly use the debug mode > while developing CPython or C extensions. +1 Stefan From guido at python.org Wed Apr 24 14:23:40 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 24 Apr 2019 11:23:40 -0700 Subject: [Python-Dev] Typing Summit at PyCon US In-Reply-To: References: Message-ID: I'd like to remind everyone that in 8 days, at PyCon US in Cleveland, we'll have the Typing Summit (the day after the Language Summit). There's still room to register ! So far I've received just under 20 registrations -- there's room for at least 20 more! The summit is for both developers and users of static type checkers for Python. Topics will include (not necessarily in this order): Michael Sullivan: Annotation growth at Dropbox, and how we used mypyc to speed up mypy 4x. Jelle Zijlstra: The future of typeshed. Jukka Lehtosalo: Modular typeshed. Ivan Levkivskyi: Typing and mypy usability. Andrey Vlasovskikh: Incremental static analysis in PyCharm. Guido van Rossum: Overview of upcoming typing PEPs (544: Protocols; 586: Literal; 589: TypedDict; 591: Final). There's also room to discuss more speculative changes to the type system, especially changes needed to support numpy, such as integer generics and variadic type variables, and special cases for wrapper functions using *(*args, **kwds). I'm looking for volunteers to speak about these topics. On Fri, Mar 22, 2019 at 11:23 AM Guido van Rossum wrote: > The typing summit is primarily a place for developers of type checkers to > collaborate, but we are also inviting (potential) users of type checkers. > For example, there are plans to extend the standard Python type system with > features intended to support numpy, Pandas, tensorflow and similar > libraries, and we will discuss these at the summit. Therefore developers > and power-users of such frameworks are especially welcome at the summit. > > With Ewa's and Dropbox's help I've arranged a room at PyCon. > > > *When: Thursday May 2nd, 1-5 pm (i.e. the day between the Language Summit > and the conference proper)* > *Where: Room 6 at PyCon in Cleveland* > > If you're planning to attend, please fill out this form: > > https://goo.gl/forms/rG9dVTBbgyBgDK8H2 > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From vano at mail.mipt.ru Wed Apr 24 14:55:21 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Wed, 24 Apr 2019 21:55:21 +0300 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <20190424160322.63323ce1@fsol> References: <20190424160322.63323ce1@fsol> Message-ID: On 24.04.2019 17:03, Antoine Pitrou wrote: > On Wed, 24 Apr 2019 01:44:17 +0200 > Victor Stinner wrote: >> The first requirement for the use case is that a Python debug build >> supports the ABI of a release build. The current blocker issue is that >> the Py_DEBUG define imply the Py_TRACE_REFS define: PyObject gets 2 >> extra fields (_ob_prev and _ob_next) which change the offset of all >> attributes of all objects and makes the ABI completely incompatible. I >> propose to no longer imply Py_TRACE_REFS *by default* (but keep the >> code): >> >> https://bugs.python.org/issue36465 >> https://github.com/python/cpython/pull/12615 > +1 from me. > >> The second issue is that library filenames are different for a debug >> build: SOABI gets an additional "d" flag for Py_DEBUG. A debug build >> should first look for "NAME.cpython-38dm.so" (flags: "dm"), but then >> also look for "NAME.cpython-38m.so" (flags: "m"). > Sounds fair (but only on Unix, I guess). > >> Maybe pip could be enhanced to support installing C extensions >> compiled in release mode when using a debug mode. But that's more for >> convenience, it's not really required, since it is easy to switch the >> Python runtime between release and debug build. > Not sure what you mean by "easy to switch the Python runtime". As soon > as I want to use pip, I have to use a release build, right? No, pip works with a debug Python just as well (python.bat -m ensurepip) and installs modules to `/site-packages` IIRC. But building extensions is broken in this case as per https://mail.python.org/pipermail/python-dev/2019-April/157180.html . > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan From nas-python at arctrix.com Wed Apr 24 14:55:01 2019 From: nas-python at arctrix.com (Neil Schemenauer) Date: Wed, 24 Apr 2019 12:55:01 -0600 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: <20190424185501.j6kjdbmb4adj4x6x@python.ca> On 2019-04-24, Victor Stinner wrote: > The current blocker issue is that the Py_DEBUG define imply the > Py_TRACE_REFS define I think your change to make Py_TRACE_REFS as separate configure flag is fine. I've used the trace fields to debug occasionally but I don't use it often enough to need it enabled by Py_DEBUG. > Being able to switch between Python in release mode and Python in > debug mode is a first step. My long term plan would be to better > separate "Python" from its "runtime". Regarding the Py_TRACE_REFS fields, I think we can't do them without breaking the ABI because of the following. For GC objects, they are always allocated by _PyObject_GC_New/_PyObject_GC_NewVar. So, we can allocate the extra space needed for the GC linked list. For non-GC objects, that's not the case. Extensions can allocate using malloc() directly or their own allocator and then pass that memory to be initialized as a PyObject. I think that's a poor design and I think we should try to make slow progress in fixing it. I think non-GC objects should also get allocated by a Python API. In that case, the Py_TRACE_REFS functionality could be implemented in a way that doesn't break the ABI. It also makes the CPython API more friendly for alternative Python runtimes like PyPy, etc. Note that this change would not prevent an extension from allocating memory with it's own allocator. It just means that memory can't hold a PyObject. The extension PyObject would need to have a pointer that points to this externally allocated memory. I can imagine there could be some situations when people really want a PyObject to reside in a certain memory location. E.g. maybe you have some kind of special shared memory area. In that case, I think we could have specialized APIs to create PyObjects using a specialized allocator. Those APIs would not be supported by some runtimes (e.g. tracing/moving GC for PyObjects) and the APIs would not be used by most extensions. Regards, Neil From pviktori at redhat.com Wed Apr 24 18:24:15 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Wed, 24 Apr 2019 18:24:15 -0400 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: So, I spent another day pondering the PEPs. I love PEP 590's simplicity and PEP 580's extensibility. As I hinted before, I hope they can they be combined, and I believe we can achieve that by having PEP 590's (o+offset) point not just to function pointer, but to a {function pointer; flags} struct with flags defined for two optimizations: - "Method-like", i.e. compatible with LOAD_METHOD/CALL_METHOD. - "Argument offsetting request", allowing PEP 590's PY_VECTORCALL_ARGUMENTS_OFFSET optimization. This would mean one basic call signature (today's METH_FASTCALL | METH_KEYWORD), with individual optimizations available if both the caller and callee support them. In case you want to know my thoughts or details, let me indulge in some detailed comparisons and commentary that led to this. I also give a more detailed proposal below. Keep in mind I wrote this before I distilled it to the paragraph above, and though the distillation is written as a diff to PEP 590, I still think of this as merging both PEPs. PEP 580 tries hard to work with existing call conventions (like METH_O, METH_VARARGS), making them fast. PEP 590 just defines a new convention. Basically, any callable that wants performance improvements must switch to METH_VECTORCALL (fastcall). I believe PEP 590's approach is OK. To stay as performant as possible, C extension authors will need to adapt their code regularly. If they don't, no harm -- the code will still work as before, and will still be about as fast as it was before. In exchange for this, Python (and Cython, etc.) can focus on optimizing one calling convention, rather than a variety, each with its own advantages and drawbacks. Extending PEP 580 to support a new calling convention will involve defining a new CCALL_* constant, and adding to existing dispatch code. Extending PEP 590 to support a new calling convention will most likely require a new type flag, and either changing the vectorcall semantics or adding a new pointer. To be a bit more concrete, I think of possible extensions to PEP 590 as things like: - Accepting a kwarg dict directly, without copying the items to tuple/array (as in PEP 580's CCALL_VARARGS|CCALL_KEYWORDS) - Prepending more than one positional argument, or appending positional arguments - When an optimization like LOAD_METHOD/CALL_METHOD turns out to no longer be relevant, removing it to simplify/speed up code. I expect we'll later find out that something along these lines might improve performance. PEP 590 would make it hard to experiment. I mentally split PEP 590 into two pieces: formalizing fastcall, plus one major "extension" -- making bound methods fast. When seen this way, this "extension" is quite heavy: it adds an additional type flag, Py_TPFLAGS_METHOD_DESCRIPTOR, and uses a bit in the "Py_ssize_t nargs" argument as additional flag. Both type flags and nargs bits are very limited resources. If I was sure vectorcall is the final best implementation we'll have, I'd go and approve it ? but I think we still need room for experimentation, in the form of more such extensions. PEP 580, with its collection of per-instance data and flags, is definitely more extensible. What I don't like about it is that it has the extensions built-in; mandatory for all callers/callees. PEP 580 adds a common data struct to callable instances. Currently these are all data bound methods want to use (cc_flags, cc_func, cc_parent, cr_self). Various flags are consulted in order to deliver the needed info to the underlying function. PEP 590 lets the callable object store data it needs independently. It provides a clever mechanism for pre-allocating space for bound methods' prepended "self" argument, so data can be provided cheaply, though it's still done by the callable itself. Callables that would need to e.g. prepend more than one argument won't be able to use this mechanism, but y'all convinced me that is not worth optimizing for. PEP 580's goal seems to be that making a callable behave like a Python function/method is just a matter of the right set of flags. Jeroen called this "complexity in the protocol". PEP 590, on the other hand, leaves much to individual callable types. This is "complexity in the users of the protocol". I now don't see a problem with PEP 590's approach. Not all users will need the complexity. We need to give CPython and Cython the tools to make implementing "def"-like functions possible (and fast), but if other extensions need to match the behavior of Python functions, they should just use Cython. Emulating Python functions is a special-enough use case that it doesn't justify complicating the protocol, and the same goes for implementing Python's built-in functions (with all their historical baggage). My more full proposal for a compromise between PEP 580 and 590 would go something like below. The type flag (Py_TPFLAGS_HAVE_VECTORCALL/Py_TPFLAGS_HAVE_CCALL) and offset (tp_vectorcall_offset/tp_ccalloffset; in tp_print's place) stay. The offset identifies a per-instance structure with two fields: - Function pointer (with the vectorcall signature) - Flags Storing any other per-instance data (like PEP 580's cr_self/cc_parent) is the responsibility of each callable type. Two flags are defined initially: 1. "Method-like" (like Py_TPFLAGS_METHOD_DESCRIPTOR in PEP 580, or non-NULL cr_self in PEP 580). Having the flag here instead of a type flag will prevent tp_call-only callables from taking advantage of LOAD_METHOD/CALL_METHOD optimisation, but I think that's OK. 2. Request to reserve space for one argument before the args array, as in PEP 590's argument offsetting. If the flag is missing, nargs may not include PY_VECTORCALL_ARGUMENTS_OFFSET. A mechanism incompatible with offsetting may use the bit for another purpose. Both flags may be simply ignored by the caller (or not be set by the callee in the first place), reverting to a more straightforward (but less performant) code path. This should also be the case for any flags added in the future. Note how without these flags, the protocol (and its documentation) will be extremely simple. This mechanism would work with my examples of possible future extensions: - "kwarg dict": A flag would enable the `kwnames` argument to be a dict instead of a tuple. - prepending/appending several positional arguments: The callable's request for how much space to allocate stored right after the {func; flags} struct. As in argument offsetting, a bit in nargs would indicate that the request was honored. (If this was made incompatible with one-arg offsetting, it could reuse the bit.) - removing an optimization: CPython would simply stop using an optimizations (but not remove the flag). Extensions could continue to use the optimization between themselves. As in PEP 590, any class that uses this mechanism shall not be usable as a base class. This will simplify implementation and tests, but hopefully the limitation will be removed in the future. (Maybe even in the initial implementation.) The METH_VECTORCALL (aka CCALL_FASTCALL|CCALL_KEYWORDS) calling convention is added to the public API. The other calling conventions (PEP 580's CCALL_O, CCALL_NOARGS, CCALL_VARARGS, CCALL_KEYWORDS, CCALL_FASTCALL, CCALL_DEFARG) as well as argument type checking (CCALL_OBJCLASS) and self slicing (CCALL_SELFARG) are left up to the callable. No equivalent of PEP 580's restrictions on the __name__ attribute. In my opinion, the PyEval_GetFuncName function should just be deprecated in favor of getting the __name__ attribute and checking if it's a string. It would be possible to add a public helper that returns a proper reference, but that doesn't seem worth it. Either way, I consider this out of scope of this PEP. No equivalent of PEP 580's PyCCall_GenericGetParent and PyCCall_GenericGetQualname either -- again, if needed, they should be retrieved as normal attributes. As I see it, the operation doesn't need to be particularly fast. No equivalent of PEP 580's PyCCall_Call, and no support for dict in PyCCall_FastCall's kwds argument. To be fast, extensions should avoid passing kwargs in a dict. Let's see how far that takes us. (FWIW, this also avoids subtle issues with dict mutability.) Profiling stays as in PEP 580: only exact function types generate the events. As in PEP 580, PyCFunction_GetFlags and PyCFunction_GET_FLAGS are deprecated As in PEP 580, nothing is added to the stable ABI Does that sound reasonable? From pviktori at redhat.com Wed Apr 24 18:24:04 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Wed, 24 Apr 2019 18:24:04 -0400 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CAE76B6.6090501@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> Message-ID: <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> On 4/10/19 7:05 PM, Jeroen Demeyer wrote: > On 2019-04-10 18:25, Petr Viktorin wrote: >> Hello! >> I've had time for a more thorough reading of PEP 590 and the reference >> implementation. Thank you for the work! > > And thank you for the review! > >> I'd now describe the fundamental >> difference between PEP 580 and PEP 590 as: >> - PEP 580 tries to optimize all existing calling conventions >> - PEP 590 tries to optimize (and expose) the most general calling >> convention (i.e. fastcall) > > And PEP 580 has better performance overall, even for METH_FASTCALL. See > this thread: > https://mail.python.org/pipermail/python-dev/2019-April/156954.html > > Since these PEPs are all about performance, I consider this a very > relevant argument in favor of PEP 580. All about performance as well as simplicity, correctness, testability, teachability... And PEP 580 touches some introspection :) >> PEP 580 also does a number of other things, as listed in PEP 579. But I >> think PEP 590 does not block future PEPs for the other items. >> On the other hand, PEP 580 has a much more mature implementation -- and >> that's where it picked up real-world complexity. > About complexity, please read what I wrote in > https://mail.python.org/pipermail/python-dev/2019-March/156853.html > > I claim that the complexity in the protocol of PEP 580 is a good thing, > as it removes complexity from other places, in particular from the users > of the protocol (better have a complex protocol that's simple to use, > rather than a simple protocol that's complex to use). I think we're talking past each other. I see now it as: PEP 580 takes existing complexity and makes it available to all users, in a simpler way. It makes existing code faster. PEP 590 defines a new simple/fast protocol for its users, and instead of making existing complexity faster and easier to use, it's left to be deprecated/phased out (or kept in existing classes for backwards compatibility). It makes it possible for future code to be faster/simpler. I think things should be simple by default, but if people want some extra performance, they can opt in to some extra complexity. > As a more concrete example of the simplicity that PEP 580 could bring, > CPython currently has 2 classes for bound methods implemented in C: > - "builtin_function_or_method" for normal C methods > - "method-descriptor" for slot wrappers like __eq__ or __add__ > > With PEP 590, these classes would need to stay separate to get maximal > performance. With PEP 580, just one class for bound methods would be > sufficient and there wouldn't be any performance loss. And this extends > to custom third-party function/method classes, for example as > implemented by Cython. Yet, for backwards compatibility reasons, we can't merge the classes. Also, I think CPython and Cython are exactly the users that can trade some extra complexity for better performance. >> Jeroen's analysis from >> https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems >> to miss a step at the top: >> >> a. CALL_FUNCTION* / CALL_METHOD opcode >> ?????? calls >> b. _PyObject_FastCallKeywords() >> ?????? which calls >> c. _PyCFunction_FastCallKeywords() >> ?????? which calls >> d. _PyMethodDef_RawFastCallKeywords() >> ?????? which calls >> e. the actual C function (*ml_meth)() >> >> I think it's more useful to say that both PEPs bridge a->e (via >> _Py_VectorCall or PyCCall_Call). > > Not quite. For a builtin_function_or_method, we have with PEP 580: > > a. call_function() > ??? calls > d. PyCCall_FastCall > ??? which calls > e. the actual C function > > and with PEP 590 it's more like: > > a. call_function() > ??? calls > c. _PyCFunction_FastCallKeywords > ??? which calls > d. _PyMethodDef_RawFastCallKeywords > ??? which calls > e. the actual C function > > Level c. above is the vectorcall wrapper, which is a level that PEP 580 > doesn't have. PEP 580 optimizes all the code paths, where PEP 590 optimizes the fast path, and makes sure most/all use cases can use (or switch to) the fast path. Both fast paths are fast: bridging a->e using zero-copy arg passing with some C calls and flag checks. The PEP 580 approach is faster; PEP 590's is simpler. >> Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or >> should address? > > Well, PEP 580 is an extensible protocol while PEP 590 is not. But, > PyTypeObject is extensible, so even with PEP 590 one can always extend > that (for example, PEP 590 uses a type flag Py_TPFLAGS_METHOD_DESCRIPTOR > where PEP 580 instead uses the structs for the C call protocol). But I > guess that extending PyTypeObject will be harder to justify (say, in a > future PEP) than extending the C call protocol. That's a good point. > Also, it's explicitly allowed for users of the PEP 580 protocol to > extend the PyCCallDef structure with custom fields. But I don't have a > concrete idea of whether that will be useful. Unless I'm missing something, that would be effectively the same as extending their own instance struct. To bring any benefits, the extended PyCCallDef would need to be standardized in a PEP. From pviktori at redhat.com Wed Apr 24 18:25:40 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Wed, 24 Apr 2019 18:25:40 -0400 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: <5a0eda4c-1e34-132e-05bf-72bcb40f44e3@redhat.com> Hi Mark! See my more general reply; here I'll just tie loose ends with a few +1s. On 4/14/19 7:30 AM, Mark Shannon wrote: > On 10/04/2019 5:25 pm, Petr Viktorin wrote: [...] >> PEP 590 is built on a simple idea, formalizing fastcall. But it is >> complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and >> Py_TPFLAGS_METHOD_DESCRIPTOR. >> As far as I understand, both are there to avoid intermediate >> bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be >> general, but I don't see any other use case.) >> Is that right? > > Not quite. > Py_TPFLAGS_METHOD_DESCRIPTOR is for LOAD_METHOD/CALL_METHOD, it allows > any callable descriptor to benefit from the LOAD_METHOD/CALL_METHOD > optimisation. > > PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward > calls with an additional argument can do so efficiently. The obvious > example is bound-methods, but classes are at least as important. > cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args) I see. Thanks! >> (I'm running out of time today, but I'll write more on why I'm asking, >> and on the case I called "impossible" (while avoiding creation of a >> "bound method" object), later.) Let me drop this thread; I stand corrected. >> Another point I'd like some discussion on is that vectorcall function >> pointer is per-instance. It looks this is only useful for type >> objects, but it will add a pointer to every new-style callable object >> (including functions). That seems wasteful. >> Why not have a per-type pointer, and for types that need it (like >> PyTypeObject), make it dispatch to an instance-specific function? > > Firstly, each callable has different behaviour, so it makes sense to be > able to do the dispatch from caller to callee in one step. Having a > per-object function pointer allows that. > Secondly, callables are either large or transient. If large, then the > extra few bytes makes little difference. If transient then, it matters > even less. > The total increase in memory is likely to be only a few tens of > kilobytes, even for a large program. That makes sense. From njs at pobox.com Thu Apr 25 02:31:04 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 24 Apr 2019 23:31:04 -0700 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: You don't necessarily need rpath actually. The Linux loader has a bug/feature where once it has successfully loaded a library with a given soname, then any future requests for that soname within the same process will automatically return that same library, regardless of rpath settings etc. So as long as the main interpreter has loaded libpython.whatever from the correct directory, then extension modules will all get that same version. The rpath won't matter at all. It is annoying in general that on Linux, we have these two different ways to build extension modules. It definitely violates TOOWTDI :-). It would be nice at some point to get rid of one of them. Note that we can't get rid of the two different ways entirely though ? on Windows, extension modules *must* link to libpython.dll, and on macOS, extension modules *can't* link to libpython.dylib. So the best we can hope for is to make Linux consistently do one of these, instead of supporting both. In principle, having extension modules link to libpython.so is a good thing. Suppose that someone wants to dynamically load the python interpreter into their program as some kind of plugin. (Examples: Apache's mod_python, LibreOffice's support for writing macros in Python.) It would be nice to be able to load python2 and python3 simultaneously into the same process as distinct plugins. And this is totally doable in theory, *but* it means that you can't assume that the interpreter's symbols will be automagically injected into extension modules, so it's only possible if extension modules link to libpython.so. In practice, extension modules have never consistently linked to libpython.so, so everybody who loads the interpreter as a plugin has already worked around this. Specifically, they use RTLD_GLOBAL to dump all the interpreter's symbols into the global namespace. This is why you can't have python2 and python3 mod_python at the same time in the same Apache. And since everyone is already working around this, linking to libpython.so currently has zero benefit... in fact manylinux wheels are actually forbidden to link to libpython.so, because this is the only way to get wheels that work on every interpreter. -n On Wed, Apr 24, 2019, 09:54 Victor Stinner wrote: > Hum, I found issues with libpython: C extensions are explicitly linked > to libpython built in release mode. So a debug python loading a C > extension may load libpython in release mode, whereas libpython in > debug mode is already loaded. > > When Python is built with --enable-shared, the python3.7 program is > linked to libpython3.7m.so.1.0 on Linux. C extensions are explicitly > linked to libpython3.7m as well: > > $ python3.7-config --ldflags > ... -lpython3.7m ... > > Example with numpy: > > $ ldd /usr/lib64/python3.7/site-packages/numpy/core/ > umath.cpython-37m-x86_64-linux-gnu.so > ... > libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (...) > ... > > When Python 3.7 is compiled in debug mode, libpython gets a "d" flag > for debug: libpython3.7dm.so.1.0. > > I see 2 solutions: > > (1) Use a different directory. If "libpython" gets the same filename > in release and debug mode, at least, they must be installed in > different directories. If libpython build in debug mode is installed > in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be > compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug > libpython. > > (2) If "libpython" gets a different filename in debug mode, C > extensions should not be linked to libpython explicitly but > *implicitly* to avoid picking the wrong libpython. For example, remove > "-lpython3.7m" from "python3.7-config --ldflags" output. > > The option (1) rely on rpath which is discouraged by Linux vendors and > may not be supported by all operating systems. > > The option (2) is simpler and likely more portable. > > Currently, C extensions of the standard library may or may not be > linked to libpython depending on they are built. In practice, both > work since python3.7 is already linked to libpython: so libpython is > already loaded in memory before C extensions are loaded. > > I opened https://bugs.python.org/issue34814 to discuss how C > extensions of the standard library should be linked but I closed it > because we failed to find a consensus and the initial use case became > a non-issue. It seems like we should reopen the discussion :-) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > On Wed, Apr 24, 2019, 09:54 Victor Stinner wrote: > Hum, I found issues with libpython: C extensions are explicitly linked > to libpython built in release mode. So a debug python loading a C > extension may load libpython in release mode, whereas libpython in > debug mode is already loaded. > > When Python is built with --enable-shared, the python3.7 program is > linked to libpython3.7m.so.1.0 on Linux. C extensions are explicitly > linked to libpython3.7m as well: > > $ python3.7-config --ldflags > ... -lpython3.7m ... > > Example with numpy: > > $ ldd /usr/lib64/python3.7/site-packages/numpy/core/ > umath.cpython-37m-x86_64-linux-gnu.so > ... > libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (...) > ... > > When Python 3.7 is compiled in debug mode, libpython gets a "d" flag > for debug: libpython3.7dm.so.1.0. > > I see 2 solutions: > > (1) Use a different directory. If "libpython" gets the same filename > in release and debug mode, at least, they must be installed in > different directories. If libpython build in debug mode is installed > in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be > compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug > libpython. > > (2) If "libpython" gets a different filename in debug mode, C > extensions should not be linked to libpython explicitly but > *implicitly* to avoid picking the wrong libpython. For example, remove > "-lpython3.7m" from "python3.7-config --ldflags" output. > > The option (1) rely on rpath which is discouraged by Linux vendors and > may not be supported by all operating systems. > > The option (2) is simpler and likely more portable. > > Currently, C extensions of the standard library may or may not be > linked to libpython depending on they are built. In practice, both > work since python3.7 is already linked to libpython: so libpython is > already loaded in memory before C extensions are loaded. > > I opened https://bugs.python.org/issue34814 to discuss how C > extensions of the standard library should be linked but I closed it > because we failed to find a consensus and the initial use case became > a non-issue. It seems like we should reopen the discussion :-) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doko at ubuntu.com Thu Apr 25 03:18:54 2019 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 25 Apr 2019 09:18:54 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: <92ffccf0-b990-6dd1-1ca5-5361ecb0bf31@ubuntu.com> On 24.04.19 18:02, Victor Stinner wrote: > Hum, I found issues with libpython: C extensions are explicitly linked > to libpython built in release mode. So a debug python loading a C > extension may load libpython in release mode, whereas libpython in > debug mode is already loaded. > > When Python is built with --enable-shared, the python3.7 program is > linked to libpython3.7m.so.1.0 on Linux. C extensions are explicitly > linked to libpython3.7m as well: > > $ python3.7-config --ldflags > ... -lpython3.7m ... > > Example with numpy: > > $ ldd /usr/lib64/python3.7/site-packages/numpy/core/umath.cpython-37m-x86_64-linux-gnu.so > ... > libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (...) > ... > > When Python 3.7 is compiled in debug mode, libpython gets a "d" flag > for debug: libpython3.7dm.so.1.0. > > I see 2 solutions: > > (1) Use a different directory. If "libpython" gets the same filename > in release and debug mode, at least, they must be installed in > different directories. If libpython build in debug mode is installed > in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be > compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug > libpython. > > (2) If "libpython" gets a different filename in debug mode, C > extensions should not be linked to libpython explicitly but > *implicitly* to avoid picking the wrong libpython. For example, remove > "-lpython3.7m" from "python3.7-config --ldflags" output. > > The option (1) rely on rpath which is discouraged by Linux vendors and > may not be supported by all operating systems. > > The option (2) is simpler and likely more portable. > > Currently, C extensions of the standard library may or may not be > linked to libpython depending on they are built. In practice, both > work since python3.7 is already linked to libpython: so libpython is > already loaded in memory before C extensions are loaded. the purpose of python-config here is not clear. Whether it's intended to be used for linking extensions, or embedded interpreters. Currently you are using the same for both use cases. From doko at ubuntu.com Thu Apr 25 03:14:27 2019 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 25 Apr 2019 09:14:27 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: On 25.04.19 08:31, Nathaniel Smith wrote: > You don't necessarily need rpath actually. The Linux loader has a > bug/feature where once it has successfully loaded a library with a given > soname, then any future requests for that soname within the same process > will automatically return that same library, regardless of rpath settings > etc. So as long as the main interpreter has loaded libpython.whatever from > the correct directory, then extension modules will all get that same > version. The rpath won't matter at all. > > It is annoying in general that on Linux, we have these two different ways > to build extension modules. It definitely violates TOOWTDI :-). It would be > nice at some point to get rid of one of them. > > Note that we can't get rid of the two different ways entirely though ? on > Windows, extension modules *must* link to libpython.dll, and on macOS, > extension modules *can't* link to libpython.dylib. So the best we can hope > for is to make Linux consistently do one of these, instead of supporting > both. > > In principle, having extension modules link to libpython.so is a good > thing. Suppose that someone wants to dynamically load the python > interpreter into their program as some kind of plugin. (Examples: Apache's > mod_python, LibreOffice's support for writing macros in Python.) It would > be nice to be able to load python2 and python3 simultaneously into the same > process as distinct plugins. And this is totally doable in theory, *but* it > means that you can't assume that the interpreter's symbols will be > automagically injected into extension modules, so it's only possible if > extension modules link to libpython.so. > > In practice, extension modules have never consistently linked to > libpython.so, so everybody who loads the interpreter as a plugin has > already worked around this. Specifically, they use RTLD_GLOBAL to dump all > the interpreter's symbols into the global namespace. This is why you can't > have python2 and python3 mod_python at the same time in the same Apache. > And since everyone is already working around this, linking to libpython.so > currently has zero benefit... in fact manylinux wheels are actually > forbidden to link to libpython.so, because this is the only way to get > wheels that work on every interpreter. extensions in Debian/Ubuntu packages are not linked against libpython.so, but the main reason here is that sometimes you have to extensions built in transition periods like for 3.6 and 3.7. And this is also the default when not configuring with --enable-shared. From doko at ubuntu.com Thu Apr 25 03:24:57 2019 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 25 Apr 2019 09:24:57 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> On 24.04.19 01:44, Victor Stinner wrote: > Hi, > > Two weeks ago, I started a thread "No longer enable Py_TRACE_REFS by > default in debug build", but I lost myself in details, I forgot the > main purpose of my proposal... > > Let me retry from scratch with a more explicit title: I would like to > be able to run C extensions compiled in release mode on a Python > compiled in debug mode ("pydebug"). The use case is to debug bugs in C > extensions thanks to additional runtime checks of a Python debug > build, and more generally get a better debugging experiences on > Python. Even for pure Python, a debug build is useful (to get the > Pyhon traceback in gdb using "py-bt" command). > > Currently, using a Python compiled in debug mode means to have to > recompile C extensions in debug mode. Compile a C extension requires a > C compiler, header files, pull dependencies, etc. It can be very > complicated in practical (and pollute your system with all these > additional dependencies). On Linux, it's already hard, but on Windows > it can be even harder. > > Just one concrete example: no debug build of numpy is provided at > https://pypi.org/project/numpy/ Good luck to build numpy in debug mode > manually (install OpenBLAS, ATLAS, Fortran compiler, Cython, etc.) > :-) there's a simple solution: apt install python3-numpy-dbg cython3-dbg ;) So depending on the package maintainer, you already have that available, but it is extra maintenance cost. Simplifying that would be a good idea. However I still would like to be able to have "debug" and "non-debug" builds co-installable at the same time. From J.Demeyer at UGent.be Thu Apr 25 05:12:18 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 25 Apr 2019 11:12:18 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> Message-ID: <5CC179F2.5050200@UGent.be> On 2019-04-25 00:24, Petr Viktorin wrote: > PEP 590 defines a new simple/fast protocol for its users, and instead of > making existing complexity faster and easier to use, it's left to be > deprecated/phased out (or kept in existing classes for backwards > compatibility). It makes it possible for future code to be faster/simpler. Can you elaborate on what you mean with this deprecating/phasing out? What's your view on dealing with method classes (not necessarily right now, but in the future)? Do you think that having separate method classes like method-wrapper (for example [].__add__) is good or bad? Since the way how PEP 580 and PEP 590 deal with bound method classes is very different, I would like to know the roadmap for this. Jeroen. From vstinner at redhat.com Thu Apr 25 06:39:44 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 25 Apr 2019 12:39:44 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: Message-ID: Hi, I'm now convinced that C extensions must *not* be linked to libpython on Unix. I wrote PR 12946: https://github.com/python/cpython/pull/12946 bpo-21536: C extensions are no longer linked to libpython On Unix, C extensions are no longer linked to libpython. It is now possible to load a C extension built using a shared library Python with a statically linked Python. When Python is embedded, libpython must not be loaded with RTLD_LOCAL, but RTDL_GLOBAL instead. Previously, using RTLD_LOCAL, it was already not possible to load C extensions which were not linked to libpython, like C extensions of the standard library built by the "*shared*" section of Modules/Setup. distutils, python-config and python-config.py have been modified. This PR allows to load a C extension built by a shared libpython with a statically linked Python: https://bugs.python.org/issue21536#msg340819 It also allows to load C extension built in release mode with a Python built in debug mode: https://bugs.python.org/issue21536#msg340821 Le jeu. 25 avr. 2019 ? 08:31, Nathaniel Smith a ?crit : > In principle, having extension modules link to libpython.so is a good thing. Suppose that someone wants to dynamically load the python interpreter into their program as some kind of plugin. (Examples: Apache's mod_python, LibreOffice's support for writing macros in Python.) It would be nice to be able to load python2 and python3 simultaneously into the same process as distinct plugins. And this is totally doable in theory, *but* it means that you can't assume that the interpreter's symbols will be automagically injected into extension modules, so it's only possible if extension modules link to libpython.so. I'm aware of 2 special use cases of libpython: (A) Embed Python using RTLD_LOCAL: dlopen("libpython2.7.so.1.0", RTLD_LOCAL | RTLD_NOW) Example of issues describing this use case: * 2003: https://bugs.python.org/issue832799 * 2006: https://bugs.python.org/issue1429775 * 2018: https://bugs.python.org/issue34814 and https://bugzilla.redhat.com/show_bug.cgi?id=1585201 Python started to link C extensions to libpython in 2006 for this use case. (B) Load "libpython2" (Python 2) and "libpython3" (Python 3). I heard this idea... but I never saw anyone doing it in practice. I don't understand how it could work in a single address space. Linking C extensions to libpython is causing different issues: (1) C extension built by a shared library Python cannot be loaded with a statically linked Python: https://bugs.python.org/issue21536 (2) C extension built in release mode cannot be loaded with Python built in debug mode. That's the issue discussed in this thread ;-) (3) C extension built by Python 3.6 cannot be loaded in Python 3.7, even if it has been compiled using the stable ABI (Py_LIMITED_API). (4) C extensions of the standard library built by "*shared*" of Modules/Setup are *not* linked to libpython. For example, _struct.so on Fedora is not linked to libpython, whereas . If libpython is loaded with RTLD_LOCAL (use case A), import _struct fails. The use case (A) (RTLD_LOCAL) is trivial to fix: replace RTLD_LOCAL with RTLD_GLOBAL. The use case (B) (libpython2 + libpython3) is also easy to workaround: just use 2 separated processes. Python 2 will reach its end of life at the end of the year, I'm not sure that we should worry too much about this use case. The issue (1) (statically/shared) is a very practical issue. Fedora/RHEL uses libpython whereas Debian/Ubuntu uses statically linked Python. C extension compiled on Fedora/RHEL is linked to libpython and so cannot be loaded on Debian/Ubuntu (their Python doesn't have libpython!). That's why manylinux forbid link to link to libpython: be able to load C extensions on all Linux distributions. IMHO issues (1), (2), (3), (4) are more valuable to be fixed than supporting use cases (A) and (B). Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Thu Apr 25 06:53:23 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 25 Apr 2019 12:53:23 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <92ffccf0-b990-6dd1-1ca5-5361ecb0bf31@ubuntu.com> References: <92ffccf0-b990-6dd1-1ca5-5361ecb0bf31@ubuntu.com> Message-ID: Le jeu. 25 avr. 2019 ? 09:30, Matthias Klose a ?crit : > the purpose of python-config here is not clear. Whether it's intended to be used > for linking extensions, or embedded interpreters. Currently you are using the > same for both use cases. My PR 12946 removes libpython from distutils, python-config and python-config.py: https://github.com/python/cpython/pull/12946 Do you mean that this change will break the build of applications embedding Python? If yes, what can done to fix that? Provide a different script to the specific case of embedded Python? Or add a new option to specify that you are embedding Python? In Python 3.7, the required linker flag is "-lpython3.7m". It's not trivial to guess the "m" suffix. FYI Python 3.8 it becames just "-lpython3.8": I removed the "m" suffix which was useless. Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Thu Apr 25 07:14:10 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 25 Apr 2019 13:14:10 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> References: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> Message-ID: Le jeu. 25 avr. 2019 ? 09:34, Matthias Klose a ?crit : > there's a simple solution: apt install python3-numpy-dbg cython3-dbg ;) So > depending on the package maintainer, you already have that available, but it is > extra maintenance cost. Simplifying that would be a good idea. Fedora provides "debuginfo" for all binarry packages (like numpy), but that's different from a debug build. Usually, C code of packages are optimized by gcc -O2 or even gcc -O3 which makes the debugging experience very painful: gdb fails to read C local variables and just say "". To debug internals, you want a debug build compiled by gcc -Og or (better IMHO) gcc -O0. If you want to inspect *Python* internals but you don't need to inspect numpy internals, being able to run a release numpy on a debug Python is convenient. With an additional change on SOABI (I will open a separated issue for that), my PR 12946 (no longer link C extensions to libpython) allows to load lxml built in release mode in a Python built in debug mode! That's *very* useful for debugging. I show an example of the gdb experience with a release Python vs debug Python: https://bugs.python.org/issue21536#msg340821 With a release Python, the basic function "py-bt" works as expected, but inspecting Python internals doesn't work: most local C variables are "optimized out" :-( With a debug Python, the debugging experience is *much* better: it's possible to inspect Python internals! > However I still > would like to be able to have "debug" and "non-debug" builds co-installable at > the same time. One option is to keep "d" flag in the SOABI so C extensions get a different SO filename (no change compared to Python 3.7): "NAME.cpython-38-x86_64-linux-gnu.so" for release vs "NAME.cpython-38d-x86_64-linux-gnu.so" for debug, debug gets "d" suffix ("cpython-38" vs "cpython-38d"). *But* modify importlib when Python is compiled in debug mode to look also to SO without the "d" suffix: first try load "NAME.cpython-38d-x86_64-linux-gnu.so" (debug: "d" suffix). If there is no match, look for "NAME.cpython-38-x86_64-linux-gnu.so" (release: no suffix). Since the ABI is now compatible in Python 3.8, it should "just work" :-) >From a Linux packager perspective, nothing changes ;-) We can still provide "apt install python3-numpy-dbg" (debug) which can is co-installable with "apt install python3-numpy" (release). The benefit is that it will be possible to load C extensions which are only available in the release flavor with a debug Python ;-) Victor -- Night gathers, and now my watch begins. It shall not end until my death. From vstinner at redhat.com Thu Apr 25 07:26:35 2019 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 25 Apr 2019 13:26:35 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <92ffccf0-b990-6dd1-1ca5-5361ecb0bf31@ubuntu.com> Message-ID: I looked how fonforge gets compiler and linker flags to embed Python: it seems like to "pkg-config --libs python-2.7" which returns "-lpython2.7". My PR doesn't change Misc/python.pc. Should I modify Misc/python.pc as well... or not? :-) I'm not used to pkg-config. I don't know if it's common that C extensions are built using pkg-config. I guess that distutils is more commonly used to build C extensions. Victor Le jeu. 25 avr. 2019 ? 12:53, Victor Stinner a ?crit : > Le jeu. 25 avr. 2019 ? 09:30, Matthias Klose a ?crit : > > the purpose of python-config here is not clear. Whether it's intended to be used > > for linking extensions, or embedded interpreters. Currently you are using the > > same for both use cases. > > My PR 12946 removes libpython from distutils, python-config and > python-config.py: > https://github.com/python/cpython/pull/12946 > > Do you mean that this change will break the build of applications > embedding Python? If yes, what can done to fix that? > > Provide a different script to the specific case of embedded Python? Or > add a new option to specify that you are embedding Python? > > In Python 3.7, the required linker flag is "-lpython3.7m". It's not > trivial to guess the "m" suffix. FYI Python 3.8 it becames just > "-lpython3.8": I removed the "m" suffix which was useless. > > Victor > -- > Night gathers, and now my watch begins. It shall not end until my death. -- Night gathers, and now my watch begins. It shall not end until my death. From doko at ubuntu.com Thu Apr 25 07:41:38 2019 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 25 Apr 2019 13:41:38 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <92ffccf0-b990-6dd1-1ca5-5361ecb0bf31@ubuntu.com> Message-ID: <9085f0c0-d84a-a339-3a34-ce3ac8e10ee7@ubuntu.com> On 25.04.19 13:26, Victor Stinner wrote: > I looked how fonforge gets compiler and linker flags to embed Python: > it seems like to "pkg-config --libs python-2.7" which returns > "-lpython2.7". My PR doesn't change Misc/python.pc. Should I modify > Misc/python.pc as well... or not? :-) I'm not used to pkg-config. I > don't know if it's common that C extensions are built using > pkg-config. I guess that distutils is more commonly used to build C > extensions. ... except for all the software which is doing some embedding (e.g. vim), or is building some bindings as part of the upstream software. So yes, there is some stuff ... The tendency seems to deprecate your own config helper in favor of pkgconfig. However I'm not sure how this would do with the current MacOS python-config python script. If we want to differentiate between embedding and extensions, then we need two different module names, maybe keeping the current one for extensions, and having a new one for embedding. Not sure about python-config, if we want a new helper for embedding, or add new options for the existing script. > Victor > > Le jeu. 25 avr. 2019 ? 12:53, Victor Stinner a ?crit : >> Le jeu. 25 avr. 2019 ? 09:30, Matthias Klose a ?crit : >>> the purpose of python-config here is not clear. Whether it's intended to be used >>> for linking extensions, or embedded interpreters. Currently you are using the >>> same for both use cases. >> >> My PR 12946 removes libpython from distutils, python-config and >> python-config.py: >> https://github.com/python/cpython/pull/12946 >> >> Do you mean that this change will break the build of applications >> embedding Python? If yes, what can done to fix that? >> >> Provide a different script to the specific case of embedded Python? Or >> add a new option to specify that you are embedding Python? >> >> In Python 3.7, the required linker flag is "-lpython3.7m". It's not >> trivial to guess the "m" suffix. FYI Python 3.8 it becames just >> "-lpython3.8": I removed the "m" suffix which was useless. >> >> Victor >> -- >> Night gathers, and now my watch begins. It shall not end until my death. > > > From doko at ubuntu.com Thu Apr 25 07:48:05 2019 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 25 Apr 2019 13:48:05 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> Message-ID: <6a395e77-d681-69f8-ba32-f6fd47f261cb@ubuntu.com> On 25.04.19 13:14, Victor Stinner wrote: > Le jeu. 25 avr. 2019 ? 09:34, Matthias Klose a ?crit : >> there's a simple solution: apt install python3-numpy-dbg cython3-dbg ;) So >> depending on the package maintainer, you already have that available, but it is >> extra maintenance cost. Simplifying that would be a good idea. > > Fedora provides "debuginfo" for all binarry packages (like numpy), but > that's different from a debug build. Usually, C code of packages are > optimized by gcc -O2 or even gcc -O3 which makes the debugging > experience very painful: gdb fails to read C local variables and just > say "". To debug internals, you want a debug build > compiled by gcc -Og or (better IMHO) gcc -O0. > > If you want to inspect *Python* internals but you don't need to > inspect numpy internals, being able to run a release numpy on a debug > Python is convenient. yes, the Debian/Ubuntu packages contain both the debug build, and the debug info for they normal build, e.g. /usr/lib/debug/.build-id/3a/8ea2ab6ee85ff68879a48170966873eb8da781.debug /usr/lib/debug/.build-id/78/5ff95f8d2d06c5990ae4e03cdff99452ca0de9.debug /usr/lib/debug/.build-id/92/e008cffa3f09106214bfb6b80b7fd02ceab74f.debug /usr/lib/debug/.build-id/ab/33160518c41acc0488bbc3af878995ef74e07f.debug /usr/lib/debug/.build-id/bd/65896626a4c6566e96ad008362922cf6a39cd6.debug /usr/lib/debug/.build-id/f1/e83b14a76dd9564e962dcdd2f70202e6fdb2b1.debug /usr/lib/debug/.build-id/ff/5eab5fd2d14f4bfa6a1ef2300358efdc7dd800.debug /usr/lib/python3/dist-packages/lxml/_elementpath.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/builder.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/etree.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/html/clean.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/html/diff.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/objectify.cpython-37dm-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/lxml/sax.cpython-37dm-x86_64-linux-gnu.so > With an additional change on SOABI (I will open a separated issue for > that), my PR 12946 (no longer link C extensions to libpython) allows > to load lxml built in release mode in a Python built in debug mode! > That's *very* useful for debugging. I show an example of the gdb > experience with a release Python vs debug Python: > > https://bugs.python.org/issue21536#msg340821 > > With a release Python, the basic function "py-bt" works as expected, > but inspecting Python internals doesn't work: most local C variables > are "optimized out" :-( > > With a debug Python, the debugging experience is *much* better: it's > possible to inspect Python internals! > > >> However I still >> would like to be able to have "debug" and "non-debug" builds co-installable at >> the same time. > > One option is to keep "d" flag in the SOABI so C extensions get a > different SO filename (no change compared to Python 3.7): > "NAME.cpython-38-x86_64-linux-gnu.so" for release vs > "NAME.cpython-38d-x86_64-linux-gnu.so" for debug, debug gets "d" > suffix ("cpython-38" vs "cpython-38d"). > > *But* modify importlib when Python is compiled in debug mode to look > also to SO without the "d" suffix: first try load > "NAME.cpython-38d-x86_64-linux-gnu.so" (debug: "d" suffix). If there > is no match, look for "NAME.cpython-38-x86_64-linux-gnu.so" (release: > no suffix). Since the ABI is now compatible in Python 3.8, it should > "just work" :-) > > From a Linux packager perspective, nothing changes ;-) We can still > provide "apt install python3-numpy-dbg" (debug) which can is > co-installable with "apt install python3-numpy" (release). > > The benefit is that it will be possible to load C extensions which are > only available in the release flavor with a debug Python ;-) yes, that sounds good. Are there use cases where you only want to load *some* debug extensions, even if more are installed? From J.Demeyer at UGent.be Thu Apr 25 10:42:45 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 25 Apr 2019 16:42:45 +0200 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: <5CC1C765.1020608@UGent.be> On 2019-04-25 00:24, Petr Viktorin wrote: > I believe we can achieve > that by having PEP 590's (o+offset) point not just to function pointer, > but to a {function pointer; flags} struct with flags defined for two > optimizations: What's the rationale for putting the flags in the instance? Do you expect flags to be different between one instance and another instance of the same class? > Both type flags and > nargs bits are very limited resources. Type flags are only a limited resource if you think that all flags ever added to a type must be put into tp_flags. There is nothing wrong with adding new fields tp_extraflags or tp_vectorcall_flags to a type. > What I don't like about it is that it has > the extensions built-in; mandatory for all callers/callees. I don't agree with the above sentence about PEP 580: - callers should use APIs like PyCCall_FastCall() and shouldn't need to worry about the implementation details at all. - callees can opt out of all the extensions by not setting any special flags and setting cr_self to a non-NULL value. When using the flags CCALL_FASTCALL | CCALL_KEYWORDS, then implementing the callee is exactly the same as PEP 590. > As in PEP 590, any class that uses this mechanism shall not be usable as > a base class. Can we please lift this restriction? There is really no reason for it. I'm not aware of any similar restriction anywhere in CPython. Note that allowing subclassing is not the same as inheriting the protocol. As a compromise, we could simply never inherit the protocol. Jeroen. From pviktori at redhat.com Thu Apr 25 12:17:53 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Thu, 25 Apr 2019 12:17:53 -0400 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: <5CC1C765.1020608@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CC1C765.1020608@UGent.be> Message-ID: <1309c111-359e-1af6-2655-a16ffaad2994@redhat.com> On 4/25/19 10:42 AM, Jeroen Demeyer wrote: > On 2019-04-25 00:24, Petr Viktorin wrote: >> I believe we can achieve >> that by having PEP 590's (o+offset) point not just to function pointer, >> but to a {function pointer; flags} struct with flags defined for two >> optimizations: > > What's the rationale for putting the flags in the instance? Do you > expect flags to be different between one instance and another instance > of the same class? I'm not tied to that idea. If there's a more reasonable place to put the flags, let's go for it, but it's not a big enough issue so it shouldn't complicate the protocol much. Quoting Mark from the other subthread: > Callables are either large or transient. If large, then the extra few bytes makes little difference. If transient then, it matters even less. >> Both type flags and >> nargs bits are very limited resources. > > Type flags are only a limited resource if you think that all flags ever > added to a type must be put into tp_flags. There is nothing wrong with > adding new fields tp_extraflags or tp_vectorcall_flags to a type. Indeed. Extra flags are just what I think PEP 590 is missing. >> What I don't like about it is that it has >> the extensions built-in; mandatory for all callers/callees. > > I don't agree with the above sentence about PEP 580: > - callers should use APIs like PyCCall_FastCall() and shouldn't need to > worry about the implementation details at all. > - callees can opt out of all the extensions by not setting any special > flags and setting cr_self to a non-NULL value. When using the flags > CCALL_FASTCALL | CCALL_KEYWORDS, then implementing the callee is exactly > the same as PEP 590. Imagine an extension author sitting down to read the docs and implement a callable: - PEP 580 introduces 6 CCALL_* combinations: you need to select the best one for your use case. Also, add two structs to the instance & link them via pointers, make sure you support descriptor behavior and the __name__ attribute. (Plus there are features for special purposes: CCALL_DEFARG, CCALL_OBJCLASS, self-slicing, but you can skip that initially.) - My proposal: to the instance, add a function pointer with known signature and flags which you set to zero. Add an offset to the type, and set a type flag. (There are additional possible optimizations, but you can skip them initially.) PEP 580 makes a lot of sense if you read it all, but I fear there'll be very few people who read and understand it. And is not important just for extension authors (admittedly, implementing a callable directly using the C API is often a bad idea). The more people understand the mechanism, the more people can help with further improvements. I don't see the benefit of supporting METH_VARARGS, METH_NOARGS, and METH_O calling conventions (beyond backwards compatibility and comptibility with Python's *args syntax). For keywords, I see a benefit in supporting *only one* of kwarg dict or kwarg tuple: if the caller and callee don't agree on which one to use, you need an expensive conversion. If we say tuple is the way, some of them will need to adapt, but within the set of those that do it any caller/callee combination will be fast. (And if tuple only turns out to be wrong choice, adding dict support in the future shouldn't be hard.) That leaves fastcall (with tuple only) as the focus of this PEP, and the other calling conventions essentially as implementation details of builtin functions/methods. >> As in PEP 590, any class that uses this mechanism shall not be usable as >> a base class. > > Can we please lift this restriction? There is really no reason for it. > I'm not aware of any similar restriction anywhere in CPython. Note that > allowing subclassing is not the same as inheriting the protocol. Sure, let's use PEP 580 treatment of inheritance. Even if we don't, I don't think dropping this restriction would be a PEP-level change. It can be dropped as soon as an implementation and tests are ready, and inheritance issues ironed out. But it doesn't need to be in the initial implementation. > As a compromise, we could simply never inherit the protocol. That also sounds reasonable for the initial implementation. From pviktori at redhat.com Thu Apr 25 17:11:36 2019 From: pviktori at redhat.com (Petr Viktorin) Date: Thu, 25 Apr 2019 17:11:36 -0400 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CC179F2.5050200@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> <5CC179F2.5050200@UGent.be> Message-ID: <49c8c9d4-1537-fc54-1ebd-e908d4a858d7@redhat.com> On 4/25/19 5:12 AM, Jeroen Demeyer wrote: > On 2019-04-25 00:24, Petr Viktorin wrote: >> PEP 590 defines a new simple/fast protocol for its users, and instead of >> making existing complexity faster and easier to use, it's left to be >> deprecated/phased out (or kept in existing classes for backwards >> compatibility). It makes it possible for future code to be >> faster/simpler. > > Can you elaborate on what you mean with this deprecating/phasing out? Kept for backwards compatibility, but not actively recommended or optimized. Perhaps made slower if that would help performance elsewhere. > What's your view on dealing with method classes (not necessarily right > now, but in the future)? Do you think that having separate method > classes like method-wrapper (for example [].__add__) is good or bad? I fully agree with PEP 579's point on complexity: > There are a huge number of classes involved to implement all variations of methods. This is not a problem by itself, but a compounding issue. The main problem is that, currently, you sometimes need to care about this (due to CPython special casing its own classes, without fallback to some public API). Ideally, what matters is the protocols the class implements rather than the class itself. If that is solved, having so many different classes becomes curious but unimportant -- merging them shouldn't be a priority. I'd concentrate on two efforts instead: - Calling should have a fast public API. (That's this PEP.) - Introspection should have well-defined, consistently used public API (but not necessarily fast). For introspection, I think the way is implementing the necessary API (e.g. dunder attributes) and changing things like inspect, traceback generation, etc. to use them. CPython's callable classes should stay as internal implementation details. (Specifically: I'm against making them subclassable: allowing subclasses basically makes everything about the superclass an API.) > Since the way how PEP 580 and PEP 590 deal with bound method classes is > very different, I would like to know the roadmap for this. My thoughts are not the roadmap, of course :) Speaking about roadmaps, I often use PEP 579 to check what I'm forgetting. Here are my thoughts on it: ## Naming (The word "built-in" is overused in Python) This is a social/docs problem, and out of scope of the technical efforts. PEPs should always define the terms they use (even in the case where there is an official definition, but it doesn't match popular usage). ## Not extendable As I mentioned above, I'm against opening the callables for subclassing. We should define and use protocols instead. ## cfunctions do not become methods If we were designing Python from scratch, this should have been done differently. Now this is a problem for Cython to solve. CPython should provide the tools to do so. ## Semantics of inspect.isfunction I don't like inspect.isfunction, because "Is it a function?" is almost never what you actually want to ask. I'd like to deprecate it in favor of explicit functions like "Does it have source code?", "Is it callable?", or even "Is it exactly types.FunctionType?". But I'm against changing its behavior -- people are expecting the current answer. ## C functions should have access to the function object That's where my stake in all this is; I want to move on with PEP 573 after 580/590 is sorted out. ## METH_FASTCALL is private and undocumented This is the intersection of PEP 580 and 590. ## Allowing native C arguments This would be a very experimental feature. Argument Clinic itself is not intended for public use, locking its "impl" functions as part of public API is off the table at this point. Cython's cpdef allows this nicely, and CPython's API is full of C functions. That should be good enough good for now. ## Complexity We should simpify, but I think the number of callable classes is not the best metric to focus on. ## PyMethodDef is too limited This is a valid point. But the PyMethodDef array is little more than a shortcut to creating methods directly in a loop. The immediate workaround could be to create a new constructor for methods. Then we can look into expressing the data declaratively again. ## Slot wrappers have no custom documentation I think this can now be done with a new custom slot wrapper class. Perhaps that can be added to CPython when it matures. ## Static methods and class methods should be callable This is a valid, though minor, point. I don't event think it would be a PEP-level change. From stefan_ml at behnel.de Thu Apr 25 17:56:56 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 25 Apr 2019 23:56:56 +0200 Subject: [Python-Dev] Any core dev event plans for EP19? Message-ID: Hi core devs, there are several core dev events happening at the US PyCon this year, so I was wondering if we could organise something similar at EuroPython. Does anyone have any plans or ideas already? And, how many of us are planning to attend EP19 in Basel this year? Unless there's something already going on that I missed, I can (try to) set up a poll on dpo to count the interest and collect ideas. Sprints would probably be a straight-forward option, a mentoring session could be another, a language summit or PEP discussion/mentoring round would also be a possibility. More ideas welcome. Stefan From levkivskyi at gmail.com Thu Apr 25 18:05:20 2019 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 25 Apr 2019 15:05:20 -0700 Subject: [Python-Dev] Any core dev event plans for EP19? In-Reply-To: References: Message-ID: Hi, I want to come to EP this year, but didn't register yet, is registration already open? -- Ivan On Thu, 25 Apr 2019 at 15:01, Stefan Behnel wrote: > Hi core devs, > > there are several core dev events happening at the US PyCon this year, so I > was wondering if we could organise something similar at EuroPython. Does > anyone have any plans or ideas already? And, how many of us are planning to > attend EP19 in Basel this year? Unless there's something already going on > that I missed, I can (try to) set up a poll on dpo to count the interest > and collect ideas. > > Sprints would probably be a straight-forward option, a mentoring session > could be another, a language summit or PEP discussion/mentoring round would > also be a possibility. More ideas welcome. > > Stefan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From berker.peksag at gmail.com Thu Apr 25 19:15:16 2019 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Fri, 26 Apr 2019 02:15:16 +0300 Subject: [Python-Dev] Any core dev event plans for EP19? In-Reply-To: References: Message-ID: On Fri, Apr 26, 2019 at 1:01 AM Stefan Behnel wrote: > there are several core dev events happening at the US PyCon this year, so I > was wondering if we could organise something similar at EuroPython. Does > anyone have any plans or ideas already? And, how many of us are planning to > attend EP19 in Basel this year? Unless there's something already going on > that I missed, I can (try to) set up a poll on dpo to count the interest > and collect ideas. Note that this year's core dev sprint will be held in London. See https://discuss.python.org/t/2019-core-dev-sprint-location-date/489 for the previous discussion. There are only two months between both events, so perhaps we can leave things like discussions on active PEPs to the core dev sprint? (And welcome to the team!) --Berker From stefan_ml at behnel.de Fri Apr 26 01:56:06 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 26 Apr 2019 07:56:06 +0200 Subject: [Python-Dev] Any core dev event plans for EP19? In-Reply-To: References: Message-ID: Berker Peksa? schrieb am 26.04.19 um 01:15: > Note that this year's core dev sprint will be held in London. See > https://discuss.python.org/t/2019-core-dev-sprint-location-date/489 > for the previous discussion. There are only two months between both > events, so perhaps we can leave things like discussions on active PEPs > to the core dev sprint? > (And welcome to the team!) Ah, nice! Thanks for telling me, I wasn't aware of it. London is just a day by train from where I live, I'm totally in for that. Stefan From J.Demeyer at UGent.be Fri Apr 26 06:54:35 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 26 Apr 2019 12:54:35 +0200 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: <1309c111-359e-1af6-2655-a16ffaad2994@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CC1C765.1020608@UGent.be> <1309c111-359e-1af6-2655-a16ffaad2994@redhat.com> Message-ID: <5CC2E36B.1080302@UGent.be> Hello, after reading the various comments and thinking about it more, let me propose a real compromise between PEP 580 and PEP 590. My proposal is: take the general framework of PEP 580 but support only a single calling convention like PEP 590. The single calling convention supported would be what is currently specified by the flag combination CCALL_DEFARG|CCALL_FASTCALL|CCALL_KEYWORDS. This way, the flags CCALL_VARARGS, CCALL_FASTCALL, CCALL_O, CCALL_NOARGS, CCALL_KEYWORDS, CCALL_DEFARG can be dropped. This calling convention is very similar to the calling convention of PEP 590, except that: - the callable is replaced by a pointer to a PyCCallDef (the structure from PEP 580, but possibly without cc_parent) - there is a self argument like PEP 580. This implies support for the CCALL_SELFARG flag from PEP 580 and no support for the PY_VECTORCALL_ARGUMENTS_OFFSET trick of PEP 590. Background: I added support for all those calling conventions in PEP 580 because I didn't want to make any compromise regarding performance. When writing PEP 580, I assumed that any kind of performance regression would be a reason to reject PEP 580. However, it seems now that you're willing to accept PEP 590 instead which does introduce performance regressions in certain code paths. So that suggests that we could keep the good parts of PEP 580 but reduce its complexity by having a single calling convention like PEP 590. If you compare this compromise to PEP 590, the main difference is dealing with bound methods. Personally, I really like the idea of having a *single* bound method class which would be used by all kinds of function classes without any loss of performance (not only in CPython itself, but also by Cython and other C extensions). To support that, we need something like the PyCCallRoot structure from PEP 580, together with the special handling for self. About cc_parent and CCALL_OBJCLASS: I prefer to keep that because it allows to merge classes for bare functions (not inside a class) and unbound methods (functions inside a class). Concretely, that could reduce code duplication between builtin_function_or_method and method_descriptor. But I'm also fine with removing cc_parent and CCALL_OBJCLASS. In any case, we can decide that later. What do you think? Jeroen. From tir.karthi at gmail.com Fri Apr 26 06:56:45 2019 From: tir.karthi at gmail.com (Karthikeyan) Date: Fri, 26 Apr 2019 16:26:45 +0530 Subject: [Python-Dev] Any core dev event plans for EP19? In-Reply-To: References: Message-ID: On Fri, Apr 26, 2019 at 3:40 AM Ivan Levkivskyi wrote: > Hi, > > I want to come to EP this year, but didn't register yet, is registration > already open? > > Just to add to this core developers are eligible for free entry to the conference : https://www.europython-society.org/core-grant -- Regards, Karthikeyan S -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.glaser at inria.fr Fri Apr 26 08:32:31 2019 From: pierre.glaser at inria.fr (Pierre Glaser) Date: Fri, 26 Apr 2019 14:32:31 +0200 (CEST) Subject: [Python-Dev] Increasing the C-optimized pickle extensibility Message-ID: <2040555306.8345029.1556281951734.JavaMail.zimbra@inria.fr> Hi All, We (Antoine Pitrou, Olivier Grisel and myself) spent some efforts recently on enabling pickle extensions to extend the C-optimized Pickler instead of the pure Python one. Pickle extensions have a crucial role in many distributed computing libraries: cloudpickle (https://github.com/cloudpipe/cloudpickle) for example is vendored in dask, pyspark, ray, and joblib. Early benchmarks show that relying on the C-optimized pickle yields significant serialization speed improvements (up to 30x faster). (draft PR of the CPickler-backed version of cloudpickle: https://github.com/cloudpipe/cloudpickle/pull/253) To make extending the C Pickler possible, we are currently moving forward with a few enhancements to the public pickle API. * First, we are enabling Pickler subclasses to implement a reducer_override method, that will be have priority over the registered reducers in the dispatch_table and over the default handling of classes and functions. (PR link: https://github.com/python/cpython/pull/12499) * Then, we are adding a new keyword argument to save_reduce called state_setter. (consequently we allow a reducer's return value to have a new, 6th item). This state setter callable is useful to override programmatically the state updating behavior of an object, that would otherwise be restricted to its static ``__setstate__`` method. (PR link: https://github.com/python/cpython/pull/12588) The PR review process of these changes is in progress, and anyone is welcomed to chime in and share some thoughts. The first addition is very non-invasive. We estimated that the second point did not require introducing a new opcode, as this change could be implemented as simple sequence of standard pickle instructions. We therefore think that it is not necessary to make this change dependent on the new protocol 5 proposed in PEP 574. The key advantage in not creating a new opcode that this makes our change backward-compatible, meaning that 3.8-written pickles will not break because of our change if read using earlier Python versions. OTOH, one might argue that a new OPCODE might * make the code a little bit cleaner * make it easier to interpret disassembled pickle strings. If you are interested, here is an example of a disassembled pickle string using our currently proposed solution: https://github.com/pierreglaser/cpython/pull/2#issuecomment-486243350 Does anyone have an opinion on this? Thanks, Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Apr 26 10:49:45 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 26 Apr 2019 07:49:45 -0700 Subject: [Python-Dev] Steering Council Update for April 2019 Message-ID: I've posted an update from the Steering Council to our repo: https://github.com/python/steering-council/blob/master/updates/2019-04-26_steering-council-update.md I will also link to this on Discourse (discuss.python.org). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Fri Apr 26 10:52:18 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 26 Apr 2019 16:52:18 +0200 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <49c8c9d4-1537-fc54-1ebd-e908d4a858d7@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> <5CC179F2.5050200@UGent.be> <49c8c9d4-1537-fc54-1ebd-e908d4a858d7@redhat.com> Message-ID: <5CC31B22.7070705@UGent.be> On 2019-04-25 23:11, Petr Viktorin wrote: > My thoughts are not the roadmap, of course :) I asked about methods because we should aware of the consequences when choosing between PEP 580 and PEP 590 (or some compromise). There are basically 3 different ways of dealing with bound methods: (A) put methods inside the protocol. This is PEP 580 and my 580/590 compromise proposal. The disadvantage here is complexity in the protocol. (B) don't put methods inside the protocol and use a single generic method class types.MethodType. This is the status-quo for Python functions. It has the disadvantage of being slightly slower: there is an additional level of indirection when calling a bound method object. (C) don't put methods inside the protocol but use multiple method classes, one for every function class. This is the status-quo for functions implemented in C. This has the disadvantage of code duplication. I think that the choice between PEP 580 or 590 should be done together with a choice of one of the above options. For example, I really don't like the code duplication of (C), so I would prefer PEP 590 with (B) over PEP 590 with (C). From guido at python.org Fri Apr 26 11:15:02 2019 From: guido at python.org (Guido van Rossum) Date: Fri, 26 Apr 2019 08:15:02 -0700 Subject: [Python-Dev] Increasing the C-optimized pickle extensibility In-Reply-To: <2040555306.8345029.1556281951734.JavaMail.zimbra@inria.fr> References: <2040555306.8345029.1556281951734.JavaMail.zimbra@inria.fr> Message-ID: I think it's better not to introduce a new opcode, for the reason you stated -- you don't want your pickles to be unreadable by older Python versions, if you can help it. On Fri, Apr 26, 2019 at 5:59 AM Pierre Glaser wrote: > Hi All, > > We (Antoine Pitrou, Olivier Grisel and myself) spent some efforts recently > on > enabling pickle extensions to extend the C-optimized Pickler instead of the > pure Python one. > > Pickle extensions have a crucial role in many distributed computing > libraries: > cloudpickle (https://github.com/cloudpipe/cloudpickle) for example is > vendored > in dask, pyspark, ray, and joblib. > Early benchmarks show that relying on the C-optimized pickle yields > significant serialization speed improvements (up to 30x faster). > (draft PR of the CPickler-backed version of cloudpickle: > https://github.com/cloudpipe/cloudpickle/pull/253) > > To make extending the C Pickler possible, we are currently moving forward > with > a few enhancements to the public pickle API. > > * First, we are enabling Pickler subclasses to implement a reducer_override > method, that will be have priority over the registered reducers in the > dispatch_table and over the default handling of classes and functions. > (PR link: https://github.com/python/cpython/pull/12499) > > * Then, we are adding a new keyword argument to save_reduce called > state_setter. > (consequently we allow a reducer's return value to have a new, 6th item). > This state setter callable is useful to override programmatically the > state updating > behavior of an object, that would otherwise be restricted to its static > ``__setstate__`` method. > (PR link: https://github.com/python/cpython/pull/12588) > > The PR review process of these changes is in progress, and anyone is > welcomed > to chime in and share some thoughts. > > The first addition is very non-invasive. We estimated that the second > point did > not require introducing a new opcode, as this change could be implemented > as > simple sequence of standard pickle instructions. We therefore think that > it is > not necessary to make this change dependent on the new protocol 5 proposed > in > PEP 574. > > The key advantage in not creating a new opcode that this makes our change > backward-compatible, meaning that 3.8-written pickles will not break > because of > our change if read using earlier Python versions. > > OTOH, one might argue that a new OPCODE might > * make the code a little bit cleaner > * make it easier to interpret disassembled pickle strings. > > If you are interested, here is an example of a disassembled pickle string > using our currently proposed solution: > https://github.com/pierreglaser/cpython/pull/2#issuecomment-486243350 > > Does anyone have an opinion on this? > > Thanks, > > Pierre > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Apr 26 11:18:27 2019 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 26 Apr 2019 09:18:27 -0600 Subject: [Python-Dev] Steering Council Update for April 2019 In-Reply-To: References: Message-ID: Thanks to each of you for the update and the hard work! -eric On Fri, Apr 26, 2019 at 8:52 AM Guido van Rossum wrote: > > I've posted an update from the Steering Council to our repo: > > https://github.com/python/steering-council/blob/master/updates/2019-04-26_steering-council-update.md > > I will also link to this on Discourse (discuss.python.org). > > -- > --Guido van Rossum (python.org/~guido) > Pronouns: he/him/his (why is my pronoun here?) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com From ericsnowcurrently at gmail.com Fri Apr 26 12:46:31 2019 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 26 Apr 2019 10:46:31 -0600 Subject: [Python-Dev] Actor Model in Python In-Reply-To: References: Message-ID: On Sun, Mar 31, 2019 at 9:19 AM Aratz Manterola Lasa via Python-Dev wrote: > I was wondering if there was any Python project aiming to implement the actor model for python concurrency. As far as the standard library goes, the explicitly supported concurrency models are: threading, multiprocessing, and async/await. Between these (and a few other parts provided by Python) anyone can build libraries that emulate various other concurrency models. Such libraries exist on the cheeseshop (PyPI), though I don't know about packages for the actor model specifically. I recommend searching there for such packages. If you don't find one then perhaps you've found a new project to start. :) Also, I have a proposal [1] for Python 3.9 that provides first class [low level] support for concurrency models like CSP and the actor model. This is done with multiple [mostly] isolated interpreters per process and with basic "channels" for safely passing messages between them. While the proposed library is intended to be useful on its own, it is also intended to provide effective building blocks for library authors. Note that the PEP has not been accepted and is not guaranteed to be accepted (though I'm hopeful). Regardless, consider posting to python-list at python.org for feedback from the broader Python community. This list is specifically used for the development of the Python language itself. Thanks! -eric [1] https://www.python.org/dev/peps/pep-0554/ From status at bugs.python.org Fri Apr 26 14:07:55 2019 From: status at bugs.python.org (Python tracker) Date: Fri, 26 Apr 2019 18:07:55 +0000 (UTC) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20190426180755.412D452BE98@bugs.ams1.psf.io> ACTIVITY SUMMARY (2019-04-19 - 2019-04-26) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 7083 (+25) closed 41402 (+39) total 48485 (+64) Open issues with patches: 2821 Issues opened (46) ================== #11871: test_default_timeout() of test_threading.BarrierTests failure: https://bugs.python.org/issue11871 reopened by vstinner #36672: A compiler warning in winreg.SetValue() https://bugs.python.org/issue36672 opened by ZackerySpytz #36673: Comment/PI parsing support for ElementTree https://bugs.python.org/issue36673 opened by scoder #36674: "unittest.TestCase.debug" should honour "skip" (and other test https://bugs.python.org/issue36674 opened by dmaurer #36675: Doctest directives and comments not visible or missing from co https://bugs.python.org/issue36675 opened by steven.daprano #36676: Make ET.XMLParser target aware of namespace prefixes https://bugs.python.org/issue36676 opened by scoder #36678: duplicate method definitions in Lib/test/test_dataclasses.py https://bugs.python.org/issue36678 opened by xdegaye #36679: duplicate method definition in Lib/test/test_genericclass.py https://bugs.python.org/issue36679 opened by xdegaye #36680: duplicate method definition in Lib/test/test_importlib/test_u https://bugs.python.org/issue36680 opened by xdegaye #36681: duplicate method definition in Lib/test/test_logging.py https://bugs.python.org/issue36681 opened by xdegaye #36683: duplicate method definition in Lib/test/test_utf8_mode.py https://bugs.python.org/issue36683 opened by xdegaye #36684: codecov.io code coverage has not updated since 2019-04-13 https://bugs.python.org/issue36684 opened by gphemsley #36685: C implementation of xml.etree.ElementTree does not make a copy https://bugs.python.org/issue36685 opened by gphemsley #36686: Docs: asyncio.loop.subprocess_exec documentation is confusing, https://bugs.python.org/issue36686 opened by sbstp #36688: _dummy_thread lacks an RLock implementaiton https://bugs.python.org/issue36688 opened by xtreak #36689: docs: os.path.commonpath raises ValueError for different drive https://bugs.python.org/issue36689 opened by lazka #36691: SystemExit & sys.exit : Allow both exit status and message https://bugs.python.org/issue36691 opened by takluyver #36692: Unexpected stderr output from test_sys_settrace https://bugs.python.org/issue36692 opened by ncoghlan #36694: Excessive memory use or memory fragmentation when unpickling m https://bugs.python.org/issue36694 opened by Ellenbogen #36697: inspect.getclosurevars returns wrong globals dict https://bugs.python.org/issue36697 opened by Noitul #36698: Shell restart when error message contains non-BMP characters https://bugs.python.org/issue36698 opened by TheMathsGod #36699: building for riscv multilib (patch attached) https://bugs.python.org/issue36699 opened by Andreas K. H??ttel #36700: base64 has old references that should be updated https://bugs.python.org/issue36700 opened by paulehoffman #36702: test_dtrace failed https://bugs.python.org/issue36702 opened by sayno996 #36703: [Easy][Windows] test_subprocess: test_close_fds_with_stdio() h https://bugs.python.org/issue36703 opened by vstinner #36704: logging.FileHandler currently hardcodes errors='strict' https://bugs.python.org/issue36704 opened by sourcejedi #36706: Python script on startup stucks at import https://bugs.python.org/issue36706 opened by serge g #36709: Asyncio SSL keep-alive connections raise errors after loop clo https://bugs.python.org/issue36709 opened by tomchristie #36710: Pass _PyRuntimeState as an argument rather than using the _PyR https://bugs.python.org/issue36710 opened by vstinner #36711: duplicate method definition in Lib/email/feedparser.py https://bugs.python.org/issue36711 opened by xdegaye #36712: duplicate method definition in Lib/email/test/test_email_renam https://bugs.python.org/issue36712 opened by xdegaye #36713: uplicate method definition in Lib/ctypes/test/test_unicode.py https://bugs.python.org/issue36713 opened by xdegaye #36714: Tweak doctest 'example' regex to allow a leading ellipsis in ' https://bugs.python.org/issue36714 opened by bskinn #36715: Dictionary initialization https://bugs.python.org/issue36715 opened by Aditya Sane #36716: Embedded Python fails to import module files with version_plat https://bugs.python.org/issue36716 opened by ecosatto #36717: Allow retrieval of return value from the target of a threading https://bugs.python.org/issue36717 opened by Joel Croteau2 #36719: regrtest --findleaks should fail if an uncollectable object is https://bugs.python.org/issue36719 opened by vstinner #36721: Add pkg-config python-3.8-embed https://bugs.python.org/issue36721 opened by vstinner #36723: Unittest Discovery for namespace subpackages dot notation fail https://bugs.python.org/issue36723 opened by mrwaffles #36724: Clear _PyRuntime at exit https://bugs.python.org/issue36724 opened by vstinner #36725: Reference leak regression with Python3.8a3 https://bugs.python.org/issue36725 opened by kayhayen #36728: Remove PyEval_ReInitThreads() from the public C API https://bugs.python.org/issue36728 opened by vstinner #36729: Delete unused text variable on tests https://bugs.python.org/issue36729 opened by eamanu #36730: Change outdated references to macOS https://bugs.python.org/issue36730 opened by Sebastian Bassi #36732: test_asyncio: test_huge_content_recvinto() fails randomly https://bugs.python.org/issue36732 opened by vstinner #36734: Modules/faulthandler.c does not compile on HP-UX due to bpo-35 https://bugs.python.org/issue36734 opened by michael-o Most recent 15 issues with no replies (15) ========================================== #36732: test_asyncio: test_huge_content_recvinto() fails randomly https://bugs.python.org/issue36732 #36730: Change outdated references to macOS https://bugs.python.org/issue36730 #36729: Delete unused text variable on tests https://bugs.python.org/issue36729 #36728: Remove PyEval_ReInitThreads() from the public C API https://bugs.python.org/issue36728 #36723: Unittest Discovery for namespace subpackages dot notation fail https://bugs.python.org/issue36723 #36721: Add pkg-config python-3.8-embed https://bugs.python.org/issue36721 #36717: Allow retrieval of return value from the target of a threading https://bugs.python.org/issue36717 #36713: uplicate method definition in Lib/ctypes/test/test_unicode.py https://bugs.python.org/issue36713 #36712: duplicate method definition in Lib/email/test/test_email_renam https://bugs.python.org/issue36712 #36703: [Easy][Windows] test_subprocess: test_close_fds_with_stdio() h https://bugs.python.org/issue36703 #36702: test_dtrace failed https://bugs.python.org/issue36702 #36699: building for riscv multilib (patch attached) https://bugs.python.org/issue36699 #36692: Unexpected stderr output from test_sys_settrace https://bugs.python.org/issue36692 #36676: Make ET.XMLParser target aware of namespace prefixes https://bugs.python.org/issue36676 #36675: Doctest directives and comments not visible or missing from co https://bugs.python.org/issue36675 Most recent 15 issues waiting for review (15) ============================================= #36734: Modules/faulthandler.c does not compile on HP-UX due to bpo-35 https://bugs.python.org/issue36734 #36729: Delete unused text variable on tests https://bugs.python.org/issue36729 #36725: Reference leak regression with Python3.8a3 https://bugs.python.org/issue36725 #36724: Clear _PyRuntime at exit https://bugs.python.org/issue36724 #36719: regrtest --findleaks should fail if an uncollectable object is https://bugs.python.org/issue36719 #36715: Dictionary initialization https://bugs.python.org/issue36715 #36710: Pass _PyRuntimeState as an argument rather than using the _PyR https://bugs.python.org/issue36710 #36688: _dummy_thread lacks an RLock implementaiton https://bugs.python.org/issue36688 #36685: C implementation of xml.etree.ElementTree does not make a copy https://bugs.python.org/issue36685 #36683: duplicate method definition in Lib/test/test_utf8_mode.py https://bugs.python.org/issue36683 #36681: duplicate method definition in Lib/test/test_logging.py https://bugs.python.org/issue36681 #36680: duplicate method definition in Lib/test/test_importlib/test_u https://bugs.python.org/issue36680 #36679: duplicate method definition in Lib/test/test_genericclass.py https://bugs.python.org/issue36679 #36678: duplicate method definitions in Lib/test/test_dataclasses.py https://bugs.python.org/issue36678 #36676: Make ET.XMLParser target aware of namespace prefixes https://bugs.python.org/issue36676 Top 10 most discussed issues (10) ================================= #36710: Pass _PyRuntimeState as an argument rather than using the _PyR https://bugs.python.org/issue36710 12 msgs #36725: Reference leak regression with Python3.8a3 https://bugs.python.org/issue36725 8 msgs #35224: PEP 572: Assignment Expressions https://bugs.python.org/issue35224 6 msgs #36670: test suite broken due to cpu usage feature on win 10/ german https://bugs.python.org/issue36670 6 msgs #16079: list duplicate test names with patchcheck https://bugs.python.org/issue16079 5 msgs #35824: http.cookies._CookiePattern modifying regular expressions https://bugs.python.org/issue35824 5 msgs #36661: Missing dataclass decorator import in dataclasses module docs https://bugs.python.org/issue36661 5 msgs #36719: regrtest --findleaks should fail if an uncollectable object is https://bugs.python.org/issue36719 5 msgs #32424: Synchronize copy methods between Python and C implementations https://bugs.python.org/issue32424 4 msgs #36624: cleanup the stdlib and tests with regard to sys.platform usage https://bugs.python.org/issue36624 4 msgs Issues closed (40) ================== #9194: winreg:fixupMultiSZ should check that P < Q in the inner loop https://bugs.python.org/issue9194 closed by steve.dower #17349: wsgiref.simple_server.demo_app is not PEP-3333 compatible https://bugs.python.org/issue17349 closed by berker.peksag #18372: _Pickler_New() doesn't call PyObject_GC_Track(self) https://bugs.python.org/issue18372 closed by inada.naoki #21536: extension built with a shared python cannot be loaded with a s https://bugs.python.org/issue21536 closed by vstinner #23078: unittest.mock patch autospec doesn't work on staticmethods https://bugs.python.org/issue23078 closed by berker.peksag #24011: Add error checks to PyInit_signal() https://bugs.python.org/issue24011 closed by berker.peksag #28552: Distutils fail if sys.executable is None https://bugs.python.org/issue28552 closed by vstinner #30840: Contrary to documentation, relative imports cannot pass throug https://bugs.python.org/issue30840 closed by ncoghlan #35149: pip3 show causing Error for ConfigParaser https://bugs.python.org/issue35149 closed by eryksun #36454: test_time: test_monotonic() failed on AMD64 FreeBSD 10-STABLE https://bugs.python.org/issue36454 closed by vstinner #36465: Make release and debug ABI compatible https://bugs.python.org/issue36465 closed by vstinner #36469: Stuck during interpreter exit, attempting to take the GIL https://bugs.python.org/issue36469 closed by mocramis #36523: Add docstring to io.IOBase.writelines https://bugs.python.org/issue36523 closed by inada.naoki #36546: Add quantiles() to the statistics module https://bugs.python.org/issue36546 closed by rhettinger #36635: Add _testinternalcapi module https://bugs.python.org/issue36635 closed by vstinner #36645: re.sub() library entry does not adequately document surprising https://bugs.python.org/issue36645 closed by berker.peksag #36650: Cached method implementation no longer works on Python 3.7.3 https://bugs.python.org/issue36650 closed by rhettinger #36658: Py_Initialze() throws error 'unable to load the file system en https://bugs.python.org/issue36658 closed by steve.dower #36659: distutils UnixCCompiler: Remove standard library path from rpa https://bugs.python.org/issue36659 closed by vstinner #36668: semaphore_tracker is not reused by child processes https://bugs.python.org/issue36668 closed by pitrou #36669: weakref proxy doesn't support the matrix multiplication operat https://bugs.python.org/issue36669 closed by SilentGhost #36671: str.lower() looses character information when working with UTF https://bugs.python.org/issue36671 closed by SilentGhost #36677: support visual studio multiprocess compile https://bugs.python.org/issue36677 closed by Manjusaka #36682: duplicate method definitions in Lib/test/test_sys_setprofile.p https://bugs.python.org/issue36682 closed by steve.dower #36687: subprocess encoding https://bugs.python.org/issue36687 closed by sbstp #36690: A typing error in demo rpython.py https://bugs.python.org/issue36690 closed by berker.peksag #36693: Reversing large ranges results in a minor type inconsistency https://bugs.python.org/issue36693 closed by rhettinger #36695: Change (regression?) in v3.8.0a3 doctest output after capturin https://bugs.python.org/issue36695 closed by bskinn #36696: possible multiple regressions on AIX https://bugs.python.org/issue36696 closed by vstinner #36701: module 'urllib' has no attribute 'request' https://bugs.python.org/issue36701 closed by brett.cannon #36705: Unexpected Behaviour of pprint.pprint https://bugs.python.org/issue36705 closed by fdrake #36707: The "m" ABI flag of SOABI for pymalloc is no longer needed https://bugs.python.org/issue36707 closed by vstinner #36708: can not execute the python + version, to launch python under w https://bugs.python.org/issue36708 closed by brett.cannon #36718: Python 2.7 compilation fails on AMD64 Ubuntu Shared 2.7 buildb https://bugs.python.org/issue36718 closed by vstinner #36720: Correct Should to Must in Definition of object.__len__ https://bugs.python.org/issue36720 closed by brett.cannon #36722: In debug build, load also C extensions compiled in release mod https://bugs.python.org/issue36722 closed by vstinner #36726: Empty select() on windows gives error. https://bugs.python.org/issue36726 closed by martin.panter #36727: python 3.6+ docs use ul tags instead of ol tags https://bugs.python.org/issue36727 closed by eric.smith #36731: Add example to priority queue https://bugs.python.org/issue36731 closed by rhettinger #36733: make regen-all doesn't work in subfolder: No module named Pars https://bugs.python.org/issue36733 closed by vstinner From pwang at anaconda.com Fri Apr 26 17:57:31 2019 From: pwang at anaconda.com (Peter Wang) Date: Fri, 26 Apr 2019 16:57:31 -0500 Subject: [Python-Dev] Increasing the C-optimized pickle extensibility In-Reply-To: References: <2040555306.8345029.1556281951734.JavaMail.zimbra@inria.fr> Message-ID: I strongly second not breaking backwards compatibility and interoperability, especially for persistent artifacts, unless there is a *REALLY* good reason. A potential unintended side effect of such breakages is that it slows down adoption of the new version. -Peter On Fri, Apr 26, 2019 at 10:27 AM Guido van Rossum wrote: > I think it's better not to introduce a new opcode, for the reason you > stated -- you don't want your pickles to be unreadable by older Python > versions, if you can help it. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Apr 26 22:14:16 2019 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 26 Apr 2019 22:14:16 -0400 Subject: [Python-Dev] Actor Model in Python In-Reply-To: References: Message-ID: https://trio.readthedocs.io/en/latest/reference-core.html#synchronizing-and-communicating-between-tasks https://pypi.org/search/?q=Actor+model https://en.wikipedia.org/wiki/Actor_model https://en.wikipedia.org/wiki/Bulk_synchronous_parallel#The_model On Friday, April 26, 2019, Eric Snow wrote: > On Sun, Mar 31, 2019 at 9:19 AM Aratz Manterola Lasa via Python-Dev > wrote: > > I was wondering if there was any Python project aiming to implement the > actor model for python concurrency. > > As far as the standard library goes, the explicitly supported > concurrency models are: threading, multiprocessing, and async/await. > Between these (and a few other parts provided by Python) anyone can > build libraries that emulate various other concurrency models. Such > libraries exist on the cheeseshop (PyPI), though I don't know about > packages for the actor model specifically. I recommend searching > there for such packages. If you don't find one then perhaps you've > found a new project to start. :) > > Also, I have a proposal [1] for Python 3.9 that provides first class > [low level] support for concurrency models like CSP and the actor > model. This is done with multiple [mostly] isolated interpreters per > process and with basic "channels" for safely passing messages between > them. While the proposed library is intended to be useful on its own, > it is also intended to provide effective building blocks for library > authors. Note that the PEP has not been accepted and is not > guaranteed to be accepted (though I'm hopeful). > > > Regardless, consider posting to python-list at python.org for feedback > from the broader Python community. This list is specifically used for > the development of the Python language itself. Thanks! Or python-ideas at python.org , though I'm not sure what would be needed from core Python or stdlib to create another actor model abstraction on top of the actual concurrency primitives. Truly functional actors are slow when/because the memory is not shared inter-process https://arrow.apache.org/docs/python/memory.html#referencing-and-allocating-memory https://arrow.apache.org/docs/python/ipc.html#arbitrary-object-serialization https://www.python.org/dev/peps/pep-0554/#interpreter-isolation > > -eric > > > [1] https://www.python.org/dev/peps/pep-0554/ "PEP 554 -- Multiple Interpreters in the Stdlib" https://www.python.org/dev/peps/pep-0554/ Is there / are there Issues, PRs, and Mailing List Threads regarding the status of this proposal? So sorry to interrupt, > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Sat Apr 27 01:22:55 2019 From: armin.rigo at gmail.com (Armin Rigo) Date: Sat, 27 Apr 2019 07:22:55 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <20190424185501.j6kjdbmb4adj4x6x@python.ca> References: <20190424185501.j6kjdbmb4adj4x6x@python.ca> Message-ID: Hi Neil, On Wed, 24 Apr 2019 at 21:17, Neil Schemenauer wrote: > Regarding the Py_TRACE_REFS fields, I think we can't do them without > breaking the ABI because of the following. For GC objects, they are > always allocated by _PyObject_GC_New/_PyObject_GC_NewVar. So, we > can allocate the extra space needed for the GC linked list. For > non-GC objects, that's not the case. Extensions can allocate using > malloc() directly or their own allocator and then pass that memory > to be initialized as a PyObject. > > I think that's a poor design and I think we should try to make slow > progress in fixing it. Such progress needs to start with the global static PyTypeObjects that all extensions define. This is going to be impossible to fix without requiring a big fix in of *all* of them. (Unless of course you mean to still allow them, but then Py_TRACE_REF can't be implemented in a way that doesn't break the ABI.) A bient?t, Armin. On Wed, 24 Apr 2019 at 21:17, Neil Schemenauer wrote: > > On 2019-04-24, Victor Stinner wrote: > > The current blocker issue is that the Py_DEBUG define imply the > > Py_TRACE_REFS define > > I think your change to make Py_TRACE_REFS as separate configure flag > is fine. I've used the trace fields to debug occasionally but I > don't use it often enough to need it enabled by Py_DEBUG. > > > Being able to switch between Python in release mode and Python in > > debug mode is a first step. My long term plan would be to better > > separate "Python" from its "runtime". > > Regarding the Py_TRACE_REFS fields, I think we can't do them without > breaking the ABI because of the following. For GC objects, they are > always allocated by _PyObject_GC_New/_PyObject_GC_NewVar. So, we > can allocate the extra space needed for the GC linked list. For > non-GC objects, that's not the case. Extensions can allocate using > malloc() directly or their own allocator and then pass that memory > to be initialized as a PyObject. > > I think that's a poor design and I think we should try to make slow > progress in fixing it. I think non-GC objects should also get > allocated by a Python API. In that case, the Py_TRACE_REFS > functionality could be implemented in a way that doesn't break the > ABI. It also makes the CPython API more friendly for alternative > Python runtimes like PyPy, etc. > > Note that this change would not prevent an extension from allocating > memory with it's own allocator. It just means that memory can't > hold a PyObject. The extension PyObject would need to have a > pointer that points to this externally allocated memory. > > I can imagine there could be some situations when people really > want a PyObject to reside in a certain memory location. E.g. maybe > you have some kind of special shared memory area. In that case, I > think we could have specialized APIs to create PyObjects using a > specialized allocator. Those APIs would not be supported by > some runtimes (e.g. tracing/moving GC for PyObjects) and the APIs > would not be used by most extensions. > > Regards, > > Neil > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/armin.rigo%40gmail.com From mark at hotpy.org Sat Apr 27 05:19:15 2019 From: mark at hotpy.org (Mark Shannon) Date: Sat, 27 Apr 2019 10:19:15 +0100 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <5CB442F3.5060705@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CB442F3.5060705@UGent.be> Message-ID: <988d377d-fcf1-bbcc-152f-3f54d820bdbf@hotpy.org> Hi Jeroen, On 15/04/2019 9:38 am, Jeroen Demeyer wrote: > On 2019-04-14 13:30, Mark Shannon wrote: >> PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward >> calls with an additional argument can do so efficiently. The obvious >> example is bound-methods, but classes are at least as important. >> cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args) > > But tp_new and tp_init take the "cls" and "self" as separate arguments, > not as part of *args. So I don't see why you need > PY_VECTORCALL_ARGUMENTS_OFFSET for this. Here's some (untested) code for an implementation of vectorcall for object subtypes implemented in Python. It uses PY_VECTORCALL_ARGUMENTS_OFFSET to save memory allocation when calling the __init__ method. https://github.com/python/cpython/commit/9ff46e3ba0747f386f9519933910d63d5caae6ee#diff-c3cf251f16d5a03a9e7d4639f2d6f998R3820 Cheers, Mark. From mark at hotpy.org Sat Apr 27 05:26:29 2019 From: mark at hotpy.org (Mark Shannon) Date: Sat, 27 Apr 2019 10:26:29 +0100 Subject: [Python-Dev] PEP 580 and PEP 590 comparison. In-Reply-To: <5CB4422D.8030507@UGent.be> References: <5CB4422D.8030507@UGent.be> Message-ID: <04dc374d-a648-a3f0-5a1f-fee4ec2b1b98@hotpy.org> Hi, On 15/04/2019 9:34 am, Jeroen Demeyer wrote: > On 2019-04-14 13:34, Mark Shannon wrote: >> I'll address capability first. > > I don't think that comparing "capability" makes a lot of sense since > neither PEP 580 nor PEP 590 adds any new capabilities to CPython. They > are meant to allow doing things faster, not to allow more things. > > And yes, the C call protocol can be implemented on top of the vectorcall > protocol and conversely, but that doesn't mean much. That isn't true. You cannot implement PEP 590 on top of PEP 580. PEP 580 isn't as general. Specifically, and this is important, PEP 580 cannot implement efficient calls to class objects without breaking the ABI. > >> Now performance. >> >> Currently the PEP 590 implementation is intentionally minimal. It does >> nothing for performance. > > So, we're missing some information here. What kind of performance > improvements are possible with PEP 590 which are not in the reference > implementation? Performance improvements include, but aren't limited to: 1. Much faster calls to common classes: range(), set(), type(), list(), etc. 2. Modifying argument clinic to produce C functions compatible with the vectorcall, allowing the interpreter to call the C function directly, with no additional overhead beyond the vectorcall call sequence. 3. Customization of the C code for function objects depending on the Python code. The would probably be limited to treating closures and generator function differently, but optimizing other aspects of the Python function call is possible. > >> The benchmark Jeroen provides is a >> micro-benchmark that calls the same functions repeatedly. This is >> trivial and unrealistic. > > Well, it depends what you want to measure... I'm trying to measure > precisely the thing that makes PEP 580 and PEP 590 different from the > status-quo, so in that sense those benchmarks are very relevant. > > I think that the following 3 statements are objectively true: > > (A) Both PEP 580 and PEP 590 add a new calling convention, which is > equally fast as builtin functions (and hence faster than tp_call). Yes > (B) Both PEP 580 and PEP 590 keep roughly the same performance as the > status-quo for existing function/method calls. For the minimal implementation of PEP 590, yes. I would expect a small improvement with and implementation of PEP 590 including optimizations. > (C) While the performance of PEP 580 and PEP 590 is roughly the same, > PEP 580 is slightly faster (based on the reference implementations > linked from PEP 580 and PEP 590)I quite deliberately used the term "minimal" to describe the implementation of PEP 590 you have been using. PEP 590 allows many optimizations. Comparing the performance of the four hundred line minimal diff for PEP 590 with the full four thousand line diff for PEP 580 is misleading. > > Two caveats concerning (C): > - the difference may be too small to matter. Relatively, it's a few > percent of the call time but in absolute numbers, it's less than 10 CPU > clock cycles. > - there might be possible improvements to the reference implementation > of either PEP 580/PEP 590. I don't expect big differences though. > >> To repeat an example >> from an earlier email, which may have been overlooked, this code reduces >> the time to create ranges and small lists by about 30% > > That's just a special case of the general fact (A) above and using the > new calling convention for "type". It's an argument in favor of both PEP > 580 and PEP 590, not for PEP 590 specifically. It very much is an argument in favor of PEP 590. PEP 580 cannot do this. Cheers, Mark. From mark at hotpy.org Sat Apr 27 07:32:54 2019 From: mark at hotpy.org (Mark Shannon) Date: Sat, 27 Apr 2019 12:32:54 +0100 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> Message-ID: <4da05381-9cce-2e41-f76f-cc33f359268c@hotpy.org> Hi Petr, On 24/04/2019 11:24 pm, Petr Viktorin wrote: > So, I spent another day pondering the PEPs. > > I love PEP 590's simplicity and PEP 580's extensibility. As I hinted > before, I hope they can they be combined, and I believe we can achieve > that by having PEP 590's (o+offset) point not just to function pointer, > but to a {function pointer; flags} struct with flags defined for two > optimizations: > - "Method-like", i.e. compatible with LOAD_METHOD/CALL_METHOD. > - "Argument offsetting request", allowing PEP 590's > PY_VECTORCALL_ARGUMENTS_OFFSET optimization. A big problem with adding another field to the structure is that it prevents classes from implementing vectorcall. A 30% reduction in the time to create ranges, small lists and sets and to call type(x) is easily worth the a single tp_flag, IMO. As an aside, there are currently over 10 spare flags. As long we don't consume more that one a year, we have over a decade to make tp_flags a uint64_t. It already consumes 64 bits on any 64 bit machine, due to the struct layout. As I've said before, PEP 590 is universal and capable of supporting an implementation of PEP 580 on top of it. Therefore, adding any flags or fields from PEP 580 to PEP 590 will not increase its capability. Since any extra fields will require at least as many memory accesses as before, it will not improve performance and by restricting layout may decrease it. > > This would mean one basic call signature (today's METH_FASTCALL | > METH_KEYWORD), with individual optimizations available if both the > caller and callee support them. > That would prevent the code having access to the callable object. That access is a fundamental part of both PEP 580 and PEP 590 and the key motivating factor for both. > > > In case you want to know my thoughts or details, let me indulge in some > detailed comparisons and commentary that led to this. > I also give a more detailed proposal below. > Keep in mind I wrote this before I distilled it to the paragraph above, > and though the distillation is written as a diff to PEP 590, I still > think of this as merging both PEPs. > > > PEP 580 tries hard to work with existing call conventions (like METH_O, > METH_VARARGS), making them fast. > PEP 590 just defines a new convention. Basically, any callable that > wants performance improvements must switch to METH_VECTORCALL (fastcall). > I believe PEP 590's approach is OK. To stay as performant as possible, C > extension authors will need to adapt their code regularly. If they > don't, no harm -- the code will still work as before, and will still be > about as fast as it was before. As I see it, authors of C extensions have five options with PEP 590. Option 4, do nothing, is the recommended option :) 1. Use the PyMethodDef protocol, it will work exactly the same as before. It's already fairly quick in most cases. 2. Use Cython and let Cython take care of handling the vectorcall interface. 3. Use Argument Clinic, and let Argument Clinic take care of handling the vectorcall interface. 4. Do nothing. This the same as 1-3 above depending on what you were already doing. 5. Implement the vectorcall call directly. This might be a bit quicker than the above, but probably not enough to be worth it, unless you are implementing numpy or something like that. > In exchange for this, Python (and Cython, etc.) can focus on optimizing > one calling convention, rather than a variety, each with its own > advantages and drawbacks. > > Extending PEP 580 to support a new calling convention will involve > defining a new CCALL_* constant, and adding to existing dispatch code. > Extending PEP 590 to support a new calling convention will most likely > require a new type flag, and either changing the vectorcall semantics or > adding a new pointer. > To be a bit more concrete, I think of possible extensions to PEP 590 as > things like: > - Accepting a kwarg dict directly, without copying the items to > tuple/array (as in PEP 580's CCALL_VARARGS|CCALL_KEYWORDS) > - Prepending more than one positional argument, or appending positional > arguments > - When an optimization like LOAD_METHOD/CALL_METHOD turns out to no > longer be relevant, removing it to simplify/speed up code. > I expect we'll later find out that something along these lines might > improve performance. PEP 590 would make it hard to experiment. > > I mentally split PEP 590 into two pieces: formalizing fastcall, plus one > major "extension" -- making bound methods fast. Not just bound methods, any callable that adds an extra argument before dispatching to another callable. This includes builtin-methods, classes and a few others. Setting the Py_TPFLAGS_METHOD_DESCRIPTOR flag states the behaviour of the object when used as a descriptor. It is up to the implementation to use that information how it likes. If LOAD_METHOD/CALL_METHOD gets replaced, then the new implementation can still use this information. > When seen this way, this "extension" is quite heavy: it adds an > additional type flag, Py_TPFLAGS_METHOD_DESCRIPTOR, and uses a bit in > the "Py_ssize_t nargs" argument as additional flag. Both type flags and > nargs bits are very limited resources. If I was sure vectorcall is the > final best implementation we'll have, I'd go and approve it ? but I > think we still need room for experimentation, in the form of more such > extensions. > PEP 580, with its collection of per-instance data and flags, is > definitely more extensible. What I don't like about it is that it has > the extensions built-in; mandatory for all callers/callees. > > PEP 580 adds a common data struct to callable instances. Currently these > are all data bound methods want to use (cc_flags, cc_func, cc_parent, > cr_self). Various flags are consulted in order to deliver the needed > info to the underlying function. > PEP 590 lets the callable object store data it needs independently. It > provides a clever mechanism for pre-allocating space for bound methods' > prepended "self" argument, so data can be provided cheaply, though it's > still done by the callable itself. > Callables that would need to e.g. prepend more than one argument won't > be able to use this mechanism, but y'all convinced me that is not worth > optimizing for. > > PEP 580's goal seems to be that making a callable behave like a Python > function/method is just a matter of the right set of flags. Jeroen > called this "complexity in the protocol". > PEP 590, on the other hand, leaves much to individual callable types. > This is "complexity in the users of the protocol". > I now don't see a problem with PEP 590's approach. Not all users will > need the complexity. We need to give CPython and Cython the tools to > make implementing "def"-like functions possible (and fast), but if other > extensions need to match the behavior of Python functions, they should > just use Cython. Emulating Python functions is a special-enough use case > that it doesn't justify complicating the protocol, and the same goes for > implementing Python's built-in functions (with all their historical > baggage). > > > > My more full proposal for a compromise between PEP 580 and 590 would go > something like below. > > The type flag (Py_TPFLAGS_HAVE_VECTORCALL/Py_TPFLAGS_HAVE_CCALL) and > offset (tp_vectorcall_offset/tp_ccalloffset; in tp_print's place) stay. > > The offset identifies a per-instance structure with two fields: > - Function pointer (with the vectorcall signature) > - Flags > Storing any other per-instance data (like PEP 580's cr_self/cc_parent) > is the responsibility of each callable type. > > Two flags are defined initially: > 1. "Method-like" (like Py_TPFLAGS_METHOD_DESCRIPTOR in PEP 580, or > non-NULL cr_self in PEP 580). Having the flag here instead of a type > flag will prevent tp_call-only callables from taking advantage of > LOAD_METHOD/CALL_METHOD optimisation, but I think that's OK. > > 2. Request to reserve space for one argument before the args array, as > in PEP 590's argument offsetting. If the flag is missing, nargs may not > include PY_VECTORCALL_ARGUMENTS_OFFSET. A mechanism incompatible with > offsetting may use the bit for another purpose. > > Both flags may be simply ignored by the caller (or not be set by the > callee in the first place), reverting to a more straightforward (but > less performant) code path. This should also be the case for any flags > added in the future. > Note how without these flags, the protocol (and its documentation) will > be extremely simple. > This mechanism would work with my examples of possible future extensions: > - "kwarg dict": A flag would enable the `kwnames` argument to be a dict > instead of a tuple. > - prepending/appending several positional arguments: The callable's > request for how much space to allocate stored right after the {func; > flags} struct. As in argument offsetting, a bit in nargs would indicate > that the request was honored. (If this was made incompatible with > one-arg offsetting, it could reuse the bit.) > - removing an optimization: CPython would simply stop using an > optimizations (but not remove the flag). Extensions could continue to > use the optimization between themselves. This seems a lot more complex than the caller setting a bit to tell the callee whether it has allocated extra space. > > As in PEP 590, any class that uses this mechanism shall not be usable as > a base class. This will simplify implementation and tests, but hopefully > the limitation will be removed in the future. (Maybe even in the initial > implementation.) > > The METH_VECTORCALL (aka CCALL_FASTCALL|CCALL_KEYWORDS) calling > convention is added to the public API. The other calling conventions > (PEP 580's CCALL_O, CCALL_NOARGS, CCALL_VARARGS, CCALL_KEYWORDS, > CCALL_FASTCALL, CCALL_DEFARG) as well as argument type checking > (CCALL_OBJCLASS) and self slicing (CCALL_SELFARG) are left up to the > callable. > > No equivalent of PEP 580's restrictions on the __name__ attribute. In my > opinion, the PyEval_GetFuncName function should just be deprecated in > favor of getting the __name__ attribute and checking if it's a string. > It would be possible to add a public helper that returns a proper > reference, but that doesn't seem worth it. Either way, I consider this > out of scope of this PEP. > > No equivalent of PEP 580's PyCCall_GenericGetParent and > PyCCall_GenericGetQualname either -- again, if needed, they should be > retrieved as normal attributes. As I see it, the operation doesn't need > to be particularly fast. > > No equivalent of PEP 580's PyCCall_Call, and no support for dict in > PyCCall_FastCall's kwds argument. To be fast, extensions should avoid > passing kwargs in a dict. Let's see how far that takes us. (FWIW, this > also avoids subtle issues with dict mutability.) > > Profiling stays as in PEP 580: only exact function types generate the > events. > > As in PEP 580, PyCFunction_GetFlags and PyCFunction_GET_FLAGS are > deprecated > > As in PEP 580, nothing is added to the stable ABI > > > Does that sound reasonable? From mark at hotpy.org Sat Apr 27 08:07:25 2019 From: mark at hotpy.org (Mark Shannon) Date: Sat, 27 Apr 2019 13:07:25 +0100 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: <5CC1C765.1020608@UGent.be> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CC1C765.1020608@UGent.be> Message-ID: Hi Jeroen, On 25/04/2019 3:42 pm, Jeroen Demeyer wrote: > On 2019-04-25 00:24, Petr Viktorin wrote: >> I believe we can achieve >> that by having PEP 590's (o+offset) point not just to function pointer, >> but to a {function pointer; flags} struct with flags defined for two >> optimizations: > > What's the rationale for putting the flags in the instance? Do you > expect flags to be different between one instance and another instance > of the same class? > >> Both type flags and >> nargs bits are very limited resources. > > Type flags are only a limited resource if you think that all flags ever > added to a type must be put into tp_flags. There is nothing wrong with > adding new fields tp_extraflags or tp_vectorcall_flags to a type. > >> What I don't like about it is that it has >> the extensions built-in; mandatory for all callers/callees. > > I don't agree with the above sentence about PEP 580: > - callers should use APIs like PyCCall_FastCall() and shouldn't need to > worry about the implementation details at all. > - callees can opt out of all the extensions by not setting any special > flags and setting cr_self to a non-NULL value. When using the flags > CCALL_FASTCALL | CCALL_KEYWORDS, then implementing the callee is exactly > the same as PEP 590. > >> As in PEP 590, any class that uses this mechanism shall not be usable as >> a base class. > > Can we please lift this restriction? There is really no reason for it. > I'm not aware of any similar restriction anywhere in CPython. Note that > allowing subclassing is not the same as inheriting the protocol. As a > compromise, we could simply never inherit the protocol. AFAICT, any limitations on subclassing exist solely to prevent tp_call and the PEP 580/590 function pointer being in conflict. This limitation is inherent and the same for both PEPs. Do you agree? Let us conside a class C that sets the Py_TPFLAGS_HAVE_CCALL/Py_TPFLAGS_HAVE_VECTORCALL flag. It will set the function pointer in a new instance, C(), when the object is created. If we create a new class D: class D(C): __call__(self, ...): ... and then create an instance `d = D()` then calling d will have two contradictory behaviours; the one installed by C in the function pointer and the one specified by D.__call__ We can ensure correct behaviour by setting the function pointer to NULL or a forwarding function (depending on the implementation) if __call__ has been overridden. This would be enforced at class creation/readying time. Cheers, Mark. From J.Demeyer at UGent.be Sat Apr 27 08:27:53 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 27 Apr 2019 14:27:53 +0200 Subject: [Python-Dev] PEP 580/590 discussion In-Reply-To: References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CC1C765.1020608@UGent.be> Message-ID: <5CC44AC9.9020501@UGent.be> On 2019-04-27 14:07, Mark Shannon wrote: > class D(C): > __call__(self, ...): > ... > > and then create an instance `d = D()` then calling d will have two > contradictory behaviours; the one installed by C in the function pointer > and the one specified by D.__call__ It's true that the function pointer in D will be wrong but it's also irrelevant since the function pointer won't be used: class D won't have the flag Py_TPFLAGS_HAVE_CCALL/Py_TPFLAGS_HAVE_VECTORCALL set. From stefan_ml at behnel.de Sat Apr 27 10:44:29 2019 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 27 Apr 2019 16:44:29 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <6a395e77-d681-69f8-ba32-f6fd47f261cb@ubuntu.com> References: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> <6a395e77-d681-69f8-ba32-f6fd47f261cb@ubuntu.com> Message-ID: Matthias Klose schrieb am 25.04.19 um 13:48: > Are there use cases where you only want to load *some* > debug extensions, even if more are installed? Not sure if there are _important_ use cases (that could justify certain design decisions), but I can certainly imagine using a non-debug (and therefore faster) Pandas or NumPy for preparing some data that I need to debug my own code. More generally, whenever I can avoid using a debug version of a *dependency* that I don't need to include in my debug analysis, it's probably a good idea to not use the debug version. Even given venvs and virtualisation techniques, it would probably be nice if users could install debug+nondebug versions of libraries once and then import the right one at need, rather than having to set up a new environment (while they're on a train in the middle of nowhere without fast access to PyPI). Stefan From mark at hotpy.org Sat Apr 27 05:47:46 2019 From: mark at hotpy.org (Mark Shannon) Date: Sat, 27 Apr 2019 10:47:46 +0100 Subject: [Python-Dev] PEP 590 discussion In-Reply-To: <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> References: <6c8356e3-f9b2-c39e-63c4-17f146d326b7@hotpy.org> <15b8a3d7-00ed-a5eb-475c-a3adee671b5f@hotpy.org> <5C9FEF82.50207@UGent.be> <421f8182-4bc8-b8cf-82d6-ca4a4fbd2013@hotpy.org> <50d675b4-839c-6502-ad1a-a33ea9330000@redhat.com> <5CAE76B6.6090501@UGent.be> <19993545-3fdf-4bf6-56dd-e926d2032a3a@redhat.com> Message-ID: <73165133-21ec-b030-cbd8-0f57b85c3873@hotpy.org> Hi Petr, On 24/04/2019 11:24 pm, Petr Viktorin wrote: > On 4/10/19 7:05 PM, Jeroen Demeyer wrote: >> On 2019-04-10 18:25, Petr Viktorin wrote: >>> Hello! >>> I've had time for a more thorough reading of PEP 590 and the reference >>> implementation. Thank you for the work! >> >> And thank you for the review! >> >>> I'd now describe the fundamental >>> difference between PEP 580 and PEP 590 as: >>> - PEP 580 tries to optimize all existing calling conventions >>> - PEP 590 tries to optimize (and expose) the most general calling >>> convention (i.e. fastcall) >> >> And PEP 580 has better performance overall, even for METH_FASTCALL. >> See this thread: >> https://mail.python.org/pipermail/python-dev/2019-April/156954.html >> >> Since these PEPs are all about performance, I consider this a very >> relevant argument in favor of PEP 580. > > All about performance as well as simplicity, correctness, testability, > teachability... And PEP 580 touches some introspection :) > >>> PEP 580 also does a number of other things, as listed in PEP 579. But I >>> think PEP 590 does not block future PEPs for the other items. >>> On the other hand, PEP 580 has a much more mature implementation -- and >>> that's where it picked up real-world complexity. >> About complexity, please read what I wrote in >> https://mail.python.org/pipermail/python-dev/2019-March/156853.html >> >> I claim that the complexity in the protocol of PEP 580 is a good >> thing, as it removes complexity from other places, in particular from >> the users of the protocol (better have a complex protocol that's >> simple to use, rather than a simple protocol that's complex to use). > > I think we're talking past each other. I see now it as: > > PEP 580 takes existing complexity and makes it available to all users, > in a simpler way. It makes existing code faster. > > PEP 590 defines a new simple/fast protocol for its users, and instead of > making existing complexity faster and easier to use, it's left to be > deprecated/phased out (or kept in existing classes for backwards > compatibility). It makes it possible for future code to be faster/simpler. > > I think things should be simple by default, but if people want some > extra performance, they can opt in to some extra complexity. > > >> As a more concrete example of the simplicity that PEP 580 could bring, >> CPython currently has 2 classes for bound methods implemented in C: >> - "builtin_function_or_method" for normal C methods >> - "method-descriptor" for slot wrappers like __eq__ or __add__ >> >> With PEP 590, these classes would need to stay separate to get maximal >> performance. With PEP 580, just one class for bound methods would be >> sufficient and there wouldn't be any performance loss. And this >> extends to custom third-party function/method classes, for example as >> implemented by Cython. > > Yet, for backwards compatibility reasons, we can't merge the classes. > Also, I think CPython and Cython are exactly the users that can trade > some extra complexity for better performance. > >>> Jeroen's analysis from >>> https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems >>> to miss a step at the top: >>> >>> a. CALL_FUNCTION* / CALL_METHOD opcode >>> ?????? calls >>> b. _PyObject_FastCallKeywords() >>> ?????? which calls >>> c. _PyCFunction_FastCallKeywords() >>> ?????? which calls >>> d. _PyMethodDef_RawFastCallKeywords() >>> ?????? which calls >>> e. the actual C function (*ml_meth)() >>> >>> I think it's more useful to say that both PEPs bridge a->e (via >>> _Py_VectorCall or PyCCall_Call). >> >> Not quite. For a builtin_function_or_method, we have with PEP 580: >> >> a. call_function() >> ???? calls >> d. PyCCall_FastCall >> ???? which calls >> e. the actual C function >> >> and with PEP 590 it's more like: >> >> a. call_function() >> ???? calls >> c. _PyCFunction_FastCallKeywords >> ???? which calls >> d. _PyMethodDef_RawFastCallKeywords >> ???? which calls >> e. the actual C function >> >> Level c. above is the vectorcall wrapper, which is a level that PEP >> 580 doesn't have. > > PEP 580 optimizes all the code paths, where PEP 590 optimizes the fast > path, and makes sure most/all use cases can use (or switch to) the fast > path. > Both fast paths are fast: bridging a->e using zero-copy arg passing with > some C calls and flag checks. > > The PEP 580 approach is faster; PEP 590's is simpler. Why do you say that PEP 580's approach is faster? There is no evidence for this. The only evidence so far is a couple of contrived benchmarks. Jeroen's showed a ~1% speedup for PEP 580 and mine showed a ~30% speed up for PEP 590. This clearly shows that I am better and coming up with contrived benchmarks :) PEP 590 was chosen as the fastest protocol I could come up with that was fully general, and wasn't so complex as to be unusable. > > >>> Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or >>> should address? >> >> Well, PEP 580 is an extensible protocol while PEP 590 is not. But, >> PyTypeObject is extensible, so even with PEP 590 one can always extend >> that (for example, PEP 590 uses a type flag >> Py_TPFLAGS_METHOD_DESCRIPTOR where PEP 580 instead uses the structs >> for the C call protocol). But I guess that extending PyTypeObject will >> be harder to justify (say, in a future PEP) than extending the C call >> protocol. > > That's a good point. Saying that PEP 590 is not extensible is true, but misleading. PEP 590 is fully universal, it supports callables that can do anything with anything. There is no need for it to be extended because it already supports any possible behaviour. Cheers, Mark. From paul at ganssle.io Sat Apr 27 12:39:57 2019 From: paul at ganssle.io (Paul Ganssle) Date: Sat, 27 Apr 2019 12:39:57 -0400 Subject: [Python-Dev] datetime.fromisocalendar Message-ID: <63756c7f-e07e-3556-7ab0-47c3fc3072de@ganssle.io> Greetings, Some time ago, I proposed adding a `.fromisocalendar` alternate constructor to `datetime` (bpo-36004 ), with a corresponding implementation (PR #11888 ). I advertised it on datetime-SIG some time ago but haven't seen much discussion there, so I'd like to bring it to python-dev's attention as we near the cut-off for new Python 3.8 features. Other than the fact that I've needed this functionality in the past, I also think a good general principle for the datetime module is that when a class (time, date, datetime) has a "serialization" method (.strftime, .timestamp, .isoformat, .isocalendar, etc), there should be a corresponding /deserialization/ method (.strptime, .fromtimestamp, .fromisoformat) that constructs a datetime from the output. Now that `fromisoformat` was introduced in Python 3.7, I think `isocalendar` is the only remaining method without an inverse. Do people agree with this principle? Should we add the `fromisocalendar` method? Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From jp at nexedi.com Sat Apr 27 12:04:38 2019 From: jp at nexedi.com (Jean-Paul Smets) Date: Sat, 27 Apr 2019 16:04:38 +0000 Subject: [Python-Dev] Actor Model in Python In-Reply-To: References: Message-ID: Hello, ERP5 https://erp5.nexedi.com) implements the "Actalk" actor model in a library called "CMFActivity". Processing (ex. financial transactions, machine learning) can be distributed on a cluster of servers. Actalk is interesting because it provides a way to unify and combine multiple OOCP models within the same runtime, rather than being forced to use only one. * Actalk: http://www-poleia.lip6.fr/~briot/actalk/papers/PAPERS.html * CMFActivity: https://lab.nexedi.com/nexedi/erp5/tree/master/product/CMFActivity Go channels concurrency model is ported to python: https://pypi.org/project/pygolang/ Nexedi has plans to experiment a port of Actalk to Cython with GIL-less concurrency. Regards, JPS. Le 2019-03-31 12:30, Aratz Manterola Lasa via Python-Dev a ?crit : > Hello, > I was wondering if there was any Python project aiming to implement the actor model for python concurrency. ?Does anyone know it? > Aratz. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/nicolas%40nexedi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at withers.org Sat Apr 27 13:30:10 2019 From: chris at withers.org (Chris Withers) Date: Sat, 27 Apr 2019 18:30:10 +0100 Subject: [Python-Dev] git history conundrum Message-ID: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Hi All, I'm in the process of bringing the mock backport up to date, but this has got me stumped: $ git log --oneline? --no-merges 5943ea76d529f9ea18c73a61e10c6f53bdcc864f.. -- Lib/unittest/mock.py Lib/unittest/test/testmock/ | tail 362f058a89 Issue #28735: Fixed the comparison of mock.MagickMock with mock.ANY. d9c956fb23 Issue #20804: The unittest.mock.sentinel attributes now preserve their identity when they are copied or pickled. 84b6fb0eea Fix unittest.mock._Call: don't ignore name 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now properly support assert_called, assert_not_called, and assert_called_once. 0be894b2f6 Issue #27895:? Spelling fixes (Contributed by Ville Skytt?). 15f44ab043 Issue #27895:? Spelling fixes (Contributed by Ville Skytt?). d4583d7fea Issue #26750: use inspect.isdatadescriptor instead of our own _is_data_descriptor(). 9854789efe Issue #26750: unittest.mock.create_autospec() now works properly for subclasses of property() and other data descriptors. 204bf0b9ae English spelling and grammar fixes Right, so I've merged up to 15f44ab043, what comes next? $ git log --oneline? --no-merges 15f44ab043.. -- Lib/unittest/mock.py Lib/unittest/test/testmock/ | tail -n 3 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now properly support assert_called, assert_not_called, and assert_called_once. 0be894b2f6 Issue #27895:? Spelling fixes (Contributed by Ville Skytt?). Okay, no idea why 0be894b2f6 is there, appears to be a totally identical commit to 15f44ab043, so let's skip it: $ git log --oneline? --no-merges 0be894b2f6.. -- Lib/unittest/mock.py Lib/unittest/test/testmock/ | tail -n 3 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now properly support assert_called, assert_not_called, and assert_called_once. 15f44ab043 Issue #27895:? Spelling fixes (Contributed by Ville Skytt?). Wat?! Why is 15f44ab043 showing up again?! What's the git subtlety I'm missing here? Chris From jp at nexedi.com Sat Apr 27 12:02:02 2019 From: jp at nexedi.com (Jean-Paul Smets) Date: Sat, 27 Apr 2019 16:02:02 +0000 Subject: [Python-Dev] Actor Model in Python In-Reply-To: References: Message-ID: <61d5f630775e33d2e6d790e2398af069@nexedi.com> Hello, ERP5 https://erp5.nexedi.com) implements the "Actalk" actor model in a library called "CMFActivity". Processing (ex. financial transactions, machine learning) can be distributed on a cluster of servers. Actalk is interesting because it provides a way to unify and combine multiple OOCP models within the same runtime, rather than being forced to use only one. * Actalk: http://www-poleia.lip6.fr/~briot/actalk/papers/PAPERS.html * CMFActivity: https://lab.nexedi.com/nexedi/erp5/tree/master/product/CMFActivity Go channels concurrency model is ported to python: https://pypi.org/project/pygolang/ Nexedi has plans to experiment a port of Actalk to Cython with GIL-less concurrency. Regards, JPS. Le 2019-03-31 12:30, Aratz Manterola Lasa via Python-Dev a ?crit : > Hello, > I was wondering if there was any Python project aiming to implement the actor model for python concurrency. ?Does anyone know it? > Aratz. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/nicolas%40nexedi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Apr 27 15:24:10 2019 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 27 Apr 2019 12:24:10 -0700 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <20190424185501.j6kjdbmb4adj4x6x@python.ca> Message-ID: On Sat, Apr 27, 2019, 04:27 Armin Rigo wrote: > Hi Neil, > > On Wed, 24 Apr 2019 at 21:17, Neil Schemenauer > wrote: > > Regarding the Py_TRACE_REFS fields, I think we can't do them without > > breaking the ABI because of the following. For GC objects, they are > > always allocated by _PyObject_GC_New/_PyObject_GC_NewVar. So, we > > can allocate the extra space needed for the GC linked list. For > > non-GC objects, that's not the case. Extensions can allocate using > > malloc() directly or their own allocator and then pass that memory > > to be initialized as a PyObject. > > > > I think that's a poor design and I think we should try to make slow > > progress in fixing it. > > Such progress needs to start with the global static PyTypeObjects that > all extensions define. This is going to be impossible to fix without > requiring a big fix in of *all* of them. (Unless of course you mean > to still allow them, but then Py_TRACE_REF can't be implemented in a > way that doesn't break the ABI.) > For Py_TRACE_REFS specifically, IIUC the only goal is to be able to produce a list of all live objects on demand. If that's the goal, then static type objects aren't a huge deal. You can't add extra data into the type objects themselves, but since there's a fixed set of them and they're immortal, you can just build a static list of all of them in PyType_Ready. -n > -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Sat Apr 27 15:40:18 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 27 Apr 2019 21:40:18 +0200 Subject: [Python-Dev] PEP 580 and PEP 590 comparison. In-Reply-To: <04dc374d-a648-a3f0-5a1f-fee4ec2b1b98@hotpy.org> References: <5CB4422D.8030507@UGent.be> <04dc374d-a648-a3f0-5a1f-fee4ec2b1b98@hotpy.org> Message-ID: <5CC4B022.1080802@UGent.be> On 2019-04-27 11:26, Mark Shannon wrote: > Specifically, and this is important, PEP 580 cannot implement efficient > calls to class objects without breaking the ABI. First of all, the layout of PyTypeObject isn't actually part of the stable ABI (see PEP 384). So, we wouldn't be breaking anything by extending PyTypeObject. Second, even if you don't buy this argument and you really think that we should guarantee ABI-compatibility, we can still solve that in PEP 580 by special-casing instances of "type". Sure, that's an annoyance but it's not a fundamental obstruction. Jeroen. From J.Demeyer at UGent.be Sat Apr 27 16:04:28 2019 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Sat, 27 Apr 2019 22:04:28 +0200 Subject: [Python-Dev] PEP 580 and PEP 590 comparison. In-Reply-To: <04dc374d-a648-a3f0-5a1f-fee4ec2b1b98@hotpy.org> References: <5CB4422D.8030507@UGent.be> <04dc374d-a648-a3f0-5a1f-fee4ec2b1b98@hotpy.org> Message-ID: <5CC4B5CC.3010102@UGent.be> On 2019-04-27 11:26, Mark Shannon wrote: > Performance improvements include, but aren't limited to: > > 1. Much faster calls to common classes: range(), set(), type(), list(), > etc. That's not specific to PEP 590. It can be done with any proposal. I know that there is the ABI issue with PEP 580, but that's not such a big problem as you seem to think (see my last e-mail). > 2. Modifying argument clinic to produce C functions compatible with the > vectorcall, allowing the interpreter to call the C function directly, > with no additional overhead beyond the vectorcall call sequence. This is a very good point. Doing this will certainly reduce the overhead of PEP 590 over PEP 580. > 3. Customization of the C code for function objects depending on the > Python code. The would probably be limited to treating closures and > generator function differently, but optimizing other aspects of the > Python function call is possible. I'm not entirely sure what you mean, but I'm pretty sure that it's not specific to PEP 590. Jeroen. From guido at python.org Sat Apr 27 16:33:45 2019 From: guido at python.org (Guido van Rossum) Date: Sat, 27 Apr 2019 13:33:45 -0700 Subject: [Python-Dev] datetime.fromisocalendar In-Reply-To: <63756c7f-e07e-3556-7ab0-47c3fc3072de@ganssle.io> References: <63756c7f-e07e-3556-7ab0-47c3fc3072de@ganssle.io> Message-ID: I think it?s a good idea. On Sat, Apr 27, 2019 at 11:43 AM Paul Ganssle wrote: > Greetings, > > Some time ago, I proposed adding a `.fromisocalendar` alternate > constructor to `datetime` (bpo-36004 ), > with a corresponding implementation (PR #11888 > ). I advertised it on > datetime-SIG some time ago but haven't seen much discussion there, so I'd > like to bring it to python-dev's attention as we near the cut-off for new > Python 3.8 features. > > Other than the fact that I've needed this functionality in the past, I > also think a good general principle for the datetime module is that when a > class (time, date, datetime) has a "serialization" method (.strftime, > .timestamp, .isoformat, .isocalendar, etc), there should be a corresponding > *deserialization* method (.strptime, .fromtimestamp, .fromisoformat) that > constructs a datetime from the output. Now that `fromisoformat` was > introduced in Python 3.7, I think `isocalendar` is the only remaining > method without an inverse. Do people agree with this principle? Should we > add the `fromisocalendar` method? > > Thanks, > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido (mobile) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vadmium+py at gmail.com Sat Apr 27 22:51:14 2019 From: vadmium+py at gmail.com (Martin Panter) Date: Sun, 28 Apr 2019 02:51:14 +0000 Subject: [Python-Dev] git history conundrum In-Reply-To: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Message-ID: On Sat, 27 Apr 2019 at 19:07, Chris Withers wrote: > Right, so I've merged up to 15f44ab043, what comes next? > > $ git log --oneline --no-merges 15f44ab043.. -- Lib/unittest/mock.py > Lib/unittest/test/testmock/ | tail -n 3 This Git command line means list all the revisions except 15f44ab043 and those leading up to it. > 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock > ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now > properly support assert_called, assert_not_called, and assert_called_once. > 0be894b2f6 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > > Okay, no idea why 0be894b2f6 is there, appears to be a totally identical > commit to 15f44ab043 Git revision 15f44ab043 is the original spelling fixes, which were pushed to the Mercurial ?default? branch (= Git master) for Python 3.6 by Raymond. Revision 0be894b2f6 is my backport to the 3.5 branch, done about a week later. The backport is probably a subset of the original, rather than identical (e.g. the datetime.rst change was not applicable to 3.5). The convention at the time was to keep the 3.5 branch merged into Default (Master). That is why my 3.5 backport appears in your history of Master. > so let's skip it: > > $ git log --oneline --no-merges 0be894b2f6.. -- Lib/unittest/mock.py > Lib/unittest/test/testmock/ | tail -n 3 > 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock > ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now > properly support assert_called, assert_not_called, and assert_called_once. > 15f44ab043 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > > Wat?! Why is 15f44ab043 showing up again?! Because you are asked for all the revisions except my backport and its ancestors. As far as Git is concerned, the original spelling fixes are not an ancestor of my backport. I don?t have a copy of the Git repository to try, but I suggest the following command is what you want: git log --oneline --no-merges HEAD ^15f44ab043 ^0be894b2f6 -- Lib/unittest/mock.py Lib/unittest/test/testmock/ From chris at withers.org Sun Apr 28 03:25:30 2019 From: chris at withers.org (Chris Withers) Date: Sun, 28 Apr 2019 08:25:30 +0100 Subject: [Python-Dev] git history conundrum In-Reply-To: References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Message-ID: <3c7b8ee1-f119-20f9-0c25-3a96923976c8@withers.org> On 28/04/2019 03:51, Martin Panter wrote: > On Sat, 27 Apr 2019 at 19:07, Chris Withers wrote: >> Right, so I've merged up to 15f44ab043, what comes next? >> >> $ git log --oneline --no-merges 15f44ab043.. -- Lib/unittest/mock.py >> Lib/unittest/test/testmock/ | tail -n 3 > > This Git command line means list all the revisions except 15f44ab043 > and those leading up to it. That seems at odds with what I've found searching online and with the backporting instructions left in the mock backport docs. My understanding is that 15f44ab043.. expands out to 15f44ab043..HEAD and means "all revs between 15f44ab043 and master": https://stackoverflow.com/a/7693298/216229 Can you explain what leads you to expect that to behave differently? > The convention at the time was to keep the 3.5 branch merged into > Default (Master). That is why my 3.5 backport appears in your history > of Master. Ah, okay. > Because you are asked for all the revisions except my backport and its > ancestors. As far as Git is concerned, the original spelling fixes are > not an ancestor of my backport. > > I don?t have a copy of the Git repository to try, but I suggest the > following command is what you want: > > git log --oneline --no-merges HEAD ^15f44ab043 ^0be894b2f6 -- > Lib/unittest/mock.py Lib/unittest/test/testmock/ What's the best way to spell "show me all the revisions on master that affect {mock files} from commit x to HEAD, not including x"? cheers, Chris From chris at withers.org Sun Apr 28 03:10:14 2019 From: chris at withers.org (Chris Withers) Date: Sun, 28 Apr 2019 08:10:14 +0100 Subject: [Python-Dev] git history conundrum In-Reply-To: References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Message-ID: <6b6e1fff-808b-2724-b392-21b03387323c@withers.org> On 28/04/2019 03:51, Martin Panter wrote: > On Sat, 27 Apr 2019 at 19:07, Chris Withers wrote: >> Right, so I've merged up to 15f44ab043, what comes next? >> >> $ git log --oneline --no-merges 15f44ab043.. -- Lib/unittest/mock.py >> Lib/unittest/test/testmock/ | tail -n 3 > > This Git command line means list all the revisions except 15f44ab043 > and those leading up to it. That seems at odds with what I've found searching online and with the backporting instructions left in the mock backport docs. My understanding is that 15f44ab043.. expands out to 15f44ab043..HEAD and means "all revs between 15f44ab043 and master": https://stackoverflow.com/a/7693298/216229 Can you explain what leads you to expect that to behave differently? > The convention at the time was to keep the 3.5 branch merged into > Default (Master). That is why my 3.5 backport appears in your history > of Master. Ah, okay. > Because you are asked for all the revisions except my backport and its > ancestors. As far as Git is concerned, the original spelling fixes are > not an ancestor of my backport. > > I don?t have a copy of the Git repository to try, but I suggest the > following command is what you want: > > git log --oneline --no-merges HEAD ^15f44ab043 ^0be894b2f6 -- > Lib/unittest/mock.py Lib/unittest/test/testmock/ What's the best way to spell "show me all the revisions on master that affect {mock files} from commit x to HEAD, not including x"? cheers, Chris From vstinner at redhat.com Sun Apr 28 04:53:25 2019 From: vstinner at redhat.com (Victor Stinner) Date: Sun, 28 Apr 2019 10:53:25 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <7aad8d69-2d66-54a3-26a4-20dcc5150f66@ubuntu.com> <6a395e77-d681-69f8-ba32-f6fd47f261cb@ubuntu.com> Message-ID: FYI I pushed my 3 changes to implement my idea. It is now possible to install some extensions in release mode and some others in debug mode. Python in debug mode prefers debug extensions. I documented changes here: https://docs.python.org/dev/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build The library filename has to be different in debug mode, so it can be co-installable with release build of a C extension. Victor Le samedi 27 avril 2019, Stefan Behnel a ?crit : > Matthias Klose schrieb am 25.04.19 um 13:48: >> Are there use cases where you only want to load *some* >> debug extensions, even if more are installed? > > Not sure if there are _important_ use cases (that could justify certain > design decisions), but I can certainly imagine using a non-debug (and > therefore faster) Pandas or NumPy for preparing some data that I need to > debug my own code. More generally, whenever I can avoid using a debug > version of a *dependency* that I don't need to include in my debug > analysis, it's probably a good idea to not use the debug version. > > Even given venvs and virtualisation techniques, it would probably be nice > if users could install debug+nondebug versions of libraries once and then > import the right one at need, rather than having to set up a new > environment (while they're on a train in the middle of nowhere without fast > access to PyPI). > > Stefan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com > -- Night gathers, and now my watch begins. It shall not end until my death. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Apr 28 16:51:01 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 28 Apr 2019 22:51:01 +0200 Subject: [Python-Dev] git history conundrum References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> <3c7b8ee1-f119-20f9-0c25-3a96923976c8@withers.org> Message-ID: <20190428225101.63d2860b@fsol> On Sun, 28 Apr 2019 08:25:30 +0100 Chris Withers wrote: > > What's the best way to spell "show me all the revisions on master that > affect {mock files} from commit x to HEAD, not including x"? Something like: $ git log x...HEAD -- {mock files} perhaps? Regards Antoine. From robertc at robertcollins.net Sun Apr 28 17:21:39 2019 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 29 Apr 2019 09:21:39 +1200 Subject: [Python-Dev] git history conundrum In-Reply-To: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Message-ID: Thank you! If I understand correctly this is just the hg style branch backport consequence, multiple copies of a change. Should be safe to skip those. Rob On Sun, 28 Apr 2019, 07:11 Chris Withers, wrote: > Hi All, > > I'm in the process of bringing the mock backport up to date, but this > has got me stumped: > > $ git log --oneline --no-merges > 5943ea76d529f9ea18c73a61e10c6f53bdcc864f.. -- Lib/unittest/mock.py > Lib/unittest/test/testmock/ | tail > 362f058a89 Issue #28735: Fixed the comparison of mock.MagickMock with > mock.ANY. > d9c956fb23 Issue #20804: The unittest.mock.sentinel attributes now > preserve their identity when they are copied or pickled. > 84b6fb0eea Fix unittest.mock._Call: don't ignore name > 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock > ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now > properly support assert_called, assert_not_called, and assert_called_once. > 0be894b2f6 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > 15f44ab043 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > d4583d7fea Issue #26750: use inspect.isdatadescriptor instead of our own > _is_data_descriptor(). > 9854789efe Issue #26750: unittest.mock.create_autospec() now works > properly for subclasses of property() and other data descriptors. > 204bf0b9ae English spelling and grammar fixes > > Right, so I've merged up to 15f44ab043, what comes next? > > $ git log --oneline --no-merges 15f44ab043.. -- Lib/unittest/mock.py > Lib/unittest/test/testmock/ | tail -n 3 > 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock > ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now > properly support assert_called, assert_not_called, and assert_called_once. > 0be894b2f6 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > > Okay, no idea why 0be894b2f6 is there, appears to be a totally identical > commit to 15f44ab043, so let's skip it: > > $ git log --oneline --no-merges 0be894b2f6.. -- Lib/unittest/mock.py > Lib/unittest/test/testmock/ | tail -n 3 > 161a4dd495 Issue #28919: Simplify _copy_func_details() in unittest.mock > ac5084b6c7 Fixes issue28380: unittest.mock Mock autospec functions now > properly support assert_called, assert_not_called, and assert_called_once. > 15f44ab043 Issue #27895: Spelling fixes (Contributed by Ville Skytt?). > > Wat?! Why is 15f44ab043 showing up again?! > > What's the git subtlety I'm missing here? > > Chris > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/robertc%40robertcollins.net > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at withers.org Sun Apr 28 17:55:10 2019 From: chris at withers.org (Chris Withers) Date: Sun, 28 Apr 2019 22:55:10 +0100 Subject: [Python-Dev] git history conundrum In-Reply-To: References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> Message-ID: <6259012a-f7ea-c294-3bef-f3bef7484a7f@withers.org> On 28/04/2019 22:21, Robert Collins wrote: > Thank you! Thank me when we get there ;-) Currently in Dec 2018 with a wonderful Py2 failure: ====================================================================== ERROR: test_autospec_getattr_partial_function (mock.tests.testhelpers.SpecSignatureTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "mock/tests/testhelpers.py", line 973, in test_autospec_getattr_partial_function autospec = create_autospec(proxy) File "mock/mock.py", line 2392, in create_autospec for entry in dir(spec): TypeError: __dir__() must return a list, not str Once we're done, I'll need a username/password that can write to https://pypi.org/project/mock/ ... > If I understand correctly this is just the hg style branch backport > consequence, multiple copies of a change. Should be safe to skip those. Yep, current script I've been using is here, high level highlighted: https://github.com/cjw296/mock/blob/backporting/backport.py#L102-L125 cheers, Chris From fuzzyman at voidspace.org.uk Sun Apr 28 17:57:52 2019 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Apr 2019 22:57:52 +0100 Subject: [Python-Dev] git history conundrum In-Reply-To: <6259012a-f7ea-c294-3bef-f3bef7484a7f@withers.org> References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> <6259012a-f7ea-c294-3bef-f3bef7484a7f@withers.org> Message-ID: > On 28 Apr 2019, at 22:55, Chris Withers wrote: > >> On 28/04/2019 22:21, Robert Collins wrote: >> Thank you! > > Thank me when we get there ;-) Currently in Dec 2018 with a wonderful Py2 failure: > > ====================================================================== > ERROR: test_autospec_getattr_partial_function (mock.tests.testhelpers.SpecSignatureTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "mock/tests/testhelpers.py", line 973, in test_autospec_getattr_partial_function > autospec = create_autospec(proxy) > File "mock/mock.py", line 2392, in create_autospec > for entry in dir(spec): > TypeError: __dir__() must return a list, not str > > Once we're done, I'll need a username/password that can write to https://pypi.org/project/mock/ ... I can add you as a maintainer. Ping me off-list. Michael > >> If I understand correctly this is just the hg style branch backport consequence, multiple copies of a change. Should be safe to skip those. > > Yep, current script I've been using is here, high level highlighted: > > https://github.com/cjw296/mock/blob/backporting/backport.py#L102-L125 > > cheers, > > Chris > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk From robertc at robertcollins.net Sun Apr 28 18:03:43 2019 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 29 Apr 2019 10:03:43 +1200 Subject: [Python-Dev] git history conundrum In-Reply-To: <6259012a-f7ea-c294-3bef-f3bef7484a7f@withers.org> References: <3140efd6-54ff-40d0-5906-fbb19e99ea76@withers.org> <6259012a-f7ea-c294-3bef-f3bef7484a7f@withers.org> Message-ID: Share your own username with Michael or I and we'll add you there. Rob On Mon, 29 Apr 2019, 09:55 Chris Withers, wrote: > On 28/04/2019 22:21, Robert Collins wrote: > > Thank you! > > Thank me when we get there ;-) Currently in Dec 2018 with a wonderful > Py2 failure: > > ====================================================================== > ERROR: test_autospec_getattr_partial_function > (mock.tests.testhelpers.SpecSignatureTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "mock/tests/testhelpers.py", line 973, in > test_autospec_getattr_partial_function > autospec = create_autospec(proxy) > File "mock/mock.py", line 2392, in create_autospec > for entry in dir(spec): > TypeError: __dir__() must return a list, not str > > Once we're done, I'll need a username/password that can write to > https://pypi.org/project/mock/ ... > > > If I understand correctly this is just the hg style branch backport > > consequence, multiple copies of a change. Should be safe to skip those. > > Yep, current script I've been using is here, high level highlighted: > > https://github.com/cjw296/mock/blob/backporting/backport.py#L102-L125 > > cheers, > > Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Mon Apr 29 09:30:41 2019 From: vstinner at redhat.com (Victor Stinner) Date: Mon, 29 Apr 2019 15:30:41 +0200 Subject: [Python-Dev] datetime.fromisocalendar In-Reply-To: References: <63756c7f-e07e-3556-7ab0-47c3fc3072de@ganssle.io> Message-ID: I reviewed and merged Paul's PR. I concur with Guido, the new constructor perfectly makes sense and is useful. About the implementation: date and time are crazy beasts. Extract of the code: if not 0 < week < 53: out_of_range = True if week == 53: # ISO years have 53 weeks in them on years starting with a # Thursday and leap years starting on a Wednesday first_weekday = _ymd2ord(year, 1, 1) % 7 if (first_weekday == 4 or (first_weekday == 3 and _is_leap(year))): out_of_range = False if out_of_range: raise ValueError(f"Invalid week: {week}") "ISO years have 53 weeks in them on years starting with a Thursday and leap years starting on a Wednesday" !?! Victor Le sam. 27 avr. 2019 ? 22:37, Guido van Rossum a ?crit : > > I think it?s a good idea. > > On Sat, Apr 27, 2019 at 11:43 AM Paul Ganssle wrote: >> >> Greetings, >> >> Some time ago, I proposed adding a `.fromisocalendar` alternate constructor to `datetime` (bpo-36004), with a corresponding implementation (PR #11888). I advertised it on datetime-SIG some time ago but haven't seen much discussion there, so I'd like to bring it to python-dev's attention as we near the cut-off for new Python 3.8 features. >> >> Other than the fact that I've needed this functionality in the past, I also think a good general principle for the datetime module is that when a class (time, date, datetime) has a "serialization" method (.strftime, .timestamp, .isoformat, .isocalendar, etc), there should be a corresponding deserialization method (.strptime, .fromtimestamp, .fromisoformat) that constructs a datetime from the output. Now that `fromisoformat` was introduced in Python 3.7, I think `isocalendar` is the only remaining method without an inverse. Do people agree with this principle? Should we add the `fromisocalendar` method? >> >> Thanks, >> Paul >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- > --Guido (mobile) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. From vano at mail.mipt.ru Mon Apr 29 16:56:08 2019 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Mon, 29 Apr 2019 23:56:08 +0300 Subject: [Python-Dev] datetime.fromisocalendar In-Reply-To: References: <63756c7f-e07e-3556-7ab0-47c3fc3072de@ganssle.io> Message-ID: <16fa808e-cb0c-c31e-274d-b60b53ef2a23@mail.mipt.ru> On 29.04.2019 16:30, Victor Stinner wrote: > I reviewed and merged Paul's PR. I concur with Guido, the new > constructor perfectly makes sense and is useful. > > About the implementation: date and time are crazy beasts. Extract of the code: > > if not 0 < week < 53: > out_of_range = True > > if week == 53: > # ISO years have 53 weeks in them on years starting with a > # Thursday and leap years starting on a Wednesday > first_weekday = _ymd2ord(year, 1, 1) % 7 > if (first_weekday == 4 or (first_weekday == 3 and > _is_leap(year))): > out_of_range = False > > if out_of_range: > raise ValueError(f"Invalid week: {week}") > > "ISO years have 53 weeks in them on years starting with a Thursday and > leap years starting on a Wednesday" !?! https://www.staff.science.uu.nl/~gent0113/calendar/isocalendar.htm , linked from https://docs.python.org/3/library/datetime.html?highlight=isocalendar#datetime.date.isocalendar > Victor > > Le sam. 27 avr. 2019 ? 22:37, Guido van Rossum a ?crit : >> I think it?s a good idea. >> >> On Sat, Apr 27, 2019 at 11:43 AM Paul Ganssle wrote: >>> Greetings, >>> >>> Some time ago, I proposed adding a `.fromisocalendar` alternate constructor to `datetime` (bpo-36004), with a corresponding implementation (PR #11888). I advertised it on datetime-SIG some time ago but haven't seen much discussion there, so I'd like to bring it to python-dev's attention as we near the cut-off for new Python 3.8 features. >>> >>> Other than the fact that I've needed this functionality in the past, I also think a good general principle for the datetime module is that when a class (time, date, datetime) has a "serialization" method (.strftime, .timestamp, .isoformat, .isocalendar, etc), there should be a corresponding deserialization method (.strptime, .fromtimestamp, .fromisoformat) that constructs a datetime from the output. Now that `fromisoformat` was introduced in Python 3.7, I think `isocalendar` is the only remaining method without an inverse. Do people agree with this principle? Should we add the `fromisocalendar` method? >>> >>> Thanks, >>> Paul >>> >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org >> -- >> --Guido (mobile) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com > > -- Regards, Ivan From nas-python at arctrix.com Mon Apr 29 20:01:41 2019 From: nas-python at arctrix.com (Neil Schemenauer) Date: Mon, 29 Apr 2019 18:01:41 -0600 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: References: <20190424185501.j6kjdbmb4adj4x6x@python.ca> Message-ID: <20190430000141.sbdhdez3xumf2qgz@python.ca> On 2019-04-27, Nathaniel Smith wrote: > For Py_TRACE_REFS specifically, IIUC the only goal is to be able to produce > a list of all live objects on demand. If that's the goal, then static type > objects aren't a huge deal. You can't add extra data into the type objects > themselves, but since there's a fixed set of them and they're immortal, you > can just build a static list of all of them in PyType_Ready. As far as I understand, we have a similar problem already for gc.get_objects() because those static type objects don't have a PyGC_Head. My 2-cent proposal for fixing things in the long term would be to introduce a function like PyType_Ready that returns a pointer to the new type. The argument to it would be what is the current static type structure. The function would copy things from the static type structure into a newly allocated type structure. We have a kind of solution already with PyType_FromSpec, etc. However, I think it is harder to convert existing extension module source code to use that API. We want to make it very easy for people to fix source code. If we can remove static types, that would allow us to kill off Py_TYPE(o)->tp_is_gc(o). I understand why that exists but I think it is quite an ugly detail of the current GC implementation. I wonder about the performance impact of it given current memory latencies. When we do a full GC run, we call PyObject_IS_GC() on many objects. I fear having to lookup and call tp_is_gc could be quite expensive. I've been playing with the idea of using memory bitmaps rather then the PyGC_Head. That idea seems to depend on removing static type objects. Initially I was thinking of it as reducing the memory overhead for GC types. Now I think the memory overhead doesn't matter too much but perhaps the bitmaps would be much faster due to memory latency. There is an interesting Youtube video that compares vector traversals vs linked list traversals in C++. Linked lists on modern machines are really terrible. Regards, Neil From vstinner at redhat.com Mon Apr 29 20:22:59 2019 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 30 Apr 2019 02:22:59 +0200 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <20190430000141.sbdhdez3xumf2qgz@python.ca> References: <20190424185501.j6kjdbmb4adj4x6x@python.ca> <20190430000141.sbdhdez3xumf2qgz@python.ca> Message-ID: You have my support is you work on removing static types :-) Here are my notes on the current C APIs to define a type: https://pythoncapi.readthedocs.io/type_object.html IMHO static types should go away in the long term. They are causing too many practical issues. Victor Le mar. 30 avr. 2019 ? 02:01, Neil Schemenauer a ?crit : > > On 2019-04-27, Nathaniel Smith wrote: > > For Py_TRACE_REFS specifically, IIUC the only goal is to be able to produce > > a list of all live objects on demand. If that's the goal, then static type > > objects aren't a huge deal. You can't add extra data into the type objects > > themselves, but since there's a fixed set of them and they're immortal, you > > can just build a static list of all of them in PyType_Ready. > > As far as I understand, we have a similar problem already for > gc.get_objects() because those static type objects don't have a > PyGC_Head. My 2-cent proposal for fixing things in the long term > would be to introduce a function like PyType_Ready that returns a > pointer to the new type. The argument to it would be what is the > current static type structure. The function would copy things from > the static type structure into a newly allocated type structure. > > We have a kind of solution already with PyType_FromSpec, etc. > However, I think it is harder to convert existing extension module > source code to use that API. We want to make it very easy for > people to fix source code. > > If we can remove static types, that would allow us to kill off > Py_TYPE(o)->tp_is_gc(o). I understand why that exists but I think > it is quite an ugly detail of the current GC implementation. I > wonder about the performance impact of it given current memory > latencies. When we do a full GC run, we call PyObject_IS_GC() on > many objects. I fear having to lookup and call tp_is_gc could be > quite expensive. > > I've been playing with the idea of using memory bitmaps rather then > the PyGC_Head. That idea seems to depend on removing static type > objects. Initially I was thinking of it as reducing the memory > overhead for GC types. Now I think the memory overhead doesn't > matter too much but perhaps the bitmaps would be much faster due to > memory latency. There is an interesting Youtube video that compares > vector traversals vs linked list traversals in C++. Linked lists on > modern machines are really terrible. > > Regards, > > Neil -- Night gathers, and now my watch begins. It shall not end until my death. From njs at pobox.com Mon Apr 29 20:43:35 2019 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 29 Apr 2019 17:43:35 -0700 Subject: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode In-Reply-To: <20190430000141.sbdhdez3xumf2qgz@python.ca> References: <20190424185501.j6kjdbmb4adj4x6x@python.ca> <20190430000141.sbdhdez3xumf2qgz@python.ca> Message-ID: On Mon, Apr 29, 2019 at 5:01 PM Neil Schemenauer wrote: > As far as I understand, we have a similar problem already for > gc.get_objects() because those static type objects don't have a > PyGC_Head. My 2-cent proposal for fixing things in the long term > would be to introduce a function like PyType_Ready that returns a > pointer to the new type. The argument to it would be what is the > current static type structure. The function would copy things from > the static type structure into a newly allocated type structure. I doubt you'll be able to get rid of static types entirely, due to the usual issues with C API breakage. And I'm guessing that static types make up such a tiny fraction of the address space that merely tweaking the percent up or down won't affect performance. But your proposed new API would make it *way* easier to migrate existing code to the stable ABI. -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Tue Apr 30 04:14:28 2019 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Apr 2019 10:14:28 +0200 Subject: [Python-Dev] PEP 574 ready for review Message-ID: <20190430101428.204457e6@fsol> Hello, I've put the final touches to PEP 574 - Pickle protocol 5 with out-of-band data (*). It is now ready for review. The implementation is fully functional, as well as its PyPI backport (**), and has regression tests against Numpy. Numpy and PyArrow have their own tests against the pickle5 backport. (*) https://www.python.org/dev/peps/pep-0574/ (**) https://pypi.org/project/pickle5/ Regards Antoine. From chris at withers.org Tue Apr 30 05:26:19 2019 From: chris at withers.org (Chris Withers) Date: Tue, 30 Apr 2019 10:26:19 +0100 Subject: [Python-Dev] drop jython support in mock backport? Message-ID: [resending to python-dev in case there are Jython users here...] Hi All, If you need Jython support in the mock backport, please shout now: https://github.com/testing-cabal/mock/issues/453 cheers, Chris From chris at withers.org Tue Apr 30 17:24:53 2019 From: chris at withers.org (Chris Withers) Date: Tue, 30 Apr 2019 22:24:53 +0100 Subject: [Python-Dev] "if __name__ == '__main__'" at the bottom of python unittest files Message-ID: <01ebef8d-d370-22c5-cb7a-194704a3906c@withers.org> Hi All, I have a crazy idea of getting unittest.mock up to 100% code coverage. I noticed at the bottom of all of the test files in testmock/, there's a: if __name__ == '__main__': ??? unittest.main() ...block. How would people feel about these going away? I don't *think* they're needed now that we have unittest discover, but thought I'd ask. Chris From robertc at robertcollins.net Tue Apr 30 18:41:45 2019 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 1 May 2019 10:41:45 +1200 Subject: [Python-Dev] "if __name__ == '__main__'" at the bottom of python unittest files In-Reply-To: <01ebef8d-d370-22c5-cb7a-194704a3906c@withers.org> References: <01ebef8d-d370-22c5-cb7a-194704a3906c@withers.org> Message-ID: They were never needed ? Removal is fine with me. On Wed, 1 May 2019, 09:27 Chris Withers, wrote: > Hi All, > > I have a crazy idea of getting unittest.mock up to 100% code coverage. > > I noticed at the bottom of all of the test files in testmock/, there's a: > > if __name__ == '__main__': > unittest.main() > > ...block. > > How would people feel about these going away? I don't *think* they're > needed now that we have unittest discover, but thought I'd ask. > > Chris > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/robertc%40robertcollins.net > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ja.py at farowl.co.uk Tue Apr 30 18:00:25 2019 From: ja.py at farowl.co.uk (Jeff Allen) Date: Tue, 30 Apr 2019 23:00:25 +0100 Subject: [Python-Dev] drop jython support in mock backport? In-Reply-To: References: Message-ID: Cross-posting to jython-users for obvious reasons. Jeff Allen On 30/04/2019 10:26, Chris Withers wrote: > [resending to python-dev in case there are Jython users here...] > > Hi All, > > If you need Jython support in the mock backport, please shout now: > > https://github.com/testing-cabal/mock/issues/453 > > cheers, > > Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: