From k.kowitski at icloud.com Sat Sep 1 12:32:15 2018 From: k.kowitski at icloud.com (Kevin Kowitski) Date: Sat, 01 Sep 2018 12:32:15 -0400 Subject: [Python-Dev] New Subscriber Message-ID: Hey Everyone! I am new to this mailing list. I have experience with software development in the medical device industry and SaaS technology solutions. I have primarily worked with R, Java, and various unique and specific machine languages, but I have recently picked up Python. Part of what has brought me to this mailing list was my search for insights on Python?s usability as a shareable, executable, desktop application. I often make tools for streamlining my professional tasks and decided I wanted to give Python a try. I have read that there are some fantastic GUI libraries and development environments, but how practical is this language for distributing those programs in a nicely packaged exe? If there is anyone here with some insight I would love to start off my membership with a discussion and some knowledge sharing! Best, Kevin Sent from my iPhone From brett at python.org Sat Sep 1 13:48:58 2018 From: brett at python.org (Brett Cannon) Date: Sat, 1 Sep 2018 10:48:58 -0700 Subject: [Python-Dev] New Subscriber In-Reply-To: References: Message-ID: Hi, Kevin! This mailing list is actually about the development *of* Python, not *with* it. To have a discussion about GUI programming with Python is probably python-list is a better place to discuss this. On Sat, 1 Sep 2018 at 10:37 Kevin Kowitski via Python-Dev < python-dev at python.org> wrote: > Hey Everyone! > > I am new to this mailing list. I have experience with software > development in the medical device industry and SaaS technology solutions. I > have primarily worked with R, Java, and various unique and specific machine > languages, but I have recently picked up Python. > Part of what has brought me to this mailing list was my search for > insights on Python?s usability as a shareable, executable, desktop > application. I often make tools for streamlining my professional tasks and > decided I wanted to give Python a try. I have read that there are some > fantastic GUI libraries and development environments, but how practical is > this language for distributing those programs in a nicely packaged exe? > If there is anyone here with some insight I would love to start off my > membership with a discussion and some knowledge sharing! > > Best, > Kevin > > Sent from my iPhone > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aixtools at felt.demon.nl Sat Sep 1 13:55:42 2018 From: aixtools at felt.demon.nl (Michael) Date: Sat, 1 Sep 2018 19:55:42 +0200 Subject: [Python-Dev] make patchcheck and git path In-Reply-To: <23429.93.348012.547338@turnbull.sk.tsukuba.ac.jp> References: <5f9b58cd-224a-6851-0bd0-54a2731aa9e4@felt.demon.nl> <23429.93.348012.547338@turnbull.sk.tsukuba.ac.jp> Message-ID: <174f24fe-18e5-4a7e-b03f-245c78ab38ca@felt.demon.nl> On 28/08/2018 09:57, Stephen J. Turnbull wrote: > Michael Felt (aixtools) writes: > > > When building out of tree there is no .git reference. If I > > understand the process it uses git to see what files have changed, > > and does further processing on those. > > Just guessing based on generic git knowledge here: > > If you build in a sibling directory of the .git directory, git should > "see" the GITDIR, and it should work. Where is your build directory > relative to the GITDIR? I work in "parallel" /data/prj/python/python-version /data/prj/python/git/python-version I suppose I should try setting GITDIR - but, I think it would be better, at least nicer, if "patchcheck" as a target did some checking for git early on, rather than bail out at the end. The results of the check might be just a message to set GITDIR, e.g.. > I suspect you could also set GITDIR=/path/to/python/source/.git in > make's process environment, and do "make patchcheck" outside of the > Python source tree successfully. I'll give this a try next time around. (vacation, so not really 'active' atm). Thanks for the suggestions. > > Regards, > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From k.kowitski at icloud.com Sat Sep 1 14:12:02 2018 From: k.kowitski at icloud.com (Kevin Kowitski) Date: Sat, 01 Sep 2018 14:12:02 -0400 Subject: [Python-Dev] New Subscriber In-Reply-To: References: Message-ID: Oh! I?m so sorry haha. I thought this was a developers forum for users of Python. So you are saying Python-list would be that type of forum? -Kevin Sent from my iPhone > On Sep 1, 2018, at 1:48 PM, Brett Cannon wrote: > > Hi, Kevin! This mailing list is actually about the development of Python, not with it. To have a discussion about GUI programming with Python is probably python-list is a better place to discuss this. > >> On Sat, 1 Sep 2018 at 10:37 Kevin Kowitski via Python-Dev wrote: >> Hey Everyone! >> >> I am new to this mailing list. I have experience with software development in the medical device industry and SaaS technology solutions. I have primarily worked with R, Java, and various unique and specific machine languages, but I have recently picked up Python. >> Part of what has brought me to this mailing list was my search for insights on Python?s usability as a shareable, executable, desktop application. I often make tools for streamlining my professional tasks and decided I wanted to give Python a try. I have read that there are some fantastic GUI libraries and development environments, but how practical is this language for distributing those programs in a nicely packaged exe? >> If there is anyone here with some insight I would love to start off my membership with a discussion and some knowledge sharing! >> >> Best, >> Kevin >> >> Sent from my iPhone >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Sep 1 14:23:20 2018 From: brett at python.org (Brett Cannon) Date: Sat, 1 Sep 2018 11:23:20 -0700 Subject: [Python-Dev] New Subscriber In-Reply-To: References: Message-ID: Yep, python-list sounds like the mailing list that you're after. On Sat, 1 Sep 2018 at 11:12 Kevin Kowitski wrote: > Oh! I?m so sorry haha. I thought this was a developers forum for users of > Python. So you are saying Python-list would be that type of forum? > > -Kevin > > Sent from my iPhone > > On Sep 1, 2018, at 1:48 PM, Brett Cannon wrote: > > Hi, Kevin! This mailing list is actually about the development *of* > Python, not *with* it. To have a discussion about GUI programming with > Python is probably python-list is a better place to discuss this. > > On Sat, 1 Sep 2018 at 10:37 Kevin Kowitski via Python-Dev < > python-dev at python.org> wrote: > >> Hey Everyone! >> >> I am new to this mailing list. I have experience with software >> development in the medical device industry and SaaS technology solutions. I >> have primarily worked with R, Java, and various unique and specific machine >> languages, but I have recently picked up Python. >> Part of what has brought me to this mailing list was my search for >> insights on Python?s usability as a shareable, executable, desktop >> application. I often make tools for streamlining my professional tasks and >> decided I wanted to give Python a try. I have read that there are some >> fantastic GUI libraries and development environments, but how practical is >> this language for distributing those programs in a nicely packaged exe? >> If there is anyone here with some insight I would love to start off >> my membership with a discussion and some knowledge sharing! >> >> Best, >> Kevin >> >> Sent from my iPhone >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/brett%40python.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sat Sep 1 18:10:31 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 2 Sep 2018 00:10:31 +0200 Subject: [Python-Dev] Use of Cython In-Reply-To: References: <20180730110120.6d03e6d8@fsol> <74a848fa0eff42fc8ae5aa58e3fe71d0@xmail101.UGent.be> <5B600F47.3090503@UGent.be> <20180731094528.118471f9@fsol> <17ebc01b-e0f2-1ebc-1229-b4ca84843f9c@python.org> <8CA25A41-D9F4-4634-9509-604F84B09E46@mac.com> <2B17179F-29B4-40E2-824D-749359E33089@mac.com> <57363ED3-4851-4A5D-A4B9-F9F0A9053F2C@mac.com> Message-ID: Yury, given that people are starting to quote enthusiastically the comments you made below, let me set a couple of things straight. Yury Selivanov schrieb am 07.08.2018 um 19:34: > On Mon, Aug 6, 2018 at 11:49 AM Ronald Oussoren via Python-Dev wrote: > >> I have no strong opinion on using Cython for tests or in the stdlib, other than that it is a fairly large dependency. I do think that adding a ?Cython-lite? tool the CPython distribution would be less ideal, creating and maintaining that tool would be a lot of work without clear benefits over just using Cython. > > Speaking of which, Dropbox is working on a new compiler they call "mypyc". > > mypyc will compile type-annotated Python code to an optimized C. That's their plan. Saying that "it will" is a bit premature at this point. The list of failed attempts at writing static Python compilers is rather long, even if you only count those that compile the usual "easy subset" of Python. I wish them the best of luck and endurance, but they have a long way to go. > The > first goal is to compile mypy with it to make it faster, so I hope > that the project will be completed. That's not "the first goal". It's the /only/ goal. The only intention of mypyc is to be able to compile and optimise enough of Python to speed up the kind or style of code that mypy uses. > Essentially, mypyc will be similar > to Cython, but mypyc is a *subset of Python*, not a superset. Which is bad, right? It means that there will be many things that simply don't work, and that you need to change your code in order to make it compile at all. Cython is way beyond that point by now. Even RPython will probably continue to be way better than mypyc for quite a while, maybe forever, who knows. > Interfacing with C libraries can be easily achieved with cffi. Except that it will be fairly slow. cffi is not designed for static analysis but for runtime operations. You can obviously also use cffi from Cython ? but then, why would you, if you can get much faster code much more easily without using cffi? That being said, if someone wants to write a static cffi optimiser for Cython, why not, I'd be happy to help with my advice. The cool thing is that this can be improved gradually, because compiling the cffi code probably already works out of the box. It's just not (much) faster than when interpreted. > Being a > strict subset of Python means that mypyc code will execute just fine > in PyPy. So does normal (non-subset) Python code. You can run it in PyPy, have CPython interpret it, or compile it with Cython if you want it to run faster in CPython, all without having to limit yourself to a subset of Python. Seriously, you make this sound like requiring users to rewrite their code to make it compilable with mypyc was a good thing. > They can even apply some optimizations to it eventually, as > it has a strict and static type system. In case "they" refers to PyPy here, then I remember the PyPy project stating very clearly that they are not interested in PEP-484 typing because it is completely irrelevant for their JIT. It's really best for them to ignore it. That's similar for Cython, simply because PEP-484 typing isn't designed for optimisation at all, definitely not for C-level optimisation. Still, Cython can make some use of PEP-484 typing, if you use it to define specific C types. That allows normal execution in CPython, static analysis with PEP-484 analyser tools (e.g. PyCharm or mypy), and efficient optimisation by Cython. The best of all worlds. See the docs on how to do that, it's been supported for about a year now (and has been around in a similar, non-PEP-484 form for years before that PEP even existed). > I'd be more willing to start using mypyc+cffi in CPython stdlib > *eventually*, than Cython now. Cython is a relatively complex and > still poorly documented language. You are free to improve the documentation or otherwise help us find and discuss concrete problems with it. Calling Cython a "poorly documented language" could easily feel offensive towards those who have put a lot of work into the documentation, wiki, tutorials, trainings and what not that help people use the language. Even stack overflow is getting better and better in documenting Cython these days, even though responses over there that describe work-arounds tend to get outdated fairly quickly. Besides, don't forget that it's Python, so consider reading the Python documentation first if something is unclear. And maybe some documentation of C data types as well. (.5 wink) > I'm speaking from experience after > writing thousands of lines of Cython in uvloop & asyncpg. In skillful > hands Cython is amazing, but I'd be cautious to advertise and use it > in CPython. Why not? You didn't actually give any reasons for that. > I'm also -1 on using Cython to test C API. While writing C tests is > annoying (I wrote a fair share myself), their very purpose is to make > third-party tools/extensions more stable. Using a third-party tool to > test C API to track regressions that break third-party tools feels > wrong. I don't understand that argument. What's wrong about using a tool that helps you get around writing boiler plate code? The actual testing does not need to be done by Cython at all, you can write it any way you like. Stefan From ncoghlan at gmail.com Sun Sep 2 12:56:55 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Sep 2018 02:56:55 +1000 Subject: [Python-Dev] UTF-8 Mode now also enabled by the POSIX locale In-Reply-To: References: Message-ID: On Tue, 28 Aug 2018 at 23:02, Victor Stinner wrote: > > Hi, > > While working on test_utf8_mode on AIX (bpo-34347) and HP-UX > (bpo-34403), I noticed that FreeBSD doesn't work properly with the > POSIX locale (bpo-34527). I also noticed that my implementation of my > PEP 540 "UTF-8 Mode" doesn't respect the PEP: the UTF-8 Mode should be > enabled by the POSIX locale, not only by the C locale. > > I just modified Python 3.7 and master (future 3.8) to enable UTF-8 > Mode if the LC_CTYPE locale is "POSIX": > https://bugs.python.org/issue34527 > > I also fixed FreeBSD to support the "POSIX" locale as well (3.6, 3.7 > and master branches). > > Note: The C locale coercion (PEP 538) is only enabled if the LC_CTYPE > locale is "C". https://bugs.python.org/issue30672 is the open issue noting that it should also handle the case where POSIX isn't a simple alias for the C locale the way it is in glibc. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Sep 2 13:02:32 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Sep 2018 03:02:32 +1000 Subject: [Python-Dev] AIX and python tests In-Reply-To: <061047f4-6bbc-a662-412d-bb5febfc72bb@felt.demon.nl> References: <061047f4-6bbc-a662-412d-bb5febfc72bb@felt.demon.nl> Message-ID: On Mon, 6 Aug 2018 at 07:03, Michael wrote: > > As I have time, I'll dig into these. > > I have a couple of PR already 'out there', which I hope someone will be looking at when/as he/she/they have time. My time will also be intermittent. > > My next test - and I hope not too difficult - would be the test_utf8. The test: > > FAIL: test_cmd_line (test.test_utf8_mode.UTF8ModeTests) > > fails - and I am wondering if it is as simple as AIX default mode is ISO8559-1 > and the test looks to be comparing UTF8 with the locale_default. > If that is the case, obviously this test will never succeed - asis. > > Am I understanding the test properly. > If yes, then I'll see what I can come up with for a patch to the test for AIX. > If no, I'll need some hand holding to help me understand the test UTF-8 mode relates to PEP 540, and the intent is that the default C/POSIX locale should either be coerced to a UTF-8 based one (by the PEP 538 mechanism), or else UTF-8 mode will activate, and CPython will set its *own* encoding to UTF-8, and ignore the locale one. We did need to make the PEP 538 tests AIX-aware [1] so they knew what to expect as the default encoding when locale coercion was disabled, so it's possible some further special casing will be needed in the UTF-8 mode tests as well. Cheers, Nick. [1] https://github.com/python/cpython/blob/master/Lib/test/test_c_locale_coercion.py#L40 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cmawebsite at gmail.com Sun Sep 2 17:37:59 2018 From: cmawebsite at gmail.com (Collin Anderson) Date: Sun, 2 Sep 2018 17:37:59 -0400 Subject: [Python-Dev] Python 2.7 EOL date In-Reply-To: <1535223033.3211797.1486041560.6B4E95EE@webmail.messagingengine.com> References: <1535223033.3211797.1486041560.6B4E95EE@webmail.messagingengine.com> Message-ID: Thanks all for clarifying! -Collin On Sat, Aug 25, 2018 at 2:50 PM Benjamin Peterson wrote: > I was operating under the optimistic assumption whatever the precise time > of 2.7's official demise would only be an amusing piece of trivia for a > world of happy Python 3 users. > > It's still to early to promise exact release dates; that will depend on > the day-to-day schedules of the release manager and binary builders circa > January 2020. A conservative assumption is that no 2.7 changes that land > after December 31 2019 will ever be released. > > We could make the last release of 2.7 in July 2020. But what does that buy > anyone? > > On Thu, Aug 23, 2018, at 11:53, Collin Anderson wrote: > > Hi All, > > > > Sorry if this has been mentioned before, but I noticed the Python 2.7 EOL > > date was recently set to Jan 1st, 2020. > > > > My understanding was Python releases get 5 years of support from their > > initial release, and Python 2.7 was extended an additional 5 years. > > > > Python 2.7 was originally released on 2010-07-03, and with an original > EOL > > of 2015-07-03. Extended 5 years, shouldn't the EOL be 2020-07-03? > > > > Also, this statement is a little unclear to me: > > > > > Specifically, 2.7 will receive bugfix support until January 1, 2020. > All > > 2.7 development work will cease in 2020. > > > > This statement makes it sound like bugfixes end on Jan 1st, but seems to > > leave open the possibility that security fixes could continue through the > > year. > > > > Thanks! > > Collin > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From talha.noyon10 at gmail.com Mon Sep 3 11:43:35 2018 From: talha.noyon10 at gmail.com (Md Abu Talha) Date: Mon, 3 Sep 2018 21:43:35 +0600 Subject: [Python-Dev] pyOpengl text render Message-ID: How can I rendered text using python and opengl? please help -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Sep 3 12:24:40 2018 From: phd at phdru.name (Oleg Broytman) Date: Mon, 3 Sep 2018 18:24:40 +0200 Subject: [Python-Dev] pyOpengl text render In-Reply-To: References: Message-ID: <20180903162440.gxudwqfnsv2gjfff@phdru.name> Hello. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See https://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Mon, Sep 03, 2018 at 09:43:35PM +0600, Md Abu Talha wrote: > How can I rendered text using python and opengl? please help https://www.google.com/search?hl=en&pws=0&q=python+opengl+render+text https://stackoverflow.com/search?q=%5Bpython%5D+%5Bopengl%5D+render+text Oleg. -- Oleg Broytman https://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From oono0114 at gmail.com Tue Sep 4 10:37:23 2018 From: oono0114 at gmail.com (=?UTF-8?B?5aSn6YeO6ZqG5byY?=) Date: Tue, 4 Sep 2018 23:37:23 +0900 Subject: [Python-Dev] AES cipher implementation in standard library Message-ID: Dear all, Have we tried cipher implementation includes AES as a standard library in the past? https://docs.python.org/3.6/library/crypto.html if possible I want to try to implement AES because famous 3rd party library is not maintained and general cipher programs should be used for multiple purpose.Though the implementation is tough, I believe this should be worth to it. In my case, I want to use AES implementation for zipfile module. Thanks and Regards, --------------- Takahiro Ono -------------- next part -------------- An HTML attachment was scrubbed... URL: From oono0114 at gmail.com Tue Sep 4 10:43:27 2018 From: oono0114 at gmail.com (=?UTF-8?B?5aSn6YeO6ZqG5byY?=) Date: Tue, 4 Sep 2018 23:43:27 +0900 Subject: [Python-Dev] Cipher implementation (such as AES) in standard library Message-ID: Dear all, Have we tried cipher implementation includes AES as a standard library in the past? https://docs.python.org/3.6/library/crypto.html if possible I want to try to implement AES because famous 3rd party library is not maintained and general cipher programs should be used for multiple purpose.Though the implementation is tough, I believe this should be worth to it. In my case, I want to use AES implementation for zipfile module. Thanks and Regards, --------------- Takahiro Ono -------------- next part -------------- An HTML attachment was scrubbed... URL: From eelizondo at fb.com Tue Sep 4 11:13:54 2018 From: eelizondo at fb.com (Eddie Elizondo) Date: Tue, 4 Sep 2018 15:13:54 +0000 Subject: [Python-Dev] Heap-allocated StructSequences Message-ID: PEP-384 talks about defining a Stable ABI by making PyTypeObject opaque. Thus, banning the use of static PyTypeObjects. Specifically, I?d like to focus on those static PyTypeObjects that are initialized as StructSequences. As of today, there are 14 instances of these types (in timemodule.c, posixmodule.c, etc.) within cpython/Modules. These are all initialized through PyStructSequence_InitType2. This is very problematic when trying to make these types conform to PEP-384 as they are used through static PyTypeObjects. Problems: * PyStructSequence_InitType2 overrides the PyTypeObject: This C-API does a direct memcpy from a ?prototype? structure. This effectively overrides anything set within the PyTypeObject. For example, if we were to initialize a heap allocated PyTypeObject and pass it on to this function, the C-API would just get rid of the Py_TPFLAG_HEAPTYPE flag, causing issues with the GC. * PyStructSequence_InitType2 does not work with heap allocated PyTypeObjects: Even if the function is fixed to preserve the state of the PyTypeObject and only overriding the specific slots (i.e. tp_new, tp_repr, etc.), it is expected that PyStructSequence_InitType2 will call PyType_Ready on the object. That means that the incoming object shouldn?t be initialized by a function such as PyType_FromSpec, as that would have already called PyType_Ready on it. Therefore, PyStructSequence_InitType2 will now have the responsibility of setting all the slots and properties of the PyHeapTypeObject, which is not feasible. * PyStructSequence_NewType is non-functional: This function was meant to be used as a way of creating a heap-allocated PyTypeObject that be passed to PyStructSequence_InitType2, effectively returning a heap allocated PyTypeObject. The current implementation doesn?t work in practice. Given that this struct is created in the heap, the GC has control over it. Thus, when the GC tries to traverse the type it complains with: ?Error: type_traverse() called for non-heap type?, since it doesn?t have the Py_TPFLAG_HEAPTYPE flag. If we add the flag, we run into bullet point 1, if we are able to preserve the flag then we will still run into the problem of bullet point 2. Extra note: This C-API is not being used anywhere within CPython itself. Solution: * Fix the implementation of PyStructSequence_NewType: The best solution would be to fix the implementation of this function. This can easily be done by dynamically creating a PyType_Spec and calling PyType_FromSpec ``` PyObject* PyStructSequence_NewType(PyStructSequence_Desc *desc) { // ? PyType_Spec* spec = PyMem_NEW(PyType_Spec, 1); spec->name = desc->name; spec->basicsize = sizeof(PyStructSequence) - sizeof(PyObject *); spec->itemsize = sizeof(PyObject *); spec->flags = Py_TPFLAGS_DEFAULT; spec->slots = PyMem_NEW(PyType_Slot, 6); spec->slots[0].slot = Py_tp_dealloc; spec->slots[0].pfunc = (destructor)structseq_dealloc; // ? bases = PyTuple_Pack(1, &PyTuple_Type); type = PyType_FromSpecWithBases(spec, bases); // ? ``` This will cleanly create a heap allocated PyStructSequence which can be used just like any stack allocated PyTypeObject initialized through PyStructSequence_InitType2. This means that any C-Extension should be using PyStructSequence_NewType and only built-in types should be calling PyStructSequence_InitType2. This will enable these types to comply with PEP-384 As an extra, I already have patches for this proposal. They can be found here: Branch: https://github.com/eduardo-elizondo/cpython/tree/heap-structseq Reimplement PyStructSequence_NewType: https://github.com/eduardo-elizondo/cpython/commit/413f8ca5bc008d84b3397ca1c9565c604d54b661 Patch timemodule with NewType: https://github.com/eduardo-elizondo/cpython/commit/0a35ea263a531cb03c06be9efc9e96d68162b308 Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Sep 4 12:19:49 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 4 Sep 2018 12:19:49 -0400 Subject: [Python-Dev] Use of Cython In-Reply-To: References: <20180730110120.6d03e6d8@fsol> <74a848fa0eff42fc8ae5aa58e3fe71d0@xmail101.UGent.be> <5B600F47.3090503@UGent.be> <20180731094528.118471f9@fsol> <17ebc01b-e0f2-1ebc-1229-b4ca84843f9c@python.org> <8CA25A41-D9F4-4634-9509-604F84B09E46@mac.com> <2B17179F-29B4-40E2-824D-749359E33089@mac.com> <57363ED3-4851-4A5D-A4B9-F9F0A9053F2C@mac.com> Message-ID: Hi Stefan, On Sat, Sep 1, 2018 at 6:12 PM Stefan Behnel wrote: > > Yury, > > given that people are starting to quote enthusiastically the comments you > made below, let me set a couple of things straight. To everyone reading this thread please keep in mind that I'm not in position to "defend" mypyc or to "promote" it, and I'm not affiliated with the project at all. I am just excited about yet another tool to statically compile Python and I'm discussing it only from a theoretical standpoint. > > Yury Selivanov schrieb am 07.08.2018 um 19:34: > > On Mon, Aug 6, 2018 at 11:49 AM Ronald Oussoren via Python-Dev wrote: > > > >> I have no strong opinion on using Cython for tests or in the stdlib, other than that it is a fairly large dependency. I do think that adding a ?Cython-lite? tool the CPython distribution would be less ideal, creating and maintaining that tool would be a lot of work without clear benefits over just using Cython. > > > > Speaking of which, Dropbox is working on a new compiler they call "mypyc". > > > > mypyc will compile type-annotated Python code to an optimized C. > > That's their plan. Saying that "it will" is a bit premature at this point. > The list of failed attempts at writing static Python compilers is rather > long, even if you only count those that compile the usual "easy subset" of > Python. > > I wish them the best of luck and endurance, but they have a long way to go. I fully agree with you here. > > > > The > > first goal is to compile mypy with it to make it faster, so I hope > > that the project will be completed. > > That's not "the first goal". It's the /only/ goal. The only intention of > mypyc is to be able to compile and optimise enough of Python to speed up > the kind or style of code that mypy uses. > > > > Essentially, mypyc will be similar > > to Cython, but mypyc is a *subset of Python*, not a superset. > > Which is bad, right? It means that there will be many things that simply > don't work, and that you need to change your code in order to make it > compile at all. Cython is way beyond that point by now. Even RPython will > probably continue to be way better than mypyc for quite a while, maybe > forever, who knows. To be clear I'm not involved with mypyc, but my understanding is that the entire Python syntax will be supported, except some dynamic features like patching `globals()`, `locals()`, or classes, or __class__. IMO this is *good* and in general Python programs don't do that anyways. > > > > Interfacing with C libraries can be easily achieved with cffi. > > Except that it will be fairly slow. cffi is not designed for static > analysis but for runtime operations. Could you please clarify this point? My current understanding is that you can build a static compiler with a knowledge about cffi so that it can compile calls like `ffi.new("something_t[]", 80)` to pure C. > You can obviously also use cffi from > Cython ? but then, why would you, if you can get much faster code much more > easily without using cffi? The "much more easily" part is debatable here and is highly subjective. For me using Cython is also easier *at this point* because I've spent so much time working with it. Although getting there wasn't easy for me :( > > That being said, if someone wants to write a static cffi optimiser for > Cython, why not, I'd be happy to help with my advice. The cool thing is > that this can be improved gradually, because compiling the cffi code > probably already works out of the box. It's just not (much) faster than > when interpreted. Yeah, statically compiling cffi-enabled code is probably the way to go for mypyc and Cython. > > > > Being a > > strict subset of Python means that mypyc code will execute just fine > > in PyPy. > > So does normal (non-subset) Python code. You can run it in PyPy, have > CPython interpret it, or compile it with Cython if you want it to run > faster in CPython, all without having to limit yourself to a subset of > Python. Seriously, you make this sound like requiring users to rewrite > their code to make it compilable with mypyc was a good thing. But that's the point: unless you add Cython types to your Python code it gets only moderate speedups. Using Cython/C types usually means that you need to use pxd/pyx files which means that the code isn't Python anymore. I know that Cython has a mode to use decorators in pure Python code to annotate types, but they are less intuitive than using typing annotations in 3.6+. [..] > > I'd be more willing to start using mypyc+cffi in CPython stdlib > > *eventually*, than Cython now. Cython is a relatively complex and > > still poorly documented language. > > You are free to improve the documentation or otherwise help us find and > discuss concrete problems with it. Fair point. > Calling Cython a "poorly documented > language" could easily feel offensive towards those who have put a lot of > work into the documentation, wiki, tutorials, trainings and what not that > help people use the language. Even stack overflow is getting better and > better in documenting Cython these days, even though responses over there > that describe work-arounds tend to get outdated fairly quickly. Didn't mean to offend anyone, sorry if I did. I'm myself partly responsible for poor asyncio docs and I know how it is to be on the receiving end :( [..] > > I'm speaking from experience after > > writing thousands of lines of Cython in uvloop & asyncpg. In skillful > > hands Cython is amazing, but I'd be cautious to advertise and use it > > in CPython. > > Why not? You didn't actually give any reasons for that. I've listed a couple: (1) To get significant speedup one needs to learn a lot of new syntax. For CPython it means that we'd have Python, C, and Cython to learn to understand code written in Cython. There's a very popular assumption that you have to be proficient in C in order to become a CPython core dev and people are genuinely surprised when I tell them that it's not a requirement. At the three conferences I've been this summer at least 5 people complained to me that they didn't even consider contributing to CPython because they don't know C. Adding yet another language would simply raise this bar even higher, IMHO. (2) My point about documentation still stands, even though I feel extremely uncomfortable using it, sorry. > > > > I'm also -1 on using Cython to test C API. While writing C tests is > > annoying (I wrote a fair share myself), their very purpose is to make > > third-party tools/extensions more stable. Using a third-party tool to > > test C API to track regressions that break third-party tools feels > > wrong. > > I don't understand that argument. What's wrong about using a tool that > helps you get around writing boiler plate code? The actual testing does not > need to be done by Cython at all, you can write it any way you like. Because you don't have 100% control over how exactly Cython (or different versions of it) will compile your code to C. In my experience writing a few C API tests in C is relatively easy compared to introducing these new C APIs in the first place. To summarize my personal position: I'm -1 on using Cython to write C API tests/boilerplate in CPython. I'm -1 on giving green light to use Cython's pxd/pyx syntaxes in CPython. I'd be +0.5 on using Cython (optionally?) to compile some pure Python code to make it 30-50% faster. asyncio, for instance, would certainly benefit from that. Y From christian at python.org Tue Sep 4 12:48:43 2018 From: christian at python.org (Christian Heimes) Date: Tue, 4 Sep 2018 18:48:43 +0200 Subject: [Python-Dev] AES cipher implementation in standard library In-Reply-To: References: Message-ID: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> On 2018-09-04 16:37, ???? wrote: > Dear all, > > Have we tried?cipher?implementation includes AES as a standard library > in the past? > https://docs.python.org/3.6/library/crypto.html > > if possible I want to try to implement AES because famous 3rd party > library is not maintained and general cipher programs should be used for > multiple purpose.Though the implementation is tough,? I believe this > should be worth to it. > In my case, I want to use AES implementation for zipfile module. strong -1 The Python standard library doesn't contain any encryption, signing, and other cryptographic algorithms for multiple reasons. The only exception from the rule are hashing algorithms and HMAC construct. There are legal implications like export restrictions. Crypto is just too hard to get right and we don't want to give the user additional rope. We already had a very lengthy and exhausting discussion for the secrets module. That module just provides a user-friendly interface to CPRNG. By the way, AES by itself is a useless to borderline dangerous algorithm. It must be embedded within additional layers like block mode, authenticated encryption / MAC, and more. There isn't a single correct answer for block mode and AD algorithm, too. It highly depends on the problem space. While GCM AEAD mode is good choice for network communication, it can be a pretty bad idea for persistent storage. There is one excellent Python library with high level and low level cryptographic algorithms: http://cryptography.readthedocs.io/ . It's t Regards, Christian From stefan_ml at behnel.de Tue Sep 4 14:55:56 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 4 Sep 2018 20:55:56 +0200 Subject: [Python-Dev] Use of Cython In-Reply-To: References: <20180730110120.6d03e6d8@fsol> <74a848fa0eff42fc8ae5aa58e3fe71d0@xmail101.UGent.be> <5B600F47.3090503@UGent.be> <20180731094528.118471f9@fsol> <17ebc01b-e0f2-1ebc-1229-b4ca84843f9c@python.org> <8CA25A41-D9F4-4634-9509-604F84B09E46@mac.com> <2B17179F-29B4-40E2-824D-749359E33089@mac.com> <57363ED3-4851-4A5D-A4B9-F9F0A9053F2C@mac.com> Message-ID: Yury Selivanov schrieb am 04.09.2018 um 18:19: > On Sat, Sep 1, 2018 at 6:12 PM Stefan Behnel wrote: >> Yury Selivanov schrieb am 07.08.2018 um 19:34: >>> The first goal is to compile mypy with it to make it faster, so I hope >>> that the project will be completed. >> >> That's not "the first goal". It's the /only/ goal. The only intention of >> mypyc is to be able to compile and optimise enough of Python to speed up >> the kind or style of code that mypy uses. >> >>> Essentially, mypyc will be similar >>> to Cython, but mypyc is a *subset of Python*, not a superset. >> >> Which is bad, right? It means that there will be many things that simply >> don't work, and that you need to change your code in order to make it >> compile at all. Cython is way beyond that point by now. Even RPython will >> probably continue to be way better than mypyc for quite a while, maybe >> forever, who knows. > > To be clear I'm not involved with mypyc, but my understanding is that > the entire Python syntax will be supported, except some dynamic > features like patching `globals()`, `locals()`, or classes, or > __class__. No, that's not the goal, at least from what I understood from my discussions with Jukka. The goal is to make it compile mypy, be it by supporting Python features in mypyc or by avoiding Python features in mypy. I'm sure they will take any shortcut they can in order to avoid having to make mypyc too capable, because mypyc is not more than a means to an end. For example, they may easily get away without supporting generators and closures, which are quite difficult to implement in C. But finding a non-trivial piece of Python code out there that uses neither of the two is probably not easy. I'm also sure they will avoid Python semantics wherever they can, because implementing them in the same way as CPython and Cython would mean that certain constructs cannot safely be statically reasoned about, and thus cannot be optimised. Avoiding (full) Python semantics relieves you from these restrictions, and if you control both sides, the compiler and the code that it compiles, then it becomes much easier to apply arbitrary optimisations at will. IMHO, what they are implementing is much closer to ShedSkin than to Cython. >>> Interfacing with C libraries can be easily achieved with cffi. >> >> Except that it will be fairly slow. cffi is not designed for static >> analysis but for runtime operations. > > Could you please clarify this point? My current understanding is that > you can build a static compiler with a knowledge about cffi so that it > can compile calls like `ffi.new("something_t[]", 80)` to pure C. I'm sure there is a relatively large subset of cffi's API that could be compiled statically, as long as the declartions and their usage are kept simple and fully visible to the compiler. What that subset is remains to be seen once someone actually tries to do it. > Yeah, statically compiling cffi-enabled code is probably the way to go > for mypyc and Cython. I doubt it, given the expected restrictions and verbosity. But debating this is useless as long as no-one attempts to actually write a static compiler for cffi(-like) code. > Using Cython/C types usually means > that you need to use pxd/pyx files which means that the code isn't > Python anymore. I'm aware that this is a very common misconception that is difficult to get out of people's heads. You probably got this idea from wrapping a native library, in which case the only choice you have in order to declare an external C-API is really to use Cython's special syntax. However, this would not apply to most use cases in the CPython project context, and it also does not necessarily apply to most of the code in a Cython module even if it uses external libraries. Cython has four ways to provide type declarations: cdef statements in Cython code, external .pxd files for Python or Cython files, special decorators and declaration functions, and PEP-484/526 type annotations. All four have their use cases (e.g. syntax support in different Python versions, efficiency of expression, readability for people with different backgrounds, etc.), and all but the first allow users to keep their module code in Python syntax. As long as you do not call into external native code, it's your choice which of these you prefer for your code base, project context and developer background. You can even mix them at will, if you feel like it. > I know that Cython has a mode to use decorators in > pure Python code to annotate types, but they are less intuitive than > using typing annotations in 3.6+. You can use PEP-484/526 type annotations to declare Cython types in Python code that you intend to compile. It's entirely up to you, and it's an entirely subjective measure which "is better". Many people prefer Cython's non-Python syntax because it allows them to apply their existing C knowledge. For them, PEP-484 annotations may easily be non-intuitive in comparison. > For CPython it means that we'd have Python, C, and Cython to learn to > understand code written in Cython. There's a very popular assumption > that you have to be proficient in C in order to become a CPython core > dev and people are genuinely surprised when I tell them that it's not > a requirement. At the three conferences I've been this summer at > least 5 people complained to me that they didn't even consider > contributing to CPython because they don't know C. Adding yet another > language would simply raise this bar even higher, IMHO. Adding the right language would lower the bar, IMHO. Cython is Python. It allows users with a Python background to implement C things without having to thoroughly learn C /and/ the CPython C-API first. So, the way I see it, rather than /adding/ a "third" language to the mix, it substantially lowers the entry level from the current two and a half languages (Python + C + C-API) to one and a half (Python + Cython). > I'd be +0.5 on using Cython (optionally?) to compile some pure Python > code to make it 30-50% faster. asyncio, for instance, would certainly > benefit from that. Since most of this (stdlib) Python code doesn't need to stay syntax compatible with Python < 3.6 (actually 3.8) anymore, you can probably get much higher speedups than that by statically typing some variables and functions here and there. I recently tried that with difflib, makes a big difference. Stefan From ethan at stoneleaf.us Tue Sep 4 15:47:31 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 04 Sep 2018 12:47:31 -0700 Subject: [Python-Dev] Use of Cython In-Reply-To: References: <20180730110120.6d03e6d8@fsol> <74a848fa0eff42fc8ae5aa58e3fe71d0@xmail101.UGent.be> <5B600F47.3090503@UGent.be> <20180731094528.118471f9@fsol> <17ebc01b-e0f2-1ebc-1229-b4ca84843f9c@python.org> <8CA25A41-D9F4-4634-9509-604F84B09E46@mac.com> <2B17179F-29B4-40E2-824D-749359E33089@mac.com> <57363ED3-4851-4A5D-A4B9-F9F0A9053F2C@mac.com> Message-ID: <5B8EE153.1020104@stoneleaf.us> On 09/04/2018 11:55 AM, Stefan Behnel wrote: > Adding the right language would lower the bar, IMHO. Cython is Python. It > allows users with a Python background to implement C things without having > to thoroughly learn C/and/ the CPython C-API first. So, the way I see it, > rather than/adding/ a "third" language to the mix, it substantially lowers > the entry level from the current two and a half languages (Python + C + > C-API) to one and a half (Python + Cython). As somebody who only has light exposure to C, I would very much like to have Cython be an option. -- ~Ethan~ From yselivanov.ml at gmail.com Tue Sep 4 15:51:37 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 4 Sep 2018 15:51:37 -0400 Subject: [Python-Dev] Use of Cython In-Reply-To: References: <20180730110120.6d03e6d8@fsol> <74a848fa0eff42fc8ae5aa58e3fe71d0@xmail101.UGent.be> <5B600F47.3090503@UGent.be> <20180731094528.118471f9@fsol> <17ebc01b-e0f2-1ebc-1229-b4ca84843f9c@python.org> <8CA25A41-D9F4-4634-9509-604F84B09E46@mac.com> <2B17179F-29B4-40E2-824D-749359E33089@mac.com> <57363ED3-4851-4A5D-A4B9-F9F0A9053F2C@mac.com> Message-ID: On Tue, Sep 4, 2018 at 2:58 PM Stefan Behnel wrote: [..] > Cython has four ways to provide type declarations: cdef statements in > Cython code, external .pxd files for Python or Cython files, special > decorators and declaration functions, and PEP-484/526 type annotations. Great to hear that PEP 484 type annotations are supported. Here's a link to the docs: https://cython.readthedocs.io/en/latest/src/tutorial/pure.html#static-typing [..] > > I know that Cython has a mode to use decorators in > > pure Python code to annotate types, but they are less intuitive than > > using typing annotations in 3.6+. > > You can use PEP-484/526 type annotations to declare Cython types in Python > code that you intend to compile. It's entirely up to you, and it's an > entirely subjective measure which "is better". Many people prefer Cython's > non-Python syntax because it allows them to apply their existing C > knowledge. For them, PEP-484 annotations may easily be non-intuitive in > comparison. Yeah, but if we decide to use Cython in CPython we probably need to come up with something like PEP 7 to recommend one particular style and have an overall guideline. Using PEP 484 annotations means that we have pure Python code that PyPy and other interpreters can still run. [..] > > I'd be +0.5 on using Cython (optionally?) to compile some pure Python > > code to make it 30-50% faster. asyncio, for instance, would certainly > > benefit from that. > > Since most of this (stdlib) Python code doesn't need to stay syntax > compatible with Python < 3.6 (actually 3.8) anymore, you can probably get > much higher speedups than that by statically typing some variables and > functions here and there. I recently tried that with difflib, makes a big > difference. I'd be willing to try this in asyncio if we start using Cython. Yury From vstinner at redhat.com Wed Sep 5 05:52:50 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 5 Sep 2018 11:52:50 +0200 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? Message-ID: Hi, It's no longer possible to merge any change in the 3.6 branch of CPython, because the AppVeyor job fails: https://bugs.python.org/issue34575 It seems like AppVeyor has a build cache and this cache is outdated. I tried to use the REST API but I'm not allowed to invalidate the cache: even the most basic REST API query (list my own roles) fails with: {"message":"You do not have required permissions to perform this action."} Who ows the "python" AppVeyor project? Can someone please give me the administrator permission on this project, so I will be able to invalid the build cache? Moreover, would it be possible to give me the administrator permission on the CPython GitHub project, so I would be able to mark the AppVeyor as optional until the issue is solved (to unblock the workflow at least)? I promise I will not mess up the Python project ;-) Thanks in advance, Victor From vstinner at redhat.com Wed Sep 5 05:56:24 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 5 Sep 2018 11:56:24 +0200 Subject: [Python-Dev] Schedule of the Python 3.7.1 release? Message-ID: Hi, Someone asked somewhere (oops, I forgot where!) when is Python 3.7.1 scheduled. I wanted to reply when I saw that it was scheduled for 2 months ago: "3.7.1: 2018-07-xx" https://www.python.org/dev/peps/pep-0537/#maintenance-releases Is there any blocker for 3.7.1? I fixed dozens of bugs in the 3.7 branch since 3.7.0 has been released. I'm ashamed of most of them, since I introduced regressions in 3.7. (If you insist, I can name them ;-)) Note: the latest Python activity of Ned Deily, our 3.7 release manager, was an email sent to python-committers mi-July. Victor From p.f.moore at gmail.com Wed Sep 5 06:03:48 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Sep 2018 11:03:48 +0100 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: Message-ID: On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > Hi, > > It's no longer possible to merge any change in the 3.6 branch of > CPython, because the AppVeyor job fails: > https://bugs.python.org/issue34575 > > It seems like AppVeyor has a build cache and this cache is outdated. I > tried to use the REST API but I'm not allowed to invalidate the cache: > even the most basic REST API query (list my own roles) fails with: > > {"message":"You do not have required permissions to perform this action."} > > Who ows the "python" AppVeyor project? Can someone please give me the > administrator permission on this project, so I will be able to invalid > the build cache? > > Moreover, would it be possible to give me the administrator permission > on the CPython GitHub project, so I would be able to mark the AppVeyor > as optional until the issue is solved (to unblock the workflow at > least)? I promise I will not mess up the Python project ;-) I don't appear to have admin rights on Appveyor either. Also, there doesn't appear to be an appveyor.yml file in the CPython repository, so I'm not clear how the build process has been configured. Does anyone have that information? (And I'd strongly recommend that if we're somehow configuring the builds via the Appveyor UI, we move to using a config file like we do for Travis, so that diagnosis and fixes can be done without needing to access the Appveyor admin interface...) Paul From evpok.padding at gmail.com Wed Sep 5 06:10:57 2018 From: evpok.padding at gmail.com (Evpok Padding) Date: Wed, 5 Sep 2018 12:10:57 +0200 Subject: [Python-Dev] Comparisions for collections.Counters Message-ID: Hello everyone, According to the [doc][1], `collections.Counter` convenience intersection and union functions are meant to help it represent multisets. However, it currently lacks comparisons, which would make sense and seems straightforward to implement. Am I missing something here or should I send a PR ASAP?? :-) Cheers, E [1]: https://docs.python.org/3/library/collections.html#collections.Counter -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Wed Sep 5 06:22:34 2018 From: christian at python.org (Christian Heimes) Date: Wed, 5 Sep 2018 12:22:34 +0200 Subject: [Python-Dev] Schedule of the Python 3.7.1 release? In-Reply-To: References: Message-ID: <16e28d4a-41e4-8b46-64e9-40a48a1f4911@python.org> On 2018-09-05 11:56, Victor Stinner wrote: > Hi, > > Someone asked somewhere (oops, I forgot where!) when is Python 3.7.1 > scheduled. I wanted to reply when I saw that it was scheduled for 2 > months ago: > > "3.7.1: 2018-07-xx" > https://www.python.org/dev/peps/pep-0537/#maintenance-releases > > Is there any blocker for 3.7.1? I fixed dozens of bugs in the 3.7 > branch since 3.7.0 has been released. I'm ashamed of most of them, > since I introduced regressions in 3.7. (If you insist, I can name them > ;-)) > > Note: the latest Python activity of Ned Deily, our 3.7 release > manager, was an email sent to python-committers mi-July. Hi, can we do a release after the core dev sprints, please? I have like to discuss and land some TLS 1.3 / OpenSSL 1.1.1 related changes and improvements first. Christian From solipsis at pitrou.net Wed Sep 5 07:22:01 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Sep 2018 13:22:01 +0200 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? References: Message-ID: <20180905132201.29d06e72@fsol> On Wed, 5 Sep 2018 11:03:48 +0100 Paul Moore wrote: > On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > > > Hi, > > > > It's no longer possible to merge any change in the 3.6 branch of > > CPython, because the AppVeyor job fails: > > https://bugs.python.org/issue34575 > > > > It seems like AppVeyor has a build cache and this cache is outdated. I > > tried to use the REST API but I'm not allowed to invalidate the cache: > > even the most basic REST API query (list my own roles) fails with: > > > > {"message":"You do not have required permissions to perform this action."} > > > > Who ows the "python" AppVeyor project? Can someone please give me the > > administrator permission on this project, so I will be able to invalid > > the build cache? > > > > Moreover, would it be possible to give me the administrator permission > > on the CPython GitHub project, so I would be able to mark the AppVeyor > > as optional until the issue is solved (to unblock the workflow at > > least)? I promise I will not mess up the Python project ;-) > > I don't appear to have admin rights on Appveyor either. Also, there > doesn't appear to be an appveyor.yml file in the CPython repository, > so I'm not clear how the build process has been configured. For some reason it seems to be located in a hidden directory (".github/appveyor.yml"). Not the most intuitive decision IMHO. Travis' own config file ".travis.yml" is still at repository root, which makes things more confusing. Regards Antoine. From eric at trueblade.com Wed Sep 5 06:50:31 2018 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 5 Sep 2018 06:50:31 -0400 Subject: [Python-Dev] Schedule of the Python 3.7.1 release? In-Reply-To: <16e28d4a-41e4-8b46-64e9-40a48a1f4911@python.org> References: <16e28d4a-41e4-8b46-64e9-40a48a1f4911@python.org> Message-ID: <44a5c5f1-eef8-4d6d-97a9-7ae3cf45f3a0@trueblade.com> On 9/5/2018 6:22 AM, Christian Heimes wrote: > On 2018-09-05 11:56, Victor Stinner wrote: >> Hi, >> >> Someone asked somewhere (oops, I forgot where!) when is Python 3.7.1 >> scheduled. I wanted to reply when I saw that it was scheduled for 2 >> months ago: >> >> "3.7.1: 2018-07-xx" >> https://www.python.org/dev/peps/pep-0537/#maintenance-releases >> >> Is there any blocker for 3.7.1? I fixed dozens of bugs in the 3.7 >> branch since 3.7.0 has been released. I'm ashamed of most of them, >> since I introduced regressions in 3.7. (If you insist, I can name them >> ;-)) >> >> Note: the latest Python activity of Ned Deily, our 3.7 release >> manager, was an email sent to python-committers mi-July. > > Hi, > > can we do a release after the core dev sprints, please? I have like to > discuss and land some TLS 1.3 / OpenSSL 1.1.1 related changes and > improvements first. Agreed. I have at least one dataclasses bug I'm planning on addressing, and I'd like to get some feedback on it at the sprints. Eric From p.f.moore at gmail.com Wed Sep 5 07:54:42 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Sep 2018 12:54:42 +0100 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: <20180905132201.29d06e72@fsol> References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, 5 Sep 2018 at 12:24, Antoine Pitrou wrote: > For some reason it seems to be located in a hidden directory > (".github/appveyor.yml"). Not the most intuitive decision IMHO. > Travis' own config file ".travis.yml" is still at repository root, which > makes things more confusing. Thanks, agreed that's confusing. I'd prefer appveyor.yml to be at the project root, as that's what nearly all projects I deal with do. But at least I know where it is now :-) Paul From vstinner at redhat.com Wed Sep 5 09:06:14 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 5 Sep 2018 15:06:14 +0200 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: Message-ID: I wrote some notes about our CIs. Link to AppVeyor notes: https://pythondev.readthedocs.io/ci.html#appveyor Victor Le mer. 5 sept. 2018 ? 12:04, Paul Moore a ?crit : > > On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > > > Hi, > > > > It's no longer possible to merge any change in the 3.6 branch of > > CPython, because the AppVeyor job fails: > > https://bugs.python.org/issue34575 > > > > It seems like AppVeyor has a build cache and this cache is outdated. I > > tried to use the REST API but I'm not allowed to invalidate the cache: > > even the most basic REST API query (list my own roles) fails with: > > > > {"message":"You do not have required permissions to perform this action."} > > > > Who ows the "python" AppVeyor project? Can someone please give me the > > administrator permission on this project, so I will be able to invalid > > the build cache? > > > > Moreover, would it be possible to give me the administrator permission > > on the CPython GitHub project, so I would be able to mark the AppVeyor > > as optional until the issue is solved (to unblock the workflow at > > least)? I promise I will not mess up the Python project ;-) > > I don't appear to have admin rights on Appveyor either. Also, there > doesn't appear to be an appveyor.yml file in the CPython repository, > so I'm not clear how the build process has been configured. Does > anyone have that information? (And I'd strongly recommend that if > we're somehow configuring the builds via the Appveyor UI, we move to > using a config file like we do for Travis, so that diagnosis and fixes > can be done without needing to access the Appveyor admin interface...) > > Paul From berker.peksag at gmail.com Wed Sep 5 09:08:05 2018 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Wed, 5 Sep 2018 16:08:05 +0300 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: Message-ID: On Wed, Sep 5, 2018 at 12:55 PM Victor Stinner wrote: > > Hi, > > It's no longer possible to merge any change in the 3.6 branch of > CPython, because the AppVeyor job fails: > https://bugs.python.org/issue34575 > > It seems like AppVeyor has a build cache and this cache is outdated. I > tried to use the REST API but I'm not allowed to invalidate the cache: > even the most basic REST API query (list my own roles) fails with: > > {"message":"You do not have required permissions to perform this action."} > > Who ows the "python" AppVeyor project? Can someone please give me the > administrator permission on this project, so I will be able to invalid > the build cache? > > Moreover, would it be possible to give me the administrator permission > on the CPython GitHub project, so I would be able to mark the AppVeyor > as optional until the issue is solved (to unblock the workflow at > least)? I promise I will not mess up the Python project ;-) I've just made the "continuous-integration/appveyor/pr" status check optional on the 3.6 branch to unblock the development for now. Indeed, AppVeyor's REST API doesn't work: $ curl -H "Authorization: Bearer $APPVEYOR_TOKEN" -H "Content-Type: application/json" -X DELETE https://ci.appveyor.com/api/projects/python/cpython/buildcache {"message":"You do not have required permissions to perform this action."} I'm going to try to make you an admin on python/cpython, but I find GitHub's user/team management UI a bit confusing, so no promise :) --Berker From zachary.ware+pydev at gmail.com Wed Sep 5 09:44:06 2018 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Wed, 5 Sep 2018 08:44:06 -0500 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: <20180905132201.29d06e72@fsol> References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, Sep 5, 2018 at 6:23 AM Antoine Pitrou wrote: > On Wed, 5 Sep 2018 11:03:48 +0100 > Paul Moore wrote: > > On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > > Who ows the "python" AppVeyor project? That seems to have fallen to me for the most part. > > > Can someone please give me the > > > administrator permission on this project, so I will be able to invalid > > > the build cache? > > > > I don't appear to have admin rights on Appveyor either. I've attempted to make a change that should give you both more access; even odds on whether it did anything :). I've never tried to use their REST API, so I don't know whether it will help with that at all. > For some reason it seems to be located in a hidden directory > (".github/appveyor.yml"). Not the most intuitive decision IMHO. > Travis' own config file ".travis.yml" is still at repository root, which > makes things more confusing. The idea there was to avoid proliferation of root-level dotfiles where possible, but if we would rather keep it at the project root it's a relatively simple change to make. For the actual issue at hand, the problem arises from doing builds on 3.6 with both the VS2015 and VS2017 images. Apparently something built in `/externals` by the VS2015 build gets cached, which then breaks the VS2017 build; I haven't tracked down how exactly that is happening. I think the preferred solution is probably to just drop the VS2017 build on 3.6 on AppVeyor; VSTS runs on VS2017 and dropping one of the builds from 3.6 will make AppVeyor significantly quicker on that branch. -- Zach From vstinner at redhat.com Wed Sep 5 10:06:21 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 5 Sep 2018 16:06:21 +0200 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: <20180905132201.29d06e72@fsol> Message-ID: Le mer. 5 sept. 2018 ? 15:47, Zachary Ware a ?crit : > For the actual issue at hand, the problem arises from doing builds on > 3.6 with both the VS2015 and VS2017 images. Apparently something > built in `/externals` by the VS2015 build gets cached, which then > breaks the VS2017 build; I haven't tracked down how exactly that is > happening. I think the preferred solution is probably to just drop > the VS2017 build on 3.6 on AppVeyor; VSTS runs on VS2017 and dropping > one of the builds from 3.6 will make AppVeyor significantly quicker on > that branch. Do we have a VS2017 buildbot on the 3.6 branch? If yes, I vote to drop VS2017 in the pre-commit hook (AppVeyor) since it would make AppVeyor twice faster on 3.6! Victor From ncoghlan at gmail.com Wed Sep 5 10:22:43 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Sep 2018 00:22:43 +1000 Subject: [Python-Dev] Python REPL doesn't work on Windows over remote powershell session (winrm) In-Reply-To: References: Message-ID: On Sun, 26 Aug 2018 at 12:16, David Bolen wrote: > I'm not sure if there's any better way for Python to detect a remote > shell as being interactive under Windows that would cover such cases. > Perhaps some of the newer pty changes I read Microsoft is making might > help, assuming it flows through to the isatty() test. Based on https://blogs.msdn.microsoft.com/commandline/2018/08/02/windows-command-line-introducing-the-windows-pseudo-console-conpty/, that seems plausible to me, since they're adding a true pseudo-tty system, and mostly modeling it's behaviour on the *nix one. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Wed Sep 5 10:30:13 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Sep 2018 15:30:13 +0100 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, 5 Sep 2018 at 14:47, Zachary Ware wrote: > > On Wed, Sep 5, 2018 at 6:23 AM Antoine Pitrou wrote: > > On Wed, 5 Sep 2018 11:03:48 +0100 > > Paul Moore wrote: > > > On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > > > Who ows the "python" AppVeyor project? > > That seems to have fallen to me for the most part. > > > > > Can someone please give me the > > > > administrator permission on this project, so I will be able to invalid > > > > the build cache? > > > > > > I don't appear to have admin rights on Appveyor either. > > I've attempted to make a change that should give you both more access; > even odds on whether it did anything :). I've never tried to use > their REST API, so I don't know whether it will help with that at all. I do indeed now seem to have admin access on Appveyor. Thanks for that. I guess I should therefore say that if anyone needs help with Appveyor stuff, feel free to ping me and save Zach from getting all the work :-) > > For some reason it seems to be located in a hidden directory > > (".github/appveyor.yml"). Not the most intuitive decision IMHO. > > Travis' own config file ".travis.yml" is still at repository root, which > > makes things more confusing. > > The idea there was to avoid proliferation of root-level dotfiles where > possible, but if we would rather keep it at the project root it's a > relatively simple change to make. When working via github on the web (which I was) rather than on a local checkout where I can search, putting it in a subdiretory is a bit less discoverable (made worse because there's nothing about the name ".github" that suggests it would have Appveyor files in it :-)) I'd prefer it at the top level - but not enough to submit a PR for that at the moment, so I'm fine with it staying where it is. > For the actual issue at hand, the problem arises from doing builds on > 3.6 with both the VS2015 and VS2017 images. Apparently something > built in `/externals` by the VS2015 build gets cached, which then > breaks the VS2017 build; I haven't tracked down how exactly that is > happening. I think the preferred solution is probably to just drop > the VS2017 build on 3.6 on AppVeyor; VSTS runs on VS2017 and dropping > one of the builds from 3.6 will make AppVeyor significantly quicker on > that branch. Nice catch. I'd agree, it's probably not worth having both (particularly as, if Victor says, we have buildbots for the one Appveyor doesn't cover - but even if we don't I think VSTS has it covered). I presume you're suggesting keeping 2017 is so that we don't have stray 2015-built artifacts in the cache, which makes sense to me, and I have a mild preference for keeping the latest compiler, as that's likely the one that people will find easier to get. But 2015 is presumably the version the official 3.6 builds are made with, so there's an argument for keeping that one (although if we do that I guess we need to find a *different* way of fixing the cached artifact issue). tl; dr; I'm inclined to agree with you that just using VS2017 on Appveyor is the simplest option. Paul From zachary.ware+pydev at gmail.com Wed Sep 5 10:48:58 2018 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Wed, 5 Sep 2018 09:48:58 -0500 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, Sep 5, 2018 at 9:30 AM Paul Moore wrote: > I presume you're suggesting keeping 2017 is so that we don't have > stray 2015-built artifacts in the cache, which makes sense to me, and > I have a mild preference for keeping the latest compiler, as that's > likely the one that people will find easier to get. But 2015 is > presumably the version the official 3.6 builds are made with, so > there's an argument for keeping that one (although if we do that I > guess we need to find a *different* way of fixing the cached artifact > issue). My fix was actually to keep VS2015 on AppVeyor and leave VS2017 to VSTS, that way we get pre-commit coverage on both compilers. There shouldn't be any caching issues between branches, since PCbuild is sufficiently different between each branch. I wish there was a cache per branch, but there doesn't seem to be. -- Zach From erik.m.bray at gmail.com Wed Sep 5 10:55:59 2018 From: erik.m.bray at gmail.com (Erik Bray) Date: Wed, 5 Sep 2018 16:55:59 +0200 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, Sep 5, 2018 at 4:32 PM Paul Moore wrote: > > On Wed, 5 Sep 2018 at 14:47, Zachary Ware wrote: > > > > On Wed, Sep 5, 2018 at 6:23 AM Antoine Pitrou wrote: > > > On Wed, 5 Sep 2018 11:03:48 +0100 > > > Paul Moore wrote: > > > > On Wed, 5 Sep 2018 at 10:55, Victor Stinner wrote: > > > > > Who ows the "python" AppVeyor project? > > > > That seems to have fallen to me for the most part. > > > > > > > Can someone please give me the > > > > > administrator permission on this project, so I will be able to invalid > > > > > the build cache? > > > > > > > > I don't appear to have admin rights on Appveyor either. > > > > I've attempted to make a change that should give you both more access; > > even odds on whether it did anything :). I've never tried to use > > their REST API, so I don't know whether it will help with that at all. > > I do indeed now seem to have admin access on Appveyor. Thanks for > that. I guess I should therefore say that if anyone needs help with > Appveyor stuff, feel free to ping me and save Zach from getting all > the work :-) > > > > For some reason it seems to be located in a hidden directory > > > (".github/appveyor.yml"). Not the most intuitive decision IMHO. > > > Travis' own config file ".travis.yml" is still at repository root, which > > > makes things more confusing. > > > > The idea there was to avoid proliferation of root-level dotfiles where > > possible, but if we would rather keep it at the project root it's a > > relatively simple change to make. > > When working via github on the web (which I was) rather than on a > local checkout where I can search, putting it in a subdiretory is a > bit less discoverable (made worse because there's nothing about the > name ".github" that suggests it would have Appveyor files in it :-)) > I'd prefer it at the top level - but not enough to submit a PR for > that at the moment, so I'm fine with it staying where it is. > > > For the actual issue at hand, the problem arises from doing builds on > > 3.6 with both the VS2015 and VS2017 images. Apparently something > > built in `/externals` by the VS2015 build gets cached, which then > > breaks the VS2017 build; I haven't tracked down how exactly that is > > happening. I think the preferred solution is probably to just drop > > the VS2017 build on 3.6 on AppVeyor; VSTS runs on VS2017 and dropping > > one of the builds from 3.6 will make AppVeyor significantly quicker on > > that branch. > > Nice catch. I'd agree, it's probably not worth having both > (particularly as, if Victor says, we have buildbots for the one > Appveyor doesn't cover - but even if we don't I think VSTS has it > covered). > > I presume you're suggesting keeping 2017 is so that we don't have > stray 2015-built artifacts in the cache, which makes sense to me, and > I have a mild preference for keeping the latest compiler, as that's > likely the one that people will find easier to get. But 2015 is > presumably the version the official 3.6 builds are made with, so > there's an argument for keeping that one (although if we do that I > guess we need to find a *different* way of fixing the cached artifact > issue). > > tl; dr; I'm inclined to agree with you that just using VS2017 on > Appveyor is the simplest option. Hello, Let me take this note as an opportunity to nag that I have a still open pull request to add testing of Python on Cygwin to the AppVeyor build, which in theory works quite well: https://github.com/python/cpython/pull/8463 So +1 for dropping one build configuration from AppVeyor if that will make it easier in the future to add this one :) However, Victor has asked that as a prerequisite to adding a Cygwin build to AppVeyor, we first have a relatively stable buildbot. I had thought maybe adding advisory CI on AppVeyor *first* would make getting a stable buildbot easier, but I can see the argument either way, so we have added said buildbot: https://buildbot.python.org/all/#/builders/164 Unfortunately, for the last ~120 builds it has been all but useless due to at least two small, but long outstanding issues preventing 3.7.x from building on Cygwin. Both of those issues have proposed fixes pending review, both of which have PRs linked to from my AppVeyor PR. If anyone is interested in having a look at those I'd appreciate it, thanks (one of them also got some review from Inada Naoki, but we didn't ever agree on some concrete action items for making the patch acceptable, and it has stalled again...) Best, E From christian at python.org Wed Sep 5 11:08:02 2018 From: christian at python.org (Christian Heimes) Date: Wed, 5 Sep 2018 17:08:02 +0200 Subject: [Python-Dev] AES cipher implementation in standard library In-Reply-To: References: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> Message-ID: <10baa5ef-2e9d-156c-04cd-90c6d3b0ad86@python.org> On 2018-09-05 16:01, ???? wrote: > Christian,? really appreciated?the?details. I understood. > > Is wrapper library like ssl module with openssl on platform also not > good idea? > My intention is not re-invention but single standard way as standard > library. > > If I can read past discussion somewhere, it's also appreciated The Python standard library doesn't have to solve all problems. Although the slogan is "Batteries Included", we stopped including all batteries a long time. We try not to add new modules, especially complex and security-relevant modules like a generic crypto interface. The Python core developer team doesn't have any resources to design, create, and maintain a crypto interface. The ssl module is a bit special, because pip and other download tools need it. Without the ssl module, pip wouldn't be able to download packages over HTTPS. If you need cryptographic algorithms and primitives, then use the PyCA cryptography package. It's *the* recommended library for cryptography, and X.509. Christian From oono0114 at gmail.com Wed Sep 5 10:01:14 2018 From: oono0114 at gmail.com (=?UTF-8?B?5aSn6YeO6ZqG5byY?=) Date: Wed, 5 Sep 2018 23:01:14 +0900 Subject: [Python-Dev] AES cipher implementation in standard library In-Reply-To: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> References: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> Message-ID: Christian, really appreciated the details. I understood. Is wrapper library like ssl module with openssl on platform also not good idea? My intention is not re-invention but single standard way as standard library. If I can read past discussion somewhere, it's also appreciated Thanks and Regards, Takahiro Ono 2018?9?5?(?) 1:48 Christian Heimes : > On 2018-09-04 16:37, ???? wrote: > > Dear all, > > > > Have we tried cipher implementation includes AES as a standard library > > in the past? > > https://docs.python.org/3.6/library/crypto.html > > > > if possible I want to try to implement AES because famous 3rd party > > library is not maintained and general cipher programs should be used for > > multiple purpose.Though the implementation is tough, I believe this > > should be worth to it. > > In my case, I want to use AES implementation for zipfile module. > > strong -1 > > The Python standard library doesn't contain any encryption, signing, and > other cryptographic algorithms for multiple reasons. The only exception > from the rule are hashing algorithms and HMAC construct. There are legal > implications like export restrictions. Crypto is just too hard to get > right and we don't want to give the user additional rope. We already had > a very lengthy and exhausting discussion for the secrets module. That > module just provides a user-friendly interface to CPRNG. > > By the way, AES by itself is a useless to borderline dangerous > algorithm. It must be embedded within additional layers like block mode, > authenticated encryption / MAC, and more. There isn't a single correct > answer for block mode and AD algorithm, too. It highly depends on the > problem space. While GCM AEAD mode is good choice for network > communication, it can be a pretty bad idea for persistent storage. > > There is one excellent Python library with high level and low level > cryptographic algorithms: http://cryptography.readthedocs.io/ . It's t > > Regards, > Christian > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oono0114 at gmail.com Wed Sep 5 10:25:39 2018 From: oono0114 at gmail.com (=?UTF-8?B?5aSn6YeO6ZqG5byY?=) Date: Wed, 5 Sep 2018 23:25:39 +0900 Subject: [Python-Dev] AES cipher implementation in standard library In-Reply-To: References: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> Message-ID: Sorry, allow me to ask one more thing. If I want to use AES in zipfile module, what the good way to implement? Thanks and Regards, ----------------- Takahiro Ono 2018?9?5?(?) 23:01 ???? : > Christian, really appreciated the details. I understood. > > Is wrapper library like ssl module with openssl on platform also not good > idea? > My intention is not re-invention but single standard way as standard > library. > > If I can read past discussion somewhere, it's also appreciated > > Thanks and Regards, > Takahiro Ono > > > > > 2018?9?5?(?) 1:48 Christian Heimes : > >> On 2018-09-04 16:37, ???? wrote: >> > Dear all, >> > >> > Have we tried cipher implementation includes AES as a standard library >> > in the past? >> > https://docs.python.org/3.6/library/crypto.html >> > >> > if possible I want to try to implement AES because famous 3rd party >> > library is not maintained and general cipher programs should be used for >> > multiple purpose.Though the implementation is tough, I believe this >> > should be worth to it. >> > In my case, I want to use AES implementation for zipfile module. >> >> strong -1 >> >> The Python standard library doesn't contain any encryption, signing, and >> other cryptographic algorithms for multiple reasons. The only exception >> from the rule are hashing algorithms and HMAC construct. There are legal >> implications like export restrictions. Crypto is just too hard to get >> right and we don't want to give the user additional rope. We already had >> a very lengthy and exhausting discussion for the secrets module. That >> module just provides a user-friendly interface to CPRNG. >> >> By the way, AES by itself is a useless to borderline dangerous >> algorithm. It must be embedded within additional layers like block mode, >> authenticated encryption / MAC, and more. There isn't a single correct >> answer for block mode and AD algorithm, too. It highly depends on the >> problem space. While GCM AEAD mode is good choice for network >> communication, it can be a pretty bad idea for persistent storage. >> >> There is one excellent Python library with high level and low level >> cryptographic algorithms: http://cryptography.readthedocs.io/ . It's t >> >> Regards, >> Christian >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Wed Sep 5 12:12:37 2018 From: mike at selik.org (Michael Selik) Date: Wed, 5 Sep 2018 09:12:37 -0700 Subject: [Python-Dev] Comparisions for collections.Counters In-Reply-To: References: Message-ID: On Wed, Sep 5, 2018 at 3:13 AM Evpok Padding wrote: > According to the [doc][1], `collections.Counter` convenience intersection > and union functions are meant to help it represent multisets. However, it > currently lacks comparisons, which would make sense and seems > straightforward to implement. > x = Counter(a=1, b=2) y = Counter(a=2, b=1) x > y ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Sep 5 12:25:16 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 5 Sep 2018 19:25:16 +0300 Subject: [Python-Dev] Comparisions for collections.Counters In-Reply-To: References: Message-ID: 05.09.18 13:10, Evpok Padding ????: > According to the [doc][1], `collections.Counter` convenience > intersection and union functions are meant to help it represent > multisets. However, it currently lacks comparisons, which would make > sense and seems straightforward to implement. > Am I missing something here or should I send a PR ASAP?? :-) There is a closed issue for this: https://bugs.python.org/issue22515. From evpok.padding at gmail.com Wed Sep 5 13:52:19 2018 From: evpok.padding at gmail.com (Evpok Padding) Date: Wed, 5 Sep 2018 19:52:19 +0200 Subject: [Python-Dev] Comparisions for collections.Counters In-Reply-To: References: Message-ID: On Wed, 5 Sep 2018 at 18:28, Serhiy Storchaka wrote: > There is a closed issue for this: https://bugs.python.org/issue22515. > Oh, thanks, I had missed that. I guess I can live with it, although I agree with [the last comment][1] that this decision does not make a lot of sense. But hey, who am I to second-guess Guido anr Raymond H. Cheers, E [1]: https://bugs.python.org/issue22515#msg253251 -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Sep 5 16:14:38 2018 From: brett at python.org (Brett Cannon) Date: Wed, 5 Sep 2018 13:14:38 -0700 Subject: [Python-Dev] Workflow blocked on the 3.6 because of AppVeyor; who owns the AppVeyor project? In-Reply-To: References: <20180905132201.29d06e72@fsol> Message-ID: On Wed, 5 Sep 2018 at 04:56 Paul Moore wrote: > On Wed, 5 Sep 2018 at 12:24, Antoine Pitrou wrote: > > For some reason it seems to be located in a hidden directory > > (".github/appveyor.yml"). Not the most intuitive decision IMHO. > > Travis' own config file ".travis.yml" is still at repository root, which > > makes things more confusing. > > Thanks, agreed that's confusing. I'd prefer appveyor.yml to be at the > project root, as that's what nearly all projects I deal with do. But > at least I know where it is now :-) > This was on purpose as the CI files are not directly related to Python itself so they are easier to leave out of any source tarball and such. The reason .travis.yml is top-level is because Travis won't let us have it anywhere else. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Thu Sep 6 10:18:33 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 6 Sep 2018 16:18:33 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) Message-ID: Hi, The Python bug tracker is full of bugs, and sadly we don't have enough people to take care of all of them. There are 3 open bugs about security issues in XML and I simply propose to close it: https://bugs.python.org/issue17318 https://bugs.python.org/issue17239 https://bugs.python.org/issue24238 The XML documentation already starts with a red warning explaining the security limitations of the Python implementation and points to defusedxml and defusedexpat which are existing and working counter-measures: https://docs.python.org/dev/library/xml.html Note: Christian Heimes, author of these 2 packages, told me that these modules may not work on Python 3.7, he didn't have time to maintain them recently. Maybe someone might want to help him? I suggest to close the 3 Python bugs without doing anything. Are you ok with that? Keeping the issue open for 3 years doesn't help anyone, and there is already a security warning in all supported version (I checked 2.7 and 3.4). It seems like XML is getting less popular because of JSON becoming more popular (even if JSON obviously comes with its own set of security issues...). It seems like less core developers care about XML (today than 3 years ago). We should just accept that core developers have limited availability and that documenting security issues is an *acceptable* trade-off. I don't see any value of keeping these 3 issues open. Victor From solipsis at pitrou.net Thu Sep 6 10:29:42 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Sep 2018 16:29:42 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) References: Message-ID: <20180906162942.0a1fb35c@fsol> On Thu, 6 Sep 2018 16:18:33 +0200 Victor Stinner wrote: > > It seems like XML is getting less popular because of JSON becoming > more popular (even if JSON obviously comes with its own set of > security issues...). It seems like less core developers care about XML > (today than 3 years ago). > > We should just accept that core developers have limited availability > and that documenting security issues is an *acceptable* trade-off. I > don't see any value of keeping these 3 issues open. If we consider fixing these issues to be desirable, then the issues should be kept open. Closing issues because no-one is working on them sounds a bit silly to me. Regards Antoine. From vstinner at redhat.com Thu Sep 6 10:40:16 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 6 Sep 2018 16:40:16 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <20180906162942.0a1fb35c@fsol> References: <20180906162942.0a1fb35c@fsol> Message-ID: Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a ?crit : > If we consider fixing these issues to be desirable, then the issues > should be kept open. Closing issues because no-one is working on them > sounds a bit silly to me. I forgot to mention that closing these issues is my reply to Larry's call to fix 3 security issues: https://mail.python.org/pipermail/python-committers/2018-August/006031.html Larry wrote "If they're really all wontfix, maybe we should mark them as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." For these XML issues, the security vulnerabilities can also been seen as XML features. Loading an external DTD is part of the XML specification, as well as entity expansion. I'm also dubious about PyYAML which allows to run arbitrary Python code in a configuration *by default*. But well, it seems like nobody stepped in to change the default. Victor From antoine at python.org Thu Sep 6 10:47:09 2018 From: antoine at python.org (Antoine Pitrou) Date: Thu, 6 Sep 2018 16:47:09 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> Message-ID: <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Le 06/09/2018 ? 16:40, Victor Stinner a ?crit?: > Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a ?crit : >> If we consider fixing these issues to be desirable, then the issues >> should be kept open. Closing issues because no-one is working on them >> sounds a bit silly to me. > > I forgot to mention that closing these issues is my reply to Larry's > call to fix 3 security issues: > > https://mail.python.org/pipermail/python-committers/2018-August/006031.html > > Larry wrote "If they're really all wontfix, maybe we should mark them > as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." "wontfix" on 3.4 doesn't mean we won't fix them later, e.g. in 3.8. > For these XML issues, the security vulnerabilities can also been seen > as XML features. Loading an external DTD is part of the XML > specification, as well as entity expansion. That doesn't mean there shouldn't be any hard limits to expansion depth or breadth. Function calls are a Python feature, yet we limit the amount of recursion allowed. Regards Antoine. From vstinner at redhat.com Thu Sep 6 10:58:02 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 6 Sep 2018 16:58:02 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <51e63340-6007-323d-be7c-f4b7362214e0@python.org> References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: Are you volunteer to fix the XML modules? Victor Le jeu. 6 sept. 2018 ? 16:50, Antoine Pitrou a ?crit : > > > Le 06/09/2018 ? 16:40, Victor Stinner a ?crit : > > Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a ?crit : > >> If we consider fixing these issues to be desirable, then the issues > >> should be kept open. Closing issues because no-one is working on them > >> sounds a bit silly to me. > > > > I forgot to mention that closing these issues is my reply to Larry's > > call to fix 3 security issues: > > > > https://mail.python.org/pipermail/python-committers/2018-August/006031.html > > > > Larry wrote "If they're really all wontfix, maybe we should mark them > > as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." > > "wontfix" on 3.4 doesn't mean we won't fix them later, e.g. in 3.8. > > > For these XML issues, the security vulnerabilities can also been seen > > as XML features. Loading an external DTD is part of the XML > > specification, as well as entity expansion. > > That doesn't mean there shouldn't be any hard limits to expansion depth > or breadth. > > Function calls are a Python feature, yet we limit the amount of > recursion allowed. > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com From antoine at python.org Thu Sep 6 10:59:10 2018 From: antoine at python.org (Antoine Pitrou) Date: Thu, 6 Sep 2018 16:59:10 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: <8d2a77cb-81e6-b8cc-9059-fc14169ca57c@python.org> Le 06/09/2018 ? 16:58, Victor Stinner a ?crit?: > Are you volunteer to fix the XML modules? No. That doesn't mean nobody else will be. Regards Antoine. > > Victor > Le jeu. 6 sept. 2018 ? 16:50, Antoine Pitrou a ?crit : >> >> >> Le 06/09/2018 ? 16:40, Victor Stinner a ?crit : >>> Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a ?crit : >>>> If we consider fixing these issues to be desirable, then the issues >>>> should be kept open. Closing issues because no-one is working on them >>>> sounds a bit silly to me. >>> >>> I forgot to mention that closing these issues is my reply to Larry's >>> call to fix 3 security issues: >>> >>> https://mail.python.org/pipermail/python-committers/2018-August/006031.html >>> >>> Larry wrote "If they're really all wontfix, maybe we should mark them >>> as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." >> >> "wontfix" on 3.4 doesn't mean we won't fix them later, e.g. in 3.8. >> >>> For these XML issues, the security vulnerabilities can also been seen >>> as XML features. Loading an external DTD is part of the XML >>> specification, as well as entity expansion. >> >> That doesn't mean there shouldn't be any hard limits to expansion depth >> or breadth. >> >> Function calls are a Python feature, yet we limit the amount of >> recursion allowed. >> >> Regards >> >> Antoine. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com From guido at python.org Thu Sep 6 11:03:13 2018 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Sep 2018 08:03:13 -0700 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <51e63340-6007-323d-be7c-f4b7362214e0@python.org> References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: FWIW I'm with Antoine here -- XML is still important and I'd like us to go the extra mile here, not just give up because the issues have been inactive for a long time. We can't control what PyYAML does, but for the stdlib XML code, the buck stops here, and we should do the responsible thing. On Thu, Sep 6, 2018 at 7:49 AM Antoine Pitrou wrote: > > Le 06/09/2018 ? 16:40, Victor Stinner a ?crit : > > Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a > ?crit : > >> If we consider fixing these issues to be desirable, then the issues > >> should be kept open. Closing issues because no-one is working on them > >> sounds a bit silly to me. > > > > I forgot to mention that closing these issues is my reply to Larry's > > call to fix 3 security issues: > > > > > https://mail.python.org/pipermail/python-committers/2018-August/006031.html > > > > Larry wrote "If they're really all wontfix, maybe we should mark them > > as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." > > "wontfix" on 3.4 doesn't mean we won't fix them later, e.g. in 3.8. > > > For these XML issues, the security vulnerabilities can also been seen > > as XML features. Loading an external DTD is part of the XML > > specification, as well as entity expansion. > > That doesn't mean there shouldn't be any hard limits to expansion depth > or breadth. > > Function calls are a Python feature, yet we limit the amount of > recursion allowed. > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Thu Sep 6 11:05:48 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Thu, 6 Sep 2018 10:05:48 -0500 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: Thought: what if there's a label on the bug tracker meaning roughly "we're probably not going to fix this anytime soon, but we won't mind someone stepping up"? On Thu, Sep 6, 2018, 10:04 AM Guido van Rossum wrote: > FWIW I'm with Antoine here -- XML is still important and I'd like us to go > the extra mile here, not just give up because the issues have been inactive > for a long time. We can't control what PyYAML does, but for the stdlib XML > code, the buck stops here, and we should do the responsible thing. > > On Thu, Sep 6, 2018 at 7:49 AM Antoine Pitrou wrote: > >> >> Le 06/09/2018 ? 16:40, Victor Stinner a ?crit : >> > Le jeu. 6 sept. 2018 ? 16:33, Antoine Pitrou a >> ?crit : >> >> If we consider fixing these issues to be desirable, then the issues >> >> should be kept open. Closing issues because no-one is working on them >> >> sounds a bit silly to me. >> > >> > I forgot to mention that closing these issues is my reply to Larry's >> > call to fix 3 security issues: >> > >> > >> https://mail.python.org/pipermail/python-committers/2018-August/006031.html >> > >> > Larry wrote "If they're really all wontfix, maybe we should mark them >> > as wontfix, thus giving 3.4 a sendoff worthy of its heroic stature." >> >> "wontfix" on 3.4 doesn't mean we won't fix them later, e.g. in 3.8. >> >> > For these XML issues, the security vulnerabilities can also been seen >> > as XML features. Loading an external DTD is part of the XML >> > specification, as well as entity expansion. >> >> That doesn't mean there shouldn't be any hard limits to expansion depth >> or breadth. >> >> Function calls are a Python feature, yet we limit the amount of >> recursion allowed. >> >> Regards >> >> Antoine. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > >> > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com > -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hodgestar+pythondev at gmail.com Thu Sep 6 11:08:18 2018 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Thu, 6 Sep 2018 17:08:18 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: On Thu, Sep 6, 2018 at 5:06 PM Ryan Gonzalez wrote: > Thought: what if there's a label on the bug tracker meaning roughly "we're probably not going to fix this anytime soon, but we won't mind someone stepping up"? Maybe "wouldlikehelpfixing"? :D From tseaver at palladion.com Thu Sep 6 12:30:53 2018 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 6 Sep 2018 12:30:53 -0400 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: On 09/06/2018 11:05 AM, Ryan Gonzalez wrote: > Thought: what if there's a label on the bug tracker meaning roughly "we're > probably not going to fix this anytime soon, but we won't mind someone > stepping up"? "help-wanted" Tres. -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com From arj.python at gmail.com Thu Sep 6 13:09:58 2018 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Thu, 6 Sep 2018 21:09:58 +0400 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: Message-ID: no time? i have seen them countless of time on this list e.g. no ... don't implement this in the workflow as my volunteer time will be lost etc etc etc. i guess a call for more core contributors will be nice. for myself i have some translations ahead (finally getting the chance to read the docs from cover to cover), but i guess actually core-contributing will be a nice experience. the problem with getting contributors is that the docs need to be more readable, more tutos need to be written (less people are contributors / pyramid effect -> less guides written). the devs are doing a nice job guiding etc but the first step must be made easier. lack of time for an exceedingly popular project, for a very open system as python hints to a bottleneck somehere, not that there are no interests, but that they get blocked. Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Thu Sep 6 15:10:33 2018 From: steve.dower at python.org (Steve Dower) Date: Thu, 6 Sep 2018 12:10:33 -0700 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: <108b8b0f-c9eb-9fdb-65a4-0fc45abc45b5@python.org> On 06Sep2018 0758, Victor Stinner wrote: > Are you volunteer to fix the XML modules? If Christian is not able to keep maintaining the defused* packages, then I may take a look at this next week at the sprints. The built-in XML packages actually don't meet Microsoft's internal security requirements, so I have some business motivation to do it. Hopefully it doesn't turn me into the sole XML maintainer... Ultimately, however, I think we're looking at technically incompatible design changes, which is why simply dropping in a "fix" for 3.4 would not work whereas adding new options (with more secure defaults) may work for 3.8. So I'm agreed with nearly everyone else - bugs should stay open as long as we're interested in taking a fix, even if they've already been open for a long time. Our issue tracker is a backlog, not a plan, so there is no penalty for something sitting in there for a long time. Cheers, Steve From vstinner at redhat.com Fri Sep 7 03:00:54 2018 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 7 Sep 2018 09:00:54 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <108b8b0f-c9eb-9fdb-65a4-0fc45abc45b5@python.org> References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> <108b8b0f-c9eb-9fdb-65a4-0fc45abc45b5@python.org> Message-ID: Le jeu. 6 sept. 2018 ? 21:10, Steve Dower a ?crit : > If Christian is not able to keep maintaining the defused* packages, then > I may take a look at this next week at the sprints. The built-in XML > packages actually don't meet Microsoft's internal security requirements, > so I have some business motivation to do it. Great! The best would be to be able to merge defuse* features into the stdlib. Maybe not change the default, but add an option to enable security counter-measures. Victor From christian at python.org Fri Sep 7 04:20:02 2018 From: christian at python.org (Christian Heimes) Date: Fri, 7 Sep 2018 10:20:02 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: <399f48ca-c4a9-5aa3-2fe6-bfe9ebc3dc0c@python.org> On 2018-09-06 17:03, Guido van Rossum wrote: > FWIW I'm with Antoine here -- XML is still important and I'd like us to > go the extra mile here, not just give up because the issues have been > inactive for a long time. We can't control what PyYAML does, but for the > stdlib XML code, the buck stops here, and we should do the responsible > thing. Back in the days, I didn't push hard for the necessary fixes, because all fixes were breaking changes. After all I'd have to disable some features that people may have relied upon. The XML security stuff was my first major security topic for Python, even before SipHash24. I was more concerned not to break people's software than to keep the majority of users safe. I have changed my opinion over the last six, seven years. By the way I couldn't fix some problems in Python and our expat wrapper either. The expat parser was missing features to properly implement security measurements. I need to check if expat has been improved over the years. The topic is on the agenda for the core dev sprint. Christian From vstinner at redhat.com Fri Sep 7 04:27:49 2018 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 7 Sep 2018 10:27:49 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <399f48ca-c4a9-5aa3-2fe6-bfe9ebc3dc0c@python.org> References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> <399f48ca-c4a9-5aa3-2fe6-bfe9ebc3dc0c@python.org> Message-ID: Le ven. 7 sept. 2018 ? 10:23, Christian Heimes a ?crit : > Back in the days, I didn't push hard for the necessary fixes, because > all fixes were breaking changes. After all I'd have to disable some > features that people may have relied upon. The XML security stuff was my > first major security topic for Python, even before SipHash24. I was more > concerned not to break people's software than to keep the majority of > users safe. I have changed my opinion over the last six, seven years. I understood that Python 2.7.9 which required a valid TLS certificate annoyed many customers. So I don't think that it would be a good idea to enforce XML security in a minor Python release. But would it make sense to make XML stricter in Python 3.8 and add an option to opt-out? Or do we need a cycle of 1.5 year (Python 3.8) with a warning, and change the default in the next cycle? > The topic is on the agenda for the core dev sprint. Great :-) Thanks are moving on. Victor From jwilk at jwilk.net Fri Sep 7 04:33:22 2018 From: jwilk at jwilk.net (Jakub Wilk) Date: Fri, 7 Sep 2018 10:33:22 +0200 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> Message-ID: <20180907083322.fgi33epizycrth7b@jwilk.net> * Victor Stinner , 2018-09-06, 16:40: >I'm also dubious about PyYAML which allows to run arbitrary Python code >in a configuration *by default*. But well, it seems like nobody stepped >in to change the default. PyYAML maintainers intend to change the default soon: https://github.com/yaml/pyyaml/issues/207 -- Jakub Wilk From pms.coder at yandex.com Fri Sep 7 03:51:49 2018 From: pms.coder at yandex.com (PMS PMS) Date: Fri, 07 Sep 2018 09:51:49 +0200 Subject: [Python-Dev] Fwd: We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: Message-ID: <21826511536306709@myt1-bc8ef50fb490.qloud-c.yandex.net> Thank you Victor. XML support in Python is critical and desired for many sectors like banking or telecoms, and code base based on XML is still on rise in such world. That's why keeping such bugs open is important, as it is not impossible that someone (banks, telecoms, google camps, government grants) would simply fund small project aiming at fixing those bugs in XML. We never know. -------- Beginning of forwarded message -------- 07.09.2018, 09:03, "Victor Stinner" : Le jeu. 6 sept. 2018 ? 21:10, Steve Dower a ?crit : > ?If Christian is not able to keep maintaining the defused* packages, then > ?I may take a look at this next week at the sprints. The built-in XML > ?packages actually don't meet Microsoft's internal security requirements, > ?so I have some business motivation to do it. Great! The best would be to be able to merge defuse* features into the stdlib. Maybe not change the default, but add an option to enable security counter-measures. Victor _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pms.coder%40yandex.ru -------- End of forwarded message -------- From vstinner at redhat.com Fri Sep 7 11:46:44 2018 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 7 Sep 2018 17:46:44 +0200 Subject: [Python-Dev] Fwd: We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: <21826511536306709@myt1-bc8ef50fb490.qloud-c.yandex.net> References: <21826511536306709@myt1-bc8ef50fb490.qloud-c.yandex.net> Message-ID: Le ven. 7 sept. 2018 ? 17:02, PMS PMS a ?crit : > XML support in Python is critical and desired for many sectors like banking or telecoms, > and code base based on XML is still on rise in such world. Would it be possible to send money to the PSF? I'm sure that the PSF will be able to find you a developer able to quickly fix these XML issues! Victor From arj.python at gmail.com Fri Sep 7 12:05:38 2018 From: arj.python at gmail.com (Abdur-Rahmaan Janhangeer) Date: Fri, 7 Sep 2018 20:05:38 +0400 Subject: [Python-Dev] Fwd: We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <21826511536306709@myt1-bc8ef50fb490.qloud-c.yandex.net> Message-ID: @VictorStinner snif, que dire? il me semble que cet issue ait pris une nouvelle dimension @appinv Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Sep 7 12:10:02 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 7 Sep 2018 18:10:02 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180907161002.8C81C11682C@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-08-31 - 2018-09-07) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6841 (+13) closed 39517 (+38) total 46358 (+51) Open issues with patches: 2729 Issues opened (40) ================== #34557: When sending binary file to a Microsoft FTP server over FTP TL https://bugs.python.org/issue34557 opened by James Campbell2 #34559: multiprocessing AuthenticationError when nesting with non-defa https://bugs.python.org/issue34559 opened by natedogith1 #34560: Backport of uuid1() failure fix https://bugs.python.org/issue34560 opened by Riccardo Mottola #34561: Replace list sorting merge_collapse()? https://bugs.python.org/issue34561 opened by tim.peters #34562: cannot install versions 3.6.5+ on Windows https://bugs.python.org/issue34562 opened by Zyg #34564: Tutorial Section 2.1 Windows Installation Path Correction https://bugs.python.org/issue34564 opened by aperture #34565: Launcher does not validate major versions https://bugs.python.org/issue34565 opened by bgerrity #34568: Types in `typing` not anymore instances of `type` or subclasse https://bugs.python.org/issue34568 opened by pekka.klarck #34569: test__xxsubinterpreters.ShareableTypeTests._assert_values fail https://bugs.python.org/issue34569 opened by Michael.Felt #34570: Segmentation fault in _PyType_Lookup https://bugs.python.org/issue34570 opened by Pablosky #34572: C unpickling bypasses import thread safety https://bugs.python.org/issue34572 opened by tjb900 #34573: Simplify __reduce__() of set and dict iterators. https://bugs.python.org/issue34573 opened by sir-sigurd #34574: OrderedDict iterators are exhausted during pickling https://bugs.python.org/issue34574 opened by sir-sigurd #34575: Python 3.6 compilation fails on AppVeyor: libeay.lib was creat https://bugs.python.org/issue34575 opened by vstinner #34576: SimpleHTTPServer: warn users on security https://bugs.python.org/issue34576 opened by vstinner #34578: Pipenv lock : ModuleNotFoundError: No module named '_ctypes' https://bugs.python.org/issue34578 opened by Arselon #34579: test_embed.InitConfigTests fail on AIX https://bugs.python.org/issue34579 opened by Michael.Felt #34580: sqlite doc: clarify the scope of the context manager https://bugs.python.org/issue34580 opened by vigdis #34582: VSTS builds should use new YAML syntax and pools https://bugs.python.org/issue34582 opened by David Staheli #34583: os.stat() wrongfully returns False for symlink on Windows 10 v https://bugs.python.org/issue34583 opened by Isaac Shabtay #34584: subprocess https://bugs.python.org/issue34584 opened by JokeNeverSoke #34585: Don't use AC_RUN_IFELSE to determine float endian https://bugs.python.org/issue34585 opened by rossburton #34586: collections.ChainMap should have a get_where method https://bugs.python.org/issue34586 opened by Zahari.Dim #34587: test_socket: testCongestion() hangs on my Fedora 28 https://bugs.python.org/issue34587 opened by vstinner #34588: traceback formatting can drop a frame https://bugs.python.org/issue34588 opened by benjamin.peterson #34589: Py_Initialize() and Py_Main() should not enable C locale coerc https://bugs.python.org/issue34589 opened by vstinner #34590: "Logging HOWTO" should share an example of best practices for https://bugs.python.org/issue34590 opened by Nathaniel Manista #34591: smtplib mixes RFC821 and RFC822 addresses https://bugs.python.org/issue34591 opened by daurnimator #34592: cdll.LoadLibrary allows None as an argument https://bugs.python.org/issue34592 opened by superbobry #34595: PyUnicode_FromFormat(): add %T format for an object type name https://bugs.python.org/issue34595 opened by vstinner #34596: [unittest] raise error if @skip is used with an argument that https://bugs.python.org/issue34596 opened by Naitree Zhu #34597: Python needs to check existence of functions at runtime for ta https://bugs.python.org/issue34597 opened by Zorg #34598: How to fix? Error in Kali linux python 2.7 - Collecting pip Fr https://bugs.python.org/issue34598 opened by andy polandski #34600: python3 regression ElementTree.iterparse() unable to capture c https://bugs.python.org/issue34600 opened by Martin Hosken #34602: python3 resource.setrlimit strange behaviour under macOS https://bugs.python.org/issue34602 opened by marche147 #34603: ctypes on Windows: error calling C function that returns a str https://bugs.python.org/issue34603 opened by mattneri #34604: Possible mojibake in pwd.getpwnam and grp.getgrnam https://bugs.python.org/issue34604 opened by wg #34605: Avoid master/slave terminology https://bugs.python.org/issue34605 opened by vstinner #34606: Unable to read zip file with extra https://bugs.python.org/issue34606 opened by altendky #34607: test_multiprocessing_forkserver is altering the environment on https://bugs.python.org/issue34607 opened by pablogsal Most recent 15 issues with no replies (15) ========================================== #34607: test_multiprocessing_forkserver is altering the environment on https://bugs.python.org/issue34607 #34604: Possible mojibake in pwd.getpwnam and grp.getgrnam https://bugs.python.org/issue34604 #34603: ctypes on Windows: error calling C function that returns a str https://bugs.python.org/issue34603 #34600: python3 regression ElementTree.iterparse() unable to capture c https://bugs.python.org/issue34600 #34598: How to fix? Error in Kali linux python 2.7 - Collecting pip Fr https://bugs.python.org/issue34598 #34592: cdll.LoadLibrary allows None as an argument https://bugs.python.org/issue34592 #34591: smtplib mixes RFC821 and RFC822 addresses https://bugs.python.org/issue34591 #34589: Py_Initialize() and Py_Main() should not enable C locale coerc https://bugs.python.org/issue34589 #34585: Don't use AC_RUN_IFELSE to determine float endian https://bugs.python.org/issue34585 #34583: os.stat() wrongfully returns False for symlink on Windows 10 v https://bugs.python.org/issue34583 #34582: VSTS builds should use new YAML syntax and pools https://bugs.python.org/issue34582 #34579: test_embed.InitConfigTests fail on AIX https://bugs.python.org/issue34579 #34578: Pipenv lock : ModuleNotFoundError: No module named '_ctypes' https://bugs.python.org/issue34578 #34572: C unpickling bypasses import thread safety https://bugs.python.org/issue34572 #34564: Tutorial Section 2.1 Windows Installation Path Correction https://bugs.python.org/issue34564 Most recent 15 issues waiting for review (15) ============================================= #34605: Avoid master/slave terminology https://bugs.python.org/issue34605 #34604: Possible mojibake in pwd.getpwnam and grp.getgrnam https://bugs.python.org/issue34604 #34596: [unittest] raise error if @skip is used with an argument that https://bugs.python.org/issue34596 #34595: PyUnicode_FromFormat(): add %T format for an object type name https://bugs.python.org/issue34595 #34589: Py_Initialize() and Py_Main() should not enable C locale coerc https://bugs.python.org/issue34589 #34588: traceback formatting can drop a frame https://bugs.python.org/issue34588 #34585: Don't use AC_RUN_IFELSE to determine float endian https://bugs.python.org/issue34585 #34582: VSTS builds should use new YAML syntax and pools https://bugs.python.org/issue34582 #34580: sqlite doc: clarify the scope of the context manager https://bugs.python.org/issue34580 #34579: test_embed.InitConfigTests fail on AIX https://bugs.python.org/issue34579 #34575: Python 3.6 compilation fails on AppVeyor: libeay.lib was creat https://bugs.python.org/issue34575 #34574: OrderedDict iterators are exhausted during pickling https://bugs.python.org/issue34574 #34573: Simplify __reduce__() of set and dict iterators. https://bugs.python.org/issue34573 #34572: C unpickling bypasses import thread safety https://bugs.python.org/issue34572 #34565: Launcher does not validate major versions https://bugs.python.org/issue34565 Top 10 most discussed issues (10) ================================= #34605: Avoid master/slave terminology https://bugs.python.org/issue34605 16 msgs #34575: Python 3.6 compilation fails on AppVeyor: libeay.lib was creat https://bugs.python.org/issue34575 12 msgs #34561: Replace list sorting merge_collapse()? https://bugs.python.org/issue34561 11 msgs #33613: test_multiprocessing_fork: test_semaphore_tracker_sigint() fai https://bugs.python.org/issue33613 10 msgs #34596: [unittest] raise error if @skip is used with an argument that https://bugs.python.org/issue34596 8 msgs #34568: Types in `typing` not anymore instances of `type` or subclasse https://bugs.python.org/issue34568 7 msgs #34200: importlib: python -m test test_pkg -m test_7 fails randomly https://bugs.python.org/issue34200 6 msgs #34543: _struct.Struct: calling functions without calling __init__ res https://bugs.python.org/issue34543 6 msgs #34570: Segmentation fault in _PyType_Lookup https://bugs.python.org/issue34570 6 msgs #25750: tp_descr_get(self, obj, type) is called without owning a refer https://bugs.python.org/issue25750 5 msgs Issues closed (36) ================== #4453: MSI installer shows error message if "Compile .py files to byt https://bugs.python.org/issue4453 closed by zach.ware #13081: Crash in Windows with unknown cause https://bugs.python.org/issue13081 closed by zach.ware #16892: Windows bug picking up stdin from a pipe https://bugs.python.org/issue16892 closed by zach.ware #17480: pyvenv should be installed someplace more obvious on Windows https://bugs.python.org/issue17480 closed by zach.ware #19955: When adding .PY and .PYW to PATHEXT, it replaced string instea https://bugs.python.org/issue19955 closed by zach.ware #21877: External.bat and pcbuild of tkinter do not match. https://bugs.python.org/issue21877 closed by zach.ware #23089: Update libffi config files https://bugs.python.org/issue23089 closed by zach.ware #23949: Number of elements display in error message is wrong while unp https://bugs.python.org/issue23949 closed by serhiy.storchaka #24401: Windows 8.1 install gives DLL required to complete could not r https://bugs.python.org/issue24401 closed by zach.ware #24698: get_externals.bat script fails https://bugs.python.org/issue24698 closed by zach.ware #26544: platform.libc_ver() returns incorrect version number https://bugs.python.org/issue26544 closed by vstinner #26901: Argument Clinic test is broken https://bugs.python.org/issue26901 closed by vstinner #30977: reduce uuid.UUID() memory footprint https://bugs.python.org/issue30977 closed by vstinner #30985: Set closing variable in asyncore at close https://bugs.python.org/issue30985 closed by vstinner #31371: Remove deprecated tkinter.tix module https://bugs.python.org/issue31371 closed by zach.ware #32310: Remove _Py_PyAtExit from Python.h https://bugs.python.org/issue32310 closed by vstinner #33083: math.factorial accepts non-integral Decimal instances https://bugs.python.org/issue33083 closed by mark.dickinson #34007: test_gdb fails in s390x SLES buildbots https://bugs.python.org/issue34007 closed by pablogsal #34499: Extend registering of single-dispatch functions to parametrize https://bugs.python.org/issue34499 closed by lukasz.langa #34500: Fix ResourceWarning in difflib.py https://bugs.python.org/issue34500 closed by terry.reedy #34530: distutils: find_executable() fails if the PATH environment var https://bugs.python.org/issue34530 closed by vstinner #34544: FreeBSD: Fatal Python error: get_locale_encoding: failed to ge https://bugs.python.org/issue34544 closed by vstinner #34549: unittest docs could use another header https://bugs.python.org/issue34549 closed by napsterinblue #34555: AF_VSOCK not unset because of wrong nesting https://bugs.python.org/issue34555 closed by benjamin.peterson #34558: ctypes.find_library processing - missing parenthesis prevents https://bugs.python.org/issue34558 closed by Mariatta #34563: invalid assert on big output of multiprocessing.Process https://bugs.python.org/issue34563 closed by vstinner #34566: pipe read sometimes returns EOF but returncode is still None https://bugs.python.org/issue34566 closed by martin.panter #34567: test.pythoninfo: dump interpreter _PyCoreConfig https://bugs.python.org/issue34567 closed by vstinner #34571: help(hashlib.blake2b) causes RuntimeError in 3.7 https://bugs.python.org/issue34571 closed by serhiy.storchaka #34577: imaplib Cyrillic password https://bugs.python.org/issue34577 closed by christian.heimes #34581: Windows : use of __pragma msvc extension without ifdef https://bugs.python.org/issue34581 closed by benjamin.peterson #34593: Missing inttypes.h https://bugs.python.org/issue34593 closed by benjamin.peterson #34594: Some tests use hardcoded errno values https://bugs.python.org/issue34594 closed by benjamin.peterson #34599: improve performance of _Py_bytes_capitalize() https://bugs.python.org/issue34599 closed by benjamin.peterson #34601: Typo: "which would rather raise MemoryError than give up", tha https://bugs.python.org/issue34601 closed by mark.dickinson #1654408: Windows installer should split tcl/tk and tkinter install opti https://bugs.python.org/issue1654408 closed by zach.ware From christian at python.org Fri Sep 7 13:35:27 2018 From: christian at python.org (Christian Heimes) Date: Fri, 7 Sep 2018 19:35:27 +0200 Subject: [Python-Dev] Fwd: We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <21826511536306709@myt1-bc8ef50fb490.qloud-c.yandex.net> Message-ID: On 2018-09-07 17:46, Victor Stinner wrote: > Le ven. 7 sept. 2018 ? 17:02, PMS PMS a ?crit : >> XML support in Python is critical and desired for many sectors like banking or telecoms, >> and code base based on XML is still on rise in such world. > > Would it be possible to send money to the PSF? I'm sure that the PSF > will be able to find you a developer able to quickly fix these XML > issues! Feel free to send the money directly to me. After all I found the bugs, documented them, and fixed them in defusedxml. From tjreedy at udel.edu Fri Sep 7 16:16:11 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 7 Sep 2018 16:16:11 -0400 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: <20180906162942.0a1fb35c@fsol> <51e63340-6007-323d-be7c-f4b7362214e0@python.org> Message-ID: On 9/6/2018 11:05 AM, Ryan Gonzalez wrote: > Thought: what if there's a label on the bug tracker meaning roughly > "we're probably not going to fix this anytime soon, but we won't mind > someone stepping up"? Not needed. Good patches are always welcome. And if there is no current PR or other information indicating otherwise, a fix 'soon' is usually unlikely. But what we mostly need is not more patches, but more reviews. Anyone can act like a core dev up to the point of actually pushing the green merge button. -- Terry Jan Reedy From josephcsible at gmail.com Sat Sep 8 23:11:27 2018 From: josephcsible at gmail.com (Joseph C. Sible) Date: Sat, 8 Sep 2018 23:11:27 -0400 Subject: [Python-Dev] Why does the Contributor Agreement need my address? Message-ID: I'm used to signing CLA's that require nothing beyond a name and a check box. When I went to sign the PSF Contributor Agreement so I can submit a PR for CPython, I was surprised to see that it wants my address. Why does the Python Software Foundation need this, especially when nobody else does? Joseph C. Sible -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Sep 8 23:46:54 2018 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 8 Sep 2018 22:46:54 -0500 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: References: Message-ID: [Joseph C. Sible > I'm used to signing CLA's that require nothing beyond a name and a check > box. When I went to sign the PSF Contributor Agreement so I can submit > a PR for CPython, I was surprised to see that it wants my address. Why > does the Python Software Foundation need this, especially when nobody > else does? So that our marketing partners can deliver exciting consumer shopping opportunities directly to your front door ;-) Seriously, "nobody else does" shows you haven't looked much. For example, the first two I just looked at also require a mailing address: Apache CLA https://www.apache.org/licenses/icla.pdf Android CLA https://cla.developers.google.com/clas/new?domain=DOMAIN_GOOGLE&kind=KIND_INDIVIDUAL So I'll guess that projects big enough to hire actual lawyers require an address. As to why they want an address, you'll have to ask a lawyer! There aren't any on this list. So, if you really want to pursue this, I suggest you direct the question instead to the Python Software Foundation, which deals with the project's legalities: psf at python.org From josephcsible at gmail.com Sat Sep 8 23:59:35 2018 From: josephcsible at gmail.com (Joseph C. Sible) Date: Sat, 8 Sep 2018 23:59:35 -0400 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: References: Message-ID: On Sat, Sep 8, 2018 at 11:47 PM Tim Peters wrote: > Seriously, "nobody else does" shows you haven't looked much. For > example, the first two I just looked at also require a mailing > address: > > Apache CLA > https://www.apache.org/licenses/icla.pdf > > Android CLA > https://cla.developers.google.com/clas/new?domain=DOMAIN_GOOGLE&kind=KIND_INDIVIDUAL > > So I'll guess that projects big enough to hire actual lawyers require > an address. Fair. I guess I should have said "nobody else that I've ever contributed code upstream to does". (I suppose I have mostly contributed to smaller, less formal projects in the past.) > As to why they want an address, you'll have to ask a > lawyer! There aren't any on this list. So, if you really want to > pursue this, I suggest you direct the question instead to the Python > Software Foundation, which deals with the project's legalities: > > psf at python.org Thanks, this is useful information. Joseph C. Sible From steve at holdenweb.com Sun Sep 9 07:32:51 2018 From: steve at holdenweb.com (Steve Holden) Date: Sun, 9 Sep 2018 12:32:51 +0100 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: References: Message-ID: On Sun, Sep 9, 2018 at 5:24 AM Joseph C. Sible wrote: > On Sat, Sep 8, 2018 at 11:47 PM Tim Peters wrote: > [...] > > As to why they want an address, you'll have to ask a > > lawyer! There aren't any on this list. So, if you really want to > > pursue this, I suggest you direct the question instead to the Python > > Software Foundation, which deals with the project's legalities: > > > > psf at python.org > > Thanks, this is useful information. > > There's a reason he was called "the timbot" ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sun Sep 9 08:14:54 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 9 Sep 2018 08:14:54 -0400 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: References: Message-ID: On 9/8/18 11:46 PM, Tim Peters wrote: > [Joseph C. Sible >> I'm used to signing CLA's that require nothing beyond a name and a check >> box. When I went to sign the PSF Contributor Agreement so I can submit >> a PR for CPython, I was surprised to see that it wants my address. Why >> does the Python Software Foundation need this, especially when nobody >> else does? > So that our marketing partners can deliver exciting consumer shopping > opportunities directly to your front door ;-) > > Seriously, "nobody else does" shows you haven't looked much. For > example, the first two I just looked at also require a mailing > address: > > Apache CLA > https://www.apache.org/licenses/icla.pdf > > Android CLA > https://cla.developers.google.com/clas/new?domain=DOMAIN_GOOGLE&kind=KIND_INDIVIDUAL > > So I'll guess that projects big enough to hire actual lawyers require > an address. As to why they want an address, you'll have to ask a > lawyer! There aren't any on this list. So, if you really want to > pursue this, I suggest you direct the question instead to the Python > Software Foundation, which deals with the project's legalities: > > psf at python.org > _ While I am not a lawyer, or even play on on TV, I can imagine that for such an agreement to really be enforced, they need to know who actually is agreeing to it. Just a name isn't a unique a unique identifier, so more information is needed. -- Richard Damon From solipsis at pitrou.net Sun Sep 9 13:49:19 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Sep 2018 19:49:19 +0200 Subject: [Python-Dev] Why does the Contributor Agreement need my address? References: Message-ID: <20180909194919.3b439370@fsol> On Sat, 8 Sep 2018 23:11:27 -0400 "Joseph C. Sible" wrote: > I'm used to signing CLA's that require nothing beyond a name and a check > box. When I went to sign the PSF Contributor Agreement so I can submit a PR > for CPython, I was surprised to see that it wants my address. Why does the > Python Software Foundation need this, especially when nobody else does? I don't think I've ever received anything from the PSF by postal mail, so if you don't want to give out your postal address, or simply don't have one, then you can probably submit a fake one. (I've never given that advice of course, you've invented it all by yourself ;-)) Regards Antoine. From tjreedy at udel.edu Sun Sep 9 15:15:36 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 9 Sep 2018 15:15:36 -0400 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: <20180909194919.3b439370@fsol> References: <20180909194919.3b439370@fsol> Message-ID: On 9/9/2018 1:49 PM, Antoine Pitrou wrote: > On Sat, 8 Sep 2018 23:11:27 -0400 > "Joseph C. Sible" wrote: >> I'm used to signing CLA's that require nothing beyond a name and a check >> box. When I went to sign the PSF Contributor Agreement so I can submit a PR >> for CPython, I was surprised to see that it wants my address. Why does the >> Python Software Foundation need this, especially when nobody else does? I presume others are correct that an address helps as an identifier. Python is important enough to be a possible target of copyright lawsuits. The purpose of collecting CLA's for all non-trivial contributions is to try to avoid lawsuits and make them easier to defend against should one happen anyway. Part of the CLA is informing contributors that we only want code that can be legally contributed, and contributors agreeing that they will offer such. > I don't think I've ever received anything from the PSF by postal mail, > so if you don't want to give out your postal address, or simply don't > have one, then you can probably submit a fake one. DON'T DO THIS, anyone. If you were to be a defendant or witness in a lawsuit, expect to be asked "Did you reside at this address when you signed the CLA?" Lying under oath and admitting that you lied when signing are both unpleasant options. -- Terry Jan Reedy From solipsis at pitrou.net Sun Sep 9 15:57:03 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Sep 2018 21:57:03 +0200 Subject: [Python-Dev] Why does the Contributor Agreement need my address? References: <20180909194919.3b439370@fsol> Message-ID: <20180909215703.0aab45a5@fsol> On Sun, 9 Sep 2018 15:15:36 -0400 Terry Reedy wrote: > On 9/9/2018 1:49 PM, Antoine Pitrou wrote: > > On Sat, 8 Sep 2018 23:11:27 -0400 > > "Joseph C. Sible" wrote: > >> I'm used to signing CLA's that require nothing beyond a name and a check > >> box. When I went to sign the PSF Contributor Agreement so I can submit a PR > >> for CPython, I was surprised to see that it wants my address. Why does the > >> Python Software Foundation need this, especially when nobody else does? > > I presume others are correct that an address helps as an identifier. It probably does, though it's hardly perfect. Mostly it can serve as a contact point, but these days an e-mail address might be more durable than a postal address (and it's probably a much better identifier too). > Part of the CLA is informing > contributors that we only want code that can be legally contributed, and > contributors agreeing that they will offer such. You don't need someone's postal address to inform them, if you're not sending them any paper material (which the PSF doesn't, AFAIR). Asking a contributor their postal address does not make them better informed. > > I don't think I've ever received anything from the PSF by postal mail, > > so if you don't want to give out your postal address, or simply don't > > have one, then you can probably submit a fake one. > > DON'T DO THIS, anyone. If you were to be a defendant or witness in a > lawsuit, expect to be asked "Did you reside at this address when you > signed the CLA?" I'm not sure why anyone would ask that question. Residing somewhere doesn't have much to do with copyright issues (except when determining which national law should apply, and perhaps even not). And a postal address doesn't have to be where you reside, either : people can very well have their postal address at a friend's or relative's while not living there, and they can very well give different postal addresses for different purposes (just like you can give a different e-mail address to your professional and personal contacts). Besides, even in a dysfunctional legal system, I'd be surprised if a Python copyright lawsuit would involve asking all past Python contributors (the thousands of them, assuming they can all be successfully contacted and brought to the court) whether they did really live at the postal address they once declared on their CLA. But, yes, perhaps better to leave the entry blank or write "irrelevant", if the PSF accepts that :-) Regards Antoine. From jackiekazil at gmail.com Sun Sep 9 15:43:13 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sun, 9 Sep 2018 15:43:13 -0400 Subject: [Python-Dev] Official citation for Python Message-ID: The PSF has received a few inquiries asking the question ? ?How do I cite Python??So, I am reaching out to you all to figure this out. (For those that don?t know my background, I have been in academia for a bit as a Ph.D student and have worked at the Library of Congress writing code to process Marc records , among other things.) IMHO the citation for Python should be decided upon by the Python developers and should live somewhere on the site. Two questions to be answered? 1. What format should it take? 2. Where does it live on the site? To help frame the first one, I quickly wrote this up ? https://docs.google.com/document/d/1R0mo8EYVIPNkmNBImpcZTbk0e78T2oU71ioX5NvVTvY/edit# tldr; Summary of possibilities? 1. Article for one citation (1 DOI, generated by the publication) 2. No article (many DOIs ? one for each major version through Zenodo (or similar service)) Discuss. -Jackie Jackie Kazil Board of Directors, PSF -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sun Sep 9 19:03:53 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 9 Sep 2018 17:03:53 -0600 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: On Sun, Sep 9, 2018, 14:19 Jacqueline Kazil wrote: > The PSF has received a few inquiries asking the question ? ?How do I cite > Python??So, I am reaching out to you all to figure this out. > > (For those that don?t know my background, I have been in academia for a > bit as a Ph.D student and have worked at the Library of Congress writing > code to process Marc records , > among other things.) > > IMHO the citation for Python should be decided upon by the Python > developers and should live somewhere on the site. > > Two questions to be answered? > > 1. What format should it take? > 2. Where does it live on the site? > > To help frame the first one, I quickly wrote this up ? > https://docs.google.com/document/d/1R0mo8EYVIPNkmNBImpcZTbk0e78T2oU71ioX5NvVTvY/edit# > > tldr; Summary of possibilities? > > 1. Article for one citation (1 DOI, generated by the publication) > 2. No article (many DOIs ? one for each major version through Zenodo > (or similar service)) > > Discuss. > Hi Jackie! FWIW, this has come up a few times in the past on python-ideas and/or python-dev. Sorry I don't have more info. Alas, if I were at my computer I could offer specifics. :) -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcidy at gmail.com Sun Sep 9 19:42:44 2018 From: marcidy at gmail.com (Matt Arcidy) Date: Sun, 9 Sep 2018 16:42:44 -0700 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: <20180909215703.0aab45a5@fsol> References: <20180909194919.3b439370@fsol> <20180909215703.0aab45a5@fsol> Message-ID: On Sun, Sep 9, 2018, 12:59 Antoine Pitrou wrote: > > > I'm not sure why anyone would ask that question. because if they can discredit a witness, they will. Matt > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/marcidy%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Richard at Damon-Family.org Sun Sep 9 20:33:02 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 9 Sep 2018 20:33:02 -0400 Subject: [Python-Dev] Why does the Contributor Agreement need my address? In-Reply-To: <20180909215703.0aab45a5@fsol> References: <20180909194919.3b439370@fsol> <20180909215703.0aab45a5@fsol> Message-ID: <9f6b0893-b1e1-8ca4-8251-5476ccb4580d@Damon-Family.org> On 9/9/18 3:57 PM, Antoine Pitrou wrote: > On Sun, 9 Sep 2018 15:15:36 -0400 > Terry Reedy wrote: >> On 9/9/2018 1:49 PM, Antoine Pitrou wrote: >>> On Sat, 8 Sep 2018 23:11:27 -0400 >>> "Joseph C. Sible" wrote: >>>> I'm used to signing CLA's that require nothing beyond a name and a check >>>> box. When I went to sign the PSF Contributor Agreement so I can submit a PR >>>> for CPython, I was surprised to see that it wants my address. Why does the >>>> Python Software Foundation need this, especially when nobody else does? >> I presume others are correct that an address helps as an identifier. > It probably does, though it's hardly perfect. Mostly it can serve as a > contact point, but these days an e-mail address might be more durable > than a postal address (and it's probably a much better identifier too). > A Name + Address is a practically perfect identifier, as most people have a specific legal address of residence and at that address it is very unlikely two people have identical legal names. It is this legal address and legal name that people should be using for these sorts of legal documents. Government tend to have a vested interest in keeping track of legal addresses as this tends to have implications in things like taxes, so piggy backing on this identification can help with identification for other purposes. There also tends to be official government documents that can track back your 'official' address over time, so confirming that you are the Joe Smith from 15 Main ST, Anytown USA, is possible. Try to think how you could legally prove you were or were not the owner of joe.smith at example.com 10 years ago, where example.com is some major free email providers. -- Richard Damon From steve at pearwood.info Sun Sep 9 20:40:44 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 10 Sep 2018 10:40:44 +1000 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: <20180910004044.GU27312@ando.pearwood.info> On Sun, Sep 09, 2018 at 03:43:13PM -0400, Jacqueline Kazil wrote: > The PSF has received a few inquiries asking the question ? ?How do I cite > Python??So, I am reaching out to you all to figure this out. If you figure it out, it would be lovely to see some movement on this ticket: https://bugs.python.org/issue26597 -- Steve From wes.turner at gmail.com Sun Sep 9 20:46:54 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 9 Sep 2018 20:46:54 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: <20180910004044.GU27312@ando.pearwood.info> References: <20180910004044.GU27312@ando.pearwood.info> Message-ID: "Python Programming Language" (van Rossum, et. Al) YYYY ? Should there be a URL and/or a DOI? Figshare and Zenodo will archive a [e.g. tagged] [GitHub] revision and generate a DOI, AFAIU On Sunday, September 9, 2018, Steven D'Aprano wrote: > On Sun, Sep 09, 2018 at 03:43:13PM -0400, Jacqueline Kazil wrote: > > The PSF has received a few inquiries asking the question ? ?How do I cite > > Python??So, I am reaching out to you all to figure this out. > > If you figure it out, it would be lovely to see some movement on this > ticket: > > https://bugs.python.org/issue26597 > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Sep 9 23:32:21 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 9 Sep 2018 23:32:21 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: On 9/9/2018 3:43 PM, Jacqueline Kazil wrote: > The PSF has received a few inquiries asking the question ? > ?How do I cite Python??So, I am reaching out to you all to figure this out. > > (For those that don?t know my background, I have been in academia for a > bit as a Ph.D student and have worked at the Library of Congress writing > code to process Marc records , > among other things.) > > IMHO the citation for Python should be decided upon by the Python > developers and should live somewhere on the site. > > Two questions to be answered? > > 1. What format should it take? There are by now formats for citing web documents. I presume style guides now include such. Try a current version of the Chicago Manual of Style. (not sure of exact title). I will ask a university professor who should know more than I. > 2. Where does it live on the site? On https://bugs.python.org/issue26597, I suggested the Copyright page. I now think a link to 'Citing these Documents' on https://docs.python.org/3/ would be even better. -- Terry Jan Reedy From jackiekazil at gmail.com Sun Sep 9 23:39:40 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sun, 9 Sep 2018 23:39:40 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: Terry -- For clarification, the format question was not a style question. It was a reference to the one versus many that I wrote in the explainer. Yes... there are many prescribed formats already. That is the easy part. -Jackie On Sun, Sep 9, 2018 at 11:33 PM Terry Reedy wrote: > On 9/9/2018 3:43 PM, Jacqueline Kazil wrote: > > The PSF has received a few inquiries asking the question ? > > ?How do I cite Python??So, I am reaching out to you all to figure this > out. > > > > (For those that don?t know my background, I have been in academia for a > > bit as a Ph.D student and have worked at the Library of Congress writing > > code to process Marc records , > > among other things.) > > > > IMHO the citation for Python should be decided upon by the Python > > developers and should live somewhere on the site. > > > > Two questions to be answered? > > > > 1. What format should it take? > > There are by now formats for citing web documents. I presume style > guides now include such. Try a current version of the Chicago Manual of > Style. (not sure of exact title). I will ask a university professor > who should know more than I. > > > 2. Where does it live on the site? > > On https://bugs.python.org/issue26597, I suggested the Copyright page. > I now think a link to 'Citing these Documents' on > https://docs.python.org/3/ > would be even better. > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jackiekazil%40gmail.com > -- Jacqueline Kazil | @jackiekazil -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Sep 9 23:53:18 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 9 Sep 2018 23:53:18 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: <2b9cec3e-eea7-395f-912d-1cf765a07030@udel.edu> On 9/9/2018 11:39 PM, Jacqueline Kazil wrote: > Terry -- For clarification, the format question was not a style > question. It was a reference to the one versus many that I wrote in the > explainer. I don't know what you mean by this. > Yes... there are many prescribed formats already. That is the easy part. Different publications use different citation formats. We cannot dictate which format an author or publication uses. We could, and I think should, suggest the content of the different fields that go into the various formats. And we could give examples of citing, say, the Reference Manual, in the most common formats. > On Sun, Sep 9, 2018 at 11:33 PM Terry Reedy > wrote: > > On 9/9/2018 3:43 PM, Jacqueline Kazil wrote: > > The PSF has received a few inquiries asking the question ? > > ?How do I cite Python??So, I am reaching out to you all to figure > this out. > > > > (For those that don?t know my background, I have been in academia > for a > > bit as a Ph.D student and have worked at the Library of Congress > writing > > code to process Marc records > , > > among other things.) > > > > IMHO the citation for Python should be decided upon by the Python > > developers and should live somewhere on the site. The PSF is the publisher. It seems that you might be more competent to make some of the decisions than are we developers, who have mostly left academia some time ago. > > Two questions to be answered? > > > >? 1. What format should it take? > > There are by now formats for citing web documents.? I presume style > guides now include such.? Try a current version of the Chicago > Manual of > Style.? (not sure of exact title).? I will ask a university professor > who should know more than I. > > >? 2. Where does it live on the site? > > On https://bugs.python.org/issue26597, I suggested the Copyright page. To make the answer more visible, > I now think a link to 'Citing these Documents' on > https://docs.python.org/3/ > would be even better. tjr From wes.turner at gmail.com Mon Sep 10 00:05:43 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 10 Sep 2018 00:05:43 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: On Sunday, September 9, 2018, Terry Reedy wrote: > On 9/9/2018 3:43 PM, Jacqueline Kazil wrote: > >> The PSF has received a few inquiries asking the question ? >> ?How do I cite Python??So, I am reaching out to you all to figure this out. >> >> (For those that don?t know my background, I have been in academia for a >> bit as a Ph.D student and have worked at the Library of Congress writing >> code to process Marc records , >> among other things.) >> >> IMHO the citation for Python should be decided upon by the Python >> developers and should live somewhere on the site. >> >> Two questions to be answered? >> >> 1. What format should it take? >> > > There are by now formats for citing web documents. I presume style guides > now include such. Try a current version of the Chicago Manual of Style. > (not sure of exact title). I will ask a university professor who should > know more than I. Citation Style Language -- supported by a number of citation tools such as Zotero and Mendeley -- is used to generate citations in very many citation styles such as Chicago, MLA, https://citationstyles.org https://citationstyles.org/authors/ https://www.zotero.org/styles (9141 styles, really) https://github.com/citation-style-language/styles BibTeX can be generated with CSL. SciPy, scikit-learn, statsmodels, and pandas all list bibtex citations in their docs. https://www.scipy.org/citing.html - [ ] Add the Python citation to this list http://scikit-learn.org/stable/about.html#citing-scikit-learn https://www.statsmodels.org/stable/#citation https://pandas.pydata.org/talks.html BibTeX is rather unspecified; in terms of which fields/attributes to define. Search engines index schema.org metadata; which can be represented as RDF in HTML (RDFa), microdata, JSON-LD: https://schema.org/CreativeWork https://schema.org/Code https://schema.org/SoftwareApplication https://schema.org/ScholarlyArticle Ideally, a tool such as Zotero or Mendeley can auto-detect and parse the citation for reformatting (with CSL) into whichever citation style is used for a bibliography / works cited / tools section. Are there separate citations for Python and CPython? https://westurner.github.io/tools/#cpython https://sphinxcontrib-bibtex.readthedocs.io/en/latest/quickstart.html > This extension allows BibTeX citations to be inserted into documentation generated by Sphinx, via a bibliography directive, and a cite role, which work similarly to LaTeX?s thebibliography environment and \cite command. > 2. Where does it live on the site? >> > > On https://bugs.python.org/issue26597, I suggested the Copyright page. I > now think a link to 'Citing these Documents' on https://docs.python.org/3/ > would be even better. A heading in the docs would be great. People may also be likely to read the README. If there are potentially multiple citations, sphinxcontrib-bibtex may be worth adding to the CPython docs. > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. > turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python.00 at klix.ch Mon Sep 10 03:51:33 2018 From: python.00 at klix.ch (Gerald Klix) Date: Mon, 10 Sep 2018 09:51:33 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> Wouldn't it make sense to ask the developers of the other Python implementations too? Just my 0.02?, Gerald Am 09.09.2018 um 21:43 schrieb Jacqueline Kazil: > The PSF has received a few inquiries asking the question ? ?How do I cite > Python??So, I am reaching out to you all to figure this out. > > (For those that don?t know my background, I have been in academia for a bit > as a Ph.D student and have worked at the Library of Congress writing code > to process Marc records , among > other things.) > > IMHO the citation for Python should be decided upon by the Python > developers and should live somewhere on the site. > > Two questions to be answered? > > 1. What format should it take? > 2. Where does it live on the site? > > To help frame the first one, I quickly wrote this up ? > https://docs.google.com/document/d/1R0mo8EYVIPNkmNBImpcZTbk0e78T2oU71ioX5NvVTvY/edit# > > tldr; Summary of possibilities? > > 1. Article for one citation (1 DOI, generated by the publication) > 2. No article (many DOIs ? one for each major version through Zenodo > (or similar service)) > > Discuss. > > -Jackie > > Jackie Kazil > Board of Directors, PSF > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/python.00%40klix.ch From greg at krypto.org Mon Sep 10 14:56:23 2018 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 10 Sep 2018 11:56:23 -0700 Subject: [Python-Dev] AES decryption/encryption support for zipfile In-Reply-To: References: <1b14178e-8cdb-5061-917e-6d63a0b3d832@python.org> Message-ID: On Wed, Sep 5, 2018 at 8:24 AM ???? wrote: > Sorry, allow me to ask one more thing. > If I want to use AES in zipfile module, what the good way to implement? > If anyone wants to add support for additional zipfile encryption/decryption methods, there are a few options: (a) Fork the stdlib zipfile module and create one that supports the additional features, posting it on PyPI. That way it could have dependencies on other third party libraries such as https://cryptography.io/en/latest/. (b) Figure out the set of hooks necessary for the zipfile module to support pluggable encryption as an API so that external libraries could provide encryption support to it. (c) Write a library that wraps an existing third party zip file creation tool or library instead of reusing the stdlib zipfile code. Option (a) is probably easiest to start with... but creates a maintenance burden of keeping that module up to date. Option (b) will be more challenging, the zipfile API modifications and their tests would need merging and would only show up in a future CPython release (3.8 today). Option (c) is entirely different, but would get you out of the business of dealing with the zipfile spec itself. Unstated option (n): write something entirely new not based on existing code or tools. An entirely different form of challenge. In general the existing stdlib zipfile module code is not loved by any of us who have had to work on it in the past decade, it is a hairy mess (but does work, so it's got that going for it). Granted, the zip format as a specification vs the many implementations out there to be compatible with is also what I'd call an underspecified mess... -gps > Thanks and Regards, > ----------------- > Takahiro Ono > > 2018?9?5?(?) 23:01 ???? : > >> Christian, really appreciated the details. I understood. >> >> Is wrapper library like ssl module with openssl on platform also not good >> idea? >> My intention is not re-invention but single standard way as standard >> library. >> >> If I can read past discussion somewhere, it's also appreciated >> >> Thanks and Regards, >> Takahiro Ono >> >> >> >> >> 2018?9?5?(?) 1:48 Christian Heimes : >> >>> On 2018-09-04 16:37, ???? wrote: >>> > Dear all, >>> > >>> > Have we tried cipher implementation includes AES as a standard library >>> > in the past? >>> > https://docs.python.org/3.6/library/crypto.html >>> > >>> > if possible I want to try to implement AES because famous 3rd party >>> > library is not maintained and general cipher programs should be used >>> for >>> > multiple purpose.Though the implementation is tough, I believe this >>> > should be worth to it. >>> > In my case, I want to use AES implementation for zipfile module. >>> >>> strong -1 >>> >>> The Python standard library doesn't contain any encryption, signing, and >>> other cryptographic algorithms for multiple reasons. The only exception >>> from the rule are hashing algorithms and HMAC construct. There are legal >>> implications like export restrictions. Crypto is just too hard to get >>> right and we don't want to give the user additional rope. We already had >>> a very lengthy and exhausting discussion for the secrets module. That >>> module just provides a user-friendly interface to CPRNG. >>> >>> By the way, AES by itself is a useless to borderline dangerous >>> algorithm. It must be embedded within additional layers like block mode, >>> authenticated encryption / MAC, and more. There isn't a single correct >>> answer for block mode and AD algorithm, too. It highly depends on the >>> problem space. While GCM AEAD mode is good choice for network >>> communication, it can be a pretty bad idea for persistent storage. >>> >>> There is one excellent Python library with high level and low level >>> cryptographic algorithms: http://cryptography.readthedocs.io/ . It's t >>> >>> Regards, >>> Christian >>> >> _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Sep 10 15:24:03 2018 From: brett at python.org (Brett Cannon) Date: Mon, 10 Sep 2018 12:24:03 -0700 Subject: [Python-Dev] Official citation for Python In-Reply-To: <2b9cec3e-eea7-395f-912d-1cf765a07030@udel.edu> References: <2b9cec3e-eea7-395f-912d-1cf765a07030@udel.edu> Message-ID: On Sun, 9 Sep 2018 at 20:55 Terry Reedy wrote: > On 9/9/2018 11:39 PM, Jacqueline Kazil wrote: > > Terry -- For clarification, the format question was not a style > > question. It was a reference to the one versus many that I wrote in the > > explainer. > > I don't know what you mean by this. > > > Yes... there are many prescribed formats already. That is the easy part. > > Different publications use different citation formats. We cannot > dictate which format an author or publication uses. We could, and I > think should, suggest the content of the different fields that go into > the various formats. And we could give examples of citing, say, the > Reference Manual, in the most common formats. > What we can suggest, though, is what the information should be and that's what Jackie is initially asking about, i.e. do we want a single reference for the language that is version-agnostic or one for each Python release? My vote is a single reference and leave it up to the person referencing to clarify the version they are using. Seems the simplest to maintain long-term. -Brett > > > On Sun, Sep 9, 2018 at 11:33 PM Terry Reedy > > wrote: > > > > On 9/9/2018 3:43 PM, Jacqueline Kazil wrote: > > > The PSF has received a few inquiries asking the > question ? > > > ?How do I cite Python??So, I am reaching out to you all to figure > > this out. > > > > > > (For those that don?t know my background, I have been in academia > > for a > > > bit as a Ph.D student and have worked at the Library of Congress > > writing > > > code to process Marc records > > , > > > among other things.) > > > > > > IMHO the citation for Python should be decided upon by the Python > > > developers and should live somewhere on the site. > > The PSF is the publisher. It seems that you might be more competent to > make some of the decisions than are we developers, who have mostly left > academia some time ago. > > > > Two questions to be answered? > > > > > > 1. What format should it take? > > > > There are by now formats for citing web documents. I presume style > > guides now include such. Try a current version of the Chicago > > Manual of > > Style. (not sure of exact title). I will ask a university professor > > who should know more than I. > > > > > 2. Where does it live on the site? > > > > On https://bugs.python.org/issue26597, I suggested the Copyright > page. > > To make the answer more visible, > > > I now think a link to 'Citing these Documents' on > > https://docs.python.org/3/ > > would be even better. > > tjr > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Sep 10 15:25:29 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 10 Sep 2018 21:25:29 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> Message-ID: I"d like ot know what thee citations are expected to be used for? i.e. -- usually, academic papers have a collection of citiations to acknowledge where you got an idea, or fact, or .... It serves both to jusstify something and make it clear that it is not your own idea (i.e. not pagerism). in the enclosed doc, it says: """ *If someone publishes research, they will cite the exact major version that was used, so if someone was trying to recreate the research they would be able to do it.* *"""* That is about reproducible results, which is really a different thing than the usual citations. In that case, you would want some way to identify the actual source code (cPython version 3.6.4, and probably a url to the source -- but how long might that last???) And you would need to post your own source anyway. Again, regular citation is about acknowledging the source of an idea or fact, or something of that nature. I can imagine a paper about computer language design or some such might need to reference Python -- in which case it should reference the Language Reference, I suppose. Maybe we should have a DOI for each version of the standard docs. But "I wrote some code in Python to produce these statistics" -- does that need a citation? If so, maybe that would take a different form. Anyway, hard to make this decision without some idea how the citation is intended to be used. -CHB On Mon, Sep 10, 2018 at 9:51 AM, Gerald Klix wrote: > Wouldn't it make sense to ask the developers of the other Python > implementations too? > > Just my 0.02?, > > > Gerald > > > > Am 09.09.2018 um 21:43 schrieb Jacqueline Kazil: > >> The PSF has received a few inquiries asking the question ? ?How do I cite >> Python??So, I am reaching out to you all to figure this out. >> >> (For those that don?t know my background, I have been in academia for a >> bit >> as a Ph.D student and have worked at the Library of Congress writing code >> to process Marc records , among >> other things.) >> >> IMHO the citation for Python should be decided upon by the Python >> developers and should live somewhere on the site. >> >> Two questions to be answered? >> >> 1. What format should it take? >> 2. Where does it live on the site? >> >> To help frame the first one, I quickly wrote this up ? >> https://docs.google.com/document/d/1R0mo8EYVIPNkmNBImpcZTbk0 >> e78T2oU71ioX5NvVTvY/edit# >> >> tldr; Summary of possibilities? >> >> 1. Article for one citation (1 DOI, generated by the publication) >> 2. No article (many DOIs ? one for each major version through Zenodo >> (or similar service)) >> >> Discuss. >> >> -Jackie >> >> Jackie Kazil >> Board of Directors, PSF >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/python. >> 00%40klix.ch >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris. > barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Mon Sep 10 18:17:00 2018 From: nad at python.org (Ned Deily) Date: Mon, 10 Sep 2018 15:17:00 -0700 Subject: [Python-Dev] 3.7.1 and 3.6.7 Releases Coming Soon Message-ID: <3ABAB3B5-6346-49F2-98F7-185303166016@python.org> I have updated the schedules for the next maintenance releases of Python 3.7.x and 3.6.x. My original plan had been to get 3.7.1, the first 3.7 maintenance release, out by the end of July. It was solely my fault that that did not happen so I hope you'll accept my apologies; I will try to not let it happen again. I have now scheduled a 3.7.1 release candidate and rescheduled the 3.6.7 release candidate for 2018-09-18, about a week from today, and 3.7.1 final and 3.6.7 final for 2018-09-28. That allows us to take advantage of fixes generated at the Core Developers sprint taking place this week. Please review any open issues you are working on or are interested in and try to get them merged in to the 3.7 and/or 3.6 branches soon - by the beginning of next week at the latest. As usual, if there are any issues you believe need to be addressed prior to these releases, please ensure there are open issues for them in the bug tracker (bugs.python.org) and that their priorities are set accordingly (e.g. "release blocker"). Thanks for your support! --Ned -- Ned Deily nad at python.org -- [] From steve at pearwood.info Mon Sep 10 21:48:08 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 11 Sep 2018 11:48:08 +1000 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> Message-ID: <20180911014808.GC1596@ando.pearwood.info> On Mon, Sep 10, 2018 at 09:25:29PM +0200, Chris Barker via Python-Dev wrote: > I"d like ot know what thee citations are expected to be used for? > > i.e. -- usually, academic papers have a collection of citiations to > acknowledge where you got an idea, or fact, or .... It serves both to > jusstify something and make it clear that it is not your own idea (i.e. not > pagerism). [ > That is about reproducible results, which is really a different thing than > the usual citations. I don't think it is. I think you are seeing a distinction that is not there. If citations were just about acknowledgement, we could say "I got this idea from Bob" and be done with it. Citations are about identifying the *exact* source so that anyone can reproduce the given ideas by checking not just "Bob" but the specific page number of a specific edition of a specific work. So the requirement for precision is no different between papers and software, and the academic standards for citing software already take that into account. There are challenges with software, to be sure -- code is much more ephemeral, there may be literally hundreds of authors, etc. But in principle, the kinds of information needed to cite a software package is known. The major citation styles already include this. When you are using a specific style, this page: https://openresearchsoftware.metajnl.com/about/ suggests a few formats, depending on how you got access to the software. The bottom line is, we don't have to guess what information to provide. People like Jacqueline can tell us what they need, and we'll just fill in the values. The people citing Python know what information they need, we just have to help them get it. I think that the best way to do that is to provide the correct information in a single place, in a single, standard format, and let them choose the appropriate citation style for their publication. Jackie, do I have that right? -- Steve From wes.turner at gmail.com Mon Sep 10 23:44:07 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 10 Sep 2018 23:44:07 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: <20180911014808.GC1596@ando.pearwood.info> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <20180911014808.GC1596@ando.pearwood.info> Message-ID: I also see reproducibility and citation graphs as distinct concepts. If it's reproducibility you're after, bibliographic citations are very unlikely to enable someone else to assemble an identical build environment from which the same conclusion should be repeatably derivable. A ScholarlyArticle can be reproducible with no citations whatsoever. A ScholarlyArticle may very likely have many citations and still be woefully unreproducible. This citation doesn't contain a URL, but still isn't quite useless (while the paper is excellent); because there's at least a DOI string: Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285 > Rule 3: Archive the Exact Versions of All External Programs Used mybinder.org builds Jupyter containers from git repositories that contain config files with repo2docker. https://repo2docker.readthedocs.io/en/latest/config_files.html#configuration-files """ Dockerfile environment.yml requirements.txt REQUIRE install.R apt.txt setup.py postBuild runtime.txt """ Specifying the exact version of Python (and what package it was installed from and/or what URL the source was obtained and built from) is no substitute for hashes of the 'pinned' versions of said artifacts. # includes the python version $ conda env export -f environment.yml # these do not include the python version $ pip freeze -r requirements.txt --all $ pipenv lock # > Pipfile.lock $ pipenv sync # < Pipfile.lock Uploading a built container or VM image to e.g. Docker Hub / GitLab Container Registry / Vagrant Cloud is another way to ensure that research findings are reproducible. - Dockerfile, docker-compose.yml - Vagrantfile > Rule 4: Version Control All Custom Scripts https://mozillascience.github.io/code-research-object/ (FigShare + GitHub => DOI citation URI) https://guides.github.com/activities/citable-code/ (Zenodo + GitHub => DOI citation URI) ... Is it necessary to cite Python (or all packages) if you're not building a derivative of Python or said packages? It's definitely a good idea to "Archive the Exact Versions of All External Programs Used"; but IDK that those are best represented with bibliographic citations. Really, a link to the Homepage, Source, Docs, and Wikipedia page are probably more helpful to a reviewer that's not familiar with and wants to help support by linking dereferenceable URLs and https://5stardata.info. While out of scope and OT, it's worth mentioning that search engines index https://schema.org/Dataset metadata; which is helpful for data reuse and autodiscovering requisite premises for the argument presented in a https://schema.org/ScholarlyArticle . A citation for each MAJ.MIN.PATCH revision of CPython (and/or other excellent packages) might be a bit much. On Monday, September 10, 2018, Steven D'Aprano wrote: > On Mon, Sep 10, 2018 at 09:25:29PM +0200, Chris Barker via Python-Dev > wrote: > > I"d like ot know what thee citations are expected to be used for? > > > > i.e. -- usually, academic papers have a collection of citiations to > > acknowledge where you got an idea, or fact, or .... It serves both to > > jusstify something and make it clear that it is not your own idea (i.e. > not > > pagerism). > > [ > > That is about reproducible results, which is really a different thing > than > > the usual citations. > > I don't think it is. I think you are seeing a distinction that is not > there. If citations were just about acknowledgement, we could say "I got > this idea from Bob" and be done with it. Citations are about identifying > the *exact* source so that anyone can reproduce the given ideas by > checking not just "Bob" but the specific page number of a specific > edition of a specific work. > > So the requirement for precision is no different between papers and > software, and the academic standards for citing software already take > that into account. There are challenges with software, to be sure -- > code is much more ephemeral, there may be literally hundreds of > authors, etc. But in principle, the kinds of information needed to > cite a software package is known. The major citation styles already > include this. When you are using a specific style, this page: > > https://openresearchsoftware.metajnl.com/about/ > > suggests a few formats, depending on how you got access to the software. > > The bottom line is, we don't have to guess what information to provide. > People like Jacqueline can tell us what they need, and we'll just fill > in the values. > > The people citing Python know what information they need, we just have > to help them get it. I think that the best way to do that is to provide > the correct information in a single place, in a single, standard format, > and let them choose the appropriate citation style for their > publication. > > Jackie, do I have that right? > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Sep 11 05:16:08 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 11 Sep 2018 11:16:08 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: <20180911014808.GC1596@ando.pearwood.info> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <20180911014808.GC1596@ando.pearwood.info> Message-ID: On Tue, Sep 11, 2018 at 3:48 AM, Steven D'Aprano wrote: > > That is about reproducible results, which is really a different thing > than > > the usual citations. > > I don't think it is. I think you are seeing a distinction that is not > there. no need for us to agree on that, but there are still multiple reasons / ways you might want to cite Python, and what you would want to cite would be different. Lets say one were to write an article about how different computer languages express functional programming concepts -- you may want to cite Python, but you are not trying to identify a specific version for reproducible results. And see Wes Turner's note -- it is highly unlikely that a single citation to a standard document or something will be enough for reproducibility anyway. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Sep 11 08:45:09 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 11 Sep 2018 22:45:09 +1000 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <20180911014808.GC1596@ando.pearwood.info> Message-ID: <20180911124508.GK1596@ando.pearwood.info> On Tue, Sep 11, 2018 at 11:16:08AM +0200, Chris Barker wrote: > On Tue, Sep 11, 2018 at 3:48 AM, Steven D'Aprano > wrote: > > > > That is about reproducible results, which is really a different thing > > than > > > the usual citations. > > > > I don't think it is. I think you are seeing a distinction that is not > > there. > > > no need for us to agree on that, but there are still multiple reasons / > ways you might want to cite Python, and what you would want to cite would > be different. I think this thread is about *academic* citations. I know the feature request I linked to earlier is, because I opened it and that's what I intended :-) Informal citations can include as much or as little information as you care to give. It could be as little as "use Python" or it could be a link to a specific branch or tag in a repo, complete with detailed instructions on building the environment up to and including the OS and processor type. But those sorts of detailed build instructions aren't really a *citation*, they are more along the lines of the experimental design ("First, compile Linux using gcc ..."). There's a metric ton of information on the web about citing software, there are existing standards, and I really think you are over-complicating this. See, for example: https://www.software.ac.uk/how-cite-software http://www.citethisforme.com/cite/software https://openresearchsoftware.metajnl.com/about/#q12 Its not our job to tell academics how to cite, they already have a number of standardised templates that they use, but it is our job to tell them what information to fill into the template. > Lets say one were to write an article about how different computer > languages express functional programming concepts -- you may want to cite > Python, but you are not trying to identify a specific version for > reproducible results. I don't think we need to lose any sleep over how random bloggers and Redditors informally cite Python. I think the focus here is on academic citations, which have rather precise and standard requirements. No need to expand the scope of this problem to arbitrary mentions of Python. Besides, if we have a sys.cite() function that provides the relevant information, bloggers etc will soon learn to pick and choose the bits they care about from it, even if they don't give a proper academic style citation. Of course it is possible that I've completely misunderstood Jackie's request. If so, hopefully she will speak up soon. > And see Wes Turner's note -- it is highly unlikely that a single citation > to a standard document or something will be enough for reproducibility > anyway. The academic community seems to think that it is. We don't have to tell them that they're wrong. -- Steve From wes.turner at gmail.com Tue Sep 11 10:04:40 2018 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Sep 2018 10:04:40 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <20180910004044.GU27312@ando.pearwood.info> Message-ID: On Sunday, September 9, 2018, Wes Turner wrote: > "Python Programming Language" (van Rossum, et. Al) YYYY > > ? > > Should there be a URL and/or a DOI? > http://www.python.org/~guido/Publications.html https://gvanrossum.github.io/Publications.html lists a number of Python citations: """ Python Guido van Rossum: Scripting the Web with Python. In "Scripting Languages: Automating the Web", World Wide Web Journal, Volume 2, Issue 2, Spring 1997, O'Reilly. Aaron Watters, Guido van Rossum, James C. Ahlstrom: Internet Programming with Python. MIS Press/Henry Holt publishers, New York, 1996. Guido van Rossum: Python Library Reference. May 1995. CWI Report CS-R9524. Guido van Rossum: Python Reference Manual. May 1995. CWI Report CS-R9525. Guido van Rossum: Python Tutorial. May 1995. CWI Report CS-R9526. Guido van Rossum: Extending and Embedding the Python Interpreter. May 1995. CWI Report CS-R9527. Guido van Rossum, Jelke de Boer: Linking a Stub Generator (AIL) to a Prototyping Language (Python). Spring 1991 EurOpen Conference Proceedings (May 20-24, 1991) Tromso, Norway. """ https://en.wikipedia.org/wiki/History_of_Python#Version_release_dates cites http://python-history.blogspot.com/2009/01/brief-timeline-of-python.html as a reference (for Python versions up to 2.6 and 3.0): - 0.9 - 1991 - 1.0 - 1994 - 2.0 - 2000 - 3.0 - 2008 - 3.7 - 2018 Maybe it would be most productive for us to discuss the fields in the proposed citation? ~"PSF is the author" @article{python, title={{P}ython ...}, author={Van Rossum, G. and Python Software Foundation (PSF), The.} , journal={ }, volume={ }, pages={ }, year={ } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Sep 11 16:35:04 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 11 Sep 2018 22:35:04 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: <20180911124508.GK1596@ando.pearwood.info> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <20180911014808.GC1596@ando.pearwood.info> <20180911124508.GK1596@ando.pearwood.info> Message-ID: On Tue, Sep 11, 2018 at 2:45 PM, Steven D'Aprano wrote: > I think this thread is about *academic* citations. yes, I assumed that as well, what in any of my posts made you think otherwise? > There's a metric ton of information on the web about citing software, > there are existing standards, and I really think you are > over-complicating this. See, for example: > > https://www.software.ac.uk/how-cite-software > > http://www.citethisforme.com/cite/software > > https://openresearchsoftware.metajnl.com/about/#q12 The fact that those posts exist demonstrates that this is anything but a solved problem. Its not our job to tell academics how to cite, they already have a > number of standardized templates that they use, but it is our job to > tell them what information to fill into the template. > yes, of course -- I don't know why this thread got sidetracked into citation formats, that has nothing to do with it. Or as the op said, that's "the easy part" > Lets say one were to write an article about how different computer > > languages express functional programming concepts -- you may want to cite > > Python, but you are not trying to identify a specific version for > > reproducible results. > > I don't think we need to lose any sleep over how random bloggers and > Redditors informally cite Python. Why in the world would you think "article" meant random bloggers? In BiBTex, for instance, a paper in a peer reviewed journal is called an "article", as apposed to a book, or chapter, or inproceedings, or techreport, or.... As this whole thread is about academic citations, I assumed that... I think the focus here is on academic > citations, which have rather precise and standard requirements. not for software, yet. > No need > to expand the scope of this problem to arbitrary mentions of Python. > I was not expanding it -- I was hoping to contract it -- or at least better define it. > Of course it is possible that I've completely misunderstood Jackie's > request. If so, hopefully she will speak up soon. I think we're all on the same page about that, actually. My point, to be more pedantic about it, is that an academic paper might be *about* Python in some way, or it might describe work that *used* Python as a tool to accomplish some other understanding. These *may* require a different citation. And a citation that satisfies academic criteria for using Python may not be enough to assure reproducible results. > And see Wes Turner's note -- it is highly unlikely that a single citation > > to a standard document or something will be enough for reproducibility > > anyway. > > The academic community seems to think that it is. We don't have to tell > them that they're wrong. The Academic community has a really bad track record with reproducible results for computationally based research -- it is not a solved problem. And it's not a "they" -- many of us on this list are part of the academic community. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Sep 11 17:01:16 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 11 Sep 2018 23:01:16 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <20180910004044.GU27312@ando.pearwood.info> Message-ID: Thanks Wes. """ > Python > > Guido van Rossum: Scripting the Web with Python. In "Scripting Languages: > Automating the Web", World Wide Web Journal, Volume 2, Issue 2, Spring > 1997, O'Reilly. > > Aaron Watters, Guido van Rossum, James C. Ahlstrom: Internet Programming > with Python. MIS Press/Henry Holt publishers, New York, 1996. > > Guido van Rossum: Python Library Reference. May 1995. CWI Report CS-R9524. > > Guido van Rossum: Python Reference Manual. May 1995. CWI Report CS-R9525. > > Guido van Rossum: Python Tutorial. May 1995. CWI Report CS-R9526. > > Guido van Rossum: Extending and Embedding the Python Interpreter. May > 1995. CWI Report CS-R9527. > > Guido van Rossum, Jelke de Boer: Linking a Stub Generator (AIL) to a > Prototyping Language (Python). Spring 1991 EurOpen Conference Proceedings > (May 20-24, 1991) Tromso, Norway. > """ > Of these, I think the Python Reference Manual is the only one that comes close as a general citation for the langue itself. And it would be a particular version, presumably. But those old versions are published as tech reports by a know institution -- easier to cite than teh PSF. I've seen folks cite various academic articles to satisfy a citation for a language or library, but often that is simply because that is the one thing they could find that is citable in the standard way. sure -- though I don't think "article" is the correct catagory -- probbaly technical report. Looking at the current "official" docs, the copyright is held by the PSF, but I see no author mentioned (I recall it said Fred Drake many years back...) I take that back -- the PDF version has an author. So I'm thinking maybe: @techreport{techreport, author = {Guido van Rossum and the Python development team}, title = {Python Language Reference, release 3.7.0}, institution = {Python Software Foundation}, year = 2018, address = {9450 SW Gemini Dr. ECM# 90772, Beaverton, OR 97008 USA}, number = 3.7, url . = {https://docs.python.org/release/3.7.0/}, } I used "Guido van Rossum and the Python development team" as the Author, as that's what it says on the PDF version. Not sure how bibliography software will deal with that... And what do we do with version? part of the title, or (ab)use number (Or volume, or ....) And it would be better to cite the entire standard docs, rather than just the language reference, but I"m not sure what to call it -- there is no single document with a title. And a DOI for each release would be nice. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Tue Sep 11 18:23:45 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 12 Sep 2018 00:23:45 +0200 Subject: [Python-Dev] bpo-34595: How to format a type name? Message-ID: Hi, Last week, I opened an issue to propose to add a new %T formatter to PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() and PyErr_Format(): https://bugs.python.org/issue34595 I merged my change, but then Serhiy Storchaka asked if we can add something to get the "fully qualified name" (FQN) of a type, ex "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). I proposed a second pull request to add %t (short) in addition to %T (FQN). But then Petr Viktorin asked me to open a thread on python-dev to get a wider discussion. So here I am. The rationale for this change is to fix multiple issues: * C extensions use Py_TYPE(obj)->tp_name which returns a fully qualified name for C types, but the name (without the module) for Python name. Python modules use type(obj).__name__ which always return the short name. * currently, many C extensions truncate the type name: use "%.80s" instead of "%s" to format a type name * "%s" with Py_TYPE(obj)->tp_name is used more than 200 times in the C code, and I dislike this complex pattern. IMHO "%t" with obj would be simpler to read, write and maintain. * I want C extensions and Python modules to have the same behavior: respect the PEP 399. Petr considers that error messages are not part of the PEP 399, but the issue is wider than only error messages. The main issue is that at the C level, Py_TYPE(obj)->tp_name is "usually" the fully qualified name for types defined in C, but it's only the "short" name for types defined in Python. For example, if you get the C accelerator "_datetime", PyTYPE(obj)->tp_name of a datetime.timedelta object gives you "datetime.timedelta", but if you don't have the accelerator, tp_name is just "timedelta". Another example, this script displays "mytimedelta(0)" if you have the C accelerator, but "__main__.mytimedelta(0)" if you use the Python implementation: --- import sys #sys.modules['_datetime'] = None import datetime class mytimedelta(datetime.timedelta): pass print(repr(mytimedelta())) --- So I would like to fix this kind of issue. Type names are mainly used for two purposes: * format an error message * obj.__repr__() It's unclear to me if we should use the "short" or the "fully qualified" name. It should maybe be decided on a case by case basis. There is also a 3rd usage: to implement __reduce__, here backward compatibility matters. Note: The discussion evolved since my first implementation of %T which just used the not well defined Py_TYPE(obj)->tp_name. -- Petr asked me why not exposing functions to get these names. For example, with my second PR (not merged), there are 3 (private) functions: /* type.__name__ */ const char* _PyType_Name(PyTypeObject *type); /* type.__qualname__ */ PyObject* _PyType_QualName(PyTypeObject *type); * type.__module__ "." type.__qualname__ (but type.__qualname__ for builtin types) */ PyObject * _PyType_FullName(PyTypeObject *type); My concern here is that each caller has to handler error: PyErr_Format(PyExc_TypeError, "must be str, not %.100s", Py_TYPE(obj)->tp_name); would become: PyObject *type_name = _PyType_FullName(Py_TYPE(obj)); if (name == NULL) { /* do something with this error ... */ PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); Py_DECREF(name); When I report an error, I dislike having to handle *new* errors... I prefer that the error handling is done inside PyErr_Format() for me, to reduce the risk of additional bugs. -- Serhiy also asked if we could expose the same feature at the *Python* level: provide something to get the fully qualified name of a type. It's not just f"{type(obj).__module}.{type(obj).__name__}", but you have to skip the module for builtin types like "str" (not return "builtins.str"). Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for name and "T" for fully qualfied name. We would only have to modify type.__format__(). I'm not sure if we need to add new formatters to str % args. Example of Python code: raise TypeError("must be str, not %s" % type(fmt).__name__) I'm not sure about Python changes. My first concern was just to avoid Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and Python consistent. If the behavior of C extensions change, Python modules should be adapted as well, to get the same behavior. Note: I reverted my change which added the %T formatter from PyUnicode_FromFormatV() to clarify the status of this issue. Victor From encukou at gmail.com Tue Sep 11 18:57:14 2018 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 11 Sep 2018 15:57:14 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: <609ce9c8-ce1b-bfd9-958c-7baa22e2dd7d@gmail.com> On 09/11/18 15:23, Victor Stinner wrote: > Hi, > > Last week, I opened an issue to propose to add a new %T formatter to > PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() > and PyErr_Format(): > > https://bugs.python.org/issue34595 > > I merged my change, but then Serhiy Storchaka asked if we can add > something to get the "fully qualified name" (FQN) of a type, ex > "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). > I proposed a second pull request to add %t (short) in addition to %T > (FQN). > > But then Petr Viktorin asked me to open a thread on python-dev to get > a wider discussion. So here I am. > > > The rationale for this change is to fix multiple issues: > > * C extensions use Py_TYPE(obj)->tp_name which returns a fully > qualified name for C types, but the name (without the module) for > Python name. Python modules use type(obj).__name__ which always return > the short name. That might be a genuine problem, but I wonder if "%T" is fixing the symptom rather than the cause here. Or is this only an issue for PyUnicode_FromFormat()? > * currently, many C extensions truncate the type name: use "%.80s" > instead of "%s" to format a type name That's an orthogonal issue -- you can change "%.80s" to "%s", and presumably you could use "%.80t" as well. > * "%s" with Py_TYPE(obj)->tp_name is used more than 200 times in the C > code, and I dislike this complex pattern. IMHO "%t" with obj would be > simpler to read, write and maintain. I consider `Py_TYPE(obj)->tp_name` much more understandable than "%t". It's longer to spell out, but it's quite self-documenting. > * I want C extensions and Python modules to have the same behavior: > respect the PEP 399. Petr considers that error messages are not part > of the PEP 399, but the issue is wider than only error messages. The other major use is for __repr__, which AFAIK we also don't guarantee to be stable, so I don't think PEP 399 applies to it. Having the same behavior between C and Python versions of a module is nice, but PEP 399 doesn't prescribe it. There are other differences as well -- for example, `_datetime.datetime` is immutable, and that's OK. If error messages and __repr__s should be consistent between Python and the C accelerator, are you planning to write tests for all the affected modules when switching them to %T/%t? > The main issue is that at the C level, Py_TYPE(obj)->tp_name is > "usually" the fully qualified name for types defined in C, but it's > only the "short" name for types defined in Python. > > For example, if you get the C accelerator "_datetime", > PyTYPE(obj)->tp_name of a datetime.timedelta object gives you > "datetime.timedelta", but if you don't have the accelerator, tp_name > is just "timedelta". > > Another example, this script displays "mytimedelta(0)" if you have the > C accelerator, but "__main__.mytimedelta(0)" if you use the Python > implementation: > --- > import sys > #sys.modules['_datetime'] = None > import datetime > > class mytimedelta(datetime.timedelta): > pass > > print(repr(mytimedelta())) > --- > > So I would like to fix this kind of issue. > > > Type names are mainly used for two purposes: > > * format an error message > * obj.__repr__() > > It's unclear to me if we should use the "short" or the "fully > qualified" name. It should maybe be decided on a case by case basis. > > There is also a 3rd usage: to implement __reduce__, here backward > compatibility matters. > > > Note: The discussion evolved since my first implementation of %T which > just used the not well defined Py_TYPE(obj)->tp_name. > > -- > > Petr asked me why not exposing functions to get these names. For > example, with my second PR (not merged), there are 3 (private) > functions: > > /* type.__name__ */ > const char* _PyType_Name(PyTypeObject *type); > /* type.__qualname__ */ > PyObject* _PyType_QualName(PyTypeObject *type); > * type.__module__ "." type.__qualname__ (but type.__qualname__ for > builtin types) */ > PyObject * _PyType_FullName(PyTypeObject *type); > > My concern here is that each caller has to handler error: > > PyErr_Format(PyExc_TypeError, "must be str, not %.100s", > Py_TYPE(obj)->tp_name); > > would become: > > PyObject *type_name = _PyType_FullName(Py_TYPE(obj)); > if (name == NULL) { /* do something with this error ... */ > PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); > Py_DECREF(name); > > When I report an error, I dislike having to handle *new* errors... I > prefer that the error handling is done inside PyErr_Format() for me, > to reduce the risk of additional bugs. > > -- > > Serhiy also asked if we could expose the same feature at the *Python* > level: provide something to get the fully qualified name of a type. > It's not just f"{type(obj).__module}.{type(obj).__name__}", but you > have to skip the module for builtin types like "str" (not return > "builtins.str"). > > Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for > name and "T" for fully qualfied name. We would only have to modify > type.__format__(). > > I'm not sure if we need to add new formatters to str % args. > > Example of Python code: > > raise TypeError("must be str, not %s" % type(fmt).__name__) > > I'm not sure about Python changes. My first concern was just to avoid > Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and > Python consistent. If the behavior of C extensions change, Python > modules should be adapted as well, to get the same behavior. > > > Note: I reverted my change which added the %T formatter from > PyUnicode_FromFormatV() to clarify the status of this issue. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com > From python at mrabarnett.plus.com Tue Sep 11 19:06:42 2018 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 12 Sep 2018 00:06:42 +0100 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: On 2018-09-11 23:23, Victor Stinner wrote: > Hi, > > Last week, I opened an issue to propose to add a new %T formatter to > PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() > and PyErr_Format(): > > https://bugs.python.org/issue34595 > > I merged my change, but then Serhiy Storchaka asked if we can add > something to get the "fully qualified name" (FQN) of a type, ex > "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). > I proposed a second pull request to add %t (short) in addition to %T > (FQN). > > But then Petr Viktorin asked me to open a thread on python-dev to get > a wider discussion. So here I am. > > > The rationale for this change is to fix multiple issues: > > * C extensions use Py_TYPE(obj)->tp_name which returns a fully > qualified name for C types, but the name (without the module) for > Python name. Python modules use type(obj).__name__ which always return > the short name. > > * currently, many C extensions truncate the type name: use "%.80s" > instead of "%s" to format a type name > > * "%s" with Py_TYPE(obj)->tp_name is used more than 200 times in the C > code, and I dislike this complex pattern. IMHO "%t" with obj would be > simpler to read, write and maintain. > > * I want C extensions and Python modules to have the same behavior: > respect the PEP 399. Petr considers that error messages are not part > of the PEP 399, but the issue is wider than only error messages. > > > The main issue is that at the C level, Py_TYPE(obj)->tp_name is > "usually" the fully qualified name for types defined in C, but it's > only the "short" name for types defined in Python. > > For example, if you get the C accelerator "_datetime", > PyTYPE(obj)->tp_name of a datetime.timedelta object gives you > "datetime.timedelta", but if you don't have the accelerator, tp_name > is just "timedelta". > > Another example, this script displays "mytimedelta(0)" if you have the > C accelerator, but "__main__.mytimedelta(0)" if you use the Python > implementation: > --- > import sys > #sys.modules['_datetime'] = None > import datetime > > class mytimedelta(datetime.timedelta): > pass > > print(repr(mytimedelta())) > --- > > So I would like to fix this kind of issue. > > > Type names are mainly used for two purposes: > > * format an error message > * obj.__repr__() > > It's unclear to me if we should use the "short" or the "fully > qualified" name. It should maybe be decided on a case by case basis. > > There is also a 3rd usage: to implement __reduce__, here backward > compatibility matters. > > > Note: The discussion evolved since my first implementation of %T which > just used the not well defined Py_TYPE(obj)->tp_name. > > -- > > Petr asked me why not exposing functions to get these names. For > example, with my second PR (not merged), there are 3 (private) > functions: > > /* type.__name__ */ > const char* _PyType_Name(PyTypeObject *type); > /* type.__qualname__ */ > PyObject* _PyType_QualName(PyTypeObject *type); > * type.__module__ "." type.__qualname__ (but type.__qualname__ for > builtin types) */ > PyObject * _PyType_FullName(PyTypeObject *type); > > My concern here is that each caller has to handler error: > > PyErr_Format(PyExc_TypeError, "must be str, not %.100s", > Py_TYPE(obj)->tp_name); > > would become: > > PyObject *type_name = _PyType_FullName(Py_TYPE(obj)); > if (name == NULL) { /* do something with this error ... */ > PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); > Py_DECREF(name); > > When I report an error, I dislike having to handle *new* errors... I > prefer that the error handling is done inside PyErr_Format() for me, > to reduce the risk of additional bugs. > > -- > > Serhiy also asked if we could expose the same feature at the *Python* > level: provide something to get the fully qualified name of a type. > It's not just f"{type(obj).__module}.{type(obj).__name__}", but you > have to skip the module for builtin types like "str" (not return > "builtins.str"). > > Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for > name and "T" for fully qualfied name. We would only have to modify > type.__format__(). > > I'm not sure if we need to add new formatters to str % args. > > Example of Python code: > > raise TypeError("must be str, not %s" % type(fmt).__name__) > > I'm not sure about Python changes. My first concern was just to avoid > Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and > Python consistent. If the behavior of C extensions change, Python > modules should be adapted as well, to get the same behavior. > > > Note: I reverted my change which added the %T formatter from > PyUnicode_FromFormatV() to clarify the status of this issue. > I'm not sure about having 2 different, though similar, format codes for 2 similar, though slightly different, cases. (And, for all we know, we might want to use "%t" at some later date for something else.) Perhaps we could have a single format code plus an optional '#' for the "alternate form": %T for short form %#T for fully qualified name From guido at python.org Tue Sep 11 19:35:23 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Sep 2018 16:35:23 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: FWIW, I personally think this went to python-dev prematurely. This is a shallow problem and IMO doesn't need all that much handwringing. Yes, there are a few different alternatives. So list them all concisely in the tracker and think about it for a few minutes and then pick one. No matter what's picked it'll be better than the status quo -- which is that printing a class name produces either a full or a short name depending on whether it's defined in C or Python, and there's no simple pattern (in C) to print either the full or the short name. On Tue, Sep 11, 2018 at 4:09 PM MRAB wrote: > On 2018-09-11 23:23, Victor Stinner wrote: > > Hi, > > > > Last week, I opened an issue to propose to add a new %T formatter to > > PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() > > and PyErr_Format(): > > > > https://bugs.python.org/issue34595 > > > > I merged my change, but then Serhiy Storchaka asked if we can add > > something to get the "fully qualified name" (FQN) of a type, ex > > "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). > > I proposed a second pull request to add %t (short) in addition to %T > > (FQN). > > > > But then Petr Viktorin asked me to open a thread on python-dev to get > > a wider discussion. So here I am. > > > > > > The rationale for this change is to fix multiple issues: > > > > * C extensions use Py_TYPE(obj)->tp_name which returns a fully > > qualified name for C types, but the name (without the module) for > > Python name. Python modules use type(obj).__name__ which always return > > the short name. > > > > * currently, many C extensions truncate the type name: use "%.80s" > > instead of "%s" to format a type name > > > > * "%s" with Py_TYPE(obj)->tp_name is used more than 200 times in the C > > code, and I dislike this complex pattern. IMHO "%t" with obj would be > > simpler to read, write and maintain. > > > > * I want C extensions and Python modules to have the same behavior: > > respect the PEP 399. Petr considers that error messages are not part > > of the PEP 399, but the issue is wider than only error messages. > > > > > > The main issue is that at the C level, Py_TYPE(obj)->tp_name is > > "usually" the fully qualified name for types defined in C, but it's > > only the "short" name for types defined in Python. > > > > For example, if you get the C accelerator "_datetime", > > PyTYPE(obj)->tp_name of a datetime.timedelta object gives you > > "datetime.timedelta", but if you don't have the accelerator, tp_name > > is just "timedelta". > > > > Another example, this script displays "mytimedelta(0)" if you have the > > C accelerator, but "__main__.mytimedelta(0)" if you use the Python > > implementation: > > --- > > import sys > > #sys.modules['_datetime'] = None > > import datetime > > > > class mytimedelta(datetime.timedelta): > > pass > > > > print(repr(mytimedelta())) > > --- > > > > So I would like to fix this kind of issue. > > > > > > Type names are mainly used for two purposes: > > > > * format an error message > > * obj.__repr__() > > > > It's unclear to me if we should use the "short" or the "fully > > qualified" name. It should maybe be decided on a case by case basis. > > > > There is also a 3rd usage: to implement __reduce__, here backward > > compatibility matters. > > > > > > Note: The discussion evolved since my first implementation of %T which > > just used the not well defined Py_TYPE(obj)->tp_name. > > > > -- > > > > Petr asked me why not exposing functions to get these names. For > > example, with my second PR (not merged), there are 3 (private) > > functions: > > > > /* type.__name__ */ > > const char* _PyType_Name(PyTypeObject *type); > > /* type.__qualname__ */ > > PyObject* _PyType_QualName(PyTypeObject *type); > > * type.__module__ "." type.__qualname__ (but type.__qualname__ for > > builtin types) */ > > PyObject * _PyType_FullName(PyTypeObject *type); > > > > My concern here is that each caller has to handler error: > > > > PyErr_Format(PyExc_TypeError, "must be str, not %.100s", > > Py_TYPE(obj)->tp_name); > > > > would become: > > > > PyObject *type_name = _PyType_FullName(Py_TYPE(obj)); > > if (name == NULL) { /* do something with this error ... */ > > PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); > > Py_DECREF(name); > > > > When I report an error, I dislike having to handle *new* errors... I > > prefer that the error handling is done inside PyErr_Format() for me, > > to reduce the risk of additional bugs. > > > > -- > > > > Serhiy also asked if we could expose the same feature at the *Python* > > level: provide something to get the fully qualified name of a type. > > It's not just f"{type(obj).__module}.{type(obj).__name__}", but you > > have to skip the module for builtin types like "str" (not return > > "builtins.str"). > > > > Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for > > name and "T" for fully qualfied name. We would only have to modify > > type.__format__(). > > > > I'm not sure if we need to add new formatters to str % args. > > > > Example of Python code: > > > > raise TypeError("must be str, not %s" % type(fmt).__name__) > > > > I'm not sure about Python changes. My first concern was just to avoid > > Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and > > Python consistent. If the behavior of C extensions change, Python > > modules should be adapted as well, to get the same behavior. > > > > > > Note: I reverted my change which added the %T formatter from > > PyUnicode_FromFormatV() to clarify the status of this issue. > > > I'm not sure about having 2 different, though similar, format codes for > 2 similar, though slightly different, cases. (And, for all we know, we > might want to use "%t" at some later date for something else.) > > Perhaps we could have a single format code plus an optional '#' for the > "alternate form": > > %T for short form > %#T for fully qualified name > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Sep 11 20:21:01 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Sep 2018 17:21:01 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: MRAB wrote on 9/11/18 16:06: > On 2018-09-11 23:23, Victor Stinner wrote: >> Last week, I opened an issue to propose to add a new %T formatter to >> PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() >> and PyErr_Format(): >> >> ??? https://bugs.python.org/issue34595 >> >> I merged my change, but then Serhiy Storchaka asked if we can add >> something to get the "fully qualified name" (FQN) of a type, ex >> "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). >> I proposed a second pull request to add %t (short) in addition to %T >> (FQN). >> >> But then Petr Viktorin asked me to open a thread on python-dev to get >> a wider discussion. So here I am. +1 for adding format specs for the types, and for giving a consistent way to specify which you want. %t (short) and %T (long) do seem like the logical choices, and I think in context it'll be pretty evident what they mean. > Perhaps we could have a single format code plus an optional '#' for the > "alternate form": > > %T for short form > %#T for fully qualified name OTOH, if %T and variants meant "type" but %t mean something entirely different, that *would* probably be confusing. -Barry From encukou at gmail.com Tue Sep 11 20:54:22 2018 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 11 Sep 2018 17:54:22 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: <061c208a-b022-67f3-fc07-14a4501896ec@gmail.com> On 09/11/18 15:23, Victor Stinner wrote: > Hi, > > Last week, I opened an issue to propose to add a new %T formatter to > PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() > and PyErr_Format(): > > https://bugs.python.org/issue34595 > > I merged my change, but then Serhiy Storchaka asked if we can add > something to get the "fully qualified name" (FQN) of a type, ex > "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). > I proposed a second pull request to add %t (short) in addition to %T > (FQN). > > But then Petr Viktorin asked me to open a thread on python-dev to get > a wider discussion. So here I am. After a discussion with Victor. I'll summarize where we are now. There are actually two inconsistencies to fix: - Python modules use `type(obj).__name__` and C extensions use `Py_TYPE(obj)->tp_name`, which inconsistent. - Usage __name__ or __qualname__, and prepending __module__ or not, is inconsistent across types/modules. It turns out that today, when you want to print out a type name, you nearly always want the fully qualified name (including the module unless it's "builtins"). So we can just have "%T" and not "%t". (Or we can add "%t" if a use case arises). It should be possible to do this also in Python, preferably using a name similar to "%T". Most of the usage is in error messages and __repr__, where we don't need to worry about compatibility too much. It should be possible to get the name if you have the type object, but not an instance of it. So, the proposed `PyUnicode_FromFormat("%T", obj)` is incomplete -- if we go that way, we'll also need a function like PyType_GetFullName. Making "%T" work on the type, e.g. `PyUnicode_FromFormat("%T", Py_TYPE(obj))`, would be more general. --- So, I propose adding a "%T" formatter to PyUnicode_FromFormat, to be used like this: PyUnicode_FromFormat("%T", Py_TYPE(obj)) and a "T" format code for type.__format__, to be used like this: f"{type(obj):T}" From ethan at stoneleaf.us Tue Sep 11 21:00:57 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 11 Sep 2018 18:00:57 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: On 09/11/2018 05:21 PM, Barry Warsaw wrote: > MRAB wrote on 9/11/18 16:06: >> Perhaps we could have a single format code plus an optional '#' for >> the "alternate form": >> >> %T for short form >> %#T for fully qualified name > > OTOH, if %T and variants meant "type" but %t mean something entirely > different, that *would* probably be confusing. I think folks used to %-formatting are already used to un-related but similar codes (and related but dissimilar): - %M for minute - %m for month (or maybe I have that backwards) - %H for 24-hour clock - %I for 12-hour clock - %w for weekday as decimal number - %W for week number of the year I always have to look it up. :( -- ~Ethan~ From python at mrabarnett.plus.com Tue Sep 11 21:57:09 2018 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 12 Sep 2018 02:57:09 +0100 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: <40ba772e-430f-bf4a-72f5-facbc42adb09@mrabarnett.plus.com> On 2018-09-12 02:00, Ethan Furman wrote: > On 09/11/2018 05:21 PM, Barry Warsaw wrote: >> MRAB wrote on 9/11/18 16:06: > >>> Perhaps we could have a single format code plus an optional '#' for >>> the "alternate form": >>> >>> %T for short form >>> %#T for fully qualified name >> >> OTOH, if %T and variants meant "type" but %t mean something entirely >> different, that *would* probably be confusing. > > I think folks used to %-formatting are already used to un-related but > similar codes (and related but dissimilar): > > - %M for minute > - %m for month (or maybe I have that backwards) > - %H for 24-hour clock > - %I for 12-hour clock > - %w for weekday as decimal number > - %W for week number of the year > > I always have to look it up. :( > Well, for the time of day (24-hour) it's %H:%M:%S, all uppercase. From storchaka at gmail.com Wed Sep 12 01:53:24 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 12 Sep 2018 08:53:24 +0300 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: 12.09.18 01:23, Victor Stinner ????: > But then Petr Viktorin asked me to open a thread on python-dev to get > a wider discussion. So here I am. Thank you for opening this discussion Victor. I wanted to do it myself, but you have wrote much better description of the problem. See also related issues: https://bugs.python.org/issue21861 https://bugs.python.org/issue22032 (solved) https://bugs.python.org/issue22033 (solved) https://bugs.python.org/issue27541 (solved) https://bugs.python.org/issue28062 There were also attempts to change repr/str of types so that they return just a FQN. It would help to solve the issue from Python side. This idea was initially suggested by Guido, but later he changed his mind. > The rationale for this change is to fix multiple issues: > > * C extensions use Py_TYPE(obj)->tp_name which returns a fully > qualified name for C types, but the name (without the module) for > Python name. Python modules use type(obj).__name__ which always return > the short name. Sometimes Python modules use FQN, but this is not common, and the code is cumbersome. It is more common to use obj.__class__ instead of type(obj), the difference is intentionally ignored. > * currently, many C extensions truncate the type name: use "%.80s" > instead of "%s" to format a type name AFAIK the rationale of this in PyUnicode_FromFormat() is that if you have corrupted type object, tp_name can point on arbitrary place in memory, and an attempt to interpret it as null terminated string can output a large amount of trash. It is better to get a truncated type name in error message (names of real types usually are below that limit) than get tons of trash or an error in attempt to format it. > Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for > name and "T" for fully qualfied name. We would only have to modify > type.__format__(). This will make the feature inconsistent in Python and C. In Python, the argument is a type, in C it is an instance of the type. We need a way to format a FQN in C for types themselves. It is less common case, but using _PyType_FullName() for it is very non-convenient as you have shown above. From steve at pearwood.info Wed Sep 12 02:58:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 12 Sep 2018 16:58:03 +1000 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <20180911014808.GC1596@ando.pearwood.info> <20180911124508.GK1596@ando.pearwood.info> Message-ID: <20180912065802.GN1596@ando.pearwood.info> On Tue, Sep 11, 2018 at 10:35:04PM +0200, Chris Barker wrote: > On Tue, Sep 11, 2018 at 2:45 PM, Steven D'Aprano > wrote: > > > I think this thread is about *academic* citations. > > yes, I assumed that as well, what in any of my posts made you think > otherwise? When you started talking about *articles* rather than *papers*. Articles are far more general and include anything up to and including posts on Reddit. Anyway, I think until Jackie returns to clarify what precisely she hopes to gain, I don't think further discussion on-list is warranted. -- Steve From storchaka at gmail.com Wed Sep 12 04:33:14 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 12 Sep 2018 11:33:14 +0300 Subject: [Python-Dev] Stop automerging In-Reply-To: <42908G1267zFrdW@mail.python.org> References: <42908G1267zFrdW@mail.python.org> Message-ID: 12.09.18 01:34, Miss Islington (bot) ????: > https://github.com/python/cpython/commit/d13e59c1b512069d90efe7ee9b613d3913e79c56 > commit: d13e59c1b512069d90efe7ee9b613d3913e79c56 > branch: master > author: Benjamin Peterson > committer: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com> > date: 2018-09-11T15:29:57-07:00 > summary: > > Make sure the line comes from the same node as the col offset. (GH-9189) > > Followup to 90fc8980bbcc5c7dcced3627fe172b0bfd193a3b. > > This commit message looks awful (and it is duplicated in maintained branches). Please stop automerging to master by bots. The reason of automating merging before is that the core dev that performs merging is responsible for editing the commit message. There were mistakes from time to time, but usually regular commiters did care of this. I often use "git log", and such commit messages spoil the history. From zachary.ware+pydev at gmail.com Wed Sep 12 05:09:18 2018 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Wed, 12 Sep 2018 02:09:18 -0700 Subject: [Python-Dev] Stop automerging In-Reply-To: References: <42908G1267zFrdW@mail.python.org> Message-ID: It is still up to the core dev to set the message properly, but the HTML comments are invisible on GitHub until you edit the message. That bug is now fixed, though; HTML comments are stripped from the message before creating the commit. -- Zach (Top-posted in HTML from a phone) On Wed, Sep 12, 2018, 01:34 Serhiy Storchaka wrote: > 12.09.18 01:34, Miss Islington (bot) ????: > > > https://github.com/python/cpython/commit/d13e59c1b512069d90efe7ee9b613d3913e79c56 > > commit: d13e59c1b512069d90efe7ee9b613d3913e79c56 > > branch: master > > author: Benjamin Peterson > > committer: Miss Islington (bot) < > 31488909+miss-islington at users.noreply.github.com> > > date: 2018-09-11T15:29:57-07:00 > > summary: > > > > Make sure the line comes from the same node as the col offset. (GH-9189) > > > > Followup to 90fc8980bbcc5c7dcced3627fe172b0bfd193a3b. > > > > > > This commit message looks awful (and it is duplicated in maintained > branches). Please stop automerging to master by bots. The reason of > automating merging before is that the core dev that performs merging is > responsible for editing the commit message. There were mistakes from > time to time, but usually regular commiters did care of this. > > I often use "git log", and such commit messages spoil the history. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/zachary.ware%2Bpydev%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mariatta.wijaya at gmail.com Wed Sep 12 10:16:38 2018 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Wed, 12 Sep 2018 07:16:38 -0700 Subject: [Python-Dev] Stop automerging In-Reply-To: References: <42908G1267zFrdW@mail.python.org> Message-ID: Thanks Zach for fixing it quickly. Even if that bug has been fixed, per my instructions to python-committers, core devs should still edit the PR title and PR description *before* adding the '? automerge' label. The YouTube video (link in python-committers email) shows to edit those. The PR title and body will be used as the squashed commit message. And remember, you can still merge the PR manually. Just don't apply the '? automerge' label. On Wed, Sep 12, 2018, 2:09 AM Zachary Ware wrote: > It is still up to the core dev to set the message properly, but the HTML > comments are invisible on GitHub until you edit the message. That bug is > now fixed, though; HTML comments are stripped from the message before > creating the commit. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Wed Sep 12 11:15:46 2018 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 12 Sep 2018 08:15:46 -0700 Subject: [Python-Dev] Stop automerging In-Reply-To: References: <42908G1267zFrdW@mail.python.org> Message-ID: <1536765346.1917281.1505707984.5FAF6C18@webmail.messagingengine.com> On Wed, Sep 12, 2018, at 01:33, Serhiy Storchaka wrote: > 12.09.18 01:34, Miss Islington (bot) ????: > > https://github.com/python/cpython/commit/d13e59c1b512069d90efe7ee9b613d3913e79c56 > > commit: d13e59c1b512069d90efe7ee9b613d3913e79c56 > > branch: master > > author: Benjamin Peterson > > committer: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com> > > date: 2018-09-11T15:29:57-07:00 > > summary: > > > > Make sure the line comes from the same node as the col offset. (GH-9189) > > > > Followup to 90fc8980bbcc5c7dcced3627fe172b0bfd193a3b. > > > > > > This commit message looks awful (and it is duplicated in maintained > branches). Please stop automerging to master by bots. The reason of > automating merging before is that the core dev that performs merging is > responsible for editing the commit message. There were mistakes from > time to time, but usually regular commiters did care of this. (Just checking) Is there something wrong with this message besides the <-- comment? From vstinner at redhat.com Wed Sep 12 11:56:31 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 12 Sep 2018 17:56:31 +0200 Subject: [Python-Dev] Stop automerging In-Reply-To: <1536765346.1917281.1505707984.5FAF6C18@webmail.messagingengine.com> References: <42908G1267zFrdW@mail.python.org> <1536765346.1917281.1505707984.5FAF6C18@webmail.messagingengine.com> Message-ID: Hi Benjamin, https://github.com/python/cpython/commit/d13e59c1b512069d90efe7ee9b613d3913e79c56 Le mer. 12 sept. 2018 ? 17:19, Benjamin Peterson a ?crit : > (Just checking) Is there something wrong with this message besides the <-- comment? Since the commit is described as a follow-up of 90fc8980bbcc5c7dcced3627fe172b0bfd193a3b, maybe it could also include "bpo-31902: " prefix in it's title. Just to ease following where the change comes from. IMHO the main complain of Serhiy was the giant comment ;-) I agree that this one has to go, and hopefully it's already fixed. We are now good! Victor From vstinner at redhat.com Wed Sep 12 12:02:03 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 12 Sep 2018 18:02:03 +0200 Subject: [Python-Dev] Stop automerging In-Reply-To: References: <42908G1267zFrdW@mail.python.org> <1536765346.1917281.1505707984.5FAF6C18@webmail.messagingengine.com> Message-ID: Oh yes, one issue of missing bpo-xxx is that bots don't report merged commits into the bpo. I like using the bpo issue to track backports: https://bugs.python.org/issue31902 It was just a general remark, it's fine for these commits. Someone can add them manually to the bpo if you want. Victor Le mer. 12 sept. 2018 ? 17:56, Victor Stinner a ?crit : > > Hi Benjamin, > > https://github.com/python/cpython/commit/d13e59c1b512069d90efe7ee9b613d3913e79c56 > > Le mer. 12 sept. 2018 ? 17:19, Benjamin Peterson a ?crit : > > (Just checking) Is there something wrong with this message besides the <-- comment? > > Since the commit is described as a follow-up of > 90fc8980bbcc5c7dcced3627fe172b0bfd193a3b, maybe it could also include > "bpo-31902: " prefix in it's title. Just to ease following where the > change comes from. > > IMHO the main complain of Serhiy was the giant comment > ;-) I agree that this one has to go, and hopefully it's already fixed. > We are now good! > > Victor From encukou at gmail.com Wed Sep 12 20:26:47 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 12 Sep 2018 17:26:47 -0700 Subject: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods In-Reply-To: <5B2A15FE.4000608@UGent.be> References: <5B2A15FE.4000608@UGent.be> Message-ID: On 06/20/18 01:53, Jeroen Demeyer wrote: > Hello, > > Let me present PEP 579 and PEP 580. > > PEP 579 is an informational meta-PEP, listing some of the issues with > functions/methods implemented in C. The idea is to create several PEPs > each fix some part of the issues mentioned in PEP 579. > > PEP 580 is a standards track PEP to introduce a new "C call" protocol, > which is an important part of PEP 579. In the reference implementation > (which is work in progress), this protocol will be used by built-in > functions and methods. However, it should be used by more classes in the > future. > > You find the texts at > https://www.python.org/dev/peps/pep-0579 > https://www.python.org/dev/peps/pep-0580 Hi! I finally had time to read the PEPs carefully. Overall, great work! PEP 580 does look complicated, but it's well thought out and addresses real problems. I think the main advantage over the competing PEP 576 is that it's a better foundation for solving Cython (and other C-API users) and my PEP 573 (module state access from methods). With that, I do have some comments. The reference to PEP 573 is premature. If PEP 580 is implemented then PEP 573 will build on top, and I don't plan to update PEP 573 before that. So, I think 580 should be independent. If you agree I can summarize rationale for "parent", as much as it concerns 580. # Using tp_print The tp_print gimmick is my biggest worry. AFAIK there's no guarantee that a function pointer and Py_ssize_t are the same size. That makes the backwards-compatibility typedef in the implementation is quite worrying: typedef Py_ssize_t printfunc I can see the benefit for backporting to earlier Python versions, and maybe that outweighs worries about exotic architectures, but the PEP should at least have more words on why this is not a problem. # The C Call protocol I really like the fact that, in the reference implementation, the flags are arranged in a way that allows a switch statement to select what to call. That should be noted, if only to explain why there's no guarantee of compatibility between Python versions. # Descriptor behavior I'd say "SHOULD" rather than "MUST" here. The section describes how to implement expected/reasonable behavior, but I see no need to limit that. "if func supports the C call protocol, then func.__set__ must not be implemented." -- also, __delete__ should not be implemented, right?. # Generic API functions I'm a bit worried about PyCCall_FASTCALL's "kwds" argument accepting a dict, which is mutable. I wouldn't mind dropping that capability, but if it stays, we need to require that the callable promises to not modify it. PyCCall_FASTCALL is not a macro, shouldn't it be named PyCCall_FastCall? # C API functions The function PyCFunction_GetFlags is, for better or worse, part of the stable ABI. We shouldn't just give up on it. I'm fine with documenting that it shouldn't be used, but for functions defined using PyCFunction_New etc. it should continue behaving as before. One solution could be to preserve the "definition time" METH_* flags in the 0xFFF bits of cc_flags and use the other bits for CCALL_*. # Stable ABI The section should repeat that PyCFunction_ClsNew is added to the stable ABI (but nothing else). From vstinner at redhat.com Wed Sep 12 20:33:23 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 13 Sep 2018 02:33:23 +0200 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: Hi, For the type name, sometimes, we only get a type (not an instance), and we want to format its FQN. IMHO we need to provide ways to format the FQN of a type for *types* and for *instances*. Here is my proposal: * Add !t conversion to format string * Add ":T" format to type.__format__() * Add "%t" and "%T" formatters to PyUnicode_FromUnicodeV() * Add a read-only type.__fqn__ property # Python: "!t" for instance raise TypeError(f"must be str, not {obj!t}") /* C: "%t" for instance */ PyErr_Format(PyExc_TypeError, "must be str, not %t", obj); /* C: "%T" for type */ PyErr_Format(PyExc_TypeError, "must be str, not %T", mytype); # Python: ":T" for type raise TypeError(f"must be str, not {mytype!T}") Open question: Should we also add "%t" and "%T" formatters to the str % args operator at the Python level? I have a proof-of-concept implementation: https://github.com/python/cpython/pull/9251 Victor Victor From turnbull.stephen.fw at u.tsukuba.ac.jp Wed Sep 12 23:59:44 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 13 Sep 2018 12:59:44 +0900 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> Message-ID: <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Chris Barker via Python-Dev writes: > But "I wrote some code in Python to produce these statistics" -- > does that need a citation? That depends on what you mean by "statistics" and whether (as one should) one makes the code available. If the code is published or "available on request", definitely, Python should be cited. If not, and by "statistics" you mean the kind of things provided by Steven d'Aprano's excellent statistics module (mean, median, standard deviation, etc), maybe no citation is needed. But anything more esoteric than that (even linear regression), yeah, I would say you should cite both Python and any reference you used to learn the algorithm or formulas, in the context of mentioning that your statistics are home-brew, not produced by one of the recognized applications for doing so. > If so, maybe that would take a different form. Yes, it would. But not so different: eg, version is analogous to edition when citing a book. > Anyway, hard to make this decision without some idea how the > citation is intended to be used. Same as any other citation, (1) to give credit to those responsible for providing a resource (this is why publishers and their metadata of city are still conventionally included), and (2) to show where that resource can be obtained. AFAICS, both motivations are universally applicable in polite society. NB: Replication is an important reason for wanting to acquire the resource, but it's not the only one. I think underlying your comment is the question of *what* resource is being cited. I can think of three offhand that might be characterized as "Python". First, the PSF, as a provider of funding. There is a conventional form for this: a footnote on the title or author's name saying "The author acknowledges [a] grant [grant identifier if available] from the Python Software Foundation." I usually orally mention them in presentations, too. That one's easy; *everybody* should *always* do that. The rest of these, sort of an ideal to strive for. If you keep a bibliographic database, and there are now quite a few efforts to crowd source them, it's easier to go the whole 9 yards than to skimp. But except in cases where we don't need to even mention the code, probably we should be citing, for reasons of courtesy to readers as well as authors, editors, and publishers (as disgusting as many publishers are as members of society, they do play a role in providing many resources ---we should find ways to compete them into good behavior, not ostracize them). The second is the Python *language and standard library*. Then the Language Reference and/or the Library Reference should be cited briefly when Python is first mentioned, and in the text introducing a program or program fragment, with a full citation in the bibliography. I tentatively suggest that the metadata for the Language Reference would be Author: principal author(s) (Guido?) et al. OR python.org OR Python Contributors Title: The Python Language Reference Version: to match Python version used (if relevant, different versions each get full citations), probably should not be "current" Publisher: Python Software Foundation Date: of the relevant version Location: City of legal address of PSF URL: to version used (probably should not be the default) Date accessed: if "current" was used The Library reference would be the same except for Title. The third is a *particular implementation*. In that case the metadata would be Author: principal author(s) (Guido) et al. OR python.org OR Python Contributors Title: The cPython Python distribution Python Version: as appropriate (if relevant, different versions each get full citations), never "current" Distributor Version: if different from Python version (eg, additional Debian cruft) Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) Date: of the relevant version Location: City of legal address of distributor If downloaded: URL: to version used (including git commit SHA1 if available) Date accessed: download from distributor, not installation date If received on physical medium: use the "usual" form of citation for a collection of individual works (even if Python was the only thing on it). Probably the only additional information needed would be the distributor as editor of the collection and the name of the collection. In most cases I can think of, if the implementation is cited, the Language and Library References should be cited, too. Finally, if Python or components were modified for the project, the modified version should be preserved in a repository and a VCS identifier provided. This does not imply the repository need be publicly accessible, of course, although it might be for other reasons (eg, in a GSoC project,wherever or if hosted for free on GitHub). I doubt that "URNs" like DOI and ISBN are applicable, but if available they should be included in all cases as well. Steve From wes.turner at gmail.com Thu Sep 13 00:41:26 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 13 Sep 2018 00:41:26 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: Do you guys think we should all cite Grub and BusyBox and bash and libc and setuptools and pip and openssl and GNU/Linux and LXC and Docker; or else it's plagiarism for us all? #OpenAccess On Wednesday, September 12, 2018, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Chris Barker via Python-Dev writes: > > > But "I wrote some code in Python to produce these statistics" -- > > does that need a citation? > > That depends on what you mean by "statistics" and whether (as one > should) one makes the code available. If the code is published or > "available on request", definitely, Python should be cited. If not, > and by "statistics" you mean the kind of things provided by Steven > d'Aprano's excellent statistics module (mean, median, standard > deviation, etc), maybe no citation is needed. But anything more > esoteric than that (even linear regression), yeah, I would say you > should cite both Python and any reference you used to learn the > algorithm or formulas, in the context of mentioning that your > statistics are home-brew, not produced by one of the recognized > applications for doing so. > > > If so, maybe that would take a different form. > > Yes, it would. But not so different: eg, version is analogous to > edition when citing a book. > > > Anyway, hard to make this decision without some idea how the > > citation is intended to be used. > > Same as any other citation, (1) to give credit to those responsible > for providing a resource (this is why publishers and their metadata of > city are still conventionally included), and (2) to show where that > resource can be obtained. AFAICS, both motivations are universally > applicable in polite society. NB: Replication is an important reason > for wanting to acquire the resource, but it's not the only one. > > I think underlying your comment is the question of *what* resource is > being cited. I can think of three offhand that might be characterized > as "Python". First, the PSF, as a provider of funding. There is a > conventional form for this: a footnote on the title or author's name > saying "The author acknowledges [a] > grant [grant identifier if available] from the Python Software > Foundation." I usually orally mention them in presentations, too. > That one's easy; *everybody* should *always* do that. > > The rest of these, sort of an ideal to strive for. If you keep a > bibliographic database, and there are now quite a few efforts to crowd > source them, it's easier to go the whole 9 yards than to skimp. But > except in cases where we don't need to even mention the code, probably > we should be citing, for reasons of courtesy to readers as well as > authors, editors, and publishers (as disgusting as many publishers are > as members of society, they do play a role in providing many resources > ---we should find ways to compete them into good behavior, not > ostracize them). > > The second is the Python *language and standard library*. Then the > Language Reference and/or the Library Reference should be cited > briefly when Python is first mentioned, and in the text introducing a > program or program fragment, with a full citation in the bibliography. > I tentatively suggest that the metadata for the Language Reference > would be > > Author: principal author(s) (Guido?) et al. OR python.org OR > Python Contributors > Title: The Python Language Reference > Version: to match Python version used (if relevant, different > versions each get full citations), probably should not be > "current" > Publisher: Python Software Foundation > Date: of the relevant version > Location: City of legal address of PSF > URL: to version used (probably should not be the default) > Date accessed: if "current" was used > > The Library reference would be the same except for Title. > > The third is a *particular implementation*. In that case the metadata > would be > > Author: principal author(s) (Guido) et al. OR python.org OR > Python Contributors > Title: The cPython Python distribution > Python Version: as appropriate (if relevant, different versions each > get full citations), never "current" > Distributor Version: if different from Python version (eg, additional > Debian cruft) > Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) > Date: of the relevant version > Location: City of legal address of distributor > > If downloaded: > > URL: to version used (including git commit SHA1 if available) > Date accessed: download from distributor, not installation date > > If received on physical medium: use the "usual" form of citation for a > collection of individual works (even if Python was the only thing on > it). Probably the only additional information needed would be the > distributor as editor of the collection and the name of the > collection. > > In most cases I can think of, if the implementation is cited, the > Language and Library References should be cited, too. > > Finally, if Python or components were modified for the project, the > modified version should be preserved in a repository and a VCS > identifier provided. This does not imply the repository need be > publicly accessible, of course, although it might be for other reasons > (eg, in a GSoC project,wherever or if hosted for free on GitHub). > > I doubt that "URNs" like DOI and ISBN are applicable, but if available > they should be included in all cases as well. > > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Sep 13 00:46:17 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 13 Sep 2018 00:46:17 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: There was a thread about adding __cite__ to things and a tool to collect those citations awhile back. "[Python-ideas] Add a __cite__ method for scientific packages" http://markmail.org/thread/rekmbmh64qxwcind Which CPython source file should contain this __cite__ value? ... On a related note, you should ask the list admin to append a URL to each mailing list message whenever this list is upgraded to mm3; so that you can all be appropriately cited. On Thursday, September 13, 2018, Wes Turner wrote: > Do you guys think we should all cite Grub and BusyBox and bash and libc > and setuptools and pip and openssl and GNU/Linux and LXC and Docker; or > else it's plagiarism for us all? > > #OpenAccess > > On Wednesday, September 12, 2018, Stephen J. Turnbull < > turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > >> Chris Barker via Python-Dev writes: >> >> > But "I wrote some code in Python to produce these statistics" -- >> > does that need a citation? >> >> That depends on what you mean by "statistics" and whether (as one >> should) one makes the code available. If the code is published or >> "available on request", definitely, Python should be cited. If not, >> and by "statistics" you mean the kind of things provided by Steven >> d'Aprano's excellent statistics module (mean, median, standard >> deviation, etc), maybe no citation is needed. But anything more >> esoteric than that (even linear regression), yeah, I would say you >> should cite both Python and any reference you used to learn the >> algorithm or formulas, in the context of mentioning that your >> statistics are home-brew, not produced by one of the recognized >> applications for doing so. >> >> > If so, maybe that would take a different form. >> >> Yes, it would. But not so different: eg, version is analogous to >> edition when citing a book. >> >> > Anyway, hard to make this decision without some idea how the >> > citation is intended to be used. >> >> Same as any other citation, (1) to give credit to those responsible >> for providing a resource (this is why publishers and their metadata of >> city are still conventionally included), and (2) to show where that >> resource can be obtained. AFAICS, both motivations are universally >> applicable in polite society. NB: Replication is an important reason >> for wanting to acquire the resource, but it's not the only one. >> >> I think underlying your comment is the question of *what* resource is >> being cited. I can think of three offhand that might be characterized >> as "Python". First, the PSF, as a provider of funding. There is a >> conventional form for this: a footnote on the title or author's name >> saying "The author acknowledges [a] >> grant [grant identifier if available] from the Python Software >> Foundation." I usually orally mention them in presentations, too. >> That one's easy; *everybody* should *always* do that. >> >> The rest of these, sort of an ideal to strive for. If you keep a >> bibliographic database, and there are now quite a few efforts to crowd >> source them, it's easier to go the whole 9 yards than to skimp. But >> except in cases where we don't need to even mention the code, probably >> we should be citing, for reasons of courtesy to readers as well as >> authors, editors, and publishers (as disgusting as many publishers are >> as members of society, they do play a role in providing many resources >> ---we should find ways to compete them into good behavior, not >> ostracize them). >> >> The second is the Python *language and standard library*. Then the >> Language Reference and/or the Library Reference should be cited >> briefly when Python is first mentioned, and in the text introducing a >> program or program fragment, with a full citation in the bibliography. >> I tentatively suggest that the metadata for the Language Reference >> would be >> >> Author: principal author(s) (Guido?) et al. OR python.org OR >> Python Contributors >> Title: The Python Language Reference >> Version: to match Python version used (if relevant, different >> versions each get full citations), probably should not be >> "current" >> Publisher: Python Software Foundation >> Date: of the relevant version >> Location: City of legal address of PSF >> URL: to version used (probably should not be the default) >> Date accessed: if "current" was used >> >> The Library reference would be the same except for Title. >> >> The third is a *particular implementation*. In that case the metadata >> would be >> >> Author: principal author(s) (Guido) et al. OR python.org OR >> Python Contributors >> Title: The cPython Python distribution >> Python Version: as appropriate (if relevant, different versions each >> get full citations), never "current" >> Distributor Version: if different from Python version (eg, additional >> Debian cruft) >> Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) >> Date: of the relevant version >> Location: City of legal address of distributor >> >> If downloaded: >> >> URL: to version used (including git commit SHA1 if available) >> Date accessed: download from distributor, not installation date >> >> If received on physical medium: use the "usual" form of citation for a >> collection of individual works (even if Python was the only thing on >> it). Probably the only additional information needed would be the >> distributor as editor of the collection and the name of the >> collection. >> >> In most cases I can think of, if the implementation is cited, the >> Language and Library References should be cited, too. >> >> Finally, if Python or components were modified for the project, the >> modified version should be preserved in a repository and a VCS >> identifier provided. This does not imply the repository need be >> publicly accessible, of course, although it might be for other reasons >> (eg, in a GSoC project,wherever or if hosted for free on GitHub). >> >> I doubt that "URNs" like DOI and ISBN are applicable, but if available >> they should be included in all cases as well. >> >> Steve >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. >> turner%40gmail.com >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From walter at livinglogic.de Thu Sep 13 05:00:28 2018 From: walter at livinglogic.de (Walter =?utf-8?q?D=C3=B6rwald?=) Date: Thu, 13 Sep 2018 11:00:28 +0200 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: <9BB6B29C-A889-411A-95BA-6598379C3E28@livinglogic.de> On 13 Sep 2018, at 2:33, Victor Stinner wrote: > Hi, > > For the type name, sometimes, we only get a type (not an instance), > and we want to format its FQN. IMHO we need to provide ways to format > the FQN of a type for *types* and for *instances*. Here is my > proposal: > > * Add !t conversion to format string > * Add ":T" format to type.__format__() > * Add "%t" and "%T" formatters to PyUnicode_FromUnicodeV() As far as I can remember, the distinction between lowercase and uppercase format letter for PyUnicode_FromUnicodeV() and friends was: lowercase letters are for formatting C types (like `char *` etc.) and uppercase formatting letters are for Python types (i.e. the C type is `PyObject *`). IMHO we should keep that distinction. > * Add a read-only type.__fqn__ property I like that. > # Python: "!t" for instance > raise TypeError(f"must be str, not {obj!t}") > > /* C: "%t" for instance */ > PyErr_Format(PyExc_TypeError, "must be str, not %t", obj); > > > /* C: "%T" for type */ > PyErr_Format(PyExc_TypeError, "must be str, not %T", mytype); > > # Python: ":T" for type > raise TypeError(f"must be str, not {mytype!T}") We could solve the problem with instances and classes by adding two new ! operators to str.format/f-strings and making them chainable. The !t operator would get the class of the argument and the !c operator would require a class argument and would convert it to its name (which is obj.__module__ + "." + obj.__qualname__ (or only obj.__qualname__ for builtin types)). So: >>> import pathlib >>> p = pathlib.Path("spam.py") >>> print(f"{pathlib.Path}") >>> print(f"{pathlib.Path!c}") pathlib.Path >>> print(f"{pathlib.Path!c!r}") 'pathlib.Path' >>> print(f"{p!t}") >>> print(f"{p!t!c}") pathlib.Path >>> print(f"{p!c}") Traceback (most recent call last): File "", line 1, in TypeError: object is not a class This would also give us: >>> print(f"{p!s!r}") 'spam.py' Which is different from: >>> print(f"{p}") spam.py >>> print(f"{p!r}") PosixPath('spam.py') > Open question: Should we also add "%t" and "%T" formatters to the str > % args operator at the Python level? > > I have a proof-of-concept implementation: > https://github.com/python/cpython/pull/9251 > > Victor Servus, Walter From J.Demeyer at UGent.be Thu Sep 13 05:22:36 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 13 Sep 2018 11:22:36 +0200 Subject: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods In-Reply-To: References: <5B2A15FE.4000608@UGent.be> Message-ID: <5B9A2C5C.6040008@UGent.be> On 2018-09-13 02:26, Petr Viktorin wrote: > The reference to PEP 573 is premature. It seems to me that PEP 580 helps with the use case of PEP 573. In fact, it implements part of what PEP 573 proposes. So I don't see the problem with the reference to PEP 573. Even if the implementation of PEP 573 changes, the problem statement will remain and that's what I'm referring to. > If you agree I can > summarize rationale for "parent", as much as it concerns 580. Sure. I still think that we should refer to PEP 573, but maybe we can summarize it also in PEP 580. > # Using tp_print > > The tp_print gimmick is my biggest worry. > AFAIK there's no guarantee that a function pointer and Py_ssize_t are > the same size. I'm not actually claiming anywhere that it is the same size. > # Descriptor behavior > > I'd say "SHOULD" rather than "MUST" here. The section describes how to > implement expected/reasonable behavior, but I see no need to limit that. There *is* actually an important reason to limit it: it allows code to make assumptions on what __get__ does. This enables optimizations which wouldn't be possible otherwise. If you cannot be sure what __get__ does, then you cannot optimize obj.method(x) to type(obj).method(obj, x) > "if func supports the C call protocol, then func.__set__ must not be > implemented." -- also, __delete__ should not be implemented, right?. Indeed. I write Python but I think C API, so for me these are both really tp_descr_set. > PyCCall_FASTCALL is not a macro, shouldn't it be named PyCCall_FastCall? What's the convention for that anyway? I assumed that capital letters meant a "really know what you are doing" function which could segfault if used badly. For me, whether something is a function or macro is just an implementation detail (which can change between Python versions) which should not affect the naming. > # C API functions > > The function PyCFunction_GetFlags is, for better or worse, part of the > stable ABI. We shouldn't just give up on it. I'm fine with documenting > that it shouldn't be used, but for functions defined using > PyCFunction_New etc. it should continue behaving as before. > One solution could be to preserve the "definition time" METH_* flags in > the 0xFFF bits of cc_flags and use the other bits for CCALL_*. I'm fine with that if you insist. However, it would be a silly solution to formally satisfy the "stable ABI" requirement without actually helping. I agree with your other points that I didn't reply to and will make some edits to PEP 580. Jeroen. From eric at trueblade.com Thu Sep 13 10:01:08 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 13 Sep 2018 10:01:08 -0400 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: Message-ID: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> On 9/12/2018 8:33 PM, Victor Stinner wrote: > Hi, > > For the type name, sometimes, we only get a type (not an instance), > and we want to format its FQN. IMHO we need to provide ways to format > the FQN of a type for *types* and for *instances*. Here is my > proposal: > > * Add !t conversion to format string I'm strongly opposed to this. This !t conversion would not be widely applicable enough to be generally useful, and would need to be exposed in the f-string and str.format() documentation, even though 99% of programmers would never need or see it. The purpose of the conversions is not to save you from making a function call when you know the type of the arguments. The purpose was specifically to convert arguments to strings so that your format specifier could always use the string formatting mini-language. It was more useful in str.format(), where the format string might be written separately (as user input or a translation, say) and not know the types of the arguments. You can (and I have!) argued that the conversions are completely unneeded in f-strings. raise TypeError(f"must be str, not {obj!t}") Should be written as: raise TypeError(f"must be str, not {type(obj)}") > * Add ":T" format to type.__format__() As you know (I've read the patch) this is just "T". I mention it here for future readers. They should understand that the ":" is a str.format() and f-string construct, and is unknown to __format__(). That said, I think this is a good idea. type.__format__() could also understand "#"? to specify qualname. > * Add "%t" and "%T" formatters to PyUnicode_FromUnicodeV() I think "T" is a good idea, but I think you're adding in obj vs type(obj) just because of the borrowed reference issue in Py_TYPE(). That issue is so much larger than string formatting the type of an object that it shouldn't be addressed here. > * Add a read-only type.__fqn__ property I'm not sure of the purpose of this. When in your examples is it used? > # Python: "!t" for instance > raise TypeError(f"must be str, not {obj!t}") > > /* C: "%t" for instance */ > PyErr_Format(PyExc_TypeError, "must be str, not %t", obj); > > > /* C: "%T" for type */ > PyErr_Format(PyExc_TypeError, "must be str, not %T", mytype); > > # Python: ":T" for type > raise TypeError(f"must be str, not {mytype!T}") > > > Open question: Should we also add "%t" and "%T" formatters to the str > % args operator at the Python level? No. Again, I think any formatting of type names should not be in a widely used interface, and should remain in our type-specific interface, __format__. %-formatting has no per-type extensibility, and I don't think we should start adding codes for every possible use case. Format codes for datetimes would be way more useful that %t, and I'd be opposed to adding them, too. (I realize my analogy is stretched, because every object has a type. But still.) Eric > > I have a proof-of-concept implementation: > https://github.com/python/cpython/pull/9251 > > Victor > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com From encukou at gmail.com Thu Sep 13 14:18:11 2018 From: encukou at gmail.com (Petr Viktorin) Date: Thu, 13 Sep 2018 11:18:11 -0700 Subject: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods In-Reply-To: <5B9A2C5C.6040008@UGent.be> References: <5B2A15FE.4000608@UGent.be> <5B9A2C5C.6040008@UGent.be> Message-ID: On 09/13/18 02:22, Jeroen Demeyer wrote: > On 2018-09-13 02:26, Petr Viktorin wrote: >> The reference to PEP 573 is premature. > > It seems to me that PEP 580 helps with the use case of PEP 573. In fact, > it implements part of what PEP 573 proposes. So I don't see the problem > with the reference to PEP 573. Even if the implementation of PEP 573 > changes, the problem statement will remain and that's what I'm referring > to. > >> If you agree I can >> summarize rationale for "parent", as much as it concerns 580. > > Sure. I still think that we should refer to PEP 573, but maybe we can > summarize it also in PEP 580. I want to make it clear that PEP 580 doesn't depend on 579. Reviewers don't need to agree with PEP 579 to accept 580. Here's my proposed rewording: https://github.com/python/peps/pull/775/files?short_path=b34f00e#diff-b34f00eeb75773c32f9b22fd7fee9771 >> # Using tp_print >> >> The tp_print gimmick is my biggest worry. >> AFAIK there's no guarantee that a function pointer and Py_ssize_t are >> the same size. > > I'm not actually claiming anywhere that it is the same size. Indeed, I was thinking ahead here. Backporting this to earlier versions of CPython will not be completely trivial, but let's leave it to Cython. >> # Descriptor behavior >> >> I'd say "SHOULD" rather than "MUST" here. The section describes how to >> implement expected/reasonable behavior, but I see no need to limit that. > > There *is* actually an important reason to limit it: it allows code to > make assumptions on what __get__ does. This enables optimizations which > wouldn't be possible otherwise. If you cannot be sure what __get__ does, > then you cannot optimize > > obj.method(x) > > to > > type(obj).method(obj, x) I see now. Yes, that's reasonable. >> "if func supports the C call protocol, then func.__set__ must not be >> implemented." -- also, __delete__ should not be implemented, right?. > > Indeed. I write Python but I think C API, so for me these are both > really tp_descr_set. > >> PyCCall_FASTCALL is not a macro, shouldn't it be named PyCCall_FastCall? > > What's the convention for that anyway? I assumed that capital letters > meant a "really know what you are doing" function which could segfault > if used badly. Well, I don't think that's a useful distinction either. This is C; pretty much anything can segfault when used badly. Macros tend to be "fast": Py_TYPE just gets a member of a struct; Py_INCREF just increments a number. METH_NOARGS is just a number. None of them are very dangerous. IMO, PyTuple_GET_ITEM is not uppercase because it's dangerous, but because it just reaches into memory. > For me, whether something is a function or macro is just an > implementation detail (which can change between Python versions) which > should not affect the naming. True. I'm not saying the convention is very strict or useful. >> # C API functions >> >> The function PyCFunction_GetFlags is, for better or worse, part of the >> stable ABI. We shouldn't just give up on it. I'm fine with documenting >> that it shouldn't be used, but for functions defined using >> PyCFunction_New etc. it should continue behaving as before. >> One solution could be to preserve the "definition time" METH_* flags in >> the 0xFFF bits of cc_flags and use the other bits for CCALL_*. > > I'm fine with that if you insist. However, it would be a silly solution > to formally satisfy the "stable ABI" requirement without actually helping. Yes, it's definitely very silly. But that's not a reason to break our promise to the users. After all it's called "stable ABI", not "useful ABI" :) > I agree with your other points that I didn't reply to and will make some > edits to PEP 580. Thank you! From larry at hastings.org Thu Sep 13 14:20:36 2018 From: larry at hastings.org (Larry Hastings) Date: Thu, 13 Sep 2018 11:20:36 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> Message-ID: <08985312-bb2d-ce2c-4e5f-a660d7df5f91@hastings.org> On 09/13/2018 07:01 AM, Eric V. Smith wrote: > On 9/12/2018 8:33 PM, Victor Stinner wrote: > >> Hi, >> >> For the type name, sometimes, we only get a type (not an instance), >> and we want to format its FQN. IMHO we need to provide ways to format >> the FQN of a type for *types* and for *instances*. Here is my >> proposal: >> >> * Add !t conversion to format string > > I'm strongly opposed to this. This !t conversion would not be widely > applicable enough to be generally useful, and would need to be exposed > in the f-string and str.format() documentation, even though 99% of > programmers would never need or see it. I discussed this with Eric in-person this morning at the core dev sprints.? Eric's understanding is that this is motivated by the fact that Py_TYPE() returns a borrowed reference, and by switching to this !t conversion we could avoid using Py_TYPE() when formatting error messages.? My quick thoughts on this: * If Py_TYPE() is a bad API, then it's a bad API and should be replaced.? We should have a new version of Py_TYPE() that returns a strong reference. * If we're talking about formatting error messages, we're formatting an exception, which means we're already no longer in performance-sensitive code.? So we should use the new API that returns a strong reference.? The negligible speed hit of taking the extra reference will be irrelevant. Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Thu Sep 13 15:26:20 2018 From: mark at hotpy.org (Mark Shannon) Date: Thu, 13 Sep 2018 20:26:20 +0100 Subject: [Python-Dev] Request for review Message-ID: <1a7b9fda-806c-edc1-7f3b-f6673eb3cfbe@hotpy.org> Hi, Can I request a review of https://github.com/python/cpython/pull/6641. It has been open for a few months now. Cheers, Mark. From vstinner at redhat.com Thu Sep 13 17:08:01 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 13 Sep 2018 23:08:01 +0200 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> Message-ID: Le jeu. 13 sept. 2018 ? 16:01, Eric V. Smith a ?crit : > > * Add !t conversion to format string > > I'm strongly opposed to this. This !t conversion would not be widely > applicable enough to be generally useful, and would need to be exposed > in the f-string and str.format() documentation, even though 99% of > programmers would never need or see it. (I'm thinking aloud.) In the Python code base, I found 115 lines using type(obj).__name__ and 228 lines using obj.__class__.__name__. $ scm.py grep 'type(.*).__name__'|wc -l 115 $ scm.py grep '.__class__.__name__'|wc -l 228 I don't know how to compare these numbers, so I tried to count the number of f-strings: $ git grep '[^"%-]\) x.__class__: --- Moreover, it's also possible to override the "type" symbol in the global or local scope: --- type = id num = 42 print(f"type(num): {type(num)}") # Output: "type(num): 139665950357856" --- One advantage of having a builtin formatter would be to always use internally the builtin type() function to get the type of an object, or not use "type()" in the current scope. The second advantage is to prevent the need of having to decide between type(obj) and obj.__class__ :-) > raise TypeError(f"must be str, not {obj!t}") > > Should be written as: > raise TypeError(f"must be str, not {type(obj)}") f"{type(obj)}" behaves as str(type(obj)), but in practice it uses repr(type(obj)): >>> f"{type(42)}" "" My proposed f"{obj!t}" returns the fully qualified name of the object type: >>> f"{42!t}" "int" Do you want to modify str(type) to return a value different than repr(type)? Or maybe it's just a typo and you wanted to write f"{type(obj):T}"? > That said, I think this is a good idea. type.__format__() could also > understand "#" to specify qualname. When I discussed with Petr Viktorin, we failed to find an usecase where __qualname__ was needed. We agreed that we always want the fully qualified name, not just the qualified name. > I think "T" is a good idea, but I think you're adding in obj vs > type(obj) just because of the borrowed reference issue in Py_TYPE(). > That issue is so much larger than string formatting the type of an > object that it shouldn't be addressed here. Right, that's a side effect of the discussion on the C API. It seems like Py_TYPE() has to go in the new C API. Sorry, the rationale is not written down yet, but Dino convinced me that Py_TYPE() has to go :-) > > Open question: Should we also add "%t" and "%T" formatters to the str > > % args operator at the Python level? > > No. Again, I think any formatting of type names should not be in a > widely used interface, (...) Ok, that's fine with me :-) Victor From encukou at gmail.com Thu Sep 13 17:52:42 2018 From: encukou at gmail.com (Petr Viktorin) Date: Thu, 13 Sep 2018 14:52:42 -0700 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> Message-ID: <87ae702f-0583-9726-ac26-21ea9bdd9c69@gmail.com> On 09/13/18 14:08, Victor Stinner wrote: > Le jeu. 13 sept. 2018 ? 16:01, Eric V. Smith a ?crit : >>> * Add !t conversion to format string >> >> I'm strongly opposed to this. This !t conversion would not be widely >> applicable enough to be generally useful, and would need to be exposed >> in the f-string and str.format() documentation, even though 99% of >> programmers would never need or see it. > > (I'm thinking aloud.) > > In the Python code base, I found 115 lines using type(obj).__name__ > and 228 lines using obj.__class__.__name__. [...] "!t" is not a big improvement over ":T" and "type(obj)". > I'm not sure if type(obj) or obj.__class__ should be used, but I can > say that they are different: obj.__class__ can be overriden: [...] > > > Moreover, it's also possible to override the "type" symbol in the > global or local scope: [...] I don't think either of those are problematic. If you override `__class__` or `type`, things will behave weirdly, and that's OK. > One advantage of having a builtin formatter would be to always use > internally the builtin type() function to get the type of an object, > or not use "type()" in the current scope. The second advantage is to > prevent the need of having to decide between type(obj) and > obj.__class__ :-) > > >> raise TypeError(f"must be str, not {obj!t}") >> >> Should be written as: >> raise TypeError(f"must be str, not {type(obj)}") [...] > > Do you want to modify str(type) to return a value different than repr(type)? > > Or maybe it's just a typo and you wanted to write f"{type(obj):T}"? Yes, AFAIK that was a typo. >> I think "T" is a good idea, but I think you're adding in obj vs >> type(obj) just because of the borrowed reference issue in Py_TYPE(). >> That issue is so much larger than string formatting the type of an >> object that it shouldn't be addressed here. > > Right, that's a side effect of the discussion on the C API. It seems > like Py_TYPE() has to go in the new C API. Sorry, the rationale is not > written down yet, but Dino convinced me that Py_TYPE() has to go :-) I'll be happy when we get rid of Py_TYPE and get to use moving garbage collectors... but now is not the time. The API for "%T" should be "give me the type". The best way to do that might change in the future. But at this point, we're bikeshedding. I think all the relevant voices have been heard. From eric at trueblade.com Thu Sep 13 18:06:25 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 13 Sep 2018 18:06:25 -0400 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: <87ae702f-0583-9726-ac26-21ea9bdd9c69@gmail.com> References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> <87ae702f-0583-9726-ac26-21ea9bdd9c69@gmail.com> Message-ID: <147003fa-5c42-6642-e87b-7c31a167c56f@trueblade.com> On 9/13/2018 5:52 PM, Petr Viktorin wrote: > On 09/13/18 14:08, Victor Stinner wrote: >> Le jeu. 13 sept. 2018 ? 16:01, Eric V. Smith a >> ?crit : >>>> * Add !t conversion to format string >>> >>> I'm strongly opposed to this. This !t conversion would not be widely >>> applicable enough to be generally useful, and would need to be exposed >>> in the f-string and str.format() documentation, even though 99% of >>> programmers would never need or see it. >> >> (I'm thinking aloud.) >> >> In the Python code base, I found 115 lines using type(obj).__name__ >> and 228 lines using obj.__class__.__name__. > [...] > > "!t" is not a big improvement over ":T" and "type(obj)". > >> I'm not sure if type(obj) or obj.__class__ should be used, but I can >> say that they are different: obj.__class__ can be overriden: > [...] >> >> >> Moreover, it's also possible to override the "type" symbol in the >> global or local scope: > [...] > > I don't think either of those are problematic. If you override > `__class__` or `type`, things will behave weirdly, and that's OK. > >> One advantage of having a builtin formatter would be to always use >> internally the builtin type() function to get the type of an object, >> or not use "type()" in the current scope. The second advantage is to >> prevent the need of having to decide between type(obj) and >> obj.__class__ :-) >> >> >>> raise TypeError(f"must be str, not {obj!t}") >>> >>> Should be written as: >>> raise TypeError(f"must be str, not {type(obj)}") > [...] >> >> Do you want to modify str(type) to return a value different than >> repr(type)? >> >> Or maybe it's just a typo and you wanted to write f"{type(obj):T}"? > > Yes, AFAIK that was a typo. f'{type(obj)}' becomes type(obj).__format__(''), so you can return something other than __str__ or __repr__ does. It's only by convention that an object's __format__ returns __str__: it need not do so. Eric > >>> I think "T" is a good idea, but I think you're adding in obj vs >>> type(obj) just because of the borrowed reference issue in Py_TYPE(). >>> That issue is so much larger than string formatting the type of an >>> object that it shouldn't be addressed here. >> >> Right, that's a side effect of the discussion on the C API. It seems >> like Py_TYPE() has to go in the new C API. Sorry, the rationale is not >> written down yet, but Dino convinced me that Py_TYPE() has to go :-) > > I'll be happy when we get rid of Py_TYPE and get to use moving garbage > collectors... but now is not the time. > The API for "%T" should be "give me the type". The best way to do that > might change in the future. > > > But at this point, we're bikeshedding. I think all the relevant voices > have been heard. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com From christian at python.org Thu Sep 13 19:47:23 2018 From: christian at python.org (Christian Heimes) Date: Thu, 13 Sep 2018 16:47:23 -0700 Subject: [Python-Dev] We cannot fix all issues: let's close XML security issues (not fix them) In-Reply-To: References: Message-ID: On 06/09/2018 07.18, Victor Stinner wrote: > Hi, > > The Python bug tracker is full of bugs, and sadly we don't have enough > people to take care of all of them. There are 3 open bugs about > security issues in XML and I simply propose to close it: > > https://bugs.python.org/issue17318 > https://bugs.python.org/issue17239 > https://bugs.python.org/issue24238 > > The XML documentation already starts with a red warning explaining the > security limitations of the Python implementation and points to > defusedxml and defusedexpat which are existing and working > counter-measures: > > https://docs.python.org/dev/library/xml.html > > Note: Christian Heimes, author of these 2 packages, told me that these > modules may not work on Python 3.7, he didn't have time to maintain > them recently. Maybe someone might want to help him? > > I suggest to close the 3 Python bugs without doing anything. Are you > ok with that? Keeping the issue open for 3 years doesn't help anyone, > and there is already a security warning in all supported version (I > checked 2.7 and 3.4). > > It seems like XML is getting less popular because of JSON becoming > more popular (even if JSON obviously comes with its own set of > security issues...). It seems like less core developers care about XML > (today than 3 years ago). > > We should just accept that core developers have limited availability > and that documenting security issues is an *acceptable* trade-off. I > don't see any value of keeping these 3 issues open. Hi, during the Python core developer sprint, Steve Dower forced ^H^H^H^H^H^H convinced me into looking into the XML security bugs again. I come with fixes for all issues. However all security fixes require a change of behavior. I strongly believe that the change doesn't affect the majority of users in a negative way. For entity expansion attacks (billion laughs, quadratic blowup), the issue cannot be fixed in a libexpat callback. I decided that it's better to fix the issue in expat directly. libxml2 added limits for entity expansion many years ago, too. I created a patch for libexpat to limit nesting depths, entity length and ratio between XML data and expansion, https://github.com/libexpat/libexpat/pull/220 . The PR is a proof of concept. For the external entity and DTD bug in SAX and pulldom parser, I changed the default setting in PR https://github.com/python/cpython/pull/9217 . When accepted, the parsers no longer load and embed files from local directories or network locations. Regards, Christian From songofacandy at gmail.com Thu Sep 13 19:50:50 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 14 Sep 2018 08:50:50 +0900 Subject: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods In-Reply-To: <5B9A2C5C.6040008@UGent.be> References: <5B2A15FE.4000608@UGent.be> <5B9A2C5C.6040008@UGent.be> Message-ID: 2018?9?13?(?) 18:22 Jeroen Demeyer : > On 2018-09-13 02:26, Petr Viktorin wrote: > > > PyCCall_FASTCALL is not a macro, shouldn't it be named PyCCall_FastCall? > > What's the convention for that anyway? I assumed that capital letters > meant a "really know what you are doing" function which could segfault > if used badly. > > For me, whether something is a function or macro is just an > implementation detail (which can change between Python versions) which > should not affect the naming. > https://www.python.org/dev/peps/pep-0007/#naming-conventions All capital name is used for macros. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Thu Sep 13 20:04:53 2018 From: vstinner at redhat.com (Victor Stinner) Date: Fri, 14 Sep 2018 02:04:53 +0200 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: <147003fa-5c42-6642-e87b-7c31a167c56f@trueblade.com> References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> <87ae702f-0583-9726-ac26-21ea9bdd9c69@gmail.com> <147003fa-5c42-6642-e87b-7c31a167c56f@trueblade.com> Message-ID: Le ven. 14 sept. 2018 ? 00:09, Eric V. Smith a ?crit : > f'{type(obj)}' becomes type(obj).__format__(''), so you can return > something other than __str__ or __repr__ does. It's only by convention > that an object's __format__ returns __str__: it need not do so. What's New in Python 3.7 contains: > object.__format__(x, '') is now equivalent to str(x) rather than format(str(self), ''). > (Contributed by Serhiy Storchaka in bpo-28974.) https://bugs.python.org/issue28974 Oh, I didn't know that a type is free to change this behavior: return something different than str(obj) if the format spec is an empty string. So are you suggesting to change type(obj).__format__('') to return the fully qualified name instead of repr(type)? So "%s" % type(obj) would use repr(), but "{}".format(type(obj)) and f"{type(obj)}" would return the fully qualified name? Victor From eric at trueblade.com Thu Sep 13 20:24:00 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 13 Sep 2018 20:24:00 -0400 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> <87ae702f-0583-9726-ac26-21ea9bdd9c69@gmail.com> <147003fa-5c42-6642-e87b-7c31a167c56f@trueblade.com> Message-ID: <5b10464c-0726-21a1-9ef5-d056cc4e1a09@trueblade.com> On 9/13/2018 8:04 PM, Victor Stinner wrote: > Le ven. 14 sept. 2018 ? 00:09, Eric V. Smith a ?crit : >> f'{type(obj)}' becomes type(obj).__format__(''), so you can return >> something other than __str__ or __repr__ does. It's only by convention >> that an object's __format__ returns __str__: it need not do so. > What's New in Python 3.7 contains: > >> object.__format__(x, '') is now equivalent to str(x) rather than format(str(self), ''). >> (Contributed by Serhiy Storchaka in bpo-28974.) > https://bugs.python.org/issue28974 > > Oh, I didn't know that a type is free to change this behavior: return > something different than str(obj) if the format spec is an empty > string. True! That issue was specific to object.__format__, not any other classes implementation of __format__. > So are you suggesting to change type(obj).__format__('') to return the > fully qualified name instead of repr(type)? I'm not suggesting it, I'm saying it's possible. It indeed might be the most useful behavior. > So "%s" % type(obj) would use repr(), but "{}".format(type(obj)) and > f"{type(obj)}" would return the fully qualified name? > "%s" % type(obj) would use str(), not repr. You could either: - keep with convention and have type(obj).__format__('') return type(obj).__str__(), while type(obj).__format__('#') (or what other char you want to use) return the qualname; or - just have type(obj).__format__('') return the qualname, if that's the more useful behavior. Eric From nas-python at arctrix.com Fri Sep 14 01:47:53 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Thu, 13 Sep 2018 23:47:53 -0600 Subject: [Python-Dev] bpo-34595: How to format a type name? In-Reply-To: References: <55d6bf1f-208d-973d-e283-684b8da43c92@trueblade.com> Message-ID: <20180914054753.nicou5cgmncumq7w@python.ca> On 2018-09-13, Victor Stinner wrote: > Right, that's a side effect of the discussion on the C API. It seems > like Py_TYPE() has to go in the new C API. Sorry, the rationale is not > written down yet, but Dino convinced me that Py_TYPE() has to go :-) My understanding is that using Py_TYPE() inside the CPython internals is okay (i.e. using a borrowed reference). However, extension modules would preferrably not use APIs that give back borrowed references. A clean API redesign would remove all of those. So, what are extension modules supposed to do? We want to give them an easy to use API. If we give them %t that takes an object and internally does the Py_TYPE() call, they have a simple way to do the right thing. E.g. PyErr_Format(PyExc_TypeError, "\"%s\" must be string, not %.200s", name, src->ob_type->tp_name); becomes PyErr_Format(PyExc_TypeError, "\"%s\" must be string, not %t", name, src); This kind of code occurs often in extension modules. If you make them get a strong reference to the type, they have to remember to decref it. It's not a huge deal but is a bit harder to use. I like the proposal to provide both %t and %T. Our format code is a bit more complicated but many extension modules get a bit simpler. That's a win, IMHO. For the Python side, I don't think you need the % format codes. You need a idiomatic way of getting the type name. repr() and str() of the type object is not it. I don't think changing them at this point is a good idea. So, having a new property would seem the obvious solution. E.g. f'"{name}" must be string, not {src.__class__.__qualname__}' That __qualname__ property will be useful for other things, not just building type error messages. From nas-python at arctrix.com Fri Sep 14 02:34:59 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Fri, 14 Sep 2018 00:34:59 -0600 Subject: [Python-Dev] Heap-allocated StructSequences In-Reply-To: References: Message-ID: <20180914063459.sqid2n6bera5or6l@python.ca> On 2018-09-04, Eddie Elizondo wrote: > Solution: > > * Fix the implementation of PyStructSequence_NewType: > > The best solution would be to fix the implementation of this > function. This can easily be done by dynamically creating a > PyType_Spec and calling PyType_FromSpec Hello Eddie, Thank you for spending time to look into this. Without studying the details of your patch, your approach sounds correct to me. I think we should be allocating types from the heap and use PyType_FromSpec. Having static type definitions living in the data segment cause too many issues. We have to assess how 3rd party extension modules would be affected by this change. Unless it is too hard to do, they should still compile (perhaps with warnings) after your fix. Do you know if that's the case? Looking at your changes to structseq.c, I can't tell easily. In any case, this should go into Victor's pythoncapi fork. That fork includes all the C-API cleanup we are hoping to make to CPython (assuming we can figure out the backwards and forwards compatibility issues). Here is the project site: https://pythoncapi.readthedocs.io Regards, Neil From encukou at gmail.com Fri Sep 14 12:04:46 2018 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 14 Sep 2018 09:04:46 -0700 Subject: [Python-Dev] Heap-allocated StructSequences In-Reply-To: <20180914063459.sqid2n6bera5or6l@python.ca> References: <20180914063459.sqid2n6bera5or6l@python.ca> Message-ID: <87c5b936-d1d5-713a-ee35-189062a718c4@gmail.com> On 09/13/18 23:34, Neil Schemenauer wrote: > On 2018-09-04, Eddie Elizondo wrote: >> Solution: >> >> * Fix the implementation of PyStructSequence_NewType: >> >> The best solution would be to fix the implementation of this >> function. This can easily be done by dynamically creating a >> PyType_Spec and calling PyType_FromSpec > > Hello Eddie, > > Thank you for spending time to look into this. Without studying the > details of your patch, your approach sounds correct to me. I think > we should be allocating types from the heap and use PyType_FromSpec. > Having static type definitions living in the data segment cause too > many issues. > > We have to assess how 3rd party extension modules would be affected > by this change. Unless it is too hard to do, they should still > compile (perhaps with warnings) after your fix. Do you know if > that's the case? Looking at your changes to structseq.c, I can't > tell easily. > > In any case, this should go into Victor's pythoncapi fork. That > fork includes all the C-API cleanup we are hoping to make to CPython > (assuming we can figure out the backwards and forwards compatibility > issues). Nope, Victor's fork doesn't include all C-API cleanup. There's an older long-term effort (PEP-384, PEP-489, the current contenders 576/579/580, and PEP-573 for the future). Converting things to use PyType_FromSpec falls in there. As long as the old API still works, these changes should go in (but they might need a PEP). From status at bugs.python.org Fri Sep 14 12:09:58 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 14 Sep 2018 18:09:58 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180914160958.22E905761C@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-09-07 - 2018-09-14) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6819 (-22) closed 39612 (+95) total 46431 (+73) Open issues with patches: 2716 Issues opened (38) ================== #23354: Loading 2 GiLOC file which raises exception causes wrong trace https://bugs.python.org/issue23354 reopened by benjamin.peterson #26544: platform.libc_ver() returns incorrect version number https://bugs.python.org/issue26544 reopened by vstinner #33083: math.factorial accepts non-integral Decimal instances https://bugs.python.org/issue33083 reopened by vstinner #34609: Idle Unitest https://bugs.python.org/issue34609 opened by piscvau at yahoo.fr #34610: Incorrect iteration of Manager.dict() method of the multiproce https://bugs.python.org/issue34610 opened by deltaclock #34613: asyncio.StreamReader initialization documentation incorrectly https://bugs.python.org/issue34613 opened by psycojoker #34616: implement "Async exec" https://bugs.python.org/issue34616 opened by mbussonn #34617: socket.recvfrom(): docs should warn about packet truncation wh https://bugs.python.org/issue34617 opened by Anees Ahmed #34620: Octal byte literals with a decimal value > 255 are silently tr https://bugs.python.org/issue34620 opened by bup #34623: _elementtree.c doesn't call XML_SetHashSalt() https://bugs.python.org/issue34623 opened by christian.heimes #34624: -W option does not accept module regexes https://bugs.python.org/issue34624 opened by coldfix #34626: PEP 384's PyType_Spec and PyType_Slot are not documented https://bugs.python.org/issue34626 opened by petr.viktorin #34628: urllib.request.urlopen fails when userinfo is present in URL https://bugs.python.org/issue34628 opened by ytvwld #34629: Python3 regression for urllib(2).urlopen(...).fp for chunked h https://bugs.python.org/issue34629 opened by tkruse #34630: Don't log ssl cert errors in asyncio https://bugs.python.org/issue34630 opened by asvetlov #34631: Upgrade to OpenSSL 1.1.1 https://bugs.python.org/issue34631 opened by christian.heimes #34632: Port importlib_metadata to Python 3.8 https://bugs.python.org/issue34632 opened by barry #34634: New asyncio streams API https://bugs.python.org/issue34634 opened by asvetlov #34639: PYTHONCOERCECLOCALE is ignored when using -E or -I option https://bugs.python.org/issue34639 opened by vstinner #34643: How to build Release Version of Python in Windows? https://bugs.python.org/issue34643 opened by Valentin Zhao #34648: Confirm the types of parameters of traceback.format_list and t https://bugs.python.org/issue34648 opened by Nathaniel Manista #34651: Disallow fork in a subinterpreter. https://bugs.python.org/issue34651 opened by eric.snow #34655: Support sendfile in asyncio streams API https://bugs.python.org/issue34655 opened by asvetlov #34656: memory exhaustion in Modules/_pickle.c:1393 https://bugs.python.org/issue34656 opened by shuoz #34659: Inconsistency between functools.reduce & itertools.accumulate https://bugs.python.org/issue34659 opened by lycantropos #34662: tarfile.TarFile may write corrupt files if not closed https://bugs.python.org/issue34662 opened by lordmauve #34663: Support POSIX_SPAWN_USEVFORK flag in posix_spawn https://bugs.python.org/issue34663 opened by pablogsal #34665: Py_FinalizeEx() - Bugs & caveats - Add info that NumPy and Pan https://bugs.python.org/issue34665 opened by jcmuel #34667: Review documentation section by section https://bugs.python.org/issue34667 opened by willingc #34669: test_ssl fails if SSLv2 is enabled https://bugs.python.org/issue34669 opened by benjamin.peterson #34670: Add set_post_handshake_auth for TLS 1.3 https://bugs.python.org/issue34670 opened by christian.heimes #34672: '%Z' strftime specifier never works with musl https://bugs.python.org/issue34672 opened by benjamin.peterson #34673: make the eval loop more editable https://bugs.python.org/issue34673 opened by benjamin.peterson #34676: Guarantie that divmod() and PyNumber_Divmod() return a 2-tuple https://bugs.python.org/issue34676 opened by serhiy.storchaka #34677: Event scheduler page example https://bugs.python.org/issue34677 opened by mediator42 #34678: need to remove the term "white space" https://bugs.python.org/issue34678 opened by dwaygig #34679: asyncio.add_signal_handler call fails if not on main thread https://bugs.python.org/issue34679 opened by jnwatson #34680: asyncio event_loop close fails off main thread if signal handl https://bugs.python.org/issue34680 opened by jnwatson Most recent 15 issues with no replies (15) ========================================== #34680: asyncio event_loop close fails off main thread if signal handl https://bugs.python.org/issue34680 #34679: asyncio.add_signal_handler call fails if not on main thread https://bugs.python.org/issue34679 #34677: Event scheduler page example https://bugs.python.org/issue34677 #34676: Guarantie that divmod() and PyNumber_Divmod() return a 2-tuple https://bugs.python.org/issue34676 #34673: make the eval loop more editable https://bugs.python.org/issue34673 #34672: '%Z' strftime specifier never works with musl https://bugs.python.org/issue34672 #34670: Add set_post_handshake_auth for TLS 1.3 https://bugs.python.org/issue34670 #34667: Review documentation section by section https://bugs.python.org/issue34667 #34665: Py_FinalizeEx() - Bugs & caveats - Add info that NumPy and Pan https://bugs.python.org/issue34665 #34655: Support sendfile in asyncio streams API https://bugs.python.org/issue34655 #34634: New asyncio streams API https://bugs.python.org/issue34634 #34632: Port importlib_metadata to Python 3.8 https://bugs.python.org/issue34632 #34631: Upgrade to OpenSSL 1.1.1 https://bugs.python.org/issue34631 #34629: Python3 regression for urllib(2).urlopen(...).fp for chunked h https://bugs.python.org/issue34629 #34626: PEP 384's PyType_Spec and PyType_Slot are not documented https://bugs.python.org/issue34626 Most recent 15 issues waiting for review (15) ============================================= #34676: Guarantie that divmod() and PyNumber_Divmod() return a 2-tuple https://bugs.python.org/issue34676 #34673: make the eval loop more editable https://bugs.python.org/issue34673 #34672: '%Z' strftime specifier never works with musl https://bugs.python.org/issue34672 #34663: Support POSIX_SPAWN_USEVFORK flag in posix_spawn https://bugs.python.org/issue34663 #34656: memory exhaustion in Modules/_pickle.c:1393 https://bugs.python.org/issue34656 #34651: Disallow fork in a subinterpreter. https://bugs.python.org/issue34651 #34634: New asyncio streams API https://bugs.python.org/issue34634 #34630: Don't log ssl cert errors in asyncio https://bugs.python.org/issue34630 #34626: PEP 384's PyType_Spec and PyType_Slot are not documented https://bugs.python.org/issue34626 #34623: _elementtree.c doesn't call XML_SetHashSalt() https://bugs.python.org/issue34623 #34613: asyncio.StreamReader initialization documentation incorrectly https://bugs.python.org/issue34613 #34610: Incorrect iteration of Manager.dict() method of the multiproce https://bugs.python.org/issue34610 #34604: Possible mojibake in pwd.getpwnam and grp.getgrnam https://bugs.python.org/issue34604 #34603: ctypes on Windows: error calling C function that returns a str https://bugs.python.org/issue34603 #34596: [unittest] raise error if @skip is used with an argument that https://bugs.python.org/issue34596 Top 10 most discussed issues (10) ================================= #34580: sqlite doc: clarify the scope of the context manager https://bugs.python.org/issue34580 10 msgs #34589: Py_Initialize() and Py_Main() should not enable C locale coerc https://bugs.python.org/issue34589 10 msgs #34595: PyUnicode_FromFormat(): add %T format for an object type name https://bugs.python.org/issue34595 10 msgs #34421: Cannot install package with unicode module names on Windows https://bugs.python.org/issue34421 9 msgs #34604: Possible mojibake in pwd.getpwnam and grp.getgrnam https://bugs.python.org/issue34604 9 msgs #31577: crash in os.utime() in case of a bad ns argument https://bugs.python.org/issue31577 8 msgs #34597: Python needs to check existence of functions at runtime for ta https://bugs.python.org/issue34597 8 msgs #33649: asyncio docs overhaul https://bugs.python.org/issue33649 7 msgs #1621: Do not assume signed integer overflow behavior https://bugs.python.org/issue1621 6 msgs #34600: python3 regression ElementTree.iterparse() unable to capture c https://bugs.python.org/issue34600 6 msgs Issues closed (89) ================== #7713: implement ability to disable automatic search path additions https://bugs.python.org/issue7713 closed by willingc #21258: Add __iter__ support for mock_open https://bugs.python.org/issue21258 closed by berker.peksag #23855: Missing Sanity Check for malloc() in PC/_msi.c https://bugs.python.org/issue23855 closed by berker.peksag #24696: Don't use None as sentinel for traceback https://bugs.python.org/issue24696 closed by benjamin.peterson #25041: document AF_PACKET socket address format https://bugs.python.org/issue25041 closed by benjamin.peterson #25083: Python can sometimes create incorrect .pyc files https://bugs.python.org/issue25083 closed by petr.viktorin #29051: Improve error reporting involving f-strings (PEP 498) https://bugs.python.org/issue29051 closed by eric.smith #29386: select.epoll.poll may behave differently if timeout = -1 vs ti https://bugs.python.org/issue29386 closed by berker.peksag #29832: Don't refer to getsockaddrarg in error messages https://bugs.python.org/issue29832 closed by benjamin.peterson #30576: http.server should support HTTP compression (gzip) https://bugs.python.org/issue30576 closed by brett.cannon #31132: test_prlimit from test_resource fails when building python3 in https://bugs.python.org/issue31132 closed by benjamin.peterson #31141: Start should be a keyword argument of the built-in sum https://bugs.python.org/issue31141 closed by rhettinger #31608: crash in methods of a subclass of _collections.deque with a ba https://bugs.python.org/issue31608 closed by benjamin.peterson #31704: HTTP check lowercase response from proxy https://bugs.python.org/issue31704 closed by benjamin.peterson #31734: crash or SystemError in sqlite3.Cache in case it is uninitiali https://bugs.python.org/issue31734 closed by berker.peksag #31801: vars() manipulation encounters problems with Enum https://bugs.python.org/issue31801 closed by ethan.furman #31902: Fix col_offset for ast nodes: AsyncFor, AsyncFunctionDef, Asyn https://bugs.python.org/issue31902 closed by miss-islington #31903: `_scproxy` calls SystemConfiguration functions in a way that c https://bugs.python.org/issue31903 closed by benjamin.peterson #32270: subprocess closes redirected fds even if they are in pass_fds https://bugs.python.org/issue32270 closed by izbyshev #32490: subprocess: duplicate filename in exception message https://bugs.python.org/issue32490 closed by benjamin.peterson #32933: mock_open does not support iteration around text files. https://bugs.python.org/issue32933 closed by berker.peksag #33032: Mention implicit cache in struct.Struct docs https://bugs.python.org/issue33032 closed by gregory.p.smith #33073: Add as_integer_ratio() to int() objects https://bugs.python.org/issue33073 closed by rhettinger #33217: x in enum.Flag member is True when x is not a Flag https://bugs.python.org/issue33217 closed by ethan.furman #33487: BZ2File(buffering=None) does not emit deprecation warning, dep https://bugs.python.org/issue33487 closed by gregory.p.smith #33604: HMAC default to MD5 marked as to be removed in 3.6 https://bugs.python.org/issue33604 closed by gregory.p.smith #33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid} https://bugs.python.org/issue33625 closed by serhiy.storchaka #33774: Document that @lru_cache caches based on exactly how the funct https://bugs.python.org/issue33774 closed by rhettinger #33883: doc Mention mypy, pyrex, pytype and PyAnnotate in FAQ https://bugs.python.org/issue33883 closed by benjamin.peterson #33986: asyncio: Typo in documentation: BaseSubprocessTransport -> Sub https://bugs.python.org/issue33986 closed by asvetlov #34004: Acquiring locks not interrupted by signals on musl libc https://bugs.python.org/issue34004 closed by benjamin.peterson #34082: EnumMeta.__new__ should use enum_class.__new__ https://bugs.python.org/issue34082 closed by ethan.furman #34194: test_ssl, AIX, and defaults for _ssl connections https://bugs.python.org/issue34194 closed by Michael.Felt #34200: importlib: python -m test test_pkg fails semi-randomly https://bugs.python.org/issue34200 closed by gregory.p.smith #34213: Frozen dataclass __init__ fails for "object" property" https://bugs.python.org/issue34213 closed by eric.smith #34246: Gentoo Refleaks 3.7: test_smtplib has dangling threads https://bugs.python.org/issue34246 closed by pablogsal #34286: lib2to3 tests fail on the 3.7 branch (used to work with 3.7.0) https://bugs.python.org/issue34286 closed by ned.deily #34365: datetime's documentation refers to "comparison [...] falling b https://bugs.python.org/issue34365 closed by Mariatta #34405: Upgrade to OpenSSL 1.1.0i / 1.0.2p https://bugs.python.org/issue34405 closed by ned.deily #34409: Add a way to customize iteration over fields in asdict() for t https://bugs.python.org/issue34409 closed by eric.smith #34455: Tkinter crashing when pressing Command + ^ (OSX) https://bugs.python.org/issue34455 closed by ned.deily #34487: enum _sunder_ names mix metaclass and enum class attributes https://bugs.python.org/issue34487 closed by ethan.furman #34490: transport.get_extra_info('sockname') of test_asyncio fails on https://bugs.python.org/issue34490 closed by asvetlov #34525: smtplib's authobject return value wrongly documented https://bugs.python.org/issue34525 closed by benjamin.peterson #34546: Add encryption support to zipfile https://bugs.python.org/issue34546 closed by serhiy.storchaka #34578: Pipenv lock : ModuleNotFoundError: No module named '_ctypes' https://bugs.python.org/issue34578 closed by ned.deily #34586: collections.ChainMap should have a get_where method https://bugs.python.org/issue34586 closed by rhettinger #34588: traceback formatting can drop a frame https://bugs.python.org/issue34588 closed by benjamin.peterson #34598: How to fix? Error in Kali linux python 2.7 - Collecting pip Fr https://bugs.python.org/issue34598 closed by benjamin.peterson #34605: Avoid master/slave terminology https://bugs.python.org/issue34605 closed by gvanrossum #34606: Unable to read zip file with extra https://bugs.python.org/issue34606 closed by serhiy.storchaka #34608: gc.get_referrers behavior change 3.6 to 3.7 https://bugs.python.org/issue34608 closed by inada.naoki #34611: some examples in 'itertools' modules docs are inaccuracy. https://bugs.python.org/issue34611 closed by rhettinger #34612: doc Some classes are treated as functions in Built-in Function https://bugs.python.org/issue34612 closed by rhettinger #34614: Builtin `abs(Path)` should return `Path.absolute()`. https://bugs.python.org/issue34614 closed by pitrou #34615: subprocess.call wrong exit code https://bugs.python.org/issue34615 closed by kayhayen #34618: Encoding error running in subprocess with captured output https://bugs.python.org/issue34618 closed by eryksun #34619: Typo in docs.python.jp https://bugs.python.org/issue34619 closed by brett.cannon #34621: uuid.UUID objects can't be unpickled in older Python versions https://bugs.python.org/issue34621 closed by taleinat #34622: Extract asyncio exceptions into a separate file https://bugs.python.org/issue34622 closed by asvetlov #34625: update to expat 2.2.6 https://bugs.python.org/issue34625 closed by benjamin.peterson #34627: Python incorrect execution order https://bugs.python.org/issue34627 closed by badrussians #34633: Simplify __reduce__ for ordered dict iterators https://bugs.python.org/issue34633 closed by pablogsal #34635: inspect: add tools for inspecting subclasses https://bugs.python.org/issue34635 closed by bmintz #34636: re module microoptimization: speed up bytes \w \s \d matching https://bugs.python.org/issue34636 closed by gregory.p.smith #34637: Make *start* usable as a keyword argument for sum(). https://bugs.python.org/issue34637 closed by rhettinger #34638: Avoid circular references in asyncio streams https://bugs.python.org/issue34638 closed by asvetlov #34640: remove the configure check TANH_PRESERVES_ZERO_SIGN https://bugs.python.org/issue34640 closed by benjamin.peterson #34641: Curiosity: f((a)=1) is not a syntax error -- why? https://bugs.python.org/issue34641 closed by benjamin.peterson #34642: time.ctime() uses %3d instead of %.2d to format. https://bugs.python.org/issue34642 closed by martin.panter #34644: Bug in reverse method https://bugs.python.org/issue34644 closed by steven.daprano #34645: math and numpy yield different results (nan) https://bugs.python.org/issue34645 closed by steven.daprano #34646: remove PyAPI_* from function definitions https://bugs.python.org/issue34646 closed by benjamin.peterson #34647: print sys.thread_info in regrtest header https://bugs.python.org/issue34647 closed by benjamin.peterson #34649: Modules/_json.c: Missing NULL checks in _encoded_const() https://bugs.python.org/issue34649 closed by berker.peksag #34650: test_posix fails with musl https://bugs.python.org/issue34650 closed by benjamin.peterson #34652: never enable lchmod on Linux https://bugs.python.org/issue34652 closed by benjamin.peterson #34653: PyParser_SimpleParseStringFilename should be deleted https://bugs.python.org/issue34653 closed by eric.smith #34654: test_time needs to handle '+' at the beginning of large years https://bugs.python.org/issue34654 closed by benjamin.peterson #34657: pyconfig.h macro "timezone" name clashes with user source are https://bugs.python.org/issue34657 closed by zach.ware #34658: subprocess with preexec_fn when fork() fails could corrupt PyE https://bugs.python.org/issue34658 closed by gregory.p.smith #34660: Remove ableist terms and pejoratives from source code and docs https://bugs.python.org/issue34660 closed by willingc #34661: test_shutil fails with busybox unzip https://bugs.python.org/issue34661 closed by benjamin.peterson #34664: test.test_os.MakedirTests.test_mode is too strict https://bugs.python.org/issue34664 closed by benjamin.peterson #34666: Implement async write and async close in asyncio StreamWriter https://bugs.python.org/issue34666 closed by asvetlov #34668: test_resource fails if test has CAP_SYS_RESOURCE but isn't roo https://bugs.python.org/issue34668 closed by petr.viktorin #34671: Remove references to Benevolent Dictator https://bugs.python.org/issue34671 closed by Mariatta #34674: assume unistd.h exists https://bugs.python.org/issue34674 closed by benjamin.peterson #34675: Avoid terminology related to slavery https://bugs.python.org/issue34675 closed by Mariatta From larry at hastings.org Fri Sep 14 17:27:37 2018 From: larry at hastings.org (Larry Hastings) Date: Fri, 14 Sep 2018 14:27:37 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? Message-ID: What follows is the text of issue 34690: https://bugs.python.org/issue34690 The PR is here: https://github.com/python/cpython/pull/9320 I don't know if we should be discussing this here on python-dev, or on bpo, or on Zulip, or on the soon-to-be-created Discourse. But maybe we can talk about it somewhere! //arry/ ---- This patch was sent to me privately by Jeethu Rao at Facebook. It's a change they're working with internally to improve startup time.? What I've been told by Carl Shapiro at Facebook is that we have their blessing to post it publicly / merge it / build upon it for CPython.? Their patch was written for 3.6, I have massaged it to the point where it minimally works with 3.8. What the patch does: it takes all the Python modules that are loaded as part of interpreter startup and deserializes the marshalled .pyc file into precreated objects stored as static C data.? You add this .C file to the Python build.? Then there's a patch to Python itself (about 250 lines iirc) that teaches it to load modules from these data structures. I wrote a quick dumb test harness to compare this patch vs 3.8 stock.? It runs a command line 500 times and uses time.perf_counter to time the process.? On a fast quiescent laptop I observe a 21-22% improvement: cmdline: ['./python', '-c', 'pass'] 500 runs: sm38 ? average time 0.006302303705982922 ????????? best 0.006055746000129147 ???????? worst 0.00816565500008437 clean38 ? average time 0.007969956444008858 ????????? best 0.007829047999621253 ???????? worst 0.008812210000542109 improvement 0.20924239043734505 % cmdline: ['./python', '-c', 'import io'] 500 runs: sm38 ? average time 0.006297688038004708 ????????? best 0.005980765999993309 ???????? worst 0.0072462130010535475 clean38 ? average time 0.007996319670004595 ????????? best 0.0078091849991324125 ???????? worst 0.009175700999549008 improvement 0.21242667903482038 % The downside of the patch: for these modules it ignores the Python files on disk--it doesn't even stat them.? If you add stat calls you lose half of the speed improvement.? I believe they added a work-around, where you can set a flag (command-line? environment variable? I don't know, I didn't go looking for it) that tells Python "don't use the frozen modules" and it loads all those files from disk. I don't propose to merge the patch in its current state.? I think it would need a lot of work both in terms of "doing things the way Python does it" as well as just code smell (the serializer is implemented in both C and Python and jumps back and forth, also the build process for the serialized modules is pretty tiresome). Is it worth working on? -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python at arctrix.com Fri Sep 14 17:54:24 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Fri, 14 Sep 2018 15:54:24 -0600 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: <20180914215424.hjxvq5l7m66schas@python.ca> On 2018-09-14, Larry Hastings wrote: [...] > improvement 0.21242667903482038 % I assume that should be 21.2 % othewise I recommend you abandon the idea. ;-P > The downside of the patch: for these modules it ignores the Python files on > disk--it doesn't even stat them. Having a command-line/env var to turn this on/off would be an acceptable fix, IMHO. If I'm running Python a server, I don't need to be editing .py modules and have them be recognized. Maybe have it turned off by default, at least at first. > Is it worth working on? I wonder how much of the speedup relies on putting it in the data segment (i.e. using linker/loader to essentially handle the unmarshal). What if you had a new marshal format that only needed a light 2nd pass in order to fix up the data loaded from disk? Yuri suggested looking at formats like Cap'n Proto. If the cost of the 2nd pass was not bad, you wouldn't have to rely on the platform C toolchain. Instead we can write .pyc files that hold this data. Then the speedup can work on all compiled Python modules, not just the ones you go through the special process that links them into the data segment. I suppose that might mean that .pyc files become arch specific. Maybe that's okay. As you said last night, there doesn't seem to be much low hanging fruit around anymore. So, 21% looks pretty decent. Regards, Neil From larry at hastings.org Fri Sep 14 18:06:18 2018 From: larry at hastings.org (Larry Hastings) Date: Fri, 14 Sep 2018 15:06:18 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <20180914215424.hjxvq5l7m66schas@python.ca> References: <20180914215424.hjxvq5l7m66schas@python.ca> Message-ID: <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> On 09/14/2018 02:54 PM, Neil Schemenauer wrote: > On 2018-09-14, Larry Hastings wrote: > [...] >> improvement 0.21242667903482038 % > I assume that should be 21.2 % othewise I recommend you abandon the > idea. ;-P Yeah, that thing you said. > I wonder how much of the speedup relies on putting it in the data > segment (i.e. using linker/loader to essentially handle the > unmarshal). What if you had a new marshal format that only needed a > light 2nd pass in order to fix up the data loaded from disk? Some experimentation would be in order.? I can suggest that, based on conversation from Carl, that adding the stat calls back in costs you half the startup.? So any mechanism where we're talking to the disk _at all_ simply isn't going to be as fast. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python at arctrix.com Fri Sep 14 18:25:58 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Fri, 14 Sep 2018 16:25:58 -0600 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> Message-ID: <20180914222558.yuromalgkggfp2re@python.ca> On 2018-09-14, Larry Hastings wrote: > [..] adding the stat calls back in costs you half the startup.? So > any mechanism where we're talking to the disk _at all_ simply > isn't going to be as fast. Okay, so if we use hundreds of small .pyc files scattered all over the disk, that's bad? Who would have thunk it. ;-P We could have a new format, .pya (compiled python archive) that has data for many .pyc files in it. In normal runs you would have one or just and handlful of these things (e.g. one for stdlib, one for your app and all the packages it uses). Then you mmap these just once and rely on OS page faults to bring in the data as you need it. The .pya would have a hash table at the start or end that tells you the offset for each module. Regards, Neil From shoyer at gmail.com Fri Sep 14 21:39:49 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 14 Sep 2018 18:39:49 -0700 Subject: [Python-Dev] Request for review: binary op dispatch rules for subclasses Message-ID: Over a year ago, I made a pull request ( https://github.com/python/cpython/pull/1325) to fix a long-standing issue with how Python handles dispatch for arithmetic binary operations involving subclasses (https://bugs.python.org/issue30140). I pinged the bug several times, but I'm still waiting for a review, which would be greatly appreciated! Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sat Sep 15 05:53:20 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 15 Sep 2018 10:53:20 +0100 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <20180914222558.yuromalgkggfp2re@python.ca> References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> <20180914222558.yuromalgkggfp2re@python.ca> Message-ID: On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer wrote: > > On 2018-09-14, Larry Hastings wrote: > > [..] adding the stat calls back in costs you half the startup. So > > any mechanism where we're talking to the disk _at all_ simply > > isn't going to be as fast. > > Okay, so if we use hundreds of small .pyc files scattered all over > the disk, that's bad? Who would have thunk it. ;-P > > We could have a new format, .pya (compiled python archive) that has > data for many .pyc files in it. In normal runs you would have one > or just and handlful of these things (e.g. one for stdlib, one for > your app and all the packages it uses). Then you mmap these just > once and rely on OS page faults to bring in the data as you need it. > The .pya would have a hash table at the start or end that tells you > the offset for each module. Isn't that essentially what putting the stdlib in a zipfile does? (See the windows embedded distribution for an example). It probably uses normal IO rather than mmap, but maybe adding a "use mmap" flag to the zipfile module would be a more general enhancement that zipimport could use for free. Paul From jackiekazil at gmail.com Sat Sep 15 10:58:26 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sat, 15 Sep 2018 10:58:26 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: I just got caught up on the thread. This is a really great discussion. Thank you for all the contributions. Before we get into the details, let's go back to the main use case we are trying to solve. *As a user, I am writing an academic paper and I need to cite Python. * Let's throw reproducibility out the window for now (<--- something I never thought I would say), because that should be captured in the code, not in the citations. So, if we don't need the specific version of Python, then maybe creating one citation is all we need. And that gives it some good Google juice as well. Thoughts? (Once we nail down one or many, I think we can then move into the details of the content of the citation.) -Jackie On Thu, Sep 13, 2018 at 12:47 AM Wes Turner wrote: > There was a thread about adding __cite__ to things and a tool to collect > those citations awhile back. > > "[Python-ideas] Add a __cite__ method for scientific packages" > http://markmail.org/thread/rekmbmh64qxwcind > > Which CPython source file should contain this __cite__ value? > > ... On a related note, you should ask the list admin to append a URL to > each mailing list message whenever this list is upgraded to mm3; so that > you can all be appropriately cited. > > On Thursday, September 13, 2018, Wes Turner wrote: > >> Do you guys think we should all cite Grub and BusyBox and bash and libc >> and setuptools and pip and openssl and GNU/Linux and LXC and Docker; or >> else it's plagiarism for us all? >> >> #OpenAccess >> >> On Wednesday, September 12, 2018, Stephen J. Turnbull < >> turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: >> >>> Chris Barker via Python-Dev writes: >>> >>> > But "I wrote some code in Python to produce these statistics" -- >>> > does that need a citation? >>> >>> That depends on what you mean by "statistics" and whether (as one >>> should) one makes the code available. If the code is published or >>> "available on request", definitely, Python should be cited. If not, >>> and by "statistics" you mean the kind of things provided by Steven >>> d'Aprano's excellent statistics module (mean, median, standard >>> deviation, etc), maybe no citation is needed. But anything more >>> esoteric than that (even linear regression), yeah, I would say you >>> should cite both Python and any reference you used to learn the >>> algorithm or formulas, in the context of mentioning that your >>> statistics are home-brew, not produced by one of the recognized >>> applications for doing so. >>> >>> > If so, maybe that would take a different form. >>> >>> Yes, it would. But not so different: eg, version is analogous to >>> edition when citing a book. >>> >>> > Anyway, hard to make this decision without some idea how the >>> > citation is intended to be used. >>> >>> Same as any other citation, (1) to give credit to those responsible >>> for providing a resource (this is why publishers and their metadata of >>> city are still conventionally included), and (2) to show where that >>> resource can be obtained. AFAICS, both motivations are universally >>> applicable in polite society. NB: Replication is an important reason >>> for wanting to acquire the resource, but it's not the only one. >>> >>> I think underlying your comment is the question of *what* resource is >>> being cited. I can think of three offhand that might be characterized >>> as "Python". First, the PSF, as a provider of funding. There is a >>> conventional form for this: a footnote on the title or author's name >>> saying "The author acknowledges [a] >>> grant [grant identifier if available] from the Python Software >>> Foundation." I usually orally mention them in presentations, too. >>> That one's easy; *everybody* should *always* do that. >>> >>> The rest of these, sort of an ideal to strive for. If you keep a >>> bibliographic database, and there are now quite a few efforts to crowd >>> source them, it's easier to go the whole 9 yards than to skimp. But >>> except in cases where we don't need to even mention the code, probably >>> we should be citing, for reasons of courtesy to readers as well as >>> authors, editors, and publishers (as disgusting as many publishers are >>> as members of society, they do play a role in providing many resources >>> ---we should find ways to compete them into good behavior, not >>> ostracize them). >>> >>> The second is the Python *language and standard library*. Then the >>> Language Reference and/or the Library Reference should be cited >>> briefly when Python is first mentioned, and in the text introducing a >>> program or program fragment, with a full citation in the bibliography. >>> I tentatively suggest that the metadata for the Language Reference >>> would be >>> >>> Author: principal author(s) (Guido?) et al. OR python.org OR >>> Python Contributors >>> Title: The Python Language Reference >>> Version: to match Python version used (if relevant, different >>> versions each get full citations), probably should not be >>> "current" >>> Publisher: Python Software Foundation >>> Date: of the relevant version >>> Location: City of legal address of PSF >>> URL: to version used (probably should not be the default) >>> Date accessed: if "current" was used >>> >>> The Library reference would be the same except for Title. >>> >>> The third is a *particular implementation*. In that case the metadata >>> would be >>> >>> Author: principal author(s) (Guido) et al. OR python.org OR >>> Python Contributors >>> Title: The cPython Python distribution >>> Python Version: as appropriate (if relevant, different versions each >>> get full citations), never "current" >>> Distributor Version: if different from Python version (eg, additional >>> Debian cruft) >>> Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) >>> Date: of the relevant version >>> Location: City of legal address of distributor >>> >>> If downloaded: >>> >>> URL: to version used (including git commit SHA1 if available) >>> Date accessed: download from distributor, not installation date >>> >>> If received on physical medium: use the "usual" form of citation for a >>> collection of individual works (even if Python was the only thing on >>> it). Probably the only additional information needed would be the >>> distributor as editor of the collection and the name of the >>> collection. >>> >>> In most cases I can think of, if the implementation is cited, the >>> Language and Library References should be cited, too. >>> >>> Finally, if Python or components were modified for the project, the >>> modified version should be preserved in a repository and a VCS >>> identifier provided. This does not imply the repository need be >>> publicly accessible, of course, although it might be for other reasons >>> (eg, in a GSoC project,wherever or if hosted for free on GitHub). >>> >>> I doubt that "URNs" like DOI and ISBN are applicable, but if available >>> they should be included in all cases as well. >>> >>> Steve >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com >>> >> _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jackiekazil%40gmail.com > -- Jacqueline Kazil | @jackiekazil -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sat Sep 15 13:12:32 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Sep 2018 13:12:32 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: On Saturday, September 15, 2018, Jacqueline Kazil wrote: > I just got caught up on the thread. This is a really great discussion. > Thank you for all the contributions. > > Before we get into the details, let's go back to the main use case we are > trying to solve. > *As a user, I am writing an academic paper and I need to cite Python. * > > Let's throw reproducibility out the window for now (<--- something I never > thought I would say), because that should be captured in the code, not in > the citations. > > So, if we don't need the specific version of Python, then maybe creating > one citation is all we need. > And that gives it some good Google juice as well. > https://scholar.google.com/scholar?hl=en&q=python+van+Rossum * https://www.semanticscholar.org/search?q=Python%20van%20Rossum https://www.mendeley.com/research-papers/?query=Python+van+Rossum https://www.zotero.org/search/q/Python/type/group With an e.g. {Zotero,} group, it would be easy to cite the Python citation with the greatest centrality. https://networkx.github.io/documentation/stable/reference/algorithms/centrality.html A DOI URN/URI/URL really is easiest to aggregate the edges of/for. - [ ] Link to the new citation(s) page in the Python docs from the SciPy citing page https://www.scipy.org/citing.html NP. YW! > Thoughts? > > (Once we nail down one or many, I think we can then move into the details > of the content of the citation.) > > -Jackie > > On Thu, Sep 13, 2018 at 12:47 AM Wes Turner wrote: > >> There was a thread about adding __cite__ to things and a tool to collect >> those citations awhile back. >> >> "[Python-ideas] Add a __cite__ method for scientific packages" >> http://markmail.org/thread/rekmbmh64qxwcind >> >> Which CPython source file should contain this __cite__ value? >> >> ... On a related note, you should ask the list admin to append a URL to >> each mailing list message whenever this list is upgraded to mm3; so that >> you can all be appropriately cited. >> >> On Thursday, September 13, 2018, Wes Turner wrote: >> >>> Do you guys think we should all cite Grub and BusyBox and bash and libc >>> and setuptools and pip and openssl and GNU/Linux and LXC and Docker; or >>> else it's plagiarism for us all? >>> >>> #OpenAccess >>> >>> On Wednesday, September 12, 2018, Stephen J. Turnbull < >>> turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: >>> >>>> Chris Barker via Python-Dev writes: >>>> >>>> > But "I wrote some code in Python to produce these statistics" -- >>>> > does that need a citation? >>>> >>>> That depends on what you mean by "statistics" and whether (as one >>>> should) one makes the code available. If the code is published or >>>> "available on request", definitely, Python should be cited. If not, >>>> and by "statistics" you mean the kind of things provided by Steven >>>> d'Aprano's excellent statistics module (mean, median, standard >>>> deviation, etc), maybe no citation is needed. But anything more >>>> esoteric than that (even linear regression), yeah, I would say you >>>> should cite both Python and any reference you used to learn the >>>> algorithm or formulas, in the context of mentioning that your >>>> statistics are home-brew, not produced by one of the recognized >>>> applications for doing so. >>>> >>>> > If so, maybe that would take a different form. >>>> >>>> Yes, it would. But not so different: eg, version is analogous to >>>> edition when citing a book. >>>> >>>> > Anyway, hard to make this decision without some idea how the >>>> > citation is intended to be used. >>>> >>>> Same as any other citation, (1) to give credit to those responsible >>>> for providing a resource (this is why publishers and their metadata of >>>> city are still conventionally included), and (2) to show where that >>>> resource can be obtained. AFAICS, both motivations are universally >>>> applicable in polite society. NB: Replication is an important reason >>>> for wanting to acquire the resource, but it's not the only one. >>>> >>>> I think underlying your comment is the question of *what* resource is >>>> being cited. I can think of three offhand that might be characterized >>>> as "Python". First, the PSF, as a provider of funding. There is a >>>> conventional form for this: a footnote on the title or author's name >>>> saying "The author acknowledges [a] >>>> grant [grant identifier if available] from the Python Software >>>> Foundation." I usually orally mention them in presentations, too. >>>> That one's easy; *everybody* should *always* do that. >>>> >>>> The rest of these, sort of an ideal to strive for. If you keep a >>>> bibliographic database, and there are now quite a few efforts to crowd >>>> source them, it's easier to go the whole 9 yards than to skimp. But >>>> except in cases where we don't need to even mention the code, probably >>>> we should be citing, for reasons of courtesy to readers as well as >>>> authors, editors, and publishers (as disgusting as many publishers are >>>> as members of society, they do play a role in providing many resources >>>> ---we should find ways to compete them into good behavior, not >>>> ostracize them). >>>> >>>> The second is the Python *language and standard library*. Then the >>>> Language Reference and/or the Library Reference should be cited >>>> briefly when Python is first mentioned, and in the text introducing a >>>> program or program fragment, with a full citation in the bibliography. >>>> I tentatively suggest that the metadata for the Language Reference >>>> would be >>>> >>>> Author: principal author(s) (Guido?) et al. OR python.org OR >>>> Python Contributors >>>> Title: The Python Language Reference >>>> Version: to match Python version used (if relevant, different >>>> versions each get full citations), probably should not be >>>> "current" >>>> Publisher: Python Software Foundation >>>> Date: of the relevant version >>>> Location: City of legal address of PSF >>>> URL: to version used (probably should not be the default) >>>> Date accessed: if "current" was used >>>> >>>> The Library reference would be the same except for Title. >>>> >>>> The third is a *particular implementation*. In that case the metadata >>>> would be >>>> >>>> Author: principal author(s) (Guido) et al. OR python.org OR >>>> Python Contributors >>>> Title: The cPython Python distribution >>>> Python Version: as appropriate (if relevant, different versions each >>>> get full citations), never "current" >>>> Distributor Version: if different from Python version (eg, >>>> additional >>>> Debian cruft) >>>> Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) >>>> Date: of the relevant version >>>> Location: City of legal address of distributor >>>> >>>> If downloaded: >>>> >>>> URL: to version used (including git commit SHA1 if available) >>>> Date accessed: download from distributor, not installation date >>>> >>>> If received on physical medium: use the "usual" form of citation for a >>>> collection of individual works (even if Python was the only thing on >>>> it). Probably the only additional information needed would be the >>>> distributor as editor of the collection and the name of the >>>> collection. >>>> >>>> In most cases I can think of, if the implementation is cited, the >>>> Language and Library References should be cited, too. >>>> >>>> Finally, if Python or components were modified for the project, the >>>> modified version should be preserved in a repository and a VCS >>>> identifier provided. This does not imply the repository need be >>>> publicly accessible, of course, although it might be for other reasons >>>> (eg, in a GSoC project,wherever or if hosted for free on GitHub). >>>> >>>> I doubt that "URNs" like DOI and ISBN are applicable, but if available >>>> they should be included in all cases as well. >>>> >>>> Steve >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >>>> wes.turner%40gmail.com >>>> >>> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> jackiekazil%40gmail.com >> > > > -- > Jacqueline Kazil | @jackiekazil > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Sep 15 16:46:06 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 15 Sep 2018 22:46:06 +0200 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: > > On Saturday, September 15, 2018, Jacqueline Kazil > wrote: > >> I just got caught up on the thread. This is a really great discussion. >> Thank you for all the contributions. >> >> Before we get into the details, let's go back to the main use case we are >> trying to solve. >> *As a user, I am writing an academic paper and I need to cite Python. * >> > ai'd still like to know *why* you need to cite python 0 I can imagine multiple reasons, and that may influence the best document to cite. > Let's throw reproducibility out the window for now (<--- something I never >> thought I would say), because that should be captured in the code, not in >> the citations. >> > thanks for that clarification. > So, if we don't need the specific version of Python, then maybe creating >> one citation is all we need. >> > well, Python does evolve over time, so depending on why you are citing it, version may matter. But i suggest hat the language reference be used as the "primary" citation for Python, and then you can cite the version that is current at the time of your paper writing (Or the version that's relevant to your paper). And that gives it some good Google juice as well. >> > > https://scholar.google.com/scholar?hl=en&q=python+van+Rossum * > looks like the language reference shows there -- so good to go. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chema at rinzewind.org Sat Sep 15 19:21:44 2018 From: chema at rinzewind.org (=?iso-8859-1?Q?Jos=E9_Mar=EDa?= Mateos) Date: Sat, 15 Sep 2018 19:21:44 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: Message-ID: <20180915232144.GA5339@equipaje> On Sun, Sep 09, 2018 at 03:43:13PM -0400, Jacqueline Kazil wrote: > The PSF has received a few inquiries asking the question ? ?How do I cite > Python??So, I am reaching out to you all to figure this out. What about the R approach? --- > citation() To cite R in publications use: R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. A BibTeX entry for LaTeX users is @Manual{, title = {R: A Language and Environment for Statistical Computing}, author = {{R Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2018}, url = {https://www.R-project.org/}, } We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also ?citation("pkgname")? for citing R packages. --- Cheers, -- Jos? Mar?a (Chema) Mateos https://rinzewind.org/blog-es || https://rinzewind.org/blog-en From turnbull.stephen.fw at u.tsukuba.ac.jp Sun Sep 16 04:35:32 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 16 Sep 2018 17:35:32 +0900 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Jacqueline Kazil writes: > *As a user, I am writing an academic paper and I need to cite Python. * I don't understand the meaning of "need" and "Python". To understand your code, one likely needs the Language Reference and surely the Library Reference, and probably documentation of the APIs and semantics of various third party code. To just give credit to the Python project for the suite of tools you've used, a citation like the R Project's should do (I think this has appeared more than once, I copy it from Jos? Mar?a Mateos's parallel post): > To cite R in publications use: > R Core Team (2018). R: A language and environment for statistical > computing. R Foundation for Statistical Computing, Vienna, Austria. > URL https://www.R-project.org/. I guess for Python that would be something like """ Python Core Developers [2018]. Python: A general purpose language for computing, with batteries included. Python Software Foundation, Beaverton, OR. https://www.python.org/. """ I like R's citation() builtin. One caveat: I get the impression that the R Project is far more centralized than Python is, that there are not huge independent projects like SciPy and NumPy and Twisted and so on, nor independent implementations of the core language like PyPy and Jython. So I suspect that for most serious scientific computing you would need to cite one or more third-pary projects as well, and perhaps an implementation such as PyPy or Jython. Jacqueline again: > Let's throw reproducibility out the window for now (<--- something > I never thought I would say), because that should be captured in > the code, not in the citations. > > So, if we don't need the specific version of Python, then maybe > creating one citation is all we need. Do you realize that `3 / 2` means different computations depending on the version of Python? And that `"a string"` produces different objects with different duck-types depending on the version? As far as handling versions, this would do, I think: f""" Python Core Developers [{release_year}]. Python: A general purpose language for computing, with batteries included, version {version_number}. Python Software Foundation, Beaverton, OR. Project URL: https://www.python.org/. """ From wes.turner at gmail.com Sun Sep 16 10:07:09 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 16 Sep 2018 10:07:09 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: Message-ID: Should Python builds add `-mindirect-branch=thunk -mindirect-branch-register` to CFLAGS? Where would this be to be added in the build scripts with which architectures? /QSpectre is the MSVC build flag for Spectre Variant 1: > The /Qspectre option is available in Visual Studio 2017 version 15.7 and later. https://docs.microsoft.com/en-us/cpp/build/reference/qspectre?view=vs-2017 security@ directed me to the issue tracker / lists, so I'm forwarding this to python-dev and python-ideas, as well. # Forwarded message From: *Wes Turner* Date: Wednesday, September 12, 2018 Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register To: distutils-sig Should C extensions that compile all add `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the risk of Spectre variant 2 (which does indeed affect user space applications as well as kernels)? [1] https://github.com/speed47/spectre-meltdown-checker/ issues/119#issuecomment-361432244 [2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) [3] https://en.wikipedia.org/wiki/Speculative_Store_Bypass# Speculative_execution_exploit_variants On Wednesday, September 12, 2018, Wes Turner wrote: > >> On Wednesday, September 12, 2018, Joni Orponen >> wrote: >> >>> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner wrote: >>> >>>> Should C extensions that compile all add >>>> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate >>>> the risk of Spectre variant 2 (which does indeed affect user space >>>> applications as well as kernels)? >>>> >>> >>> Are those available on GCC <= 4.2.0 as per PEP 513? >>> >> >> AFAIU, only >> GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support enabled >> by the `-mindirect-branch=thunk -mindirect-branch-register` CFLAGS. >> > On Wednesday, September 12, 2018, Wes Turner wrote: > "What is a retpoline and how does it work?" > https://stackoverflow.com/questions/48089426/what-is-a- > retpoline-and-how-does-it-work > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Sep 16 10:16:19 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 16 Sep 2018 10:16:19 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: Message-ID: On Sunday, September 16, 2018, Wes Turner wrote: > Should Python builds add `-mindirect-branch=thunk > -mindirect-branch-register` to CFLAGS? > > Where would this be to be added in the build scripts with which > architectures? > > /QSpectre is the MSVC build flag for Spectre Variant 1: > > > The /Qspectre option is available in Visual Studio 2017 version 15.7 and > later. > > https://docs.microsoft.com/en-us/cpp/build/reference/qspectre?view=vs-2017 > > security@ directed me to the issue tracker / lists, > so I'm forwarding this to python-dev and python-ideas, as well. > > # Forwarded message > From: *Wes Turner* > Date: Wednesday, September 12, 2018 > Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk > -mindirect-branch-register > To: distutils-sig > > > Should C extensions that compile all add > `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the > risk of Spectre variant 2 (which does indeed affect user space applications > as well as kernels)? > > [1] https://github.com/speed47/spectre-meltdown-checker/issues/ > 119#issuecomment-361432244 > [2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) > [3] https://en.wikipedia.org/wiki/Speculative_Store_Bypass#Specu > lative_execution_exploit_variants > > On Wednesday, September 12, 2018, Wes Turner wrote: >> >>> On Wednesday, September 12, 2018, Joni Orponen >>> wrote: >>> >>>> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner >>>> wrote: >>>> >>>>> Should C extensions that compile all add >>>>> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate >>>>> the risk of Spectre variant 2 (which does indeed affect user space >>>>> applications as well as kernels)? >>>>> >>>> >>>> Are those available on GCC <= 4.2.0 as per PEP 513? >>>> >>> >>> AFAIU, only >>> GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support enabled >>> by the `-mindirect-branch=thunk -mindirect-branch-register` CFLAGS. >>> >> > On Wednesday, September 12, 2018, Wes Turner > wrote: > >> "What is a retpoline and how does it work?" >> https://stackoverflow.com/questions/48089426/what-is-a-retpo >> line-and-how-does-it-work >> >> There's probably already been an ANN announce about this? If not, someone with appropriate security posture and syntax could address: Whether python.org binaries are already rebuilt Whether OS package binaries are already rebuilt Whether anaconda binaries are already rebuilt Whether C extension binaries on pypi are already rebuilt -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python at arctrix.com Sun Sep 16 16:13:53 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Sun, 16 Sep 2018 14:13:53 -0600 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> <20180914222558.yuromalgkggfp2re@python.ca> Message-ID: <20180916201353.b5dwga73zm7llgg4@python.ca> On 2018-09-15, Paul Moore wrote: > On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer wrote: > > We could have a new format, .pya (compiled python archive) that has > > data for many .pyc files in it. [..] > Isn't that essentially what putting the stdlib in a zipfile does? (See > the windows embedded distribution for an example). It probably uses > normal IO rather than mmap, but maybe adding a "use mmap" flag to the > zipfile module would be a more general enhancement that zipimport > could use for free. Yeah, it's close to the same thing. If the syscalls are what gives the speedup, using a better zipfile implementation might give nearly the same benefit. At the sprint we dicussed a variation of Larry's (FB's) patch. Allow the frozen data to be in DLLs as well as in the python executable data segment. So, importlib would be frozen into the exe. The standard library could become another DLL. The user could provide one or more DLLs that contains their app code and package deps. In general, I think there would only be two DLLs: stdlib and app+deps. My suggestion of a special format (similar to zipfile) was motivated by the wish to avoid platform build tools. E.g. Windows users would have a harder time to build DLLs. However, I now think depending on platform build tools is fine. The people who will build these DLLs will have the tools and skills to do so. Even if there is only a DLLs for the stdlib, it will be a win. If no DLLs are provided, you get the same behavior as current Python (i.e. importlib is frozen in, everything else can come from .py files). I think there is no question that Larry's PR will be faster than the zipfile approach. It removes the umarshal step. Maybe that benefit will but small but I think it should count. Also, I suspect the OS can page-in the DLL on-demand and perhaps leave parts of module .pyc data on disk. Larry had the idea of keeping code objects frozen until they need to be executed. It's a cool idea that would be enabled by this first step. I'm excited about Larry's PR. I think if we get it cleanup up and into Python 3.8, we will clearly leave Python 2.7 behind in terms of startup performance. That has been a goal of mine for a couple years now. Regards, Neil From solipsis at pitrou.net Sun Sep 16 16:24:41 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 16 Sep 2018 22:24:41 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? References: Message-ID: <20180916222441.13600498@fsol> On Fri, 14 Sep 2018 14:27:37 -0700 Larry Hastings wrote: > > I don't propose to merge the patch in its current state.? I think it > would need a lot of work both in terms of "doing things the way Python > does it" as well as just code smell (the serializer is implemented in > both C and Python and jumps back and forth, also the build process for > the serialized modules is pretty tiresome). > > Is it worth working on? I think it's of limited interest if it only helps with modules used during the startup sequence, not arbitrary stdlib or third-party modules. To give an idea, on my machine the baseline Python startup is about 20ms (`time python -c pass`), but if I import Numpy it grows to 100ms, and with Pandas it's more than 200ms. Saving 4ms on the baseline startup would make no practical difference for concrete usage. I'm ready to think there are other use cases where it matters, though. Regards Antoine. From jackiekazil at gmail.com Sun Sep 16 18:22:47 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sun, 16 Sep 2018 18:22:47 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Message-ID: RE: Why cite Python?. I would say that in this paper ? http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf, where we introduced a new library, we should have cited Python, because the library was based in Python. We were riding on the coattails of Python and if Python did not exist, then this library would not exist. (taking this a level higher) Just as someone doing research (a specific application) should cite the Mesa library. Without the good and bad that is Mesa, their research would have taken a different form. Since my Ph.D is on Mesa, I will be citing Python there. I think for more insight we can look at who has cited some of Guido?s stuff? For example: https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en Does that help? RE: Just like R - Versions @Stephen Are you suggesting major versions or minor versions? RE: Guido?s prio works Some of those have weight already. Should we be picking one those and pointing people to that? Final decision I am going to the NumFocus summit for maintainers of Science Python libraries next week. I believe that the Science Python community is where the main audience for this is? correct me if you think this is a wrong assumption. I thought I could take two to three concrete formats and user test there and report on how community members who would be using the citation feel. Good idea? Bad idea? On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Jacqueline Kazil writes: > > > *As a user, I am writing an academic paper and I need to cite Python. * > > I don't understand the meaning of "need" and "Python". To understand > your code, one likely needs the Language Reference and surely the > Library Reference, and probably documentation of the APIs and > semantics of various third party code. > > To just give credit to the Python project for the suite of tools > you've used, a citation like the R Project's should do (I think this > has appeared more than once, I copy it from Jos? Mar?a Mateos's > parallel post): > > > To cite R in publications use: > > > R Core Team (2018). R: A language and environment for statistical > > computing. R Foundation for Statistical Computing, Vienna, Austria. > > URL https://www.R-project.org/. > > I guess for Python that would be something like > > """ > Python Core Developers [2018]. Python: A general purpose language for > computing, with batteries included. Python Software Foundation, > Beaverton, OR. https://www.python.org/. > """ > > I like R's citation() builtin. > > One caveat: I get the impression that the R Project is far more > centralized than Python is, that there are not huge independent > projects like SciPy and NumPy and Twisted and so on, nor independent > implementations of the core language like PyPy and Jython. So I > suspect that for most serious scientific computing you would need to > cite one or more third-pary projects as well, and perhaps an > implementation such as PyPy or Jython. > > Jacqueline again: > > > Let's throw reproducibility out the window for now (<--- something > > I never thought I would say), because that should be captured in > > the code, not in the citations. > > > > So, if we don't need the specific version of Python, then maybe > > creating one citation is all we need. > > Do you realize that `3 / 2` means different computations depending on > the version of Python? And that `"a string"` produces different > objects with different duck-types depending on the version? > > As far as handling versions, this would do, I think: > > f""" > Python Core Developers [{release_year}]. Python: A general purpose > language for computing, with batteries included, version > {version_number}. Python Software Foundation, Beaverton, OR. > Project URL: https://www.python.org/. > """ > -- Jacqueline Kazil | @jackiekazil -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Sep 16 19:19:00 2018 From: brett at python.org (Brett Cannon) Date: Sun, 16 Sep 2018 16:19:00 -0700 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sun, 16 Sep 2018 at 15:23 Jacqueline Kazil wrote: > RE: Why cite Python?. > > I would say that in this paper ? > http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf, > where we introduced a new library, we should have cited Python, because the > library was based in Python. We were riding on the coattails of Python and > if Python did not exist, then this library would not exist. > > (taking this a level higher) > Just as someone doing research (a specific application) should cite the > Mesa library. Without the good and bad that is Mesa, their research would > have taken a different form. > > Since my Ph.D is on Mesa, I will be citing Python there. > > I think for more insight we can look at who has cited some of Guido?s > stuff? > For example: > https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en > > Does that help? > RE: Just like R - Versions > > @Stephen > Are you suggesting major versions or minor versions? > RE: Guido?s prio works > > Some of those have weight already. Should we be picking one those and > pointing people to that? > Final decision > > I am going to the NumFocus summit for maintainers of Science Python > libraries next week. I believe that the Science Python community is where > the main audience for this is? correct me if you think this is a wrong > assumption. > > I thought I could take two to three concrete formats and user test there > and report on how community members who would be using the citation feel. > > Good idea? Bad idea? > I think seeing how some other academics other than the ones here definitely wouldn't hurt. -Brett > > On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull < > turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > >> Jacqueline Kazil writes: >> >> > *As a user, I am writing an academic paper and I need to cite Python. * >> >> I don't understand the meaning of "need" and "Python". To understand >> your code, one likely needs the Language Reference and surely the >> Library Reference, and probably documentation of the APIs and >> semantics of various third party code. >> >> To just give credit to the Python project for the suite of tools >> you've used, a citation like the R Project's should do (I think this >> has appeared more than once, I copy it from Jos? Mar?a Mateos's >> parallel post): >> >> > To cite R in publications use: >> >> > R Core Team (2018). R: A language and environment for statistical >> > computing. R Foundation for Statistical Computing, Vienna, Austria. >> > URL https://www.R-project.org/. >> >> I guess for Python that would be something like >> >> """ >> Python Core Developers [2018]. Python: A general purpose language for >> computing, with batteries included. Python Software Foundation, >> Beaverton, OR. https://www.python.org/. >> """ >> >> I like R's citation() builtin. >> >> One caveat: I get the impression that the R Project is far more >> centralized than Python is, that there are not huge independent >> projects like SciPy and NumPy and Twisted and so on, nor independent >> implementations of the core language like PyPy and Jython. So I >> suspect that for most serious scientific computing you would need to >> cite one or more third-pary projects as well, and perhaps an >> implementation such as PyPy or Jython. >> >> Jacqueline again: >> >> > Let's throw reproducibility out the window for now (<--- something >> > I never thought I would say), because that should be captured in >> > the code, not in the citations. >> > >> > So, if we don't need the specific version of Python, then maybe >> > creating one citation is all we need. >> >> Do you realize that `3 / 2` means different computations depending on >> the version of Python? And that `"a string"` produces different >> objects with different duck-types depending on the version? >> >> As far as handling versions, this would do, I think: >> >> f""" >> Python Core Developers [{release_year}]. Python: A general purpose >> language for computing, with batteries included, version >> {version_number}. Python Software Foundation, Beaverton, OR. >> Project URL: https://www.python.org/. >> """ >> > > > -- > Jacqueline Kazil | @jackiekazil > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jackiekazil at gmail.com Sun Sep 16 19:30:07 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sun, 16 Sep 2018 19:30:07 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Message-ID: Cool, thanks! On Sun, Sep 16, 2018 at 7:19 PM Brett Cannon wrote: > > > On Sun, 16 Sep 2018 at 15:23 Jacqueline Kazil > wrote: > >> RE: Why cite Python?. >> >> I would say that in this paper ? >> http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf, >> where we introduced a new library, we should have cited Python, because the >> library was based in Python. We were riding on the coattails of Python and >> if Python did not exist, then this library would not exist. >> >> (taking this a level higher) >> Just as someone doing research (a specific application) should cite the >> Mesa library. Without the good and bad that is Mesa, their research would >> have taken a different form. >> >> Since my Ph.D is on Mesa, I will be citing Python there. >> >> I think for more insight we can look at who has cited some of Guido?s >> stuff? >> For example: >> https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en >> >> Does that help? >> RE: Just like R - Versions >> >> @Stephen >> Are you suggesting major versions or minor versions? >> RE: Guido?s prio works >> >> Some of those have weight already. Should we be picking one those and >> pointing people to that? >> Final decision >> >> I am going to the NumFocus summit for maintainers of Science Python >> libraries next week. I believe that the Science Python community is where >> the main audience for this is? correct me if you think this is a wrong >> assumption. >> >> I thought I could take two to three concrete formats and user test there >> and report on how community members who would be using the citation feel. >> >> Good idea? Bad idea? >> > I think seeing how some other academics other than the ones here > definitely wouldn't hurt. > > -Brett > > >> >> On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull < >> turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: >> >>> Jacqueline Kazil writes: >>> >>> > *As a user, I am writing an academic paper and I need to cite Python. >>> * >>> >>> I don't understand the meaning of "need" and "Python". To understand >>> your code, one likely needs the Language Reference and surely the >>> Library Reference, and probably documentation of the APIs and >>> semantics of various third party code. >>> >>> To just give credit to the Python project for the suite of tools >>> you've used, a citation like the R Project's should do (I think this >>> has appeared more than once, I copy it from Jos? Mar?a Mateos's >>> parallel post): >>> >>> > To cite R in publications use: >>> >>> > R Core Team (2018). R: A language and environment for statistical >>> > computing. R Foundation for Statistical Computing, Vienna, Austria. >>> > URL https://www.R-project.org/. >>> >>> I guess for Python that would be something like >>> >>> """ >>> Python Core Developers [2018]. Python: A general purpose language for >>> computing, with batteries included. Python Software Foundation, >>> Beaverton, OR. https://www.python.org/. >>> """ >>> >>> I like R's citation() builtin. >>> >>> One caveat: I get the impression that the R Project is far more >>> centralized than Python is, that there are not huge independent >>> projects like SciPy and NumPy and Twisted and so on, nor independent >>> implementations of the core language like PyPy and Jython. So I >>> suspect that for most serious scientific computing you would need to >>> cite one or more third-pary projects as well, and perhaps an >>> implementation such as PyPy or Jython. >>> >>> Jacqueline again: >>> >>> > Let's throw reproducibility out the window for now (<--- something >>> > I never thought I would say), because that should be captured in >>> > the code, not in the citations. >>> > >>> > So, if we don't need the specific version of Python, then maybe >>> > creating one citation is all we need. >>> >>> Do you realize that `3 / 2` means different computations depending on >>> the version of Python? And that `"a string"` produces different >>> objects with different duck-types depending on the version? >>> >>> As far as handling versions, this would do, I think: >>> >>> f""" >>> Python Core Developers [{release_year}]. Python: A general purpose >>> language for computing, with batteries included, version >>> {version_number}. Python Software Foundation, Beaverton, OR. >>> Project URL: https://www.python.org/. >>> """ >>> >> >> >> -- >> Jacqueline Kazil | @jackiekazil >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/brett%40python.org >> > -- Jacqueline Kazil | @jackiekazil -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at ganssle.io Sun Sep 16 19:35:12 2018 From: paul at ganssle.io (Paul Ganssle) Date: Sun, 16 Sep 2018 19:35:12 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Message-ID: <57be8f7d-08fe-1f0f-5779-67ea7a2de01f@ganssle.io> I think the "why" in this case should be a bit deeper than that, because until recently, it's been somewhat unusual to cite the /tools you use/ to create a paper. I see three major reasons why people cite software packages, and the form of the citation would have different requirements for each one: 1. *Academic credit / Academic use metrics* The weird way that academia has evolved, academics are largely judged by their publications and how influential those publications are. A lot of the people who work on statistical and scientific python libraries are doing excellent and incredibly influential work, but that's largely invisible to the metrics used by funding and tenure committees, so there's been an effort do things like getting DOIs for libraries or publishing articles in journals like the journal of open source software: https://joss.theoj.org Then you cite the libraries if you use them, and the people who contribute to the work can say, "Look I'm a regular contributor to this core library that is cited in 90% of papers". This seems less important to CPython, where the majority of core contributors (as far as I can tell) are not academics and have little use for high h-index papers. That said, even if no one involved cares about the academic credit, if every paper that used Python cited the language, it probably /would/ provide useful metrics to the PSF and others interested in this. If all you want is a formal way to say "I used Python for this" as a citation so that it can be tracked, then a single DOI for the entire language should be sufficient. 2. *As a primary source or example for some claims * If you are writing an article about language design and you are referencing how Python handles async or scoping or unicode or something, you want to make it easy for your readers to see the context of your statement, to verify that it's true and to get more details than you might want to include as part of what may be a tangential mention in your paper. I have a sense that this is closer to the original reason people cited things in papers and books before citations became a metric for measuring influence - and subsequently a way to give credit for the source of ideas. If this is why you are citing Python, you should probably be citing a specific sub-section of the language reference and/or documentation, and that citation should probably be versioned, since new features are added in every minor version, and the way some of these things are handled may change over time. In this case, a separate DOI for each minor version that points to the documentation as built by a specific commit or git tag or whatever would probably be ideal. 3. *To aid reproducibility* It won't go all the way towards reproducing your research, but given that Python is a living language that is always changing - both in implementation and the spec itself - to the extent that you have a "methods" section, it should probably include things like operating system version, CPython version and the versions of all libraries you used so that if someone is failing to replicate your results, they know how to build an environment where it /should work/. If you want to include this information in the form of a citation, then I would think that you would not want to be both more granular - citing the specific interpreter you used (CPython, Jython, Pypy), the full version (3.6.6 rather than 3.6) and possibly even other factors like operating system, etc, and /less/ granular in that you don't need to cite a specific subset of the interpreter (e.g. async), but just the interpreter as a whole. -- My thoughts on the matter are that I think the CPython core dev team probably cares a lot less about #1 than, say, the R dev team, which is one reason why there's no clear way to cite "CPython" as a whole. I think that #3 is a very laudable goal, but probably should be in some sort of "methods" section of the document being prepared rather than overloading citations for it, though having a standardized way to describe your Python setup (similar to, say, the pandas debugging feature `pandas.show_versions()`) that is optimized for publication would probably be super helpful. While #2 is probably only a small fraction of all the times where people would want to "cite CPython", I think it's probably the most important one, since it's performing a very specific function useful to the reader of the paper. It also seems not terribly difficult to come up with some guidance for unambiguously referencing sections of the documentation and/or language reference, and having "get a DOI for the documentation" be part of the release cycle. Best, Paul P.S. I will also be at the NumFocus summit. It's been some time since I've been an academic, but hopefully there will be an interesting discussion about this there! On 9/16/18 6:22 PM, Jacqueline Kazil wrote: > > RE: Why cite Python?. > > I would say that in this paper ? > http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf, > where we introduced a new library, we should have cited Python, > because the library was based in Python. We were riding on the > coattails of Python and if Python did not exist, then this library > would not exist. > > (taking this a level higher) > Just as someone doing research (a specific application) should cite > the Mesa library. Without the good and bad that is Mesa, their > research would have taken a different form. > > Since my Ph.D is on Mesa, I will be citing Python there. > > I think for more insight we can look at who has cited some of Guido?s > stuff? > For example: > https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en > > Does that help? > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From wes.turner at gmail.com Sun Sep 16 20:29:06 2018 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 16 Sep 2018 20:29:06 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: Message-ID: Are all current Python builds and C extensions vulnerable to Spectre variants {1, 2, *}? There are now multiple threads: "SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register" - https://mail.python.org/mm3/archives/list/distutils-sig at python.org/thread/4BGE226DB5EWIAT5VCJ75QD5ASOVJZCM/ - https://mail.python.org/pipermail/python-ideas/2018-September/053473.html - https://mail.python.org/pipermail/python-dev/2018-September/155199.html Original thread (that I forwarded to security@): "[Python-ideas] Executable space protection: NX bit," https://mail.python.org/pipermail/python-ideas/2018-September/053175.html > ~ Do trampolines / nested functions in C extensions switch off the NX bit? On Sunday, September 16, 2018, Nathaniel Smith wrote: > On Wed, Sep 12, 2018, 12:29 Joni Orponen wrote: > >> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner wrote: >> >>> Should C extensions that compile all add >>> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the >>> risk of Spectre variant 2 (which does indeed affect user space applications >>> as well as kernels)? >>> >> >> Are those available on GCC <= 4.2.0 as per PEP 513? >> > > Pretty sure no manylinux1 compiler is ever going to get these mitigations. > > For manylinux2010 on x86-64, we can easily use a much newer compiler: RH > maintains a recent compiler, currently gcc 7.3, or if that doesn't work for > some reason then the conda folks have be apparently figured out how to > build the equivalent from gcc upstream releases. > Are there different CFLAGS and/or gcc compatibility flags in conda builds of Python and C extensions? Where are those set in conda builds? What's the best way to set CFLAGS in Python builds and C extensions? export CFLAGS="-mindirect-branch=thunk -mindirect-branch-register" ./configure make ? Why are we supposed to use an old version of GCC that doesn't have the retpoline patches that only mitigate Spectre variant 2? > > Unfortunately, the manylinux2010 infrastructure is not quite ready... I'm > pretty sure it needs some volunteers to push it to the finish line, though > unfortunately I haven't had enough time to keep track. > "PEP 571 -- The manylinux2010 Platform Tag" https://www.python.org/dev/peps/pep-0571/ "Tracking issue for manylinux2010 rollout" https://github.com/pypa/manylinux/issues/179 Are all current Python builds and C extensions vulnerable to Spectre variants {1, 2, *}? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremy at alum.mit.edu Mon Sep 17 00:05:30 2018 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 17 Sep 2018 00:05:30 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: I wanted to start with an easy answer that is surely unsatisfying: http://blog.apastyle.org/apastyle/2015/01/how-to-cite-software-in-apa-style.html APA style is pretty popular, and it says that standard software doesn't need to be specified. Standard software includes "Microsoft Word, Java, and Adobe Photoshop." So I'd say Python fits well in that category, and doesn't need to be cited. I said you wouldn't be satisfied... On Sat, Sep 15, 2018 at 11:02 AM Jacqueline Kazil wrote: > I just got caught up on the thread. This is a really great discussion. > Thank you for all the contributions. > > Before we get into the details, let's go back to the main use case we are > trying to solve. > *As a user, I am writing an academic paper and I need to cite Python. * > The goal here is ambiguous. Python means many things--a language described by the language specification, the source code of a particular implementation of the language (Python often refers to C Python), a particular binary release of the implementation of the language (Python 1.5.2 for Windows). Which one is relevant in the context of the paper? If you're talking about a bug in timsort in a particular version of C Python, then you probably want to cite that specific version of the implementation. I suspect the most common goal for a citation is just to describe the language "in general" where 1.5.2 or 3.7.0 and Jython or CPython are all details that don't matter. In that case I'd cite the language specification. We're talking about putting a citation in a paper (a written source) and the written language specification captures what we think of as essential for the language. If you want to cite Turing's proof of the undecidability of the halting problem, you'd cite the paper where he wrote it down (in Proceedings of the London Mathematical Society). If you want to cite a programming language in the abstract, cite the specification that describes it. I think style guides are relevant here. They give guidance on how to cite an item based on its category. For example, the MLA style guide describes how to cite a digital file, a physical object, and many other things. My favorite example under "physical object" is "Physical objects found online." (Think about it :-). There's some discussion of how to cite source code here: http://integrity.mit.edu/handbook/writing-code. Notably this is talking about citing source code in the context of other source code, and it mostly recommends using URLs. If you wanted to cite a particular piece of source code in an written article, you'd probably follow one of the approaches for citing online resources. Try to identify who / when / what / where. For example MLA style for a blog post would be : Editor, screen name, author, or compiler name (if available). ?Posting Title.? Name of Site, Version number (if available), Name of institution/organization affiliated with the site (sponsor or publisher), URL. Date of access. You could cite a particular source file this way or a particular source release. The date usually refers to the original publication date. I think that was with the 1.0 release, although I'm not sure. I'd probably pick that date, but someone can correct me if there's an earlier date. It would suggest somehow that current Python and the original Python were mostly the same thing, which is an idea I like. van Rossum, Guido (1994). "The Python Language Reference". Python Software Foundation, https://docs.python.org/reference/index.html. Retrieved 16 September 2018. I'd say that's all settled. If anyone asks you, "How can you be sure that settles it?" You can answer, "Some guy said it on a mailing list." And then you can site the message: Jeremy Hylton. "[Python-Dev] Official citation for Python." Sep. 17, 2018. python-dev, https://mail.python.org/mailman/listinfo/python-dev. Accessed 18 September 2018. Jeremy > Let's throw reproducibility out the window for now (<--- something I never > thought I would say), because that should be captured in the code, not in > the citations. > > So, if we don't need the specific version of Python, then maybe creating > one citation is all we need. > And that gives it some good Google juice as well. > > Thoughts? > > (Once we nail down one or many, I think we can then move into the details > of the content of the citation.) > > -Jackie > > On Thu, Sep 13, 2018 at 12:47 AM Wes Turner wrote: > >> There was a thread about adding __cite__ to things and a tool to collect >> those citations awhile back. >> >> "[Python-ideas] Add a __cite__ method for scientific packages" >> http://markmail.org/thread/rekmbmh64qxwcind >> >> Which CPython source file should contain this __cite__ value? >> >> ... On a related note, you should ask the list admin to append a URL to >> each mailing list message whenever this list is upgraded to mm3; so that >> you can all be appropriately cited. >> >> On Thursday, September 13, 2018, Wes Turner wrote: >> >>> Do you guys think we should all cite Grub and BusyBox and bash and libc >>> and setuptools and pip and openssl and GNU/Linux and LXC and Docker; or >>> else it's plagiarism for us all? >>> >>> #OpenAccess >>> >>> On Wednesday, September 12, 2018, Stephen J. Turnbull < >>> turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: >>> >>>> Chris Barker via Python-Dev writes: >>>> >>>> > But "I wrote some code in Python to produce these statistics" -- >>>> > does that need a citation? >>>> >>>> That depends on what you mean by "statistics" and whether (as one >>>> should) one makes the code available. If the code is published or >>>> "available on request", definitely, Python should be cited. If not, >>>> and by "statistics" you mean the kind of things provided by Steven >>>> d'Aprano's excellent statistics module (mean, median, standard >>>> deviation, etc), maybe no citation is needed. But anything more >>>> esoteric than that (even linear regression), yeah, I would say you >>>> should cite both Python and any reference you used to learn the >>>> algorithm or formulas, in the context of mentioning that your >>>> statistics are home-brew, not produced by one of the recognized >>>> applications for doing so. >>>> >>>> > If so, maybe that would take a different form. >>>> >>>> Yes, it would. But not so different: eg, version is analogous to >>>> edition when citing a book. >>>> >>>> > Anyway, hard to make this decision without some idea how the >>>> > citation is intended to be used. >>>> >>>> Same as any other citation, (1) to give credit to those responsible >>>> for providing a resource (this is why publishers and their metadata of >>>> city are still conventionally included), and (2) to show where that >>>> resource can be obtained. AFAICS, both motivations are universally >>>> applicable in polite society. NB: Replication is an important reason >>>> for wanting to acquire the resource, but it's not the only one. >>>> >>>> I think underlying your comment is the question of *what* resource is >>>> being cited. I can think of three offhand that might be characterized >>>> as "Python". First, the PSF, as a provider of funding. There is a >>>> conventional form for this: a footnote on the title or author's name >>>> saying "The author acknowledges [a] >>>> grant [grant identifier if available] from the Python Software >>>> Foundation." I usually orally mention them in presentations, too. >>>> That one's easy; *everybody* should *always* do that. >>>> >>>> The rest of these, sort of an ideal to strive for. If you keep a >>>> bibliographic database, and there are now quite a few efforts to crowd >>>> source them, it's easier to go the whole 9 yards than to skimp. But >>>> except in cases where we don't need to even mention the code, probably >>>> we should be citing, for reasons of courtesy to readers as well as >>>> authors, editors, and publishers (as disgusting as many publishers are >>>> as members of society, they do play a role in providing many resources >>>> ---we should find ways to compete them into good behavior, not >>>> ostracize them). >>>> >>>> The second is the Python *language and standard library*. Then the >>>> Language Reference and/or the Library Reference should be cited >>>> briefly when Python is first mentioned, and in the text introducing a >>>> program or program fragment, with a full citation in the bibliography. >>>> I tentatively suggest that the metadata for the Language Reference >>>> would be >>>> >>>> Author: principal author(s) (Guido?) et al. OR python.org OR >>>> Python Contributors >>>> Title: The Python Language Reference >>>> Version: to match Python version used (if relevant, different >>>> versions each get full citations), probably should not be >>>> "current" >>>> Publisher: Python Software Foundation >>>> Date: of the relevant version >>>> Location: City of legal address of PSF >>>> URL: to version used (probably should not be the default) >>>> Date accessed: if "current" was used >>>> >>>> The Library reference would be the same except for Title. >>>> >>>> The third is a *particular implementation*. In that case the metadata >>>> would be >>>> >>>> Author: principal author(s) (Guido) et al. OR python.org OR >>>> Python Contributors >>>> Title: The cPython Python distribution >>>> Python Version: as appropriate (if relevant, different versions each >>>> get full citations), never "current" >>>> Distributor Version: if different from Python version (eg, >>>> additional >>>> Debian cruft) >>>> Publisher: Distributor (eg, PSF, Debian Project, Anaconda Inc.) >>>> Date: of the relevant version >>>> Location: City of legal address of distributor >>>> >>>> If downloaded: >>>> >>>> URL: to version used (including git commit SHA1 if available) >>>> Date accessed: download from distributor, not installation date >>>> >>>> If received on physical medium: use the "usual" form of citation for a >>>> collection of individual works (even if Python was the only thing on >>>> it). Probably the only additional information needed would be the >>>> distributor as editor of the collection and the name of the >>>> collection. >>>> >>>> In most cases I can think of, if the implementation is cited, the >>>> Language and Library References should be cited, too. >>>> >>>> Finally, if Python or components were modified for the project, the >>>> modified version should be preserved in a repository and a VCS >>>> identifier provided. This does not imply the repository need be >>>> publicly accessible, of course, although it might be for other reasons >>>> (eg, in a GSoC project,wherever or if hosted for free on GitHub). >>>> >>>> I doubt that "URNs" like DOI and ISBN are applicable, but if available >>>> they should be included in all cases as well. >>>> >>>> Steve >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: >>>> https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com >>>> >>> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/jackiekazil%40gmail.com >> > > > -- > Jacqueline Kazil | @jackiekazil > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aixtools at felt.demon.nl Mon Sep 17 03:39:22 2018 From: aixtools at felt.demon.nl (Michael) Date: Mon, 17 Sep 2018 09:39:22 +0200 Subject: [Python-Dev] debugging test_importlib.test_bad_traverse - script status is SUCCESS - but FAIL is expected. Message-ID: <10c3d2a1-85e3-5747-77a2-8c282565e8ce@felt.demon.nl> I read the discussion related to issue32374. That seems to be sure that other events that could cause the test to fail (i.e., the program executes successfully) are caught early, and/or ignored so that the program fails - and the test succeeds. I am having trouble figuring out why the script below does not fail on AIX, and would appreciate your assistance in debugging what is happening, i.e., getting deeper. Many thanks! ? +270????? @unittest.skipIf(not hasattr(sys, 'gettotalrefcount'), ? +271????????????? '--with-pydebug has to be enabled for this test') ? +272????? def test_bad_traverse(self): ? +273????????? ''' Issue #32374: Test that traverse fails when accessing per-module ? +274????????????? state before Py_mod_exec was executed. ? +275????????????? (Multiphase initialization modules only) ? +276????????? ''' ? +277????????? script = """if True: ? +278????????????????? try: ? +279????????????????????? from test import support ? +280????????????????????? import importlib.util as util ? +281????????????????????? spec = util.find_spec('_testmultiphase') ? +282????????????????????? spec.name = '_testmultiphase_with_bad_traverse' ? +283 ? +284????????????????????? with support.SuppressCrashReport(): ? +285????????????????????????? m = spec.loader.create_module(spec) ? +286????????????????? except: ? +287????????????????????? # Prevent Python-level exceptions from ? +288????????????????????? # ending the process with non-zero status ? +289????????????????????? # (We are testing for a crash in C-code) ? +290????????????????????? pass""" ? +291????????? assert_python_failure("-c", script) To make sure the full debug info is loaded I added "-X dev", and for your reading added some additional print statements - and for speed run the command directly. Regardless of how I run it (calling as a test, or directly) the end-result is the same. # Note: I was not able to fine the default "loader.create_module() code" to add debugging statements. # Pointer for that is welcome! ./python -X dev '-X' 'faulthandler' '-I' '-c' "if True: ??????????????? try: ??????????????????? from test import support ??????????????????? import importlib.util as util ??????????????????? spec = util.find_spec('_testmultiphase') ??????????????????? spec.name = '_testmultiphase_with_bad_traverse' ??????????????????? m = spec.loader.create_module(spec) ??????????????????? print(m) ??????????????????? print(dir(m)) ??????????????????? print(m.__doc__) ??????????????????? print(m.__loader__) ??????????????????? print(m.__name__) ??????????????????? print(m.__package__) ??????????????????? print(m.__spec__) ??????????????? except: ??????????????????? # Prevent Python-level exceptions from ??????????????????? # ending the process with non-zero status ??????????????????? # (We are testing for a crash in C-code) ??????????????????? print('in except')" ['__doc__', '__loader__', '__name__', '__package__', '__spec__'] Test module _testmultiphase_with_bad_traverse None _testmultiphase_with_bad_traverse None None root at x066:[/data/prj/python/git/Python3-3.8.0]echo $? 0 To get some additional idea of what is happening I added some fprintf statements: The additional debug info is: (see diff below) 1. bad_traverse:0 2. bad_traverse:0 1. bad_traverse:0 2. bad_traverse:0 1. bad_traverse:0 2. bad_traverse:0 *** To my SURPRISE *** only one routine with these print statements is ever called. I was expecting at more. (only bad_traverse(...) gets called, I was expecting both bad_traverse_test (Objects/moduleobject.c) and some kind of initialization of m_state->integer. Since the macro Py_VISIT includes a return() statement, and my debug statement always print the second line - I assume Py_VISIT(m_state->integer) is not doing anything (i.e., vret == 0) /* Utility macro to help write tp_traverse functions. ?* To use this macro, the tp_traverse function must name its arguments ?* "visit" and "arg".? This is intended to keep tp_traverse functions ?* looking as much alike as possible. ?*/ #define Py_VISIT(op)??????????????????????????????????????????????????? \ ??? do {??????????????????????????????????????????????????????????????? \ ??????? if (op) {?????????????????????????????????????????????????????? \ ??????????? int vret = visit((PyObject *)(op), arg);??????????????????? \ ??????????? if (vret)?????????????????????????????????????????????????? \ ??????????????? return vret;??????????????????????????????????????????? \ ??????? }?????????????????????????????????????????????????????????????? \ ??? } while (0) Is this what it should be? root at x066:[/data/prj/python/git/Python3-3.8.0]git status On branch aix-pr Changes not staged for commit: ? (use "git add ..." to update what will be committed) ? (use "git checkout -- ..." to discard changes in working directory) ??????? modified:?? Modules/_testmultiphase.c ??????? modified:?? Objects/moduleobject.c no changes added to commit (use "git add" and/or "git commit -a") root at x066:[/data/prj/python/git/Python3-3.8.0]git diff root at x066:[/data/prj/python/git/Python3-3.8.0]git diff | cat diff --git a/Modules/_testmultiphase.c b/Modules/_testmultiphase.c index 5776df7d76..c28aef1455 100644 --- a/Modules/_testmultiphase.c +++ b/Modules/_testmultiphase.c @@ -622,23 +622,34 @@ PyInit__testmultiphase_exec_unreported_exception(PyObject *spec) ?static int ?bad_traverse(PyObject *self, visitproc visit, void *arg) { ???? testmultiphase_state *m_state; +FILE *errmsg = fopen("/tmp/err", "a"); ???? m_state = PyModule_GetState(self); +fprintf(errmsg,"1. bad_traverse:%ld\n", m_state->integer); ???? Py_VISIT(m_state->integer); +fprintf(errmsg,"2. bad_traverse:%ld\n", m_state->integer); +fclose(errmsg); ???? return 0; ?} ?static int ?execfunc_with_bad_traverse(PyObject *mod) { ???? testmultiphase_state *m_state; +FILE *errmsg; +errmsg = fopen("/tmp/err", "a"); ???? m_state = PyModule_GetState(mod); ???? if (m_state == NULL) { +fprintf(errmsg,"0.execfunc:\n"); +fclose(errmsg); ???????? return -1; ???? } ???? m_state->integer = PyLong_FromLong(0x7fffffff); +fprintf(errmsg,"1.execfunc:%ld\n", m_state->integer); ???? Py_INCREF(m_state->integer); +fprintf(errmsg,"2.execfunc:%ld\n", m_state->integer); +fclose(errmsg); ???? return 0; ?} diff --git a/Objects/moduleobject.c b/Objects/moduleobject.c index ccf5f8e6d1..603611c686 100644 --- a/Objects/moduleobject.c +++ b/Objects/moduleobject.c @@ -27,6 +27,9 @@ static PyMemberDef module_members[] = { ?#ifdef Py_DEBUG ?static int ?bad_traverse_test(PyObject *self, void *arg) { +FILE *errmsg = fopen("/tmp/err","a"); +fprintf(errmsg,"bad_traverse_test: self!=NULL:%d\n", self != NULL); +fclose(errmsg); ???? assert(self != NULL); ???? return 0; ?} root at x066:[/data/prj/python/git/Python3-3.8.0] ~ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Sep 17 04:58:28 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 17 Sep 2018 17:58:28 +0900 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> <23454.5588.561636.164398@turnbull.sk.tsukuba.ac.jp> Message-ID: <23455.27828.567687.265452@turnbull.sk.tsukuba.ac.jp> Jacqueline Kazil writes: > I thought I could take two to three concrete formats and user test > there and report on how community members who would be using the > citation feel. +1 From solipsis at pitrou.net Mon Sep 17 05:25:56 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Sep 2018 11:25:56 +0200 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register References: Message-ID: <20180917112556.6ca45678@fsol> Hi, Please don't cross-post so heavily. python-dev is sufficient for this. If you want to push this forward, I suggest you measure performance of Python compiled with and without the Spectre mitigation options, and report the results here. That will help vendors and packagers decide whether they want to pursue the route of enabling those options. Note there are plenty of data-driven conditional jumps in Python. It will not be easy to determine which ones are vulnerable to exploiting through speculative execution of a mispredicted branch. The bytecode evaluation loop sounds like a potential attack target, but it's also performance-sensitive. Regards Antoine. On Sun, 16 Sep 2018 20:29:06 -0400 Wes Turner wrote: > Are all current Python builds and C extensions vulnerable to Spectre > variants {1, 2, *}? > > There are now multiple threads: > > "SEC: Spectre variant 2: GCC: -mindirect-branch=thunk > -mindirect-branch-register" > - > https://mail.python.org/mm3/archives/list/distutils-sig at python.org/thread/4BGE226DB5EWIAT5VCJ75QD5ASOVJZCM/ > - https://mail.python.org/pipermail/python-ideas/2018-September/053473.html > - https://mail.python.org/pipermail/python-dev/2018-September/155199.html > > > Original thread (that I forwarded to security@): > "[Python-ideas] Executable space protection: NX bit," > https://mail.python.org/pipermail/python-ideas/2018-September/053175.html > > ~ Do trampolines / nested functions in C extensions switch off the NX bit? > > On Sunday, September 16, 2018, Nathaniel Smith wrote: > > > On Wed, Sep 12, 2018, 12:29 Joni Orponen wrote: > > > >> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner wrote: > >> > >>> Should C extensions that compile all add > >>> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the > >>> risk of Spectre variant 2 (which does indeed affect user space applications > >>> as well as kernels)? > >>> > >> > >> Are those available on GCC <= 4.2.0 as per PEP 513? > >> > > > > Pretty sure no manylinux1 compiler is ever going to get these mitigations. > > > > For manylinux2010 on x86-64, we can easily use a much newer compiler: RH > > maintains a recent compiler, currently gcc 7.3, or if that doesn't work for > > some reason then the conda folks have be apparently figured out how to > > build the equivalent from gcc upstream releases. > > > > Are there different CFLAGS and/or gcc compatibility flags in conda builds > of Python and C extensions? > > Where are those set in conda builds? > > What's the best way to set CFLAGS in Python builds and C extensions? > > export CFLAGS="-mindirect-branch=thunk -mindirect-branch-register" > ./configure > make > > ? > > Why are we supposed to use an old version of GCC that doesn't have the > retpoline patches that only mitigate Spectre variant 2? > > > > > > Unfortunately, the manylinux2010 infrastructure is not quite ready... I'm > > pretty sure it needs some volunteers to push it to the finish line, though > > unfortunately I haven't had enough time to keep track. > > > > "PEP 571 -- The manylinux2010 Platform Tag" > https://www.python.org/dev/peps/pep-0571/ > > "Tracking issue for manylinux2010 rollout" > https://github.com/pypa/manylinux/issues/179 > > Are all current Python builds and C extensions vulnerable to Spectre > variants {1, 2, *}? > > > > From aixtools at felt.demon.nl Mon Sep 17 06:50:11 2018 From: aixtools at felt.demon.nl (Michael) Date: Mon, 17 Sep 2018 12:50:11 +0200 Subject: [Python-Dev] Nearly - all tests PASS for AIX Message-ID: <908465e6-d28b-a4ec-d687-1ff3826e7e84@felt.demon.nl> Dear all, The last two months I have spent nearly all my free time to cleanup "a frustration" - from my side - the long list of failing tests for AIX (there were nearly 20 when I started). atm - I am stuck on one - test_importlib (mail elsewhere), and the one I just finished (test_httpservers) may be overly simplified (just skipping the trailing-slash tests) - see issue34711 for a discussion. I would be grateful for feedback before I post it as a PR - to avoid working in circles. I hope you, the developers and development-minded community consider in a useful contribution. Currently - with all my proposed patches combined I have: 393 tests OK. 1 test failed: ??? test_importlib 25 tests skipped: ??? test_dbm_gnu test_devpoll test_epoll test_gdb test_idle ??? test_kqueue test_lzma test_msilib test_ossaudiodev test_readline ??? test_spwd test_sqlite test_startfile test_tcl test_tix test_tk ??? test_ttk_guionly test_ttk_textonly test_turtle test_unicode_file ??? test_unicode_file_functions test_winconsoleio test_winreg ??? test_winsound test_zipfile64 1 re-run test: ??? test_importlib Awaiting comments and suggestions. Many thanks for your time. Michael Felt -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From steve.dower at python.org Mon Sep 17 12:44:03 2018 From: steve.dower at python.org (Steve Dower) Date: Mon, 17 Sep 2018 09:44:03 -0700 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: Message-ID: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> I investigated this thoroughly some time ago (when the MSVC flags became available) and determined (with the help of some of the original Spectre/Meltdown investigation team) that there is no significant value in enabling these flags for Python. It boiled down to: * Python allows arbitrary code execution by design * Pure Python code in CPython has very long per-instruction opcode sequences that cannot easily be abused or timed * Injected pure Python code cannot be coerced into generating native code that is able to abuse Spectre/Meltdown but not able to abuse other attacks more easily * Code injection itself is outside of this particular threat model By comparison with JavaScript, most JS JITs can be easily coerced into generating specific native code that can break sandbox guarantees (e.g. browser tabs). Python offers none of these guarantees. Distributors are of course free to enable these flags for their own builds, but I recommend against it for the official binaries, and would suggest that it's worth more PR than actual security and nobody else needs to enable it either. (Extension authors with significant scriptable C code need to perform their own analysis. I'm only talking about CPython here.) Cheers, Steve On 16Sep2018 0707, Wes Turner wrote: > Should Python builds add `-mindirect-branch=thunk > -mindirect-branch-register` to CFLAGS? > > Where would this be to be added in the build scripts with which > architectures? > > /QSpectre is the MSVC build flag for Spectre Variant 1: > > >?The /Qspectre option is available in Visual Studio 2017 version 15.7 > and later. > > https://docs.microsoft.com/en-us/cpp/build/reference/qspectre?view=vs-2017 > > security@ directed me to the issue tracker / lists, > so I'm forwarding this to python-dev and python-ideas, as well. > > # Forwarded message > From: *Wes Turner* > > Date: Wednesday, September 12, 2018 > Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk > -mindirect-branch-register > To: distutils-sig > > > > Should C extensions that compile all add > `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the > risk of Spectre variant 2 (which does indeed affect user space > applications as well as kernels)? > > [1] > https://github.com/speed47/spectre-meltdown-checker/issues/119#issuecomment-361432244 > > [2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) > > [3] > https://en.wikipedia.org/wiki/Speculative_Store_Bypass#Speculative_execution_exploit_variants > > > On Wednesday, September 12, 2018, Wes Turner > wrote: > > On Wednesday, September 12, 2018, Joni Orponen > > wrote: > > On Wed, Sep 12, 2018 at 8:48 PM Wes Turner > > wrote: > > Should C extensions that compile all add > `-mindirect-branch=thunk -mindirect-branch-register` [1] > to mitigate the risk of Spectre variant 2 (which does > indeed affect user space applications as well as kernels)? > > > Are those available on GCC <= 4.2.0 as per PEP 513? > > > AFAIU, only > GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support > enabled by the `-mindirect-branch=thunk > -mindirect-branch-register` CFLAGS. > > > ?On Wednesday, September 12, 2018, Wes Turner > wrote: > > "What is a retpoline and how does it work?" > https://stackoverflow.com/questions/48089426/what-is-a-retpoline-and-how-does-it-work > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org > From python at mrabarnett.plus.com Mon Sep 17 13:11:11 2018 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 17 Sep 2018 18:11:11 +0100 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: On 2018-09-17 05:05, Jeremy Hylton wrote: > > I wanted to start with an easy answer that is surely unsatisfying: > http://blog.apastyle.org/apastyle/2015/01/how-to-cite-software-in-apa-style.html > > APA style is pretty popular, and it says that standard software doesn't > need to be specified. Standard software includes "Microsoft Word, Java, > and Adobe Photoshop." So I'd say Python fits well in that category, and > doesn't need to be cited. > > I said you wouldn't be satisfied... > It goes on to say """Note: We don?t keep a comprehensive list of what programs are ?standard.? You make the call.""". [snip] From wes.turner at gmail.com Mon Sep 17 14:58:30 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 17 Sep 2018 14:58:30 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> Message-ID: On Monday, September 17, 2018, Steve Dower wrote: > I investigated this thoroughly some time ago (when the MSVC flags became > available) and determined (with the help of some of the original > Spectre/Meltdown investigation team) that there is no significant value in > enabling these flags for Python. What did you fuzz with? Does that assume that e.g. Fortify has identified all bugs in CPython C? There have been a number of variants that have been disclosed; which did who test for? > > It boiled down to: > * Python allows arbitrary code execution by design Yet binaries built with GCC do have NX? Unless nested functions in C extensions? > * Pure Python code in CPython has very long per-instruction opcode > sequences that cannot easily be abused or timed A demonstration of this would be helpful. * Injected pure Python code cannot be coerced into generating native code > that is able to abuse Spectre/Meltdown but not able to abuse other attacks > more easily So, not impossible. * Code injection itself is outside of this particular threat model [Jupyter] Notebook servers are as wide open to arbitrary code execution as browser JS JITs; often with VMs and/or containers as a 'sandbox' `pip install requirements.txt` installs and executes unsigned code: Python, C extensions What can a container do to contain a speculative execution exploit intending to escape said container? I thought I read that RH has a kernel flag for userspace? > By comparison with JavaScript, most JS JITs can be easily coerced into > generating specific native code that can break sandbox guarantees (e.g. > browser tabs). Python offers none of these guarantees. This is faulty logic. Because Python does not have a JIT sandbox, speculative execution is not a factor for Python? > > Distributors are of course free to enable these flags for their own > builds, but I recommend against it for the official binaries, and would > suggest that it's worth more PR than actual security and nobody else needs > to enable it either. > > (Extension authors with significant scriptable C code need to perform > their own analysis. I'm only talking about CPython here.) Extension installers (and authors) are not likely to perform any such analysis. Extensions are composed of arbitrary C, which certainly can both directly exploit and indirectly enable remote exploitation of Spectre and Meltdown vulnerabilities. Most users of python are installing arbitrary packages (without hashes or signatures). > > Cheers, > Steve > > On 16Sep2018 0707, Wes Turner wrote: > >> Should Python builds add `-mindirect-branch=thunk >> -mindirect-branch-register` to CFLAGS? >> >> Where would this be to be added in the build scripts with which >> architectures? >> >> /QSpectre is the MSVC build flag for Spectre Variant 1: >> >> > The /Qspectre option is available in Visual Studio 2017 version 15.7 >> and later. >> >> https://docs.microsoft.com/en-us/cpp/build/reference/qspectr >> e?view=vs-2017 >> >> security@ directed me to the issue tracker / lists, >> so I'm forwarding this to python-dev and python-ideas, as well. >> >> # Forwarded message >> From: *Wes Turner* > >> Date: Wednesday, September 12, 2018 >> Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk >> -mindirect-branch-register >> To: distutils-sig > org>> >> >> >> Should C extensions that compile all add >> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the >> risk of Spectre variant 2 (which does indeed affect user space applications >> as well as kernels)? >> >> [1] https://github.com/speed47/spectre-meltdown-checker/issues/ >> 119#issuecomment-361432244 > ectre-meltdown-checker/issues/119#issuecomment-361432244> >> [2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) < >> https://en.wikipedia.org/wiki/Spectre_%28security_vulnerability%29> >> [3] https://en.wikipedia.org/wiki/Speculative_Store_Bypass#Specu >> lative_execution_exploit_variants > /Speculative_Store_Bypass#Speculative_execution_exploit_variants> >> >> On Wednesday, September 12, 2018, Wes Turner > > wrote: >> >> On Wednesday, September 12, 2018, Joni Orponen >> > wrote: >> >> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner >> > wrote: >> >> Should C extensions that compile all add >> `-mindirect-branch=thunk -mindirect-branch-register` [1] >> to mitigate the risk of Spectre variant 2 (which does >> indeed affect user space applications as well as kernels)? >> >> >> Are those available on GCC <= 4.2.0 as per PEP 513? >> >> >> AFAIU, only >> GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support >> enabled by the `-mindirect-branch=thunk >> -mindirect-branch-register` CFLAGS. >> >> >> On Wednesday, September 12, 2018, Wes Turner > > wrote: >> >> "What is a retpoline and how does it work?" >> https://stackoverflow.com/questions/48089426/what-is-a-retpo >> line-and-how-does-it-work >> > oline-and-how-does-it-work> >> >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve. >> dower%40python.org >> >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. > turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Sep 17 15:41:55 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 17 Sep 2018 15:41:55 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> Message-ID: On Mon, Sep 17, 2018 at 2:58 PM Wes Turner wrote: > > I thought I read that RH has a kernel flag for userspace? > "Controlling the Performance Impact of Microcode and Security Patches for CVE-2017-5754 CVE-2017-5715 and CVE-2017-5753 using Red Hat Enterprise Linux Tunables" https://access.redhat.com/articles/3311301 > Indirect Branch Restricted Speculation (ibrs) > [...] When ibrs_enabled is set to 1 (spectre_v2=ibrs) the kernel runs with indirect branch restricted speculation, which protects the kernel space from attacks (even from hyperthreading/simultaneous multi-threading attacks). When IBRS is set to 2 (spectre_v2=ibrs_always), both userland and kernel runs with indirect branch restricted speculation. This protects userspace from hyperthreading/simultaneous multi-threading attacks as well, and is also the default on certain old AMD processors (family 10h, 12h and 16h). This feature addresses CVE-2017-5715, variant #2. > [...] > echo 2 > /sys/kernel/debug/x86/ibrs_enabled https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SpectreAndMeltdown/MitigationControls > echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both userspace and kernel ... On Mon, Sep 17, 2018 at 5:26 AM Antoine Pitrou wrote: > If you want to push this forward, I suggest you measure performance of > Python compiled with and without the Spectre mitigation options, and > report the results here. That will help vendors and packagers decide > whether they want to pursue the route of enabling those options. "Speculative Execution Exploit Performance Impacts - Describing the performance impacts to security patches for CVE-2017-5754 CVE-2017-5753 and CVE-2017-5715" https://access.redhat.com/articles/3307751 - Revised worst-case peformance impact: 4-8% -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Mon Sep 17 16:13:16 2018 From: steve.dower at python.org (Steve Dower) Date: Mon, 17 Sep 2018 13:13:16 -0700 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> Message-ID: <7f61f443-8599-31dc-58ee-609843f86250@python.org> On 17Sep2018 1158, Wes Turner wrote: > On Monday, September 17, 2018, Steve Dower > wrote: > > I investigated this thoroughly some time ago (when the MSVC flags > became available) and determined (with the help of some of the > original Spectre/Meltdown investigation team) that there is no > significant value in enabling these flags for Python. > > What did you fuzz with? > Does that assume that e.g. Fortify has identified all bugs in CPython C? > There have been a number of variants that have been disclosed; which did > who test for? Don't change the subject. > It boiled down to: > * Python allows arbitrary code execution by design > > Yet binaries built with GCC do have NX? Unless nested functions in C > extensions? I don't know anything about GCC settings. Binaries for Windows have been built with this option for over a decade. It's unrelated to Spectre/Meltdown. > * Pure Python code in CPython has very long per-instruction opcode > sequences that cannot easily be abused or timed > > A demonstration of this would be helpful. That's not how proof-of-concepts work. You can't assume that the lack of a demonstration proves it is possible - at best you have to assume that it proves it is *not* possible, but really it just proves that nobody has a demonstration yet. What I could demonstrate (again) if I thought it would be worthwhile is that the changes enabled by the flag do not affect the normal interpreter loop, and do not affect any code that can be called fast enough to potentially leak information. Feel free to go ahead and build with/without the flags and compare the disassembly (and if you do this and find that compilers are detecting new cases since I looked, *that* would be very helpful to share directly with the security team). > * Injected pure Python code cannot be coerced into generating native > code that is able to abuse Spectre/Meltdown but not able to abuse > other attacks more easily > > ?So, not impossible. Of course it's not impossible. But why would you > * Code injection itself is outside of this particular threat model > > [Jupyter] Notebook servers are as wide open to arbitrary code execution > as browser JS JITs; often with VMs and/or containers as a 'sandbox' > `pip install requirements.txt` installs and executes unsigned code: > Python, C extensions > > What can a container do to contain a speculative execution exploit > intending to escape said container? Python's threat model does not treat the Python process as a sandbox. To say it another way, if you assume the Python process is a sandbox, you're on your own. Arbitrary code, Python or otherwise, can totally escape the process, and then it's up to the OS to protect against escaping the machine. We do what we can to reduce unnecessary arbitrary code, but unless you've properly protected your environment then you have a lot more to worry about besides speculative execution vulnerabilities. > By comparison with JavaScript, most JS JITs can be easily coerced > into generating specific native code that can break sandbox > guarantees (e.g. browser tabs). Python offers none of these guarantees. > > > This is faulty logic. Because Python does not have a JIT sandbox, > speculative execution is not a factor for Python? Because Python does not have a (native) JIT at all, speculative execution relies on identifying vulnerable and reusable code patterns within the C code and being able to invoke those directly. Because pure Python code does not allow this (without relying on other bugs), there is no way to do this within the threat model we use. Once you allow arbitrary or unvalidated native code, you are outside the threat model and hence on your own. And if you find a bug that lets pure Python code move the instruction pointer to arbitrary native code, that should be reported to the security team. > Distributors are of course free to enable these flags for their own > builds, but I recommend against it for the official binaries, and > would suggest that it's worth more PR than actual security and > nobody else needs to enable it either. > > (Extension authors with significant scriptable C code need to > perform their own analysis. I'm only talking about CPython here.) > > > Extension installers (and authors) are not likely to perform any such > analysis. Then it is their fault if they are compromised. Open source software relies on users validating the software themselves, as there is no legal recourse against developers who do not do it. > Extensions are composed of arbitrary C, which certainly can both > directly exploit and indirectly enable remote exploitation of Spectre > and Meltdown vulnerabilities. If arbitrary C is running, we can't help you anymore. > Most users of python are installing arbitrary packages (without hashes > or signatures). If they are concerned about Spectre/Meltdown, they should stop doing this. They should also stop if they are concerned about 1000 other issues that are much more likely than Spectre/Meltdown. Cheers, Steve From njs at pobox.com Mon Sep 17 16:13:37 2018 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Sep 2018 13:13:37 -0700 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> Message-ID: Hi Wes, It's great you're passionate about python security, but this is the wrong way to go about it. Spectre is inherently super subtle and confusing, so if there's something that people need to do, then we need a clear, comprehensive write-up of what the threat is and how to address it. Perhaps you could find some collaborators with expertise in these things and work with them off-list to put something like that together ? that could be quite helpful. What isn't helpful is what you've been doing instead: sending incoherent jumbles of vaguely-related text, to multiple highly-subscribed mailing lists, multiple times a day, for a week now. This is worse than useless. Please stop. -n On Mon, Sep 17, 2018, 12:44 Wes Turner wrote: > > > On Mon, Sep 17, 2018 at 2:58 PM Wes Turner wrote: > >> >> I thought I read that RH has a kernel flag for userspace? >> > > "Controlling the Performance Impact of Microcode and Security Patches for > CVE-2017-5754 CVE-2017-5715 and CVE-2017-5753 using Red Hat Enterprise > Linux Tunables" > https://access.redhat.com/articles/3311301 > > > Indirect Branch Restricted Speculation (ibrs) > > [...] When ibrs_enabled is set to 1 (spectre_v2=ibrs) the kernel runs > with indirect branch restricted speculation, which protects the kernel > space from attacks (even from hyperthreading/simultaneous multi-threading > attacks). When IBRS is set to 2 (spectre_v2=ibrs_always), both userland and > kernel runs with indirect branch restricted speculation. This protects > userspace from hyperthreading/simultaneous multi-threading attacks as well, > and is also the default on certain old AMD processors (family 10h, 12h and > 16h). This feature addresses CVE-2017-5715, variant #2. > > [...] > > echo 2 > /sys/kernel/debug/x86/ibrs_enabled > > > https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SpectreAndMeltdown/MitigationControls > > echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both > userspace and kernel > > ... > On Mon, Sep 17, 2018 at 5:26 AM Antoine Pitrou > wrote: > >> If you want to push this forward, I suggest you measure performance of >> Python compiled with and without the Spectre mitigation options, and >> report the results here. That will help vendors and packagers decide >> whether they want to pursue the route of enabling those options. > > > "Speculative Execution Exploit Performance Impacts - Describing the > performance impacts to security patches for CVE-2017-5754 CVE-2017-5753 and > CVE-2017-5715" > https://access.redhat.com/articles/3307751 > > - Revised worst-case peformance impact: 4-8% > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Sep 17 18:08:24 2018 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 17 Sep 2018 18:08:24 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: <7f61f443-8599-31dc-58ee-609843f86250@python.org> References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: To summarize: - CPython may be vulnerable to speculative execution vulnerabilities, but none are known. - In general, CPython is currently too slow for speculative execution exploitation to be practical. - Sandboxed, JIT'ed JS is not too slow for speculative execution exploitation to be practical - (Not otherwise discussed here: PyPy's sandboxed JIT may not be too slow for speculative execution exploitation to be practical.) - C extensions may be vulnerable to speculative execution vulnerabilities; but that's the authors' and users' problem (and so it's appropriate to mention this to extension owners following distutils-sig/PyPA) - C extensions can set indirect branch CFLAGS only with GCC 7.3 and GCC 8+; which are only usable with conda and the forthcoming manylinux2010 spec - Linux kernels with [IBRS, STIBP, IBPB] can enable userspace protection - The revised worst-case performance impacts for variant 2 mitigations are 4-8% - MSVC has a /Qspectre flag for variant 1 - Because there is no exploit provided (or currently thought possible with just CPython), this security-related dialogue is regarded as a nuisance. - There is no published or official statement or investigation from the Python community regarding Spectre (or Meltdown) vulnerabilities. Here's a good write-up: Safety_instructions_for_Meltdown_and_Spectre Have a good day! On Monday, September 17, 2018, Steve Dower wrote: > On 17Sep2018 1158, Wes Turner wrote: > >> On Monday, September 17, 2018, Steve Dower > > wrote: >> >> I investigated this thoroughly some time ago (when the MSVC flags >> became available) and determined (with the help of some of the >> original Spectre/Meltdown investigation team) that there is no >> significant value in enabling these flags for Python. >> >> What did you fuzz with? >> Does that assume that e.g. Fortify has identified all bugs in CPython C? >> There have been a number of variants that have been disclosed; which did >> who test for? >> > > Don't change the subject. > > It boiled down to: >> * Python allows arbitrary code execution by design >> >> Yet binaries built with GCC do have NX? Unless nested functions in C >> extensions? >> > > I don't know anything about GCC settings. Binaries for Windows have been > built with this option for over a decade. It's unrelated to > Spectre/Meltdown. > > * Pure Python code in CPython has very long per-instruction opcode >> sequences that cannot easily be abused or timed >> >> A demonstration of this would be helpful. >> > > That's not how proof-of-concepts work. You can't assume that the lack of a > demonstration proves it is possible - at best you have to assume that it > proves it is *not* possible, but really it just proves that nobody has a > demonstration yet. > > What I could demonstrate (again) if I thought it would be worthwhile is > that the changes enabled by the flag do not affect the normal interpreter > loop, and do not affect any code that can be called fast enough to > potentially leak information. Feel free to go ahead and build with/without > the flags and compare the disassembly (and if you do this and find that > compilers are detecting new cases since I looked, *that* would be very > helpful to share directly with the security team). > > * Injected pure Python code cannot be coerced into generating native >> code that is able to abuse Spectre/Meltdown but not able to abuse >> other attacks more easily >> >> So, not impossible. >> > > Of course it's not impossible. But why would you > > * Code injection itself is outside of this particular threat model >> >> [Jupyter] Notebook servers are as wide open to arbitrary code execution >> as browser JS JITs; often with VMs and/or containers as a 'sandbox' >> > > `pip install requirements.txt` installs and executes unsigned code: >> Python, C extensions >> >> What can a container do to contain a speculative execution exploit >> intending to escape said container? >> > > Python's threat model does not treat the Python process as a sandbox. To > say it another way, if you assume the Python process is a sandbox, you're > on your own. > > Arbitrary code, Python or otherwise, can totally escape the process, and > then it's up to the OS to protect against escaping the machine. We do what > we can to reduce unnecessary arbitrary code, but unless you've properly > protected your environment then you have a lot more to worry about besides > speculative execution vulnerabilities. > > By comparison with JavaScript, most JS JITs can be easily coerced >> into generating specific native code that can break sandbox >> guarantees (e.g. browser tabs). Python offers none of these >> guarantees. >> >> >> This is faulty logic. Because Python does not have a JIT sandbox, >> speculative execution is not a factor for Python? >> > > Because Python does not have a (native) JIT at all, speculative execution > relies on identifying vulnerable and reusable code patterns within the C > code and being able to invoke those directly. Because pure Python code does > not allow this (without relying on other bugs), there is no way to do this > within the threat model we use. > > Once you allow arbitrary or unvalidated native code, you are outside the > threat model and hence on your own. And if you find a bug that lets pure > Python code move the instruction pointer to arbitrary native code, that > should be reported to the security team. > > Distributors are of course free to enable these flags for their own >> builds, but I recommend against it for the official binaries, and >> would suggest that it's worth more PR than actual security and >> nobody else needs to enable it either. >> >> (Extension authors with significant scriptable C code need to >> perform their own analysis. I'm only talking about CPython here.) >> >> >> Extension installers (and authors) are not likely to perform any such >> analysis. >> > > Then it is their fault if they are compromised. Open source software > relies on users validating the software themselves, as there is no legal > recourse against developers who do not do it. > > Extensions are composed of arbitrary C, which certainly can both directly >> exploit and indirectly enable remote exploitation of Spectre and Meltdown >> vulnerabilities. >> > > If arbitrary C is running, we can't help you anymore. > > Most users of python are installing arbitrary packages (without hashes or >> signatures). >> > > If they are concerned about Spectre/Meltdown, they should stop doing this. > They should also stop if they are concerned about 1000 other issues that > are much more likely than Spectre/Meltdown. > > Cheers, > Steve > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 17 19:37:37 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Sep 2018 16:37:37 -0700 Subject: [Python-Dev] Request for review: binary op dispatch rules for subclasses In-Reply-To: References: Message-ID: FWIW I wrote up what I recall about the issue and kicked it to the next BDFL: https://bugs.python.org/issue30140#msg325553 On Fri, Sep 14, 2018 at 6:42 PM Stephan Hoyer wrote: > Over a year ago, I made a pull request ( > https://github.com/python/cpython/pull/1325) to fix a long-standing issue > with how Python handles dispatch for arithmetic binary operations involving > subclasses (https://bugs.python.org/issue30140). > > I pinged the bug several times, but I'm still waiting for a review, which > would be greatly appreciated! > > Best, > Stephan > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.shapiro at gmail.com Mon Sep 17 20:23:26 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Mon, 17 Sep 2018 17:23:26 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <20180916222441.13600498@fsol> References: <20180916222441.13600498@fsol> Message-ID: On Sun, Sep 16, 2018 at 1:24 PM, Antoine Pitrou wrote: > I think it's of limited interest if it only helps with modules used > during the startup sequence, not arbitrary stdlib or third-party > modules. > This should help any use-case that is already using the freeze module already bundled with CPython. Third-party code, like py2exe, py2app, pyinstaller, and XAR could build upon this to create applications that start faster. > To give an idea, on my machine the baseline Python startup is about 20ms > (`time python -c pass`), but if I import Numpy it grows to 100ms, and > with Pandas it's more than 200ms. Saving 4ms on the baseline startup > would make no practical difference for concrete usage. > Do you have a feeling for how many of those milliseconds are spend loading bytecode from disk? If so standalone executables that contain numpy and pandas (and mercurial) would start faster > I'm ready to think there are other use cases where it matters, though. > I think so. I hope you will, too :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From avery.richards at gmail.com Mon Sep 17 21:12:33 2018 From: avery.richards at gmail.com (Avery Richards) Date: Mon, 17 Sep 2018 18:12:33 -0700 Subject: [Python-Dev] [help] where to learn how to upgrade from 2.7 to 3 Message-ID: I am having so much fun learning python! I did not install the best version into my mac at first. Now I can't find out how to upgrade, (pip is awesome but not as conversational as I need it to be on the subject). I've downloaded the packages from python.org, installed all sorts of stuff, I configured my text editor to recognize python3, resolving formatting strings output, but now as I progress the [end = ' '] is not recognized. I have figured out a lot on my own, can you help me upgrade to 3.6 once and for all? Again I consulted with pip and followed faq websites (maybe a mistake there, idk). please please thank you! ~Avery -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Mon Sep 17 21:20:27 2018 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 18 Sep 2018 03:20:27 +0200 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation Message-ID: Hi Unicode and locales lovers, tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X coerce_c_locale=value" option and make sure that the C locale coercion cannot be when Python in embedded: are you ok with these changes? Before 3.7.0 release, during the implementation of the UTF-8 Mode (PEP 540), I changed two things in Nick Coghlan's implementation of the C locale coercion (PEP 538): (1) PYTHONCOERCECLOCALE environment variable is now ignored when -E or -I command line option is used. (2) When Python is embeded, the C locale coercion is now enabled if the LC_CTYPE locale is "C". Nick asked me to change the behavior: https://bugs.python.org/issue34589 I just pushed this change in the 3.7 branch which adds a new "-X coerce_c_locale=value" option: https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684 Examples using Pyhon 3.7 (future 3.7.1) with UTF-8 Mode disabled, to only test the C locale coercion: --- $ cat test.py import codecs, locale enc = locale.getpreferredencoding() enc = codecs.lookup(enc).name print(enc) $ export LC_ALL= LC_CTYPE=C LANG= # Disable C locale coercion: get ASCII as expected $ PYTHONCOERCECLOCALE=0 ./python -X utf8=0 test.py ascii # -E ignores PYTHONCOERCECLOCALE=0: # C locale is coerced, we get UTF-8 $ PYTHONCOERCECLOCALE=0 ./python -E -X utf8=0 test.py utf-8 # -X coerce_c_locale=0 is not affected by -E: # C locale coercion disabled as expected, get ASCII as expected $ ./python -E -X utf8=0 -X coerce_c_locale=0 test.py ascii --- For (1), Nick's use case is to get Python 3.6 behavior (C locale not coerced) on Python 3.7 using PYTHONCOERCECLOCALE. Nick proposed to use PYTHONCOERCECLOCALE even with -E or -I, but I dislike introducing a special case for -E option. I chose to add a new "-X coerce_c_locale=0" to Python 3.7.1 to provide a solution for this use case. (Python 3.7.0 and older ignore this option.) Note: Python 3.7.0 is fine with PYTHONCOERCECLOCALE=0, we are only talking about the special case of -E and -I options. For (2), I modified Python 3.7.1 to make sure the C locale is never coerced when the C API is used to embed Python inside an application: Py_Initialize() and Py_Main(). The C locale can only be coerced by the official Python program ("python3.7"). I don't know if it should be possible to enable C locale coercion when Python is embedded. So I just made the change requested by Nick :-) I dislike doing such late changes in 3.7.1, especially since PEP 538 has been designed by Nick Coghlan, and we disagree on the fix. But Ned Deily, our Python 3.7 release manager, wants to see last 3.7 fixes merged before Tuesday, so here we are. Nick, Ned, INADA-san: are you ok with these changes? The other choices for 3.7.1 are: * Revert my change: C locale coercion can still be enabled when Python is embedded, -E option ignores PYTHONCOERCECLOCALE env var. * Revert my change and apply Nick's PR 9257: C locale coercion cannot be enabled when Python is embedded and -E option doesn't ignore PYTHONCOERCECLOCALE env var. I spent months to fix the master branch to support all possible locales and encodings, and get a consistent CLI: https://vstinner.github.io/python3-locales-encodings.html So I'm not excited by Nick's PR which IMHO moves Python backward, especially it breaks the -E option contract: it doesn't ignore PYTHONCOERCECLOCALE env var. Victor From nad at python.org Mon Sep 17 21:42:41 2018 From: nad at python.org (Ned Deily) Date: Mon, 17 Sep 2018 21:42:41 -0400 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: Message-ID: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> On Sep 17, 2018, at 21:20, Victor Stinner wrote: > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > coerce_c_locale=value" option and make sure that the C locale coercion > cannot be when Python in embedded: are you ok with these changes? > > > Before 3.7.0 release, during the implementation of the UTF-8 Mode (PEP > 540), I changed two things in Nick Coghlan's implementation of the C > locale coercion (PEP 538): > > (1) PYTHONCOERCECLOCALE environment variable is now ignored when -E or > -I command line option is used. > > (2) When Python is embeded, the C locale coercion is now enabled if > the LC_CTYPE locale is "C". > > Nick asked me to change the behavior: > https://bugs.python.org/issue34589 > > I just pushed this change in the 3.7 branch which adds a new "-X > coerce_c_locale=value" option: > https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684 > > Examples using Pyhon 3.7 (future 3.7.1) with UTF-8 Mode disabled, to > only test the C locale coercion: > --- > $ cat test.py > import codecs, locale > enc = locale.getpreferredencoding() > enc = codecs.lookup(enc).name > print(enc) > > $ export LC_ALL= LC_CTYPE=C LANG= > > # Disable C locale coercion: get ASCII as expected > $ PYTHONCOERCECLOCALE=0 ./python -X utf8=0 test.py > ascii > > # -E ignores PYTHONCOERCECLOCALE=0: > # C locale is coerced, we get UTF-8 > $ PYTHONCOERCECLOCALE=0 ./python -E -X utf8=0 test.py > utf-8 > > # -X coerce_c_locale=0 is not affected by -E: > # C locale coercion disabled as expected, get ASCII as expected > $ ./python -E -X utf8=0 -X coerce_c_locale=0 test.py > ascii > --- > > > For (1), Nick's use case is to get Python 3.6 behavior (C locale not > coerced) on Python 3.7 using PYTHONCOERCECLOCALE. Nick proposed to use > PYTHONCOERCECLOCALE even with -E or -I, but I dislike introducing a > special case for -E option. > > I chose to add a new "-X coerce_c_locale=0" to Python 3.7.1 to provide > a solution for this use case. (Python 3.7.0 and older ignore this > option.) > > Note: Python 3.7.0 is fine with PYTHONCOERCECLOCALE=0, we are only > talking about the special case of -E and -I options. > > > For (2), I modified Python 3.7.1 to make sure the C locale is never > coerced when the C API is used to embed Python inside an application: > Py_Initialize() and Py_Main(). The C locale can only be coerced by the > official Python program ("python3.7"). > > I don't know if it should be possible to enable C locale coercion when > Python is embedded. So I just made the change requested by Nick :-) > > > I dislike doing such late changes in 3.7.1, especially since PEP 538 > has been designed by Nick Coghlan, and we disagree on the fix. But Ned > Deily, our Python 3.7 release manager, wants to see last 3.7 fixes > merged before Tuesday, so here we are. Just because the 3.7.1rc is scheduled doesn't mean we should throw something in, particularly if it's not fully reviewed and fully agreed upon. If it's important enough, we could delay the rc a few days ... or decide to wait for 3.7.2. > Nick, Ned, INADA-san: are you ok with these changes? > The other choices for 3.7.1 are: > > * Revert my change: C locale coercion can still be enabled when Python > is embedded, -E option ignores PYTHONCOERCECLOCALE env var. > > * Revert my change and apply Nick's PR 9257: C locale coercion cannot > be enabled when Python is embedded and -E option doesn't ignore > PYTHONCOERCECLOCALE env var. > > > I spent months to fix the master branch to support all possible > locales and encodings, and get a consistent CLI: > https://vstinner.github.io/python3-locales-encodings.html > > So I'm not excited by Nick's PR which IMHO moves Python backward, > especially it breaks the -E option contract: it doesn't ignore > PYTHONCOERCECLOCALE env var. I would like to see Nick review the merged 3.7 PR and have both him and you agree that this is the thing to do for 3.7.1. I also want to make sure we understand what affect this will have on 3.7.0 users. Let's not potentially make things worse. I'm not planning to tag 3.7.1rc for at least another 18 hours. I'm marking bpo-34589 as "release blocker" and I will not proceed until this is resolved. Thanks! --Ned -- Ned Deily nad at python.org -- [] From rymg19 at gmail.com Mon Sep 17 23:26:34 2018 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 17 Sep 2018 22:26:34 -0500 Subject: [Python-Dev] [help] where to learn how to upgrade from 2.7 to 3 In-Reply-To: References: Message-ID: Python-dev is for development *of* Python, not *in* Python! You want python-list instead. Also, make sure you include some full example code where the error occurs and what exactly is failing. Right now, it's hard for me to tell what exactly is going on... On Mon, Sep 17, 2018, 8:21 PM Avery Richards wrote: > I am having so much fun learning python! I did not install the best > version into my mac at first. Now I can't find out how to upgrade, (pip is > awesome but not as conversational as I need it to be on the subject). I've > downloaded the packages from python.org, installed all sorts of stuff, I > configured my text editor to recognize python3, resolving formatting > strings output, but now as I progress the > > [end = ' '] > > is not recognized. I have figured out a lot on my own, can you help me > upgrade to 3.6 once and for all? Again I consulted with pip and followed > faq websites (maybe a mistake there, idk). > > please please thank you! > > ~Avery > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com > -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Tue Sep 18 00:48:39 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 18 Sep 2018 00:48:39 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: I believe this is the article Wes wanted to link to: https://www.thomas-krenn.com/en/wiki/Safety_instructions_for_Meltdown_and_Spectre On Mon, Sep 17, 2018 at 6:10 PM Wes Turner wrote: > > To summarize: > > - CPython may be vulnerable to speculative execution vulnerabilities, but none are known. > - In general, CPython is currently too slow for speculative execution exploitation to be practical. > - Sandboxed, JIT'ed JS is not too slow for speculative execution exploitation to be practical > - (Not otherwise discussed here: PyPy's sandboxed JIT may not be too slow for speculative execution exploitation to be practical.) I'm no security researcher, and I barely remember much about Spectre/Meltdown, but I think the idea is that, if Python takes about 2 milliseconds to run your code, then a difference of +- 10 microseconds is indistinguishable from noise. Try to write software that can learn to distinguish two similar computers using the running time of certain functions. Javascript can be crafted to get close enough to some C programs. ASM.js and WebAssembly might help. PyPy's need for mitigation is independent of CPython's. > - Because there is no exploit provided (or currently thought possible with just CPython), this security-related dialogue is regarded as a nuisance. More than that. Steve says that he looked into it and decided there wasn't really anything to worry about. He believes that any exploit of it will also imply an easier exploit is possible. He also says that this particular fix won't really help. Nathaniel is annoyed because Spectre is tricky to understand, and he assumes you don't understand it as well as you think because you haven't shown him that you have the expertise to understand it. > Here's a good write-up: > Safety_instructions_for_Meltdown_and_Spectre But how does that apply to CPython? What specifically about CPython makes the interpreter vulnerable to the attack? Under what conditions would CPython be vulnerable to this attack, but not an easier attack of at least the same severity? The article I linked at the top of this email does not give advice for interpreter writers at all. It only says what end users and chip manufacturers ought to do. It is not relevant. From songofacandy at gmail.com Tue Sep 18 02:38:04 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 18 Sep 2018 15:38:04 +0900 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Tue, Sep 18, 2018 at 7:08 AM Wes Turner wrote: > > To summarize: > > - CPython may be vulnerable to speculative execution vulnerabilities, but none are known. > - In general, CPython is currently too slow for speculative execution exploitation to be practical. > - Sandboxed, JIT'ed JS is not too slow for speculative execution exploitation to be practical > - (Not otherwise discussed here: PyPy's sandboxed JIT may not be too slow for speculative execution exploitation to be practical.) > As far as I know, execution speed is important for attacker, not victim. In case of JavaScript, browser may load attacking code and run it while user watching websites. Browsers provides sandbox for JS, but attacker code may be able to bypass the sandbox by Spectre or Meltdown. So browsers disabled high precision timer until OSes are updated. This topic is totally unrelated to compiler options: these compiler options doesn't prohibit running attacking code, it just guard branches from branch target injection. Does my understanding collect? Why should we discuss about execution speed? I think this topic should split to two topics: (1) Guard Python process from Spectre/Meltdown attack from other process, (2) Prohibit Python code attack other processes by using Spectre/Meltdown. Regards, -- INADA Naoki From solipsis at pitrou.net Tue Sep 18 04:31:42 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Sep 2018 10:31:42 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> Message-ID: <20180918103142.60cd2ec5@fsol> On Mon, 17 Sep 2018 17:23:26 -0700 Carl Shapiro wrote: > > > To give an idea, on my machine the baseline Python startup is about 20ms > > (`time python -c pass`), but if I import Numpy it grows to 100ms, and > > with Pandas it's more than 200ms. Saving 4ms on the baseline startup > > would make no practical difference for concrete usage. > > > > Do you have a feeling for how many of those milliseconds are spend loading > bytecode from disk? No idea. In my previous experiments with module import speed, I concluded that executing module bytecode generally was the dominating contributor, but that doesn't mean loading bytecode is costless. Regards Antoine. From aixtools at felt.demon.nl Tue Sep 18 05:48:53 2018 From: aixtools at felt.demon.nl (Michael) Date: Tue, 18 Sep 2018 11:48:53 +0200 Subject: [Python-Dev] debugging test_importlib.test_bad_traverse - script status is SUCCESS - but FAIL is expected. In-Reply-To: <10c3d2a1-85e3-5747-77a2-8c282565e8ce@felt.demon.nl> References: <10c3d2a1-85e3-5747-77a2-8c282565e8ce@felt.demon.nl> Message-ID: On 17/09/2018 09:39, Michael wrote: > I read the discussion related to issue32374. That seems to be sure that > other events that could > cause the test to fail (i.e., the program executes successfully) are > caught early, and/or ignored > so that the program fails - and the test succeeds. After reading below, I would appreciate knowing whether to ask that issue32374 be reopened and the test adjusted so that the test is "SkipIf" AIX? Or, something else? I'll work on something else, but I do not want to guess the current intent of this test module. +++++++ In: Modules/_testmultiphase.c - found where AIX and Linux differ in their response to accessing a NULL pointer, in this case m_state->integer ? +624? static int ? +625? bad_traverse(PyObject *self, visitproc visit, void *arg) { ? +626????? testmultiphase_state *m_state; ? +627???? FILE *err = fopen("/tmp/err","a"); ? +628 ? +629????? m_state = PyModule_GetState(self); ? +630 ? +631? fprintf(err,"%s:%d\n", __FILE__,__LINE__); fflush(err); ? +632? fprintf(err, "m_state:08%lx &m_state->integer:%08lx\n", ? +633????????? m_state, &(m_state->integer)); ? +634? fclose(err); ? +635????? Py_VISIT(m_state->integer); ? +636? /* ? +637? #define Py_VISIT(op) ? +638????? do { ? +639????????? if (m_state->integer) { ? +640????????????? int vret = visit((PyObject *)(m_state->integer), arg); ? +641????????????? if (vret) { ? +642????????????????? return vret; ? +643????????????? } ? +644????????? } ? +645????? } while (0); ? +646? */ ? +647????? return 0; ? +648? } The "m_state" and m_state->integer values are identical, but the response is not. root at x066:[/data/prj/python/git]uname AIX /data/prj/python/git/python3-3.8/Modules/_testmultiphase.c:631 m_state:080 &m_state->integer:00000000 root at x074:/data/prj/python/git# uname Linux /data/prj/python/git/Python3-3.8.0/Modules/_testmultiphase.c:631 m_state:080 &m_state->integer:00000000 ++++++ Test program to demonstrate +++++++ AIX does not segmentfault on access of a NULL pointer ++++++++++++++++++++++++++++++++++++++++++ root at x074:/data/prj/python/git# cat nullpr.c #include main() { ??????? int *vpt = NULL; fprintf(stdout, "vpt = %08lx\n", vpt); if (*vpt) ??????? fprintf(stdout,"True\n"); else ??????? fprintf(stdout,"False\n"); } root at x074:/data/prj/python/git# rm -f nullpr; make nullpr; ./nullpr make: Warning: File 'nullpr.c' has modification time 387 s in the future cc???? nullpr.c?? -o nullpr nullpr.c:2:1: warning: return type defaults to 'int' [-Wimplicit-int] ?main() ?^ make: warning:? Clock skew detected.? Your build may be incomplete. vpt = 00000000 Segmentation fault ++++++++++ AIX does not 'Segmenttation fault' +++++++++++++ root at x066:[/data/prj/python/git]rm -r nullpr; make nullpr; ./nullpr cc???? nullpr.c?? -o nullpr vpt = 00000000 False -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From J.Demeyer at UGent.be Tue Sep 18 06:34:50 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 18 Sep 2018 12:34:50 +0200 Subject: [Python-Dev] Request review of bpo-34125/GH-8416 Message-ID: <5BA0D4CA.7040302@UGent.be> The gist of bpo-34125 is that the following two statements behave differently with respect to sys.setprofile() profiling: >>> list.append([], None) >>> list.append([], None, **{}) More precisely: the former call is profiled, but the latter is not. The fix at GH-8416 is simply to make this consistent by also profiling the latter. Victor Stinner did not want to accept the pull request because he wanted "a wider discussion on function calls". I think that GH-8416 is a simple bugfix which can be merged anyway and which won't make future "discussions on function calls" any harder. In any case, it would be good to have any *decision* (accepted or rejected) on that PR. I find the current uncertainty worse than a decision either way. Right now, my reference implementation of PEP 580 conflicts with that branch and I would like to resolve that conflict after GH-8416 has been decided. Links: https://bugs.python.org/issue34125 https://github.com/python/cpython/pull/8416 Thanks, Jeroen Demeyer. From fabiofz at gmail.com Tue Sep 18 08:55:15 2018 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Tue, 18 Sep 2018 09:55:15 -0300 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> Message-ID: On Mon, Sep 17, 2018 at 9:23 PM, Carl Shapiro wrote: > On Sun, Sep 16, 2018 at 1:24 PM, Antoine Pitrou > wrote: > >> I think it's of limited interest if it only helps with modules used >> during the startup sequence, not arbitrary stdlib or third-party >> modules. >> > > This should help any use-case that is already using the freeze module > already bundled with CPython. Third-party code, like py2exe, py2app, > pyinstaller, and XAR could build upon this to create applications that > start faster. > I think this seems like a great idea. Some questions though: During the import process, Python can already deal with folders and .zip files in sys.path... now, instead of having special handling for a new concept with a custom command line, etc, why not just say that this is a special file (e.g.: files with a .pyfrozen extension) and make importlib be able to deal with it when it's on sys.path (that way there could be multiple of those and there should be no need to turn it on/off, custom command line, etc)? Another question: doesn't importlib already provide hooks for external contributors which could address that use case? (so, this could initially be available as a third party library for maturing outside of CPython and then when it's deemed to be mature it could be integrated into CPython -- not that this can't happen on Python 3.8 timeframe, but it'd be useful checking its use against the current Python version and measuring benefits with real world code). To give an idea, on my machine the baseline Python startup is about 20ms >> (`time python -c pass`), but if I import Numpy it grows to 100ms, and >> with Pandas it's more than 200ms. Saving 4ms on the baseline startup >> would make no practical difference for concrete usage. >> > > Do you have a feeling for how many of those milliseconds are spend loading > bytecode from disk? If so standalone executables that contain numpy and > pandas (and mercurial) would start faster > > >> I'm ready to think there are other use cases where it matters, though. >> > > I think so. I hope you will, too :-) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > fabiofz%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aixtools at felt.demon.nl Tue Sep 18 09:56:21 2018 From: aixtools at felt.demon.nl (Michael) Date: Tue, 18 Sep 2018 15:56:21 +0200 Subject: [Python-Dev] Nearly - all tests PASS for AIX In-Reply-To: <908465e6-d28b-a4ec-d687-1ff3826e7e84@felt.demon.nl> References: <908465e6-d28b-a4ec-d687-1ff3826e7e84@felt.demon.nl> Message-ID: On 17/09/2018 12:50, Michael wrote: > Dear all, > > The last two months I have spent nearly all my free time to cleanup "a > frustration" - from my side - the long list of failing tests for AIX > (there were nearly 20 when I started). == Tests result: SUCCESS == 393 tests OK. 1 test altered the execution environment: ??? test_threading 25 tests skipped: ??? test_dbm_gnu test_devpoll test_epoll test_gdb test_idle ??? test_kqueue test_lzma test_msilib test_ossaudiodev test_readline ??? test_spwd test_sqlite test_startfile test_tcl test_tix test_tk ??? test_ttk_guionly test_ttk_textonly test_turtle test_unicode_file ??? test_unicode_file_functions test_winconsoleio test_winreg ??? test_winsound test_zipfile64 Total duration: 13 min 30 sec Tests result: SUCCESS May I put this up as a PR - not for merging - but to see how it performs, or does not perform, with the Travis Ci, etc. tests? Regards, Michael p.s. - most of the time test_threading just passes. Going to Rinse and repeat! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From carl.shapiro at gmail.com Tue Sep 18 13:36:52 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Tue, 18 Sep 2018 10:36:52 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <20180918103142.60cd2ec5@fsol> References: <20180916222441.13600498@fsol> <20180918103142.60cd2ec5@fsol> Message-ID: On Tue, Sep 18, 2018 at 1:31 AM, Antoine Pitrou wrote: > No idea. In my previous experiments with module import speed, I > concluded that executing module bytecode generally was the dominating > contributor, but that doesn't mean loading bytecode is costless. > My observations might not be so different. On a large application, we measured ~25-30% of start-up time being spent in the loading of compiled bytecode. That includes: probing the filesystem, reading the bytecode off disk, allocating heap storage, and un-marshaling objects into the heap. Making that percentage go to ~0% using this change does not make the non-import parts of our module body functions execute faster. It does create a greater opportunity for the application developer to do less work in module body functions which is where the largest start-up time gains are now likely to happen. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.shapiro at gmail.com Tue Sep 18 13:57:16 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Tue, 18 Sep 2018 10:57:16 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> Message-ID: On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny wrote: > During the import process, Python can already deal with folders and .zip > files in sys.path... now, instead of having special handling for a new > concept with a custom command line, etc, why not just say that this is a > special file (e.g.: files with a .pyfrozen extension) and make importlib be > able to deal with it when it's on sys.path (that way there could be > multiple of those and there should be no need to turn it on/off, custom > command line, etc)? > That is an interesting idea but it might not be easy to work into this design. The improvement in start-up time comes from eliminating the overheads of filesystem I/O, memory allocation, and un-marshaling bytecode. Having this data on the filesystem would reintroduce the cost of filesystem I/O and it would add a load-time relocation to the equation so the overall performance benefits would be greatly lessened. > Another question: doesn't importlib already provide hooks for external > contributors which could address that use case? (so, this could initially > be available as a third party library for maturing outside of CPython and > then when it's deemed to be mature it could be integrated into CPython -- > not that this can't happen on Python 3.8 timeframe, but it'd be useful > checking its use against the current Python version and measuring benefits > with real world code). > This may be possible but, for the same reasons I outline above, it would certainly come at the expense of performance. I think many people are interested in a better .pyc format but our goals are much more modest. We are actually trying to not introduce a whole new way to externalize .py data in CPython. Rather, we think of this as just making the existing frozen module capability much faster so its use can be broadened to making start-up performance better. The user visible part, the command line interface to bypass the frozen module, would be a nice-to-have for developers but is something we could live without. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Tue Sep 18 14:38:10 2018 From: steve.dower at python.org (Steve Dower) Date: Tue, 18 Sep 2018 11:38:10 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> Message-ID: <397028a3-8368-4d57-57de-916f0847f90f@python.org> On 18Sep2018 1057, Carl Shapiro wrote: > On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny > wrote: > > During the import process, Python can already deal with folders and > .zip files in sys.path... now, instead of having special handling > for a new concept with a custom command line, etc, why not just say > that this is a special file (e.g.: files with a .pyfrozen extension) > and make importlib be able to deal with it when it's on sys.path > (that way there could be multiple of those and there should be no > need to turn it on/off, custom command line, etc)? > > > That is an interesting idea but it might not be easy to work into this > design.? The improvement in start-up time comes from eliminating the > overheads of filesystem I/O, memory allocation, and un-marshaling > bytecode.? Having this data on the filesystem would reintroduce the cost > of filesystem I/O and it would add a load-time relocation to the > equation so the overall performance benefits would be greatly lessened. > > Another question: doesn't importlib already provide hooks for > external contributors which could address that use case? (so, this > could initially be available as a third party library for maturing > outside of CPython and then when it's deemed to be mature it could > be integrated into CPython -- not that this can't happen on Python > 3.8 timeframe, but it'd be useful checking its use against the > current Python version and measuring benefits with real world code). > > > This may be possible but, for the same reasons I outline above, it would > certainly come at the expense of performance. > > I think many people are interested in a better .pyc format but our goals > are much more modest.? We are actually trying to not introduce a whole > new way to externalize .py data in CPython.? Rather, we think of this as > just making the existing frozen module capability much faster so its use > can be broadened to making start-up performance better.? The user > visible part, the command line interface to bypass the frozen module, > would be a nice-to-have for developers but is something we could live > without. The primary benefit of the importlib hook approach is that it would not require rebuilding CPython each time you make a change. Since we need to consider a wide range of users across a wide range of platforms, having the ability to load a single native module that contains many "pre-loaded" modules allows many more people to access the benefits. It would not prevent some specific modules from being compiled into the main binary, but for those who do not build their own Python it would also allow specific applications to use the feature as well. FWIW, I don't read this as being pushed back on Carl to implement before the idea is accepted. I think we're taking the (now proven) core idea and shaping it into a suitable form for the main CPython distribution, which has to take more use cases into account. Cheers, Steve From fabiofz at gmail.com Tue Sep 18 14:46:11 2018 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Tue, 18 Sep 2018 15:46:11 -0300 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> Message-ID: On Tue, Sep 18, 2018 at 2:57 PM, Carl Shapiro wrote: > On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny wrote: > >> During the import process, Python can already deal with folders and .zip >> files in sys.path... now, instead of having special handling for a new >> concept with a custom command line, etc, why not just say that this is a >> special file (e.g.: files with a .pyfrozen extension) and make importlib be >> able to deal with it when it's on sys.path (that way there could be >> multiple of those and there should be no need to turn it on/off, custom >> command line, etc)? >> > > That is an interesting idea but it might not be easy to work into this > design. The improvement in start-up time comes from eliminating the > overheads of filesystem I/O, memory allocation, and un-marshaling > bytecode. Having this data on the filesystem would reintroduce the cost of > filesystem I/O and it would add a load-time relocation to the equation so > the overall performance benefits would be greatly lessened. > > >> Another question: doesn't importlib already provide hooks for external >> contributors which could address that use case? (so, this could initially >> be available as a third party library for maturing outside of CPython and >> then when it's deemed to be mature it could be integrated into CPython -- >> not that this can't happen on Python 3.8 timeframe, but it'd be useful >> checking its use against the current Python version and measuring benefits >> with real world code). >> > > This may be possible but, for the same reasons I outline above, it would > certainly come at the expense of performance. > > I think many people are interested in a better .pyc format but our goals > are much more modest. We are actually trying to not introduce a whole new > way to externalize .py data in CPython. Rather, we think of this as just > making the existing frozen module capability much faster so its use can be > broadened to making start-up performance better. The user visible part, > the command line interface to bypass the frozen module, would be a > nice-to-have for developers but is something we could live without. > Just to make sure we're in the same page, the approach I'm talking about would still be having a dll, not a better .pyc format, so, during the import a custom importer would open that dll once and provide modules from it -- do you think this would be much more overhead than what's proposed now? I guess it may be a bit slower because it'd have to obey the existing import capabilities, but that shouldn't mean more time is spent on IO, memory allocation nor un-marshaling bytecode (although it may be that I misunderstood the approach or the current import capabilities don't provide the proper api for that). -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Tue Sep 18 15:22:37 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Tue, 18 Sep 2018 15:22:37 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Tue, Sep 18, 2018 at 2:40 AM INADA Naoki wrote: > > On Tue, Sep 18, 2018 at 7:08 AM Wes Turner wrote: > > > > To summarize: > > > > - CPython may be vulnerable to speculative execution vulnerabilities, but none are known. > > - In general, CPython is currently too slow for speculative execution exploitation to be practical. > > - Sandboxed, JIT'ed JS is not too slow for speculative execution exploitation to be practical > > - (Not otherwise discussed here: PyPy's sandboxed JIT may not be too slow for speculative execution exploitation to be practical.) > > > > As far as I know, execution speed is important for attacker, not victim. > In case of JavaScript, browser may load attacking code and run it while > user watching websites. > Browsers provides sandbox for JS, but attacker code may be able to > bypass the sandbox by Spectre or Meltdown. So browsers disabled > high precision timer until OSes are updated. > > This topic is totally unrelated to compiler options: these compiler options > doesn't prohibit running attacking code, it just guard branches from > branch target injection. > > Does my understanding collect? Why should we discuss about execution speed? According to this article, the malicious program needs to act in the amount of time it takes for the CPU to load a value from memory and invalidate a branch prediction: https://hackernoon.com/timing-is-everything-understanding-the-meltdown-and-spectre-attacks-5e1946e44f9f From carl.shapiro at gmail.com Tue Sep 18 16:44:10 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Tue, 18 Sep 2018 13:44:10 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <397028a3-8368-4d57-57de-916f0847f90f@python.org> References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: On Tue, Sep 18, 2018 at 11:38 AM, Steve Dower wrote: > The primary benefit of the importlib hook approach is that it would not > require rebuilding CPython each time you make a change. Since we need to > consider a wide range of users across a wide range of platforms, having the > ability to load a single native module that contains many "pre-loaded" > modules allows many more people to access the benefits. > > It would not prevent some specific modules from being compiled into the > main binary, but for those who do not build their own Python it would also > allow specific applications to use the feature as well. > How might people feel about using the linker to bundle a list of pre-loaded modules into a single-file executable? That would avoid the inconvenience of rebuilding all of CPython by shipping a static libpython and having the tool generate a .o or .S file with the un-marshaled data. (Linkers and assemblers are small enough to be bundled on systems that do not have them.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nas-python at arctrix.com Tue Sep 18 18:00:03 2018 From: nas-python at arctrix.com (Neil Schemenauer) Date: Tue, 18 Sep 2018 16:00:03 -0600 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: <20180918220003.ulk4ypmxtbym6s4v@python.ca> On 2018-09-18, Carl Shapiro wrote: > How might people feel about using the linker to bundle a list of pre-loaded > modules into a single-file executable? The users of Python are pretty diverse so it depends on who you ask. Some would like a giant executable that includes everything they need (so of like the Go approach). Other people want an executable that has just importlib inside it and then mix-and-match different shared libs for their different purposes. Some will not want work "old school" and load from separate .py or .pyc files. I see no reason why we can't support all these options. Regards, Neil From stefan_ml at behnel.de Wed Sep 19 03:32:34 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 19 Sep 2018 09:32:34 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: Carl Shapiro schrieb am 18.09.2018 um 22:44: > How might people feel about using the linker to bundle a list of pre-loaded > modules into a single-file executable? One way to do that would be to compile Python modules with Cython and link them in statically, instead of compiling them to .pyc files. Advantage: you get native C .o files, fast and straight forward to link. Disadvantage: native code is much more voluminous than byte code, so the overall binary size would grow substantially. Also, one thing that would be interesting to find out is whether constant Python data structures can actually be pre-allocated in the data segment (and I mean their object structs) . Then things like tuples of strings (argument lists and what not) could be loaded and the objects quickly initialised (although, is that even necessary?), rather than having to heap allocate and create them. Probably something that we should try out in Cython. Stefan From ncoghlan at gmail.com Wed Sep 19 03:49:40 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Sep 2018 17:49:40 +1000 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: I think the changes to both master and the 3.7 branch should be reverted. For 3.7, I already said that I think we should just accept that that ship has sailed with 3.7.0 and leave the as-shipped implementation alone for the rest of the 3.7 series: https://bugs.python.org/issue34589#msg325242 It isn't the way I intended it to work, but the kinds of large scale architectural changes the intended implementation is designed to cope with aren't going to happen on a maintenance branch anyway. For 3.8, after Victor's rushed changes have been reverted, my PR should be conflict free again, and we'll be able to get PEP 538 back to working the way it was always supposed to work (while keeping the genuine stdio handling fixes that Victor's refactoring provided): https://github.com/python/cpython/pull/9257 Regards, Nick. On Tue, 18 Sep 2018 at 11:42, Ned Deily wrote: > > On Sep 17, 2018, at 21:20, Victor Stinner wrote: > > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > > coerce_c_locale=value" option and make sure that the C locale coercion > > cannot be when Python in embedded: are you ok with these changes? > > > > > > Before 3.7.0 release, during the implementation of the UTF-8 Mode (PEP > > 540), I changed two things in Nick Coghlan's implementation of the C > > locale coercion (PEP 538): > > > > (1) PYTHONCOERCECLOCALE environment variable is now ignored when -E or > > -I command line option is used. > > > > (2) When Python is embeded, the C locale coercion is now enabled if > > the LC_CTYPE locale is "C". > > > > Nick asked me to change the behavior: > > https://bugs.python.org/issue34589 > > > > I just pushed this change in the 3.7 branch which adds a new "-X > > coerce_c_locale=value" option: > > https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684 > > > > Examples using Pyhon 3.7 (future 3.7.1) with UTF-8 Mode disabled, to > > only test the C locale coercion: > > --- > > $ cat test.py > > import codecs, locale > > enc = locale.getpreferredencoding() > > enc = codecs.lookup(enc).name > > print(enc) > > > > $ export LC_ALL= LC_CTYPE=C LANG= > > > > # Disable C locale coercion: get ASCII as expected > > $ PYTHONCOERCECLOCALE=0 ./python -X utf8=0 test.py > > ascii > > > > # -E ignores PYTHONCOERCECLOCALE=0: > > # C locale is coerced, we get UTF-8 > > $ PYTHONCOERCECLOCALE=0 ./python -E -X utf8=0 test.py > > utf-8 > > > > # -X coerce_c_locale=0 is not affected by -E: > > # C locale coercion disabled as expected, get ASCII as expected > > $ ./python -E -X utf8=0 -X coerce_c_locale=0 test.py > > ascii > > --- > > > > > > For (1), Nick's use case is to get Python 3.6 behavior (C locale not > > coerced) on Python 3.7 using PYTHONCOERCECLOCALE. Nick proposed to use > > PYTHONCOERCECLOCALE even with -E or -I, but I dislike introducing a > > special case for -E option. > > > > I chose to add a new "-X coerce_c_locale=0" to Python 3.7.1 to provide > > a solution for this use case. (Python 3.7.0 and older ignore this > > option.) > > > > Note: Python 3.7.0 is fine with PYTHONCOERCECLOCALE=0, we are only > > talking about the special case of -E and -I options. > > > > > > For (2), I modified Python 3.7.1 to make sure the C locale is never > > coerced when the C API is used to embed Python inside an application: > > Py_Initialize() and Py_Main(). The C locale can only be coerced by the > > official Python program ("python3.7"). > > > > I don't know if it should be possible to enable C locale coercion when > > Python is embedded. So I just made the change requested by Nick :-) > > > > > > I dislike doing such late changes in 3.7.1, especially since PEP 538 > > has been designed by Nick Coghlan, and we disagree on the fix. But Ned > > Deily, our Python 3.7 release manager, wants to see last 3.7 fixes > > merged before Tuesday, so here we are. > > Just because the 3.7.1rc is scheduled doesn't mean we should throw something in, particularly if it's not fully reviewed and fully agreed upon. If it's important enough, we could delay the rc a few days ... or decide to wait for 3.7.2. > > > Nick, Ned, INADA-san: are you ok with these changes? > > The other choices for 3.7.1 are: > > > > * Revert my change: C locale coercion can still be enabled when Python > > is embedded, -E option ignores PYTHONCOERCECLOCALE env var. > > > > * Revert my change and apply Nick's PR 9257: C locale coercion cannot > > be enabled when Python is embedded and -E option doesn't ignore > > PYTHONCOERCECLOCALE env var. > > > > > > I spent months to fix the master branch to support all possible > > locales and encodings, and get a consistent CLI: > > https://vstinner.github.io/python3-locales-encodings.html > > > > So I'm not excited by Nick's PR which IMHO moves Python backward, > > especially it breaks the -E option contract: it doesn't ignore > > PYTHONCOERCECLOCALE env var. > > I would like to see Nick review the merged 3.7 PR and have both him and you agree that this is the thing to do for 3.7.1. I also want to make sure we understand what affect this will have on 3.7.0 users. Let's not potentially make things worse. > > I'm not planning to tag 3.7.1rc for at least another 18 hours. I'm marking bpo-34589 as "release blocker" and I will not proceed until this is resolved. > > Thanks! > --Ned > > -- > Ned Deily > nad at python.org -- [] > -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vstinner at redhat.com Wed Sep 19 07:47:36 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 19 Sep 2018 13:47:36 +0200 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: Le mer. 19 sept. 2018 ? 09:50, Nick Coghlan a ?crit : > I think the changes to both master and the 3.7 branch should be reverted. Ok, I prepared a PR to revert the 3.7 change: https://github.com/python/cpython/pull/9416 > For 3.7, I already said that I think we should just accept that that > ship has sailed with 3.7.0 and leave the as-shipped implementation > alone for the rest of the 3.7 series: (...) For 3.8, (...), my PR > should be conflict free again, and we'll be able to get PEP 538 back > to working the way it was always supposed to work (...) I read all your comments, and honestly, I don't understand you. Once you say: "we don't actually want anyone turning off locale coercion except for debugging purposes" https://bugs.python.org/issue34589#msg325554 but you also say that Python 3.7.0 is broken on Centos 7 because it's not possible to disable C locale coercion using -E flag: https://bugs.python.org/issue34589#msg325246 And here (your email), one more time, you insist to support "PYTHONCOERCECLOCALE=0 python3 -E". I don't understand if you want PYTHONCOERCECLOCALE to be ignored when using -E or not. Since the PEP 538 is something new, we don't have much feedback of users to know if it causes any troubles, so I agree that we should provide a way to disable the feature, as I provided a way to disable the UTF-8 Mode when the LC_CTYPE is C or POSIX. Just to give user a full control on locales and encodings. That's why I came up with a new -X coerce_c_locale option which can be used even with -E. I understood that you like the option, since you proposed to use it: https://bugs.python.org/issue34589#msg325493 -- Moreover, you asked me to make sure that Py_Initialize() and Py_Main() cannot enable C locale coercion. That's what I did. -- IMHO the implementation is really a secondary concern here, the main question is: what is the correct behavior? Nick: * Do we agree that we need to provide a way to disable C locale coercion (PEP 538) even when -E is used? * Do you agree that Py_Initialize() and Py_Main() must not enable the C locale coercion (PEP 538)? I understood that your reply is yes for the second question, since you insist to push your change which also prevent Py_Initialize() and Py_Main() to enable C locale coercion. If we change 3.7.0 behavior in 3.8, I would prefer to change the behavior in 3.7.1. IMHO it's not too late to *fix* 3.7. -- I decided to push a concrete implementation because I understood that you was ok for the -X coerce_c_locale option and you asked me to fix my mistakes. I feel guilty that I broke the implementation of your PEP :-( Moreover, I'm also exhausted by fixing locales and encodings, I'm doing that for one year now, and I expected many times that I was done with all regressions and corner cases... We are discussing these issues since 3 weeks and we failed to fix them, whereas Ned asked to push last fixes before 3.7.1. I sent an email to make sure that we all agree on the solution. Well, it seems like again, we failed to agree on the expected *behavior*. Victor From vstinner at redhat.com Wed Sep 19 08:07:30 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 19 Sep 2018 14:07:30 +0200 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: > IMHO the implementation is really a secondary concern here, the main > question is: what is the correct behavior? > > Nick: > > * Do we agree that we need to provide a way to disable C locale > coercion (PEP 538) even when -E is used? > * Do you agree that Py_Initialize() and Py_Main() must not enable the > C locale coercion (PEP 538)? > > I understood that your reply is yes for the second question, since you > insist to push your change which also prevent Py_Initialize() and > Py_Main() to enable C locale coercion. Hum, I'm not sure if I explained properly my opinion on these questions. I consider that Python 3.7.0 introduced a regression compared to Python 3.6: it changes the LC_CTYPE locale for Python and all child processes and it's not possible to opt-out for that when using -E command line option. I proposed (and implemented) -X coerce_c_locale=0 for that. Unicode and locales are so hard to get right that I consider that it's important that we provide an option to opt-out,. Otherwise, someone will find an use case where Python 3.7 doesn't behave as expected and break one specific use case. I didn't notice a complain yet, but there are very few Python 3.7 users at this point. For example, very few Linux distributions use it yet. I consider that PYTHONCOERCECLOCALE must not introduce an exception in -E: it must be ignored when -E or -I is used. For security reasons, it's important to really ignore all PYTHON* environment variables. "Unicode" (in general) has been abused in the past to exploit vulnerabilities in applications. Locales and encodings are so hard, that it's easy to mess up and introduce a vulnerability just caused by encodings. It's also important to get deterministic and reproducible programs. For Py_Initialize() and Py_Main(): I have no opinion, so I rely on Nick's request to make sure that the C locale is not coerced when Python is embeded :-) Victor From yselivanov.ml at gmail.com Wed Sep 19 13:30:26 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 19 Sep 2018 13:30:26 -0400 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: Ned, Nick, Victor, There's an issue with the new PEP 567 (contextvars) C API. Currently it's designed to expose "PyContext*" and "PyContextVar*" pointers. I want to change that to "PyObject*" as using non-PyObject pointers turned out to be a very bad idea (interfacing with Cython is particularly challenging). Is it a good idea to change this in Python 3.7.1? Yury From vstinner at redhat.com Wed Sep 19 15:08:08 2018 From: vstinner at redhat.com (Victor Stinner) Date: Wed, 19 Sep 2018 21:08:08 +0200 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: Message-ID: Le mardi 18 septembre 2018, Victor Stinner a ?crit : > Hi Unicode and locales lovers, > > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > coerce_c_locale=value" option and make sure that the C locale coercion > cannot be when Python in embedded: are you ok with these changes? Nick asked me to revert, which means that no, he is not ok with these changes. I reverted my change in 3.7. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at holdenweb.com Wed Sep 19 15:19:04 2018 From: steve at holdenweb.com (Steve Holden) Date: Wed, 19 Sep 2018 20:19:04 +0100 Subject: [Python-Dev] [help] where to learn how to upgrade from 2.7 to 3 In-Reply-To: References: Message-ID: You can find information about python-list at https://mail.python.org/mailman/listinfo/python-list regards Steve Holden On Tue, Sep 18, 2018 at 4:28 AM Ryan Gonzalez wrote: > Python-dev is for development *of* Python, not *in* Python! You want > python-list instead. > > Also, make sure you include some full example code where the error occurs > and what exactly is failing. Right now, it's hard for me to tell what > exactly is going on... > > On Mon, Sep 17, 2018, 8:21 PM Avery Richards > wrote: > >> I am having so much fun learning python! I did not install the best >> version into my mac at first. Now I can't find out how to upgrade, (pip is >> awesome but not as conversational as I need it to be on the subject). I've >> downloaded the packages from python.org, installed all sorts of stuff, >> I configured my text editor to recognize python3, resolving formatting >> strings output, but now as I progress the >> >> [end = ' '] >> >> is not recognized. I have figured out a lot on my own, can you help me >> upgrade to 3.6 once and for all? Again I consulted with pip and followed >> faq websites (maybe a mistake there, idk). >> >> please please thank you! >> >> ~Avery >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com >> > -- > > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else > https://refi64.com/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Wed Sep 19 16:24:08 2018 From: nad at python.org (Ned Deily) Date: Wed, 19 Sep 2018 16:24:08 -0400 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: On Sep 19, 2018, at 13:30, Yury Selivanov wrote: > Ned, Nick, Victor, > > There's an issue with the new PEP 567 (contextvars) C API. > > Currently it's designed to expose "PyContext*" and "PyContextVar*" > pointers. I want to change that to "PyObject*" as using non-PyObject > pointers turned out to be a very bad idea (interfacing with Cython is > particularly challenging). > > Is it a good idea to change this in Python 3.7.1? It's hard to make an informed decision without a concrete PR to review. What would be the impact on any user code that has already adopted it in 3.7.0? -- Ned Deily nad at python.org -- [] From nad at python.org Wed Sep 19 16:48:46 2018 From: nad at python.org (Ned Deily) Date: Wed, 19 Sep 2018 16:48:46 -0400 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: Message-ID: On Sep 19, 2018, at 15:08, Victor Stinner wrote: > Le mardi 18 septembre 2018, Victor Stinner a ?crit : > > Hi Unicode and locales lovers, > > > > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > > coerce_c_locale=value" option and make sure that the C locale coercion > > cannot be when Python in embedded: are you ok with these changes? > > Nick asked me to revert, which means that no, he is not ok with these changes. > > I reverted my change in 3.7. Thank you, Victor! Nick, with regard to this does the current state of the 3.7 branch look acceptable now for a 3.7.1? -- Ned Deily nad at python.org -- [] From nad at python.org Wed Sep 19 17:12:47 2018 From: nad at python.org (Ned Deily) Date: Wed, 19 Sep 2018 17:12:47 -0400 Subject: [Python-Dev] 3.7.1 and 3.6.7 Releases Coming Soon In-Reply-To: <3ABAB3B5-6346-49F2-98F7-185303166016@python.org> References: <3ABAB3B5-6346-49F2-98F7-185303166016@python.org> Message-ID: Update: not surprisingly, there have been a number of issues that have popped up during and since the sprint that we would like to ensure are addressed in 3.7.1 and 3.6.7. In order to do so, I've been holding off on starting the releases. I think we are now getting close to having the important ones resolved so I'm going to plan on cutting off code for 3.7.1rc1 and 3.6.7rc1 by the end of 2018-09-20 (23:59 AoE). That's roughly 38 hours from now. Thanks for all of your help in improving Python for everyone! --Ned On Sep 10, 2018, at 18:17, Ned Deily wrote: > I have now scheduled a 3.7.1 release candidate and rescheduled the 3.6.7 release candidate for 2018-09-18, about a week from today, and 3.7.1 final and 3.6.7 final for 2018-09-28. That allows us to take advantage of fixes generated at the Core Developers sprint taking place this week. > > Please review any open issues you are working on or are interested in and try to get them merged in to the 3.7 and/or 3.6 branches soon - by the beginning of next week at the latest. As usual, if there are any issues you believe need to be addressed prior to these releases, please ensure there are open issues for them in the bug tracker (bugs.python.org) and that their priorities are set accordingly (e.g. "release blocker"). -- Ned Deily nad at python.org -- [] From tjreedy at udel.edu Wed Sep 19 18:08:37 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Sep 2018 18:08:37 -0400 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <397028a3-8368-4d57-57de-916f0847f90f@python.org> References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: On 9/18/2018 2:38 PM, Steve Dower wrote: > The primary benefit of the importlib hook approach is that it would not > require rebuilding CPython each time you make a change. If one edits a .c or .h file, one must rebuild to test. If one edits a .py module, one does not, and it would be a major nuisance to have to. My first suggested patches on the tracker (to .py files) were developed in my installed version (after backing up a module). I have occasionally told people on StackOverflow how to edit an idlelib file to get a future change 'now'. Other people have occasional reported there own custom modifications. If Python usually used derived stdlib code, but could optionally use the original .py files via a command-line switch, experimenting with changes to .py files would be easier. -- Terry Jan Reedy From larry at hastings.org Wed Sep 19 19:54:52 2018 From: larry at hastings.org (Larry Hastings) Date: Wed, 19 Sep 2018 16:54:52 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: On 09/19/2018 03:08 PM, Terry Reedy wrote: > If Python usually used derived stdlib code, but could optionally use > the original .py files via a command-line switch, experimenting with > changes to .py files would be easier. When Carl described the patch to me, he said there was already a switch in there somewhere to do exactly that.? I don't remember if it was command-line, it might have been an environment variable.? (I admit I didn't go hunting for it--I didn't need it to test the patch itself, and I had enough to do.)? Regardless, we would definitely have that functionality in before the patch would ever be considered for merging. We talked about it last week at the core dev sprint, and I thought about it some more.? As a result here's the behavior I propose.? I'm going to call the process "freezing" and the result "frozen modules", even though that's an already-well-overused name and I hope we'll pick something else before it gets merged. First, .py files that get frozen will have their date/time stamps set to a known value, both as part of the tarball / zip file, and when installed (a la "make install", the Win32 installer, etc).? There are some constraints on this; we distribute Python via .zip files, and .zip only supports 2 second resolution for date/time stamps.? So maybe something like this: the date is the approximate date of the release, and the time is the version number (e.g. 03:08:00 for all 3.8.x releases). When attempting to load a frozen Python module, Python will stat the .py file.? If the date/time and size match what we expected, Python will use the frozen module.? Otherwise it'll fall back to conventional behavior, including supporting .pyc files. There will also be a switch (command-line? environment variable? compile-time flag? all three?) for people who control their environments where you can skip the .py file and use the frozen module every time. In short: correctness by default, and more speed available if you know it's safe for your use case.? Use of the optimization is intentionally a little fragile, to ensure correctness. Cheers, //arry/ p.s. Why not 03:08:01 for 3.8.1?? That wouldn't be stored properly in the .zip file with its only-two-second resolution.? And multiplying the tertiary version number by 2--or 10, etc--would be surprising. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed Sep 19 20:34:01 2018 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 19 Sep 2018 17:34:01 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> <20180914222558.yuromalgkggfp2re@python.ca> Message-ID: On Sat, Sep 15, 2018 at 2:53 AM Paul Moore wrote: > On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer > wrote: > > > > On 2018-09-14, Larry Hastings wrote: > > > [..] adding the stat calls back in costs you half the startup. So > > > any mechanism where we're talking to the disk _at all_ simply > > > isn't going to be as fast. > > > > Okay, so if we use hundreds of small .pyc files scattered all over > > the disk, that's bad? Who would have thunk it. ;-P > > > > We could have a new format, .pya (compiled python archive) that has > > data for many .pyc files in it. In normal runs you would have one > > or just and handlful of these things (e.g. one for stdlib, one for > > your app and all the packages it uses). Then you mmap these just > > once and rely on OS page faults to bring in the data as you need it. > > The .pya would have a hash table at the start or end that tells you > > the offset for each module. > > Isn't that essentially what putting the stdlib in a zipfile does? (See > the windows embedded distribution for an example). It probably uses > normal IO rather than mmap, but maybe adding a "use mmap" flag to the > zipfile module would be a more general enhancement that zipimport > could use for free. > > Paul > To share a lesson learned: Putting the stdlib in a zip file is doable, but comes with a caveats that would likely make OS distros want to undo the change if done with CPython today: We did that for one of our internal pre-built Python 2.7 distributions used internally at Google used in the 2012-2014 timeframe. Thinking at the time "yay, less inodes and disk space and stat calls by the interpreter on all machines." The caveat we didn't anticipate was unfortunately that zipimport.c cannot handle the zip file changing out from underneath a running process. Ever. It does not hold an open file handle to the zip file (which on posix systems would ameliorate the problem) but instead regularly reopens it by name while using a startup-time cached zip file index. So when you deploy a change to your Python interpreter (as any OS distro package update, security update, upgrade, etc.) existing running processes that go on to do another import of a stdlib module that hadn't already been imported (statistically likely to be a codec related module, as those are often imported upon first use rather than at startup time with most modules the way people tend to structure their code) read a different zipfile using a cached index from a previous one and... boom. A strange rolling error in production that is not pretty to debug. Fixing zipimport.c to deal with this properly was tried, but still ran into issues, and was deemed ultimately infeasible. There's a BPO issue or three filed about this if you go hunting. On the contrary, having compiled in constants in the executable is fine and will never suffer from this problem. Those are mapped as RO data by the dynamic loader and demand paged. No complicated code in CPython required to manage them aside from the stdlib startup code import intercepting logic (which should be reasonably small, even without having looked at the patch in the PR yet). There's ongoing work to rewrite zipimport.c in python using zipfile itself which if used for the stdlib will require everything that it needs to be frozen into C data similar to existing bootstrap import logic - and being a different implementation of zip file reading code might be possible to do without suffering the same caveat. But storing the data on the C side still sounds like a much simpler code path to me. The maintenance concern is mostly about testing and building to make sure we include everything needed by the interpreter and keep it up to date. I'd like a configure flag controlling when the feature is to be "on by default". Having it off by default and enabled by an interpreter command line flag otherwise. Consider adding the individual configure flag to the set of things that --with-optimizations turns on for people. Don't be surprised if Facebook reports a startup time speedup greater than what you ever measure yourself. Their applications are different, and if they're using their XAR thing that mounts applications as a FUSE filesystem - that increases stat() overhead beyond what it already is with additional kernel round trips so it'll benefit that design even more. Any savings in startup time by not doing a crazy amount of sequential high latency blocking system calls is a good thing regardless. Not just for command line tools. Serving applications that are starting up are effectively spinning consuming CPUs to ultimately compute the same result everywhere for every application every time before performing useful work... You can measure such an optimization in a worthwhile amount of $ or carbon footprint saved around the world. Heat death of the universe by a billion cuts. Thanks for working on this! -G -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Sep 19 21:25:07 2018 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Sep 2018 21:25:07 -0400 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> <20180914222558.yuromalgkggfp2re@python.ca> Message-ID: On Sep 19, 2018, at 20:34, Gregory P. Smith wrote: > There's ongoing work to rewrite zipimport.c in python using zipfile itself Great timing! Serhiy?s rewrite of zipimport in Python has just landed in 3.8, although it doesn?t use zipfile. What?s in git now is a pretty straightforward translation from the original C, so it could use some clean ups (and I think Serhiy is planning that). So the problem you describe should be easier to fix now in 3.8. It would be interesting to see if we can squeeze more performance and better behavior out of it now that it?s in Python. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From eric at trueblade.com Wed Sep 19 21:37:30 2018 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 19 Sep 2018 21:37:30 -0400 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> <20180914222558.yuromalgkggfp2re@python.ca> Message-ID: <8d100c3b-e829-6c4f-7f8f-64bf64689717@trueblade.com> On 9/19/2018 9:25 PM, Barry Warsaw wrote: > On Sep 19, 2018, at 20:34, Gregory P. Smith wrote: > >> There's ongoing work to rewrite zipimport.c in python using zipfile itself > > Great timing! Serhiy?s rewrite of zipimport in Python has just landed in 3.8, although it doesn?t use zipfile. What?s in git now is a pretty straightforward translation from the original C, so it could use some clean ups (and I think Serhiy is planning that). So the problem you describe should be easier to fix now in 3.8. It would be interesting to see if we can squeeze more performance and better behavior out of it now that it?s in Python. You don't hear "better performance" and "now that it's in Python" together very often! Although I agree with your point: it's like how we tried and failed to make progress on namespace packages when import was written in C, and then once it was in Python it was easy to add the functionality. Eric From ncoghlan at gmail.com Thu Sep 20 06:11:45 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Sep 2018 20:11:45 +1000 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: On Wed, 19 Sep 2018 at 22:07, Victor Stinner wrote: > > > IMHO the implementation is really a secondary concern here, the main > > question is: what is the correct behavior? > > > > Nick: > > > > * Do we agree that we need to provide a way to disable C locale > > coercion (PEP 538) even when -E is used? > > * Do you agree that Py_Initialize() and Py_Main() must not enable the > > C locale coercion (PEP 538)? > > > > I understood that your reply is yes for the second question, since you > > insist to push your change which also prevent Py_Initialize() and > > Py_Main() to enable C locale coercion. > > Hum, I'm not sure if I explained properly my opinion on these questions. > > I consider that Python 3.7.0 introduced a regression compared to > Python 3.6: it changes the LC_CTYPE locale for Python and all child > processes and it's not possible to opt-out for that when using -E > command line option. This *wasn't* broken in the original PEP 538 implementation - it was only broken when you ignored the PEP and tried to make everything work the same way PEP 540 did, including moving the coercion out of the Python CLI and into the runtime library APIs. I still think the locale coercion handling in Python 3.7.x is broken, but adding MORE code is NOT the right answer: going back to the original (correct) implementation is. So changing it back to the way the PEP is supposed to work is fine, making everything more complicated for no good reason whatsoever is not fine. What changed is the fact I decided it wasn't worth holding up 3.7.1 over (and it certainly isn't worth adding a new -X option in a point release). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 20 06:20:53 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Sep 2018 20:20:53 +1000 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: Message-ID: On Thu, 20 Sep 2018 at 06:48, Ned Deily wrote: > > On Sep 19, 2018, at 15:08, Victor Stinner wrote: > > Le mardi 18 septembre 2018, Victor Stinner a ?crit : > > > Hi Unicode and locales lovers, > > > > > > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > > > coerce_c_locale=value" option and make sure that the C locale coercion > > > cannot be when Python in embedded: are you ok with these changes? > > > > Nick asked me to revert, which means that no, he is not ok with these changes. > > > > I reverted my change in 3.7. > > Thank you, Victor! > > Nick, with regard to this does the current state of the 3.7 branch look acceptable now for a 3.7.1? It's still broken relative to the PEP in the following respects: - Py_Initialize() coerces the C locale to C.UTF-8, even though it's not supposed to - Py_Main() coerces the C locale to C.UTF-8, even though it's not supposed to - PYTHONCOERCECLOCALE=0 doesn't work if -E or -I are passed on the command line (but it's supposed to) - PYTHONCOERCECLOCALE=warn doesn't work if -E or -I are passed on the command line (it's nominally supposed to do this too, but I'm less concerned about this one) The problem with Victor's patch is that instead of reverting to the as-designed-and-accepted PEP the way my PR (mostly) does, it instead introduces a whole new command line option (which then needs to be documented and tested), and still coerces *far* too late (not until Py_Initialise is already running, after who knows how much code in the embedding application has already executed). I don't have the time required to push through Victor's insistence that -E and -I are sacrosanct and must always be respected (despite PEP 538 explicitly saying that they won't be where PYTHONCOERCECLOCALE is concerned), and so we can't *possibly* change back to having the locale coercion work the way I originally implemented it, so I wrote the 3.7.x series off as a lost cause, and decided to devote my energies to getting things back to the way they were supposed to be for 3.8+. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 20 07:06:44 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Sep 2018 21:06:44 +1000 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: Message-ID: On Thu, 20 Sep 2018 at 20:20, Nick Coghlan wrote: > > On Thu, 20 Sep 2018 at 06:48, Ned Deily wrote: > > > > On Sep 19, 2018, at 15:08, Victor Stinner wrote: > > > Le mardi 18 septembre 2018, Victor Stinner a ?crit : > > > > Hi Unicode and locales lovers, > > > > > > > > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X > > > > coerce_c_locale=value" option and make sure that the C locale coercion > > > > cannot be when Python in embedded: are you ok with these changes? > > > > > > Nick asked me to revert, which means that no, he is not ok with these changes. > > > > > > I reverted my change in 3.7. > > > > Thank you, Victor! > > > > Nick, with regard to this does the current state of the 3.7 branch look acceptable now for a 3.7.1? > > It's still broken relative to the PEP in the following respects: > > - Py_Initialize() coerces the C locale to C.UTF-8, even though it's > not supposed to > - Py_Main() coerces the C locale to C.UTF-8, even though it's not supposed to > - PYTHONCOERCECLOCALE=0 doesn't work if -E or -I are passed on the > command line (but it's supposed to) > - PYTHONCOERCECLOCALE=warn doesn't work if -E or -I are passed on the > command line (it's nominally supposed to do this too, but I'm less > concerned about this one) It's worth noting that even though the PYTHONCOERCECLOCALE=0 off switch doesn't currently work as described in PEP 538 when passing -E or -I, setting "LC_ALL=C" does (since that's handled by the C library, independently of any CPython command line flags). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 20 09:18:15 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Sep 2018 23:18:15 +1000 Subject: [Python-Dev] Nearly - all tests PASS for AIX In-Reply-To: References: <908465e6-d28b-a4ec-d687-1ff3826e7e84@felt.demon.nl> Message-ID: On Wed, 19 Sep 2018 at 00:00, Michael wrote: > On 17/09/2018 12:50, Michael wrote: > > Dear all, > > > > The last two months I have spent nearly all my free time to cleanup "a > > frustration" - from my side - the long list of failing tests for AIX > > (there were nearly 20 when I started). > > == Tests result: SUCCESS == > > 393 tests OK. Nice! > 1 test altered the execution environment: > test_threading > > 25 tests skipped: > test_dbm_gnu test_devpoll test_epoll test_gdb test_idle > test_kqueue test_lzma test_msilib test_ossaudiodev test_readline > test_spwd test_sqlite test_startfile test_tcl test_tix test_tk > test_ttk_guionly test_ttk_textonly test_turtle test_unicode_file > test_unicode_file_functions test_winconsoleio test_winreg > test_winsound test_zipfile64 > > Total duration: 13 min 30 sec > Tests result: SUCCESS > > May I put this up as a PR - not for merging - but to see how it > performs, or does not perform, with the Travis Ci, etc. tests? That seems like a reasonable approach to me - it will also allow folks to give the changes a quick skim and provide suggestions for splitting it up into more easily reviewed PRs. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 20 09:26:46 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Sep 2018 23:26:46 +1000 Subject: [Python-Dev] debugging test_importlib.test_bad_traverse - script status is SUCCESS - but FAIL is expected. In-Reply-To: References: <10c3d2a1-85e3-5747-77a2-8c282565e8ce@felt.demon.nl> Message-ID: On Tue, 18 Sep 2018 at 19:52, Michael wrote: > > On 17/09/2018 09:39, Michael wrote: > > I read the discussion related to issue32374. That seems to be sure that > > other events that could > > cause the test to fail (i.e., the program executes successfully) are > > caught early, and/or ignored > > so that the program fails - and the test succeeds. > > After reading below, I would appreciate knowing whether to ask that > issue32374 be reopened and the test adjusted so that the test is > "SkipIf" AIX? Or, something else? I'll work on something else, but I do > not want to guess the current intent of this test module. Reviewing https://bugs.python.org/issue32374, the purpose of the test case is to check that failing to handle the m_state == NULL case will always segfault (because the import machinery always checks that m_traverse is valid after the create stage), rather than only segfaulting sometimes (based on whether or not a cyclic gc run triggers at an inopportune moment). Since the AIX case won't segfault in either the deliberately triggered traversal *or* in a GC-induced traversal, skipping the test case on AIX seems fine (with the note that null pointer accesses are just zero on AIX, not a segfault). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefanrin at gmail.com Thu Sep 20 13:29:22 2018 From: stefanrin at gmail.com (Stefan Ring) Date: Thu, 20 Sep 2018 19:29:22 +0200 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki wrote: > I think this topic should split to two topics: (1) Guard Python > process from Spectre/Meltdown > attack from other process, (2) Prohibit Python code attack other > processes by using > Spectre/Meltdown. (3) Guard Python from performance degradation by overly aggressive Spectre "mitigation". From wes.turner at gmail.com Thu Sep 20 14:08:26 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 20 Sep 2018 14:08:26 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Thursday, September 20, 2018, Stefan Ring wrote: > On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki > wrote: > > > I think this topic should split to two topics: (1) Guard Python > > process from Spectre/Meltdown > > attack from other process, (2) Prohibit Python code attack other > > processes by using > > Spectre/Meltdown. > > (3) Guard Python from performance degradation by overly aggressive > Spectre "mitigation". > Spectre has the potential of having a greater impact on cloud providers than Meltdown. Whereas Meltdown allows unauthorized applications to read from privileged memory to obtain sensitive data from processes running on the same cloud server, Spectre can allow malicious programs to induce a hypervisor to transmit the data to a guest system running on top of it. - Private SSL certs - Cached keys and passwords in non-zeroed RAM - [...] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) I really shouldn't need to apologise for bringing this up here. Here's one: https://github.com/Eugnis/spectre-attack/blob/master/Source.c Is this too slow in CPython with: - Coroutines (asyncio (tulip)) - PyPy JIT * - Numba JIT * - C Extensions * - Cython * * Not anyone here's problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aixtools at felt.demon.nl Thu Sep 20 13:58:43 2018 From: aixtools at felt.demon.nl (Michael) Date: Thu, 20 Sep 2018 19:58:43 +0200 Subject: [Python-Dev] Nearly - all tests PASS for AIX In-Reply-To: References: <908465e6-d28b-a4ec-d687-1ff3826e7e84@felt.demon.nl> Message-ID: On 20/09/2018 15:18, Nick Coghlan wrote: > That seems like a reasonable approach to me - it will also allow folks > to give the changes a quick skim and provide suggestions for splitting > it up into more easily reviewed PRs. I already have them as individual PR's (8 Open) - https://github.com/python/cpython/pulls?q=is%3Aopen+is%3Apr+author%3Aaixtools+sort%3Aupdated-desc But I'll add the combined one to get it through grinder and see if there are unexpected surprises. Michael > > Cheers, > Nick. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From carl.shapiro at gmail.com Thu Sep 20 14:21:06 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Thu, 20 Sep 2018 11:21:06 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: On Wed, Sep 19, 2018 at 12:32 AM, Stefan Behnel wrote: > Also, one thing that would be interesting to find out is whether constant > Python data structures can actually be pre-allocated in the data segment > (and I mean their object structs) . Then things like tuples of strings > (argument lists and what not) could be loaded and the objects quickly > initialised (although, is that even necessary?), rather than having to heap > allocate and create them. Probably something that we should try out in > Cython. > I might not be fully understanding the scope of your question but this patch does allocate constant data structures in the data segment. We could be more aggressive with that but we limit our scope to what is presented to the un-marshaling code. This may be relevant to Cython, as well. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Sep 21 01:53:05 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Sep 2018 07:53:05 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: Carl Shapiro schrieb am 20.09.2018 um 20:21: > On Wed, Sep 19, 2018 at 12:32 AM, Stefan Behnel wrote: > >> Also, one thing that would be interesting to find out is whether constant >> Python data structures can actually be pre-allocated in the data segment >> (and I mean their object structs) . Then things like tuples of strings >> (argument lists and what not) could be loaded and the objects quickly >> initialised (although, is that even necessary?), rather than having to heap >> allocate and create them. Probably something that we should try out in >> Cython. > > I might not be fully understanding the scope of your question but this > patch does allocate constant data structures in the data segment. We could > be more aggressive with that but we limit our scope to what is presented to > the un-marshaling code. Ah, thanks, yes, it works recursively, also for tuples and code objects. Took me a while to figure out how to open the "frozemodules.c" file, but looking at that makes it clear. Yes, that's what I meant. > This may be relevant to Cython, as well. Totally. This might actually be more relevant for Cython than for CPython in the end, because it wouldn't be limited to the stdlib and its core modules. It's a bit more difficult for us, because this probably won't work easily across Python releases (2.[67] and 3.[45678] for now) and also definitely not for PyPy, but that just means some multiplication of the generated code, and we have the dynamic part of it already. Supporting that for Unicode strings will be fun, I'm sure. :) Stefan From stefan_ml at behnel.de Fri Sep 21 02:20:47 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Sep 2018 08:20:47 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: Larry Hastings schrieb am 14.09.2018 um 23:27: > What the patch does: it takes all the Python modules that are loaded as > part of interpreter startup and deserializes the marshalled .pyc file into > precreated objects stored as static C data. What about the small integers cache? The C serialisation generates several PyLong objects that would normally reside in the cache. Is this handled somewhere? I guess the cache could entirely be loaded from the data segment. And the same would have to be done for interned strings. Basically anything that CPython only wants to have one instance of. That would severely limit the application of this optimisation to external modules, though. I don't see a way how they could load their data structures from the data segment without duplicating all sorts of "singletons". Stefan From leewangzhong+python at gmail.com Fri Sep 21 02:24:52 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Fri, 21 Sep 2018 02:24:52 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Thu, Sep 20, 2018 at 2:10 PM Wes Turner wrote: > > On Thursday, September 20, 2018, Stefan Ring wrote: >> >> On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki wrote: >> >> > I think this topic should split to two topics: (1) Guard Python >> > process from Spectre/Meltdown >> > attack from other process, (2) Prohibit Python code attack other >> > processes by using >> > Spectre/Meltdown. >> >> (3) Guard Python from performance degradation by overly aggressive >> Spectre "mitigation". > > > > Spectre has the potential of having a greater impact on cloud providers than Meltdown. Whereas Meltdown allows unauthorized applications to read from privileged memory to obtain sensitive data from processes running on the same cloud server, Spectre can allow malicious programs to induce a hypervisor to transmit the data to a guest system running on top of it. > - Private SSL certs > - Cached keys and passwords in non-zeroed RAM > - [...] > > https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) It's true that the attacks should worry cloud providers. Doesn't that mean that companies like Amazon, Microsoft (Steve), and Docker should have done analyses on CPython's vulnerability to these exploits? Has/should/can anyone officially representing Python contact the companies and ask them? When I followed your quote to find the context, I found it uses, as its source, a Forbes article. The source cited by THAT article is Daniel Gruss, who was one of the researchers. Should someone from the PSF contact the researchers? Steve says he spoke to some of them to judge whether the proposed compiler flags would help, and decided against it. Absent of expert input, here's my non-expert take: That quote requires an OS-level fix. A Python program without the proper permissions can't do such things unless there is a vulnerability with the OS, and it is extremely unlikely for anyone to update Python for Spectre but not update the OS (and they'd be screwed in any case). And even if there is a vulnerability in the OS, maybe the way to exploit it is by using arbitrary Python execution (which you need before you can TRY to use Spectre) on this Python interpreter. You can then write a new binary file and run THAT, and it will be fast enough. That's not something you can fix about CPython. Also, (again with my understanding) the problem of Spectre and Meltdown are that you can escape sandboxes and the like, such as the user/kernel divide, or a software sandbox like that provided by a JavaScript VM. For CPython to be "vulnerable" to these attacks, it needs to have some kind of sandbox or protection to break out of. Instead, we sometimes have sandboxes AROUND CPython (like Jupyter) or WITHIN CPython. I don't see how it makes sense to talk about a sandbox escape FOR CPython (yet). Your original post linked to a discussion about Linux using those build flags. Linux is a kernel, and has such protections that can be bypassed, so it has something to worry about. Malicious code can be native code, which (to my understanding) will be fast enough to exploit the cache miss time. Here's Google's article about the retpoline and why it helps: https://support.google.com/faqs/answer/7625886 As of yet, you have quoted passages that have little relevance to interpreter devs, especially non-JIT interpreters, and you have linked to entire articles for non-experts with little relevance to interpreter devs. This doesn't show that you have any better of an understanding than I have, which is less than the understanding that some of the Python devs have, and much less than what Steve has. In short, it LOOKS like you don't know what you're talking about. If you have a different and deeper understanding of the problem, then you need to show it, and say why there is a problem for CPython specifically. Or find someone who can do that for you. > Here's one: > https://github.com/Eugnis/spectre-attack/blob/master/Source.c > > Is this too slow in CPython with: > - Coroutines (asyncio (tulip)) > - PyPy JIT * > - Numba JIT * > - C Extensions * > - Cython * > > * Not anyone here's problem. C extensions are obviously fast enough. I think most of the other starred examples are fast enough, but it's probably more subtle than I think and requires further analysis by their devs. I also think there's something important I'm still missing about what's required and what it can do. I don't see what coroutines have to do with it. Coroutines are still Python code, and they're subject to the GIL. From wes.turner at gmail.com Fri Sep 21 03:11:59 2018 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 21 Sep 2018 03:11:59 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: I feel like you are actively undermining attempts to prevent exploitation of known vulnerabilities because the software in question is currently too slow. For a 4-8% performance penalty, we could just add the CFLAGS to the build now and not worry about it. I give up. On Friday, September 21, 2018, Franklin? Lee wrote: > On Thu, Sep 20, 2018 at 2:10 PM Wes Turner wrote: > > > > On Thursday, September 20, 2018, Stefan Ring > wrote: > >> > >> On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki > wrote: > >> > >> > I think this topic should split to two topics: (1) Guard Python > >> > process from Spectre/Meltdown > >> > attack from other process, (2) Prohibit Python code attack other > >> > processes by using > >> > Spectre/Meltdown. > >> > >> (3) Guard Python from performance degradation by overly aggressive > >> Spectre "mitigation". > > > > > > > Spectre has the potential of having a greater impact on cloud > providers than Meltdown. Whereas Meltdown allows unauthorized applications > to read from privileged memory to obtain sensitive data from processes > running on the same cloud server, Spectre can allow malicious programs to > induce a hypervisor to transmit the data to a guest system running on top > of it. > > - Private SSL certs > > - Cached keys and passwords in non-zeroed RAM > > - [...] > > > > https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) > > It's true that the attacks should worry cloud providers. Doesn't that > mean that companies like Amazon, Microsoft (Steve), and Docker should > have done analyses on CPython's vulnerability to these exploits? > Has/should/can anyone officially representing Python contact the > companies and ask them? > > When I followed your quote to find the context, I found it uses, as > its source, a Forbes article. The source cited by THAT article is > Daniel Gruss, who was one of the researchers. Should someone from the > PSF contact the researchers? Steve says he spoke to some of them to > judge whether the proposed compiler flags would help, and decided > against it. > > Absent of expert input, here's my non-expert take: That quote requires > an OS-level fix. A Python program without the proper permissions can't > do such things unless there is a vulnerability with the OS, and it is > extremely unlikely for anyone to update Python for Spectre but not > update the OS (and they'd be screwed in any case). And even if there > is a vulnerability in the OS, maybe the way to exploit it is by using > arbitrary Python execution (which you need before you can TRY to use > Spectre) on this Python interpreter. You can then write a new binary > file and run THAT, and it will be fast enough. That's not something > you can fix about CPython. > > Also, (again with my understanding) the problem of Spectre and > Meltdown are that you can escape sandboxes and the like, such as the > user/kernel divide, or a software sandbox like that provided by a > JavaScript VM. For CPython to be "vulnerable" to these attacks, it > needs to have some kind of sandbox or protection to break out of. > Instead, we sometimes have sandboxes AROUND CPython (like Jupyter) or > WITHIN CPython. I don't see how it makes sense to talk about a sandbox > escape FOR CPython (yet). > > Your original post linked to a discussion about Linux using those > build flags. Linux is a kernel, and has such protections that can be > bypassed, so it has something to worry about. Malicious code can be > native code, which (to my understanding) will be fast enough to > exploit the cache miss time. Here's Google's article about the > retpoline and why it helps: > https://support.google.com/faqs/answer/7625886 > > As of yet, you have quoted passages that have little relevance to > interpreter devs, especially non-JIT interpreters, and you have linked > to entire articles for non-experts with little relevance to > interpreter devs. This doesn't show that you have any better of an > understanding than I have, which is less than the understanding that > some of the Python devs have, and much less than what Steve has. In > short, it LOOKS like you don't know what you're talking about. If you > have a different and deeper understanding of the problem, then you > need to show it, and say why there is a problem for CPython > specifically. Or find someone who can do that for you. > > > Here's one: > > https://github.com/Eugnis/spectre-attack/blob/master/Source.c > > > > Is this too slow in CPython with: > > - Coroutines (asyncio (tulip)) > > - PyPy JIT * > > - Numba JIT * > > - C Extensions * > > - Cython * > > > > * Not anyone here's problem. > > C extensions are obviously fast enough. I think most of the other > starred examples are fast enough, but it's probably more subtle than I > think and requires further analysis by their devs. I also think > there's something important I'm still missing about what's required > and what it can do. > > I don't see what coroutines have to do with it. Coroutines are still > Python code, and they're subject to the GIL. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leewangzhong+python at gmail.com Fri Sep 21 03:23:03 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Fri, 21 Sep 2018 03:23:03 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: There is much reason to believe it won't help, and you have given little reason to believe that it will help, or that you even properly understand how the flag mitigates Spectre in programs where it does help. I am not undermining your attempts for the sake of performance. I am trying to help you understand why no one is taking you seriously, and suggesting what you can do so that they will. One way is to explain why an unsandboxed interpreter needs to protect against a sandbox escape. On Sep 21, 2018 3:12 AM, "Wes Turner" wrote: I feel like you are actively undermining attempts to prevent exploitation of known vulnerabilities because the software in question is currently too slow. For a 4-8% performance penalty, we could just add the CFLAGS to the build now and not worry about it. I give up. On Friday, September 21, 2018, Franklin? Lee wrote: > On Thu, Sep 20, 2018 at 2:10 PM Wes Turner wrote: > > > > On Thursday, September 20, 2018, Stefan Ring > wrote: > >> > >> On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki > wrote: > >> > >> > I think this topic should split to two topics: (1) Guard Python > >> > process from Spectre/Meltdown > >> > attack from other process, (2) Prohibit Python code attack other > >> > processes by using > >> > Spectre/Meltdown. > >> > >> (3) Guard Python from performance degradation by overly aggressive > >> Spectre "mitigation". > > > > > > > Spectre has the potential of having a greater impact on cloud > providers than Meltdown. Whereas Meltdown allows unauthorized applications > to read from privileged memory to obtain sensitive data from processes > running on the same cloud server, Spectre can allow malicious programs to > induce a hypervisor to transmit the data to a guest system running on top > of it. > > - Private SSL certs > > - Cached keys and passwords in non-zeroed RAM > > - [...] > > > > https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) > > It's true that the attacks should worry cloud providers. Doesn't that > mean that companies like Amazon, Microsoft (Steve), and Docker should > have done analyses on CPython's vulnerability to these exploits? > Has/should/can anyone officially representing Python contact the > companies and ask them? > > When I followed your quote to find the context, I found it uses, as > its source, a Forbes article. The source cited by THAT article is > Daniel Gruss, who was one of the researchers. Should someone from the > PSF contact the researchers? Steve says he spoke to some of them to > judge whether the proposed compiler flags would help, and decided > against it. > > Absent of expert input, here's my non-expert take: That quote requires > an OS-level fix. A Python program without the proper permissions can't > do such things unless there is a vulnerability with the OS, and it is > extremely unlikely for anyone to update Python for Spectre but not > update the OS (and they'd be screwed in any case). And even if there > is a vulnerability in the OS, maybe the way to exploit it is by using > arbitrary Python execution (which you need before you can TRY to use > Spectre) on this Python interpreter. You can then write a new binary > file and run THAT, and it will be fast enough. That's not something > you can fix about CPython. > > Also, (again with my understanding) the problem of Spectre and > Meltdown are that you can escape sandboxes and the like, such as the > user/kernel divide, or a software sandbox like that provided by a > JavaScript VM. For CPython to be "vulnerable" to these attacks, it > needs to have some kind of sandbox or protection to break out of. > Instead, we sometimes have sandboxes AROUND CPython (like Jupyter) or > WITHIN CPython. I don't see how it makes sense to talk about a sandbox > escape FOR CPython (yet). > > Your original post linked to a discussion about Linux using those > build flags. Linux is a kernel, and has such protections that can be > bypassed, so it has something to worry about. Malicious code can be > native code, which (to my understanding) will be fast enough to > exploit the cache miss time. Here's Google's article about the > retpoline and why it helps: > https://support.google.com/faqs/answer/7625886 > > As of yet, you have quoted passages that have little relevance to > interpreter devs, especially non-JIT interpreters, and you have linked > to entire articles for non-experts with little relevance to > interpreter devs. This doesn't show that you have any better of an > understanding than I have, which is less than the understanding that > some of the Python devs have, and much less than what Steve has. In > short, it LOOKS like you don't know what you're talking about. If you > have a different and deeper understanding of the problem, then you > need to show it, and say why there is a problem for CPython > specifically. Or find someone who can do that for you. > > > Here's one: > > https://github.com/Eugnis/spectre-attack/blob/master/Source.c > > > > Is this too slow in CPython with: > > - Coroutines (asyncio (tulip)) > > - PyPy JIT * > > - Numba JIT * > > - C Extensions * > > - Cython * > > > > * Not anyone here's problem. > > C extensions are obviously fast enough. I think most of the other > starred examples are fast enough, but it's probably more subtle than I > think and requires further analysis by their devs. I also think > there's something important I'm still missing about what's required > and what it can do. > > I don't see what coroutines have to do with it. Coroutines are still > Python code, and they're subject to the GIL. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Sep 21 05:37:25 2018 From: christian at python.org (Christian Heimes) Date: Fri, 21 Sep 2018 11:37:25 +0200 Subject: [Python-Dev] 3.7.1 and 3.6.7 Releases Coming Soon In-Reply-To: References: <3ABAB3B5-6346-49F2-98F7-185303166016@python.org> Message-ID: On 19/09/2018 23.12, Ned Deily wrote: > Update: not surprisingly, there have been a number of issues that have popped up during and since the sprint that we would like to ensure are addressed in 3.7.1 and 3.6.7. In order to do so, I've been holding off on starting the releases. I think we are now getting close to having the important ones resolved so I'm going to plan on cutting off code for 3.7.1rc1 and 3.6.7rc1 by the end of 2018-09-20 (23:59 AoE). That's roughly 38 hours from now. > > Thanks for all of your help in improving Python for everyone! Hi Ned, I'm really sorry, but would it be possible to delay the RCs until Sunday or Monday AoE? Some of the XML security fixes, OpenSSL 1.1.1 fixes (TLS 1.3 post-handshake authentication), and SSL module regression haven't landed yet. I'm confident that I can land most to all fixes during the weekend. Related PRs are: * https://github.com/python/cpython/pull/9468 * https://github.com/python/cpython/pull/9460 * https://github.com/python/cpython/pull/9217 * https://github.com/python/cpython/pull/9265 I'm also still collaborating with Sebastian Pipping (libexpat maintainer) on the DoS mitigations (CVE-2013-0340). My initial patch had some flaws. I might be able to get expat release 2.3.0 in time, too. https://github.com/libexpat/libexpat/pull/220 Christian From nad at python.org Fri Sep 21 09:08:52 2018 From: nad at python.org (Ned Deily) Date: Fri, 21 Sep 2018 09:08:52 -0400 Subject: [Python-Dev] 3.7.1 and 3.6.7 Releases Coming Soon In-Reply-To: References: <3ABAB3B5-6346-49F2-98F7-185303166016@python.org> Message-ID: <8629D385-F3BF-417C-A12E-4FC3FA4F4F12@python.org> On Sep 21, 2018, at 05:37, Christian Heimes wrote: > On 19/09/2018 23.12, Ned Deily wrote: >> Update: not surprisingly, there have been a number of issues that have popped up during and since the sprint that we would like to ensure are addressed in 3.7.1 and 3.6.7. In order to do so, I've been holding off on starting the releases. I think we are now getting close to having the important ones resolved so I'm going to plan on cutting off code for 3.7.1rc1 and 3.6.7rc1 by the end of 2018-09-20 (23:59 AoE). That's roughly 38 hours from now. > I'm really sorry, but would it be possible to delay the RCs until Sunday > or Monday AoE? > > Some of the XML security fixes, OpenSSL 1.1.1 fixes (TLS 1.3 > post-handshake authentication), and SSL module regression haven't landed > yet. I'm confident that I can land most to all fixes during the weekend. > > Related PRs are: > > * https://github.com/python/cpython/pull/9468 > * https://github.com/python/cpython/pull/9460 > * https://github.com/python/cpython/pull/9217 > * https://github.com/python/cpython/pull/9265 > > I'm also still collaborating with Sebastian Pipping (libexpat > maintainer) on the DoS mitigations (CVE-2013-0340). My initial patch had > some flaws. I might be able to get expat release 2.3.0 in time, too. > > https://github.com/libexpat/libexpat/pull/220 I agree that it would be good to get the security-related and OpenSSL-related fixes in sooner than later and there has been a lot going on recently. Since you have asked so nicely, I have rescheduled the cutoffs for 3.7.1rc1 and 3.6.7rc1 to be by the end of 2018-09-24 (23:59 AoE) and the final releases now on 2018-10-04. Everyone else: here are a few more days to get important things in to these releases. -- Ned Deily nad at python.org -- [] From guido at python.org Fri Sep 21 10:26:26 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Sep 2018 07:26:26 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: > What about the small integers cache? I believe the small integers cache is only used to reduce the number of objects -- I don't think there's any code (in CPython itself) that just *assumes* that because an int is small it must be in the cache. So it should be fine. On Thu, Sep 20, 2018 at 11:23 PM Stefan Behnel wrote: > Larry Hastings schrieb am 14.09.2018 um 23:27: > > What the patch does: it takes all the Python modules that are loaded as > > part of interpreter startup and deserializes the marshalled .pyc file > into > > precreated objects stored as static C data. > > What about the small integers cache? The C serialisation generates several > PyLong objects that would normally reside in the cache. Is this handled > somewhere? I guess the cache could entirely be loaded from the data > segment. And the same would have to be done for interned strings. Basically > anything that CPython only wants to have one instance of. > > That would severely limit the application of this optimisation to external > modules, though. I don't see a way how they could load their data > structures from the data segment without duplicating all sorts of > "singletons". > > Stefan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Sep 21 11:13:27 2018 From: christian at python.org (Christian Heimes) Date: Fri, 21 Sep 2018 17:13:27 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: On 21/09/2018 16.26, Guido van Rossum wrote: >> What about the small integers cache? > > I believe the small integers cache is only used to reduce the number of > objects -- I don't think there's any code (in CPython itself) that just > *assumes* that because an int is small it must be in the cache. So it > should be fine. Some places may assume that PyLong_FromLong() for a small int never fails. I certainly expect this in coverity scan modeling. Christian From yselivanov.ml at gmail.com Fri Sep 21 11:15:18 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 21 Sep 2018 11:15:18 -0400 Subject: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation In-Reply-To: References: <942002A1-4319-4370-BC07-A628BB1B8FC1@python.org> Message-ID: On Wed, Sep 19, 2018 at 4:26 PM Ned Deily wrote: > On Sep 19, 2018, at 13:30, Yury Selivanov wrote: [..] > > Currently it's designed to expose "PyContext*" and "PyContextVar*" > > pointers. I want to change that to "PyObject*" as using non-PyObject > > pointers turned out to be a very bad idea (interfacing with Cython is > > particularly challenging). > > > > Is it a good idea to change this in Python 3.7.1? > > It's hard to make an informed decision without a concrete PR to review. What would be the impact on any user code that has already adopted it in 3.7.0? Ned, I've created an issue to track this: https://bugs.python.org/issue34762 Yury From jeethu at jeethurao.com Fri Sep 21 03:19:18 2018 From: jeethu at jeethurao.com (Jeethu Rao) Date: Fri, 21 Sep 2018 08:19:18 +0100 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> Message-ID: > On Sep 21, 2018, at 06:53, Stefan Behnel wrote: > > Totally. This might actually be more relevant for Cython than for CPython > in the end, because it wouldn't be limited to the stdlib and its core modules. > > It's a bit more difficult for us, because this probably won't work easily > across Python releases (2.[67] and 3.[45678] for now) and also definitely > not for PyPy, but that just means some multiplication of the generated > code, and we have the dynamic part of it already. Supporting that for > Unicode strings will be fun, I'm sure. :) I?m glad to hear that this might be relevant to Cython. I believe it should be straightforward to parametrize the code generator to generate code targeted at specific cPython versions. While we originally targeted 3.6, Larry Hastings managed to quickly port it to 3.8. The two changes in cPython?s data structures between 3.6 and 3.8 that needed changes to the code-gen were [1] from 3.7 and [2] from 3.8. And internally, I?ve still got a task open to back-port this to support 2.7. -- Jeethu [1]: https://bugs.python.org/issue18896 [2]: https://bugs.python.org/issue33597 From status at bugs.python.org Fri Sep 21 12:10:04 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 21 Sep 2018 18:10:04 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180921161004.5F0F257BC7@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-09-14 - 2018-09-21) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6795 (-24) closed 39723 (+111) total 46518 (+87) Open issues with patches: 2711 Issues opened (63) ================== #19756: test_nntplib: sporadic failures, network isses? server down? https://bugs.python.org/issue19756 reopened by berker.peksag #34682: Typo reports on docs@ https://bugs.python.org/issue34682 opened by zach.ware #34683: Caret positioned wrong for SyntaxError reported by ast.c https://bugs.python.org/issue34683 opened by gvanrossum #34686: Add `-r`, as opposed to `-R` to Python core interpreter https://bugs.python.org/issue34686 opened by lepaperwan #34687: asyncio: is it time to make ProactorEventLoop as the default e https://bugs.python.org/issue34687 opened by vstinner #34688: Segfault in pandas that works fine on 3.7 https://bugs.python.org/issue34688 opened by xtreak #34689: Lib/sysconfig.py expands non-variables https://bugs.python.org/issue34689 opened by lepaperwan #34690: Store startup modules as C structures for 20%+ startup speed i https://bugs.python.org/issue34690 opened by larry #34691: _contextvars missing in x64 master branch Windows build? https://bugs.python.org/issue34691 opened by tim.peters #34696: PyByteArray_FromObject() has undocumented (and untested) behav https://bugs.python.org/issue34696 opened by ZackerySpytz #34697: ctypes: Crash if manually-created CField instance is used https://bugs.python.org/issue34697 opened by izbyshev #34698: urllib.request.Request.set_proxy doesn't (necessarily) replace https://bugs.python.org/issue34698 opened by tburke #34699: allow path-like objects in program arguments in Windows https://bugs.python.org/issue34699 opened by guoci #34700: typing.get_type_hints doesn't know about typeshed https://bugs.python.org/issue34700 opened by Ben.Darnell #34701: Asyncio documentation for recursive coroutines is lacking https://bugs.python.org/issue34701 opened by azaria.zornberg #34702: urlopen doesn't handle query strings with "file" scheme https://bugs.python.org/issue34702 opened by liZe #34704: Do not access ob_type directly, introduce Py_TP https://bugs.python.org/issue34704 opened by nascheme #34705: Python 3.8 changes how returns through finally clauses are tra https://bugs.python.org/issue34705 opened by nedbat #34706: Signature.from_callable sometimes drops subclassing https://bugs.python.org/issue34706 opened by bukzor #34707: Python not reentrant https://bugs.python.org/issue34707 opened by skaller #34708: Odd crashes/freezes when sys.stdout.shell.console is typed https://bugs.python.org/issue34708 opened by Harrison Chudleigh #34709: Suggestion: make getuser.getpass() also look at SUDO_USER envi https://bugs.python.org/issue34709 opened by Amos S #34711: return ENOTDIR when open() accepts filenames with a trailing s https://bugs.python.org/issue34711 opened by Michael.Felt #34713: csvwriter.writerow()'s return type is undocumented https://bugs.python.org/issue34713 opened by nchammas #34714: timeout in test_multiprocessing_spawn x86 Windows7 3.x buildbo https://bugs.python.org/issue34714 opened by pablogsal #34716: MagicMock.__divmod__ should return a pair https://bugs.python.org/issue34716 opened by serhiy.storchaka #34720: Fix test_importlib.test_bad_traverse for AIX https://bugs.python.org/issue34720 opened by Michael.Felt #34722: Non-deterministic bytecode generation https://bugs.python.org/issue34722 opened by Peter Ebden #34724: argparse subparser help indent too short https://bugs.python.org/issue34724 opened by TakingItCasual #34725: Py_GetProgramFullPath() odd behaviour in Windows https://bugs.python.org/issue34725 opened by mariofutire #34726: Add support of checked hash-based pycs in zipimport https://bugs.python.org/issue34726 opened by serhiy.storchaka #34728: deprecate *loop* argument for asyncio.sleep https://bugs.python.org/issue34728 opened by yselivanov #34730: aclose() doesn't stop raise StopAsyncIteration / GeneratorExit https://bugs.python.org/issue34730 opened by dfee #34732: uuid returns version more than 5 https://bugs.python.org/issue34732 opened by jophine pranjal #34734: Azure linux buildbot failure https://bugs.python.org/issue34734 opened by xtreak #34736: Confusing base64.b64decode output https://bugs.python.org/issue34736 opened by mark.dickinson #34737: Python upgrade with SYSTEM account uninstalls python https://bugs.python.org/issue34737 opened by JatinGoel #34738: Distutils: ZIP files don't include directory entries https://bugs.python.org/issue34738 opened by serhiy.storchaka #34739: Get rid of tp_getattro in xml.etree.ElementTree.XMLParser https://bugs.python.org/issue34739 opened by serhiy.storchaka #34740: Get rid of tp_getattro in ossaudiodev.oss_audio_device https://bugs.python.org/issue34740 opened by serhiy.storchaka #34741: Get rid of tp_getattro and tp_setattro in pyexpat.xmlparser https://bugs.python.org/issue34741 opened by serhiy.storchaka #34742: Add optional argument for exit status in argparse.ArgumentPars https://bugs.python.org/issue34742 opened by Ankit Goel #34744: New %(flag)s format specifier for argparse.add_argument help s https://bugs.python.org/issue34744 opened by helmsman helmsman #34745: asyncio ssl memory leak https://bugs.python.org/issue34745 opened by thehesiod #34747: SSLSocket.context cannot be changed on non-connected sockets https://bugs.python.org/issue34747 opened by vincent-nexedi #34748: Incorrect HTML link in functools.partial https://bugs.python.org/issue34748 opened by xitop #34749: improve performance of binascii.a2b_base64() https://bugs.python.org/issue34749 opened by sir-sigurd #34750: locals().update doesn't work in Enum body, even though direct https://bugs.python.org/issue34750 opened by Antony.Lee #34751: Hash collisions for tuples https://bugs.python.org/issue34751 opened by jdemeyer #34752: warnings.warn fails silently with unicode input https://bugs.python.org/issue34752 opened by nparslow #34753: Use coroutine object or coroutine function instead of coroutin https://bugs.python.org/issue34753 opened by Windson Yang #34756: Few changes in sys.breakpointhook() https://bugs.python.org/issue34756 opened by serhiy.storchaka #34757: Placeholder for discussion on Combined patches for AIX - to re https://bugs.python.org/issue34757 opened by Michael.Felt #34758: http.server module sets incorrect mimetype for WebAssembly fil https://bugs.python.org/issue34758 opened by travisoneill #34759: Possible regression in ssl module in 3.7.1 and master https://bugs.python.org/issue34759 opened by njs #34760: Regression in abc in combination with passing a function to is https://bugs.python.org/issue34760 opened by glaubich #34761: str(super()) != super().__str__() https://bugs.python.org/issue34761 opened by Guillaume Dominici #34762: Change contextvars C API to use PyObject https://bugs.python.org/issue34762 opened by yselivanov #34763: Python lacks 0x4E17 https://bugs.python.org/issue34763 opened by ????????? #34764: Improve documentation example for using iter() with sentinel v https://bugs.python.org/issue34764 opened by ChrisRands #34765: Update install-sh https://bugs.python.org/issue34765 opened by cstratak #34766: BaseProxy cache should be cleaned when Manager client is recon https://bugs.python.org/issue34766 opened by wynfred #34767: Optimize asyncio.Lock https://bugs.python.org/issue34767 opened by yselivanov Most recent 15 issues with no replies (15) ========================================== #34767: Optimize asyncio.Lock https://bugs.python.org/issue34767 #34766: BaseProxy cache should be cleaned when Manager client is recon https://bugs.python.org/issue34766 #34764: Improve documentation example for using iter() with sentinel v https://bugs.python.org/issue34764 #34763: Python lacks 0x4E17 https://bugs.python.org/issue34763 #34758: http.server module sets incorrect mimetype for WebAssembly fil https://bugs.python.org/issue34758 #34757: Placeholder for discussion on Combined patches for AIX - to re https://bugs.python.org/issue34757 #34752: warnings.warn fails silently with unicode input https://bugs.python.org/issue34752 #34749: improve performance of binascii.a2b_base64() https://bugs.python.org/issue34749 #34748: Incorrect HTML link in functools.partial https://bugs.python.org/issue34748 #34747: SSLSocket.context cannot be changed on non-connected sockets https://bugs.python.org/issue34747 #34741: Get rid of tp_getattro and tp_setattro in pyexpat.xmlparser https://bugs.python.org/issue34741 #34740: Get rid of tp_getattro in ossaudiodev.oss_audio_device https://bugs.python.org/issue34740 #34720: Fix test_importlib.test_bad_traverse for AIX https://bugs.python.org/issue34720 #34716: MagicMock.__divmod__ should return a pair https://bugs.python.org/issue34716 #34714: timeout in test_multiprocessing_spawn x86 Windows7 3.x buildbo https://bugs.python.org/issue34714 Most recent 15 issues waiting for review (15) ============================================= #34763: Python lacks 0x4E17 https://bugs.python.org/issue34763 #34762: Change contextvars C API to use PyObject https://bugs.python.org/issue34762 #34759: Possible regression in ssl module in 3.7.1 and master https://bugs.python.org/issue34759 #34758: http.server module sets incorrect mimetype for WebAssembly fil https://bugs.python.org/issue34758 #34757: Placeholder for discussion on Combined patches for AIX - to re https://bugs.python.org/issue34757 #34756: Few changes in sys.breakpointhook() https://bugs.python.org/issue34756 #34751: Hash collisions for tuples https://bugs.python.org/issue34751 #34749: improve performance of binascii.a2b_base64() https://bugs.python.org/issue34749 #34744: New %(flag)s format specifier for argparse.add_argument help s https://bugs.python.org/issue34744 #34741: Get rid of tp_getattro and tp_setattro in pyexpat.xmlparser https://bugs.python.org/issue34741 #34740: Get rid of tp_getattro in ossaudiodev.oss_audio_device https://bugs.python.org/issue34740 #34739: Get rid of tp_getattro in xml.etree.ElementTree.XMLParser https://bugs.python.org/issue34739 #34738: Distutils: ZIP files don't include directory entries https://bugs.python.org/issue34738 #34732: uuid returns version more than 5 https://bugs.python.org/issue34732 #34728: deprecate *loop* argument for asyncio.sleep https://bugs.python.org/issue34728 Top 10 most discussed issues (10) ================================= #34589: Py_Initialize() and Py_Main() should not enable C locale coerc https://bugs.python.org/issue34589 21 msgs #34751: Hash collisions for tuples https://bugs.python.org/issue34751 14 msgs #19756: test_nntplib: sporadic failures, network isses? server down? https://bugs.python.org/issue19756 13 msgs #17239: XML vulnerabilities in Python https://bugs.python.org/issue17239 10 msgs #34759: Possible regression in ssl module in 3.7.1 and master https://bugs.python.org/issue34759 10 msgs #34732: uuid returns version more than 5 https://bugs.python.org/issue34732 9 msgs #34623: _elementtree.c doesn't call XML_SetHashSalt() https://bugs.python.org/issue34623 7 msgs #34686: Add `-r`, as opposed to `-R` to Python core interpreter https://bugs.python.org/issue34686 7 msgs #31635: test_strptime failure on OpenBSD https://bugs.python.org/issue31635 6 msgs #32557: allow shutil.disk_usage to take a file path on Windows also https://bugs.python.org/issue32557 6 msgs Issues closed (105) =================== #8450: httplib: false BadStatusLine() raised https://bugs.python.org/issue8450 closed by miss-islington #9148: os.execve puts process to background on windows https://bugs.python.org/issue9148 closed by eryksun #15713: PEP 3121, 384 Refactoring applied to zipimport module https://bugs.python.org/issue15713 closed by serhiy.storchaka #18174: Make regrtest with --huntrleaks check for fd leaks https://bugs.python.org/issue18174 closed by vstinner #18295: Possible integer overflow in PyCode_New() https://bugs.python.org/issue18295 closed by vstinner #19431: Document PyFrame_FastToLocals() and PyFrame_FastToLocalsWithEr https://bugs.python.org/issue19431 closed by vstinner #20414: Python 3.4 has two Overlapped types https://bugs.python.org/issue20414 closed by vstinner #21702: asyncio: remote_addr of create_datagram_endpoint() is not docu https://bugs.python.org/issue21702 closed by vstinner #23236: asyncio: add timeout to StreamReader read methods https://bugs.python.org/issue23236 closed by vstinner #23295: [Windows] asyncio: add UDP support to ProactorEventLoop https://bugs.python.org/issue23295 closed by vstinner #23327: zipimport to import from non-ascii pathname on Windows https://bugs.python.org/issue23327 closed by serhiy.storchaka #25750: tp_descr_get(self, obj, type) is called without owning a refer https://bugs.python.org/issue25750 closed by jdemeyer #26119: Windows Installer can sometimes silently fail pip stage https://bugs.python.org/issue26119 closed by steve.dower #26568: Add a new warnings.showwarnmsg() function taking a warnings.Wa https://bugs.python.org/issue26568 closed by vstinner #28955: Not matched behavior of numeric comparison with the documentat https://bugs.python.org/issue28955 closed by benjamin.peterson #29306: Check usage of Py_EnterRecursiveCall() and Py_LeaveRecursiveCa https://bugs.python.org/issue29306 closed by vstinner #29419: Argument Clinic: inline PyArg_UnpackTuple and PyArg_ParseStack https://bugs.python.org/issue29419 closed by vstinner #29451: Use _PyArg_Parser for _PyArg_ParseStack(): support positional https://bugs.python.org/issue29451 closed by vstinner #29465: Modify _PyObject_FastCall() to reduce stack consumption https://bugs.python.org/issue29465 closed by vstinner #29471: AST: add an attribute to FunctionDef to distinguish functions https://bugs.python.org/issue29471 closed by vstinner #29674: Use GCC __attribute__((alloc_size(x, y))) on PyMem_Malloc() fu https://bugs.python.org/issue29674 closed by vstinner #29881: Add a new private API clear private variables, which are initi https://bugs.python.org/issue29881 closed by vstinner #30170: "tests may fail, unable to create temporary directory" warning https://bugs.python.org/issue30170 closed by vstinner #30198: distutils build_ext: don't run newer_group() in parallel in mu https://bugs.python.org/issue30198 closed by vstinner #30227: test_site must not write outside the build directory: must not https://bugs.python.org/issue30227 closed by vstinner #30244: Emit a ResourceWarning in concurrent.futures executor destruct https://bugs.python.org/issue30244 closed by vstinner #30317: test_timeout() of test_multiprocessing_spawn.WithManagerTestBa https://bugs.python.org/issue30317 closed by vstinner #30318: test_distutils is too verbose on Windows https://bugs.python.org/issue30318 closed by vstinner #30816: test_open() of test_eintr timeout after 10 min on "x86-64 El C https://bugs.python.org/issue30816 closed by vstinner #30884: regrtest -jN --timeout=TIMEOUT should kill child process runni https://bugs.python.org/issue30884 closed by vstinner #31321: traceback.clear_frames() doesn't clear *all* frames https://bugs.python.org/issue31321 closed by vstinner #31392: Upgrade installers to OpenSSL 1.1.0g and 1.0.2n https://bugs.python.org/issue31392 closed by ned.deily #31531: crash and SystemError in case of a bad zipimport._zip_director https://bugs.python.org/issue31531 closed by serhiy.storchaka #31687: test_semaphore_tracker() of test_multiprocessing_spawn fails r https://bugs.python.org/issue31687 closed by vstinner #31958: UUID versions are not validated to lie in the documented range https://bugs.python.org/issue31958 closed by berker.peksag #32075: Expose ZipImporter Type Object in the include header files. https://bugs.python.org/issue32075 closed by serhiy.storchaka #32093: macOS: implement time.thread_time() using thread_info() https://bugs.python.org/issue32093 closed by vstinner #32128: test_nntplib: test_article_head_body() fails in SSL mode https://bugs.python.org/issue32128 closed by vstinner #32183: Coverity: CID 1423264: Insecure data handling (TAINTED_SCALA https://bugs.python.org/issue32183 closed by vstinner #32245: OSError: raw write() returned invalid length on latest Win 10 https://bugs.python.org/issue32245 closed by eryksun #32455: PyCompile_OpcodeStackEffect() and dis.stack_effect() are not p https://bugs.python.org/issue32455 closed by serhiy.storchaka #33070: Add platform triplet for RISC-V https://bugs.python.org/issue33070 closed by benjamin.peterson #33216: [3.5] Wrong order of stack for CALL_FUNCTION_VAR and CALL_FUNC https://bugs.python.org/issue33216 closed by serhiy.storchaka #33486: regen autotools related files https://bugs.python.org/issue33486 closed by petr.viktorin #33531: test_asyncio: test_subprocess test_stdin_broken_pipe() failure https://bugs.python.org/issue33531 closed by vstinner #33613: test_multiprocessing_fork: test_semaphore_tracker_sigint() fai https://bugs.python.org/issue33613 closed by vstinner #33649: asyncio docs overhaul https://bugs.python.org/issue33649 closed by yselivanov #33676: test_multiprocessing_fork: dangling threads warning https://bugs.python.org/issue33676 closed by vstinner #33680: regrtest: re-run failed tests in a subprocess https://bugs.python.org/issue33680 closed by vstinner #33683: asyncio: sendfile tests ignore SO_SNDBUF on Windows https://bugs.python.org/issue33683 closed by vstinner #33686: test_concurrent_futures: test_pending_calls_race() failed on x https://bugs.python.org/issue33686 closed by vstinner #33717: Enhance test.pythoninfo: meta-ticket for multiple changes https://bugs.python.org/issue33717 closed by vstinner #33718: Enhance regrtest: meta-ticket for multiple changes https://bugs.python.org/issue33718 closed by vstinner #33721: os.path.exists() ought to return False if pathname contains NU https://bugs.python.org/issue33721 closed by serhiy.storchaka #33868: test__xxsubinterpreters: test_subinterpreter() fails randomly https://bugs.python.org/issue33868 closed by vstinner #33920: test_asyncio: test_run_coroutine_threadsafe_with_timeout() fai https://bugs.python.org/issue33920 closed by vstinner #33966: test_multiprocessing_spawn.WithProcessesTestPool.test_tracebac https://bugs.python.org/issue33966 closed by vstinner #34011: Default preference not given to venv DLL's https://bugs.python.org/issue34011 closed by steve.dower #34125: Profiling depends on whether **kwargs is given https://bugs.python.org/issue34125 closed by jdemeyer #34131: test_threading: BarrierTests.test_default_timeout() failed on https://bugs.python.org/issue34131 closed by vstinner #34150: test_multiprocessing_spawn: Dangling processes leaked on AMD64 https://bugs.python.org/issue34150 closed by vstinner #34247: PYTHONOPTIMIZE ignored in 3.7.0 when using custom launcher https://bugs.python.org/issue34247 closed by ncoghlan #34285: regrtest: in case of test failure, add "always look on the bri https://bugs.python.org/issue34285 closed by vstinner #34341: Appending to ZIP archive blows up existing Central Directory e https://bugs.python.org/issue34341 closed by serhiy.storchaka #34354: Memory leak on _testCongestion https://bugs.python.org/issue34354 closed by petr.viktorin #34363: dataclasses.asdict() mishandles dataclass instance attributes https://bugs.python.org/issue34363 closed by eric.smith #34382: test_os.test_mode fails when directory base directory has g+s https://bugs.python.org/issue34382 closed by Michael.Felt #34458: No way to alternate options https://bugs.python.org/issue34458 closed by paul.j3 #34479: ArgumentParser subparser error display at the wrong level https://bugs.python.org/issue34479 closed by paul.j3 #34515: lib2to3: support non-ASCII identifiers https://bugs.python.org/issue34515 closed by benjamin.peterson #34560: Backport of uuid1() failure fix https://bugs.python.org/issue34560 closed by Riccardo Mottola #34585: Don't use AC_RUN_IFELSE to determine float endian https://bugs.python.org/issue34585 closed by benjamin.peterson #34587: test_socket: testCongestion() hangs on my Fedora 28 https://bugs.python.org/issue34587 closed by vstinner #34595: PyUnicode_FromFormat(): add %T format for an object type name https://bugs.python.org/issue34595 closed by vstinner #34603: ctypes on Windows: error calling C function that returns a str https://bugs.python.org/issue34603 closed by vstinner #34639: PYTHONCOERCECLOCALE is ignored when using -E or -I option https://bugs.python.org/issue34639 closed by vstinner #34651: Disallow fork in a subinterpreter. https://bugs.python.org/issue34651 closed by eric.snow #34656: memory exhaustion in Modules/_pickle.c:1393 https://bugs.python.org/issue34656 closed by benjamin.peterson #34663: Support POSIX_SPAWN_USEVFORK flag in posix_spawn https://bugs.python.org/issue34663 closed by pablogsal #34673: make the eval loop more editable https://bugs.python.org/issue34673 closed by benjamin.peterson #34681: Incorrect class name Pattern in sre_parse.py https://bugs.python.org/issue34681 closed by serhiy.storchaka #34684: Generate _geoslib.c with Cython 0.28.2 for Python 3.7 transiti https://bugs.python.org/issue34684 closed by zach.ware #34685: scheduler tests for posix_spawn fail on AMD64 FreeBSD 10.x Sha https://bugs.python.org/issue34685 closed by pablogsal #34692: ok https://bugs.python.org/issue34692 closed by serhiy.storchaka #34693: PEPping distutils/core.py https://bugs.python.org/issue34693 closed by berker.peksag #34694: Dismiss To Avoid Slave/Master wording cause it easier for non https://bugs.python.org/issue34694 closed by Mariatta #34695: sqlite3: Cache.get() crashes if Cache.__init__() was not calle https://bugs.python.org/issue34695 closed by berker.peksag #34703: Unexpected Arithmetic Result https://bugs.python.org/issue34703 closed by mark.dickinson #34710: SSL Module build fails with more pedantic compiler flags https://bugs.python.org/issue34710 closed by christian.heimes #34712: Style fixes in examples of "Input and Output" tutorial section https://bugs.python.org/issue34712 closed by ned.deily #34715: timemodule.c fails to compile on BSD https://bugs.python.org/issue34715 closed by benjamin.peterson #34717: docs: disable numbered sections for stdlib in html https://bugs.python.org/issue34717 closed by yselivanov #34718: Syntax error on factorial example https://bugs.python.org/issue34718 closed by magmax #34719: Deprecate set to frozenset conversion in set.__contains__ https://bugs.python.org/issue34719 closed by rhettinger #34721: json module loads function https://bugs.python.org/issue34721 closed by ammar2 #34723: lower() on Turkish letter "??" returns a 2-chars-long string https://bugs.python.org/issue34723 closed by vstinner #34727: Windows/2.7.15 IOError [Errno 0] when user interacts with cmd https://bugs.python.org/issue34727 closed by eryksun #34729: bz2/lzma: Compressor/decompressor crash if __init__ is not cal https://bugs.python.org/issue34729 closed by izbyshev #34731: pathlib path.match misshandles multiple `**` https://bugs.python.org/issue34731 closed by Ronny.Pfannschmidt #34733: Missing search bar on docs.python.org https://bugs.python.org/issue34733 closed by yselivanov #34735: Modules/timemodule.c: Memory leak in time_strftime() https://bugs.python.org/issue34735 closed by serhiy.storchaka #34743: test_database_source_name fails with SQLite 3.7.9 https://bugs.python.org/issue34743 closed by berker.peksag #34746: Asyncio documentation have a error https://bugs.python.org/issue34746 closed by yselivanov #34754: test_flush_return_value fails on FreeBSD https://bugs.python.org/issue34754 closed by berker.peksag #34755: Few minor optimizations in _asynciomodule.c https://bugs.python.org/issue34755 closed by serhiy.storchaka From guido at python.org Fri Sep 21 13:35:45 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Sep 2018 10:35:45 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: On Fri, Sep 21, 2018 at 8:16 AM Christian Heimes wrote: > On 21/09/2018 16.26, Guido van Rossum wrote: > >> What about the small integers cache? > > > > I believe the small integers cache is only used to reduce the number of > > objects -- I don't think there's any code (in CPython itself) that just > > *assumes* that because an int is small it must be in the cache. So it > > should be fine. > > Some places may assume that PyLong_FromLong() for a small int never > fails. I certainly expect this in coverity scan modeling. > Ah, that goes in the other direction. That function will always return a value from the cache if it's in range for the cache. and nothing change there. I was talking about situations where code might assume that if an object's address is not that of the canonical cached zero-valued PyLong object, it couldn't be a PyLong with value zero (same for other values in range of the cache). I'd be very surprised if there was code assuming that, and I'd say it was always wrong. (It's like beginners' code using 'x is 0' instead of 'x == 0'.) Though now I start worrying about interned strings. That's a concept that's a little closer to being a feature. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From e2qb2a44f at prolan-power.hu Fri Sep 21 14:02:32 2018 From: e2qb2a44f at prolan-power.hu (e2qb2a44f at prolan-power.hu) Date: Fri, 21 Sep 2018 20:02:32 +0200 Subject: [Python-Dev] switch statement Message-ID: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Hi, A humble proposal for a switch-like statement syntax for Python: - - - switch blah in (100, 2, 30, 'bumm'): dosomething1() x = 88 case blah in (44, 55): otherstuff(9) case blah in (8): boo() else: wawa() - - - So, let's use, and allow only *tuples*. As early as possible, build a jump table, based on (foreknown) small integer values. As in other languages. Strings may have to be hashed (in "compile time"), to obtain small integer value. Some secondary checking may have to be done for exact content equality. (Alternative: do no allow strings at all.) For gaps in the integer range: maybe apply some very basic dividing/shifting to "compact" the range. (As compilers optimize in other languages, I guess -- but I may be totally wrong.) (For example find "unused bits" in the numbers (in 2-base representation). newnum = orignum >> 3 & 6 | orignum & ~6. newnum is smaller (roughly 1/8) than orignum.) The (achievable) goal is to be faster than hash table lookup. (A hash table with keys 100, 2, 30, 'bumm' etc.) And approach the speed of single array-index lookup. (Or even faster in some cases as there will be just jumps instead of calls?) (I am not an "expert"!) Let allow fallthrough or not? - To be decided. (Either is compatible with the above.) I know about PEP 3103 and https://docs.python.org/3.8/faq/design.html?highlight=switch#why-isn-t-there-a-switch-or-case-statement-in-python (I do not know how to comment on a PEP, where to discuss a PEP. If this is inappropriate place, please forward it.) -- From carl.shapiro at gmail.com Fri Sep 21 14:22:08 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Fri, 21 Sep 2018 11:22:08 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <20180918220003.ulk4ypmxtbym6s4v@python.ca> References: <20180916222441.13600498@fsol> <397028a3-8368-4d57-57de-916f0847f90f@python.org> <20180918220003.ulk4ypmxtbym6s4v@python.ca> Message-ID: On Tue, Sep 18, 2018 at 3:00 PM, Neil Schemenauer wrote: > The users of Python are pretty diverse so it depends on who you ask. > Some would like a giant executable that includes everything they > need (so of like the Go approach). Other people want an executable > that has just importlib inside it and then mix-and-match different > shared libs for their different purposes. Some will not want work > "old school" and load from separate .py or .pyc files. > > I see no reason why we can't support all these options. > Supporting those options is possible if a some of our simplifying assumptions are revisited. Here are a few We know about all the objects being stored in the data segment. That makes it is easy to ensure that immutable objects are unique. Knowing anything less, that work would have to be done at load-time. We do not have to worry about the relocation cost of the pointers we add to the data segment. We are compiled into an executable that typically gets loaded at a fixed address. This could become a performance concern if we wrote our data into a shared library. Because we are compiled into the runtime, we do not have versioning issues. There is no possibility of PyObject_HEAD or any PyObject subclass being changed out from under us. The existing marshal format abstracts away from these details but our format is very sensitive to the host environment. All of these problems have technical solutions. They should be evaluated carefully to ensure that the added overhead does not wipe-out the performance wins or add lots of complexity to the runtime. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Sep 21 14:36:37 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Sep 2018 20:36:37 +0200 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: Guido van Rossum schrieb am 21.09.2018 um 19:35: > Though now I start worrying about interned strings. That's a concept that's > a little closer to being a feature. True. While there's the general '"ab"+"cd" is (not) "abcd"' caveat, I'm sure quite a bit of code out there assumes that parsed identifiers in a module, such as the names of functions and classes, are interned, since this was often communicated. And in fact, explicitly interning the same name might return a different string object with this change than what's in the module/class dict. Stefan From mike at selik.org Fri Sep 21 14:40:10 2018 From: mike at selik.org (Michael Selik) Date: Fri, 21 Sep 2018 11:40:10 -0700 Subject: [Python-Dev] switch statement In-Reply-To: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> References: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Message-ID: First, this sounds like it belongs on python-ideas, not python-dev. Second, when you do send a message to python-ideas, it'll help to accompany it with a realistic example usage that motivates your proposal. On Fri, Sep 21, 2018 at 11:18 AM wrote: > > Hi, > > A humble proposal for a switch-like statement syntax for Python: > > - - - > switch blah in (100, 2, 30, 'bumm'): > dosomething1() > x = 88 > case blah in (44, 55): > otherstuff(9) > case blah in (8): > boo() > else: > wawa() > - - - > > So, let's use, and allow only *tuples*. > As early as possible, build a jump table, based on (foreknown) small integer values. As in other languages. > Strings may have to be hashed (in "compile time"), to obtain small integer value. Some secondary checking may > have to be done for exact content equality. (Alternative: do no allow strings at all.) > For gaps in the integer range: maybe apply some very basic dividing/shifting to "compact" the range. (As > compilers optimize in other languages, I guess -- but I may be totally wrong.) (For example find "unused bits" > in the numbers (in 2-base representation). newnum = orignum >> 3 & 6 | orignum & ~6. newnum is smaller (roughly > 1/8) than orignum.) > The (achievable) goal is to be faster than hash table lookup. (A hash table with keys 100, 2, 30, 'bumm' etc.) > And approach the speed of single array-index lookup. (Or even faster in some cases as there will be just jumps > instead of calls?) > (I am not an "expert"!) > > Let allow fallthrough or not? - To be decided. (Either is compatible with the above.) > > > I know about PEP 3103 and > https://docs.python.org/3.8/faq/design.html?highlight=switch#why-isn-t-there-a-switch-or-case-statement-in-python > > (I do not know how to comment on a PEP, where to discuss a PEP. If this is inappropriate place, please forward it.) > > -- > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/mike%40selik.org From jcgoble3 at gmail.com Fri Sep 21 14:53:03 2018 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Fri, 21 Sep 2018 14:53:03 -0400 Subject: [Python-Dev] switch statement In-Reply-To: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> References: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Message-ID: As Michael said, this belongs on python-ideas, but it's interesting. I'd support it, though my input in that regard is worth approximately $0.00. ;) It's the core devs and especially the eventual BDFL replacement whom you would have to convince. Without getting into an extended discussion here on python-dev, I'd like to offer one brief comment to try to help improve the proposal: On Fri, Sep 21, 2018 at 2:17 PM wrote: > Let allow fallthrough or not? - To be decided. (Either is compatible with > the above.) > I would argue for non-fallthrough as the default with support for explicitly requesting fallthrough when desired by using "continue". I don't know of any prior art for doing it that way, but it makes sense to me. It eliminates hidden bugs from missing a "break" while still allowing it when needed. Good luck with the proposal! -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl.shapiro at gmail.com Fri Sep 21 15:01:15 2018 From: carl.shapiro at gmail.com (Carl Shapiro) Date: Fri, 21 Sep 2018 12:01:15 -0700 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: References: Message-ID: On Thu, Sep 20, 2018 at 11:20 PM, Stefan Behnel wrote: > What about the small integers cache? The C serialisation generates several > PyLong objects that would normally reside in the cache. Is this handled > somewhere? I guess the cache could entirely be loaded from the data > segment. And the same would have to be done for interned strings. Basically > anything that CPython only wants to have one instance of. > Un-marshaled immutable objects are tracked in a table to ensure their uniqueness. Thanks for mentioning the small integer cache. It is not part of the change, but it could be brought under this framework. By doing so, we could store the small integer objects instances in the data segment and other data segment objects could reference those unique small integer instances. That would severely limit the application of this optimisation to external > modules, though. I don't see a way how they could load their data > structures from the data segment without duplicating all sorts of > "singletons". Yes, additional load-time work would have to be done to ensure the uniqueness of those objects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From breamoreboy at gmail.com Fri Sep 21 16:20:10 2018 From: breamoreboy at gmail.com (Mark Lawrence) Date: Fri, 21 Sep 2018 21:20:10 +0100 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On 21/09/18 08:11, Wes Turner wrote: > I feel like you are actively undermining attempts to prevent > exploitation of known vulnerabilities because the software in question > is currently too slow. > Did you write "How to Win Friends And Influence People"? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From guido at python.org Fri Sep 21 17:10:00 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Sep 2018 14:10:00 -0700 Subject: [Python-Dev] switch statement In-Reply-To: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> References: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Message-ID: There's already a rejected PEP about a switch statement: https://www.python.org/dev/peps/pep-3103/. There's no point bringing this up again unless you have a new use case. There have been several promising posts to python-ideas about the much more powerful idea of a "match" statement. Please search for those before re-posting on python-ideas. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Sep 21 17:22:22 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 21 Sep 2018 22:22:22 +0100 Subject: [Python-Dev] switch statement In-Reply-To: References: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Message-ID: > There have been several promising posts to python-ideas about the much more powerful idea of a "match" statement I actually even started working on a PEP about this (pattern matching), but then decided to postpone it because it is unlikely that anything of this size can be discussed/accepted in current situation. We can return back to the idea when decision-making model will clarify. -- Ivan On Fri, 21 Sep 2018 at 22:12, Guido van Rossum wrote: > There's already a rejected PEP about a switch statement: > https://www.python.org/dev/peps/pep-3103/. There's no point bringing this > up again unless you have a new use case. > > There have been several promising posts to python-ideas about the much > more powerful idea of a "match" statement. Please search for those before > re-posting on python-ideas. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Sep 21 19:01:45 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 21 Sep 2018 17:01:45 -0600 Subject: [Python-Dev] Questions about signal handling. Message-ID: Hi all, I've got a pretty good sense of how signal handling works in the runtime (i.e. via a dance with the eval loop), but still have some questions: 1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly? More details are below. My interest in the topic relates to improving in-process interpreter isolation. #1 & #2 ----------- Toward the top of signalmodule.c we find the following comment [1] (written in 1994): /* NOTES ON THE INTERACTION BETWEEN SIGNALS AND THREADS When threads are supported, we want the following semantics: - only the main thread can set a signal handler - any thread can get a signal handler - signals are only delivered to the main thread I.e. we don't support "synchronous signals" like SIGFPE (catching this doesn't make much sense in Python anyway) nor do we support signals as a means of inter-thread communication, since not all thread implementations support that (at least our thread library doesn't). We still have the problem that in some implementations signals generated by the keyboard (e.g. SIGINT) are delivered to all threads (e.g. SGI), while in others (e.g. Solaris) such signals are delivered to one random thread (an intermediate possibility would be to deliver it to the main thread -- POSIX?). For now, we have a working implementation that works in all three cases -- the handler ignores signals if getpid() isn't the same as in the main thread. XXX This is a hack. */ At the very top of the file we see another relevant comment: /* XXX Signals should be recorded per thread, now we have thread state. */ That one was written in 1997, right after PyThreadState was introduced. So is the constraint about the main thread just a historical artifact? If not, what would be an appropriate explanation for why signals must be strictly bound to the main thread? #3 ----- Regarding the use of Py_MakePendingCalls() for signal handling, I can imagine the history there. However, is there any reason signal handling couldn't be separated from the "pending calls" machinery at this point? As far as I can tell there is no longer any strong relationship between the two. -eric [1] https://github.com/python/cpython/blob/master/Modules/signalmodule.c#L71 From vstinner at redhat.com Fri Sep 21 19:10:59 2018 From: vstinner at redhat.com (Victor Stinner) Date: Sat, 22 Sep 2018 01:10:59 +0200 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: Le sam. 22 sept. 2018 ? 01:05, Eric Snow a ?crit : > 3. Why does signal handling operate via the "pending calls" machinery > and not distinctly? Signals can be received anytime, between two instructions at the machine code level. But the Python code base is rarely reentrant. Moreover, you can get the signal while you don't hold the GIL :-) Victor From wes.turner at gmail.com Fri Sep 21 22:49:50 2018 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 21 Sep 2018 22:49:50 -0400 Subject: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register In-Reply-To: References: <395ca1a4-f997-f533-241c-c3d3edc25f7e@python.org> <7f61f443-8599-31dc-58ee-609843f86250@python.org> Message-ID: On Friday, September 21, 2018, Mark Lawrence wrote: > On 21/09/18 08:11, Wes Turner wrote: > >> I feel like you are actively undermining attempts to prevent exploitation >> of known vulnerabilities because the software in question is currently too >> slow. >> >> > Did you write "How to Win Friends And Influence People"? "Andrew Carnegie - Rags to Riches Power to Peace" ... Vredespaleis "Think and Grow Rich" by Napoleon Hill Today is International Day of Peace. But, now, I've gone way OT, So, we'll just have to hope IBRS is enabled, On all of the Jupyter cloud hosts. > > -- > My fellow Pythonistas, ask not what our language can do for you, ask > what you can do for our language. > > Mark Lawrence > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. > turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at drhagen.com Sat Sep 22 08:18:40 2018 From: david at drhagen.com (David Hagen) Date: Sat, 22 Sep 2018 08:18:40 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses Message-ID: The new postponed annotations have an unexpected interaction with dataclasses. Namely, you cannot get the type hints of any of the data classes methods. For example, I have some code that inspects the type parameters of a class's `__init__` method. (The real use case is to provide a default serializer for the class, but that is not important here.) ``` from dataclasses import dataclass from typing import get_type_hints class Foo: pass @dataclass class Bar: foo: Foo print(get_type_hints(Bar.__init__)) ``` In Python 3.6 and 3.7, this does what is expected; it prints `{'foo': , 'return': }`. However, if in Python 3.7, I add `from __future__ import annotations`, then this fails with an error: ``` NameError: name 'Foo' is not defined ``` I know why this is happening. The `__init__` method is defined in the `dataclasses` module which does not have the `Foo` object in its environment, and the `Foo` annotation is being passed to `dataclass` and attached to `__init__` as the string `"Foo"` rather than as the original object `Foo`, but `get_type_hints` for the new annotations only does a name lookup in the module where `__init__` is defined not where the annotation is defined. I know that the use of lambdas to implement PEP 563 was rejected for performance reasons. I could be wrong, but I think this was motivated by variable annotations because the lambda would have to be constructed each time the function body ran. I was wondering if I could motivate storing the annotations as lambdas in class bodies and function signatures, in which the environment is already being captured and is code that usually only runs once. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Sep 22 12:41:22 2018 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Sep 2018 09:41:22 -0700 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: This is a good catch -- thanks for bringing it up. I'm adding Eric Smith (author of dataclasses) and Ivan Levkivskyi (co-author of typing) as well as ?ukasz Langa (author of PEP 563) to the thread to see if they have further insights. Personally I don't think it's feasible to change PEP 563 to use lambdas (if it were even advisable, which would be a long discussion), but I do think we might be able to make small improvements to the dataclasses and/or typing modules to make sure your use case works. Probably a bugs.python.org issue is a better place to dive into the details than python-dev. Thanks again, --Guido (top-poster in chief) On Sat, Sep 22, 2018 at 8:32 AM David Hagen wrote: > The new postponed annotations have an unexpected interaction with > dataclasses. Namely, you cannot get the type hints of any of the data > classes methods. > > For example, I have some code that inspects the type parameters of a > class's `__init__` method. (The real use case is to provide a default > serializer for the class, but that is not important here.) > > ``` > from dataclasses import dataclass > from typing import get_type_hints > > class Foo: > pass > > @dataclass > class Bar: > foo: Foo > > print(get_type_hints(Bar.__init__)) > ``` > > In Python 3.6 and 3.7, this does what is expected; it prints `{'foo': > , 'return': }`. > > However, if in Python 3.7, I add `from __future__ import annotations`, > then this fails with an error: > > ``` > NameError: name 'Foo' is not defined > ``` > > I know why this is happening. The `__init__` method is defined in the > `dataclasses` module which does not have the `Foo` object in its > environment, and the `Foo` annotation is being passed to `dataclass` and > attached to `__init__` as the string `"Foo"` rather than as the original > object `Foo`, but `get_type_hints` for the new annotations only does a name > lookup in the module where `__init__` is defined not where the annotation > is defined. > > I know that the use of lambdas to implement PEP 563 was rejected for > performance reasons. I could be wrong, but I think this was motivated by > variable annotations because the lambda would have to be constructed each > time the function body ran. I was wondering if I could motivate storing the > annotations as lambdas in class bodies and function signatures, in which > the environment is already being captured and is code that usually only > runs once. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Sep 22 14:29:21 2018 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 22 Sep 2018 14:29:21 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: On 9/22/2018 12:41 PM, Guido van Rossum wrote: > This is a good catch -- thanks for bringing it up. I'm adding Eric Smith > (author of dataclasses) and Ivan Levkivskyi (co-author of typing) as > well as ?ukasz Langa (author of PEP 563) to the thread to see if they > have further insights. I don't see Ivan and ?ukasz cc'd, so I'm adding them here. > Personally I don't think it's feasible to change PEP 563 to use lambdas > (if it were even advisable, which would be a long discussion), but I do > think we might be able to make small improvements to the dataclasses > and/or typing modules to make sure your use case works. > > Probably a bugs.python.org issue is a better > place to dive into the details than python-dev. Agreed that opening a bug would be good. And then I'll ruin that suggestion by answering here, too: I think this problem is endemic to get_type_hints(). I've never understood how you're supposed to use the globals and locals arguments to it, but this works: print(get_type_hints(Bar.__init__, globals())) as does: print(get_type_hints(Bar.__init__, Bar.__module__)) But that seems like you'd have to know a lot about how a class were declared in order to call get_type_hints on it. I'm not sure __module__ is always correct (but again, I haven't really thought about it). The docs for get_type_hints() says: "In addition, forward references encoded as string literals are handled by evaluating them in globals and locals namespaces." Every once in a while someone will bring up the idea of delayed evaluation, and the answer is always "use a lambda". If we ever wanted to do something more with delayed evaluation, this is a good use case for it. Eric > > Thanks again, > > --Guido (top-poster in chief) > > On Sat, Sep 22, 2018 at 8:32 AM David Hagen > wrote: > > The new postponed annotations have an unexpected interaction with > dataclasses. Namely, you cannot get the type hints of any of the > data classes methods. > > For example, I have some code that inspects the type parameters of a > class's `__init__` method. (The real use case is to provide a > default serializer for the class, but that is not important here.) > > ``` > from dataclasses import dataclass > from typing import get_type_hints > > class Foo: > ? ? pass > > @dataclass > class Bar: > ? ? foo: Foo > > print(get_type_hints(Bar.__init__)) > ``` > > In Python 3.6 and 3.7, this does what is expected; it prints > `{'foo': , 'return': }`. > > However, if in Python 3.7, I add `from __future__ import > annotations`, then this fails with an error: > > ``` > NameError: name 'Foo' is not defined > ``` > > I know why this is happening. The `__init__` method is defined in > the `dataclasses` module which does not have the `Foo` object in its > environment, and the `Foo` annotation is being passed to `dataclass` > and attached to `__init__` as the string `"Foo"` rather than as the > original object `Foo`, but `get_type_hints` for the new annotations > only does a name lookup in the module where `__init__` is defined > not where the annotation is defined. > > I know that the use of lambdas to implement PEP 563 was rejected for > performance reasons. I could be wrong, but I think this was > motivated by variable annotations because the lambda would have to > be constructed each time the function body ran. I was wondering if I > could motivate storing the annotations as lambdas in?class bodies > and function signatures, in which the environment is already being > captured and is code that usually only runs once. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com > From guido at python.org Sat Sep 22 15:09:22 2018 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Sep 2018 12:09:22 -0700 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: On Sat, Sep 22, 2018 at 11:29 AM Eric V. Smith wrote: > I think this problem is endemic to get_type_hints(). I've never > understood how you're supposed to use the globals and locals arguments > to it, but this works: > > print(get_type_hints(Bar.__init__, globals())) > > as does: > > print(get_type_hints(Bar.__init__, Bar.__module__)) > > But that seems like you'd have to know a lot about how a class were > declared in order to call get_type_hints on it. I'm not sure __module__ > is always correct (but again, I haven't really thought about it). > Still, I wonder if there's a tweak possible of the globals and locals used when exec()'ing the function definitions in dataclasses.py, so that get_type_hints() gets the right globals for this use case. It's really tough to be at the intersection of three PEPs... -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Sat Sep 22 16:38:22 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 22 Sep 2018 16:38:22 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: On Sat, Sep 22, 2018 at 3:11 PM Guido van Rossum wrote: [..] > Still, I wonder if there's a tweak possible of the globals and locals used when exec()'ing the function definitions in dataclasses.py, so that get_type_hints() gets the right globals for this use case. > > It's really tough to be at the intersection of three PEPs... If it's possible to fix exec() to accept any Mapping (not just dicts), then we can create a proxy mapping for "Dataclass.__init__.__module__" module and everything would work as expected. Here's a very hack-ish fix we can use in meanwhile (even in 3.7.1?): https://gist.github.com/1st1/37fdd3cc84cd65b9af3471b935b722df Yury From david at drhagen.com Sun Sep 23 06:40:29 2018 From: david at drhagen.com (David Hagen) Date: Sun, 23 Sep 2018 06:40:29 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: On Sat, Sep 22, 2018 at 3:11 PM Guido van Rossum wrote: > Still, I wonder if there's a tweak possible of the globals and locals used when exec()'ing the function definitions in dataclasses.py, so that get_type_hints() gets the right globals for this use case. On Sat, Sep 22, 2018 at 4:38 PM Yury Selivanov wrote: > If it's possible to fix exec() to accept any Mapping (not just dicts), > then we can create a proxy mapping for "Dataclass.__init__.__module__" > module and everything would work as expected. Another possible solution is that `__annotations__` are *permitted* to be lambdas, but not created as such by default, and `get_type_hints` is aware of this. Operations that are known to break type hints (such as when data classes copy the type hints from the class body to the `__init__` method) could covert the type hint from a string to a lambda before moving it. It does not even have to be a lambda; an instance of some `TypeHint` class, which stores the string and pointers to the original `globals` and `locals` (or just the original object whose `globals` and `locals` are relevant), but that's essentially what a lambda is anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at drhagen.com Sun Sep 23 06:48:38 2018 From: david at drhagen.com (David Hagen) Date: Sun, 23 Sep 2018 06:48:38 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: On Sat, Sep 22, 2018 at 12:41 PM Guido van Rossum wrote: > Probably a bugs.python.org issue is a better place to dive into the details than python-dev. Issue tracker issue created: https://bugs.python.org/issue34776 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralshammrei2 at gmail.com Sun Sep 23 13:59:19 2018 From: ralshammrei2 at gmail.com (R Alshammrei) Date: Sun, 23 Sep 2018 20:59:19 +0300 Subject: [Python-Dev] (no subject) Message-ID: ------------------------------ ?? 8 ?????? ? 2017 ?? ???? ?????? 12:30 ??? ????? ????? ? ????? Masayuki YAMAMOTO < ma3yuki.8mamo10 at gmail.com > ???: > *????? ?? ???? * > > *I ????? PEP 539 ??????? ??????? ?????????. ???? ????? ??????? * > *?????????! * ???? ?? ???? ????????? ??? ????? ???? ??? ??? PEP! ???? ????????? ???? ? ???? ?????? ?? ???? ?? BDFL-Delegate ??? ???????? ????? ???? :) ?? ??????? ???. - ??? ?????? ncoghlan ?? gmail.com | ???????? ???????? ------------------------------ - Previous message (by thread): [Python-Dev] PEP 539 v3: A C new API for Thread-Local Storage in CPython - ??????? ??????? (?????? ???? ???????): [Python-Dev] PEP 539 v3: ????? ????? ??????? C ????? ??????? ?????? ?????? ?? CPython - *?? ??? ??????? ???: *[date] [thread] [subject] [author] ------------------------------ ???? ?? ????????? ??? ??????? ???????? ?? Python-Dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From jackiekazil at gmail.com Sun Sep 23 19:29:56 2018 From: jackiekazil at gmail.com (Jacqueline Kazil) Date: Sun, 23 Sep 2018 19:29:56 -0400 Subject: [Python-Dev] Official citation for Python In-Reply-To: References: <233a848c-c8a6-fda5-1c83-6cbf281b28eb@klix.ch> <23449.57520.710702.426310@turnbull.sk.tsukuba.ac.jp> Message-ID: I wanted to send an update. At the NumFocus Summit, I found out about... This: https://www.force11.org/group/software-citation-working-group & this: https://github.com/adrn/CitationPEP I am going to work on a citation approach based off of those two sources and come back with a more developed proposal. -Jackie On Mon, Sep 17, 2018 at 1:12 PM MRAB wrote: > On 2018-09-17 05:05, Jeremy Hylton wrote: > > > > I wanted to start with an easy answer that is surely unsatisfying: > > > http://blog.apastyle.org/apastyle/2015/01/how-to-cite-software-in-apa-style.html > > > > APA style is pretty popular, and it says that standard software doesn't > > need to be specified. Standard software includes "Microsoft Word, Java, > > and Adobe Photoshop." So I'd say Python fits well in that category, and > > doesn't need to be cited. > > > > I said you wouldn't be satisfied... > > > It goes on to say """Note: We don?t keep a comprehensive list of what > programs are ?standard.? You make the call.""". > > [snip] > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jackiekazil%40gmail.com > -- Jacqueline Kazil | @jackiekazil -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Sep 24 02:23:47 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 23 Sep 2018 23:23:47 -0700 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: > On Sep 22, 2018, at 1:38 PM, Yury Selivanov wrote: > > On Sat, Sep 22, 2018 at 3:11 PM Guido van Rossum wrote: > [..] >> Still, I wonder if there's a tweak possible of the globals and locals used when exec()'ing the function definitions in dataclasses.py, so that get_type_hints() gets the right globals for this use case. >> >> It's really tough to be at the intersection of three PEPs... > > If it's possible to fix exec() to accept any Mapping (not just dicts), > then we can create a proxy mapping for "Dataclass.__init__.__module__" > module and everything would work as expected FWIW, the locals() dict for exec() already accepts any mapping (not just dicts): >>> class M: def __getitem__(self, key): return key.upper() def __setitem__(self, key, value): print(f'{key!r}: {value!r}') >>> exec('a=b', globals(), M()) 'a': 'B' Raymond From ncoghlan at gmail.com Mon Sep 24 08:18:13 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Sep 2018 22:18:13 +1000 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Sat, 22 Sep 2018 at 09:14, Victor Stinner wrote: > > Le sam. 22 sept. 2018 ? 01:05, Eric Snow a ?crit : > > 3. Why does signal handling operate via the "pending calls" machinery > > and not distinctly? > > Signals can be received anytime, between two instructions at the > machine code level. But the Python code base is rarely reentrant. > Moreover, you can get the signal while you don't hold the GIL :-) This would actually be the main reason to keep the current behaviour: at least some folks are running their applications in subthreads as a workaround to avoid https://bugs.python.org/issue29988 and the general fact that Ctrl-C handling and deterministic resource cleanup generally don't get along overly well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Mon Sep 24 12:51:33 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 24 Sep 2018 10:51:33 -0600 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Fri, Sep 21, 2018 at 5:11 PM Victor Stinner wrote: > Le sam. 22 sept. 2018 ? 01:05, Eric Snow a ?crit : > > 3. Why does signal handling operate via the "pending calls" machinery > > and not distinctly? > > Signals can be received anytime, between two instructions at the > machine code level. But the Python code base is rarely reentrant. > Moreover, you can get the signal while you don't hold the GIL :-) Sorry, I wasn't clear. I'm not suggesting that signals should be handled outside the interpreter. Instead, why do we call PyErr_CheckSignals() in Py_MakePendingCalls() rather than distinctly, right before we call Py_MakePendingCalls()? -eric From leewangzhong+python at gmail.com Mon Sep 24 12:52:35 2018 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Mon, 24 Sep 2018 12:52:35 -0400 Subject: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement? In-Reply-To: <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> References: <20180914215424.hjxvq5l7m66schas@python.ca> <58cdf8c3-1b23-326c-2244-835f16397b51@hastings.org> Message-ID: On Fri, Sep 14, 2018 at 6:08 PM Larry Hastings wrote: > I can suggest that, based on conversation from Carl, that adding the stat calls back in costs you half the startup. So any mechanism where we're talking to the disk _at all_ simply isn't going to be as fast. Is that cost for when the stat calls are done in parallel with the new loading mechanism? From yselivanov.ml at gmail.com Mon Sep 24 13:14:32 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 24 Sep 2018 13:14:32 -0400 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Fri, Sep 21, 2018 at 7:04 PM Eric Snow wrote: > > Hi all, > > I've got a pretty good sense of how signal handling works in the > runtime (i.e. via a dance with the eval loop), but still have some > questions: > > 1. Why do we restrict calls to signal.signal() to the main thread? > 2. Why must signal handlers run in the main thread? > 3. Why does signal handling operate via the "pending calls" machinery > and not distinctly? Here's my take on this: Handling signals in a multi-threaded program is hard. Some signals can be delivered to an arbitrary thread, some to the one that caused them. Posix provides lots of mechanisms to tune how signals are received (or blocked) by individual threads, but (a) Python doesn't expose those APIs, (b) using those APIs correctly is insanely hard. By restricting that we can only receive signals in the main thread we remove all that complexity. Restricting that signal.signal() can only be called from the main thread just makes this API more consistent (and also IIRC avoids weird sigaction() behaviour when it is called from different threads within one program). Next, you can only call reentrant functions in your signal handlers. For instance, printf() function isn't safe to use. Therefore one common practice is to set a flag that a signal was received and check it later (exactly what we do with the pending calls machinery). Therefore, IMO, the current way we handle signals in Python is the safest, most predictable, and most cross-platform option there is. And changing how Python signals API works with threads in any way will actually break the world. Yury From eryksun at gmail.com Mon Sep 24 14:34:33 2018 From: eryksun at gmail.com (eryk sun) Date: Mon, 24 Sep 2018 13:34:33 -0500 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Fri, Sep 21, 2018 at 6:10 PM, Victor Stinner wrote: > > Moreover, you can get the signal while you don't hold the GIL :-) Note that, in Windows, SIGINT and SIGBREAK are implemented in the C runtime and linked to the corresponding console control events in a console application, such as python.exe. Console control events are delivered on a new thread (i.e. no Python thread state) that starts at CtrlRoutine in kernelbase.dll. The session server (csrss.exe) creates this thread remotely upon request from the console host process (conhost.exe). From ericsnowcurrently at gmail.com Mon Sep 24 16:19:38 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 24 Sep 2018 14:19:38 -0600 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov wrote: > On Fri, Sep 21, 2018 at 7:04 PM Eric Snow wrote: > > 1. Why do we restrict calls to signal.signal() to the main thread? > > 2. Why must signal handlers run in the main thread? > > 3. Why does signal handling operate via the "pending calls" machinery > > and not distinctly? > > Here's my take on this: > > Handling signals in a multi-threaded program is hard. Some signals can > be delivered to an arbitrary thread, some to the one that caused them. > Posix provides lots of mechanisms to tune how signals are received (or > blocked) by individual threads, but (a) Python doesn't expose those > APIs, (b) using those APIs correctly is insanely hard. By restricting > that we can only receive signals in the main thread we remove all that > complexity. Just to be clear, I'm *not* suggesting that we allow folks to specify in which Python (or kernel) thread a Python-level signal handler is called. The reason I've asked about signals is because of the semantics under subinterpreters (where there is no "main" thread). However, I don't have any plans to introduce per-interpreter signal handlers. Mostly I want to understand about the "main" thread restriction for the possible impact on subinterpreters. FWIW, I'm also mildly curious about the value of the "main" thread restriction currently. From what I can tell the restriction was made early on and there are hints in the C code that it's no longer needed. I suspect we still have the restriction solely because no one has bothered to change it. However, I wasn't sure so I figured I'd ask. :) > Restricting that signal.signal() can only be called from > the main thread just makes this API more consistent Yeah, that's what I thought. > (and also IIRC > avoids weird sigaction() behaviour when it is called from different > threads within one program). Is there a good place where this weirdness is documented? > Next, you can only call reentrant functions in your signal handlers. > For instance, printf() function isn't safe to use. Therefore one > common practice is to set a flag that a signal was received and check > it later (exactly what we do with the pending calls machinery). We don't actually use the pending calls machinery for signals though. The only thing we do is *always* call PyErr_CheckSignals() before making any pending calls. Wouldn't it be equivalent if we called PyErr_CheckSignals() at the beginning of the eval loop right before calling Py_MakePendingCalls()? This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem. > Therefore, IMO, the current way we handle signals in Python is the > safest, most predictable, and most cross-platform option there is. > And changing how Python signals API works with threads in any way will > actually break the world. I agree that the way we deal with signals (i.e. set a flag that is later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't change. My original 3 questions do not relate to that. Looks like that wasn't terribly clear. :) -eric From yselivanov.ml at gmail.com Mon Sep 24 17:10:27 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 24 Sep 2018 17:10:27 -0400 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Mon, Sep 24, 2018 at 4:19 PM Eric Snow wrote: [..] > Is there a good place where this weirdness is documented? I'll need to look through uvloop & libuv commit log to remember that; will try to find time tonight/tomorrow. [..] > This matters to me because I'd like to use "pending" calls for > subinterpreters, which means dealing with signals *in* > Py_MakePendingCalls() is problematic. Pulling the > PyErr_CheckSignals() call out would eliminate that problem. Py_MakePendingCalls is a public API, even though it's not documented. If we change it to not call PyErr_CheckSignals and if there are C extensions that block pure Python code execution for long time (but call Py_MakePendingCalls explicitly), such extensions would stop reacting to ^C. Maybe a better workaround would be to introduce a concept of "main" sub-interpreter? We can then fix Py_MakePendingCalls to only check for signals when it's called from the main interpreter. Yury From ericsnowcurrently at gmail.com Mon Sep 24 19:24:02 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 24 Sep 2018 17:24:02 -0600 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Mon, Sep 24, 2018 at 3:10 PM Yury Selivanov wrote: > On Mon, Sep 24, 2018 at 4:19 PM Eric Snow wrote: > > This matters to me because I'd like to use "pending" calls for > > subinterpreters, which means dealing with signals *in* > > Py_MakePendingCalls() is problematic. Pulling the > > PyErr_CheckSignals() call out would eliminate that problem. > > Py_MakePendingCalls is a public API, even though it's not documented. > If we change it to not call PyErr_CheckSignals and if there are C > extensions that block pure Python code execution for long time (but > call Py_MakePendingCalls explicitly), such extensions would stop > reacting to ^C. > > Maybe a better workaround would be to introduce a concept of "main" > sub-interpreter? We can then fix Py_MakePendingCalls to only check for > signals when it's called from the main interpreter. I'm planning on making Py_MakePendingCalls() a backward-compatible wrapper around a new private _Py_MakePendingCalls() which supports per-interpreter operation. Then the eval loop will call the new internal function. So nothing would change for users. -eric From vstinner at redhat.com Tue Sep 25 03:44:57 2018 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 25 Sep 2018 09:44:57 +0200 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: Please don't rely on this ugly API. *By design*, Py_AddPendingCall() tries 100 times to acquire the lock: if it fails to acquire the lock, it does notthing... your callback is ignored... By the way, recently, we had to fix yet another bug in signal handling. A new function has been added: void _PyEval_SignalReceived(void) { /* bpo-30703: Function called when the C signal handler of Python gets a signal. We cannot queue a callback using Py_AddPendingCall() since that function is not async-signal-safe. */ SIGNAL_PENDING_CALLS(); } If you want to exchange commands between two interpreters, which I see as two threads, I suggest to use two queues and something to consume queues. Victor Le lun. 24 sept. 2018 ? 22:23, Eric Snow a ?crit : > > On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov wrote: > > On Fri, Sep 21, 2018 at 7:04 PM Eric Snow wrote: > > > 1. Why do we restrict calls to signal.signal() to the main thread? > > > 2. Why must signal handlers run in the main thread? > > > 3. Why does signal handling operate via the "pending calls" machinery > > > and not distinctly? > > > > Here's my take on this: > > > > Handling signals in a multi-threaded program is hard. Some signals can > > be delivered to an arbitrary thread, some to the one that caused them. > > Posix provides lots of mechanisms to tune how signals are received (or > > blocked) by individual threads, but (a) Python doesn't expose those > > APIs, (b) using those APIs correctly is insanely hard. By restricting > > that we can only receive signals in the main thread we remove all that > > complexity. > > Just to be clear, I'm *not* suggesting that we allow folks to specify > in which Python (or kernel) thread a Python-level signal handler is > called. > > The reason I've asked about signals is because of the semantics under > subinterpreters (where there is no "main" thread). However, I don't > have any plans to introduce per-interpreter signal handlers. Mostly I > want to understand about the "main" thread restriction for the > possible impact on subinterpreters. > > FWIW, I'm also mildly curious about the value of the "main" thread > restriction currently. From what I can tell the restriction was made > early on and there are hints in the C code that it's no longer needed. > I suspect we still have the restriction solely because no one has > bothered to change it. However, I wasn't sure so I figured I'd ask. > :) > > > Restricting that signal.signal() can only be called from > > the main thread just makes this API more consistent > > Yeah, that's what I thought. > > > (and also IIRC > > avoids weird sigaction() behaviour when it is called from different > > threads within one program). > > Is there a good place where this weirdness is documented? > > > Next, you can only call reentrant functions in your signal handlers. > > For instance, printf() function isn't safe to use. Therefore one > > common practice is to set a flag that a signal was received and check > > it later (exactly what we do with the pending calls machinery). > > We don't actually use the pending calls machinery for signals though. > The only thing we do is *always* call PyErr_CheckSignals() before > making any pending calls. Wouldn't it be equivalent if we called > PyErr_CheckSignals() at the beginning of the eval loop right before > calling Py_MakePendingCalls()? > > This matters to me because I'd like to use "pending" calls for > subinterpreters, which means dealing with signals *in* > Py_MakePendingCalls() is problematic. Pulling the > PyErr_CheckSignals() call out would eliminate that problem. > > > Therefore, IMO, the current way we handle signals in Python is the > > safest, most predictable, and most cross-platform option there is. > > And changing how Python signals API works with threads in any way will > > actually break the world. > > I agree that the way we deal with signals (i.e. set a flag that is > later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't > change. My original 3 questions do not relate to that. Looks like > that wasn't terribly clear. :) > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com From barry at python.org Tue Sep 25 10:01:56 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Sep 2018 10:01:56 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: Message-ID: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> On Sep 25, 2018, at 03:44, Victor Stinner wrote: > By the way, recently, we had to fix yet another bug in signal > handling. A new function has been added: > > void > _PyEval_SignalReceived(void) > { > /* bpo-30703: Function called when the C signal handler of Python gets a > signal. We cannot queue a callback using Py_AddPendingCall() since > that function is not async-signal-safe. */ > SIGNAL_PENDING_CALLS(); > } Is anybody else concerned about the proliferation of undocumented private C API functions? I?m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don?t appear in the Python C API documentation (AFAICT). That means they are undiscoverable, and their existence and utility is buried in institutional knowledge and obscure places within the C code. And yet, the interpreter relies heavily on them. Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From J.Demeyer at UGent.be Tue Sep 25 10:08:24 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 25 Sep 2018 16:08:24 +0200 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <177b20ddea5a41bbaa46716ccdcfec42@xmail103.UGent.be> References: <177b20ddea5a41bbaa46716ccdcfec42@xmail103.UGent.be> Message-ID: <5BAA4158.6090504@UGent.be> On 2018-09-25 16:01, Barry Warsaw wrote: > Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API. Even the *public* C API is not fully documented. For example, none of the PyCFunction_... functions appears in the documentation. From solipsis at pitrou.net Tue Sep 25 10:18:07 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 25 Sep 2018 16:18:07 +0200 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> Message-ID: <20180925161807.5e13e90e@fsol> On Tue, 25 Sep 2018 10:01:56 -0400 Barry Warsaw wrote: > On Sep 25, 2018, at 03:44, Victor Stinner wrote: > > > By the way, recently, we had to fix yet another bug in signal > > handling. A new function has been added: > > > > void > > _PyEval_SignalReceived(void) > > { > > /* bpo-30703: Function called when the C signal handler of Python gets a > > signal. We cannot queue a callback using Py_AddPendingCall() since > > that function is not async-signal-safe. */ > > SIGNAL_PENDING_CALLS(); > > } > > Is anybody else concerned about the proliferation of undocumented private C API functions? I?m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don?t appear in the Python C API documentation (AFAICT). Not really. Many are just like "static" (i.e. module-private) functions, except that they need to be shared by two or three different C modules. It's definitely the case for _PyEval_SignalReceived(). Putting them in the C API documentation risks making the docs harder to browse through for third-party users. I think it's enough if there's a comment in the .h file explaining the given function. Regards Antoine. From barry at python.org Tue Sep 25 10:30:00 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Sep 2018 10:30:00 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <20180925161807.5e13e90e@fsol> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <20180925161807.5e13e90e@fsol> Message-ID: On Sep 25, 2018, at 10:18, Antoine Pitrou wrote: > > Not really. Many are just like "static" (i.e. module-private) > functions, except that they need to be shared by two or three different > C modules. It's definitely the case for _PyEval_SignalReceived(). Purely static functions which appear only in the file they are defined in are probably fine not to document, although I do still think we should take care to comment on their semantics and external behaviors (i.e. reference counting). But if they?re used in multiple C files, then I think they *can* deserve placement within the documentation. > Putting them in the C API documentation risks making the docs harder to > browse through for third-party users. I think it's enough if there's a > comment in the .h file explaining the given function. It?s a trade-off for sure. I don?t have any great ideas about how to balance that, and I don?t know what documentation techniques would help, but it does often bother me that I can?t search for them on docs.python.org. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Tue Sep 25 10:31:29 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Sep 2018 10:31:29 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <5BAA4158.6090504@UGent.be> References: <177b20ddea5a41bbaa46716ccdcfec42@xmail103.UGent.be> <5BAA4158.6090504@UGent.be> Message-ID: <8273D318-BE52-4797-9DF7-77548E9DE69A@python.org> On Sep 25, 2018, at 10:08, Jeroen Demeyer wrote: > > On 2018-09-25 16:01, Barry Warsaw wrote: >> Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API. > > Even the *public* C API is not fully documented. For example, none of the PyCFunction_... functions appears in the documentation. That?s clearly a documentation bug :). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From ericsnowcurrently at gmail.com Tue Sep 25 11:09:26 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 25 Sep 2018 09:09:26 -0600 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner wrote: > Please don't rely on this ugly API. *By design*, Py_AddPendingCall() > tries 100 times to acquire the lock: if it fails to acquire the lock, > it does notthing... your callback is ignored... Yeah, there are issues with pending calls as implemented. Furthermore, I'm not clear on why it was made a public API in the first place. Ultimately I'd like to deprecate Py_AddPendingCall and Py_MakePendingCalls (but that's not my priority right now). Regardless, the underlying machinery matches what I need for interpreter isolation right now. See below. > By the way, recently, we had to fix yet another bug in signal > handling. A new function has been added: > [snip] I saw that. If anything, it's more justification for separating signals from the pending calls machinery. :) > If you want to exchange commands between two interpreters, which I see > as two threads, I suggest to use two queues and something to consume > queues. Sure. However, to make that work I'd end up with something that's almost identical to the existing pending calls machinery. Since the function-to-run may call Python code, there must be an active PyThreadState and the GIL (or, eventually, target interpreter's lock) must be held. That is what Py_MakePendingCalls gives us already. Injecting the pending call into the eval loop (as we do today) means we don't have to create a new thread just for handling pending calls. [1] In the short term I'd like to stick with existing functionality as much as possible. Doing so has a number of benefits. That's been one of my guiding principles as I've worked toward the multi-core Python goal. That said, I agree that the pending calls machinery has some deficiencies that should be fixed or something better should replace it. I just don't want that to get in the way of short-term goals, especially as I have limited time for this. -eric [1] FWIW, a separate thread for "asynchronous" operations (ones that interrupt the eval loop, e.g. "pending" calls and signals) might be a viable approach now that we require platforms to support threading. The way we interrupt the eval loop currently seems more complex (and inefficient) than necessary. I was thinking about this last week and plan to explore it further at some point in the future. For now, though, I'd like to keep the focus on more immediate needs. :) From yselivanov.ml at gmail.com Tue Sep 25 11:15:55 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 25 Sep 2018 11:15:55 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> Message-ID: On Tue, Sep 25, 2018 at 10:04 AM Barry Warsaw wrote: > > Is anybody else concerned about the proliferation of undocumented private C API functions? I am concerned about that too. In my opinion having all those semi-private undocumented C API just contributes to the noise and artificially inflates the grey area of C API that alternative implementations *have to* support. We already have a mechanism for private header files: the "Include/internal/" directory. I think it should be mandatory to always put private C API-like functions/structs there. Yury From ericsnowcurrently at gmail.com Tue Sep 25 11:18:04 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 25 Sep 2018 09:18:04 -0600 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <20180925161807.5e13e90e@fsol> Message-ID: On Tue, Sep 25, 2018 at 8:30 AM Barry Warsaw wrote: > On Sep 25, 2018, at 10:18, Antoine Pitrou wrote: > > Putting them in the C API documentation risks making the docs harder to > > browse through for third-party users. I think it's enough if there's a > > comment in the .h file explaining the given function. > > It?s a trade-off for sure. I don?t have any great ideas about how to balance that, and I don?t know what documentation techniques would help, but it does often bother me that I can?t search for them on docs.python.org. FWIW, I've run into the same issue. Perhaps we could have a single dedicated page in the C-API docs for internal API. It could be just a big table with a bold red warning at the top (e.g. "These are internal-only APIs, here for the benefit of folks working on CPython itself."). We *could* even have a CI check to ensure that new internal API (which doesn't happen often) gets added to the table. -eric From solipsis at pitrou.net Tue Sep 25 11:19:11 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 25 Sep 2018 17:19:11 +0200 Subject: [Python-Dev] Questions about signal handling. References: Message-ID: <20180925171911.3495105f@fsol> On Tue, 25 Sep 2018 09:09:26 -0600 Eric Snow wrote: > On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner wrote: > > Please don't rely on this ugly API. *By design*, Py_AddPendingCall() > > tries 100 times to acquire the lock: if it fails to acquire the lock, > > it does notthing... your callback is ignored... > > Yeah, there are issues with pending calls as implemented. > Furthermore, I'm not clear on why it was made a public API in the > first place. I don't know, but I think Eve Online used the API at some point (not sure they're still Python-based nowadays). Perhaps Kristj?n may confirm if he's reading this. Regards Antoine. From ericsnowcurrently at gmail.com Tue Sep 25 11:24:43 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 25 Sep 2018 09:24:43 -0600 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> Message-ID: On Tue, Sep 25, 2018 at 9:16 AM Yury Selivanov wrote: > We already have a mechanism for private header files: the > "Include/internal/" directory. I think it should be mandatory to > always put private C API-like functions/structs there. +1 This is the main reason I created that directory. (Victor had a similar idea about the same time.) Having the separate "Include/internal" directory definitely makes it much easier to discover internal APIs, to distinguish between public and private, and to keep extension authors (and embedders) from using internal APIs without knowing what they're doing. That said, having docs for the internal APIs would still be awesome! -eric From vstinner at redhat.com Tue Sep 25 11:28:51 2018 From: vstinner at redhat.com (Victor Stinner) Date: Tue, 25 Sep 2018 17:28:51 +0200 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> Message-ID: Nobody should use _PyEval_SignalReceived(). It should only be used the the C signal handler. But if we have a separated documented for CPython internals, why not documenting private functions. At least, I would prefer to not put it at the same place an the *public* C API. (At least, a different directory.) Victor Le mar. 25 sept. 2018 ? 16:05, Barry Warsaw a ?crit : > > On Sep 25, 2018, at 03:44, Victor Stinner wrote: > > > By the way, recently, we had to fix yet another bug in signal > > handling. A new function has been added: > > > > void > > _PyEval_SignalReceived(void) > > { > > /* bpo-30703: Function called when the C signal handler of Python gets a > > signal. We cannot queue a callback using Py_AddPendingCall() since > > that function is not async-signal-safe. */ > > SIGNAL_PENDING_CALLS(); > > } > > Is anybody else concerned about the proliferation of undocumented private C API functions? I?m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don?t appear in the Python C API documentation (AFAICT). That means they are undiscoverable, and their existence and utility is buried in institutional knowledge and obscure places within the C code. And yet, the interpreter relies heavily on them. > > Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API. > > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com From greg at krypto.org Tue Sep 25 11:30:03 2018 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 25 Sep 2018 08:30:03 -0700 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Mon, Sep 24, 2018 at 1:20 PM Eric Snow wrote: > On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov > wrote: > > On Fri, Sep 21, 2018 at 7:04 PM Eric Snow > wrote: > > > 1. Why do we restrict calls to signal.signal() to the main thread? > > > 2. Why must signal handlers run in the main thread? > > > 3. Why does signal handling operate via the "pending calls" machinery > > > and not distinctly? > > > > Here's my take on this: > > > > Handling signals in a multi-threaded program is hard. Some signals can > > be delivered to an arbitrary thread, some to the one that caused them. > > Posix provides lots of mechanisms to tune how signals are received (or > > blocked) by individual threads, but (a) Python doesn't expose those > > APIs, (b) using those APIs correctly is insanely hard. By restricting > > that we can only receive signals in the main thread we remove all that > > complexity. > > Just to be clear, I'm *not* suggesting that we allow folks to specify > in which Python (or kernel) thread a Python-level signal handler is > called. > > The reason I've asked about signals is because of the semantics under > subinterpreters (where there is no "main" thread). However, I don't > have any plans to introduce per-interpreter signal handlers. Mostly I > want to understand about the "main" thread restriction for the > possible impact on subinterpreters. > > FWIW, I'm also mildly curious about the value of the "main" thread > restriction currently. From what I can tell the restriction was made > early on and there are hints in the C code that it's no longer needed. > I suspect we still have the restriction solely because no one has > bothered to change it. However, I wasn't sure so I figured I'd ask. > :) > We can't change the API of the main thread being where signal handlers are executed by default. If a signal handler raised an exception in a daemon thread, the process would not die when it goes uncaught (ie: why KeyboardInterrupt works). The daemon thread ends and the rest of the process is unaware of that. Many existing Python signal handlers expect to only be called from the main thread. If we wanted to change this, we'd probably want to have users declare which thread(s) are allowed to execute which signal handlers at signal handler registration time and whether they are executed by only one of those threads or by all of those threads. Not semantics I expect most people are used to because I'm not aware of any other language doing that. But I don't see a compelling use case for implementing such complexity. Maybe something like that would make sense for subinterpreter delegation only? I'm not sure. I'd start without signals at all in subinterpreters before making such a decision. Python keeping signals simple has long been cited as a feature of the VM. -gps > > > Restricting that signal.signal() can only be called from > > the main thread just makes this API more consistent > > Yeah, that's what I thought. > > > (and also IIRC > > avoids weird sigaction() behaviour when it is called from different > > threads within one program). > > Is there a good place where this weirdness is documented? > > > Next, you can only call reentrant functions in your signal handlers. > > For instance, printf() function isn't safe to use. Therefore one > > common practice is to set a flag that a signal was received and check > > it later (exactly what we do with the pending calls machinery). > > We don't actually use the pending calls machinery for signals though. > The only thing we do is *always* call PyErr_CheckSignals() before > making any pending calls. Wouldn't it be equivalent if we called > PyErr_CheckSignals() at the beginning of the eval loop right before > calling Py_MakePendingCalls()? > > This matters to me because I'd like to use "pending" calls for > subinterpreters, which means dealing with signals *in* > Py_MakePendingCalls() is problematic. Pulling the > PyErr_CheckSignals() call out would eliminate that problem. > > > Therefore, IMO, the current way we handle signals in Python is the > > safest, most predictable, and most cross-platform option there is. > > And changing how Python signals API works with threads in any way will > > actually break the world. > > I agree that the way we deal with signals (i.e. set a flag that is > later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't > change. My original 3 questions do not relate to that. Looks like > that wasn't terribly clear. :) > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Sep 25 11:39:47 2018 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 25 Sep 2018 09:39:47 -0600 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: References: Message-ID: On Tue, Sep 25, 2018 at 9:30 AM Gregory P. Smith wrote: > We can't change the API of the main thread being where signal handlers are > executed by default. > > If a signal handler raised an exception in a daemon thread, the process would > not die when it goes uncaught (ie: why KeyboardInterrupt works). The daemon > thread ends and the rest of the process is unaware of that. Many existing > Python signal handlers expect to only be called from the main thread. Ah, that's good to know. Thanks, Greg! > If we wanted to change this, we'd probably want to have users declare which > thread(s) are allowed to execute which signal handlers at signal handler > registration time and whether they are executed by only one of those threads or > by all of those threads. Not semantics I expect most people are used to because > I'm not aware of any other language doing that. But I don't see a compelling use > case for implementing such complexity. That's similar to what I imagined, based on how signals and posix threads interact. Likewise I consider it not nearly worth doing. :) > Maybe something like that would make sense for subinterpreter delegation only? > I'm not sure. I'd start without signals at all in subinterpreters before making such > a decision. > > Python keeping signals simple has long been cited as a feature of the VM. Exactly. For now I was planning on keeping signals main-thread-only (consequently main-interpreter-only). There's the possibility we could support per-interpreter signal handlers, but I don't plan on exploring that idea until well after the more important stuff is finished. -eric From barry at python.org Tue Sep 25 11:53:08 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Sep 2018 11:53:08 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> Message-ID: <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> On Sep 25, 2018, at 11:28, Victor Stinner wrote: > > But if we have a separated documented for CPython internals, why not > documenting private functions. At least, I would prefer to not put it > at the same place an the *public* C API. (At least, a different > directory.) I like the idea of an ?internals? C API documentation, separate from the public API. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Tue Sep 25 12:01:58 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Sep 2018 09:01:58 -0700 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> Message-ID: On Tue, Sep 25, 2018 at 8:55 AM Barry Warsaw wrote: > On Sep 25, 2018, at 11:28, Victor Stinner wrote: > > > > But if we have a separated documented for CPython internals, why not > > documenting private functions. At least, I would prefer to not put it > > at the same place an the *public* C API. (At least, a different > > directory.) > > I like the idea of an ?internals? C API documentation, separate from the > public API. > Right. IMO it should be physically separate from the public C API docs. I.e. reside in a different subdirectory of Doc, and be published at a different URL (perhaps not even under docs.python.org), since the audience here is exclusively people who want to modify the CPython interpreter, *not* people who want to write extension modules for use with CPython. And we should fully reserve the right to change their behavior incompatibly, even in bugfix releases. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Sep 25 12:09:33 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 25 Sep 2018 12:09:33 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> Message-ID: On Tue, Sep 25, 2018 at 11:55 AM Barry Warsaw wrote: > > On Sep 25, 2018, at 11:28, Victor Stinner wrote: > > > > But if we have a separated documented for CPython internals, why not > > documenting private functions. At least, I would prefer to not put it > > at the same place an the *public* C API. (At least, a different > > directory.) > > I like the idea of an ?internals? C API documentation, separate from the public API. For that we can just document them in the code, right? Like this one, from Include/internal/pystate.h: /* Initialize _PyRuntimeState. Return NULL on success, or return an error message on failure. */ PyAPI_FUNC(_PyInitError) _PyRuntime_Initialize(void); My main concern with maintaining a *separate* documentation of internals is that it would make it harder to keep it in sync with the actual implementation. We often struggle to keep the comments in the code in sync with that code. Yury From barry at python.org Tue Sep 25 15:24:49 2018 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Sep 2018 15:24:49 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> Message-ID: <119B0CFA-FF9F-49F0-8301-F3D11F21C772@python.org> On Sep 25, 2018, at 12:09, Yury Selivanov wrote: > > My main concern with maintaining a *separate* documentation of > internals is that it would make it harder to keep it in sync with the > actual implementation. We often struggle to keep the comments in the > code in sync with that code. Well, my goal is that the internal API would show up when I search for function names on docs.python.org. Right now, I believe the ?quick search? box does search the entire documentation suite. I don?t care too much whether they would reside in a separate section in the current C API, or in a separate directory, listed or not under ?Parts of the documentation? on the front landing page. But I agree they shouldn?t be intermingled with the public C API. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From yselivanov.ml at gmail.com Tue Sep 25 15:32:16 2018 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 25 Sep 2018 15:32:16 -0400 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <119B0CFA-FF9F-49F0-8301-F3D11F21C772@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> <119B0CFA-FF9F-49F0-8301-F3D11F21C772@python.org> Message-ID: On Tue, Sep 25, 2018 at 3:27 PM Barry Warsaw wrote: > > On Sep 25, 2018, at 12:09, Yury Selivanov wrote: > > > > My main concern with maintaining a *separate* documentation of > > internals is that it would make it harder to keep it in sync with the > > actual implementation. We often struggle to keep the comments in the > > code in sync with that code. > > Well, my goal is that the internal API would show up when I search for function names on docs.python.org. Right now, I believe the ?quick search? box does search the entire documentation suite. I don?t care too much whether they would reside in a separate section in the current C API, or in a separate directory, listed or not under ?Parts of the documentation? on the front landing page. But I agree they shouldn?t be intermingled with the public C API. An idea: it would be cool to have something like Sphinx autodoc for C headers to pull this documentation from source. Yury From myriachan at gmail.com Tue Sep 25 15:38:17 2018 From: myriachan at gmail.com (Myria) Date: Tue, 25 Sep 2018 12:38:17 -0700 Subject: [Python-Dev] Bug in _portable_fseek on Windows in 2.7.x Message-ID: Sorry for mailing about a bug instead of putting in a bug tracker ticket. The bug tracker's login system just sits there for a minute then says "an error has occurred". This line of code is broken in Windows: https://github.com/python/cpython/blob/v2.7.15/Objects/fileobject.c#L721 _lseeki64 only modifies the kernel's seek position, not the cached position stored in stdio. This sometimes leads to problems where fgetpos does not return the correct file pointer. The solution is to not do that "SIZEOF_FPOS_T >= 8" code at all on Windows, and just do this instead (or make a new HAVE_FSEEKI64 macro): #elif defined(MS_WINDOWS) return _fseeki64(fp, offset, whence); 3.x is unaffected because it uses the Unix-like FD API in Windows instead of stdio, in which its usage of _lseeki64 is correct. Thanks, Melissa From jab at math.brown.edu Wed Sep 26 07:26:17 2018 From: jab at math.brown.edu (jab at math.brown.edu) Date: Wed, 26 Sep 2018 07:26:17 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> Message-ID: I recently found out about Python 3's round-to-even change (via https://github.com/cosmologicon/pywat!) and am having trouble finding where that change was discussed. I did find the revealingly-invalid bug report https://bugs.python.org/issue32956 ("python 3 round bug"), so I asked there, but wanted to invite anyone else on this list who might be interested to help. If interested, please see the comments there (copy/pasted below for convenience), and +nosy or comment on that issue. Thanks! Joshua Bronson added the comment: This was so surprising to me that I had to check some other languages that I had handy. It turns out that not one of JavaScript, Ruby, Perl, C++, Java, Go, or Rust agrees with Python. In fact they all agreed with one another that 2.5 should round to 3. Examples below. I understand from https://github.com/cosmologicon/pywat/pull/40#discussion_r219962259 that "to always round up... can theoretically skew the data" but it's not clear why that's a good enough reason to differ from the "round" function in all these other languages (as opposed to e.g. offering this alternative behavior in some additional "round_unskewed" function). I assume the rationale for having Python 3's "round" differ from that of so many other languages was written down when this decision was made, but I searched and couldn't find it. Could anyone link to it in a comment here? And would it be worth including rationale and a larger callout in the https://docs.python.org/3/library/functions.html#round docs? The documentation of this behavior is a bit buried among other things, and the rationale for it is missing entirely. $ node -e 'console.log(Math.round(2.5))' 3 $ ruby -e 'puts (2.5).round()' 3 $ perl -e 'use Math::Round; print round(2.5)' 3 $ cat test_round.cpp #include #include int main(void) { printf("%f\n", round(2.5)); } $ g++ test_round.cpp && ./a.out 3.000000 $ cat TestRound.java class TestRound { public static void main(String[] args) { System.out.println(Math.round(2.5)); } } $ javac TestRound.java && java TestRound 3 $ cat test_round.go package main import "fmt" import "math" func main() { fmt.Println(math.Round(2.5)) } $ go build test_round.go && ./test_round 3 $ cat test_round.rs fn main() { println!("{}", (2.5_f64).round()); } $ rustc test_round.rs && ./test_round 3 Serhiy Storchaka added the comment: See the discussion on the Python-Dev mailing list: https://mail.python.org/pipermail/python-dev/2008-January/075863.html. For C look at the rint() function. It is a common knowledge that rounding half-to-even is what users want in most cases, but it is a tiny bit more expensive in C. In Python the additional cost of such rounding is insignificant. Joshua Bronson added the comment: Thanks Serhiy, I read the Python-Dev thread you linked to, but that doesn't resolve the issues: - Its topic is Python 2.6 (where this behavior does not occur) rather than Python 3 (where it does). - A few messages into the thread Guido does address Python 3, but in fact says "I think the consensus is against round-to-even in 3.0" (see https://mail.python.org/pipermail/python-dev/2008-January/075897.html). - There is no discussion of the fact that this behavior differs from the function named "round" in all the other programming languages I mentioned, and whether it would therefore be better exposed as an additional function (e.g. "round_to_even" or "round_unbiased", and in the math or statistics package rather than builtins). Surprisingly, Excel is the only other programming environment I saw discussed in the thread. (And round(2.5) == 3 there.) So that all suggests there must be some other thread or issue where this change for Python 3 have been discussed, but I looked again and could not find it. The C "rint" example you gave just seems to prove the point that this behavior should have a distinct name from "round". Regarding: > It is a common knowledge that rounding half-to-even is what users want in most cases I don't think that's common knowledge; seems like citation needed? Based on all the other languages where this differs (not to mention Python 2), it's not clear users would want Python 3 to be the only different one. And this is definitely a surprise for the majority of programmers, whose experience with "round" is how it works everywhere else. (This is making it into pywat after all: https://github.com/cosmologicon/pywat/pull/40) I can submit a PR for at least updating the docs about this (as per my previous comment) if that would be welcomed. From phoenix1987 at gmail.com Wed Sep 26 17:01:17 2018 From: phoenix1987 at gmail.com (Gabriele) Date: Wed, 26 Sep 2018 22:01:17 +0100 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? Message-ID: In trying to find the location of a valid instance of PyInterpreterState in the virtual memory of a running Python (3.6) application (using process_vm_read on Linux), I have noticed that I can only rely on _PyThreadState_Current.interp at the very beginning of the execution. If I try to attach to a running Python process, then _PythreadState_Current.interp doesn't seem to point to anything useful to derive the currently running threads and the frame stacks for each of them. This makes me wonder about the purpose of this symbol in the .dynsym section. Apart from a brute force approach for finding a valid PyInterpreterState, is there a more reliable approach for the version of Python that I'm targeting? Thanks, Gabriele -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Wed Sep 26 22:21:30 2018 From: nad at python.org (Ned Deily) Date: Wed, 26 Sep 2018 22:21:30 -0400 Subject: [Python-Dev] [RELEASE] Python 3.7.1rc1 and 3.6.7rc1 now available for testing Message-ID: <5C0A8514-FE4D-46DB-A4A3-8EC5F36D8F9B@python.org> Python 3.7.1rc1 and 3.6.7rc1 are now available. 3.7.1rc1 is the release preview of the first maintenance release of Python 3.7, the latest feature release of Python. 3.6.7rc1 is the release preview of the next maintenance release of Python 3.6, the previous feature release of Python. Assuming no critical problems are found prior to 2018-10-06, no code changes are planned between these release candidates and the final releases. These release candidates are intended to give you the opportunity to test the new security and bug fixes in 3.7.1 and 3.6.7. We strongly encourage you to test your projects and report issues found to bugs.python.org as soon as possible. Please keep in mind that these are preview releases and, thus, their use is not recommended for production environments. You can find these releases and more information here: https://www.python.org/downloads/release/python-371rc1/ https://www.python.org/downloads/release/python-367rc1/ -- Ned Deily nad at python.org -- [] From turnbull.stephen.fw at u.tsukuba.ac.jp Wed Sep 26 23:34:05 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 27 Sep 2018 12:34:05 +0900 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <458835EA-0924-4FF0-B1F3-EB2A162CBC9D@python.org> Message-ID: <23468.20397.232047.112263@turnbull.sk.tsukuba.ac.jp> Barry Warsaw writes: > I like the idea of an ?internals? C API documentation, separate > from the public API. FWIW, this worked well for XEmacs ("came for flamewars, stayed for the internals manual"). Much of the stuff we inherited from GNU only got documented when there was massive bugginess (yes, that happened ;-), but we were pretty good at getting *something* (if only a "Here be Dragons" in Portuguese) in for most new global identifiers, and that was generally enough. From steve at pearwood.info Thu Sep 27 01:29:59 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Sep 2018 15:29:59 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> Message-ID: <20180927052958.GC19437@ando.pearwood.info> On Wed, Sep 26, 2018 at 07:26:17AM -0400, jab at math.brown.edu wrote: > I did find the revealingly-invalid bug report > https://bugs.python.org/issue32956 ("python 3 round bug"), so I asked > there, but wanted to invite anyone else on this list who might be > interested to help. What about those of us who are interested in hindering? But seriously, even if round-to-even was a mistake, Python 3.x has used it for seven releases now, about a decade. Backwards compatibility means we cannot just change it. By now, there are people relying on this behaviour. Changing it would need to go through a deprecation cycle, which probably means one release with a silent warning, a second release with warning enabled, and not until 3.10 would the default change. That's a lot of inconvenience just for the sake of almost-but-not-quite matching the behaviour of some other programming languages, while breaking compatibility with others: julia> round(2.5) 2.0 julia> round(3.5) 4.0 In any case, I would oppose any proposal to revert this change. Round- to-even ("banker's rounding") is generally mathematically better, and its been said (half in jest) that if you're not using it, you're probably up to shenanigans :-) For users who don't specifically care about the rounding mode, round-to- even generally makes the safest default, even if it is surprising to those used to the naive technique taught in primary schools. For those who care about compatibility with some other language, well, there are a lot of languages and we can't match them *all* by default: # Javascript js> Math.round(-2.5) -2 # Ruby irb(main):001:0> (-2.5).round() => -3 so you probably need your own custom round function. On the other hand, I wouldn't object out of hand to a feature request to support the same eight rounding modes as the decimal module. But as always, the Devil is in the details. -- Steve From greg.ewing at canterbury.ac.nz Thu Sep 27 01:55:07 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Sep 2018 17:55:07 +1200 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> Message-ID: <5BAC70BB.2040707@canterbury.ac.nz> jab at math.brown.edu wrote: > I understand from > https://github.com/cosmologicon/pywat/pull/40#discussion_r219962259 > that "to always round up... can theoretically skew the data" *Very* theoretically. If the number is even a whisker bigger than 2.5 it's going to get rounded up regardless: >>> round(2.500000000000001) 3 That difference is on the order of the error you expect from representing decimal fractions in binary, so I would be surprised if anyone can actually measure this bias in a real application. >>It is a common knowledge that rounding half-to-even is what users want in most cases > > I don't think that's common knowledge; seems like citation needed? It's not common enough for me to have heard of it before. (BTW, how do you provide a citation for "common knowledge"?-) -- Greg From steve at holdenweb.com Thu Sep 27 06:50:53 2018 From: steve at holdenweb.com (Steve Holden) Date: Thu, 27 Sep 2018 11:50:53 +0100 Subject: [Python-Dev] Questions about signal handling. In-Reply-To: <20180925171911.3495105f@fsol> References: <20180925171911.3495105f@fsol> Message-ID: I'm afraid Kristjan left CCP some time ago, and may not subscribe to this list any more. Steve Holden On Tue, Sep 25, 2018 at 4:23 PM Antoine Pitrou wrote: > On Tue, 25 Sep 2018 09:09:26 -0600 > Eric Snow wrote: > > On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner > wrote: > > > Please don't rely on this ugly API. *By design*, Py_AddPendingCall() > > > tries 100 times to acquire the lock: if it fails to acquire the lock, > > > it does notthing... your callback is ignored... > > > > Yeah, there are issues with pending calls as implemented. > > Furthermore, I'm not clear on why it was made a public API in the > > first place. > > I don't know, but I think Eve Online used the API at some point (not > sure they're still Python-based nowadays). Perhaps Kristj?n may > confirm if he's reading this. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Sep 27 09:53:33 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Sep 2018 23:53:33 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BAC70BB.2040707@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> Message-ID: <20180927135327.GE19437@ando.pearwood.info> On Thu, Sep 27, 2018 at 05:55:07PM +1200, Greg Ewing wrote: > jab at math.brown.edu wrote: > >I understand from > >https://github.com/cosmologicon/pywat/pull/40#discussion_r219962259 > >that "to always round up... can theoretically skew the data" > > *Very* theoretically. If the number is even a whisker bigger than > 2.5 it's going to get rounded up regardless: > > >>> round(2.500000000000001) > 3 > > That difference is on the order of the error you expect from > representing decimal fractions in binary, so I would be surprised > if anyone can actually measure this bias in a real application. I think you may have misunderstood the nature of the bias. It's not about individual roundings and it definitely has nothing to do with binary representation. Any one round operation will introduce a bias. You had a number, say 2.3, and it gets rounded down to 2.0, introducing an error of -0.3. But if you have lots of rounds, some will round up, and some will round down, and we want the rounding errors to cancel. The errors *almost* cancel using the naive rounding algorithm as most of the digits pair up: .1 rounds down, error = -0.1 .9 rounds up, error = +0.1 .2 rounds down, error = -0.2 .8 rounds up, error = +0.2 etc. If each digit is equally likely, then on average they'll cancel and we're left with *almost* no overall error. The problem is that while there are four digits rounding down (.1 through .4) there are FIVE which round up (.5 through .9). Two digits don't pair up: .0 stays unchanged, error = 0 .5 always rounds up, error = +0.5 Given that for many purposes, our data is recorded only to a fixed number of decimal places, we're dealing with numbers like 0.5 rather than 0.5000000001, so this can become a real issue. Every ten rounding operations will introduce an average error of +0.05 instead of cancelling out. Rounding introduces a small but real bias. The most common (and, in many experts' opinion, the best default behaviour) is Banker's Rounding, or round-to-even. All the other digits round as per the usual rule, but .5 rounds UP half the time and DOWN the rest of the time: 0.5, 2.5, 3.5 etc round down, error = -0.5 1.5, 3.5, 5.5 etc round up, error = +0.5 thus on average the .5 digit introduces no error and the bias goes away. -- Steve From aixtools at felt.demon.nl Thu Sep 27 10:55:47 2018 From: aixtools at felt.demon.nl (Michael Felt) Date: Thu, 27 Sep 2018 16:55:47 +0200 Subject: [Python-Dev] [RELEASE] Python 3.7.1rc1 and 3.6.7rc1 now available for testing In-Reply-To: <5C0A8514-FE4D-46DB-A4A3-8EC5F36D8F9B@python.org> References: <5C0A8514-FE4D-46DB-A4A3-8EC5F36D8F9B@python.org> Message-ID: Not critical - but I note a difference between Python3 3.6.7 and 3.7.1 - no support for the configure option --with-openssl. On AIX I was able to run both configure and "make install" without incident. I also ran the "make test" command. v3.7.1: 9 tests failed again: ??? test_ctypes test_distutils test_httpservers test_importlib ??? test_site test_socket test_time? test_utf8_mode test_venv ? There are, for most of above, a PR for these waiting final review and merge. test_utf8_mode: I thought this was already merged. Will research. test_venv, test_site: new test failures (I am not familiar with). Will need more research. v3.6.1: 16 tests failed: ??? test_asyncio test_ctypes test_distutils test_ftplib test_httplib ??? test_httpservers test_importlib test_locale ??? test_multiprocessing_fork test_multiprocessing_forkserver ??? test_multiprocessing_spawn test_socket test_ssl test_strptime ??? test_time test_tools FYI: again, there are PR for many of these, but, for now, I'll assume they will not be considered for backport. FYI only. On 9/27/2018 4:21 AM, Ned Deily wrote: > Assuming no > critical problems are found prior to 2018-10-06, no code changes are > planned between these release candidates and the final releases. These > release candidates are intended to give you the opportunity to test the > new security and bug fixes in 3.7.1 and 3.6.7. We strongly encourage you > to test your projects and report issues found to bugs.python.org as soon > as possible. From levkivskyi at gmail.com Thu Sep 27 12:05:57 2018 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 27 Sep 2018 17:05:57 +0100 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: Do we have a b.p.o. issue about this? If no, then I would recommend to open one, so that we will not loose track of this. -- Ivan On Sat, 22 Sep 2018 at 16:32, David Hagen wrote: > The new postponed annotations have an unexpected interaction with > dataclasses. Namely, you cannot get the type hints of any of the data > classes methods. > > For example, I have some code that inspects the type parameters of a > class's `__init__` method. (The real use case is to provide a default > serializer for the class, but that is not important here.) > > ``` > from dataclasses import dataclass > from typing import get_type_hints > > class Foo: > pass > > @dataclass > class Bar: > foo: Foo > > print(get_type_hints(Bar.__init__)) > ``` > > In Python 3.6 and 3.7, this does what is expected; it prints `{'foo': > , 'return': }`. > > However, if in Python 3.7, I add `from __future__ import annotations`, > then this fails with an error: > > ``` > NameError: name 'Foo' is not defined > ``` > > I know why this is happening. The `__init__` method is defined in the > `dataclasses` module which does not have the `Foo` object in its > environment, and the `Foo` annotation is being passed to `dataclass` and > attached to `__init__` as the string `"Foo"` rather than as the original > object `Foo`, but `get_type_hints` for the new annotations only does a name > lookup in the module where `__init__` is defined not where the annotation > is defined. > > I know that the use of lambdas to implement PEP 563 was rejected for > performance reasons. I could be wrong, but I think this was motivated by > variable annotations because the lambda would have to be constructed each > time the function body ran. I was wondering if I could motivate storing the > annotations as lambdas in class bodies and function signatures, in which > the environment is already being captured and is code that usually only > runs once. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Sep 27 12:28:57 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 27 Sep 2018 12:28:57 -0400 Subject: [Python-Dev] Postponed annotations break inspection of dataclasses In-Reply-To: References: Message-ID: Yes, it?s https://bugs.python.org/issue34776 -- Eric > On Sep 27, 2018, at 12:05 PM, Ivan Levkivskyi wrote: > > Do we have a b.p.o. issue about this? If no, then I would recommend to open one, so that we will not loose track of this. > > -- > Ivan > > > >> On Sat, 22 Sep 2018 at 16:32, David Hagen wrote: >> The new postponed annotations have an unexpected interaction with dataclasses. Namely, you cannot get the type hints of any of the data classes methods. >> >> For example, I have some code that inspects the type parameters of a class's `__init__` method. (The real use case is to provide a default serializer for the class, but that is not important here.) >> >> ``` >> from dataclasses import dataclass >> from typing import get_type_hints >> >> class Foo: >> pass >> >> @dataclass >> class Bar: >> foo: Foo >> >> print(get_type_hints(Bar.__init__)) >> ``` >> >> In Python 3.6 and 3.7, this does what is expected; it prints `{'foo': , 'return': }`. >> >> However, if in Python 3.7, I add `from __future__ import annotations`, then this fails with an error: >> >> ``` >> NameError: name 'Foo' is not defined >> ``` >> >> I know why this is happening. The `__init__` method is defined in the `dataclasses` module which does not have the `Foo` object in its environment, and the `Foo` annotation is being passed to `dataclass` and attached to `__init__` as the string `"Foo"` rather than as the original object `Foo`, but `get_type_hints` for the new annotations only does a name lookup in the module where `__init__` is defined not where the annotation is defined. >> >> I know that the use of lambdas to implement PEP 563 was rejected for performance reasons. I could be wrong, but I think this was motivated by variable annotations because the lambda would have to be constructed each time the function body ran. I was wondering if I could motivate storing the annotations as lambdas in class bodies and function signatures, in which the environment is already being captured and is code that usually only runs once. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/levkivskyi%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vstinner at redhat.com Thu Sep 27 16:44:37 2018 From: vstinner at redhat.com (Victor Stinner) Date: Thu, 27 Sep 2018 22:44:37 +0200 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? In-Reply-To: References: Message-ID: Hi, Le mer. 26 sept. 2018 ? 23:27, Gabriele a ?crit : > In trying to find the location of a valid instance of PyInterpreterState in the virtual memory of a running Python (3.6) application (using process_vm_read on Linux), I understand that you are writing a debugger and you can only *read* modify, not execute code, right? > I have noticed that I can only rely on _PyThreadState_Current.interp at the very beginning of the execution. If I try to attach to a running Python process, then _PythreadState_Current.interp doesn't seem to point to anything useful to derive the currently running threads and the frame stacks for each of them. In the master branch, it's now _PyRuntime.gilstate.tstate_current. If you run time.sleep(3600) and look into _PyRuntime.gilstate.tstate_current using gdb, you can a NULL pointer (tstate_current=0) because Python releases the GIL.. In faulthandler, I call PyGILState_GetThisThreadState() from signal handlers to get the Python thread state of the current thread... But this one is implemented using PyThread_tss_get() (pthread_getspecific() on most platforms). Moreover, it returns NULL if the current thread is not a Python thread. There is also _PyGILState_GetInterpreterStateUnsafe() which gives access to the current Python interpreter: _PyRuntime.gilstate.autoInterpreterState. From the interpreter, you can use the linked list of thread states from interp->tstate_head. I hope that I helped :-) Obviously, when you access Python internals, the details change at each Python release... I described the master branch. Victor From phoenix1987 at gmail.com Fri Sep 28 10:18:51 2018 From: phoenix1987 at gmail.com (Gabriele) Date: Fri, 28 Sep 2018 15:18:51 +0100 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? Message-ID: Hi Victor, > I understand that you are writing a debugger and you can only *read* > modify, not execute code, right? I'm working on a frame stack sampler that runs independently from the Python process. The project is "Austin" (https://github.com/P403n1x87/austin). Whilst I could, in principle, execute code with other system calls, I prefer not to in this case. > In the master branch, it's now _PyRuntime.gilstate.tstate_current. If > you run time.sleep(3600) and look into > _PyRuntime.gilstate.tstate_current using gdb, you can a NULL pointer > (tstate_current=0) because Python releases the GIL.. I would like my application to make as few assumptions as possible. The _PyRuntime symbol might not be available if all the symbols have been stripped out of the binaries. That's why I was trying to rely on _PyThreadState_Current, which is in the .dynsym section. Judging by the output of nm -D `which python3` (I'm on Python 3.6.6 at the moment) I cannot see anything more useful than that. My current strategy is to try and make something out of this symbol and then fall back to a brute force approach to scan the .bss section for valid PyInterpreterState instances (which works reliably well and is quite fast too, but a bit ugly). > There is also _PyGILState_GetInterpreterStateUnsafe() which gives > access to the current Python interpreter: > _PyRuntime.gilstate.autoInterpreterState. From the interpreter, you > can use the linked list of thread states from interp->tstate_head. > > I hope that I helped :-) Yes thanks! Your comment made me realise why I can use PyThreadState_Current at the very beginning, and it is because Python is going through the intensive startup process, which involves, among other things, the loading of frozen modules (I can clearly see most if not all the steps in the output of Austin, as mentioned in the repo's README). During this phase, the main (and only thread) holds the GIL and is quite busy doing stuff. The long-running applications that I was trying to attach to have very long wait periods where they sit idle waiting for a timer to trigger the next operations, that fire very quickly and put the threads back to sleep again. If this is what the _PyThreadState_Current is designed for, then I guess I cannot really rely on it, especially when attaching Austin to another process. Best regards, Gabriele -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Sep 28 10:29:44 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Sep 2018 00:29:44 +1000 Subject: [Python-Dev] Documenting the private C API (was Re: Questions about signal handling.) In-Reply-To: References: <105BF294-CA1E-43C4-9CE2-A71683371DDB@python.org> <20180925161807.5e13e90e@fsol> Message-ID: On Wed, 26 Sep 2018 at 00:33, Barry Warsaw wrote: > > On Sep 25, 2018, at 10:18, Antoine Pitrou wrote: > > > > Not really. Many are just like "static" (i.e. module-private) > > functions, except that they need to be shared by two or three different > > C modules. It's definitely the case for _PyEval_SignalReceived(). > > Purely static functions which appear only in the file they are defined in are probably fine not to document, although I do still think we should take care to comment on their semantics and external behaviors (i.e. reference counting). But if they?re used in multiple C files, then I think they *can* deserve placement within the documentation. We run into this problem with the test.support helpers as well (we have more helpers than just those in the docs, but the others tend to rely on contributors and/or PR reviewers having looked at other tests that already use them). Fleshing out on the "internals" docs idea that some folks have mentioned: 1. Call it "Doc/_internals" and keep the leading underscore in the published docs 2. Use it to cover both C internals and Python internals (such as test.support) 3. Permit use of autodoc tools that we don't allow in the main docs (as these docs would be for CPython contributors, so the intended audience for the docs is the same as the audience for the code) 4. Potentially pull in some specific files and sections from the source code as literal include blocks (as per http://docutils.sourceforge.net/docs/ref/rst/directives.html#include) rather than rewriting them Cheers, Nick. P.S. While it wouldn't be usable directly, https://github.com/jnikula/hawkmoth at least demonstrates the principle of extracting Sphinx API docs from C source files. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From status at bugs.python.org Fri Sep 28 12:10:05 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 28 Sep 2018 18:10:05 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180928161005.5DD0E56A89@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-09-21 - 2018-09-28) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6781 (-14) closed 39803 (+80) total 46584 (+66) Open issues with patches: 2703 Issues opened (54) ================== #12782: Multiple context expressions do not support parentheses for co https://bugs.python.org/issue12782 reopened by lukasz.langa #28655: Tests altered the execution environment in isolated mode https://bugs.python.org/issue28655 reopened by vstinner #32528: Change base class for futures.CancelledError https://bugs.python.org/issue32528 reopened by yselivanov #34768: Add documentation explaining __init__.py in packages https://bugs.python.org/issue34768 opened by bkestelman #34769: _asyncgen_finalizer_hook running in wrong thread https://bugs.python.org/issue34769 opened by twisteroid ambassador #34771: test_ctypes failing on Linux SPARC64 https://bugs.python.org/issue34771 opened by kelledin-3 #34773: sqlite3 module inconsistently returning only some rows from a https://bugs.python.org/issue34773 opened by shankargopal #34774: IDLE: use theme colors for help viewer https://bugs.python.org/issue34774 opened by terry.reedy #34775: pathlib.PurePath division raises TypeError instead of returnin https://bugs.python.org/issue34775 opened by Roger Aiudi #34776: Postponed annotations break inspection of dataclasses https://bugs.python.org/issue34776 opened by drhagen #34778: Memoryview for column-major (f_contiguous) arrays from bytes i https://bugs.python.org/issue34778 opened by lgautier #34779: IDLE internals show up in tracebacks when returning objects th https://bugs.python.org/issue34779 opened by ppperry #34780: Hang on startup if stdin refers to a pipe with an outstanding https://bugs.python.org/issue34780 opened by izbyshev #34781: infinite waiting in multiprocessing.Pool https://bugs.python.org/issue34781 opened by coells #34782: Pdb crashes when code is executed in a mapping that does not d https://bugs.python.org/issue34782 opened by ppperry #34784: Heap-allocated StructSequences https://bugs.python.org/issue34784 opened by eelizondo #34785: pty.spawn -- auto-termination after child process is dead (a z https://bugs.python.org/issue34785 opened by jarryshaw #34788: ipaddress module fails on rfc4007 scoped IPv6 addresses https://bugs.python.org/issue34788 opened by Jeremy McMillan #34789: Make xml.sax.make_parser accept iterables https://bugs.python.org/issue34789 opened by adelfino #34790: Deprecate passing coroutine objects to asyncio.wait() https://bugs.python.org/issue34790 opened by yselivanov #34791: xml package does not obey sys.flags.ignore_environment https://bugs.python.org/issue34791 opened by christian.heimes #34792: Tutorial doesn''t discuss / and * function arguments https://bugs.python.org/issue34792 opened by diekhans #34793: Remove support for "with (await asyncio.lock):" https://bugs.python.org/issue34793 opened by yselivanov #34794: memory leak in TkApp:_createbytearray https://bugs.python.org/issue34794 opened by dtalkin #34795: loop.sock_recv failure because of delayed callback handling https://bugs.python.org/issue34795 opened by kyuupichan #34796: Tkinter scrollbar issues on Mac. https://bugs.python.org/issue34796 opened by terry.reedy #34797: Convert heapq to the argument clinic https://bugs.python.org/issue34797 opened by pablogsal #34798: pprint ignores the compact parameter for dicts https://bugs.python.org/issue34798 opened by Nicolas Hug #34799: When function in tracing returns None, tracing continues. https://bugs.python.org/issue34799 opened by fabioz #34800: email.contentmanager raises error when policy.max_line_length= https://bugs.python.org/issue34800 opened by silane #34801: codecs.getreader() splits lines containing control characters https://bugs.python.org/issue34801 opened by nascheme #34804: Repetition of 'for example' in documentation https://bugs.python.org/issue34804 opened by rarblack #34805: Explicitly specify `MyClass.__subclasses__()` returns classes https://bugs.python.org/issue34805 opened by pekka.klarck #34806: distutils tests fail with recent 3.7 branch https://bugs.python.org/issue34806 opened by doko #34807: pathlib.[r]glob fails when the toplevel directory is not reada https://bugs.python.org/issue34807 opened by Antony.Lee #34810: Maximum and minimum value of C types integers from Python https://bugs.python.org/issue34810 opened by scls #34811: test_gdb fails with latest gdb https://bugs.python.org/issue34811 opened by cstratak #34812: support.args_from_interpreter_flags() doesn't inherit -I (isol https://bugs.python.org/issue34812 opened by vstinner #34814: makesetup: must link C extensions to libpython when compiled i https://bugs.python.org/issue34814 opened by vstinner #34816: ctypes + hasattr https://bugs.python.org/issue34816 opened by lfriedri #34817: Ellipsis docs has extra dot in the markdown that makes it look https://bugs.python.org/issue34817 opened by xtreak #34818: test.test_ssl.ThreadedTests.test_tls1_3 fails in 2.7 with Attr https://bugs.python.org/issue34818 opened by xnox #34820: binascii.c:1578:1: error: the control flow of function ???bina https://bugs.python.org/issue34820 opened by wencan #34821: Crash after run Python interpreter from removed directory https://bugs.python.org/issue34821 opened by ?????????? ?????????????? #34822: Simplify AST for slices https://bugs.python.org/issue34822 opened by serhiy.storchaka #34823: libffi detection doesn???t work in my setup https://bugs.python.org/issue34823 opened by stapelberg #34824: _ssl.c: Possible null pointer dereference https://bugs.python.org/issue34824 opened by ZackerySpytz #34825: Add more entries to os module to pathlib reference table https://bugs.python.org/issue34825 opened by xtreak #34826: io.BufferedReader crashes in 2.7 on memoryview attribute acces https://bugs.python.org/issue34826 opened by gregory.p.smith #34828: sqlite.iterdump does not work for (most) databases with autoin https://bugs.python.org/issue34828 opened by itssme #34829: Add missing selection_ methods to tkinter Spinbox https://bugs.python.org/issue34829 opened by j-4321-i #34831: Asyncio Tutorial https://bugs.python.org/issue34831 opened by cjrh #34832: "Short circuiting" in base64's b64decode, decode, decodebytes https://bugs.python.org/issue34832 opened by pw.michael.harris #34833: [CI] Azure Pipeline: Initialize Agent failed https://bugs.python.org/issue34833 opened by vstinner Most recent 15 issues with no replies (15) ========================================== #34832: "Short circuiting" in base64's b64decode, decode, decodebytes https://bugs.python.org/issue34832 #34831: Asyncio Tutorial https://bugs.python.org/issue34831 #34829: Add missing selection_ methods to tkinter Spinbox https://bugs.python.org/issue34829 #34825: Add more entries to os module to pathlib reference table https://bugs.python.org/issue34825 #34824: _ssl.c: Possible null pointer dereference https://bugs.python.org/issue34824 #34823: libffi detection doesn???t work in my setup https://bugs.python.org/issue34823 #34822: Simplify AST for slices https://bugs.python.org/issue34822 #34812: support.args_from_interpreter_flags() doesn't inherit -I (isol https://bugs.python.org/issue34812 #34807: pathlib.[r]glob fails when the toplevel directory is not reada https://bugs.python.org/issue34807 #34805: Explicitly specify `MyClass.__subclasses__()` returns classes https://bugs.python.org/issue34805 #34801: codecs.getreader() splits lines containing control characters https://bugs.python.org/issue34801 #34799: When function in tracing returns None, tracing continues. https://bugs.python.org/issue34799 #34797: Convert heapq to the argument clinic https://bugs.python.org/issue34797 #34796: Tkinter scrollbar issues on Mac. https://bugs.python.org/issue34796 #34785: pty.spawn -- auto-termination after child process is dead (a z https://bugs.python.org/issue34785 Most recent 15 issues waiting for review (15) ============================================= #34829: Add missing selection_ methods to tkinter Spinbox https://bugs.python.org/issue34829 #34828: sqlite.iterdump does not work for (most) databases with autoin https://bugs.python.org/issue34828 #34825: Add more entries to os module to pathlib reference table https://bugs.python.org/issue34825 #34824: _ssl.c: Possible null pointer dereference https://bugs.python.org/issue34824 #34822: Simplify AST for slices https://bugs.python.org/issue34822 #34818: test.test_ssl.ThreadedTests.test_tls1_3 fails in 2.7 with Attr https://bugs.python.org/issue34818 #34814: makesetup: must link C extensions to libpython when compiled i https://bugs.python.org/issue34814 #34800: email.contentmanager raises error when policy.max_line_length= https://bugs.python.org/issue34800 #34797: Convert heapq to the argument clinic https://bugs.python.org/issue34797 #34794: memory leak in TkApp:_createbytearray https://bugs.python.org/issue34794 #34791: xml package does not obey sys.flags.ignore_environment https://bugs.python.org/issue34791 #34790: Deprecate passing coroutine objects to asyncio.wait() https://bugs.python.org/issue34790 #34789: Make xml.sax.make_parser accept iterables https://bugs.python.org/issue34789 #34785: pty.spawn -- auto-termination after child process is dead (a z https://bugs.python.org/issue34785 #34784: Heap-allocated StructSequences https://bugs.python.org/issue34784 Top 10 most discussed issues (10) ================================= #34751: Hash collisions for tuples https://bugs.python.org/issue34751 56 msgs #34814: makesetup: must link C extensions to libpython when compiled i https://bugs.python.org/issue34814 18 msgs #22490: Using realpath for __PYVENV_LAUNCHER__ makes Homebrew installs https://bugs.python.org/issue22490 12 msgs #32892: Remove specific constant AST types in favor of ast.Constant https://bugs.python.org/issue32892 12 msgs #34162: idlelib/NEWS.txt for 3.8.0 (and backports) https://bugs.python.org/issue34162 9 msgs #34521: Multiple tests (test_socket, test_multiprocessing_*) fail due https://bugs.python.org/issue34521 9 msgs #34781: infinite waiting in multiprocessing.Pool https://bugs.python.org/issue34781 9 msgs #34806: distutils tests fail with recent 3.7 branch https://bugs.python.org/issue34806 9 msgs #28655: Tests altered the execution environment in isolated mode https://bugs.python.org/issue28655 8 msgs #31405: shutil.which doesn't find files without PATHEXT extension on W https://bugs.python.org/issue31405 7 msgs Issues closed (79) ================== #5950: Make zipimport work with zipfile containing comments https://bugs.python.org/issue5950 closed by barry #12458: Tracebacks should contain the first line of continuation lines https://bugs.python.org/issue12458 closed by serhiy.storchaka #16360: argparse: comma in metavar causes assertion failure when forma https://bugs.python.org/issue16360 closed by paul.j3 #23584: test_doctest lineendings fails in verbose mode https://bugs.python.org/issue23584 closed by xtreak #24937: Multiple problems in getters & setters in capsulethunk.h https://bugs.python.org/issue24937 closed by petr.viktorin #24997: mock.call_args compares as equal and not equal https://bugs.python.org/issue24997 closed by berker.peksag #26000: Crash in Tokenizer - Heap-use-after-free https://bugs.python.org/issue26000 closed by xtreak #26144: test_pkg test_4 and/or test_7 sometimes fail https://bugs.python.org/issue26144 closed by xtreak #26452: Wrong line number attributed to comprehension expressions https://bugs.python.org/issue26452 closed by xtreak #28418: Raise Deprecation warning for tokenize.generate_tokens https://bugs.python.org/issue28418 closed by xtreak #30350: devguide suggests to use VS 2008 to build Python 2.7, but VS 2 https://bugs.python.org/issue30350 closed by vstinner #30964: Mention ensurepip in package installation docs https://bugs.python.org/issue30964 closed by brett.cannon #31007: ERROR: test_pipe_handle (test.test_asyncio.test_windows_utils. https://bugs.python.org/issue31007 closed by vstinner #31425: Expose AF_QIPCRTR in socket module https://bugs.python.org/issue31425 closed by taleinat #31511: test_normalization: test.support.open_urlresource() doesn't ha https://bugs.python.org/issue31511 closed by vstinner #31535: configparser unable to write comment with a upper cas letter https://bugs.python.org/issue31535 closed by xtreak #31737: Documentation renders incorrectly https://bugs.python.org/issue31737 closed by mdk #31837: ParseError in test_all_project_files() https://bugs.python.org/issue31837 closed by xtreak #31986: [2.7] test_urllib2net.test_sites_no_connection_close() randoml https://bugs.python.org/issue31986 closed by vstinner #32117: Tuple unpacking in return and yield statements https://bugs.python.org/issue32117 closed by gvanrossum #32282: When using a Windows XP compatible toolset, `socketmodule.c` f https://bugs.python.org/issue32282 closed by steve.dower #32533: SSLSocket read/write thread-unsafety https://bugs.python.org/issue32533 closed by steve.dower #32552: Improve text for file arguments in argparse.ArgumentDefaultsHe https://bugs.python.org/issue32552 closed by paul.j3 #32556: support bytes paths in nt _getdiskusage, _getvolumepathname, a https://bugs.python.org/issue32556 closed by steve.dower #32557: allow shutil.disk_usage to take a file path on Windows also https://bugs.python.org/issue32557 closed by steve.dower #32718: Install PowerShell activation scripts for venv for all platfor https://bugs.python.org/issue32718 closed by brett.cannon #33016: nt._getfinalpathname may use uninitialized memory https://bugs.python.org/issue33016 closed by steve.dower #33091: ssl.SSLError: Invalid error code (_ssl.c:2217) https://bugs.python.org/issue33091 closed by steve.dower #33180: Flag for unusable sys.executable https://bugs.python.org/issue33180 closed by steve.dower #33309: Unittest Mock objects do not freeze arguments they are called https://bugs.python.org/issue33309 closed by xtreak #33415: When add_mutually_exclusive_group is built without argument, t https://bugs.python.org/issue33415 closed by paul.j3 #33442: Python 3 doc sidebar dosnt follow page scrolling like 2.7 doc https://bugs.python.org/issue33442 closed by xtreak #33782: VSTS Windows-PR: internal error https://bugs.python.org/issue33782 closed by steve.dower #33871: Possible integer overflow in iov_setup() https://bugs.python.org/issue33871 closed by serhiy.storchaka #33937: test_socket: SendmsgSCTPStreamTest.testSendmsgTimeout() failed https://bugs.python.org/issue33937 closed by vstinner #33977: [Windows] test_compileall fails randomly with PermissionError: https://bugs.python.org/issue33977 closed by vstinner #34046: subparsers -> add_parser doesn't support hyphen char '-' https://bugs.python.org/issue34046 closed by paul.j3 #34188: Allow dict choices to "transform" values in argpagse https://bugs.python.org/issue34188 closed by paul.j3 #34223: PYTHONDUMPREFS=1 ./python -c pass does crash https://bugs.python.org/issue34223 closed by vstinner #34248: dbm errors should contain file names https://bugs.python.org/issue34248 closed by berker.peksag #34267: find_python.bat doesn't find installed Python 3.7 https://bugs.python.org/issue34267 closed by steve.dower #34320: Creating dict from OrderedDict doesn't preserve order https://bugs.python.org/issue34320 closed by inada.naoki #34372: Parenthesized expression has incorrect line numbers https://bugs.python.org/issue34372 closed by serhiy.storchaka #34472: zipfile: does not include optional descriptor signature https://bugs.python.org/issue34472 closed by serhiy.storchaka #34533: Apply PEP384 to _csv module https://bugs.python.org/issue34533 closed by berker.peksag #34537: test_gdb fails with LC_ALL=C https://bugs.python.org/issue34537 closed by vstinner #34539: namedtuple's exec() throws segmentation fault https://bugs.python.org/issue34539 closed by xtreak #34548: IDLE: Make TextView use the configured theme colors https://bugs.python.org/issue34548 closed by taleinat #34575: Python 3.6 compilation fails on AppVeyor: libeay.lib was creat https://bugs.python.org/issue34575 closed by vstinner #34582: VSTS builds should use jobs, pools, and test results https://bugs.python.org/issue34582 closed by steve.dower #34609: Importing certain modules while debugging raises an exception https://bugs.python.org/issue34609 closed by terry.reedy #34610: Incorrect iteration of Manager.dict() method of the multiproce https://bugs.python.org/issue34610 closed by serhiy.storchaka #34659: Inconsistency between functools.reduce & itertools.accumulate https://bugs.python.org/issue34659 closed by lisroach #34683: Caret positioned wrong for SyntaxError reported by ast.c https://bugs.python.org/issue34683 closed by gvanrossum #34687: asyncio: is it time to make ProactorEventLoop as the default e https://bugs.python.org/issue34687 closed by vstinner #34728: deprecate *loop* argument for asyncio.sleep https://bugs.python.org/issue34728 closed by willingc #34734: Azure linux buildbot failure https://bugs.python.org/issue34734 closed by xtreak #34736: Confusing base64.b64decode output https://bugs.python.org/issue34736 closed by taleinat #34744: New %(flag)s format specifier for argparse.add_argument help s https://bugs.python.org/issue34744 closed by paul.j3 #34759: Possible regression in ssl module in 3.7.1 and master https://bugs.python.org/issue34759 closed by njs #34760: Regression in abc in combination with passing a function to is https://bugs.python.org/issue34760 closed by levkivskyi #34761: str(super()) != super().__str__() https://bugs.python.org/issue34761 closed by eric.snow #34762: Change contextvars C API to use PyObject https://bugs.python.org/issue34762 closed by yselivanov #34770: pyshellext.cpp: Possible null pointer dereference https://bugs.python.org/issue34770 closed by xiang.zhang #34772: Python will suddenly not plot https://bugs.python.org/issue34772 closed by ammar2 #34777: urllib.request accepts anything as a header parameter for some https://bugs.python.org/issue34777 closed by xtreak #34783: [3.7] segmentation-fault/core dump when try to run non-existin https://bugs.python.org/issue34783 closed by vstinner #34786: ProcessPoolExecutor documentation reports wrong exception bein https://bugs.python.org/issue34786 closed by xiang.zhang #34787: imghdr raise TypeError for PNG https://bugs.python.org/issue34787 closed by christian.heimes #34802: asyncio.iscoroutine() documentation is wrong https://bugs.python.org/issue34802 closed by yselivanov #34803: argparse int type does not accept scientific notation https://bugs.python.org/issue34803 closed by paul.j3 #34808: bytes[0] != bytes[0:1] https://bugs.python.org/issue34808 closed by ucyo #34809: On MacOSX with 3.7 python getting "Symbol not found: _PyString https://bugs.python.org/issue34809 closed by zach.ware #34813: child process disappears when removing a print statement after https://bugs.python.org/issue34813 closed by calimeroteknik #34815: Change Py_Ellipse __str__ behavior. https://bugs.python.org/issue34815 closed by photofone #34819: Executor.map and as_completed timeouts are able to deviate https://bugs.python.org/issue34819 closed by pitrou #34827: Make argparse.NameSpace iterable https://bugs.python.org/issue34827 closed by serhiy.storchaka #34830: functools.partial is weak referncable https://bugs.python.org/issue34830 closed by mdk #1529353: Squeezer - squeeze large output in IDLE's shell https://bugs.python.org/issue1529353 closed by terry.reedy From seanharr11 at gmail.com Fri Sep 28 17:07:33 2018 From: seanharr11 at gmail.com (Sean Harrington) Date: Fri, 28 Sep 2018 17:07:33 -0400 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals Message-ID: I am proposing an extension to the multiprocessing.Pool API that allows for an alternative way to pass data to Pool worker processes, *without* using globals. A PR has been opened , extensive test coverage is also included, with all tests & CI passing on github. Please see this blog post for details, motivation, and use cases of the API extension before reading on. In *short*, the implementation of the feature works as follows: 1. Exposes a kwarg on Pool.__init__ called `expect_initret`, that defaults to False. When set to True: 1. Capture the return value of the initializer kwarg of Pool 2. Pass this value to the function being applied, as a kwarg. Again, in *short,* the motivation of the feature is to provide an explicit "flow of data" from parent process to worker process, and to avoid being *forced* to using the *global* keyword in initializer, or being *forced* to create global variables in the parent process. The interface is 100% backwards compatible through Python3.x (and perhaps beyond). -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Sep 28 18:11:49 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Sep 2018 15:11:49 -0700 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? In-Reply-To: References: Message-ID: What information do you wish the interpreter provided, that would make your program simpler and more reliable? On Fri, Sep 28, 2018, 07:21 Gabriele wrote: > Hi Victor, > > > I understand that you are writing a debugger and you can only *read* > > modify, not execute code, right? > > I'm working on a frame stack sampler that runs independently from the > Python process. The project is "Austin" > (https://github.com/P403n1x87/austin). Whilst I could, in principle, > execute code with other system calls, I prefer not to in this case. > > > In the master branch, it's now _PyRuntime.gilstate.tstate_current. If > > you run time.sleep(3600) and look into > > _PyRuntime.gilstate.tstate_current using gdb, you can a NULL pointer > > (tstate_current=0) because Python releases the GIL.. > > I would like my application to make as few assumptions as possible. > The _PyRuntime symbol might not be available if all the symbols have > been stripped out of the binaries. That's why I was trying to rely on > _PyThreadState_Current, which is in the .dynsym section. Judging by > the output of nm -D `which python3` (I'm on Python 3.6.6 at the > moment) I cannot see anything more useful than that. > > My current strategy is to try and make something out of this symbol > and then fall back to a brute force approach to scan the .bss section > for valid PyInterpreterState instances (which works reliably well and > is quite fast too, but a bit ugly). > > > There is also _PyGILState_GetInterpreterStateUnsafe() which gives > > access to the current Python interpreter: > > _PyRuntime.gilstate.autoInterpreterState. From the interpreter, you > > can use the linked list of thread states from interp->tstate_head. > > > > I hope that I helped :-) > > Yes thanks! Your comment made me realise why I can use > PyThreadState_Current at the very beginning, and it is because Python > is going through the intensive startup process, which involves, among > other things, the loading of frozen modules (I can clearly see most if > not all the steps in the output of Austin, as mentioned in the repo's > README). During this phase, the main (and only thread) holds the GIL > and is quite busy doing stuff. The long-running applications that I > was trying to attach to have very long wait periods where they sit > idle waiting for a timer to trigger the next operations, that fire > very quickly and put the threads back to sleep again. > > If this is what the _PyThreadState_Current is designed for, then I > guess I cannot really rely on it, especially when attaching Austin to > another process. > > Best regards, > Gabriele > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phoenix1987 at gmail.com Fri Sep 28 18:29:15 2018 From: phoenix1987 at gmail.com (Gabriele) Date: Fri, 28 Sep 2018 23:29:15 +0100 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? In-Reply-To: References: Message-ID: On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith wrote: > What information do you wish the interpreter provided, that would make your program simpler and more reliable? An exported global variable that points to the head of the PyInterpreterState linked list (i.e. the return value of PyInterpreterState_Head). This way my program could just look this up from the dynsym section instead of scanning a dump of the bss section in memory to find a possible candidate. It would be grand if also the string in the rodata section that gives the Python version could be dereferenced from dynsym, but that's a different question. From solipsis at pitrou.net Fri Sep 28 18:44:54 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Sep 2018 00:44:54 +0200 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals References: Message-ID: <20180929004454.3715f0e8@fsol> Hi, On Fri, 28 Sep 2018 17:07:33 -0400 Sean Harrington wrote: > > In *short*, the implementation of the feature works as follows: > > 1. Exposes a kwarg on Pool.__init__ called `expect_initret`, that > defaults to False. When set to True: > 1. Capture the return value of the initializer kwarg of Pool > 2. Pass this value to the function being applied, as a kwarg. > > Again, in *short,* the motivation of the feature is to provide an explicit > "flow of data" from parent process to worker process, and to avoid being > *forced* to using the *global* keyword in initializer, or being *forced* to > create global variables in the parent process. Thanks for taking the time to explain your use case and write a proposal. My reactions to this are: 1. The proposed API is ugly. This basically allows you to pass an argument which changes with which arguments another function is later called... 2. A global variable seems like the adequate way to represent a process-global object (which is exactly your use case). 3. If you don't like globals, you could probably do something like lazily-initialize the resource when a function needing it is executed; this also avoids creating the resource if the child doesn't use it at all. Would that work for you? As a more general remark, I understand the desire to make the Pool object more flexible, but we can also not pile up features until it satisfies all use cases. As another general remark, concurrent.futures is IMHO the preferred API for the future, and where feature work should probably concentrate. Regards Antoine. From seanharr11 at gmail.com Fri Sep 28 19:23:06 2018 From: seanharr11 at gmail.com (Sean Harrington) Date: Fri, 28 Sep 2018 19:23:06 -0400 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: <20180929004454.3715f0e8@fsol> References: <20180929004454.3715f0e8@fsol> Message-ID: Hi Antoine - see inline below for my response...thanks for your time! On Fri, Sep 28, 2018 at 6:45 PM Antoine Pitrou wrote: > > Hi, > > On Fri, 28 Sep 2018 17:07:33 -0400 > Sean Harrington wrote: > > > > In *short*, the implementation of the feature works as follows: > > > > 1. Exposes a kwarg on Pool.__init__ called `expect_initret`, that > > defaults to False. When set to True: > > 1. Capture the return value of the initializer kwarg of Pool > > 2. Pass this value to the function being applied, as a kwarg. > > > > Again, in *short,* the motivation of the feature is to provide an > explicit > > "flow of data" from parent process to worker process, and to avoid being > > *forced* to using the *global* keyword in initializer, or being *forced* > to > > create global variables in the parent process. > > Thanks for taking the time to explain your use case and write a > proposal. > > My reactions to this are: > > 1. The proposed API is ugly. This basically allows you to pass an > argument which changes with which arguments another function is later > called... > Yes I agree that this is a not-perfect contract, but isn't this also a concern with the current implementation? And isn't this pattern arguably more explicit than "The function-being-applied relying on the initializer to create a global variable from within it's lexical scope"? 2. A global variable seems like the adequate way to represent a > process-global object (which is exactly your use case) > There is nothing wrong with using a global variable, especially in nearly every toy example found on the internet of using multiprocessing.Pool (i.e. optimizing a simple script). But what happens when you have lots of nested function calls in your applied function? My simple argument is that the developer should not be constrained to make the objects passed globally available in the process, as this MAY break encapsulation for large projects. 3. If you don't like globals, you could probably do something like > lazily-initialize the resource when a function needing it is executed; > this also avoids creating the resource if the child doesn't use it at > all. Would that work for you? > > I have nothing against globals, my gripe is with being enforced to use them for every Pool use case. Further, if initializing the resource is expensive, we only want to do this ONE time per worker process. So no, this will not ~always~ work. > As a more general remark, I understand the desire to make the Pool > object more flexible, but we can also not pile up features until it > satisfies all use cases. > > I understand that this is a legitimate concern, but this is about API approachability. Python end-users of Pool are forced to declare a global from a lexical scope. Most Python end-users probably don't even know this is possible. Sure, this is adding a feature for a use case that I outlined, but really this is one of the two major use cases of "initializer" and "initargs" (see my blog post for the 2 generalized use cases ), not some obscure use case. This is making that *very common* use case more approachable. > As another general remark, concurrent.futures is IMHO the preferred API > for the future, and where feature work should probably concentrate. > > This is good to hear and know. And will keep this mind moving forward! > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/seanharr11%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at selik.org Fri Sep 28 21:27:13 2018 From: mike at selik.org (Michael Selik) Date: Fri, 28 Sep 2018 18:27:13 -0700 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: References: Message-ID: On Fri, Sep 28, 2018 at 2:11 PM Sean Harrington wrote: > kwarg on Pool.__init__ called `expect_initret`, that defaults to False. When set to True: > Capture the return value of the initializer kwarg of Pool > Pass this value to the function being applied, as a kwarg. The parameter name you chose, "initret" is awkward, because nowhere else in Python does an initializer return a value. Initializers mutate an encapsulated scope. For a class __init__, that scope is an instance's attributes. For a subprocess managed by Pool, that encapsulated scope is its "globals". I'm using quotes to emphasize that these "globals" aren't shared. On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington wrote: > On Fri, Sep 28, 2018 at 6:45 PM Antoine Pitrou wrote: >> 3. If you don't like globals, you could probably do something like >> lazily-initialize the resource when a function needing it is executed > > if initializing the resource is expensive, we only want to do this ONE time per worker process. We must have a different concept of "lazily-initialize". I understood Antoine's suggestion to be a one-time initialize per worker process. On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington wrote: > My simple argument is that the developer should not be constrained to make the objects passed globally available in the process, as this MAY break encapsulation for large projects. I could imagine someone switching from Pool to ThreadPool and getting into trouble, but in my mind using threads is caveat emptor. Are you worried about breaking encapsulation in a different scenario? From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Sep 29 03:43:14 2018 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sat, 29 Sep 2018 16:43:14 +0900 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BAC70BB.2040707@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> Message-ID: <23471.11538.998069.413468@turnbull.sk.tsukuba.ac.jp> Greg Ewing writes: > (BTW, how do you provide a citation for "common knowledge"?-) Aumann, Robert J. [1976], "Agreeing to Disagree." Annals of Statistics 4, pp. 1236-1239 is what I usually use. :-) From njs at pobox.com Sat Sep 29 06:00:27 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Sep 2018 03:00:27 -0700 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? In-Reply-To: References: Message-ID: On Fri, Sep 28, 2018 at 3:29 PM, Gabriele wrote: > On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith wrote: >> What information do you wish the interpreter provided, that would make your program simpler and more reliable? > > An exported global variable that points to the head of the > PyInterpreterState linked list (i.e. the return value of > PyInterpreterState_Head). This way my program could just look this up > from the dynsym section instead of scanning a dump of the bss section > in memory to find a possible candidate. Hmm, it looks like in 3.7+, _PyRuntime is marked PyAPI_DATA, which I think should make it exported from dynsym? https://github.com/python/cpython/blob/4b430e5f6954ef4b248e95bfb4087635dcdefc6d/Include/internal/pystate.h#L206 And PyInterpreterState_Head is just _PyRuntime.interpreters.head. So maybe this is already done... -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Sat Sep 29 06:04:47 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Sep 2018 12:04:47 +0200 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: References: <20180929004454.3715f0e8@fsol> Message-ID: <20180929120447.6f7cd478@fsol> Hi Sean, On Fri, 28 Sep 2018 19:23:06 -0400 Sean Harrington wrote: > My simple argument is that the > developer should not be constrained to make the objects passed globally > available in the process, as this MAY break encapsulation for large > projects. IMHO, global variables don't break encapsulation if they remain private to the module where they actually play a role. Of course, there are also global-like alternatives to globals, such as class attributes... The multiprocessing module itself uses globals (or quasi-globals) internally for various implementation details. > 3. If you don't like globals, you could probably do something like > > lazily-initialize the resource when a function needing it is executed; > > this also avoids creating the resource if the child doesn't use it at > > all. Would that work for you? > > > > I have nothing against globals, my gripe is with being enforced to use > them for every Pool use case. Further, if initializing the resource is > expensive, we only want to do this ONE time per worker process. That's what I meant with lazy initialization: initialize it if not already done, otherwise just use the already-initialized resource. It's a common pattern. (you can view it as a 1-element cache if you prefer) > > As a more general remark, I understand the desire to make the Pool > > object more flexible, but we can also not pile up features until it > > satisfies all use cases. > > > > I understand that this is a legitimate concern, but this is about API > approachability. Python end-users of Pool are forced to declare a global > from a lexical scope. Most Python end-users probably don't even know this > is possible. Hmm... We might have a disagreement on the target audience of the multiprocessing module. multiprocessing isn't very high-level, I would expect it to be used by experienced programmers who know how to mutate a global variable from a lexical scope. For non-programmer end-users, such as data scientists, there are higher-level libraries such as Celery (http://www.celeryproject.org/) and Dask distributed (https://distributed.readthedocs.io/en/latest/). Perhaps it would be worth mentioning them in the documentation. Regards Antoine. From seanharr11 at gmail.com Sat Sep 29 08:13:19 2018 From: seanharr11 at gmail.com (Sean Harrington) Date: Sat, 29 Sep 2018 08:13:19 -0400 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: <20180929120447.6f7cd478@fsol> References: <20180929004454.3715f0e8@fsol> <20180929120447.6f7cd478@fsol> Message-ID: On Sat, Sep 29, 2018 at 6:24 AM Antoine Pitrou wrote: > > Hi Sean, > > On Fri, 28 Sep 2018 19:23:06 -0400 > Sean Harrington wrote: > > My simple argument is that the > > developer should not be constrained to make the objects passed globally > > available in the process, as this MAY break encapsulation for large > > projects. > > IMHO, global variables don't break encapsulation if they remain private > to the module where they actually play a role. > > Of course, there are also global-like alternatives to globals, such as > class attributes... The multiprocessing module itself uses globals (or > quasi-globals) internally for various implementation details. > >>> Yes, class attributes are a viable alternative. I've written about this here. Still, the argument is not against global variables, class attributes or any close cousins -- it is simply that developers shouldn't be forced to use these. > > 3. If you don't like globals, you could probably do something like > > > lazily-initialize the resource when a function needing it is executed; > > > this also avoids creating the resource if the child doesn't use it at > > > all. Would that work for you? > > > > > > I have nothing against globals, my gripe is with being enforced to > use > > them for every Pool use case. Further, if initializing the resource is > > expensive, we only want to do this ONE time per worker process. > > That's what I meant with lazy initialization: initialize it if not > already done, otherwise just use the already-initialized resource. > It's a common pattern. > > (you can view it as a 1-element cache if you prefer) > >>> Sorry - I wasn't following your initial suggestion. This is a valid solution for ONE of the general use cases (where we initialize objects in each worker post-fork). However it fails for the other Pool use case of "initializing a big object in your parent, and passing to each worker, without using globals." > > As a more general remark, I understand the desire to make the Pool > > > object more flexible, but we can also not pile up features until it > > > satisfies all use cases. > > > > > > I understand that this is a legitimate concern, but this is about API > > approachability. Python end-users of Pool are forced to declare a global > > from a lexical scope. Most Python end-users probably don't even know this > > is possible. > > Hmm... We might have a disagreement on the target audience of the > multiprocessing module. multiprocessing isn't very high-level, I would > expect it to be used by experienced programmers who know how to mutate > a global variable from a lexical scope. > >>> It is one thing to MUTATE a global from a lexical scope. No gripes there. The specific concept I'm referencing here, is "DECLARING a global variable, from within a lexical scope". This is not as a intuitive for most programmers. > For non-programmer end-users, such as data scientists, there are > higher-level libraries such as Celery (http://www.celeryproject.org/) > and Dask distributed (https://distributed.readthedocs.io/en/latest/). > Perhaps it would be worth mentioning them in the documentation. > >>> We likely do NOT have disagreements on the multiprocessing module. Multiprocessing is NOT high-level, I agree. But the beauty of the "Pool" API is that it gives non-programmer end-users (like data scientists) the ability to leverage multiple cores, without (in most cases) needing to know implementation details about multiprocessing. All they need to understand is the higher-order-function "map()", which is a very simple concept. (I even sound over-complicated myself calling it a "higher-order-function"...) > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/seanharr11%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Sep 29 08:17:59 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Sep 2018 14:17:59 +0200 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: References: <20180929004454.3715f0e8@fsol> <20180929120447.6f7cd478@fsol> Message-ID: <20180929141759.3d613436@fsol> On Sat, 29 Sep 2018 08:13:19 -0400 Sean Harrington wrote: > > > > Hmm... We might have a disagreement on the target audience of the > > multiprocessing module. multiprocessing isn't very high-level, I would > > expect it to be used by experienced programmers who know how to mutate > > a global variable from a lexical scope. > > > > >>> It is one thing to MUTATE a global from a lexical scope. No gripes > there. The specific concept I'm referencing here, is "DECLARING a global > variable, from within a lexical scope". This is not as a intuitive for most > programmers. Well, you don't have to. You can bind it to None in the top-level scope and then mutate it from the lexical scope: my_resource = None def do_work(): global my_resource my_resource = ... Regards Antoine. From seanharr11 at gmail.com Sat Sep 29 08:23:49 2018 From: seanharr11 at gmail.com (Sean Harrington) Date: Sat, 29 Sep 2018 08:23:49 -0400 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: References: Message-ID: On Fri, Sep 28, 2018 at 9:27 PM Michael Selik wrote: > On Fri, Sep 28, 2018 at 2:11 PM Sean Harrington > wrote: > > kwarg on Pool.__init__ called `expect_initret`, that defaults to False. > When set to True: > > Capture the return value of the initializer kwarg of Pool > > Pass this value to the function being applied, as a kwarg. > > The parameter name you chose, "initret" is awkward, because nowhere > else in Python does an initializer return a value. Initializers mutate > an encapsulated scope. For a class __init__, that scope is an > instance's attributes. For a subprocess managed by Pool, that > encapsulated scope is its "globals". I'm using quotes to emphasize > that these "globals" aren't shared. > >> Yes - if you bucket the "initializer" arg of Pool into the "Python initializers" then I see your point here. And yes initializer mutates the global scope of the worker subprocess. Again, my gripe is not with globals. I am looking for the ability to have a clear, explicit flow of data from parent -> child process, without being constrained to using globals. > > On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington > wrote: > > On Fri, Sep 28, 2018 at 6:45 PM Antoine Pitrou > wrote: > >> 3. If you don't like globals, you could probably do something like > >> lazily-initialize the resource when a function needing it is executed > > > > if initializing the resource is expensive, we only want to do this ONE > time per worker process. > > We must have a different concept of "lazily-initialize". I understood > Antoine's suggestion to be a one-time initialize per worker process. > >> See my response to Anotoine earlier. I missed the point made. This is a valid solution to the problem of "initializing objects after a worker has been forked", but fails to address the "create big object in parent, pass to each worker". > > On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington > wrote: > > My simple argument is that the developer should not be constrained to > make the objects passed globally available in the process, as this MAY > break encapsulation for large projects. > > I could imagine someone switching from Pool to ThreadPool and getting > into trouble, but in my mind using threads is caveat emptor. Are you > worried about breaking encapsulation in a different scenario? > >> Without a specific example on-hand, you could imagine a tree of function calls that occur in the worker process (even newly created objects), that should not necessarily have access to objects passed from parent -> worker. In every case given the current implementation, they will. -------------- next part -------------- An HTML attachment was scrubbed... URL: From seanharr11 at gmail.com Sat Sep 29 08:26:06 2018 From: seanharr11 at gmail.com (Sean Harrington) Date: Sat, 29 Sep 2018 08:26:06 -0400 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: <20180929141759.3d613436@fsol> References: <20180929004454.3715f0e8@fsol> <20180929120447.6f7cd478@fsol> <20180929141759.3d613436@fsol> Message-ID: On Sat, Sep 29, 2018 at 8:18 AM Antoine Pitrou wrote: > On Sat, 29 Sep 2018 08:13:19 -0400 > Sean Harrington wrote: > > > > > > Hmm... We might have a disagreement on the target audience of the > > > multiprocessing module. multiprocessing isn't very high-level, I would > > > expect it to be used by experienced programmers who know how to mutate > > > a global variable from a lexical scope. > > > > > > > >>> It is one thing to MUTATE a global from a lexical scope. No gripes > > there. The specific concept I'm referencing here, is "DECLARING a global > > variable, from within a lexical scope". This is not as a intuitive for > most > > programmers. > > Well, you don't have to. You can bind it to None in the top-level > scope and then mutate it from the lexical scope: > > my_resource = None > > def do_work(): > global my_resource > my_resource = ... > > >>> Yes but this is even more constraining, as it forces the parent process to declare a global variable that it likely never uses! > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/seanharr11%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Sep 29 12:04:20 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 30 Sep 2018 02:04:20 +1000 Subject: [Python-Dev] switch statement In-Reply-To: References: <87a6ff86-01f8-0355-8a8a-0ce0e64b5ccb@prolan-power.hu> Message-ID: <20180929160420.GE19437@ando.pearwood.info> On Fri, Sep 21, 2018 at 02:10:00PM -0700, Guido van Rossum wrote: > There's already a rejected PEP about a switch statement: > https://www.python.org/dev/peps/pep-3103/. There's no point bringing this > up again unless you have a new use case. > > There have been several promising posts to python-ideas about the much more > powerful idea of a "match" statement. Please search for those before > re-posting on python-ideas. The Coconut transpiler also includes some interesting ideas for a match and case statement: http://coconut-lang.org/ https://coconut.readthedocs.io/en/master/DOCS.html#match https://coconut.readthedocs.io/en/master/DOCS.html#case -- Steve From phoenix1987 at gmail.com Sat Sep 29 12:14:22 2018 From: phoenix1987 at gmail.com (Gabriele) Date: Sat, 29 Sep 2018 17:14:22 +0100 Subject: [Python-Dev] What is the purpose of the _PyThreadState_Current symbol in Python 3? In-Reply-To: References: Message-ID: Ah ok, this might be related to Victor's observation based on the latest sources. I haven't tested 3.7 yet, but if _PyRuntime is available from dynsym then this is already enough. Thanks, Gabriele On Sat, 29 Sep 2018 at 11:00, Nathaniel Smith wrote: > > On Fri, Sep 28, 2018 at 3:29 PM, Gabriele wrote: > > On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith wrote: > >> What information do you wish the interpreter provided, that would make your program simpler and more reliable? > > > > An exported global variable that points to the head of the > > PyInterpreterState linked list (i.e. the return value of > > PyInterpreterState_Head). This way my program could just look this up > > from the dynsym section instead of scanning a dump of the bss section > > in memory to find a possible candidate. > > Hmm, it looks like in 3.7+, _PyRuntime is marked PyAPI_DATA, which I > think should make it exported from dynsym? > > https://github.com/python/cpython/blob/4b430e5f6954ef4b248e95bfb4087635dcdefc6d/Include/internal/pystate.h#L206 > > And PyInterpreterState_Head is just _PyRuntime.interpreters.head. So > maybe this is already done... > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org -- "Egli ? scritto in lingua matematica, e i caratteri son triangoli, cerchi, ed altre figure geometriche, senza i quali mezzi ? impossibile a intenderne umanamente parola; senza questi ? un aggirarsi vanamente per un oscuro laberinto." -- G. Galilei, Il saggiatore. From mike at selik.org Sat Sep 29 15:00:09 2018 From: mike at selik.org (Michael Selik) Date: Sat, 29 Sep 2018 12:00:09 -0700 Subject: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals In-Reply-To: References: Message-ID: On Sat, Sep 29, 2018 at 5:24 AM Sean Harrington wrote: >> On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington wrote: >> > My simple argument is that the developer should not be constrained to make the objects passed globally available in the process, as this MAY break encapsulation for large projects. >> >> I could imagine someone switching from Pool to ThreadPool and getting >> into trouble, but in my mind using threads is caveat emptor. Are you >> worried about breaking encapsulation in a different scenario? > > >> Without a specific example on-hand, you could imagine a tree of function calls that occur in the worker process (even newly created objects), that should not necessarily have access to objects passed from parent -> worker. In every case given the current implementation, they will. Echoing Antoine: If you want some functions to not have access to a module's globals, you can put those functions in a different module. Note that multiprocessing already encapsulates each subprocesses' globals in essentially a separate namespace. Without a specific example, this discussion is going to go around in circles. You have a clear aversion to globals. Antoine and I do not. No one else seems to have found this conversation interesting enough to participate, yet. From tritium-list at sdamon.com Sat Sep 29 21:40:03 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Sat, 29 Sep 2018 21:40:03 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <20180927135327.GE19437@ando.pearwood.info> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> Message-ID: <234e01d4585e$829acc20$87d06460$@sdamon.com> > -----Original Message----- > From: Python-Dev list=sdamon.com at python.org> On Behalf Of Steven D'Aprano > Sent: Thursday, September 27, 2018 9:54 AM > To: python-dev at python.org > Subject: Re: [Python-Dev] Change in Python 3's "round" behavior > > On Thu, Sep 27, 2018 at 05:55:07PM +1200, Greg Ewing wrote: > > jab at math.brown.edu wrote: > > >I understand from > > >https://github.com/cosmologicon/pywat/pull/40#discussion_r219962259 > > >that "to always round up... can theoretically skew the data" > > > > *Very* theoretically. If the number is even a whisker bigger than > > 2.5 it's going to get rounded up regardless: > > > > >>> round(2.500000000000001) > > 3 > > > > That difference is on the order of the error you expect from > > representing decimal fractions in binary, so I would be surprised > > if anyone can actually measure this bias in a real application. > > I think you may have misunderstood the nature of the bias. It's not > about individual roundings and it definitely has nothing to do with > binary representation. > > Any one round operation will introduce a bias. You had a number, say > 2.3, and it gets rounded down to 2.0, introducing an error of -0.3. But > if you have lots of rounds, some will round up, and some will round > down, and we want the rounding errors to cancel. > > The errors *almost* cancel using the naive rounding algorithm as most of > the digits pair up: > > .1 rounds down, error = -0.1 > .9 rounds up, error = +0.1 > > .2 rounds down, error = -0.2 > .8 rounds up, error = +0.2 > > etc. If each digit is equally likely, then on average they'll cancel and > we're left with *almost* no overall error. > > The problem is that while there are four digits rounding down (.1 > through .4) there are FIVE which round up (.5 through .9). Two digits > don't pair up: > > .0 stays unchanged, error = 0 > .5 always rounds up, error = +0.5 > > Given that for many purposes, our data is recorded only to a fixed > number of decimal places, we're dealing with numbers like 0.5 rather > than 0.5000000001, so this can become a real issue. Every ten rounding > operations will introduce an average error of +0.05 instead of > cancelling out. Rounding introduces a small but real bias. > > The most common (and, in many experts' opinion, the best default > behaviour) is Banker's Rounding, or round-to-even. All the other digits > round as per the usual rule, but .5 rounds UP half the time and DOWN the > rest of the time: > > 0.5, 2.5, 3.5 etc round down, error = -0.5 > 1.5, 3.5, 5.5 etc round up, error = +0.5 > > thus on average the .5 digit introduces no error and the bias goes away. > > ...and we have a stats module that would be a great place for a round function that needs to cancel rounding errors. The simple case should be the intuitive case for most users. My experience and that of many users of the python irc channel on freenode is that round-half-to-even is not the intuitive, or even desired, behavior - round-half-up is. This wouldn't be frustrating to the human user if the round built-in let you pick the method, instead you have to use the very complicated decimal module with a modified context to get intuitive behavior. I could live with `round(2.5) -> 2.0` if `round(2.5, method='up') -> 3.0` (or some similar spelling) was an option. As it stands, this is a wart on the language. "Statistically balanced rounding" is a special case, not the default case. > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From greg.ewing at canterbury.ac.nz Sat Sep 29 21:49:49 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Sep 2018 14:49:49 +1300 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <234e01d4585e$829acc20$87d06460$@sdamon.com> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> Message-ID: <5BB02BBD.5090804@canterbury.ac.nz> I don't really get the statistical argument. If you're doing something like calculating an average and care about accuracy, why are you rounding the values before averaging? Why not average first and then round the result if you need to? -- Greg From tritium-list at sdamon.com Sun Sep 30 06:26:49 2018 From: tritium-list at sdamon.com (Alex Walters) Date: Sun, 30 Sep 2018 06:26:49 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BB02BBD.5090804@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <5BB02BBD.5090804@canterbury.ac.nz> Message-ID: <24b501d458a8$19962f90$4cc28eb0$@sdamon.com> > -----Original Message----- > From: Python-Dev list=sdamon.com at python.org> On Behalf Of Greg Ewing > Sent: Saturday, September 29, 2018 9:50 PM > To: python-dev at python.org > Subject: Re: [Python-Dev] Change in Python 3's "round" behavior > > I don't really get the statistical argument. If you're doing something > like calculating an average and care about accuracy, why are you rounding > the values before averaging? Why not average first and then round the > result if you need to? > Other use case is finance, where you can end up with interest calculations that are fractional of the base unit of currency. US$2.345 is impossible to represent in real currency, so it has to be rounded. With half-towards-even, that rounds to $2.34, and $2.355 rounds to $2.36. It evens out in the long run. While that is very helpful for finance calculations, if you are doing finance with that level of precision, you should be using decimal instead of float anyways and decimal's round has configurable round method. > -- > Greg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From steve at pearwood.info Sun Sep 30 08:17:03 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 30 Sep 2018 22:17:03 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <234e01d4585e$829acc20$87d06460$@sdamon.com> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> Message-ID: <20180930121703.GK19437@ando.pearwood.info> On Sat, Sep 29, 2018 at 09:40:03PM -0400, Alex Walters wrote: > ...and we have a stats module that would be a great place for a round > function that needs to cancel rounding errors. This has nothing to do with statistics. You should consider that this is often called "Banker's Rounding" and what that tells you. (It's also called Dutch Rounding.) > The simple case should be the intuitive case for most users. Should it? I think that having the most correct behaviour should be the default. Who decides what is "intuitive"? I asked my three year old nephew whether 1.5 should round to down to 1 or up to 2, and he said that he didn't care about numbers because he was sailing across the ocean and I was standing in the way of his boat. > My experience and that of many users of > the python irc channel on freenode is that round-half-to-even is not the > intuitive, or even desired, behavior - round-half-up is. It would be very informative to ask *why* they want round-half-up. I expect that the reason given will boil down to "because it is the rounding method I learned in school" even if they can't articulate it that way, and start going on about it being "intuitive" as if rounding ties upwards was more intuitive than rounding ties downward. Compatibility with "other languages" isn't the answer, because other languages differ in how they do rounding and we can't match them all: # Javascript js> Math.round(2.5) + Math.round(-2.5) 1 # Ruby steve at orac ~ $ ruby -e 'puts (2.5).round() + (-2.5).round()' 0 VBScript is another language which uses Bankers Rounding: https://blogs.msdn.microsoft.com/ericlippert/2003/09/26/bankers-rounding/ although the example given (calculating an average) is misleading, because as I said this is not about statistics. Bankers Rounding produces better *averages* because it produces better *sums* (to quote one of the comments). Similarly for differences. If you perform many subtractions (let's say you are paying off a loan, and calculating interest, then rounding to the nearest cent) you have to care about bias. If each rounding introduces a 0.5 cent bias (as round-half-up does) then the total bias increases as the number of transactions increases. > This wouldn't be frustrating to the human user Did you intend to imply I'm not human, or was it an accident? -- Steve From rosuav at gmail.com Sun Sep 30 10:14:26 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Oct 2018 00:14:26 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <20180930121703.GK19437@ando.pearwood.info> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <20180930121703.GK19437@ando.pearwood.info> Message-ID: On Sun, Sep 30, 2018 at 10:18 PM Steven D'Aprano wrote: > > On Sat, Sep 29, 2018 at 09:40:03PM -0400, Alex Walters wrote: > > My experience and that of many users of > > the python irc channel on freenode is that round-half-to-even is not the > > intuitive, or even desired, behavior - round-half-up is. > > It would be very informative to ask *why* they want round-half-up. > > I expect that the reason given will boil down to "because it is the > rounding method I learned in school" even if they can't articulate it > that way, and start going on about it being "intuitive" as if rounding > ties upwards was more intuitive than rounding ties downward. Let's start by assuming that real numbers are a perfectly continuous space of values, and that every actually-recorded value is *already* the result of rounding some number to fit within our available space (rather than assuming that recorded values are perfectly precise and correct). Further, assume that representable numbers are equally spaced - not strictly true, but extremely hard to compensate for. That means that any representable number actually has to indicate a range of values centered on that value. For the sake of argument, pretend we can represent one digit before the decimal and one after; in actual usage, this would occur at the extreme of precision, 53 bits down the line. So the number 2.0 actually means the range (1.95, 2.05), the number 2.1 really means (2.05, 2.15), 2.5 means (2.45, 2.55), 2.9 means (2.85, 2.95), 3.0 means (2.95, 3.05). Now we take our values and round them to integer. If we round all 0.5 values up, that means that the rounded value 2 will now catch all values in the range (1.45, 2.45), and the rounded value 3 catches (2.45, 3.45). In effect, our values are being skewed low by half a ULP. By using "round to even", you make the rounded value 2 catch all values in the range (1.45, 2.55), and the rounded value 3 now catches (2.55, 3.45). Values are now evenly spread around the stated value, but there is an entire ULP of discrepancy between the span of even numbers and the span of odd numbers. Which is more important? For a number's effective range to be centered around it, or for its range to be the same size as the range of every other number? ChrisA From greg.ewing at canterbury.ac.nz Sun Sep 30 17:50:36 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Oct 2018 10:50:36 +1300 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <24b501d458a8$19962f90$4cc28eb0$@sdamon.com> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <5BB02BBD.5090804@canterbury.ac.nz> <24b501d458a8$19962f90$4cc28eb0$@sdamon.com> Message-ID: <5BB1452C.5020402@canterbury.ac.nz> Alex Walters wrote: > Other use case is finance, where you can end up with interest calculations > that are fractional of the base unit of currency. US$2.345 is impossible to > represent in real currency, so it has to be rounded. This brings us back to my original point about floating point accuracy. If you do your interest calculation in floating point binary, first it's very unlikely that it will come out ending in exactly 0.5 of a cent, and secondly if you care about the details that much, you should be calculating in decimal, and being explicit about exactly what kind of rounding you're doing. -- Greg From greg.ewing at canterbury.ac.nz Sun Sep 30 18:04:31 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Oct 2018 11:04:31 +1300 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <20180930121703.GK19437@ando.pearwood.info> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <20180930121703.GK19437@ando.pearwood.info> Message-ID: <5BB1486F.9060109@canterbury.ac.nz> Steven D'Aprano wrote: > (It's also called Dutch Rounding.) Oh, so *that's* why Python does it! Fair enough. :-) > Similarly for differences. If you perform many subtractions (let's say > you are paying off a loan, and calculating interest, then rounding to > the nearest cent) you have to care about bias. If I'm paying off a loan, it's what the bank calculates that matters, not what I calculate. And I hope the bank isn't relying on the vagaries of Python floating point arithmetic for its critical financial calculations. -- Greg From greg.ewing at canterbury.ac.nz Sun Sep 30 18:15:53 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Oct 2018 11:15:53 +1300 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <20180930121703.GK19437@ando.pearwood.info> Message-ID: <5BB14B19.5040208@canterbury.ac.nz> Chris Angelico wrote: > ]That > means that any representable number actually has to indicate a range > of values centered on that value. That's not always true -- it depends on the source of the information. For example, a reading of 5 seconds on a clock with 1 second resolution actually represents a value between 5 and 6 seconds. So if you're fussy about rounding, you might want to round clock readings differently from measurements on a ruler. -- Greg From rosuav at gmail.com Sun Sep 30 18:27:36 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Oct 2018 08:27:36 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BB14B19.5040208@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <20180930121703.GK19437@ando.pearwood.info> <5BB14B19.5040208@canterbury.ac.nz> Message-ID: On Mon, Oct 1, 2018 at 8:17 AM Greg Ewing wrote: > > Chris Angelico wrote: > > ]That > > means that any representable number actually has to indicate a range > > of values centered on that value. > > That's not always true -- it depends on the source of the > information. For example, a reading of 5 seconds on a clock > with 1 second resolution actually represents a value between > 5 and 6 seconds. > > So if you're fussy about rounding, you might want to round > clock readings differently from measurements on a ruler. True. I gave a number of assumptions, and if those assumptions don't hold, you may need to vary things. If you have something like you describe here, you probably want to round-to-zero or something. ChrisA From Richard at Damon-Family.org Sun Sep 30 18:55:20 2018 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 30 Sep 2018 18:55:20 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BB14B19.5040208@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <20180930121703.GK19437@ando.pearwood.info> <5BB14B19.5040208@canterbury.ac.nz> Message-ID: <7af22058-f6aa-ede9-26fb-d73686bcddd8@Damon-Family.org> On 9/30/18 6:15 PM, Greg Ewing wrote: > Chris Angelico wrote: >> ]That >> means that any representable number actually has to indicate a range >> of values centered on that value. > > That's not always true -- it depends on the source of the > information. For example, a reading of 5 seconds on a clock > with 1 second resolution actually represents a value between > 5 and 6 seconds. > > So if you're fussy about rounding, you might want to round > clock readings differently from measurements on a ruler. > Actually it could be from 4+ to 6- seconds, say the first reading is 1, that could be anything from 1.000 to 1.999 and the second reading be 6, that could be from 6.000 to 6.999, thus the interval be from 6.000 - 1.999 = 4.001 tp 6.999 - 1.000 = 5.999 seconds. Now if you waited for the start time to roll over so you knew you were near 1.000, that would be different, but from just sampling you get ranges. Now if it was a stop watch that started at the beginning it depends on how it presents the time, it might respond 5 for 5.000 to 5.999 seconds, or it might intentionally round the data and say 5 from about 4.5 to 5.5. Now, one case where there is an intentional bias to the bottom is Map Grid Coordinate system, where you specify 1 meter resolution within a grid with 5 digits, but if you want to specify to less precision, the specification it to ALWAYS truncate so map coordinate 1234 represent the range from 12340.0000 to 12349.9999 -- Richard Damon From steve at pearwood.info Sun Sep 30 19:35:30 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 1 Oct 2018 09:35:30 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <5BB1452C.5020402@canterbury.ac.nz> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <5BB02BBD.5090804@canterbury.ac.nz> <24b501d458a8$19962f90$4cc28eb0$@sdamon.com> <5BB1452C.5020402@canterbury.ac.nz> Message-ID: <20180930233529.GP19437@ando.pearwood.info> On Mon, Oct 01, 2018 at 10:50:36AM +1300, Greg Ewing wrote: > Alex Walters wrote: > >Other use case is finance, where you can end up with interest calculations > >that are fractional of the base unit of currency. US$2.345 is impossible > >to > >represent in real currency, so it has to be rounded. > > This brings us back to my original point about floating point > accuracy. If you do your interest calculation in floating > point binary, first it's very unlikely that it will come > out ending in exactly 0.5 of a cent, And yet people (Alex, and he says others) are complaining about this change in behaviour. If getting exactly 0.5 is as unlikely as you claim, how would they notice? > and secondly if you > care about the details that much, you should be calculating > in decimal, and being explicit about exactly what kind of > rounding you're doing. Why should people using float have a biased round just because "they should be using Decimal"? The choice to use Decimal is not up to us and there's nothing wrong with using float for many purposes. Those who do shouldn't be burdened with a biased round. Regardless of whether it meets with the approval of the mathematically naive who think that primary school rounding is the "intuitive" (or only) way to round, the change was made something like a decade ago. It matches the behaviour of Julia, .Net, VBScript and I expect other languages and makes for a technically better default rounding mode. With no overwhelmingly strong case for reverting to a biased rounding mode, I think this discussion is dead. If people want to discuss something more productive, we could talk about adding an optional argument to round() to take a rounding mode, or adding an equivalent to the math library. I'll start off... How about we move the rounding mode constants out of the decimal module and into the math module? That makes them more easily discoverable and importable (the math module is lightweight, the decimal module is not). The decimal module would then import the constants from math (it already imports math so that's no extra dependency). Then we can add a keyword only argument to round: round(number, ndigits=0, *, mode=ROUND_HALF_EVEN) To use it, you can import the rounding mode you want from math: from math import ROUND_CEILING round(x, 3, mode=ROUND_CEILING) and everyone is happy (he says optimistically). It's a bit funny to have constants in the math module not actually used there, for the benefit of a builtin and Decimal, but I prefer that to either importing them from decimal or making them builtins. Thoughts? -- Steve From rosuav at gmail.com Sun Sep 30 19:41:22 2018 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 1 Oct 2018 09:41:22 +1000 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: <20180930233529.GP19437@ando.pearwood.info> References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> <5BAC70BB.2040707@canterbury.ac.nz> <20180927135327.GE19437@ando.pearwood.info> <234e01d4585e$829acc20$87d06460$@sdamon.com> <5BB02BBD.5090804@canterbury.ac.nz> <24b501d458a8$19962f90$4cc28eb0$@sdamon.com> <5BB1452C.5020402@canterbury.ac.nz> <20180930233529.GP19437@ando.pearwood.info> Message-ID: On Mon, Oct 1, 2018 at 9:36 AM Steven D'Aprano wrote: > Then we can add a keyword only argument to round: > > round(number, ndigits=0, *, mode=ROUND_HALF_EVEN) > > To use it, you can import the rounding mode you want from math: > > from math import ROUND_CEILING > round(x, 3, mode=ROUND_CEILING) I have no problem with this. > and everyone is happy (he says optimistically). And I am as dubious as you are about this :) IMO, the biggest problem with round() is that it's TOO discoverable. People reach for it when what they really should be using is string formatting ("I want to display all these values to three decimal places"), and then sometimes get bitten when something doesn't actually display the way they think it will. When it's used correctly, it's usually fine. ChrisA From mansourmoufid at gmail.com Sun Sep 30 21:23:26 2018 From: mansourmoufid at gmail.com (Mansour Moufid) Date: Sun, 30 Sep 2018 21:23:26 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> Message-ID: On Wed, Sep 26, 2018 at 7:29 AM wrote: > > I recently found out about Python 3's round-to-even change (via > https://github.com/cosmologicon/pywat!) and am having trouble finding > where that change was discussed. That GitHub project is hilarious especially the NaN stuff... Rounding is from engineering so there is more than one definition, and one is not more correct than the others, it just depends on the specific application. Functions like ceiling and floor do have mathematical definitions. Whichever definition of rounding the Python standard library adopts, it should be very explicitly defined in the documentation in terms of ceiling and floor. In applications where rounding is actually important, it's a good idea to do calculations with one rounding function, and again with another, and compare results. From tjreedy at udel.edu Sun Sep 30 23:11:29 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 30 Sep 2018 23:11:29 -0400 Subject: [Python-Dev] Change in Python 3's "round" behavior In-Reply-To: References: <1519658635.56.0.467229070634.issue32956@psf.upfronthosting.co.za> <1537960170.25.0.545547206417.issue32956@psf.upfronthosting.co.za> Message-ID: On 9/26/2018 7:26 AM, jab at math.brown.edu wrote: To paraphrase: 1. Where was the 3.0 change discussed? 2. What was the rationale? I think these have been answered as well as possible. 3. Can the change be reverted? It 'could be', but will not be reverted? 4. Should something be added to the doc? Maybe, but I don't see any enthusiasm from core devs. This list is for development of future Python and CPython. The continued discussion of what other languages do and how to best use rounding are off-topic here (and given the above, on python-ideas). Please take these comparison and use discussions to python-list. -- Terry Jan Reedy