From ncoghlan at gmail.com Thu Mar 1 00:51:30 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2018 15:51:30 +1000 Subject: [Python-Dev] PEP 467 (Minor API improvements for binary sequences) - any thoughts? In-Reply-To: <5A959219.3040008@stoneleaf.us> References: <5A959219.3040008@stoneleaf.us> Message-ID: On 28 February 2018 at 03:15, Ethan Furman wrote: > On 02/26/2018 11:34 PM, Elias Zamaria wrote: > > I don't know how I would feel working on something so general, of use to >> so many people for lots of different purposes. >> Do I know enough about all of the use cases and what everyone wants? I am >> not completely against it but I'd need to >> think about it. >> > > Part of the PEP writing process is asking for and collecting use-cases; > if possible, looking at other code projects for use-cases is also useful. > > Time needed can vary widely depending on the subject; if I recall > correctly, PEP 409 only took a few days, while PEP 435 took several weeks. > PEP 467 has already gone through a few iterations, so hopefully not too > much more time is required. > One of the main developments not yet accounted for in the PEP is the fact that `memoryview` now supports efficient bytes-based iteration over arbitrary buffer-exporting objects: def iterbytes(obj): with memoryview(obj) as m: return iter(m.cast('c')) This means that aspect of PEP 467 will need to lean more heavily on discoverability arguments (since the above approach isn't obvious at all unless you're already very familiar with the use of `memoryview`), since the runtime benefit from avoiding the upfront cost of allocating and initialising two memoryview objects by using a custom iterator type instead is likely to be fairly small. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Mar 1 01:02:51 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2018 16:02:51 +1000 Subject: [Python-Dev] Should the dataclass frozen property apply to subclasses? In-Reply-To: <2362d149-1b72-62e2-55b6-ce2d4143b04f@trueblade.com> References: <2362d149-1b72-62e2-55b6-ce2d4143b04f@trueblade.com> Message-ID: On 28 February 2018 at 10:37, Eric V. Smith wrote: > So, given a frozen dataclass "C" with field names in "field_names", I > propose changing __setattr__ to be: > > def __setattr__(self, name, value): > if type(self) is C or name in field_names: > raise FrozenInstanceError(f'cannot assign to field {name!r}') > super(cls, self).__setattr__(name, value) > > In the current 3.7.0a2 implementation of frozen dataclasses, __setattr__ > always raises. The change is the test and then call to super().__setattr__ > if it's a derived class. The result is an exception if either self is an > instance of C, or if self is an instance of a derived class, but the > attribute being set is a field of C. > I'm assuming you meant "3.7.0b2" here (and similarly alpha->beta for the other version numbers below) > So going back to original questions above, my suggestions are: > > 1. What happens when a frozen dataclass inherits from a non-frozen > dataclass? The result is a frozen dataclass, and all fields are > non-writable. No non-fields can be added. This is a reversion to the > 3.7.0a1 behavior. > > 2. What happens when a non-frozen dataclass inherits from a frozen > dataclass? The result is a frozen dataclass, and all fields are > non-writable. No non-fields can be added. This is a reversion to the > 3.7.0a1 behavior. I'd also be okay with this case being an error, and you'd > have to explicitly mark the derived class as frozen. This is the 3.7.0a2 > behavior. > > 3. What happens when a non-dataclass inherits from a frozen dataclass? The > fields that are in the dataclass are non-writable, but new non-field > attributes can be added. This is new behavior. > > 4. Can new non-field attributes be created for frozen dataclasses? No. > This is existing behavior. > +1 from me for the variant of this where dataclasses inheriting from frozen data classes must explicitly mark themselves as frozen (at least for now). That way we leave the door open to allowing a variant where a non-frozen dataclass that inherits from a frozen dataclass can set "hash=False" on all of the new fields it adds to avoid becoming frozen itself. > I'm hoping this change isn't so large that we can't get it in to 3.7.0a3 > next month. > I think this qualifies as the beta period serving its intended purpose (i.e. reviewing and refining the behaviour of already added features, without allowing completely new features). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Mar 1 01:43:10 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 1 Mar 2018 01:43:10 -0500 Subject: [Python-Dev] Should the dataclass frozen property apply to subclasses? In-Reply-To: References: <2362d149-1b72-62e2-55b6-ce2d4143b04f@trueblade.com> Message-ID: On 3/1/2018 1:02 AM, Nick Coghlan wrote: > I'm assuming you meant "3.7.0b2" here (and similarly alpha->beta for the > other version numbers below) Oops, yes. Thanks. > So going back to original questions above, my suggestions are: > > 1. What happens when a frozen dataclass inherits from a non-frozen > dataclass? The result is a frozen dataclass, and all fields are > non-writable. No non-fields can be added. This is a reversion to the > 3.7.0a1 behavior. > > 2. What happens when a non-frozen dataclass inherits from a frozen > dataclass? The result is a frozen dataclass, and all fields are > non-writable. No non-fields can be added. This is a reversion to the > 3.7.0a1 behavior. I'd also be okay with this case being an error, > and you'd have to explicitly mark the derived class as frozen. This > is the 3.7.0a2 behavior. > > 3. What happens when a non-dataclass inherits from a frozen > dataclass? The fields that are in the dataclass are non-writable, > but new non-field attributes can be added. This is new behavior. > > 4. Can new non-field attributes be created for frozen dataclasses? > No. This is existing behavior. > > > +1 from me for the variant of this where dataclasses inheriting from > frozen data classes must explicitly mark themselves as frozen (at least > for now). That way we leave the door open to allowing a variant where a > non-frozen dataclass that inherits from a frozen dataclass can set > "hash=False" on all of the new fields it adds to avoid becoming frozen > itself. I tend to agree. It's not like this is a huge burden, and at least the author is acknowledging that the class will end up frozen. > > I'm hoping this change isn't so large that we can't get it in to > 3.7.0a3 next month. > > > I think this qualifies as the beta period serving its intended purpose > (i.e. reviewing and refining the behaviour of already added features, > without allowing completely new features). Thanks. Eric From julien at editx.eu Thu Mar 1 08:36:42 2018 From: julien at editx.eu (Julien Carlier) Date: Thu, 1 Mar 2018 14:36:42 +0100 Subject: [Python-Dev] Python Challenge Online - 30 Questions Message-ID: Hello, Cisco & Dimension Data organize a Python Challenge on EDITx. It is a good way to test your skills & have fun. -> https://editx.eu/it-challenge/python-challenge-2018-cisco-and-dimension-data Rules are simple: 15 minutes online, 30 questions & 3 jokers. Everyone is allowed to participate but only belgian residents can win the prizes. Do not hesitate to share it with people that might be interested. Feedbacks are welcome :) Regards, Julien -- Frankenstraat - Rue des Francs 79 1000 Brussels -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Mar 2 12:09:55 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 2 Mar 2018 18:09:55 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180302170955.3520C11A977@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-02-23 - 2018-03-02) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6491 ( +6) closed 38243 (+53) total 44734 (+59) Open issues with patches: 2527 Issues opened (49) ================== #32604: Expose the subinterpreters C-API in Python for testing use. https://bugs.python.org/issue32604 reopened by eric.snow #32924: Python 3.7 docs in docs.p.o points to GitHub's master branch https://bugs.python.org/issue32924 reopened by ned.deily #32925: AST optimizer: Change a list into tuple in iterations and cont https://bugs.python.org/issue32925 opened by serhiy.storchaka #32926: Add public TestCase method/property to get result of current t https://bugs.python.org/issue32926 opened by Victor Engmark #32927: Add typeshed documentation for unittest.TestCase._feedErrorsTo https://bugs.python.org/issue32927 opened by Victor Engmark #32932: better error message when __all__ contains non-str objects https://bugs.python.org/issue32932 opened by xiang.zhang #32933: mock_open does not support iteration around text files. https://bugs.python.org/issue32933 opened by anthony-flury #32934: logging.handlers.BufferingHandler capacity is unclearly specif https://bugs.python.org/issue32934 opened by enrico #32935: Documentation falsely leads to believe that MemoryHandler can https://bugs.python.org/issue32935 opened by enrico #32936: RobotFileParser.parse() should raise an exception when the rob https://bugs.python.org/issue32936 opened by Guinness #32937: Multiprocessing worker functions not terminating with a large https://bugs.python.org/issue32937 opened by Ericg #32938: webbrowser: Add options for private mode https://bugs.python.org/issue32938 opened by csabella #32939: IDLE: self.use_context_ps1 defined in editor, but always False https://bugs.python.org/issue32939 opened by csabella #32940: IDLE: pyparse - simplify StringTranslatePseudoMapping https://bugs.python.org/issue32940 opened by csabella #32941: mmap should expose madvise() https://bugs.python.org/issue32941 opened by pitrou #32942: Regression: test_script_helper fails https://bugs.python.org/issue32942 opened by abrezovsky #32943: confusing error message for rot13 codec https://bugs.python.org/issue32943 opened by xiang.zhang #32946: Speed up import from non-packages https://bugs.python.org/issue32946 opened by serhiy.storchaka #32947: Support OpenSSL 1.1.1 https://bugs.python.org/issue32947 opened by christian.heimes #32948: clang compiler warnings on Travis https://bugs.python.org/issue32948 opened by christian.heimes #32949: Simplify "with"-related opcodes https://bugs.python.org/issue32949 opened by serhiy.storchaka #32950: profiling python gc https://bugs.python.org/issue32950 opened by Luavis #32951: Prohibit direct instantiation of SSLSocket and SSLObject https://bugs.python.org/issue32951 opened by christian.heimes #32952: Add __qualname__ for attributes of Mock instances https://bugs.python.org/issue32952 opened by s_kostyuk #32953: Dataclasses: frozen should not be inherited for non-dataclass https://bugs.python.org/issue32953 opened by eric.smith #32954: Lazy Literal String Interpolation (PEP-498-based fl-strings) https://bugs.python.org/issue32954 opened by arcivanov #32955: IDLE crashes when trying to save a file https://bugs.python.org/issue32955 opened by zaphod424 #32957: distutils.command.install checks truthiness of .ext_modules in https://bugs.python.org/issue32957 opened by korijn #32958: Urllib proxy_bypass crashes for urls containing long basic aut https://bugs.python.org/issue32958 opened by ablack #32959: zipimport fails when the ZIP archive contains more than 65535 https://bugs.python.org/issue32959 opened by mchouza #32960: dataclasses: disallow inheritance between frozen and non-froze https://bugs.python.org/issue32960 opened by eric.smith #32962: test_gdb fails in debug build with `-mcet -fcf-protection -O0` https://bugs.python.org/issue32962 opened by ishcherb #32963: Python 2.7 tutorial claims source code is UTF-8 encoded https://bugs.python.org/issue32963 opened by mjpieters #32964: Reuse a testing implementation of the path protocol in tests https://bugs.python.org/issue32964 opened by serhiy.storchaka #32966: Python 3.6.4 - 0x80070643 - Fatal Error during installation https://bugs.python.org/issue32966 opened by exceltw #32968: Fraction modulo infinity should behave consistently with other https://bugs.python.org/issue32968 opened by elias #32969: Add more constants to zlib module https://bugs.python.org/issue32969 opened by xiang.zhang #32970: Improve disassembly of the MAKE_FUNCTION instruction https://bugs.python.org/issue32970 opened by serhiy.storchaka #32971: Docs on unittest.TestCase.assertRaises() should clarify contex https://bugs.python.org/issue32971 opened by nodakai #32972: unittest.TestCase coroutine support https://bugs.python.org/issue32972 opened by pdxjohnny #32973: Importing the same extension module under multiple names break https://bugs.python.org/issue32973 opened by twouters #32974: Add bitwise operations and other missing comparison methods to https://bugs.python.org/issue32974 opened by Kyle Agronick #32975: mailbox: It would be nice to move mailbox.Message from legacy https://bugs.python.org/issue32975 opened by maxking #32976: linux/random.h present but cannot be compiled https://bugs.python.org/issue32976 opened by mfschmidt #32977: added acts_like decorator to dataclasses module https://bugs.python.org/issue32977 opened by ninjaaron #32978: Issues with reading large float values in AIFC files https://bugs.python.org/issue32978 opened by serhiy.storchaka #32981: Catastrophic backtracking in poplib and difflib https://bugs.python.org/issue32981 opened by davisjam #32982: Parse out invisible Unicode characters? https://bugs.python.org/issue32982 opened by leewz #32983: UnicodeDecodeError 'ascii' codec can't decode byte in position https://bugs.python.org/issue32983 opened by Jiri Prajzner Most recent 15 issues with no replies (15) ========================================== #32982: Parse out invisible Unicode characters? https://bugs.python.org/issue32982 #32981: Catastrophic backtracking in poplib and difflib https://bugs.python.org/issue32981 #32976: linux/random.h present but cannot be compiled https://bugs.python.org/issue32976 #32974: Add bitwise operations and other missing comparison methods to https://bugs.python.org/issue32974 #32971: Docs on unittest.TestCase.assertRaises() should clarify contex https://bugs.python.org/issue32971 #32970: Improve disassembly of the MAKE_FUNCTION instruction https://bugs.python.org/issue32970 #32969: Add more constants to zlib module https://bugs.python.org/issue32969 #32962: test_gdb fails in debug build with `-mcet -fcf-protection -O0` https://bugs.python.org/issue32962 #32958: Urllib proxy_bypass crashes for urls containing long basic aut https://bugs.python.org/issue32958 #32957: distutils.command.install checks truthiness of .ext_modules in https://bugs.python.org/issue32957 #32952: Add __qualname__ for attributes of Mock instances https://bugs.python.org/issue32952 #32950: profiling python gc https://bugs.python.org/issue32950 #32948: clang compiler warnings on Travis https://bugs.python.org/issue32948 #32946: Speed up import from non-packages https://bugs.python.org/issue32946 #32943: confusing error message for rot13 codec https://bugs.python.org/issue32943 Most recent 15 issues waiting for review (15) ============================================= #32978: Issues with reading large float values in AIFC files https://bugs.python.org/issue32978 #32970: Improve disassembly of the MAKE_FUNCTION instruction https://bugs.python.org/issue32970 #32968: Fraction modulo infinity should behave consistently with other https://bugs.python.org/issue32968 #32964: Reuse a testing implementation of the path protocol in tests https://bugs.python.org/issue32964 #32960: dataclasses: disallow inheritance between frozen and non-froze https://bugs.python.org/issue32960 #32957: distutils.command.install checks truthiness of .ext_modules in https://bugs.python.org/issue32957 #32951: Prohibit direct instantiation of SSLSocket and SSLObject https://bugs.python.org/issue32951 #32949: Simplify "with"-related opcodes https://bugs.python.org/issue32949 #32947: Support OpenSSL 1.1.1 https://bugs.python.org/issue32947 #32946: Speed up import from non-packages https://bugs.python.org/issue32946 #32943: confusing error message for rot13 codec https://bugs.python.org/issue32943 #32940: IDLE: pyparse - simplify StringTranslatePseudoMapping https://bugs.python.org/issue32940 #32932: better error message when __all__ contains non-str objects https://bugs.python.org/issue32932 #32925: AST optimizer: Change a list into tuple in iterations and cont https://bugs.python.org/issue32925 #32924: Python 3.7 docs in docs.p.o points to GitHub's master branch https://bugs.python.org/issue32924 Top 10 most discussed issues (10) ================================= #32940: IDLE: pyparse - simplify StringTranslatePseudoMapping https://bugs.python.org/issue32940 14 msgs #32932: better error message when __all__ contains non-str objects https://bugs.python.org/issue32932 11 msgs #17288: cannot jump from a 'return' or 'exception' trace event https://bugs.python.org/issue17288 9 msgs #32056: Improve exceptions in aifc, sunau and wave https://bugs.python.org/issue32056 9 msgs #32968: Fraction modulo infinity should behave consistently with other https://bugs.python.org/issue32968 9 msgs #31961: subprocess._execute_child doesn't accept a single PathLike arg https://bugs.python.org/issue31961 8 msgs #32880: IDLE: Fix and update and cleanup pyparse https://bugs.python.org/issue32880 8 msgs #32954: Lazy Literal String Interpolation (PEP-498-based fl-strings) https://bugs.python.org/issue32954 8 msgs #32911: Doc strings no longer stored in body of AST https://bugs.python.org/issue32911 7 msgs #32604: Expose the subinterpreters C-API in Python for testing use. https://bugs.python.org/issue32604 6 msgs Issues closed (49) ================== #10507: Check well-formedness of reST markup within "make patchcheck" https://bugs.python.org/issue10507 closed by Mariatta #13897: Move fields relevant to sys.exc_info out of frame into generat https://bugs.python.org/issue13897 closed by serhiy.storchaka #15663: Investigate providing Tcl/Tk 8.6 with OS X installers https://bugs.python.org/issue15663 closed by ned.deily #15873: datetime: add ability to parse RFC 3339 dates and times https://bugs.python.org/issue15873 closed by belopolsky #17232: Improve -O docs https://bugs.python.org/issue17232 closed by terry.reedy #17611: Move unwinding of stack for "pseudo exceptions" from interpret https://bugs.python.org/issue17611 closed by serhiy.storchaka #18293: ssl.wrap_socket (cert_reqs=...), getpeercert, and unvalidated https://bugs.python.org/issue18293 closed by christian.heimes #18855: Inconsistent README filenames https://bugs.python.org/issue18855 closed by Mariatta #21541: Provide configure option --with-ssl for compilation with custo https://bugs.python.org/issue21541 closed by christian.heimes #24334: SSLSocket extra level of indirection https://bugs.python.org/issue24334 closed by christian.heimes #25059: Mistake in input-output tutorial regarding print() seperator https://bugs.python.org/issue25059 closed by Mariatta #25115: SSL_set_verify_depth not exposed by the ssl module https://bugs.python.org/issue25115 closed by christian.heimes #25404: ssl.SSLcontext.load_dh_params() does not handle unicode filena https://bugs.python.org/issue25404 closed by christian.heimes #27876: Add SSLContext.set_version_range(minver, maxver=None) https://bugs.python.org/issue27876 closed by christian.heimes #28414: SSL match_hostname fails for internationalized domain names https://bugs.python.org/issue28414 closed by njs #29237: Create enum for pstats sorting options https://bugs.python.org/issue29237 closed by ethan.furman #29480: Mac OSX Installer SSL Roots https://bugs.python.org/issue29480 closed by ned.deily #30607: Extract documentation theme into a separate package https://bugs.python.org/issue30607 closed by ned.deily #31013: gcc7 throws warning when pymem.h development header is used https://bugs.python.org/issue31013 closed by ned.deily #31399: Let OpenSSL verify hostname and IP address https://bugs.python.org/issue31399 closed by christian.heimes #31518: ftplib, urllib2, poplib, httplib, urllib2_localnet use ssl.PRO https://bugs.python.org/issue31518 closed by christian.heimes #31997: SSL lib does not handle trailing dot (period) in hostname or c https://bugs.python.org/issue31997 closed by christian.heimes #32185: SSLContext.wrap_socket sends SNI Extension when server_hostnam https://bugs.python.org/issue32185 closed by christian.heimes #32304: Upload failed (400): Digests do not match on .tar.gz ending wi https://bugs.python.org/issue32304 closed by eric.araujo #32378: test_npn_protocols broken with LibreSSL 2.6.1+ https://bugs.python.org/issue32378 closed by christian.heimes #32394: socket lib beahavior change in 3.6.4 https://bugs.python.org/issue32394 closed by steve.dower #32500: PySequence_Length() raises TypeError on dict type https://bugs.python.org/issue32500 closed by Mariatta #32609: Add setter and getter for min/max protocol version https://bugs.python.org/issue32609 closed by christian.heimes #32647: Undefined references when compiling ctypes on binutils 2.29.1 https://bugs.python.org/issue32647 closed by cstratak #32732: LoggingAdapter ignores extra kwargs of Logger#log() https://bugs.python.org/issue32732 closed by vinay.sajip #32819: match_hostname() error reporting bug https://bugs.python.org/issue32819 closed by christian.heimes #32875: Add __exit__() method to event loops https://bugs.python.org/issue32875 closed by terry.reedy #32877: Login to bugs.python.org with Google account NOT working https://bugs.python.org/issue32877 closed by terry.reedy #32901: Update 3.7 and 3.8 Windows and macOS installer builds to tcl/t https://bugs.python.org/issue32901 closed by ned.deily #32903: os.chdir() may leak memory on Windows https://bugs.python.org/issue32903 closed by xiang.zhang #32916: IDLE: change 'str' to 'code' in idlelib.pyparse.PyParse and us https://bugs.python.org/issue32916 closed by terry.reedy #32923: Typo in documentation of unittest: whilst instead of while https://bugs.python.org/issue32923 closed by Mariatta #32928: _findvs failing on Windows 10 (Release build only) https://bugs.python.org/issue32928 closed by WildCard65 #32929: Change dataclasses hashing to use unsafe_hash boolean (default https://bugs.python.org/issue32929 closed by eric.smith #32930: [help][webbrowser always opens new tab. I want to open in the https://bugs.python.org/issue32930 closed by csabella #32931: Python 3.70b1 specifies non-existent compiler gcc++ https://bugs.python.org/issue32931 closed by ned.deily #32944: Need Guidance on Solving the Tcl problem https://bugs.python.org/issue32944 closed by terry.reedy #32945: sorted(generator) is slower than sorted(list-comprehension) https://bugs.python.org/issue32945 closed by rhettinger #32956: python 3 round bug https://bugs.python.org/issue32956 closed by serhiy.storchaka #32961: namedtuple displaying the internal code https://bugs.python.org/issue32961 closed by eric.smith #32965: Passing a bool to io.open() should raise a TypeError, not read https://bugs.python.org/issue32965 closed by ned.deily #32967: make check in devguide failing https://bugs.python.org/issue32967 closed by Mariatta #32979: dict get() function equivalent for lists. https://bugs.python.org/issue32979 closed by rhettinger #32980: Remove functions that do nothing in _Py_InitializeCore() https://bugs.python.org/issue32980 closed by ncoghlan From mikez302 at gmail.com Fri Mar 2 21:15:54 2018 From: mikez302 at gmail.com (Elias Zamaria) Date: Fri, 2 Mar 2018 18:15:54 -0800 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? Message-ID: I am trying to participate in the discussion about PEP 467, but I'm still kind of getting used to the mailing list. It seems like I can either subscribe and get emails for all of the threads, or unsubscribe and not get any emails, making me unable to reply to the threads I want to reply to. The batched daily digest feature makes the emails more tolerable, but ideally, I would like to only get emails for a few specific subjects I care a lot about. I can probably set up some archiving or filtering on my end, but I am wondering if anyone has a way to only subscribe to certain threads, or any general suggestions for dealing with the mailing list. Also, I was unsubscribed for a while, when someone sent a message in the PEP 467 thread. Is there a way to reply to messages that were sent when I was not subscribed? -------------- next part -------------- An HTML attachment was scrubbed... URL: From drsalists at gmail.com Fri Mar 2 23:41:05 2018 From: drsalists at gmail.com (Dan Stromberg) Date: Fri, 2 Mar 2018 20:41:05 -0800 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? In-Reply-To: References: Message-ID: On Fri, Mar 2, 2018 at 6:15 PM, Elias Zamaria wrote: > It seems like I can either subscribe and get emails for all of the threads, > or unsubscribe and not get any emails, making me unable to reply to the > threads I want to reply to. The batched daily digest feature makes the > emails more tolerable, but ideally, I would like to only get emails for a > few specific subjects I care a lot about. Maybe gmane combined with a threaded newsreader (NNTP client) would be close enough? From barry at python.org Sat Mar 3 03:03:49 2018 From: barry at python.org (Barry Warsaw) Date: Sat, 3 Mar 2018 00:03:49 -0800 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? In-Reply-To: References: Message-ID: <64172F1C-785E-41F0-9B51-634F934F9004@python.org> On Mar 2, 2018, at 20:41, Dan Stromberg wrote: > > Maybe gmane combined with a threaded newsreader (NNTP client) would be > close enough? While I guzzle the firehose of python-dev from my inbox, Gmane+NNTP is how I read many other Python mailing lists, including python-ideas. It?s a great solution if you want to participate in just a few threads. I personally use Postbox for that, but there good free open source MUAs too, depending on your platform. My favorite on Linux is Claws Mail. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From anthony.flury at btinternet.com Sat Mar 3 12:54:28 2018 From: anthony.flury at btinternet.com (anthony.flury at btinternet.com) Date: Sat, 3 Mar 2018 17:54:28 +0000 (UTC) Subject: [Python-Dev] Issue 32933 References: <1162216174.14491085.1520099668991.ref@mail.yahoo.com> Message-ID: <1162216174.14491085.1520099668991@mail.yahoo.com> I mentioned when I raised issue 32933 (mock_open doesn't support dunder iter protocol) that I have a one line fix for the issue. Who should I send this detail too ? Obviously it will need test cases too - I wrote a single very simple one just to prove to myself it worked - but it isn't complete by any stretch of the imagination.?-- Anthony Flury anthony.flury at btinternet.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sanyam.khurana01 at gmail.com Sat Mar 3 13:54:34 2018 From: sanyam.khurana01 at gmail.com (Sanyam Khurana) Date: Sun, 4 Mar 2018 00:24:34 +0530 Subject: [Python-Dev] Issue 32933 In-Reply-To: <1162216174.14491085.1520099668991@mail.yahoo.com> References: <1162216174.14491085.1520099668991.ref@mail.yahoo.com> <1162216174.14491085.1520099668991@mail.yahoo.com> Message-ID: Hey Tony, You can raise a PR and then start working on writing a test case of it. People would then be able to see what exactly you've done and suggest changes if need be. Let me know if you've any more questions. On Sat, Mar 3, 2018 at 11:24 PM, TonyFlury via Python-Dev wrote: > I mentioned when I raised issue 32933 (mock_open doesn't support dunder iter > protocol) that I have a one line fix for the issue. > > Who should I send this detail too ? Obviously it will need test cases too - > I wrote a single very simple one just to prove to myself it worked - but it > isn't complete by any stretch of the imagination. > > -- > Anthony Flury > anthony.flury at btinternet.com > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/sanyam.khurana01%40gmail.com > -- Mozilla Rep http://www.SanyamKhurana.com Github: CuriousLearner From anthony.flury at btinternet.com Sun Mar 4 14:36:58 2018 From: anthony.flury at btinternet.com (anthony.flury at btinternet.com) Date: Sun, 4 Mar 2018 19:36:58 +0000 (UTC) Subject: [Python-Dev] Python Test suite hangining In-Reply-To: <1162216174.14491085.1520099668991@mail.yahoo.com> References: <1162216174.14491085.1520099668991.ref@mail.yahoo.com> <1162216174.14491085.1520099668991@mail.yahoo.com> Message-ID: <1496840137.15189488.1520192218828@mail.yahoo.com> All, Sorry to trouble you all? - but I am trying to get the Python 3.8 test suite running on Ubuntu 16.0.4. As per the dev guide - having cloned the repo and run the build I am running the test suite by : ./python -m test -j1 This runs through to test 414/415 - and then start getting this message ????running: test_poplib (nnn sec) ????every 30 seconds - nnn got to 1077 secs before I decided to stop it. When I ran test_poplib on it's own - I got this Traceback? : ????Traceback (most recent call last): ??????? .......??????? ....... ????? File "/home/tony/Development/python/cpython/Lib/ssl.py", line 1108, in do_handshake ??????? self._sslobj.do_handshake() ????ssl.SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert certificate unknown (_ssl.c:1038) And then lots of reports of unexpected EOFs. I have the latest CA-certificates installed Any advice - I couldn't find anything on google. ?-- Anthony Flury anthony.flury at btinternet.com From: "anthony.flury at btinternet.com" To: "Python-Dev at python.org" Sent: Saturday, March 3, 2018 5:54 PM Subject: Issue 32933 I mentioned when I raised issue 32933 (mock_open doesn't support dunder iter protocol) that I have a one line fix for the issue. Who should I send this detail too ? Obviously it will need test cases too - I wrote a single very simple one just to prove to myself it worked - but it isn't complete by any stretch of the imagination.?-- Anthony Flury anthony.flury at btinternet.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Mar 5 13:13:47 2018 From: brett at python.org (Brett Cannon) Date: Mon, 05 Mar 2018 18:13:47 +0000 Subject: [Python-Dev] Python Test suite hangining In-Reply-To: <1496840137.15189488.1520192218828@mail.yahoo.com> References: <1162216174.14491085.1520099668991.ref@mail.yahoo.com> <1162216174.14491085.1520099668991@mail.yahoo.com> <1496840137.15189488.1520192218828@mail.yahoo.com> Message-ID: On Sun, 4 Mar 2018 at 11:38 TonyFlury via Python-Dev wrote: > All, > Sorry to trouble you all - but I am trying to get the Python 3.8 test > suite running on Ubuntu 16.0.4. > > As per the dev guide - having cloned the repo and run the build I am > running the test suite by : ./python -m test -j1 > > This runs through to test 414/415 - and then start getting this message > > running: test_poplib (nnn sec) > > every 30 seconds - nnn got to 1077 secs before I decided to stop it. > > > When I ran test_poplib on it's own - I got this Traceback : > > Traceback (most recent call last): > ....... > ....... > File "/home/tony/Development/python/cpython/Lib/ssl.py", line 1108, > in do_handshake > self._sslobj.do_handshake() > ssl.SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert > certificate unknown (_ssl.c:1038) > > And then lots of reports of unexpected EOFs. > > I have the latest CA-certificates installed > > Any advice - I couldn't find anything on google. > > CI is testing, so this isn't a universal issue: https://travis-ci.org/python/cpython/jobs/349398791 . So I would see if you could diagnose whether your latest certs are so new compared to what others who run the test suite are using that that's what is causing your failure compared to others. -Brett > > -- > Anthony Flury > anthony.flury at btinternet.com > > > ------------------------------ > *From:* "anthony.flury at btinternet.com" > *To:* "Python-Dev at python.org" > *Sent:* Saturday, March 3, 2018 5:54 PM > *Subject:* Issue 32933 > > I mentioned when I raised issue 32933 (mock_open doesn't support dunder > iter protocol) that I have a one line fix for the issue. > > Who should I send this detail too ? Obviously it will need test cases too > - I wrote a single very simple one just to prove to myself it worked - but > it isn't complete by any stretch of the imagination. > > -- > Anthony Flury > anthony.flury at btinternet.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikez302 at gmail.com Mon Mar 5 13:03:53 2018 From: mikez302 at gmail.com (Elias Zamaria) Date: Mon, 5 Mar 2018 10:03:53 -0800 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? In-Reply-To: <64172F1C-785E-41F0-9B51-634F934F9004@python.org> References: <64172F1C-785E-41F0-9B51-634F934F9004@python.org> Message-ID: Thanks for the suggestions, everyone. I think for now, I'll just try to deal with it and maybe set up some filtering in Gmail. I'm not sure if it is worth changing my habits and getting over the learning curve of these tools just to deal with this one mailing list. On Sat, Mar 3, 2018 at 12:03 AM, Barry Warsaw wrote: > On Mar 2, 2018, at 20:41, Dan Stromberg wrote: > > > > Maybe gmane combined with a threaded newsreader (NNTP client) would be > > close enough? > > While I guzzle the firehose of python-dev from my inbox, Gmane+NNTP is how > I read many other Python mailing lists, including python-ideas. It?s a > great solution if you want to participate in just a few threads. I > personally use Postbox for that, but there good free open source MUAs too, > depending on your platform. My favorite on Linux is Claws Mail. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mikez302%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Mar 5 18:40:57 2018 From: christian at python.org (Christian Heimes) Date: Mon, 5 Mar 2018 18:40:57 -0500 Subject: [Python-Dev] Python Test suite hangining In-Reply-To: References: <1162216174.14491085.1520099668991.ref@mail.yahoo.com> <1162216174.14491085.1520099668991@mail.yahoo.com> <1496840137.15189488.1520192218828@mail.yahoo.com> Message-ID: On 2018-03-05 13:13, Brett Cannon wrote: > > > On Sun, 4 Mar 2018 at 11:38 TonyFlury via Python-Dev > > wrote: > > All, > Sorry to trouble you all? - but I am trying to get the Python 3.8 > test suite running on Ubuntu 16.0.4. > > As per the dev guide - having cloned the repo and run the build I am > running the test suite by : ./python -m test -j1 > > This runs through to test 414/415 - and then start getting this message > > ????running: test_poplib (nnn sec) > > ????every 30 seconds - nnn got to 1077 secs before I decided to stop it. > > > When I ran test_poplib on it's own - I got this Traceback? : > > ????Traceback (most recent call last): > ??????? ....... > ??????? ....... > ????? File "/home/tony/Development/python/cpython/Lib/ssl.py", line > 1108, in do_handshake > ??????? self._sslobj.do_handshake() > ????ssl.SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert > certificate unknown (_ssl.c:1038) > > And then lots of reports of unexpected EOFs. > > I have the latest CA-certificates installed > > Any advice - I couldn't find anything on google. > > > CI is testing, so this isn't a universal issue: > https://travis-ci.org/python/cpython/jobs/349398791 . So I would see if > you could diagnose whether your latest certs are so new compared to what > others who run the test suite are using that that's what is causing your > failure compared to others. Python's test suite should depend on public CAs. The issue in poplib is related to https://bugs.python.org/issue32706 . I'll address the problem when I'm back home in Germany next week. Christian From ncoghlan at gmail.com Mon Mar 5 20:34:44 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Mar 2018 11:34:44 +1000 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? In-Reply-To: References: <64172F1C-785E-41F0-9B51-634F934F9004@python.org> Message-ID: On 6 March 2018 at 04:03, Elias Zamaria wrote: > Thanks for the suggestions, everyone. I think for now, I'll just try to > deal with it and maybe set up some filtering in Gmail. I'm not sure if it > is worth changing my habits and getting over the learning curve of these > tools just to deal with this one mailing list. > One useful feature in Gmail for getting the entire list out of your inbox is the "Filter messages like this" option in the per-message drop down (it prepopulates a new filter definition with the appropriate list headers). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcepl at cepl.eu Tue Mar 6 01:08:13 2018 From: mcepl at cepl.eu (=?UTF-8?Q?Mat=C4=9Bj?= Cepl) Date: Tue, 06 Mar 2018 07:08:13 +0100 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? References: Message-ID: On 2018-03-03, 02:15 GMT, Elias Zamaria wrote: > It seems like I can either subscribe and get emails for all of > the threads, or unsubscribe and not get any emails, making me > unable to reply to the threads I want to reply to. Go to https://mail.python.org/mailman/listinfo/python-dev in the last input box fill in your email, and click on [Unsubscribe or edit options]. On the next page fill-in your password (you get it every month in email), and on the settings page switch ?Mail delivery? to ?Disabled?. You will not get any message from the list, but you will be still subscribed, so you can post to the list. Then open your NNTP newsreader (https://en.wikipedia.org/wiki/Newsreader_(Usenet)) (you use one, right? It is the only sensible way how to deal with the large community lists) and subscribe to nntp://news.gmane.org/gmane.comp.python.devel . You will have all advantages of newsreader (killfile, easy ignoring whole threads at once) in the comfort of your computer, perhaps even in the offline state. If you don?t want to go all that length, just use email program which can kill threads (e.g., Thunderbird https://support.mozilla.org/en-US/kb/ignore-threads ) Best, Mat?j -- https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of all tyrannies, a tyranny exercised for the good of its victims may be the most oppressive. It may be better to live under robber barons than under omnipotent moral busybodies. The robber baron's cruelty may sometimes sleep, his cupidity may at some point be satiated; but those who torment us for our own good will torment us without end, for they do so with the approval of their consciences. -- C. S. Lewis From storchaka at gmail.com Tue Mar 6 06:52:00 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 6 Mar 2018 13:52:00 +0200 Subject: [Python-Dev] Fix strncpy warning with gcc 8 (#5840) In-Reply-To: <3zwY7G1QqjzFr9b@mail.python.org> References: <3zwY7G1QqjzFr9b@mail.python.org> Message-ID: 06.03.18 12:34, Xiang Zhang ????: > https://github.com/python/cpython/commit/efd2bac1564f8141a4eab1bf8779b412974b8d69 > commit: efd2bac1564f8141a4eab1bf8779b412974b8d69 > branch: master > author: Siddhesh Poyarekar > committer: Xiang Zhang > date: 2018-03-06T18:34:35+08:00 > summary: > > Fix strncpy warning with gcc 8 (#5840) > > The length in strncpy is one char too short and as a result it leads > to a build warning with gcc 8. Comment out the strncpy since the > interpreter aborts immediately after anyway. > > files: > M Python/pystrtod.c > > diff --git a/Python/pystrtod.c b/Python/pystrtod.c > index 9bf936386210..601f7c691edf 100644 > --- a/Python/pystrtod.c > +++ b/Python/pystrtod.c > @@ -1060,8 +1060,8 @@ format_float_short(double d, char format_code, > else { > /* shouldn't get here: Gay's code should always return > something starting with a digit, an 'I', or 'N' */ > - strncpy(p, "ERR", 3); > - /* p += 3; */ > + /* strncpy(p, "ERR", 3); > + p += 3; */ > Py_UNREACHABLE(); > } > goto exit; > I think this code was added for purpose. In the case of programming error we could get meaningful value in post-mortal debugging. But after replacing assert(0) with Py_UNREACHABLE this perhaps lost a sense. If this code is no longer needed it is better to remove it than keeping an obscure comment. What are your thoughts @warsaw and @ericvsmith? Py_UNREACHABLE was added in issue31338 by Barry. The original code was added in issue1580 by Eric Smith (maybe it was copied from other place). From mikez302 at gmail.com Tue Mar 6 12:02:01 2018 From: mikez302 at gmail.com (Elias Zamaria) Date: Tue, 6 Mar 2018 09:02:01 -0800 Subject: [Python-Dev] Any way to only receive emails for threads that I am participating in? In-Reply-To: References: Message-ID: Mat?j, thanks for the suggestion. I haven't used Usenet in over a decade. I may try using a Usenet reader if I feel like trying something new (actually very old, but kind of unfamiliar to me from being not fresh in my head), but for now, I'm going to use Gmail's filtering so I can prevent the python-dev messages from being mixed with the other stuff in my inbox. If that is somehow not good enough for me, after taking a few days or weeks to get used to it, then I will try other ways of dealing with it. On Mon, Mar 5, 2018 at 10:08 PM, Mat?j Cepl wrote: > On 2018-03-03, 02:15 GMT, Elias Zamaria wrote: > > It seems like I can either subscribe and get emails for all of > > the threads, or unsubscribe and not get any emails, making me > > unable to reply to the threads I want to reply to. > > Go to https://mail.python.org/mailman/listinfo/python-dev in the > last input box fill in your email, and click on [Unsubscribe or > edit options]. On the next page fill-in your password (you get > it every month in email), and on the settings page switch ?Mail > delivery? to ?Disabled?. > > You will not get any message from the list, but you will be > still subscribed, so you can post to the list. > > Then open your NNTP newsreader > (https://en.wikipedia.org/wiki/Newsreader_(Usenet)) (you use > one, right? It is the only sensible way how to deal with the > large community lists) and subscribe to > nntp://news.gmane.org/gmane.comp.python.devel . You will have > all advantages of newsreader (killfile, easy ignoring whole > threads at once) in the comfort of your computer, perhaps even > in the offline state. > > If you don?t want to go all that length, just use email program > which can kill threads (e.g., Thunderbird > https://support.mozilla.org/en-US/kb/ignore-threads ) > > Best, > > Mat?j > -- > https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz > GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 > > Of all tyrannies, a tyranny exercised for the good of its > victims may be the most oppressive. It may be better to live > under robber barons than under omnipotent moral busybodies. The > robber baron's cruelty may sometimes sleep, his cupidity may at > some point be satiated; but those who torment us for our own > good will torment us without end, for they do so with the > approval of their consciences. > -- C. S. Lewis > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mikez302%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue Mar 6 14:45:59 2018 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 6 Mar 2018 14:45:59 -0500 Subject: [Python-Dev] Fix strncpy warning with gcc 8 (#5840) In-Reply-To: References: <3zwY7G1QqjzFr9b@mail.python.org> Message-ID: On 3/6/2018 6:52 AM, Serhiy Storchaka wrote: > 06.03.18 12:34, Xiang Zhang ????: >> https://github.com/python/cpython/commit/efd2bac1564f8141a4eab1bf8779b412974b8d69 >> >> commit: efd2bac1564f8141a4eab1bf8779b412974b8d69 >> branch: master >> author: Siddhesh Poyarekar >> committer: Xiang Zhang >> date: 2018-03-06T18:34:35+08:00 >> summary: >> >> Fix strncpy warning with gcc 8 (#5840) >> >> The length in strncpy is one char too short and as a result it leads >> to a build warning with gcc 8.? Comment out the strncpy since the >> interpreter aborts immediately after anyway. >> >> files: >> M Python/pystrtod.c >> >> diff --git a/Python/pystrtod.c b/Python/pystrtod.c >> index 9bf936386210..601f7c691edf 100644 >> --- a/Python/pystrtod.c >> +++ b/Python/pystrtod.c >> @@ -1060,8 +1060,8 @@ format_float_short(double d, char format_code, >> ????????? else { >> ????????????? /* shouldn't get here: Gay's code should always return >> ???????????????? something starting with a digit, an 'I',? or 'N' */ >> -??????????? strncpy(p, "ERR", 3); >> -??????????? /* p += 3; */ >> +??????????? /* strncpy(p, "ERR", 3); >> +?????????????? p += 3; */ >> ????????????? Py_UNREACHABLE(); >> ????????? } >> ????????? goto exit; >> > > I think this code was added for purpose. In the case of programming > error we could get meaningful value in post-mortal debugging. But after > replacing assert(0) with Py_UNREACHABLE this perhaps lost a sense. > > If this code is no longer needed it is better to remove it than keeping > an obscure comment. > > What are your thoughts @warsaw and @ericvsmith? > > Py_UNREACHABLE was added in issue31338 by Barry. The original code was > added in issue1580 by Eric Smith (maybe it was copied from other place). Mark Dickinson and/or I wrote that. I agree that leaving the two commented out lines is confusing. I suggest deleting them, and of course leave the comment about Gay's code. Eric From mark at hotpy.org Wed Mar 7 12:26:24 2018 From: mark at hotpy.org (Mark Shannon) Date: Wed, 7 Mar 2018 17:26:24 +0000 Subject: [Python-Dev] Backward incompatible change about docstring AST In-Reply-To: References: Message-ID: On 27/02/18 13:37, INADA Naoki wrote: > Hi, all. > > There is design discussion which is deferred blocker of 3.7. > https://bugs.python.org/issue32911 > > ## Background > > An year ago, I moved docstring in AST from statements list to field of > module, class and functions. > https://bugs.python.org/issue29463 > > Without this change, AST-level constant folding was complicated because > "foo" can be docstring but "fo" + "o" can't be docstring. > > This simplified some other edge cases. For example, future import must > be on top of the module, but docstring can be before it. > Docstring is very special than other expressions/statement. > > Of course, this change was backward incompatible. > Tools reading/writing docstring via AST will be broken by this change. > For example, it broke PyFlakes, and PyFlakes solved it already. > > https://github.com/PyCQA/pyflakes/pull/273 > > Since AST doesn't guarantee backward compatibility, we can change > AST if it's reasonable. The AST module does make some guarantees. The general advice for anyone wanting to do bytecode generation is "don't generate bytecodes directly, use the AST module." However, as long as the AST -> bytecode conversion remains the same, I think it is OK to change source -> AST conversion. > > Last week, Mark Shannon reported issue about this backward incompatibility. > As he said, this change losted lineno and column of docstring from AST. > > https://bugs.python.org/issue32911#msg312567 > > > ## Design discussion > > And as he said, there are three options: > > https://bugs.python.org/issue32911#msg312625 > >> It seems to be that there are three reasonable choices: >> 1. Revert to 3.6 behaviour, with the addition of `docstring` attribute. >> 2. Change the docstring attribute to an AST node, possibly by modifying the grammar. >> 3. Do nothing. > > 1 is backward compatible about reading docstring. > But when writing, it's not DRY or SSOT. There are two source of docstring. > For example: `ast.Module([ast.Str("spam")], docstring="egg")` > > 2 is considerable. I tried to implement this idea by adding `DocString` > statement AST. > https://github.com/python/cpython/pull/5927/files This is my preferred option now. > > While it seems large change, most changes are reverting the AST changes. > So it's more closer to 3.6 codebase. (especially, test_ast is very > close to 3.6) > > In this PR, `ast.Module([ast.Str("spam")])` doesn't have docstring for > simplicity. So it's backward incompatible for both of reading and > writing docstring too. > But it keeps lineno and column of docstring in AST. > > 3 is most conservative because 3.7b2 was cut now and there are some tools > supporting 3.7 already. > > > I prefer 2 or 3. If we took 3, I don't want to do 2 in 3.8. One > backward incompatible > change is better than two. I agree. Whatever we do, we should stick with it. Cheers, Mark. From nad at python.org Thu Mar 8 16:29:50 2018 From: nad at python.org (Ned Deily) Date: Thu, 8 Mar 2018 16:29:50 -0500 Subject: [Python-Dev] Reminder: 3.6.5rc1 cutoff coming up Message-ID: <43180E5E-EF4E-4C2D-ADB7-A85B12382C9E@python.org> A quick reminder that it's time for our next quarterly maintenance release of Python 3.6. The cutoff for 3.6.5rc1 is planned for 2018-03-12 end-of-the-day AOE. Please get any bug fixes and doc changes in before then. Expect that any changes merged after the 3.6.5rc1 cutoff will be released in 3.6.6 which is currently scheduled for 2018-06 (along with 3.7.0). Also, a reminder that 3.6.x has been out in the field for nearly 15 months now and thanks to all of your hard work in previous feature releases and during the 3.6 development phase, the 3.6 release series has seen remarkably quick adoption to overall great acclaim. Now that 3.6 has reached a certain level of maturity, it is important for all of us to continue to focus on stability for all of its downstream users. A key assumption of our maintenance strategy for years has been that we as a project will only maintain the most recent bugfix (or security) release. In other words, when we release x.y.z, we immediately drop support for x.y.z-1. To do that, we implicitly promise to users that they can "painlessly" upgrade from any x.y.n to x.y.z where n < z. To try to keep that promise, we strive to make no incompatible changes in x.y.z without *really* good reasons. I think it is important as 3.6 moves along in its lifecycle to put ourselves in the shoes of our users and ask ourselves if a change is really appropriate at this stage. There's no hard and fast rule here, just continue to use your best judgement. When in doubt, ask! FYI, I've adjusted the 3.6.x release schedule to allow for 2 additional quarterly maintenance releases after 3.7.0 releases instead of just one. That means the final bugfix release for the 3.6 series is planned to be 3.6.8 in 2018-12, 6 months after 3.7.0 releases and 2 years after 3.6.0 first released. Thereafter, only security issues will be accepted and addressed for the remaining life of 3.6. Thanks again! --Ned https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From status at bugs.python.org Fri Mar 9 12:09:49 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 9 Mar 2018 18:09:49 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180309170949.4F28811A901@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-03-02 - 2018-03-09) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6516 (+25) closed 38271 (+28) total 44787 (+53) Open issues with patches: 2546 Issues opened (39) ================== #31500: IDLE: Tiny font on HiDPI display https://bugs.python.org/issue31500 reopened by terry.reedy #32985: subprocess.Popen: Confusing documentation for restore_signals https://bugs.python.org/issue32985 opened by ntrrgc #32986: multiprocessing, default assumption of Pool size unhelpful https://bugs.python.org/issue32986 opened by M J Harvey #32987: tokenize.py parses unicode identifiers incorrectly https://bugs.python.org/issue32987 opened by steve #32989: IDLE: Fix pyparse.find_good_parse_start and its bad editor cal https://bugs.python.org/issue32989 opened by csabella #32990: Supporting extensible format(PCM) for wave.open(read-mode) https://bugs.python.org/issue32990 opened by acelletti #32993: urllib and webbrowser.open() can open w/ file: protocol https://bugs.python.org/issue32993 opened by yao zhihua #32995: Add a glossary entry for context variables https://bugs.python.org/issue32995 opened by serhiy.storchaka #32996: Improve What's New in 3.7 https://bugs.python.org/issue32996 opened by serhiy.storchaka #33000: IDLE Doc: Text consumes unlimited RAM, consoles likely not https://bugs.python.org/issue33000 opened by jbrearley #33001: Buffer overflow vulnerability in os.symlink on Windows (CVE-20 https://bugs.python.org/issue33001 opened by steve.dower #33002: Making a class formattable as hex/oct integer with printf-styl https://bugs.python.org/issue33002 opened by josh.r #33003: urllib: Document parse_http_list https://bugs.python.org/issue33003 opened by labrat #33006: docstring of filter function is incorrect https://bugs.python.org/issue33006 opened by Pierre Thibault #33007: Objects referencing private-mangled names do not roundtrip pro https://bugs.python.org/issue33007 opened by Antony.Lee #33008: urllib.request.parse_http_list incorrectly strips backslashes https://bugs.python.org/issue33008 opened by labrat #33010: os.path.isdir() returns True for broken directory symlinks or https://bugs.python.org/issue33010 opened by izbyshev #33012: Invalid function cast warnings with gcc 8 for METH_NOARGS https://bugs.python.org/issue33012 opened by siddhesh #33014: Clarify doc string for str.isidentifier() https://bugs.python.org/issue33014 opened by dabeaz #33015: Fix function cast warning in thread_pthread.h https://bugs.python.org/issue33015 opened by siddhesh #33016: nt._getfinalpathname may use uninitialized memory https://bugs.python.org/issue33016 opened by izbyshev #33017: Special set-cookie setting will bypass Cookielib https://bugs.python.org/issue33017 opened by LCatro #33018: Improve issubclass() error checking and message https://bugs.python.org/issue33018 opened by jab #33019: Review usage of environment variables in the stdlib https://bugs.python.org/issue33019 opened by pitrou #33020: Tkinter old style classes https://bugs.python.org/issue33020 opened by benkir07 #33021: Some fstat() calls do not release the GIL, possibly hanging al https://bugs.python.org/issue33021 opened by nirs #33023: Unable to copy ssl.SSLContext https://bugs.python.org/issue33023 opened by vitaly.krug #33024: asyncio.WriteTransport.set_write_buffer_limits orders its args https://bugs.python.org/issue33024 opened by vitaly.krug #33025: urlencode produces bad output from ssl.CERT_NONE and friends t https://bugs.python.org/issue33025 opened by vitaly.krug #33026: Fix jumping out of "with" block https://bugs.python.org/issue33026 opened by serhiy.storchaka #33027: handling filename encoding in Content-Disposition by cgi.Field https://bugs.python.org/issue33027 opened by pawciobiel #33028: tempfile.TemporaryDirectory incorrectly documented https://bugs.python.org/issue33028 opened by Richard Neumann #33029: Invalid function cast warnings with gcc 8 for getter and sette https://bugs.python.org/issue33029 opened by siddhesh #33030: GetLastError() may be overwritten by Py_END_ALLOW_THREADS https://bugs.python.org/issue33030 opened by steve.dower #33031: Questionable code in OrderedDict definition https://bugs.python.org/issue33031 opened by serhiy.storchaka #33032: Mention implicit cache in struct.Struct docs https://bugs.python.org/issue33032 opened by ncoghlan #33033: Clarify that the signed number convertors to PyArg_ParseTuple. https://bugs.python.org/issue33033 opened by Antony.Lee #33034: urllib.parse.urlparse and urlsplit not raising ValueError for https://bugs.python.org/issue33034 opened by jonathan-lp #33036: test_selectors.PollSelectorTestCase failing on macOS 10.13.3 https://bugs.python.org/issue33036 opened by n8henrie Most recent 15 issues with no replies (15) ========================================== #33031: Questionable code in OrderedDict definition https://bugs.python.org/issue33031 #33029: Invalid function cast warnings with gcc 8 for getter and sette https://bugs.python.org/issue33029 #33028: tempfile.TemporaryDirectory incorrectly documented https://bugs.python.org/issue33028 #33027: handling filename encoding in Content-Disposition by cgi.Field https://bugs.python.org/issue33027 #33025: urlencode produces bad output from ssl.CERT_NONE and friends t https://bugs.python.org/issue33025 #33017: Special set-cookie setting will bypass Cookielib https://bugs.python.org/issue33017 #33015: Fix function cast warning in thread_pthread.h https://bugs.python.org/issue33015 #33008: urllib.request.parse_http_list incorrectly strips backslashes https://bugs.python.org/issue33008 #33007: Objects referencing private-mangled names do not roundtrip pro https://bugs.python.org/issue33007 #33006: docstring of filter function is incorrect https://bugs.python.org/issue33006 #33003: urllib: Document parse_http_list https://bugs.python.org/issue33003 #32996: Improve What's New in 3.7 https://bugs.python.org/issue32996 #32995: Add a glossary entry for context variables https://bugs.python.org/issue32995 #32987: tokenize.py parses unicode identifiers incorrectly https://bugs.python.org/issue32987 #32985: subprocess.Popen: Confusing documentation for restore_signals https://bugs.python.org/issue32985 Most recent 15 issues waiting for review (15) ============================================= #33027: handling filename encoding in Content-Disposition by cgi.Field https://bugs.python.org/issue33027 #33026: Fix jumping out of "with" block https://bugs.python.org/issue33026 #33021: Some fstat() calls do not release the GIL, possibly hanging al https://bugs.python.org/issue33021 #33016: nt._getfinalpathname may use uninitialized memory https://bugs.python.org/issue33016 #33015: Fix function cast warning in thread_pthread.h https://bugs.python.org/issue33015 #33012: Invalid function cast warnings with gcc 8 for METH_NOARGS https://bugs.python.org/issue33012 #33006: docstring of filter function is incorrect https://bugs.python.org/issue33006 #33001: Buffer overflow vulnerability in os.symlink on Windows (CVE-20 https://bugs.python.org/issue33001 #32996: Improve What's New in 3.7 https://bugs.python.org/issue32996 #32989: IDLE: Fix pyparse.find_good_parse_start and its bad editor cal https://bugs.python.org/issue32989 #32986: multiprocessing, default assumption of Pool size unhelpful https://bugs.python.org/issue32986 #32981: Catastrophic backtracking in poplib (CVE-2018-1060) and diffli https://bugs.python.org/issue32981 #32978: Issues with reading large float values in AIFC files https://bugs.python.org/issue32978 #32970: Improve disassembly of the MAKE_FUNCTION instruction https://bugs.python.org/issue32970 #32968: Fraction modulo infinity should behave consistently with other https://bugs.python.org/issue32968 Top 10 most discussed issues (10) ================================= #32972: unittest.TestCase coroutine support https://bugs.python.org/issue32972 18 msgs #33018: Improve issubclass() error checking and message https://bugs.python.org/issue33018 13 msgs #32986: multiprocessing, default assumption of Pool size unhelpful https://bugs.python.org/issue32986 11 msgs #33001: Buffer overflow vulnerability in os.symlink on Windows (CVE-20 https://bugs.python.org/issue33001 10 msgs #33016: nt._getfinalpathname may use uninitialized memory https://bugs.python.org/issue33016 9 msgs #32978: Issues with reading large float values in AIFC files https://bugs.python.org/issue32978 6 msgs #32989: IDLE: Fix pyparse.find_good_parse_start and its bad editor cal https://bugs.python.org/issue32989 6 msgs #33030: GetLastError() may be overwritten by Py_END_ALLOW_THREADS https://bugs.python.org/issue33030 6 msgs #29708: support reproducible Python builds https://bugs.python.org/issue29708 5 msgs #33000: IDLE Doc: Text consumes unlimited RAM, consoles likely not https://bugs.python.org/issue33000 5 msgs Issues closed (27) ================== #4173: PDF documentation: long verbatim lines are cut off at right ha https://bugs.python.org/issue4173 closed by csabella #22822: IPv6Network constructor docs incorrect about valid input https://bugs.python.org/issue22822 closed by xiang.zhang #23159: argparse: Provide equivalent of optparse.OptionParser.{option_ https://bugs.python.org/issue23159 closed by emcd #25197: Allow documentation switcher to change url to /3/ and /dev/ https://bugs.python.org/issue25197 closed by Mariatta #29417: Sort entries in foo.dist-info/RECORD https://bugs.python.org/issue29417 closed by eric.araujo #30147: Change in re.escape output is not documented in 3.7 whatsnew https://bugs.python.org/issue30147 closed by ned.deily #30353: ctypes: pass by value for structs broken on Cygwin/MinGW 64-bi https://bugs.python.org/issue30353 closed by ned.deily #32854: Add ** Map Unpacking Support for namedtuple https://bugs.python.org/issue32854 closed by serhiy.storchaka #32874: IDLE: Add tests for pyparse https://bugs.python.org/issue32874 closed by terry.reedy #32963: Python 2.7 tutorial claims source code is UTF-8 encoded https://bugs.python.org/issue32963 closed by brett.cannon #32964: Reuse a testing implementation of the path protocol in tests https://bugs.python.org/issue32964 closed by serhiy.storchaka #32969: Add more constants to zlib module https://bugs.python.org/issue32969 closed by xiang.zhang #32984: IDLE: set and unset __file__ for startup files https://bugs.python.org/issue32984 closed by terry.reedy #32988: datetime.datetime.strftime('%s') always uses local timezone, e https://bugs.python.org/issue32988 closed by adamwill #32991: AttributeError in doctest.DocTestFinder.find https://bugs.python.org/issue32991 closed by jason.coombs #32992: unittest: Automatically run coroutines in a loop https://bugs.python.org/issue32992 closed by yselivanov #32994: Building the html documentation is broken https://bugs.python.org/issue32994 closed by ned.deily #32997: Catastrophic backtracking in fpformat https://bugs.python.org/issue32997 closed by benjamin.peterson #32998: regular expression regression in python 3.7 https://bugs.python.org/issue32998 closed by serhiy.storchaka #32999: issubclass(obj, abc.ABC) causes a segfault https://bugs.python.org/issue32999 closed by inada.naoki #33004: Shutil module functions could accept Path-like objects https://bugs.python.org/issue33004 closed by rougeth #33005: 3.7.0b2 Interpreter crash in dev mode (or with PYTHONMALLOC=de https://bugs.python.org/issue33005 closed by vstinner #33009: Assertion failure in inspect.signature on unbound partialmetho https://bugs.python.org/issue33009 closed by yselivanov #33011: Embedded 3.6.4 distribution does not add script parent as sys. https://bugs.python.org/issue33011 closed by steve.dower #33013: Underscore in str.format with x option https://bugs.python.org/issue33013 closed by serhiy.storchaka #33022: Floating Point Arithmetic Inconsistency (internal off-by-one) https://bugs.python.org/issue33022 closed by tim.peters #33035: Some examples in documentation section 4.7.2 are incorrect https://bugs.python.org/issue33035 closed by ebarry From cuthbert at mit.edu Sat Mar 10 16:59:07 2018 From: cuthbert at mit.edu (Michael Scott Cuthbert) Date: Sat, 10 Mar 2018 21:59:07 +0000 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? Message-ID: I notice on https://devguide.python.org that Python 3.5 is in ?security? status with an EOL of 2020-09-13 but Python 2.7 is in ?bugfix? and has a likely earlier EOL. Will there be a period where Py2.7 is in security-only status before hitting EOL? Even if the EOL is set at the last possible date of 2020-12-31, it still is in the time period before EOL that other recent versions have gone to security only. (obviously recognizing that Py2.7 EOL is not just another EOL) I tried searching in archives for anything related to this status, but couldn?t find anything. apologies if I missed a discussion. - Michael Cuthbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Mar 10 20:36:34 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Mar 2018 20:36:34 -0500 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: References: Message-ID: On 3/10/2018 4:59 PM, Michael Scott Cuthbert wrote: > I notice on https://devguide.python.org that Python 3.5 is in ?security? > status with an EOL of 2020-09-13 but Python 2.7 is in ?bugfix? and has a > likely earlier EOL. There is no relation between the two, or between 2.7 and any other version. 2.7 is a completely special case. > ?Will there be a period where Py2.7 is in security-only status before hitting EOL? https://www.python.org/dev/peps/pep-0373 gives the public status. When Benjamin Peterson want to add something, he will. Already, the main emphasis is on security, build, and test infrastructure fixes. Backporting bug and doc fixes is at developer discretion. > ?Even if the EOL is set at the last possible date of 2020-12-31, Benjamin Peterson will decide when he decides. He has not yet announced a date for a 2018 release. People have mostly proposed either Jan 1 or sometime late spring related to PyCon. If you want something definite for your own planning, I recommend that you assume Jan 1. > it still is in the time period before > EOL that other recent versions have gone to security only. Again, not relevant. You might want to read http://python3statement.org/. Some major projects (like Django, I believe) have already put their last 2.x compatible version into bug-fix only mode and expect to stop patching it before 2020. -- Terry Jan Reedy From guido at python.org Sat Mar 10 20:54:35 2018 From: guido at python.org (Guido van Rossum) Date: Sat, 10 Mar 2018 17:54:35 -0800 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: References: Message-ID: Let's not play games with semantics. The way I see the situation for 2.7 is that EOL is January 1st, 2020, and there will be no updates, not even source-only security patches, after that date. Support (from the core devs, the PSF, and python.org) stops completely on that date. If you want support for 2.7 beyond that day you will have to pay a commercial vendor. Of course it's open source so people are also welcome to fork it. But the core devs have toiled long enough, and the 2020 EOL date (an extension from the originally annouced 2015 EOL!) was announced with sufficient lead time and fanfare that I don't feel bad about stopping to support it at all. On Sat, Mar 10, 2018 at 5:36 PM, Terry Reedy wrote: > On 3/10/2018 4:59 PM, Michael Scott Cuthbert wrote: > >> I notice on https://devguide.python.org that Python 3.5 is in ?security? >> status with an EOL of 2020-09-13 but Python 2.7 is in ?bugfix? and has a >> likely earlier EOL. >> > > There is no relation between the two, or between 2.7 and any other > version. 2.7 is a completely special case. > > Will there be a period where Py2.7 is in security-only status before >> hitting EOL? >> > > https://www.python.org/dev/peps/pep-0373 gives the public status. When > Benjamin Peterson want to add something, he will. > > Already, the main emphasis is on security, build, and test infrastructure > fixes. Backporting bug and doc fixes is at developer discretion. > > Even if the EOL is set at the last possible date of 2020-12-31, >> > > Benjamin Peterson will decide when he decides. He has not yet announced a > date for a 2018 release. > > People have mostly proposed either Jan 1 or sometime late spring related > to PyCon. If you want something definite for your own planning, I > recommend that you assume Jan 1. > > it still is in the time period before EOL that other recent versions have >> gone to security only. >> > > Again, not relevant. > > You might want to read http://python3statement.org/. > > Some major projects (like Django, I believe) have already put their last > 2.x compatible version into bug-fix only mode and expect to stop patching > it before 2020. > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony.flury at btinternet.com Sat Mar 10 21:58:26 2018 From: anthony.flury at btinternet.com (Anthony Flury) Date: Sun, 11 Mar 2018 02:58:26 +0000 Subject: [Python-Dev] Git hub : CLA Not Signed label In-Reply-To: References: Message-ID: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> All, I submitted two Pull Requests last Sunday, only a few hours after I signed the CLA. I understand why the 'Knights who say ni' marked the Pull request as 'CLA Not Signed' Label at the time I submitted the Pull requests, but I was wondering when the Labels get reset. How often (if at all) does the bot look at old pull requests ? Thanks for any help you can give, I am sorry if the question sounds basic. From nad at python.org Sat Mar 10 22:12:46 2018 From: nad at python.org (Ned Deily) Date: Sat, 10 Mar 2018 22:12:46 -0500 Subject: [Python-Dev] Git hub : CLA Not Signed label In-Reply-To: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> References: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> Message-ID: <695D40DF-BFBA-4FB6-9814-9F4E5FC18047@python.org> On Mar 10, 2018, at 21:58, Anthony Flury via Python-Dev wrote: > I submitted two Pull Requests last Sunday, only a few hours after I signed the CLA. > > I understand why the 'Knights who say ni' marked the Pull request as 'CLA Not Signed' Label at the time I submitted the Pull requests, but I was wondering when the Labels get reset. I've reset both manually. Thanks for the heads up. Now to get reviews! -- Ned Deily nad at python.org -- [] From tjreedy at udel.edu Sat Mar 10 22:40:42 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Mar 2018 22:40:42 -0500 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: References: Message-ID: On 3/10/2018 8:54 PM, Guido van Rossum wrote: > Let's not play games with semantics. The way I see the situation for 2.7 > is that EOL is January 1st, 2020, and there will be no updates, not even > source-only security patches, after that date. Support (from the core > devs, the PSF, and python.org ) stops completely on > that date. +1 from me. If so, then that should be added to the PEP and announced at PyCon, to end major questions* and speculation. * There are still minor details of when patches are cutoff (is that EOL, on Jan 1?) and when the rc and final releases appear (whenever ready after cutoff?) -- Terry Jan Reedy From tjreedy at udel.edu Sat Mar 10 22:45:46 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Mar 2018 22:45:46 -0500 Subject: [Python-Dev] Git hub : CLA Not Signed label In-Reply-To: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> References: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> Message-ID: On 3/10/2018 9:58 PM, Anthony Flury via Python-Dev wrote: > All, > I submitted two Pull Requests last Sunday, only a few hours after I > signed the CLA. When processed properly, a day to a week, usually, a * will appear after your name on any tracker (bpo) post. > I understand why the 'Knights who say ni' marked the Pull request as > 'CLA Not Signed' Label at the time I submitted the Pull requests, but I > was wondering when the Labels get reset. When a core developer manually removes the label, as Ned did, and the bot checks with the tracker and agrees. > How often (if at all) does the bot look at old pull requests ? Never, that I know of. > Thanks for any help you can give, I am sorry if the question sounds basic. One can also request help on core-mentorship list. A link to the PR helps. -- Terry Jan Reedy From mcepl at cepl.eu Sun Mar 11 03:34:02 2018 From: mcepl at cepl.eu (=?UTF-8?Q?Mat=C4=9Bj?= Cepl) Date: Sun, 11 Mar 2018 08:34:02 +0100 Subject: [Python-Dev] Git hub : CLA Not Signed label References: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> Message-ID: On 2018-03-11, 03:45 GMT, Terry Reedy wrote: > When processed properly, a day to a week, usually, a * will > appear after your name on any tracker (bpo) post. So, I got my start after the name, where is the list of maintainers of individual packages, so I could know whom to bother to take a look at my PR https://github.com/python/cpython/pull/5999 (damn, just one number too small!). Best, Mat?j -- https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. -- C. A. R. Hoare From tjreedy at udel.edu Sun Mar 11 08:35:09 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 11 Mar 2018 08:35:09 -0400 Subject: [Python-Dev] Git hub : CLA Not Signed label In-Reply-To: References: <6c728f1c-052b-669c-ee89-c8f5bf390796@btinternet.com> Message-ID: On 3/11/2018 3:34 AM, Mat?j Cepl wrote: > On 2018-03-11, 03:45 GMT, Terry Reedy wrote: >> When processed properly, a day to a week, usually, a * will >> appear after your name on any tracker (bpo) post. > > So, I got my start after the name, where is the list of > maintainers of individual packages, so I could know whom to > bother to take a look at my PR > https://github.com/python/cpython/pull/5999 (damn, just one > number too small!). https://devguide.python.org/experts/ I added Langa for configparser. I have no idea what is going on with the numerous errors, like ====================================================================== ERROR: test_write_filename (test.test_configparser.SortedTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\projects\cpython\lib\test\test_configparser.py", line 717, in test_write_filename os.unlink(f.name) PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\tmpklkbe8_y' -- Terry Jan Reedy From benjamin at python.org Mon Mar 12 02:24:41 2018 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 11 Mar 2018 23:24:41 -0700 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: References: Message-ID: <1520835881.257567.1299640120.2039696A@webmail.messagingengine.com> Sounds good to me. I've updated the PEP to say 2.7 is completely dead on Jan 1 2020. The final release may not literally be on January 1st, but we certainly don't want to support 2.7 through all of 2020. On Sat, Mar 10, 2018, at 18:54, Guido van Rossum wrote: > Let's not play games with semantics. The way I see the situation for 2.7 is > that EOL is January 1st, 2020, and there will be no updates, not even > source-only security patches, after that date. Support (from the core devs, > the PSF, and python.org) stops completely on that date. If you want support > for 2.7 beyond that day you will have to pay a commercial vendor. Of course > it's open source so people are also welcome to fork it. But the core devs > have toiled long enough, and the 2020 EOL date (an extension from the > originally annouced 2015 EOL!) was announced with sufficient lead time and > fanfare that I don't feel bad about stopping to support it at all. > > On Sat, Mar 10, 2018 at 5:36 PM, Terry Reedy wrote: > > > On 3/10/2018 4:59 PM, Michael Scott Cuthbert wrote: > > > >> I notice on https://devguide.python.org that Python 3.5 is in ?security? > >> status with an EOL of 2020-09-13 but Python 2.7 is in ?bugfix? and has a > >> likely earlier EOL. > >> > > > > There is no relation between the two, or between 2.7 and any other > > version. 2.7 is a completely special case. > > > > Will there be a period where Py2.7 is in security-only status before > >> hitting EOL? > >> > > > > https://www.python.org/dev/peps/pep-0373 gives the public status. When > > Benjamin Peterson want to add something, he will. > > > > Already, the main emphasis is on security, build, and test infrastructure > > fixes. Backporting bug and doc fixes is at developer discretion. > > > > Even if the EOL is set at the last possible date of 2020-12-31, > >> > > > > Benjamin Peterson will decide when he decides. He has not yet announced a > > date for a 2018 release. > > > > People have mostly proposed either Jan 1 or sometime late spring related > > to PyCon. If you want something definite for your own planning, I > > recommend that you assume Jan 1. > > > > it still is in the time period before EOL that other recent versions have > >> gone to security only. > >> > > > > Again, not relevant. > > > > You might want to read http://python3statement.org/. > > > > Some major projects (like Django, I believe) have already put their last > > 2.x compatible version into bug-fix only mode and expect to stop patching > > it before 2020. > > > > -- > > Terry Jan Reedy > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > > 40python.org > > > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org From ncoghlan at gmail.com Mon Mar 12 09:13:19 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Mar 2018 23:13:19 +1000 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: References: Message-ID: On 11 March 2018 at 11:54, Guido van Rossum wrote: > Let's not play games with semantics. The way I see the situation for 2.7 > is that EOL is January 1st, 2020, and there will be no updates, not even > source-only security patches, after that date. Support (from the core devs, > the PSF, and python.org) stops completely on that date. If you want > support for 2.7 beyond that day you will have to pay a commercial vendor. > Of course it's open source so people are also welcome to fork it. But the > core devs have toiled long enough, and the 2020 EOL date (an extension from > the originally annouced 2015 EOL!) was announced with sufficient lead time > and fanfare that I don't feel bad about stopping to support it at all. > +1 from me, as even if commercial redistributors do decide they want to collaborate on a post-2020 Python 2.7 maintenance branch, there's no technical reason that that needs to live under the "python" GitHub organisation, and some solid logistical reasons for it to live somewhere more explicitly vendor managed. For example, a 2.7 vendor branch would need its own issue tracker that's independent of bugs.python.org, since the ability to report bugs against 2.7 will be removed from bpo (and all remaining 2.7-only bugs will be closed). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcepl at cepl.eu Mon Mar 12 10:19:52 2018 From: mcepl at cepl.eu (=?UTF-8?Q?Mat=C4=9Bj?= Cepl) Date: Mon, 12 Mar 2018 15:19:52 +0100 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? References: Message-ID: On 2018-03-12, 13:13 GMT, Nick Coghlan wrote: > +1 from me, as even if commercial redistributors do decide > they want to collaborate on a post-2020 Python 2.7 maintenance > branch, there's no technical reason that that needs to live > under the "python" GitHub organisation, and some solid > logistical reasons for it to live somewhere more explicitly > vendor managed. It would be good to have some email list of the commercial redistributors (Linux distro maintainers + people from Anaconda etc.). Could python.org host it? Best, Mat?j -- https://matej.ceplovi.cz/blog/, Jabber: mcepl at ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 ..every Man has a Property in his own Person. This no Body has any Right to but himself. The Labour of his Body, and the Work of his Hands, we may say, are properly his. .... The great and chief end therefore, of Mens uniting into Commonwealths, and putting themselves under Government, is the Preservation of their Property. -- John Locke, "A Treatise Concerning Civil Government" From raymond.hettinger at gmail.com Mon Mar 12 12:49:27 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 12 Mar 2018 09:49:27 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion Message-ID: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> There is a feature request and patch to propagate the float.is_integer() API through rest of the numeric types ( https://bugs.python.org/issue26680 ). While I don't think it is a good idea, the OP has been persistent and wants his patch to go forward. It may be worthwhile to discuss on this list to help resolve this particular request and to address the more general, recurring design questions. Once a feature with a marginally valid use case is added to an API, it is common for us to get downstream requests to propagate that API to other places where it makes less sense but does restore a sense of symmetry or consistency. In cases where an abstract base class is involved, acceptance of the request is usually automatic (i.e. range() and tuple() objects growing index() and count() methods). However, when our hand hasn't been forced, there is still an opportunity to decline. That said, proponents of symmetry requests tend to feel strongly about it and tend to never fully accept such a request being declined (it leaves them with a sense that Python is disordered and unbalanced). Raymond ---- My thoughts on the feature request ----- What is the proposal? * Add an is_integer() method to int(), Decimal(), Fraction(), and Real(). Modify Rational() to provide a default implementation. Starting point: Do we need this? * We already have a simple, traditional, portable, and readable way to make the test: int(x) == x * In the context of ints, the test x.is_integer() always returns True. This isn't very useful. * Aside from the OP, this behavior has never been requested in Python's 27 year history. Does it cost us anything? * Yes, adding a method to the numeric tower makes it a requirement for every class that ever has or ever will register or inherit from the tower ABCs. * Adding methods to a core object such as int() increases the cognitive load for everyday users who look at dir(), call help(), or read the main docs. * It conflicts with a design goal for the decimal module to not invent new functionality beyond the spec unless essential for integration with the rest of the language. The reasons included portability with other implementations and not trying to guess what the committee would have decided in the face of tricky questions such as whether Decimal('1.000001').is_integer() should return True when the context precision is only three decimal places (i.e. whether context precision and rounding traps should be applied before the test and whether context flags should change after the test). Shouldn't everything in a concrete class also be in an ABC and all its subclasses? * In general, the answer is no. The ABCs are intended to span only basic functionality. For example, GvR intentionally omitted update() from the Set() ABC because the need was fulfilled by __ior__(). But int() already has real, imag, numerator, and denominator, why is this different? * Those attributes are central to the functioning of the numeric tower. * In contrast, the is_integer() method is a peripheral and incidental concept. What does "API Parsimony" mean? * Avoidance of feature creep. * Preference for only one obvious way to do things. * Practicality (not craving things you don't really need) beats purity (symmetry and foolish consistency). * YAGNI suggests holding off in the absence of clear need. * Recognition that smaller APIs are generally better for users. Are there problems with symmetry/consistency arguments? * The need for guard rails on an overpass doesn't imply the same need on a underpass even though both are in the category of grade changing byways. * "In for a penny, in for a pound" isn't a principle of good design; rather, it is a slippery slope whereby the acceptance of a questionable feature in one place seems to compel later decisions to propagate the feature to other places where the cost / benefit trade-offs are less favorable. Should float.as_integer() have ever been added in the first place? * Likely, it should have been a math module function like isclose() and isinf() so that it would not have been type specific. * However, that ship has sailed; instead, the question is whether we now have to double down and have to dispatch other ships as well. * There is some question as to whether it is even a good idea to be testing the results of floating point calculations for exact values. It may be useful for testing inputs, but is likely a trap for people using it other contexts. Have we ever had problems with just accepting requests solely based on symmetry? * Yes. The str.startswith() and str.endswith() methods were given optional start/end arguments to be consistent with str.index(), not because there were known use cases where code was made better with the new feature. This ended up conflicting with a later feature request that did have valid use cases (supporting multiple test prefixes/suffixes). As a result, we ended-up with an awkward and error-prone API that requires double parenthesis for the valid use case: url.endswith(('.html', '.css')). From greg at krypto.org Mon Mar 12 13:33:00 2018 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Mar 2018 17:33:00 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 9:51 AM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > There is a feature request and patch to propagate the float.is_integer() > API through rest of the numeric types ( https://bugs.python.org/issue26680 > ). > > While I don't think it is a good idea, the OP has been persistent and > wants his patch to go forward. > > It may be worthwhile to discuss on this list to help resolve this > particular request and to address the more general, recurring design > questions. Once a feature with a marginally valid use case is added to an > API, it is common for us to get downstream requests to propagate that API > to other places where it makes less sense but does restore a sense of > symmetry or consistency. In cases where an abstract base class is > involved, acceptance of the request is usually automatic (i.e. range() and > tuple() objects growing index() and count() methods). However, when our > hand hasn't been forced, there is still an opportunity to decline. That > said, proponents of symmetry requests tend to feel strongly about it and > tend to never fully accept such a request being declined (it leaves them > with a sense that Python is disordered and unbalanced). > > > Raymond > > > ---- My thoughts on the feature request ----- > > What is the proposal? > * Add an is_integer() method to int(), Decimal(), Fraction(), and Real(). > Modify Rational() to provide a default implementation. > > Starting point: Do we need this? > * We already have a simple, traditional, portable, and readable way to > make the test: int(x) == x > Mark Dickerson left a comment on the bug pointing out that such a test is not great as it can lead to an excessive amount of computation to create the int from some numeric types such as Decimal when all that is desired is something the type itself may be able to answer without that. > * In the context of ints, the test x.is_integer() always returns True. > This isn't very useful. > * Aside from the OP, this behavior has never been requested in Python's 27 > year history. > > Does it cost us anything? > * Yes, adding a method to the numeric tower makes it a requirement for > every class that ever has or ever will register or inherit from the tower > ABCs. > * Adding methods to a core object such as int() increases the cognitive > load for everyday users who look at dir(), call help(), or read the main > docs. > * It conflicts with a design goal for the decimal module to not invent new > functionality beyond the spec unless essential for integration with the > rest of the language. The reasons included portability with other > implementations and not trying to guess what the committee would have > decided in the face of tricky questions such as whether > Decimal('1.000001').is_integer() > should return True when the context precision is only three decimal places > (i.e. whether context precision and rounding traps should be applied before > the test and whether context flags should change after the test). > > Shouldn't everything in a concrete class also be in an ABC and all its > subclasses? > * In general, the answer is no. The ABCs are intended to span only basic > functionality. For example, GvR intentionally omitted update() from the > Set() ABC because the need was fulfilled by __ior__(). > > But int() already has real, imag, numerator, and denominator, why is this > different? > * Those attributes are central to the functioning of the numeric tower. > * In contrast, the is_integer() method is a peripheral and incidental > concept. > > What does "API Parsimony" mean? > * Avoidance of feature creep. > * Preference for only one obvious way to do things. > * Practicality (not craving things you don't really need) beats purity > (symmetry and foolish consistency). > * YAGNI suggests holding off in the absence of clear need. > * Recognition that smaller APIs are generally better for users. > > Are there problems with symmetry/consistency arguments? > * The need for guard rails on an overpass doesn't imply the same need on a > underpass even though both are in the category of grade changing byways. > * "In for a penny, in for a pound" isn't a principle of good design; > rather, it is a slippery slope whereby the acceptance of a questionable > feature in one place seems to compel later decisions to propagate the > feature to other places where the cost / benefit trade-offs are less > favorable. > > Should float.as_integer() have ever been added in the first place? > * Likely, it should have been a math module function like isclose() and > isinf() so that it would not have been type specific. > * However, that ship has sailed; instead, the question is whether we now > have to double down and have to dispatch other ships as well. > * There is some question as to whether it is even a good idea to be > testing the results of floating point calculations for exact values. It may > be useful for testing inputs, but is likely a trap for people using it > other contexts. > I don't think that ship has sailed. We could still add a math.is_integer or math.is_integral_value function (I'll let others bikeshed the name) and have it understand all stdlib numeric types. For non-stdlib types it could fall back to looking for an .is_integer() method on the type as a protocol before just raising a TypeError. -gps > Have we ever had problems with just accepting requests solely based on > symmetry? > * Yes. The str.startswith() and str.endswith() methods were given > optional start/end arguments to be consistent with str.index(), not because > there were known use cases where code was made better with the new > feature. This ended up conflicting with a later feature request that did > have valid use cases (supporting multiple test prefixes/suffixes). As a > result, we ended-up with an awkward and error-prone API that requires > double parenthesis for the valid use case: url.endswith(('.html', '.css')). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon Mar 12 13:59:07 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 12 Mar 2018 19:59:07 +0200 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: 12.03.18 18:49, Raymond Hettinger ????: > There is a feature request and patch to propagate the float.is_integer() API through rest of the numeric types ( https://bugs.python.org/issue26680 ). > > While I don't think it is a good idea, the OP has been persistent and wants his patch to go forward. > > It may be worthwhile to discuss on this list to help resolve this particular request and to address the more general, recurring design questions. Once a feature with a marginally valid use case is added to an API, it is common for us to get downstream requests to propagate that API to other places where it makes less sense but does restore a sense of symmetry or consistency. In cases where an abstract base class is involved, acceptance of the request is usually automatic (i.e. range() and tuple() objects growing index() and count() methods). However, when our hand hasn't been forced, there is still an opportunity to decline. That said, proponents of symmetry requests tend to feel strongly about it and tend to never fully accept such a request being declined (it leaves them with a sense > that Python is disordered and unbalanced). > > > Raymond > > > ---- My thoughts on the feature request ----- I concur with Raymond at all points about this concrete feature and about the general design in general. From p.f.moore at gmail.com Mon Mar 12 14:22:15 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 12 Mar 2018 18:22:15 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: On 12 March 2018 at 17:59, Serhiy Storchaka wrote: > 12.03.18 18:49, Raymond Hettinger ????: >> >> There is a feature request and patch to propagate the float.is_integer() >> API through rest of the numeric types ( https://bugs.python.org/issue26680 >> ). >> >> While I don't think it is a good idea, the OP has been persistent and >> wants his patch to go forward. >> >> It may be worthwhile to discuss on this list to help resolve this >> particular request and to address the more general, recurring design >> questions. Once a feature with a marginally valid use case is added to an >> API, it is common for us to get downstream requests to propagate that API to >> other places where it makes less sense but does restore a sense of symmetry >> or consistency. In cases where an abstract base class is involved, >> acceptance of the request is usually automatic (i.e. range() and tuple() >> objects growing index() and count() methods). However, when our hand hasn't >> been forced, there is still an opportunity to decline. That said, >> proponents of symmetry requests tend to feel strongly about it and tend to >> never fully accept such a request being declined (it leaves them with a >> sense >> that Python is disordered and unbalanced). >> >> >> Raymond >> >> >> ---- My thoughts on the feature request ----- > > > I concur with Raymond at all points about this concrete feature and about > the general design in general. So do I. I do think that there is an element of considered judgement in all of these types of request, and it's right that each such request be considered on its merits. But I do not think that "symmetry with other cases" is a merit in itself, it needs to be subjected to precisely the same sort of scrutiny as any other argument. In this case, Raymond's arguments seem persuasive to me (in particular, the uselessness of int.is_integer() and the complexities in deciding correct behaviour for Decimal.is_integer() in the absence of an answer in the standard). I would ask why a standalone is_integer() function, targeted at somewhere in the stdlib like the math module[1] isn't acceptable, and I'd be wanting to see use cases for the functionality - in particular use cases for a generic "can be used for an unknown type" solution, as opposed to type-specific solutions like float.is_integer or (Fraction.denominator == 1). I imagine these questions have already been thrashed out on the tracker, though. It's certainly true that people get particularly wedded to symmetry/consistency arguments. Sometimes such arguments have value (discoverability, teachability, ease of writing type-agnostic code) but we should keep that value in perspective. Paul [1] with possibly a period for development as a 3rd party library, although I can see that "being built in" may be a key benefit of a proposal like this From solipsis at pitrou.net Mon Mar 12 14:10:27 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Mar 2018 19:10:27 +0100 Subject: [Python-Dev] Symmetry arguments for API expansion References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: <20180312191027.7654fee9@fsol> On Mon, 12 Mar 2018 09:49:27 -0700 Raymond Hettinger wrote: > > Starting point: Do we need this? > * We already have a simple, traditional, portable, and readable way to make the test: int(x) == x It doesn't look that obvious to me. As a reviewer I would request to add a comment explaining the test. > * Aside from the OP, this behavior has never been requested in Python's 27 year history. That's possible. One thing I often see is suboptimal compatibility with third-party integer types such as Numpy ints, but that's a slightly different request (as it usually doesn't imply accepting Numpy floats that exactly represent integrals). > Does it cost us anything? > * Yes, adding a method to the numeric tower makes it a requirement for every class that ever has or ever will register or inherit from the tower ABCs. Well, the big question is whether the notion of numeric tower is useful in Python at all. If it's useful then there's a point to expand its usability with such a feature. Personally I don't care much :-) > As a result, we ended-up with an awkward and error-prone API that requires double parenthesis for the valid use case: url.endswith(('.html', '.css')). It doesn't look that awkward to me. Regards Antoine. From dickinsm at gmail.com Mon Mar 12 15:13:36 2018 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 12 Mar 2018 19:13:36 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 4:49 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > What is the proposal? > * Add an is_integer() method to int(), Decimal(), Fraction(), and Real(). > Modify Rational() to provide a default implementation. > >From the issue discussion, it sounds to me as though the OP would be content with adding is_integer to int and Fraction (leaving the decimal module and the numeric tower alone). > Starting point: Do we need this? > * We already have a simple, traditional, portable, and readable way to > make the test: int(x) == x > As already pointed out in the issue discussion, this solution isn't particularly portable (it'll fail for infinities and nans), and can be horribly inefficient in the case of a Decimal input with large exponent: In [1]: import decimal In [2]: x = decimal.Decimal('1e99999') In [3]: %timeit x == int(x) 1.42 s ? 6.27 ms per loop (mean ? std. dev. of 7 runs, 1 loop each) In [4]: %timeit x == x.to_integral_value() 230 ns ? 2.03 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) * In the context of ints, the test x.is_integer() always returns True. > This isn't very useful. > It's useful in the context of duck typing, which I believe is a large part of the OP's point. For a value x that's known to be *either* float or int (which is not an uncommon situation), it makes x.is_integer() valid without needing to know the specific type of x. * It conflicts with a design goal for the decimal module to not invent new > functionality beyond the spec unless essential for integration with the > rest of the language. The reasons included portability with other > implementations and not trying to guess what the committee would have > decided in the face of tricky questions such as whether > Decimal('1.000001').is_integer() > should return True when the context precision is only three decimal places > (i.e. whether context precision and rounding traps should be applied before > the test and whether context flags should change after the test). > I don't believe there's any ambiguity here. The correct behaviour looks clear: the context isn't used, no flags are touched, and the method returns True if and only if the value is finite and an exact integer. This is analogous to the existing is-sNaN, is-signed, is-finite, is-zero, is-infinite tests, none of which are affected by (or affect) context. -- Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 12 15:15:47 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Mar 2018 12:15:47 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <20180312191027.7654fee9@fsol> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> Message-ID: There's a reason why adding this to int feels right to me. In mypy we treat int as a sub*type* of float, even though technically it isn't a sub*class*. The absence of an is_integer() method on int means that this code has a bug that mypy doesn't catch: def f(x: float): if x.is_integer(): "do something" else: "do something else" f(12) This passes the type check (because 12 is considered an acceptable substitute for float) but currently fails at runtime (because x is an int and does not have that method). You may think that mypy is obviously wrong here, but in fact we (the Python community) have gone through considerable hoops to make other cases like this work at runtime (e.g. adding .imag and .real to int), and disallowing ints where a float is expected in mypy would cause unacceptable noise about many valid programs (the difference in runtime behavior between int and float was much more pronounced in Python 2, where integer division truncated, and we intentionally changed that for the same reason). So I think the OP of the bug has a valid point, 27 years without this feature notwithstanding. And while mypy does not endorse or use the numeric tower, given the strong argument for adding the method to int, it makes sense to add it to all types in the numeric tower as well. I have no strong opinion about what to do for Decimal, which in general doesn't like to play nice with other ABCs (in general I think Decimal is doing itself a disfavor by favoring the language-independent Decimal standard over Python conventions, but that's a discussion for another time). On Mon, Mar 12, 2018 at 11:10 AM, Antoine Pitrou wrote: > On Mon, 12 Mar 2018 09:49:27 -0700 > Raymond Hettinger wrote: > > > > Starting point: Do we need this? > > * We already have a simple, traditional, portable, and readable way to > make the test: int(x) == x > > It doesn't look that obvious to me. As a reviewer I would request to > add a comment explaining the test. > > > * Aside from the OP, this behavior has never been requested in Python's > 27 year history. > > That's possible. One thing I often see is suboptimal compatibility > with third-party integer types such as Numpy ints, but that's a > slightly different request (as it usually doesn't imply accepting > Numpy floats that exactly represent integrals). > > > Does it cost us anything? > > * Yes, adding a method to the numeric tower makes it a requirement for > every class that ever has or ever will register or inherit from the tower > ABCs. > > Well, the big question is whether the notion of numeric tower is useful > in Python at all. If it's useful then there's a point to expand its > usability with such a feature. Personally I don't care much :-) > > > As a result, we ended-up with an awkward and error-prone API that > requires double parenthesis for the valid use case: url.endswith(('.html', > '.css')). > > It doesn't look that awkward to me. > > Regards > > Antoine. > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon Mar 12 15:53:51 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 12 Mar 2018 21:53:51 +0200 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> Message-ID: 12.03.18 21:15, Guido van Rossum ????: > There's a reason why adding this to int feels right to me. In mypy we > treat int as a sub*type* of float, even though technically it isn't a > sub*class*.. The absence of an is_integer() method on int means that > this code has a bug that mypy doesn't catch: > > def f(x: float): > ??? if x.is_integer(): > ??????? "do something" > ??? else: > ??????? "do something else" What is the real use case of float.is_integer()? I searched on GitHub and found only misuses of it like (x/5).is_integer() (x % 5 == 0 would be more correct and clear) or (x**0.5).is_integer() (returns wrong result for large ints and some floats) in short examples. Some of these snippets look like book examples, and they propagate bad practices (like "if a.is_integer() == True:"). From raymond.hettinger at gmail.com Mon Mar 12 16:03:42 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 12 Mar 2018 13:03:42 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> Message-ID: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> > On Mar 12, 2018, at 12:15 PM, Guido van Rossum wrote: > > There's a reason why adding this to int feels right to me. In mypy we treat int as a sub*type* of float, even though technically it isn't a sub*class*. The absence of an is_integer() method on int means that this code has a bug that mypy doesn't catch: > > def f(x: float): > if x.is_integer(): > "do something" > else: > "do something else" > > f(12) Do you have any thoughts about the other non-corresponding float methods? >>> set(dir(float)) - set(dir(int)) {'as_integer_ratio', 'hex', '__getformat__', 'is_integer', '__setformat__', 'fromhex'} In general, would you prefer that functionality like is_integer() be a math module function or that is should be a method on all numeric types except Complex? I expect questions like this to recur over time. Also, do you have any thoughts on the feature itself? Serhiy ran a Github search and found that it was baiting people into worrisome code like: (x/5).is_integer() or (x**0.5).is_integer() > So I think the OP of the bug has a valid point, 27 years without this feature notwithstanding. Okay, I'll ask the OP to update his patch :-) Raymond From guido at python.org Mon Mar 12 16:41:59 2018 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Mar 2018 13:41:59 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 1:03 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Mar 12, 2018, at 12:15 PM, Guido van Rossum wrote: > > > > There's a reason why adding this to int feels right to me. In mypy we > treat int as a sub*type* of float, even though technically it isn't a > sub*class*. The absence of an is_integer() method on int means that this > code has a bug that mypy doesn't catch: > > > > def f(x: float): > > if x.is_integer(): > > "do something" > > else: > > "do something else" > > > > f(12) > > Do you have any thoughts about the other non-corresponding float methods? > Not really, but I'll try below. > >>> set(dir(float)) - set(dir(int)) > {'as_integer_ratio', 'hex', '__getformat__', 'is_integer', > '__setformat__', 'fromhex'} > IIUC fromhex is a class method so the story isn't the same there -- typical use is float.fromhex(). as_integer_ratio() seems mostly cute (it has Tim Peters all over it), OTOH it looks like Decimal has it, so I think this ship has sailed too and maybe it's best to add it to the numeric tower just to be done with it. I found a comment for __getformat__ saying "You probably don't want to use this function. It exists mainly to be used in Python's test suite" so let's skip that. So that leaves hex(). There I think it's preposterous that for ints you have to write hex(i) but for floats you must write x.hex(). The idea that the user always knows whether they have an int or a float is outdated (it stems back to the very early Python days when 3.14 + 42 was a type error -- Tim talked me out of that in '91 or '92). If you force me to choose between allowing hex(3.14) or 42.hex() I'll choose the latter -- we also have bytes.hex() and it's an easier change to add a hex() method to int than to extend the hex() function -- we'd have to add a __hex__ protocol first. > In general, would you prefer that functionality like is_integer() be a > math module function or that is should be a method on all numeric types > except Complex? I expect questions like this to recur over time. > That feels like a loaded question -- we have a math module because C has one and back in 1990 I didn't want to spend time thinking about such design issues. > Also, do you have any thoughts on the feature itself? Serhiy ran a Github > search and found that it was baiting people into worrisome code like: > (x/5).is_integer() or (x**0.5).is_integer() > Finding bad example of floating point use is like stealing candy from babies. The feature seems venerable so I think there would have to be a very high bar to deprecate it -- I don't think you want to go there. > > So I think the OP of the bug has a valid point, 27 years without this > feature notwithstanding. > > Okay, I'll ask the OP to update his patch :-) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Mar 12 17:18:06 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 12 Mar 2018 16:18:06 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [Guido] > .... as_integer_ratio() seems mostly cute (it has Tim Peters all > over it), Nope! I had nothing to do with it. I would have been -0.5 on adding it had I been aware at the time. - I expect the audience is tiny. - While, ya, _I_ have uses for it, I had a utility function for it approximately forever (it's easily built on top of math.frexp()). - Especially now, fractions.Fraction(some_float) is the same thing except for return type. > OTOH it looks like Decimal has it, Looks like ints got it first, and then spread to Decimal because "why not?" ;-) The first attempt to spread it to Decimal I found was rejected (which would have been my vote too): https://bugs.python.org/issue8947 > so I think this ship has sailed too and maybe it's best to add it to the > numeric tower just to be done with it. Or rip it out of everything. Either way works for me ;-) From mertz at gnosis.cx Mon Mar 12 17:40:23 2018 From: mertz at gnosis.cx (David Mertz) Date: Mon, 12 Mar 2018 21:40:23 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: If anyone cares, my vote is to rip out both .as_integer_ratio() and .is_integer() from Python. I've never used either and wouldn't want to. Both seem like perfectly good functions for the `math` module, albeit the former is simply the Fraction() constructor. I can see no sane reason why anyone would ever call float.is_integer() actually. That should always be spelled math.isclose(x, int(x)) because IEEE-754. Attractive nuisance is probably too generous, I'd simply call the method a bug. On Mon, Mar 12, 2018, 2:21 PM Tim Peters wrote: > [Guido] > > .... as_integer_ratio() seems mostly cute (it has Tim Peters all > > over it), > > Nope! I had nothing to do with it. I would have been -0.5 on adding > it had I been aware at the time. > > - I expect the audience is tiny. > > - While, ya, _I_ have uses for it, I had a utility function for it > approximately forever (it's easily built on top of math.frexp()). > > - Especially now, fractions.Fraction(some_float) is the same thing > except for return type. > > > > OTOH it looks like Decimal has it, > > Looks like ints got it first, and then spread to Decimal because "why > not?" ;-) The first attempt to spread it to Decimal I found was > rejected (which would have been my vote too): > > https://bugs.python.org/issue8947 > > > > so I think this ship has sailed too and maybe it's best to add it to the > > numeric tower just to be done with it. > > Or rip it out of everything. Either way works for me ;-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Mar 12 18:21:30 2018 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 12 Mar 2018 18:21:30 -0400 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 5:18 PM, Tim Peters wrote: > [Guido] >> .... as_integer_ratio() seems mostly cute (it has Tim Peters all >> over it), > > Nope! I had nothing to do with it. I would have been -0.5 on adding > it had I been aware at the time. > > - I expect the audience is tiny. The datetime module would benefit from having as_integer_ratio() supported by more types. It's been hard to resist requests to allow Decimal in timedelta constructors and/or arithmetics >>> timedelta(Decimal('1.5')) Traceback (most recent call last): File "", line 1, in TypeError: unsupported type for timedelta days component: decimal.Decimal but >>> timedelta(1.5) datetime.timedelta(days=1, seconds=43200) I don't recall why we decided not to accept anything with an .as_integer_ratio() method. See for additional discussion. From tim.peters at gmail.com Mon Mar 12 18:25:30 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 12 Mar 2018 17:25:30 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [David Mertz ] > ... > I can see no sane reason why anyone would ever call float.is_integer() > actually. That should always be spelled math.isclose(x, int(x)) because > IEEE-754. Attractive nuisance is probably too generous, I'd simply call the > method a bug. Sometimes it's necessary to know, and especially when _implementing_ 754-conforming functions. For example, what negative infinity raised to a power needs to return depends on whether the power is an integer (specifically on whether it's an odd integer): >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** 3.1 inf >>> (-math.inf) ** 3.0 # NOTE THIS ONE -inf >>> (-math.inf) ** 2.9 inf But, ya, for most people most of the time I agree is_integer() is an attractive nuisance. People implementing math functions are famous for cheerfully enduring any amount of pain needed to get the job done ;-) From mertz at gnosis.cx Mon Mar 12 19:14:50 2018 From: mertz at gnosis.cx (David Mertz) Date: Mon, 12 Mar 2018 23:14:50 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: On Mon, Mar 12, 2018, 3:25 PM Tim Peters wrote: > [David Mertz ] > > ... > > I can see no sane reason why anyone would ever call float.is_integer() > > actually. That should always be spelled math.isclose(x, int(x)) because > > IEEE-754. Attractive nuisance is probably too generous, I'd simply call > the > > method a bug. > > Sometimes it's necessary to know, and especially when _implementing_ > 754-conforming functions. For example, what negative infinity raised > to a power needs to return depends on whether the power is an integer > (specifically on whether it's an odd integer): > > >>> (-math.inf) ** 3.1 > inf > Weird. I take it that's what IEEE-754 says. NaN would sure be more intuitive here since inf+inf-j is not in the domain of Reals. Well, technically neither is inf, but at least it's the limit of the domain. :-). >>> (-math.inf) ** 3.0 # NOTE THIS ONE > -inf > >>> (-math.inf) ** 2.9 > inf > > But, ya, for most people most of the time I agree is_integer() is an > attractive nuisance. People implementing math functions are famous > for cheerfully enduring any amount of pain needed to get the job done > ;-) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Mar 12 20:06:16 2018 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 12 Mar 2018 19:06:16 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [Tim Peters] >> ... >> >>> (-math.inf) ** 3.1 >> inf [David Mertz] > Weird. I take it that's what IEEE-754 says. NaN would sure be more intuitive > here since inf+inf-j is not in the domain of Reals. Well, technically > neither is inf, but at least it's the limit of the domain. :-). Mathematical reals have all sorts of properties floats fail to capture, while mathematical reals don't distinguish between -0 and +0 at all. "Practical' symmetry arguments often underlie what float standards require. At heart , the rules for infinite arguments are often _consequences_ of "more obvious" rules for signed zero arguments, following from replacing +-inf with 1/+-0 in the latter. More explanation here: https://stackoverflow.com/questions/10367011/why-is-pow-infinity-positive-non-integer-infinity But we're not required to _like_ it; we just have to implement it ;-) >> >>> (-math.inf) ** 3.0 # NOTE THIS ONE >> -inf >> >>> (-math.inf) ** 2.9 >> inf From tim.peters at gmail.com Tue Mar 13 01:01:50 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 13 Mar 2018 00:01:50 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [Tim. on as_integer_ratio()] >> - I expect the audience is tiny. [Alexander Belopolsky] > The datetime module would benefit from having as_integer_ratio() > supported by more types. It's been hard to resist requests to allow > Decimal in timedelta constructors and/or arithmetics I don't see the connection. That timedelta construction may use as_integer_ratio() today doesn't mean it _has_ to use as_integer_ratio() forever, and is no reason (to my mind) to add as_integer_ratio all over the place. Why not drop that, and in oddball cases see whether fractions.Fraction() can handle the input? >>> fractions.Fraction(decimal.Decimal("1.76")) Fraction(44, 25) Probably less efficient, but I don't care ;-) And then, e.g., timedelta would also automagically allow Fraction arguments (which, BTW, don't support as_integer_ratio() either). Bonus: if datetime is bothering with hand-coding rational arithmetic now out of concern to get every bit right, Fraction could handle that too by itself. At heart, the Fraction() constructor is _all about_ creating integer ratios, so is the most natural place to put knowledge of how to do so. A protocol for allowing new numeric types to get converted to Fraction would be more generally useful than just a weird method only datetime uses ;-) From larry at hastings.org Tue Mar 13 04:35:25 2018 From: larry at hastings.org (Larry Hastings) Date: Tue, 13 Mar 2018 08:35:25 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: <043042d0-2557-01f6-79cf-8fb94dede3ed@hastings.org> On 03/12/2018 08:41 PM, Guido van Rossum wrote: > If you force me to choose between allowing hex(3.14) or 42.hex() I'll > choose the latter I assume you meant (42).hex() here.? If you're also interested in changing the language to permit 42.hex(), well, color me shocked :D (For those who haven't seen this before: it's a well-known gotcha. When Python's grammar sees "42.hex()", it thinks "42." is the start of a floating-point constant.? But "42.hex" isn't a valid floating-point constant, so it throws a SyntaxError.) //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Mar 13 05:02:57 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 13 Mar 2018 11:02:57 +0200 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <043042d0-2557-01f6-79cf-8fb94dede3ed@hastings.org> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <043042d0-2557-01f6-79cf-8fb94dede3ed@hastings.org> Message-ID: 13.03.18 10:35, Larry Hastings ????: > On 03/12/2018 08:41 PM, Guido van Rossum wrote: >> If you force me to choose between allowing hex(3.14) or 42.hex() I'll >> choose the latter > > I assume you meant (42).hex() here.? If you're also interested in > changing the language to permit 42.hex(), well, color me shocked :D > > (For those who haven't seen this before: it's a well-known gotcha. When > Python's grammar sees "42.hex()", it thinks "42." is the start of a > floating-point constant.? But "42.hex" isn't a valid floating-point > constant, so it throws a SyntaxError.) "42." is a valid floating-point constant. But a floating-point constant followed by an identifier is an invalid syntax. This is the same error as in "42. hex" and "42 hex". From dickinsm at gmail.com Tue Mar 13 06:14:42 2018 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 13 Mar 2018 10:14:42 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 9:18 PM, Tim Peters wrote: > [Guido] > > .... as_integer_ratio() seems mostly cute (it has Tim Peters all > > over it), > > Nope! I had nothing to do with it. I would have been -0.5 on adding > it had I been aware at the time. > Looks like it snuck into the float type as part of the fractions.Fraction work in https://bugs.python.org/issue1682 . I couldn't find much related discussion; I suspect that the move was primarily for optimization (see https://github.com/python/cpython/commit/3ea7b41b5805c60a05e697211d0bfc14a62a19fb). Decimal.as_integer_ratio was added here: https://bugs.python.org/issue25928 . I do have significant uses of `float.as_integer_ratio` in my own code, and wouldn't enjoy seeing it being deprecated/ripped out, though I guess I'd cope. Some on this thread have suggested that things like is_integer and as_integer_ratio should be math module functions. Any suggestions for how that might be made to work? Would we special-case the types we know about, and handle only those (so the math module would end up having to know about the fractions and decimal modules)? Or add a new magic method (e.g., __as_integer_ratio__) for each case we want to handle, like we do for math.__floor__, math.__trunc__ and math.__ceil__? Or use some form of single dispatch, so that custom types can register their own handlers? The majority of current math module functions simply convert their arguments to a float, so a naive implementation of math.is_integer in the same style wouldn't work: it would give incorrect results for a non-integral Decimal instance that ended up getting rounded to an integral value by the float conversion. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Mar 13 07:53:07 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 13 Mar 2018 22:53:07 +1100 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> Message-ID: <20180313115307.GS18868@ando.pearwood.info> On Mon, Mar 12, 2018 at 09:49:27AM -0700, Raymond Hettinger wrote: > * We already have a simple, traditional, portable, and readable way to > make the test: int(x) == x Alas, the simple way is not always the correct way: py> x = float('inf') py> x == int(x) Traceback (most recent call last): File "", line 1, in OverflowError: cannot convert float infinity to integer So to be correct, you need to catch OverflowError, ValueError (in case of NANs), and TypeError (in case of complex numbers). Or guard against them with isinstance() and math.isfinite() tests. But doing so has its own problems: py> x = Decimal('snan') py> math.isfinite(x) Traceback (most recent call last): File "", line 1, in ValueError: cannot convert signaling NaN to float > * In the context of ints, the test x.is_integer() always returns True. > This isn't very useful. It is if you don't know what type x is ahead of time. if x.is_integer(): versus: if isinstance(x, int) or isinstance(x, float) and x.is_integer() > Does it cost us anything? > * Yes, adding a method to the numeric tower makes it a requirement for > every class that ever has or ever will register or inherit from the > tower ABCs. Could the numeric tower offer a default implementation that should work for most numeric types? The default could possibly even be int(self) == self Then you only have to implement your own if you have special cases to consider, like floats, or can optimise the test. Many numbers ought to know if they are integer valued, without bothering to do a full conversion to int. For example, Fractions could return self.denominator == 1 as a cheap test for integerness. > * Adding methods to a core object such as int() increases the > cognitive load for everyday users who look at dir(), call help(), or > read the main docs. This is a good point, but not an overwhelming one. > What does "API Parsimony" mean? > * Avoidance of feature creep. > * Preference for only one obvious way to do things. > * Practicality (not craving things you don't really need) beats purity (symmetry and foolish consistency). > * YAGNI suggests holding off in the absence of clear need. > * Recognition that smaller APIs are generally better for users. A very nice list! Thank you for that! But the last one is true only to a point. It is possible to be too small. (The Python 1.5 API is *much* smaller than Python 3.6. I don't think that it was better.) And consider that a *consistent* API is often more important than a *minimalist* API. -- Steve From guido at python.org Tue Mar 13 11:23:38 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2018 08:23:38 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: On Mon, Mar 12, 2018 at 10:01 PM, Tim Peters wrote: > At heart, the Fraction() constructor is _all about_ creating integer > ratios, so is the most natural place to put knowledge of how to do so. > A protocol for allowing new numeric types to get converted to Fraction > would be more generally useful than just a weird method only datetime > uses ;-) > Ironically, the various Fraction constructors *calls* as_integer_ratio() for floats and Decimals. From which follows IMO that the float and Decimal classes are the right place to encapsulate the knowledge on how to do it. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Mar 13 12:37:00 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 13 Mar 2018 11:37:00 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [Tim] >> At heart, the Fraction() constructor is _all about_ creating integer >> ratios, so is the most natural place to put knowledge of how to do so. >> A protocol for allowing new numeric types to get converted to Fraction >> would be more generally useful than just a weird method only datetime >> uses ;-) [Guido] > Ironically, the various Fraction constructors *calls* as_integer_ratio() for > floats and Decimals. From which follows IMO that the float and Decimal > classes are the right place to encapsulate the knowledge on how to do it. It appears that as_integer_ratio was slammed into floats and Decimals precisely _so that_ Fraction() could call them, while Fraction has its own self-contained knowledge of how to convert ints and Fractions and strings and numbers.Rationals to Fraction (and the former types don't support as_integer_ratio). That's fine, but my objection is subtler: the actual answer to "can this thing be converted to an integer ratio?" is not "does it support as_integer_ratio?", but rather "can Fraction() deal with it?" - and there's currently no way for a new numeric type to say "and here's how I can be converted to Fraction". An obvious way to extend it is for Fraction() to look for a special method too, say "_as_integer_ratio()". The leading underscore would reflect the truth: that this wasn't really intended to be a public method on its own, but is an internal protocol for use by the Fraction() constructor. Then it would be obvious that, e.g., it would be just plain stupid ;-) for `int` to bother implementing _as_integer_ratio. The only real point of the method is to play nice with the Fraction constructor. _As is_, it's jarring that int.as_integer_ratio() doesn't exist - for the same reason it's jarring int.hex() doesn't exist. If Mark or I wanted to use float._as_integer_ratio() directly too, that's fine: we're numeric grownups and won't throw a hissy fit if ints don't support it too ;-) From guido at python.org Tue Mar 13 13:43:29 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2018 10:43:29 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: So let's make as_integer_ratio() the standard protocol for "how to make a Fraction out of a number that doesn't implement numbers.Rational". We already have two examples of this (float and Decimal) and perhaps numpy or the sometimes proposed fixed-width decimal type can benefit from it too. If this means we should add it to int, that's fine with me. On Tue, Mar 13, 2018 at 9:37 AM, Tim Peters wrote: > [Tim] > >> At heart, the Fraction() constructor is _all about_ creating integer > >> ratios, so is the most natural place to put knowledge of how to do so. > >> A protocol for allowing new numeric types to get converted to Fraction > >> would be more generally useful than just a weird method only datetime > >> uses ;-) > > [Guido] > > Ironically, the various Fraction constructors *calls* as_integer_ratio() > for > > floats and Decimals. From which follows IMO that the float and Decimal > > classes are the right place to encapsulate the knowledge on how to do it. > > It appears that as_integer_ratio was slammed into floats and Decimals > precisely _so that_ Fraction() could call them, while Fraction has its > own self-contained knowledge of how to convert ints and Fractions and > strings and numbers.Rationals to Fraction (and the former types don't > support as_integer_ratio). > > That's fine, but my objection is subtler: the actual answer to "can > this thing be converted to an integer ratio?" is not "does it support > as_integer_ratio?", but rather "can Fraction() deal with it?" - and > there's currently no way for a new numeric type to say "and here's how > I can be converted to Fraction". > > An obvious way to extend it is for Fraction() to look for a special > method too, say "_as_integer_ratio()". The leading underscore would > reflect the truth: that this wasn't really intended to be a public > method on its own, but is an internal protocol for use by the > Fraction() constructor. > > Then it would be obvious that, e.g., it would be just plain stupid ;-) > for `int` to bother implementing _as_integer_ratio. The only real > point of the method is to play nice with the Fraction constructor. > _As is_, it's jarring that int.as_integer_ratio() doesn't exist - for > the same reason it's jarring int.hex() doesn't exist. > > If Mark or I wanted to use float._as_integer_ratio() directly too, > that's fine: we're numeric grownups and won't throw a hissy fit if > ints don't support it too ;-) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Mar 13 14:38:53 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 13 Mar 2018 13:38:53 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: [Guido] > So let's make as_integer_ratio() the standard protocol for "how to make a > Fraction out of a number that doesn't implement numbers.Rational". We > already have two examples of this (float and Decimal) and perhaps numpy or > the sometimes proposed fixed-width decimal type can benefit from it too. Yup, that works. I only would have preferred that you went back in time to add a leading underscore. > If this means we should add it to int, that's fine with me. Given that int.numerator and int.denominator already exist, there's no plausible "good reason" to refuse to return them as twople. Still, I'd wait for someone to complain ;-) From raymond.hettinger at gmail.com Tue Mar 13 14:39:12 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 13 Mar 2018 11:39:12 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: > On Mar 13, 2018, at 10:43 AM, Guido van Rossum wrote: > > So let's make as_integer_ratio() the standard protocol for "how to make a Fraction out of a number that doesn't implement numbers.Rational". We already have two examples of this (float and Decimal) and perhaps numpy or the sometimes proposed fixed-width decimal type can benefit from it too. If this means we should add it to int, that's fine with me. I would like that outcome. The signature x.as_integer_ratio() -> (int, int) is pleasant to work with. The output is easy to explain, and the denominator isn't tied to powers of two or ten. Since Python ints are exact and unbounded, there isn't worry about range or rounding issues. In contrast, math.frexp(float) ->(float, int) is a bit of pain because it still leaves you in the domain of floats rather than letting you decompose to more more basic types. It's nice to have a way to move down the chain from ?, ?, or ? to the more basic ? (of course, that only works because floats and complex are implemented in a way that precludes exact irrationals). Raymond From guido at python.org Tue Mar 13 15:07:15 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2018 12:07:15 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: OK, please make it so. On Tue, Mar 13, 2018 at 11:39 AM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > > On Mar 13, 2018, at 10:43 AM, Guido van Rossum wrote: > > > > So let's make as_integer_ratio() the standard protocol for "how to make > a Fraction out of a number that doesn't implement numbers.Rational". We > already have two examples of this (float and Decimal) and perhaps numpy or > the sometimes proposed fixed-width decimal type can benefit from it too. If > this means we should add it to int, that's fine with me. > > I would like that outcome. > > The signature x.as_integer_ratio() -> (int, int) is pleasant to work > with. The output is easy to explain, and the denominator isn't tied to > powers of two or ten. Since Python ints are exact and unbounded, there > isn't worry about range or rounding issues. > > In contrast, math.frexp(float) ->(float, int) is a bit of pain because it > still leaves you in the domain of floats rather than letting you decompose > to more more basic types. It's nice to have a way to move down the chain > from ?, ?, or ? to the more basic ? (of course, that only works because > floats and complex are implemented in a way that precludes exact > irrationals). > > > Raymond > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Tue Mar 13 17:16:30 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 13 Mar 2018 14:16:30 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: <9FAF7FB8-1F92-4F69-A58D-58EA881C09C3@gmail.com> > On Mar 13, 2018, at 12:07 PM, Guido van Rossum wrote: > > OK, please make it so. Will do. I'll create a tracker issue right away. Since this one looks easy (as many things do at first), I would like to assign it to Nofar Schnider (one of my mentees). Raymond > > On Tue, Mar 13, 2018 at 11:39 AM, Raymond Hettinger wrote: > > > > On Mar 13, 2018, at 10:43 AM, Guido van Rossum wrote: > > > > So let's make as_integer_ratio() the standard protocol for "how to make a Fraction out of a number that doesn't implement numbers.Rational". We already have two examples of this (float and Decimal) and perhaps numpy or the sometimes proposed fixed-width decimal type can benefit from it too. If this means we should add it to int, that's fine with me. > > I would like that outcome. > > The signature x.as_integer_ratio() -> (int, int) is pleasant to work with. The output is easy to explain, and the denominator isn't tied to powers of two or ten. Since Python ints are exact and unbounded, there isn't worry about range or rounding issues. > > In contrast, math.frexp(float) ->(float, int) is a bit of pain because it still leaves you in the domain of floats rather than letting you decompose to more more basic types. It's nice to have a way to move down the chain from ?, ?, or ? to the more basic ? (of course, that only works because floats and complex are implemented in a way that precludes exact irrationals). > > > Raymond > > > > > > -- > --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Tue Mar 13 18:23:47 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 Mar 2018 11:23:47 +1300 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> Message-ID: <5AA84F73.3030409@canterbury.ac.nz> Tim Peters wrote: > An obvious way to extend it is for Fraction() to look for a special > method too, say "_as_integer_ratio()". Why not __as_integer_ratio__? -- Greg From tim.peters at gmail.com Tue Mar 13 18:29:19 2018 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 13 Mar 2018 17:29:19 -0500 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <5AA84F73.3030409@canterbury.ac.nz> References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> Message-ID: [Tim] >> An obvious way to extend it is for Fraction() to look for a special >> method too, say "_as_integer_ratio()". [Greg Ewing] > Why not __as_integer_ratio__? Because. at this point, that would be beating a dead horse ;-) From nad at python.org Wed Mar 14 00:52:32 2018 From: nad at python.org (Ned Deily) Date: Wed, 14 Mar 2018 00:52:32 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.5rc1 is now available for testing Message-ID: Announcing the immediate availability of Python 3.6.5 release candidate 1! Python 3.6.5rc1 is the first release candidate for Python 3.6.5, the next maintenance release of Python 3.6. While 3.6.5rc1 is a preview release and, thus, not intended for production environments, we encourage you to explore it and provide feedback via the Python bug tracker (https://bugs.python.org). 3.6.5 is planned for final release on 2018-03-26 with the next maintenance release expected to follow in about 3 months. You can find Python 3.6.5rc1 and more information here: https://www.python.org/downloads/release/python-365rc1/ Attention macOS users: as of 3.6.5rc1, there is a new additional installer variant for macOS 10.9+ that includes a built-in version of Tcl/Tk 8.6. This variant is expected to become the default variant in future releases. Check it out! -- Ned Deily nad at python.org -- [] From cuthbert at mit.edu Wed Mar 14 09:16:55 2018 From: cuthbert at mit.edu (Michael Scott Cuthbert) Date: Wed, 14 Mar 2018 13:16:55 +0000 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? Message-ID: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> > it still is in the time period before > EOL that other recent versions have gone to security only. Again, not relevant. You might want to read http://python3statement.org/. I?m guessing my first message was unclear or able to be misunderstood in some part ? I?m one of the frequent contributors to python3statement.org and have moved my own Python projects to Py3 only (the main one, music21, gets its 3.4+-only release this Saturday). I have NO desire to prolong the 2.7 pain. What I am referring to is the number of ?needs backport to 2.7? tags for non-security-related bug-fixes in the issue tracker. (https://github.com/python/cpython/pulls?q=is%3Apr+is%3Aopen+label%3A%22needs+backport+to+2.7%22) My question was between now and 1 Jan 2020 should we still be fixing things in 2.7 that we?re not fixing in 3.5, or leave 2.7 in a security-only mode for the next 21 months? Looking at what has been closed recently, without getting a bpo for actually backporting, it appears that we?re sort of doing this in practice anyhow. Thanks! and even if my message was read differently than I intended, glad that it had a good effect. Michael Cuthbert (https://music21-mit.blogspot.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at holdenweb.com Wed Mar 14 13:21:16 2018 From: steve at holdenweb.com (Steve Holden) Date: Wed, 14 Mar 2018 17:21:16 +0000 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> References: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> Message-ID: Speaking from the sidelines, I'd say that any further backporting of non-security fixes would appear to be throwing good development effort away, This software is less than two years from the extremely well-heralded end of its life and people are expecting enhancements? It's a cold, ungrateful world we live in! It might be useful to retain the issues for the benefit of those who way wish to maintain the release after EOL, or at least get a list of them before the tags are wiped. regards Steve Steve Holden On Wed, Mar 14, 2018 at 1:16 PM, Michael Scott Cuthbert wrote: > >* it still is in the time period before > *>* EOL that other recent versions have gone to security only. > * > Again, not relevant. > > You might want to read http://python3statement.org/. > > I?m guessing my first message was unclear or able to be misunderstood in > some part ? I?m one of the frequent contributors to python3statement.org > and have moved my own Python projects to Py3 only (the main one, music21, > gets its 3.4+-only release this Saturday). I have NO desire to prolong the > 2.7 pain. > > What I am referring to is the number of ?needs backport to 2.7? tags for > non-security-related bug-fixes in the issue tracker. ( > https://github.com/python/cpython/pulls?q=is%3Apr+is% > 3Aopen+label%3A%22needs+backport+to+2.7%22 > ) > My question was between now and 1 Jan 2020 should we still be fixing things > in 2.7 that we?re not fixing in 3.5, or leave 2.7 in a security-only mode > for the next 21 months? Looking at what has been closed recently, without > getting a bpo for actually backporting, it appears that we?re sort of doing > this in practice anyhow. > > Thanks! and even if my message was read differently than I intended, glad > that it had a good effect. > > Michael Cuthbert (https://music21-mit.blogspot.com) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > steve%40holdenweb.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Mar 14 15:08:12 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 14 Mar 2018 15:08:12 -0400 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> References: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> Message-ID: On 3/14/2018 9:16 AM, Michael Scott Cuthbert wrote: > I?m guessing my first message was unclear or able to be misunderstood in > some part ? I?m one of the frequent contributors to python3statement.org > and have moved my own Python projects to > Py3 only (the main one, music21, gets its 3.4+-only release this > Saturday). ?I have NO desire to prolong the 2.7 pain. Yes, sorry I mis-read you -- though like you I am happy about the resulting decision/clarification. > What I am referring to is the number of ?needs backport to 2.7? tags for > non-security-related bug-fixes in the issue tracker. > (https://github.com/python/cpython/pulls?q=is%3Apr+is%3Aopen+label%3A%22needs+backport+to+2.7%22 > ) 14 is a small fraction of open fixes, which is perhaps your point. > My question was between now and 1 Jan 2020 should we still be fixing > things in 2.7 that we?re not fixing in 3.5, or leave 2.7 in a > security-only mode for the next 21 months? ?Looking at what has been > closed recently, without getting a bpo for actually backporting, it > appears that we?re sort of doing this in practice anyhow. The only people who can do substantive backports are those currently familiar with 2.7 and the old code and some of the subtle semantic differences. It seems that a decreasing fraction of those still want to backport fixes. > Thanks! and even if my message was read differently than I intended, > glad that it had a good effect. -- Terry Jan Reedy From chris.jerdonek at gmail.com Wed Mar 14 15:09:38 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 14 Mar 2018 19:09:38 +0000 Subject: [Python-Dev] Python 2.7 -- bugfix or security before EOL? In-Reply-To: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> References: <5F405E20-BB57-4275-B39F-19205BE07D26@mit.edu> Message-ID: Oh, that makes your original email make much more sense (at least to me). I also interpreted it to mean you were interested in extending the EOL date out further, rather than pointing out that it should probably already have been switched from ?bugfix? to ?security? status. ?Chris On Wed, Mar 14, 2018 at 8:46 AM Michael Scott Cuthbert wrote: > >* it still is in the time period before > *>* EOL that other recent versions have gone to security only. > * > Again, not relevant. > > You might want to read http://python3statement.org/. > > I?m guessing my first message was unclear or able to be misunderstood in > some part ? I?m one of the frequent contributors to python3statement.org > and have moved my own Python projects to Py3 only (the main one, music21, > gets its 3.4+-only release this Saturday). I have NO desire to prolong the > 2.7 pain. > > What I am referring to is the number of ?needs backport to 2.7? tags for > non-security-related bug-fixes in the issue tracker. ( > https://github.com/python/cpython/pulls?q=is%3Apr+is%3Aopen+label%3A%22needs+backport+to+2.7%22 > ) > My question was between now and 1 Jan 2020 should we still be fixing things > in 2.7 that we?re not fixing in 3.5, or leave 2.7 in a security-only mode > for the next 21 months? Looking at what has been closed recently, without > getting a bpo for actually backporting, it appears that we?re sort of doing > this in practice anyhow. > > Thanks! and even if my message was read differently than I intended, glad > that it had a good effect. > > Michael Cuthbert (https://music21-mit.blogspot.com) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Mar 16 13:09:56 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 16 Mar 2018 18:09:56 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180316170956.8856311BDFC@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-03-09 - 2018-03-16) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6525 ( +9) closed 38312 (+41) total 44837 (+50) Open issues with patches: 2546 Issues opened (44) ================== #27645: Supporting native backup facility of SQLite https://bugs.python.org/issue27645 reopened by berker.peksag #33014: Clarify str.isidentifier docstring; fix keyword.iskeyword docs https://bugs.python.org/issue33014 reopened by terry.reedy #33037: Skip sending/receiving after SSL transport closing https://bugs.python.org/issue33037 opened by asvetlov #33038: GzipFile doesn't always ignore None as filename https://bugs.python.org/issue33038 opened by da #33039: int() and math.trunc don't accept objects that only define __i https://bugs.python.org/issue33039 opened by ncoghlan #33041: Issues with "async for" https://bugs.python.org/issue33041 opened by serhiy.storchaka #33042: New 3.7 startup sequence crashes PyInstaller https://bugs.python.org/issue33042 opened by htgoebel #33043: Add a 'Contributing to Docs' link at the bottom of docs.python https://bugs.python.org/issue33043 opened by willingc #33044: pdb from base class, get inside a method of derived class https://bugs.python.org/issue33044 opened by ishanSrt #33046: IDLE option to strip trailing whitespace automatically on save https://bugs.python.org/issue33046 opened by rhettinger #33047: "RuntimeError: dictionary changed size during iteration" using https://bugs.python.org/issue33047 opened by Delgan #33048: macOS job broken on Travis CI https://bugs.python.org/issue33048 opened by pitrou #33049: itertools.count() confusingly mentions zip() and sequence numb https://bugs.python.org/issue33049 opened by trey #33050: Centralized documentation of assumptions made by C code https://bugs.python.org/issue33050 opened by tvanslyke #33051: IDLE: Create new tab for editor options in configdialog https://bugs.python.org/issue33051 opened by csabella #33052: Sporadic segmentation fault in test_datetime https://bugs.python.org/issue33052 opened by pitrou #33053: Running a module with `-m` will add empty directory to sys.pat https://bugs.python.org/issue33053 opened by ztane #33054: unittest blocks when testing function using multiprocessing.Po https://bugs.python.org/issue33054 opened by Kenneth Chik #33055: bytes does not implement __bytes__() https://bugs.python.org/issue33055 opened by FHTMitchell #33057: logging.Manager.logRecordFactory is never used https://bugs.python.org/issue33057 opened by feinsteinben #33058: Enhance Python's Memory Instrumentation with COUNT_ALLOCS https://bugs.python.org/issue33058 opened by elizondo93 #33059: netrc module validates file mode only for /home/user/.netrc https://bugs.python.org/issue33059 opened by akoeltringer #33061: NoReturn missing from __all__ in typing.py https://bugs.python.org/issue33061 opened by Allen Tracht #33062: ssl_renegotiate() doesn't seem to be exposed https://bugs.python.org/issue33062 opened by vitaly.krug #33063: failed to build _ctypes: undefined reference to `ffi_closure_F https://bugs.python.org/issue33063 opened by siming85 #33065: IDLE debugger: problem importing user created module https://bugs.python.org/issue33065 opened by jcdlr #33066: raise an exception from multiple positions break the traceback https://bugs.python.org/issue33066 opened by hubo1016 #33067: http.client no longer sends HTTP request in one TCP package https://bugs.python.org/issue33067 opened by christian.heimes #33069: Maintainer information discarded when writing PKG-INFO https://bugs.python.org/issue33069 opened by p-ganssle #33070: Add platform triplet for RISC-V https://bugs.python.org/issue33070 opened by schwab #33071: Document that PyPI no longer requires 'register' https://bugs.python.org/issue33071 opened by p-ganssle #33073: Add as_integer_ratio() to int() objects https://bugs.python.org/issue33073 opened by rhettinger #33074: dbm corrupts index on macOS (_dbm module) https://bugs.python.org/issue33074 opened by nneonneo #33076: Trying to cleanly terminate a threaded Queue at exit of progra https://bugs.python.org/issue33076 opened by Delgan #33077: typing: Unexpected result with value of instance of class inhe https://bugs.python.org/issue33077 opened by ?????????????? ???????????????? #33078: Queue with maxsize can lead to deadlocks https://bugs.python.org/issue33078 opened by tomMoral #33079: subprocess: document the interaction between subprocess.Popen https://bugs.python.org/issue33079 opened by sloonz #33080: regen-importlib is causing build races against other regen-all https://bugs.python.org/issue33080 opened by Alexander Kanavin #33081: multiprocessing Queue leaks a file descriptor associated with https://bugs.python.org/issue33081 opened by Henrique Andrade #33082: multiprocessing docs bury very important 'callback=' args https://bugs.python.org/issue33082 opened by chadmiller-amzn #33083: math.factorial accepts non-integral Decimal instances https://bugs.python.org/issue33083 opened by mark.dickinson #33084: Computing median, median_high an median_low in statistics libr https://bugs.python.org/issue33084 opened by dcasmr #33085: *** Error in `python': double free or corruption (out): 0x0000 https://bugs.python.org/issue33085 opened by chenkai #33086: pip: IndexError https://bugs.python.org/issue33086 opened by hearot Most recent 15 issues with no replies (15) ========================================== #33086: pip: IndexError https://bugs.python.org/issue33086 #33085: *** Error in `python': double free or corruption (out): 0x0000 https://bugs.python.org/issue33085 #33080: regen-importlib is causing build races against other regen-all https://bugs.python.org/issue33080 #33079: subprocess: document the interaction between subprocess.Popen https://bugs.python.org/issue33079 #33078: Queue with maxsize can lead to deadlocks https://bugs.python.org/issue33078 #33076: Trying to cleanly terminate a threaded Queue at exit of progra https://bugs.python.org/issue33076 #33071: Document that PyPI no longer requires 'register' https://bugs.python.org/issue33071 #33070: Add platform triplet for RISC-V https://bugs.python.org/issue33070 #33067: http.client no longer sends HTTP request in one TCP package https://bugs.python.org/issue33067 #33066: raise an exception from multiple positions break the traceback https://bugs.python.org/issue33066 #33061: NoReturn missing from __all__ in typing.py https://bugs.python.org/issue33061 #33059: netrc module validates file mode only for /home/user/.netrc https://bugs.python.org/issue33059 #33053: Running a module with `-m` will add empty directory to sys.pat https://bugs.python.org/issue33053 #33052: Sporadic segmentation fault in test_datetime https://bugs.python.org/issue33052 #33051: IDLE: Create new tab for editor options in configdialog https://bugs.python.org/issue33051 Most recent 15 issues waiting for review (15) ============================================= #33082: multiprocessing docs bury very important 'callback=' args https://bugs.python.org/issue33082 #33078: Queue with maxsize can lead to deadlocks https://bugs.python.org/issue33078 #33070: Add platform triplet for RISC-V https://bugs.python.org/issue33070 #33069: Maintainer information discarded when writing PKG-INFO https://bugs.python.org/issue33069 #33063: failed to build _ctypes: undefined reference to `ffi_closure_F https://bugs.python.org/issue33063 #33058: Enhance Python's Memory Instrumentation with COUNT_ALLOCS https://bugs.python.org/issue33058 #33057: logging.Manager.logRecordFactory is never used https://bugs.python.org/issue33057 #33051: IDLE: Create new tab for editor options in configdialog https://bugs.python.org/issue33051 #33048: macOS job broken on Travis CI https://bugs.python.org/issue33048 #33042: New 3.7 startup sequence crashes PyInstaller https://bugs.python.org/issue33042 #33041: Issues with "async for" https://bugs.python.org/issue33041 #33038: GzipFile doesn't always ignore None as filename https://bugs.python.org/issue33038 #33037: Skip sending/receiving after SSL transport closing https://bugs.python.org/issue33037 #33034: urllib.parse.urlparse and urlsplit not raising ValueError for https://bugs.python.org/issue33034 #33031: Questionable code in OrderedDict definition https://bugs.python.org/issue33031 Top 10 most discussed issues (10) ================================= #26680: Incorporating float.is_integer into the numeric tower and Deci https://bugs.python.org/issue26680 27 msgs #33073: Add as_integer_ratio() to int() objects https://bugs.python.org/issue33073 16 msgs #33014: Clarify str.isidentifier docstring; fix keyword.iskeyword docs https://bugs.python.org/issue33014 15 msgs #27645: Supporting native backup facility of SQLite https://bugs.python.org/issue27645 10 msgs #32758: Stack overflow when parse long expression to AST https://bugs.python.org/issue32758 10 msgs #33042: New 3.7 startup sequence crashes PyInstaller https://bugs.python.org/issue33042 9 msgs #33077: typing: Unexpected result with value of instance of class inhe https://bugs.python.org/issue33077 8 msgs #32972: unittest.TestCase coroutine support https://bugs.python.org/issue32972 7 msgs #33034: urllib.parse.urlparse and urlsplit not raising ValueError for https://bugs.python.org/issue33034 7 msgs #33041: Issues with "async for" https://bugs.python.org/issue33041 7 msgs Issues closed (37) ================== #17288: cannot jump from a 'return' or 'exception' trace event https://bugs.python.org/issue17288 closed by serhiy.storchaka #21611: int() docstring - unclear what number is https://bugs.python.org/issue21611 closed by csabella #22674: RFE: Add signal.strsignal(): string describing a signal https://bugs.python.org/issue22674 closed by pitrou #25054: Capturing start of line '^' https://bugs.python.org/issue25054 closed by serhiy.storchaka #26701: Documentation for int constructor mentions __int__ but not __t https://bugs.python.org/issue26701 closed by ncoghlan #27984: singledispatch register should typecheck its argument https://bugs.python.org/issue27984 closed by lukasz.langa #28788: ConfigParser should be able to write config to a given filenam https://bugs.python.org/issue28788 closed by berker.peksag #29719: "Date" of what's new is confusing https://bugs.python.org/issue29719 closed by ned.deily #29804: test_ctypes test_pass_by_value fails on arm64 (aarch64) archit https://bugs.python.org/issue29804 closed by ned.deily #30249: improve struct.unpack_from's error message like struct.pack_in https://bugs.python.org/issue30249 closed by xiang.zhang #32227: singledispatch support for type annotations https://bugs.python.org/issue32227 closed by lukasz.langa #32328: ttk.Treeview: _tkinter.TclError: list element in quotes follow https://bugs.python.org/issue32328 closed by terry.reedy #32338: Save OrderedDict import in re https://bugs.python.org/issue32338 closed by serhiy.storchaka #32367: [Security] CVE-2017-17522: webbrowser.py in Python does not va https://bugs.python.org/issue32367 closed by ned.deily #32719: fatal error raised when Ctrl-C print loop https://bugs.python.org/issue32719 closed by pitrou #32757: Python 2.7 : Buffer Overflow vulnerability in exec() function https://bugs.python.org/issue32757 closed by serhiy.storchaka #32799: returned a resul https://bugs.python.org/issue32799 closed by ned.deily #32925: AST optimizer: Change a list into tuple in iterations and cont https://bugs.python.org/issue32925 closed by serhiy.storchaka #32946: Speed up import from non-packages https://bugs.python.org/issue32946 closed by serhiy.storchaka #32970: Improve disassembly of the MAKE_FUNCTION instruction https://bugs.python.org/issue32970 closed by serhiy.storchaka #32981: Catastrophic backtracking in poplib (CVE-2018-1060) and diffli https://bugs.python.org/issue32981 closed by ned.deily #32987: tokenize.py parses unicode identifiers incorrectly https://bugs.python.org/issue32987 closed by terry.reedy #32993: urllib and webbrowser.open() can open w/ file: protocol https://bugs.python.org/issue32993 closed by martin.panter #33020: Tkinter old style classes https://bugs.python.org/issue33020 closed by ned.deily #33021: Some fstat() calls do not release the GIL, possibly hanging al https://bugs.python.org/issue33021 closed by pitrou #33026: Fix jumping out of "with" block https://bugs.python.org/issue33026 closed by serhiy.storchaka #33036: test_selectors.PollSelectorTestCase failing on macOS 10.13.3 https://bugs.python.org/issue33036 closed by ned.deily #33040: Make itertools.islice supports negative values for start and s https://bugs.python.org/issue33040 closed by rhettinger #33045: SSL Dcumentation Error https://bugs.python.org/issue33045 closed by berker.peksag #33056: LEaking files in concurrent.futures.process https://bugs.python.org/issue33056 closed by pitrou #33060: Installation hangs at "Publishing product information" https://bugs.python.org/issue33060 closed by willingc #33064: lib2to3 fails on a trailing comma after **kwargs in a function https://bugs.python.org/issue33064 closed by lukasz.langa #33068: Inconsistencies in parsing (evaluating?) longstrings https://bugs.python.org/issue33068 closed by serhiy.storchaka #33072: The interpreter bytecodes for with statements are overly compl https://bugs.python.org/issue33072 closed by serhiy.storchaka #33075: typing.NamedTuple does not deduce Optional[] from using None a https://bugs.python.org/issue33075 closed by levkivskyi #1647489: zero-length match confuses re.finditer() https://bugs.python.org/issue1647489 closed by serhiy.storchaka #1693050: \w not helpful for non-Roman scripts https://bugs.python.org/issue1693050 closed by terry.reedy From truestarecat at gmail.com Fri Mar 16 06:54:51 2018 From: truestarecat at gmail.com (=?UTF-8?B?0JjQs9C+0YDRjCDQr9C60L7QstGH0LXQvdC60L4=?=) Date: Fri, 16 Mar 2018 14:54:51 +0400 Subject: [Python-Dev] ttk.Treeview.insert() does not allow to insert item with iid=0 Message-ID: Hello, I found a possible bug with ttk.Treeview widget. I'm working on program that uses tkinter UI. I use ttk.Treeview to display some objects and I want to use integer iid of items. For example, I insert a row with treeview.insert(... iid=0, ...). But I encountered a problem when I try to get this item from treeview by iid when iid =0. There is no item with such iid. This item has autogenerated iid just like it's not specified. I investigated problem and found that in ttk.py, Treeview.insert(... iid=None, ...) in method's body has a check: if iid: res = self.tk.call(self._w, "insert", parent, index, "-id", iid, *opts) else: res = self.tk.call(self._w, "insert", parent, index, *opts) It means that if iid is "True" then use it else autogenerate it. Maybe there should be "if iid is not None", not "if iid"? Or there are some reasons to do check this way? Igor Yakovchenko ??? ???????. www.avast.ru <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Mar 16 16:22:20 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 16 Mar 2018 16:22:20 -0400 Subject: [Python-Dev] ttk.Treeview.insert() does not allow to insert item with iid=0 In-Reply-To: References: Message-ID: On 3/16/2018 6:54 AM, ????? ????????? wrote: This might fit python-list better, and may become a bugs.python.org issue, but I will answer now, having just investigated. > I found a possible bug with ttk.Treeview widget. See below. > I'm working on program that uses tkinter UI. I use ttk.Treeview to > display some objects and I want to use integer iid of items. For tk, the underlying graphics framework, iids are strings. "If iid is specified, it is used as the item identifier", from the Treeview doc, oversimplifies. When iid is specified, the iid used internally and returned by Treeview() is a string based on the value passed. 1 becomes '1', 1.0 becomes '1.0', (1,2,3) and [1,2,3] both become '1 2 3' (this is consistent in tkinter), and int becomes "". Since the transformation is the same when referencing an item, as with tv.index(iid), one can usually ignore it. But it shows up if one prints a return value (as I did to discover the above), or tries to compare it with the original objects, or tries to use two different objects with the same tk string as iids. > For example, I insert a row with treeview.insert(... iid=0, ...). > But I > encountered a problem when I try to get this item from treeview by iid > when iid =0. Other things being equal, I agree that one should be able to pass 0 and 0.0 along with all other ints and floats. In the meanwhile, you could pass '0' and retrieve with either '0' or 0. To not special-case 0, always pass iid=int(row). > There is no item with such iid. This item has autogenerated iid just > like it's not specified. > I investigated problem and found that in ttk.py, Treeview.insert(... > iid=None, ...) in method's body has a check: > ? ? ? ? if iid: > ? ? ? ? ? ? res = self.tk.call(self._w, "insert", parent, index, > ? ? ? ? ? ? ? ? "-id", iid, *opts) > ? ? ? ? else: > ? ? ? ? ? ? res = self.tk.call(self._w, "insert", parent, index, *opts) > It means that if iid is "True" then use it else autogenerate it. > Maybe there should be "if iid is not None", not "if iid"? Or there are > some reasons to do check this way? It might be that accepting '' as an iid would be a problem. Our current tkinter expert, Serhiy Storchaka, should know. If so, "if iid in (None, '')" would be needed. -- Terry Jan Reedy From python at mrabarnett.plus.com Fri Mar 16 20:39:06 2018 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 17 Mar 2018 00:39:06 +0000 Subject: [Python-Dev] ttk.Treeview.insert() does not allow to insert item with iid=0 In-Reply-To: References: Message-ID: <4dd658cf-52f3-ddc3-8ef8-f9511dd85ad6@mrabarnett.plus.com> On 2018-03-16 20:22, Terry Reedy wrote: > On 3/16/2018 6:54 AM, ????? ????????? wrote: [snip] >> There is no item with such iid. This item has autogenerated iid just >> like it's not specified. >> I investigated problem and found that in ttk.py, Treeview.insert(... >> iid=None, ...) in method's body has a check: >> ? ? ? ? if iid: >> ? ? ? ? ? ? res = self.tk.call(self._w, "insert", parent, index, >> ? ? ? ? ? ? ? ? "-id", iid, *opts) >> ? ? ? ? else: >> ? ? ? ? ? ? res = self.tk.call(self._w, "insert", parent, index, *opts) >> It means that if iid is "True" then use it else autogenerate it. >> Maybe there should be "if iid is not None", not "if iid"? Or there are >> some reasons to do check this way? > > It might be that accepting '' as an iid would be a problem. > Our current tkinter expert, Serhiy Storchaka, should know. If so, "if > iid in (None, '')" would be needed. > The root of the tree has the iid ''. From agnosticdev at gmail.com Tue Mar 20 07:15:52 2018 From: agnosticdev at gmail.com (Agnostic Dev) Date: Tue, 20 Mar 2018 06:15:52 -0500 Subject: [Python-Dev] Hello! Message-ID: My name is Matt Eaton (Agnostic Dev) and I am just joining the mailing list so I wanted to reach out and say hello! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Mar 20 12:20:05 2018 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 20 Mar 2018 09:20:05 -0700 Subject: [Python-Dev] Hello! In-Reply-To: References: Message-ID: <5AB134B5.2070103@stoneleaf.us> On 03/20/2018 04:15 AM, Agnostic Dev wrote: > My name is Matt Eaton (Agnostic Dev) and I am just joining the mailing list so I wanted to reach out and say hello! Hello back! Welcome to the list! Just as a reminder, this list is for discussions of the development OF Python, or how to make Python itself better. If you want to discuss development WITH Python, then you'll want the general Python List instead. -- ~Ethan~ From chris.barker at noaa.gov Tue Mar 20 20:02:26 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 21 Mar 2018 00:02:26 +0000 Subject: [Python-Dev] ttk.Treeview.insert() does not allow to insert item with iid=0 In-Reply-To: References: Message-ID: On Fri, Mar 16, 2018 at 10:54 AM, ????? ????????? wrote: > I investigated problem and found that in ttk.py, Treeview.insert(... > iid=None, ...) in method's body has a check: > if iid: > res = self.tk.call(self._w, "insert", parent, index, > "-id", iid, *opts) > else: > res = self.tk.call(self._w, "insert", parent, index, *opts) > It means that if iid is "True" then use it else autogenerate it. > Maybe there should be "if iid is not None", not "if iid"? Or there are > some reasons to do check this way? > isn't it considered pythonic to both: use None as a default for "not specified" AND use: if something is None to check if the parameter has been specified? however, this is a bit of an odd case: ids are strings, but it allows you to pass in a non-string and stringified version will be used. so None should be the only special case -- not "anything false" (but if the empty string is the root, then it's another special case -- again, good to check for none rather than anything Falsey) so it probably should do something like: if iid is not None: res = self.tk.call(self._w, "insert", parent, index, "-id", str(iid), *opts) else: res = self.tk.call(self._w, "insert", parent, index, *opts) note both the check for None and the str() call. I'm assuming the str() call happens under the hood at the boundary already, but better to make it explicit in teh Python. Alternatively: this has been around a LONG time, so the maybe the answer is "don't do that" -- i.e. don't use anything falsey as an iid. But it would still be good to make the docs more clear about that. -CHB ------- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Mar 20 20:32:11 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 21 Mar 2018 00:32:11 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> Message-ID: It seems .as_integer_ratio() has been resolved. what about the original .is_integer() request? (Or did I miss that somehow?) Anyway, it seems like __index__() should play a role here somehow... isn't that how you ask an object for the integer version of itself? Could float et al. add an __index__ method that would return a ValueError if the value was not an integer? Of course, as pointed out earlier in this thread, an "exact" integer is probably not what you want with a float anyway.... -CHB On Tue, Mar 13, 2018 at 10:29 PM, Tim Peters wrote: > [Tim] > >> An obvious way to extend it is for Fraction() to look for a special > >> method too, say "_as_integer_ratio()". > > [Greg Ewing] > > Why not __as_integer_ratio__? > > Because. at this point, that would be beating a dead horse ;-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 20 21:32:12 2018 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Mar 2018 18:32:12 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> Message-ID: No, the whole point of __index__ is that it refuses *all* floats -- otherwise people will do approximate computations that for their simple test inputs give whole numbers, use them as sequence indices, and then find their code broken only when the computation incurs some floating point approximation. OTOH, is_integer() specifically asks whether a given real value is a whole number so you can cast it to int() without rounding, etc. On Tue, Mar 20, 2018 at 5:32 PM, Chris Barker wrote: > It seems .as_integer_ratio() has been resolved. > > what about the original .is_integer() request? (Or did I miss that > somehow?) > > Anyway, it seems like __index__() should play a role here somehow... isn't > that how you ask an object for the integer version of itself? > > Could float et al. add an __index__ method that would return a ValueError > if the value was not an integer? > > Of course, as pointed out earlier in this thread, an "exact" integer is > probably not what you want with a float anyway.... > > -CHB > > > On Tue, Mar 13, 2018 at 10:29 PM, Tim Peters wrote: > >> [Tim] >> >> An obvious way to extend it is for Fraction() to look for a special >> >> method too, say "_as_integer_ratio()". >> >> [Greg Ewing] >> > Why not __as_integer_ratio__? >> >> Because. at this point, that would be beating a dead horse ;-) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris. >> barker%40noaa.gov >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From agnosticdev at gmail.com Tue Mar 20 21:32:10 2018 From: agnosticdev at gmail.com (Agnostic Dev) Date: Tue, 20 Mar 2018 20:32:10 -0500 Subject: [Python-Dev] Hello! In-Reply-To: <5AB134B5.2070103@stoneleaf.us> References: <5AB134B5.2070103@stoneleaf.us> Message-ID: Thank you, Ethan. It sounds like I landed in the right place. Thank you for the heads up. On Tue, Mar 20, 2018 at 11:20 AM, Ethan Furman wrote: > On 03/20/2018 04:15 AM, Agnostic Dev wrote: > > My name is Matt Eaton (Agnostic Dev) and I am just joining the mailing >> list so I wanted to reach out and say hello! >> > > Hello back! Welcome to the list! > > Just as a reminder, this list is for discussions of the development OF > Python, or how to make Python itself better. If you want to discuss > development WITH Python, then you'll want the general Python List instead. > > -- > ~Ethan~ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/agnosticd > ev%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Mar 21 00:34:29 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 21 Mar 2018 15:34:29 +1100 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> Message-ID: <20180321043429.GP16661@ando.pearwood.info> On Wed, Mar 21, 2018 at 12:32:11AM +0000, Chris Barker wrote: > Could float et al. add an __index__ method that would return a ValueError > if the value was not an integer? That would allow us to write things like: "abcdefgh"[5.0] which is one of the things __index__ was invented to prevent. > Of course, as pointed out earlier in this thread, an "exact" integer is > probably not what you want with a float anyway.... Not always. For example, I might want a function factorial(x) which returns x! when x is an exact integer value, and gamma(x+1) when it is not. That is what the HP-48 series of calculators do. (This is just an illustration.) Another example is that pow() functions sometimes swap to an exact algorithm if the power is an int. There's no particular reason why x**n and x**n.0 ought to be different, but they are: py> 123**10 792594609605189126649 py> 123**10.0 7.925946096051892e+20 On the other hand, some might argue that by passing 10.0 as the power, I am specifically requesting a float implementation and result. I don't wish to argue in favour of either position, but just to point out that it is sometimes reasonable to want to know whether a float represents an exact integer value or not. -- Steve From storchaka at gmail.com Wed Mar 21 04:06:08 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 21 Mar 2018 10:06:08 +0200 Subject: [Python-Dev] Deprecating float.is_integer() Message-ID: I searched usages of is_integer() on GitHub and have found that it is used *only* in silly code like (x/5).is_integer(), (x**0.5).is_integer() (or even (x**(1/3)).is_integer()) and in loops like: i = 0 while i < 20: if i.is_integer(): print(i) i += 0.1 (x/5).is_integer() is an awful way of determining the divisibility by 5. It returns wrong result for large integers and some floats. (x % 5 == 0) is a more clear and reliable way (or PEP 8 compliant (not x % 5)). Does anybody know examples of the correct use of float.is_integer() in real programs? For now it looks just like a bug magnet. I suggest to deprecate it in 3.7 or 3.8 and remove in 3.9 or 3.10. If you even need to test if a float is an exact integer, you could use (not x % 1.0). It is even faster than x.is_integer(). From chris.barker at noaa.gov Wed Mar 21 06:31:19 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 21 Mar 2018 10:31:19 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <20180321043429.GP16661@ando.pearwood.info> References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: On Wed, Mar 21, 2018 at 4:42 AM Steven D'Aprano wrote: > > > > Could float et al. add an __index__ method that would return a ValueError > > if the value was not an integer? > > That would allow us to write things like: > > "abcdefgh"[5.0] > > which is one of the things __index__ was invented to prevent. I?m not so sure ? it was invented to prevent using e.g. 6.1 as an index, which int(I) would allow. More specifically, it was invented to Allow true integers that aren?t a python int ( like numpy int types). But, in fact, it is common to use floating point computation to compute an index ? though usually one would make a conscious choice between round() and floor() and ceil() when doing so. Passing floor(a_float) as an index is a perfectly reasonable thing to do. But Guidos point is well taken ? Having __index__ fail based on value is setting people up for bugs down the line. However, it seems use of is_integer() on a float is setting people up for exactly the same sorts of bugs. Another example is that pow() functions sometimes swap to an exact > algorithm if the power is an int. There's no particular reason why > x**n and x**n.0 ought to be different, but they are: > > py> 123**10 > 792594609605189126649 > > py> 123**10.0 > 7.925946096051892e+20 I think this is exactly like the __index__ use case. If the exponent is a literal, use what you mean. If the exponent is a computed float, then you really don?t want a different result depending on whether the computed value is exactly an integer or one ULP off. The user should check/convert to an integer with a method appropriate to the problem at hand. If it wasn?t too heavyweight, it might be nice to have some sort of flag on floats indicating whether they really ARE an integer, rather than happen to be: -Created from an integer literal - created from an integer object - result of floor(), ceil() or round() Any others? But that would be too heavyweight, and not that useful. In short, is_integer() is an attractive nuisance. -CHB PS: for the power example, the ?right? solution is to have two operators: integer power and float power, like we do for float vs floor division. No, it?s not worth it in this case, but having it be value dependent would be worse than type dependent. > > On the other hand, some might argue that by passing 10.0 as the power, I > am specifically requesting a float implementation and result. I don't > wish to argue in favour of either position, but just to point out that > it is sometimes reasonable to want to know whether a float represents an > exact integer value or not. > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Mar 21 06:31:21 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 21 Mar 2018 10:31:21 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: > > > Does anybody know examples of the correct use of float.is_integer() in > real programs? For now it looks just like a bug magnet. I suggest to > deprecate it in 3.7 or 3.8 and remove in 3.9 or 3.10. +1 It really doesn?t appear to be the right solution for any problem. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Mar 21 06:52:27 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 21 Mar 2018 12:52:27 +0200 Subject: [Python-Dev] What is the purpose of NEXT_BLOCK()? Message-ID: There is the NEXT_BLOCK() macro in compile.c. It creates a new block, creates an implicit jump from the current block to the new block, and sets it as the current block. But why it is used? All seems working if remove NEXT_BLOCK(). If there was a need of NEXT_BLOCK() (if it reduces the computational complexity of compilation or allows some optimizations), it should be documented, and we should analyze the code and add missed NEXT_BLOCK() where they are needed, and perhaps add new tests. Otherwise it can be removed. From ncoghlan at gmail.com Wed Mar 21 08:34:27 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Mar 2018 22:34:27 +1000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <52406AEA-FB0B-4056-8966-4FF85548241F@gmail.com> <20180312191027.7654fee9@fsol> <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> Message-ID: On 14 March 2018 at 08:29, Tim Peters wrote: > [Tim] > >> An obvious way to extend it is for Fraction() to look for a special > >> method too, say "_as_integer_ratio()". > > [Greg Ewing] > > Why not __as_integer_ratio__? > > Because. at this point, that would be beating a dead horse ;-) > I'm not so sure about that, as if we define a protocol method for it, then we'd presumably also define an "operator.as_integer_ratio" function, and that function could check __index__ in addition to checking the new protocol method. For example: def as_integer_ratio(n): # Automatically accept true integers if hasattr(n, "__index__"): return (n.__index__(), 1) # New reserved protocol method if hasattr(n, "__integer_ratio__"): return n.__integer_ratio__() # Historical public protocol method if hasattr(n, "as_integer_ratio"): return n.as_integer_ratio() # Check for lossless integer conversion try: int_n = int(n) except TypeError: pass else: if int_n == n: return (int_n, 1) raise TypeError(f"{type(n)} does not support conversion to an integer ratio") Similarly, on the "operator.is_integer" front: def is_integer(n): # Automatically accept true integers if hasattr(n, "__index__"): return True # New reserved protocol method if hasattr(n, "__is_integer__"): return n.__is_integer__() # Historical public protocol method if hasattr(n, "is_integer"): return n.is_integer() # As a last resort, check for lossless int conversion return int(n) == n Cheers, Nick. P.S. I've suggested "operator" as a possible location, since that's where we put "operator.index", and it's a low level module that doesn't bring in any transitive dependencies. However, putting these protocol wrappers somewhere else (e.g. in "math" or "numbers") may also make sense. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Mar 21 08:38:32 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 21 Mar 2018 23:38:32 +1100 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: <20180321123830.GR16661@ando.pearwood.info> On Wed, Mar 21, 2018 at 10:31:19AM +0000, Chris Barker wrote: > On Wed, Mar 21, 2018 at 4:42 AM Steven D'Aprano wrote: > > > Could float et al. add an __index__ method that would return a ValueError > > > if the value was not an integer? > > > > That would allow us to write things like: > > > > "abcdefgh"[5.0] > > > > which is one of the things __index__ was invented to prevent. > > I?m not so sure ? it was invented to prevent using e.g. 6.1 as an index, > which int(I) would allow. As would int(6.0). If we wanted 6.0 to be accepted as an index, then floats would already have an __index__ method :-) [...] > But Guidos point is well taken ? Having __index__ fail based on value is > setting people up for bugs down the line. > > However, it seems use of is_integer() on a float is setting people up for > exactly the same sorts of bugs. I don't think so. You aren't going to stop people from testing whether a float is an integer. (And why should you? It isn't *wrong* to do so. Some floats simply are integer valued.) All you will do is force them to write code which is even worse than what they have now. One wrong solution: int(x) == x That can raise ValueError and OverflowError, but at least it is somewhat understandable. Serhiy suggested that people should use the cryptic: (not x % 1.0) but that's hardly self-documenting: its not obvious what it does or how it works. Suppose I see that snippet in a code review, and let's suppose I recognise it and aren't totally perplexed by it. Will it pass the review? I have to make a decision: - will it fail when x is an INF or NAN? - does it give the correct results when x is negative? - does this suffer from rounding errors that could affect the result? - what if x is not a float, but a Decimal, a Fraction or an int too big to convert to a float? None of the answers are obvious at a glance. In fact, Serhiy's suggestion is not correct when x is not a float: py> from fractions import Fraction py> x =Fraction(1) + Fraction(1, 10**500) # certainly not an integer py> x.denominator == 1 # sadly Fraction doesn't support is_integer False py> not x % 1.0 True > Another example is that pow() functions sometimes swap to an exact > > algorithm if the power is an int. There's no particular reason why > > x**n and x**n.0 ought to be different, but they are: > > > > py> 123**10 > > 792594609605189126649 > > > > py> 123**10.0 > > 7.925946096051892e+20 > > > I think this is exactly like the __index__ use case. If the exponent is a > literal, use what you mean. Naturally. I already eluded to that in my earlier post. Nevertheless, this is just an example, and we shouldn't expect that the power will be a literal. I'm just illustrating the concept. > If the exponent is a computed float, then you > really don?t want a different result depending on whether the computed > value is exactly an integer or one ULP off. I don't think you actually mean to say that. I'm pretty sure that we *do* want different results if the exponent differs from an integer by one ULP. After all, that's what happens now: py> x = 25 py> x**1.0 25.0 py> x**(1.0+(2**-52)) # one ULP above 25.000000000000018 py> x**(1.0-(2**-53)) # one ULP below 24.99999999999999 I don't want to change the behaviour of pow(), but we shouldn't dismiss the possibility of some other numeric function wanting to treat values N.0 and N the same. Let's say, an is_prime(x) function that supports floats as well as ints: is_prime(3.0) # True is_prime(3.00001) # False If the argument x.is_integer() returns True, then we convert to an int and test for primality. If not, then it's definitely not prime. > The user should check/convert to an integer with a method appropriate to > the problem at hand. Oh, you mean something like x.is_integer()? I agree! *wink* > If it wasn?t too heavyweight, it might be nice to have some sort of flag on > floats indicating whether they really ARE an integer, rather than happen to > be: > > -Created from an integer literal > - created from an integer object > - result of floor(), ceil() or round() I don't understand this. You seem to be saying that none of the following are "really" integer valued: float(10) floor(10.1) ceil(10.1) round(10.1) If they're not all exactly equal to the integer 10, what on earth should they equal? -- Steve From robertomartinezp at gmail.com Wed Mar 21 11:03:20 2018 From: robertomartinezp at gmail.com (=?UTF-8?Q?Roberto_Mart=C3=ADnez?=) Date: Wed, 21 Mar 2018 15:03:20 +0000 Subject: [Python-Dev] ThreadedProcessPoolExecutor Message-ID: Hi, I've made a custom concurrent.futures.Executor mixing the ProcessPoolExecutor and ThreadPoolExecutor. I've published it here: https://github.com/nilp0inter/threadedprocess This executor is very similar to a ProcessPoolExecutor, but each process in the pool have it's own ThreadPoolExecutor inside. The motivation for this executor is mitigate the problem we have in a project were we have a very large number of long running IO bounded tasks, that have to run concurrently. Those long running tasks have sparse CPU bounded operations. To resolve this problem I considered multiple solutions: 1. Use asyncio to run the IO part as tasks and use a ProcessPoolExecutor to run the CPU bounded operations with "run_in_executor". Unfortunately the CPU operations depends on a large memory context, and using a ProcessPoolExecutor this way force the parent process to picklelize all the context to send it to the task, and because the context is so large, this operation is itself very CPU demanding. So it doesn't work. 2. Executing the IO/CPU bounded operations in different processes with multiprocessing.Process. This actually works, but the number of idle processes in the system is too large, resulting in a bad memory footprint. 3. Executing the IO/CPU bounded operations in threads. This doesn't work because the sum of all CPU operations saturate the core where the Python process is running and the other cores are wasted doing nothing. So I coded the ThreadedProcessPoolExecutor that helped me maintaining the number of processes under control (I just have one process per CPU core) allowing me to have a very high concurrency (hundreds of threads per process). I have a couple of questions: The first one is about the license. Given that I copied the majority of the code from the concurrent.futures library, I understand that I have to publish the code under the PSF LICENSE. Is this correct? My second question is about the package namespace. Given that this is an concurrent.futures.Executor subclass I understand that more intuitive place to locate it is under concurrent.futures. Is this a suitable use case for namespace packages? Is this a good idea? Best regards, Roberto -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 11:08:22 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 08:08:22 -0700 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: I searched 6M LoC of Python code at Dropbox and found only three uses. They seem legit. Two are about formatting a number that's given as a float, deciding whether to print a float as 42 or 3.14. The third is attempting a conversion from float to integer where a non-integer must raise a specific exception (the same function also supports a string as long as it can be parsed as an int). I don't doubt we would get by if is_integer() was deprecated. On Wed, Mar 21, 2018 at 3:31 AM, Chris Barker wrote: > >> Does anybody know examples of the correct use of float.is_integer() in >> real programs? For now it looks just like a bug magnet. I suggest to >> deprecate it in 3.7 or 3.8 and remove in 3.9 or 3.10. > > > +1 > > It really doesn?t appear to be the right solution for any problem. > > -CHB > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 11:13:32 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 08:13:32 -0700 Subject: [Python-Dev] What is the purpose of NEXT_BLOCK()? In-Reply-To: References: Message-ID: Maybe spelunking in the Python 2 branch will help? It seems it was introduced in 2005 by Jeremy Hylton with this comment: /* The distinction between NEW_BLOCK and NEXT_BLOCK is subtle. (I'd like to find better names.) NEW_BLOCK() creates a new block and sets it as the current block. NEXT_BLOCK() also creates an implicit jump from the current block to the new block. */ That comment (and NEW_BLOCK()) are no longer found in the Python 3 source. On Wed, Mar 21, 2018 at 3:52 AM, Serhiy Storchaka wrote: > There is the NEXT_BLOCK() macro in compile.c. It creates a new block, > creates an implicit jump from the current block to the new block, and sets > it as the current block. > > But why it is used? All seems working if remove NEXT_BLOCK(). If there was > a need of NEXT_BLOCK() (if it reduces the computational complexity of > compilation or allows some optimizations), it should be documented, and we > should analyze the code and add missed NEXT_BLOCK() where they are needed, > and perhaps add new tests. Otherwise it can be removed. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 11:23:08 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 08:23:08 -0700 Subject: [Python-Dev] ThreadedProcessPoolExecutor In-Reply-To: References: Message-ID: Roberto, That looks like an interesting class. I presume you're intending to publish this as a pip package on PyPI.python.org? I'm no lawyer, but I believe you can license your code under a new license (I recommend BSD) as long as you keep a copy and a mention of the PSF license in your distribution as well. (Though perhaps you could structure your code differently and inherit from the standard library modules rather than copying them?) In terms of the package namespace, do not put it in the same namespace as standard library code! It probably won't work and will cause world-wide pain and suffering for the users of your code. Invent your project name and use that as a top-level namespace, like everyone else. :-) Good luck with your project, --Guido On Wed, Mar 21, 2018 at 8:03 AM, Roberto Mart?nez < robertomartinezp at gmail.com> wrote: > Hi, > > I've made a custom concurrent.futures.Executor mixing the > ProcessPoolExecutor and ThreadPoolExecutor. > > I've published it here: > > https://github.com/nilp0inter/threadedprocess > > This executor is very similar to a ProcessPoolExecutor, but each process > in the pool have it's own ThreadPoolExecutor inside. > > The motivation for this executor is mitigate the problem we have in a > project were we have a very large number of long running IO bounded tasks, > that have to run concurrently. Those long running tasks have sparse CPU > bounded operations. > > To resolve this problem I considered multiple solutions: > > 1. Use asyncio to run the IO part as tasks and use a > ProcessPoolExecutor to run the CPU bounded operations with > "run_in_executor". Unfortunately the CPU operations depends on a large > memory context, and using a ProcessPoolExecutor this way force the parent > process to picklelize all the context to send it to the task, and because > the context is so large, this operation is itself very CPU demanding. So it > doesn't work. > 2. Executing the IO/CPU bounded operations in different processes with > multiprocessing.Process. This actually works, but the number of idle > processes in the system is too large, resulting in a bad memory footprint. > 3. Executing the IO/CPU bounded operations in threads. This doesn't > work because the sum of all CPU operations saturate the core where the > Python process is running and the other cores are wasted doing nothing. > > So I coded the ThreadedProcessPoolExecutor that helped me maintaining the > number of processes under control (I just have one process per CPU core) > allowing me to have a very high concurrency (hundreds of threads per > process). > > I have a couple of questions: > > The first one is about the license. Given that I copied the majority of > the code from the concurrent.futures library, I understand that I have to > publish the code under the PSF LICENSE. Is this correct? > > My second question is about the package namespace. Given that this is an > concurrent.futures.Executor subclass I understand that more intuitive place > to locate it is under concurrent.futures. Is this a suitable use case for > namespace packages? Is this a good idea? > > Best regards, > Roberto > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob at sixty-north.com Wed Mar 21 04:33:25 2018 From: rob at sixty-north.com (Robert Smallshire) Date: Wed, 21 Mar 2018 09:33:25 +0100 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <20180321043429.GP16661@ando.pearwood.info> References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: As requested on the bug tracker, I've submitted a pull request for is_integer() support on the other numeric types. https://github.com/python/cpython/pull/6121 These are the tactics I used to implement it: - float: is_integer() already exists, so no changes - int: return True - Real: return x == int(x). Although Real doesn't explicitly support conversation to int with __int__, it does support conversion to int with __trunc__. The int constructor falls back to using __trunc__. - Rational (also inherited by Fraction): return x.denominator == 1 as Rational requires that all numbers must be represented in lowest form. - Integral: return True - Decimal: expose the existing dec_mpd_isinteger C function to Python as is_integer() -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 12:12:14 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 09:12:14 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: Thank you! As you may or may not have noticed in a different thread, we're going through a small existential crisis regarding the usefulness of is_integer() -- Serhiy believes it is not useful (and even an attractive nuisance) and should be deprecated. OTOH the existence of dec_mpd_isinteger() seems to validate to me that it actually exposes useful functionality (and every Python feature can be abused, so that alone should not be a strong argument for deprecation). On Wed, Mar 21, 2018 at 1:33 AM, Robert Smallshire wrote: > As requested on the bug tracker, I've submitted a pull request for > is_integer() support on the other numeric types. > https://github.com/python/cpython/pull/6121 > > These are the tactics I used to implement it: > > - float: is_integer() already exists, so no changes > > - int: return True > > - Real: return x == int(x). Although Real doesn't explicitly support > conversation to int with __int__, it does support conversion to int with > __trunc__. The int constructor falls back to using __trunc__. > > - Rational (also inherited by Fraction): return x.denominator == 1 as > Rational requires that all numbers must be represented in lowest form. > > - Integral: return True > > - Decimal: expose the existing dec_mpd_isinteger C function to Python as > is_integer() > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Mar 21 12:13:03 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 21 Mar 2018 18:13:03 +0200 Subject: [Python-Dev] What is the purpose of NEXT_BLOCK()? In-Reply-To: References: Message-ID: 21.03.18 17:13, Guido van Rossum ????: > Maybe spelunking in the Python 2 branch will help? It seems it was > introduced in 2005 by Jeremy Hylton with this comment: > > /* The distinction between NEW_BLOCK and NEXT_BLOCK is subtle.? (I'd > ?? like to find better names.)? NEW_BLOCK() creates a new block and sets > ?? it as the current block.? NEXT_BLOCK() also creates an implicit jump > ?? from the current block to the new block. > */ > > That comment (and NEW_BLOCK()) are no longer found in the Python 3 source. It explains what NEXT_BLOCK() does, but not why it was needed to do this. NEW_BLOCK() was never used at all. From steve at holdenweb.com Wed Mar 21 12:22:53 2018 From: steve at holdenweb.com (Steve Holden) Date: Wed, 21 Mar 2018 16:22:53 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 3:08 PM, Guido van Rossum wrote: > I searched 6M LoC of Python code at Dropbox and found only three uses. > They seem legit. Two are about formatting a number that's given as a float, > deciding whether to print a float as 42 or 3.14. The third is attempting a > conversion from float to integer where a non-integer must raise a specific > exception (the same function also supports a string as long as it can be > parsed as an int). > > I don't doubt we would get by if is_integer() was deprecated. > > ?Since code that's been deleted can't have bugs, +1.? -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 21 12:46:06 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 21 Mar 2018 09:46:06 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <20180321123830.GR16661@ando.pearwood.info> References: <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> <20180321123830.GR16661@ando.pearwood.info> Message-ID: On Mar 21, 2018 05:40, "Steven D'Aprano" wrote: I don't want to change the behaviour of pow(), but we shouldn't dismiss the possibility of some other numeric function wanting to treat values N.0 and N the same. Let's say, an is_prime(x) function that supports floats as well as ints: is_prime(3.0) # True is_prime(3.00001) # False For me this is an argument against is_integer() rather than for it :-). is_prime(float) should *obviously*[1] be a TypeError. Primality is only meaningfully defined over the domain of integers, and this is a case where operator.index is exactly what you want. Of course it's just an example, and perhaps there are other, better examples. But it makes me nervous that this is the best example you could quickly come up with. -n [1] Warning: I am not Dutch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob at sixty-north.com Wed Mar 21 12:52:20 2018 From: rob at sixty-north.com (Robert Smallshire) Date: Wed, 21 Mar 2018 17:52:20 +0100 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: Here's an excerpted (and slightly simplified for consumption here) usage of float.is_integer() from the top of a function which does some convolution/filtering in a geophysics application. I've mostly seen it used in guard clauses in this way to reject either illegal numeric arguments directly, or particular combinations of arguments as in this case: def filter_convolve(x, y, xf, yf, stride=1, padding=1): x_out = (x - xf + 2*padding) / stride + 1 y_out = (y - yf + 2*padding) / stride + 1 if not (x_out.is_integer() and y_out.is_integer()): raise ValueError("Invalid convolution filter_convolve({x}, {y}, {xf}, {yf}, {stride}, {padding})" .format(x=x, y=y, xf=xf, yf=yf, stride=stride, padding=padding)) x_out = int(x_out) y_out = int(y_out) # ... Of course, there are other ways to do this check, but the approach here is obvious and easy to comprehend. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Wed Mar 21 14:07:40 2018 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 21 Mar 2018 18:07:40 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: I'd prefer to see `float.is_integer` stay. There _are_ occasions when one wants to check that a floating-point number is integral, and on those occasions, using `x.is_integer()` is the one obvious way to do it. I don't think the fact that it can be misused should be grounds for deprecation. As far as real uses: I didn't find uses of `is_integer` in our code base here at Enthought, but I did find plenty of places where it _could_ reasonably have been used, and where something less readable like `x % 1 == 0` was being used instead. For evidence that it's generally useful: it's already been noted that the decimal module uses it internally. The mpmath package defines its own "isint" function and uses it in several places: see https://github.com/fredrik-johansson/mpmath/blob/2858b1000ffdd8596defb50381dcb83de2bcccc6/mpmath/ctx_mp_python.py#L764. MPFR also has an mpfr_integer_p predicate: http://www.mpfr.org/mpfr-current/mpfr.html#index-mpfr_005finteger_005fp. A concrete use-case: suppose you wanted to implement the beta function ( https://en.wikipedia.org/wiki/Beta_function) for real arguments in Python. You'll likely need special handling for the poles, which occur only for some negative integer arguments, so you'll need an is_integer test for those. For small positive integer arguments, you may well want the accuracy advantage that arises from computing the beta function in terms of factorials (giving a correctly-rounded result) instead of via the log of the gamma function. So again, you'll want an is_integer test to identify those cases. (Oddly enough, I found myself looking at this recently as a result of the thread about quartile definitions: there are links between the beta function, the beta distribution, and order statistics, and the (k-1/3)/(n+1/3) expression used in the recommended quartile definition comes from an approximation to the median of a beta distribution with integral parameters.) Or, you could look at the SciPy implementation of the beta function, which does indeed do the C equivalent of is_integer in many places: https://github.com/scipy/scipy/blob/11509c4a98edded6c59423ac44ca1b7f28fba1fd/scipy/special/cephes/beta.c#L67 In sum: it's an occasionally useful operation; there's no other obvious, readable spelling of the operation that does the right thing in all cases, and it's _already_ in Python! In general, I'd think that deprecation of an existing construct should not be done lightly, and should only be done when there's an obvious and significant benefit to that deprecation. I don't see that benefit here. -- Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertomartinezp at gmail.com Wed Mar 21 14:09:28 2018 From: robertomartinezp at gmail.com (=?UTF-8?Q?Roberto_Mart=C3=ADnez?=) Date: Wed, 21 Mar 2018 18:09:28 +0000 Subject: [Python-Dev] ThreadedProcessPoolExecutor In-Reply-To: References: Message-ID: El mi?., 21 mar. 2018 a las 16:23, Guido van Rossum () escribi?: > Roberto, > > That looks like an interesting class. I presume you're intending to > publish this as a pip package on PyPI.python.org? > > Precisely. I'm no lawyer, but I believe you can license your code under a new license > (I recommend BSD) as long as you keep a copy and a mention of the PSF > license in your distribution as well. (Though perhaps you could structure > your code differently and inherit from the standard library modules rather > than copying them?) > I am using inheritance as much as I can. But due to some functions being at the module level, instead of being Executor methods (for the sake of being pickelizable, I suppose); I am being forced to copy some of them just to modify a couple of lines. > In terms of the package namespace, do not put it in the same namespace as > standard library code! It probably won't work and will cause world-wide > pain and suffering for the users of your code. Invent your project name and > use that as a top-level namespace, like everyone else. :-) > > Ok, I don't want to cause world-wide pain (yet). Thank you! Best regards, Roberto > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed Mar 21 14:14:06 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 21 Mar 2018 18:14:06 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: I've been using and teaching python for close to 20 years and I never noticed that x.is_integer() exists until this thread. I would say the "one obvious way" is less than obvious. On the other hand, `x == int(x)` is genuinely obvious... and it immediately suggests the probably better `math.isclose(x, int(x))` that is what you usually mean. On Wed, Mar 21, 2018, 2:08 PM Mark Dickinson wrote: > I'd prefer to see `float.is_integer` stay. There _are_ occasions when one > wants to check that a floating-point number is integral, and on those > occasions, using `x.is_integer()` is the one obvious way to do it. I don't > think the fact that it can be misused should be grounds for deprecation. > > As far as real uses: I didn't find uses of `is_integer` in our code base > here at Enthought, but I did find plenty of places where it _could_ > reasonably have been used, and where something less readable like `x % 1 == > 0` was being used instead. For evidence that it's generally useful: it's > already been noted that the decimal module uses it internally. The mpmath > package defines its own "isint" function and uses it in several places: see > https://github.com/fredrik-johansson/mpmath/blob/2858b1000ffdd8596defb50381dcb83de2bcccc6/mpmath/ctx_mp_python.py#L764. > MPFR also has an mpfr_integer_p predicate: > http://www.mpfr.org/mpfr-current/mpfr.html#index-mpfr_005finteger_005fp. > > A concrete use-case: suppose you wanted to implement the beta function ( > https://en.wikipedia.org/wiki/Beta_function) for real arguments in > Python. You'll likely need special handling for the poles, which occur only > for some negative integer arguments, so you'll need an is_integer test for > those. For small positive integer arguments, you may well want the accuracy > advantage that arises from computing the beta function in terms of > factorials (giving a correctly-rounded result) instead of via the log of > the gamma function. So again, you'll want an is_integer test to identify > those cases. (Oddly enough, I found myself looking at this recently as a > result of the thread about quartile definitions: there are links between > the beta function, the beta distribution, and order statistics, and the > (k-1/3)/(n+1/3) expression used in the recommended quartile definition > comes from an approximation to the median of a beta distribution with > integral parameters.) > > Or, you could look at the SciPy implementation of the beta function, which > does indeed do the C equivalent of is_integer in many places: > https://github.com/scipy/scipy/blob/11509c4a98edded6c59423ac44ca1b7f28fba1fd/scipy/special/cephes/beta.c#L67 > > In sum: it's an occasionally useful operation; there's no other obvious, > readable spelling of the operation that does the right thing in all cases, > and it's _already_ in Python! In general, I'd think that deprecation of an > existing construct should not be done lightly, and should only be done when > there's an obvious and significant benefit to that deprecation. I don't see > that benefit here. > > -- > Mark > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 14:59:47 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 11:59:47 -0700 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 11:14 AM, David Mertz wrote: > I've been using and teaching python for close to 20 years and I never > noticed that x.is_integer() exists until this thread. I would say the "one > obvious way" is less than obvious. > > On the other hand, `x == int(x)` is genuinely obvious... and it > immediately suggests the probably better `math.isclose(x, int(x))` that is > what you usually mean. > We can argue about this forever, but I don't think I would have come up with that either when asked "how to test a float for being a whole number". I would probably have tried "x%1 == 0" which is terrible. I like to have an API that doesn't have the pitfalls of any of the "obvious" solutions that numerically naive people would come up with, and x.is_integer() is it. Let's keep it. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Mar 21 15:02:37 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Mar 2018 14:02:37 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: [David Mertz] > I've been using and teaching python for close to 20 years and I never > noticed that x.is_integer() exists until this thread. Except it was impossible to notice across most of those years, because it didn't exist across most of those years ;-) > I would say the "one obvious way" is less than obvious. When it was introduced, it _became_ the one obvious way. > On the other hand, `x == int(x)` is genuinely obvious.. But a bad approach: it can raise OverflowError (for infinite x); it can raise ValueError (for x a NaN); and can waste relative mountains of time creating huge integers, e.g., >>> int(1e306) 1000000000000000017216064596736454828831087825013238982328892017892380671244575047987920451875459594568606138861698291060311049225532948520696938805711440650122628514669428460356992624968028329550689224175284346730060716088829214255439694630119794546505512415617982143262670862918816362862119154749127262208 In Python 2, x == math.floor(x) was much better on the latter count, but not in Python 3 (math.floor used to return a float, but returns an int now). As to Serhiy's `not x % 1.0`, after 5 minutes I gave up trying to prove it's always correct. Besides infinities and NaNs, there's also that Python's float mod can be surprising: >>> (-1e-20) % 1.0 1.0 There isn't a "clean" mathematical definition of what Python's float % does, which is why proof is strained. In general, the "natural" result is patched when and if needed to maintain that x == y*(x//y) + x%y is approximately true. The odd % result above is a consequence of that, and that (-1e-20) // 1.0 is inarguably -1.0. > and it immediately suggests the probably better `math.isclose(x, int(x))` that > is what you usually mean. Even in some of the poor cases Serhiy found, that wouldn't be a lick better. For example, math.isclose(x/5, int(x/5)) is still a plain wrong way to check whether x is divisible by 5. >>> x = 1e306 >>> math.isclose(x/5, int(x/5)) True >>> x/5 == int(x/5) True >>> int(x) % 5 3 The problem there isn't how "is it an integer?" is spelled, it's that _any_ way of spelling "is it an integer?" doesn't answer the question they're trying to answer. They're just plain confused about how floating point works. The use of `.is_integer()` (however spelled!) isn't the cause of that, it's a symptom. From mertz at gnosis.cx Wed Mar 21 16:49:51 2018 From: mertz at gnosis.cx (David Mertz) Date: Wed, 21 Mar 2018 16:49:51 -0400 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 3:02 PM, Tim Peters wrote: > [David Mertz] > > I've been using and teaching python for close to 20 years and I never > > noticed that x.is_integer() exists until this thread. > > Except it was impossible to notice across most of those years, because > it didn't exist across most of those years ;-) > That's probably some of the reason. I wasn't sure if someone used the time machine to stick it back into Python 1.4. > > On the other hand, `x == int(x)` is genuinely obvious.. > > But a bad approach: it can raise OverflowError (for infinite x); it > can raise ValueError (for x a NaN); These are the CORRECT answers! Infinity neither is nor is not an integer. Returning a boolean as an answer is bad behavior; I might argue about *which* exception is best, but False is not a good answer to `float('inf').is_integer()`. Infinity is neither in the Reals nor in the Integers, but it's just as much the limit of either. Likewise Not-a-Number isn't any less an integer than it is a real number (approximated by a floating point number). It's NOT a number, which is just as much not an integer. > and can waste relative mountains > of time creating huge integers, e.g., > True enough. But it's hard to see where that should matter. No floating point number on the order of 1e306 is sufficiently precise as to be an integer in any meaningful sense. If you are doing number theory with integers of that size (or larger is perfectly fine too) the actual test is `isinstance(x, int)`. Using a float is just simply wrong for the task to start with, whether or not those bits happen to represent something Integral... the only case where you should see this is "measuring/estimating something VERY big, very approximately." For example, this can be true (even without reaching inf): >>> x.is_integer() True >>> (math.sqrt(x**2)).is_integer() False The problem there isn't how "is it an integer?" is spelled, it's that > _any_ way of spelling "is it an integer?" doesn't answer the question > they're trying to answer. They're just plain confused about how > floating point works. The use of `.is_integer()` (however spelled!) > isn't the cause of that, it's a symptom. > Agreed! -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Wed Mar 21 18:53:09 2018 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 21 Mar 2018 22:53:09 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 8:49 PM, David Mertz wrote: > For example, this can be true (even without reaching inf): > > >>> x.is_integer() > True > >>> (math.sqrt(x**2)).is_integer() > False > If you have a moment to share it, I'd be interested to know what value of `x` you used to achieve this, and what system you were on. This can't happen under IEEE 754 arithmetic. -- Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Mar 21 19:02:49 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 22 Mar 2018 10:02:49 +1100 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> <20180321123830.GR16661@ando.pearwood.info> Message-ID: <20180321230249.GS16661@ando.pearwood.info> On Wed, Mar 21, 2018 at 09:46:06AM -0700, Nathaniel Smith wrote: [...] > For me this is an argument against is_integer() rather than for it :-). > is_prime(float) should *obviously*[1] be a TypeError. Primality is only > meaningfully defined over the domain of integers And 3.0 is an integer. Just because it is float *object* does not mean it is not an integer *value*. Do not mistake the leaky abstraction of multiple numeric types for the mathematical number three. Primality-related functions are not limited to integers. For example, the prime counting function is defined on the reals: https://en.wikipedia.org/wiki/Prime-counting_function and there's no reason not to extend the domain of is_prime to any real. "Practicality beats purity" -- why should the result be different just because the input has a ".0" at the end? Mathematically it doesn't: the answer to something like "Is 3.0 a prime?" is a clear Yes, not "I'm sorry, I don't understand the question!" which an exception would imply. As programmers, there is always a tension between the leaky abstraction of our numeric types, and the mathematical equality of: 3 == 3.0 == 9/3 == 3+0j etc. The decision on whether to be more or less restrictive on the *types* a function accepts is up to the individual developer. Having decided to be *less* restrictive, an is_integer method would be useful. For what it's worth, Wolfram|Alpha gives inconsistant results. It allows testing of rationals for primality: "Is 9/3 a prime?" evaluates as true, but: "Is 3.0 a prime?" gets parsed as "Is 3 a prime number?" and yet evaluates as false. A clear bug for software using a natural-language interface and intended to be used by students and non-mathematicans. > and this is a case where > operator.index is exactly what you want. It is exactly not what I want. > Of course it's just an example, and perhaps there are other, better > examples. But it makes me nervous that this is the best example you could > quickly come up with. I actually had to work hard to come up with an example as simple and understandable as primality testing. The first example I thought of was Bessel functions of the 1st and 2nd kind with arbitrary real-valued orders, where you *absolutely* do want order 3.0 (float) and order 3 (int) to be precisely the same. But I avoided giving it because I thought it would be too technical and it would intimidate people. I thought that the prime number example would be easier to understand. Next time I want to make a point, I'll go for argument by intimidation. *wink* -- Steve From tim.peters at gmail.com Wed Mar 21 19:43:04 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Mar 2018 18:43:04 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: Note: this is a top-posted essay much more about floating-point philosophy than about details. Details follow from the philosophy, and if philosophies don't match the desired details will never match either. Understanding floating point requires accepting that they're a funky subset of rational numbers, augmented with some oddballs (NaNs, "infinities", minus zero). At best the reals are a vague inspiration, and floats have their own terminology serving their actual nature. Thinking about reals instead is often unhelpful. For example, it's bog standard terminology to call all IEEE-754 values that aren't infinities or NaNs "finite". Which, by no coincidence, is how Python's math.isfinite() discriminates. Within the finites - which are all rational numbers - the distinction between integers and non-integers is obvious, but only after you're aware of it and give it some thought. Which most people aren't and don't - but that's no reason to prevent the rest of us from getting work done ;-) This isn't anything new in Python - it's as old as floating-point. For example, look up C's ancient "modf" function (which breaks a float/double into its "integer" and "fractional" parts, and treats all finite floats of sufficiently large magnitude as having fractional parts of 0.0 - because they in are fact exact integers). The idea that floats are "just approximations - so all kinds of slop is acceptable and all kinds of fear inescapable" went out of style when IEEE-754 was introduced. That standard codified an alternative view: that functions on floats should behave as if their inputs were _exactly_ correct, and - given that - produce the closest representable value to the infinitely precise result. That proved to be extremely valuable in practice, allowing the development of shorter, faster, more robust, and more accurate numerical algorithms. The trend ever since has been to do more & more along those lines, from trig functions doing argument reduction as if pi were represented with infinite precision, to adding single-rounding dot product primitives (all again acting as if all the inputs were exactly correct). Since that approach has been highly productive in real life, it's the one I favor. Arguments like "no floating point number on the order of 1e306 is sufficiently precise as to be an integer in any meaningful sense" don't even start to get off the ground in that approach. Maybe in 1970 ;-) You can have no actual idea of whether 1e306 is exactly right or off by a factor of a million just from staring at it, and real progress has been made by assuming all values are exactly what they appear to be, then acting accordingly. If you want to model that some values are uncertain, that's fine, but then you need something like interval arithmetic instead. >From that fundamental "take floats exactly at face value" view, what .is_integer() should do for floats is utterly obvious: there is no possible argument about whether a given IEEE-754 float is or is not an integer, provided you're thinking about IEEE-754 floats (and not, e.g., about mathematical reals), and making even a tiny attempt to honor the spirit of the IEEE-754 standard. Whether that's _useful_ to you depends on the application you're writing at the time. The advantage of the philosophy is that it often gives clear guidance about what implementations "should do" regardless, and following that guidance has repeatedly proved to be a boon to those writing numerical methods. And, yes, also a pain in the ass ;-) --- nothing new below --- On Wed, Mar 21, 2018 at 3:49 PM, David Mertz wrote: > On Wed, Mar 21, 2018 at 3:02 PM, Tim Peters wrote: >> >> [David Mertz] >> > I've been using and teaching python for close to 20 years and I never >> > noticed that x.is_integer() exists until this thread. >> >> Except it was impossible to notice across most of those years, because >> it didn't exist across most of those years ;-) > > > That's probably some of the reason. I wasn't sure if someone used the time > machine to stick it back into Python 1.4. > >> >> > On the other hand, `x == int(x)` is genuinely obvious.. >> >> But a bad approach: it can raise OverflowError (for infinite x); it >> can raise ValueError (for x a NaN); > > > These are the CORRECT answers! Infinity neither is nor is not an integer. > Returning a boolean as an answer is bad behavior; I might argue about > *which* exception is best, but False is not a good answer to > `float('inf').is_integer()`. Infinity is neither in the Reals nor in the > Integers, but it's just as much the limit of either. > > Likewise Not-a-Number isn't any less an integer than it is a real number > (approximated by a floating point number). It's NOT a number, which is just > as much not an integer. > >> >> and can waste relative mountains >> of time creating huge integers, e.g., > > > True enough. But it's hard to see where that should matter. No floating > point number on the order of 1e306 is sufficiently precise as to be an > integer in any meaningful sense. If you are doing number theory with > integers of that size (or larger is perfectly fine too) the actual test is > `isinstance(x, int)`. Using a float is just simply wrong for the task to > start with, whether or not those bits happen to represent something > Integral... the only case where you should see this is "measuring/estimating > something VERY big, very approximately." > > For example, this can be true (even without reaching inf): > >>>> x.is_integer() > True >>>> (math.sqrt(x**2)).is_integer() > False > >> The problem there isn't how "is it an integer?" is spelled, it's that >> _any_ way of spelling "is it an integer?" doesn't answer the question >> they're trying to answer. They're just plain confused about how >> floating point works. The use of `.is_integer()` (however spelled!) >> isn't the cause of that, it's a symptom. > > > Agreed! > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. From tim.peters at gmail.com Wed Mar 21 21:11:03 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Mar 2018 20:11:03 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: [David Mertz ] >> For example, this can be true (even without reaching inf): >> >> >>> x.is_integer() >> True >> >>> (math.sqrt(x**2)).is_integer() >> False [Mark Dickinson ] > If you have a moment to share it, I'd be interested to know what value of > `x` you used to achieve this, and what system you were on. This can't happen > under IEEE 754 arithmetic. I expect it might happen under one of the directed rounding modes (like "to +infinity"). But under 754 binary round-nearest/even arithmetic, it's been formally proved that sqrt(x*x) == x exactly for all non-negative finite x such that x*x neither overflows nor underflows (and .as_integer() has nothing to do with that very strong result): https://hal.inria.fr/hal-01148409/document OTOH, the paper notes that it's not necessarily true for IEEE decimal arithmetic; e.g., >>> import decimal >>> decimal.getcontext().prec = 4 >>> (decimal.Decimal("31.66") ** 2).sqrt() # result is 1 ulp smaller Decimal('31.65') >>> decimal.getcontext().prec = 5 >>> (decimal.Decimal("31.660") ** 2).sqrt() # result is 1 ulp larger Decimal('31.661') From jeanpierreda at gmail.com Wed Mar 21 21:17:58 2018 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 21 Mar 2018 18:17:58 -0700 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: > [Mark Dickinson ] >> If you have a moment to share it, I'd be interested to know what value of >> `x` you used to achieve this, and what system you were on. This can't happen >> under IEEE 754 arithmetic. > > I expect it might happen under one of the directed rounding modes > (like "to +infinity"). PyPy (5.8): >>>> x = 1e300 >>>> x.is_integer() True >>>> math.sqrt(x**2).is_integer() False >>>> x**2 inf (It gives an OverflowError on my CPython installs.) I believe this is allowed, and Python is not required to raise OverflowError here: https://docs.python.org/3.6/library/exceptions.html#OverflowError says: > for historical reasons, OverflowError is sometimes raised for integers that are outside a required range. Because of the lack of standardization of floating point exception handling in C, most floating point operations are not checked -- Devin From chris.barker at noaa.gov Wed Mar 21 21:27:25 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 22 Mar 2018 01:27:25 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 11:43 PM, Tim Peters wrote: > Note: this is a top-posted essay much more about floating-point > philosophy than about details. Details follow from the philosophy, > and if philosophies don't match the desired details will never match > either. > but of course :-) > From that fundamental "take floats exactly at face value" view, what > .is_integer() should do for floats is utterly obvious: sure -- but I don't think anyone is arguing that -- the question is whether the function should exist -- and that means not "how should it work?" or "is it clearly and appropriately defined?" but rather, "is it the "right" thing to do in most cases, when deployed by folks that haven't thought deeply about floating point. > Whether that's _useful_ to you depends on the application you're > writing at the time. exactly. I think pretty much all the real world code that's been shown here for using .is_integer() is really about type errors (issues). The function at hand really wants integer inputs -- but wants to allow the user to be sloppy and provide a float type that happens to be an int. Given Python's duck-typing nature, maybe that's a good thing? I know I really discourage dynamic type checking.... Also, every example has been for small-ish integers -- exponents, factorials, etc -- not order 1e300 -- or inf or NaN, etc. Finally, the use-cases where the value that happens-to-be-an-int is computed via floating point -- .is_integer() is probably the wrong check -- you probably want isclose(). The other use-cases: and floor() and ceil() and round() all produce actual integers -- so no need for that anymore. All this points to: we don't need .is_integer All the being said -- the standard for depreciation is much higher bar than not-adding-it-in-the-first-place. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Mar 21 21:30:18 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Mar 2018 20:30:18 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: [Devin Jeanpierre ] > PyPy (5.8): > >>>> x = 1e300 > >>>> x.is_integer() > True > >>>> math.sqrt(x**2).is_integer() > False > >>>> x**2 > inf I think you missed that David said "even without reaching inf" (you did reach inf), and that I said "such that x*x neither overflows nor underflows". Those are technical words related to IEEE-754: your x*x sets the IEEE overflow flag, although CPython may or may not raise the Python OverflowError exception. > > (It gives an OverflowError on my CPython installs.) > > I believe this is allowed, and Python is not required to raise > OverflowError here: > https://docs.python.org/3.6/library/exceptions.html#OverflowError > says: > >> for historical reasons, OverflowError is sometimes raised for integers that are outside a required range. Because of the lack of standardization of floating point exception handling in C, most floating point operations are not checked You can avoid the OverflowError (but not the IEEE overflow condition!) under CPython by multiplying instead: >>> x = 1e300 >>> x*x inf From chris.barker at noaa.gov Wed Mar 21 21:45:08 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 22 Mar 2018 01:45:08 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: <20180321123830.GR16661@ando.pearwood.info> References: <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> <20180321123830.GR16661@ando.pearwood.info> Message-ID: On Wed, Mar 21, 2018 at 12:38 PM, Steven D'Aprano wrote: > In fact, Serhiy's suggestion is not correct when x is not a float: > This is a key point -- see some of Tim's posts -- like ot or not, you probably need to know when you are working with a float and when you aren't -- and what the implications of that are. I think the key pint is this: where does the logic of whether a given object represents an integer belong? At first glance, it clearly belongs with the type -- floats know how they are represented, as do fractions and decimals -- they can determine it in an efficient and clearly defined way. However, my argument is that while an integer-valued float is clearly defined in the binary representation, what "is this an integer?" means is actually use-case dependent, and thus the user should be deciding how to do it (i.e. with isclose() or int(x) == x or .... > If the exponent is a computed float, then you > > really don?t want a different result depending on whether the computed > > value is exactly an integer or one ULP off. > > I don't think you actually mean to say that. I'm pretty sure that we > *do* want different results if the exponent differs from an integer by > one ULP. yes -- poorly worded -- I mean you want the slightly different result, not a different type and algorithm - i.e continuity if you had a range of slightly less than an integer to slightly more than an integer. > The user should check/convert to an integer with a method appropriate to > > the problem at hand. > > Oh, you mean something like x.is_integer()? I agree! > > *wink* > That's the point -- is_integer may or may not be appropriate, and whether it is is a function of use-case, not type. > If it wasn?t too heavyweight, it might be nice to have some sort of flag > on > > floats indicating whether they really ARE an integer, rather than happen > to > > be: > > > > -Created from an integer literal > > - created from an integer object > > - result of floor(), ceil() or round() > > I don't understand this. > poorly worked again -- I shoud not write these on a phone.... > You seem to be saying that none of the following are "really" integer > valued: > > float(10) > floor(10.1) > ceil(10.1) > round(10.1) > I meant those are the ones that ARE really integer valued. turns out that py3 returns an actual int for all of those other than float() (of course) anyway, so that's pretty much been done -- and no need for is_integer() well, it's been fun, but looks like it's sticking around..... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Mar 21 21:48:09 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 22 Mar 2018 01:48:09 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: On Wed, Mar 21, 2018 at 4:12 PM, Guido van Rossum wrote: > Thank you! As you may or may not have noticed in a different thread, we're > going through a small existential crisis regarding the usefulness of > is_integer() -- Serhiy believes it is not useful (and even an attractive > nuisance) and should be deprecated. OTOH the existence of > dec_mpd_isinteger() seems to validate to me that it actually exposes useful > functionality (and every Python feature can be abused, so that alone should > not be a strong argument for deprecation). > if not deprecated, then do we add it to all the other numeric types? Which was the original suggestion, yes? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed Mar 21 21:15:20 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 22 Mar 2018 01:15:20 +0000 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: Ok. I'm wrong on that example. On Wed, Mar 21, 2018, 9:11 PM Tim Peters wrote: > [David Mertz ] > >> For example, this can be true (even without reaching inf): > >> > >> >>> x.is_integer() > >> True > >> >>> (math.sqrt(x**2)).is_integer() > >> False > > [Mark Dickinson ] > > If you have a moment to share it, I'd be interested to know what value of > > `x` you used to achieve this, and what system you were on. This can't > happen > > under IEEE 754 arithmetic. > > I expect it might happen under one of the directed rounding modes > (like "to +infinity"). > > But under 754 binary round-nearest/even arithmetic, it's been formally > proved that sqrt(x*x) == x exactly for all non-negative finite x such > that x*x neither overflows nor underflows (and .as_integer() has > nothing to do with that very strong result): > > https://hal.inria.fr/hal-01148409/document > > OTOH, the paper notes that it's not necessarily true for IEEE decimal > arithmetic; e.g., > > >>> import decimal > >>> decimal.getcontext().prec = 4 > >>> (decimal.Decimal("31.66") ** 2).sqrt() # result is 1 ulp smaller > Decimal('31.65') > > >>> decimal.getcontext().prec = 5 > >>> (decimal.Decimal("31.660") ** 2).sqrt() # result is 1 ulp larger > Decimal('31.661') > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Mar 21 22:29:16 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 21 Mar 2018 21:29:16 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: [Chris Barker ] > ... > ... "is it the "right" thing to do in most cases, when deployed by folks > that haven't thought deeply about floating point. Gimme a break ;-) Even people who _believe_ they've thought about floating point still litter the bug tracker with >>> .1 + .2 0.30000000000000004 "bug reports". .is_integer() is easy to explain compared to that - and you have to go out of your way to use it. > ... > I think pretty much all the real world code that's been shown here for using > .is_integer() is really about type errors (issues). The function at hand > really wants integer inputs -- but wants to allow the user to be sloppy and > provide a float type that happens to be an int. Given Python's duck-typing > nature, maybe that's a good thing? I know I really discourage dynamic type > checking.... So you identified a use case. One you don't approve of (nor do I), but not strongly enough to demand they suffer instead ;-) > Also, every example has been for small-ish integers -- exponents, > factorials, etc -- not order 1e300 -- or inf or NaN, etc. > > Finally, the use-cases where the value that happens-to-be-an-int is computed > via floating point -- .is_integer() is probably the wrong check -- you > probably want isclose(). Everyone who has implemented a production math library can recall cases where the functionality was needed. Here, that includes at least Stefan Krah and me. You could also follow the link from Mark Dickinson to SciPy's implementation of the beta function. In every case I've needed the functionality, isclose() would have been utterly useless. Behold: >>> (-1.0) ** 3.0 -1.0 >>> (-1.0) ** 3.000000000001 # different result _type_ (-1-3.142007854859299e-12j) >>> math.isclose(3.0, 3.000000000001) True And another showing that the same functionality is needed regardless of how large the power: >>> (-1.0) ** 1e300 # an even integer power 1.0 When implementing an externally defined standard, when it says "and if such-and-such is an integer ...", it _means_ exactly an integer, not "or a few ULP away from an integer". IEEE pow()-like functions bristle with special cases for integers. >>> (-math.inf) ** 3.1 inf >>> (-math.inf) ** 3.0 # note: this one has a negative result (odd integer power) -inf >>> (-math.inf) ** 2.9 inf > ... > All this points to: we don't need .is_integer I'll grant that you don't think you need it. So don't use it ;-) > All the being said -- the standard for depreciation is much higher bar than > not-adding-it-in-the-first-place. I would not have added it as a method to begin with - but I agree with Guido that it doesn't reach the bar for deprecation. The only examples of "bad" uses we saw were from people still so naive about floating-point behavior that they'll easily fall into other ways to get it wrong. What we haven't seen: a single person here saying "you know, I think _I'd_ be seduced into misusing it!". It's not _inherently_ confusing at all. From guido at python.org Wed Mar 21 23:31:31 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 20:31:31 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: On Wed, Mar 21, 2018 at 6:48 PM, Chris Barker wrote: > On Wed, Mar 21, 2018 at 4:12 PM, Guido van Rossum > wrote: > >> Thank you! As you may or may not have noticed in a different thread, >> we're going through a small existential crisis regarding the usefulness of >> is_integer() -- Serhiy believes it is not useful (and even an attractive >> nuisance) and should be deprecated. OTOH the existence of >> dec_mpd_isinteger() seems to validate to me that it actually exposes useful >> functionality (and every Python feature can be abused, so that alone should >> not be a strong argument for deprecation). >> > > if not deprecated, then do we add it to all the other numeric types? Which > was the original suggestion, yes? > Yes. That's a pronouncement, so we can end this thread (and more importantly the other). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 21 23:40:33 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Mar 2018 20:40:33 -0700 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: OK, we'll keep float.is_integer(), and that's a pronouncement, so that we can hopefully end this thread soon. It should also be added to int. After all that's what started this thread, with the observation that mypy and PEP 484 consider an int valid whenever a float is expected. Since PEP 484 and mypy frown upon the numeric tower I don't care too much about whether it's added it there (or to Decimal) but given that we're keeping float.is_integer() and adding int.is_integer(), I don't see what's gained by not adding it to the numeric tower and Decimal either. Numpy can do what it likes, it doesn't play by the same rules as the stdlib anyway. On Wed, Mar 21, 2018 at 7:29 PM, Tim Peters wrote: > [Chris Barker ] > > ... > > ... "is it the "right" thing to do in most cases, when deployed by folks > > that haven't thought deeply about floating point. > > Gimme a break ;-) Even people who _believe_ they've thought about > floating point still litter the bug tracker with > > >>> .1 + .2 > 0.30000000000000004 > > "bug reports". .is_integer() is easy to explain compared to that - > and you have to go out of your way to use it. > > > ... > > I think pretty much all the real world code that's been shown here for > using > > .is_integer() is really about type errors (issues). The function at hand > > really wants integer inputs -- but wants to allow the user to be sloppy > and > > provide a float type that happens to be an int. Given Python's > duck-typing > > nature, maybe that's a good thing? I know I really discourage dynamic > type > > checking.... > > So you identified a use case. One you don't approve of (nor do I), > but not strongly enough to demand they suffer instead ;-) > > > > Also, every example has been for small-ish integers -- exponents, > > factorials, etc -- not order 1e300 -- or inf or NaN, etc. > > > > Finally, the use-cases where the value that happens-to-be-an-int is > computed > > via floating point -- .is_integer() is probably the wrong check -- you > > probably want isclose(). > > Everyone who has implemented a production math library can recall > cases where the functionality was needed. Here, that includes at > least Stefan Krah and me. You could also follow the link from Mark > Dickinson to SciPy's implementation of the beta function. > > In every case I've needed the functionality, isclose() would have been > utterly useless. Behold: > > >>> (-1.0) ** 3.0 > -1.0 > >>> (-1.0) ** 3.000000000001 # different result _type_ > (-1-3.142007854859299e-12j) > >>> math.isclose(3.0, 3.000000000001) > True > > And another showing that the same functionality is needed regardless > of how large the power: > > >>> (-1.0) ** 1e300 # an even integer power > 1.0 > > When implementing an externally defined standard, when it says "and if > such-and-such is an integer ...", it _means_ exactly an integer, not > "or a few ULP away from an integer". IEEE pow()-like functions > bristle with special cases for integers. > > >>> (-math.inf) ** 3.1 > inf > >>> (-math.inf) ** 3.0 # note: this one has a negative result (odd integer > power) > -inf > >>> (-math.inf) ** 2.9 > inf > > > > ... > > All this points to: we don't need .is_integer > > I'll grant that you don't think you need it. So don't use it ;-) > > > > All the being said -- the standard for depreciation is much higher bar > than > > not-adding-it-in-the-first-place. > > I would not have added it as a method to begin with - but I agree with > Guido that it doesn't reach the bar for deprecation. The only > examples of "bad" uses we saw were from people still so naive about > floating-point behavior that they'll easily fall into other ways to > get it wrong. What we haven't seen: a single person here saying "you > know, I think _I'd_ be seduced into misusing it!". It's not > _inherently_ confusing at all. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Mar 22 02:16:47 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 Mar 2018 19:16:47 +1300 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: Message-ID: <5AB34A4F.4010506@canterbury.ac.nz> Tim Peters wrote: > from trig functions doing argument reduction as if pi were represented > with infinite precision, That sounds like an interesting trick! Can you provide pointers to any literature describing how it's done? Not doubting it's possible, just curious. -- Greg From tim.peters at gmail.com Thu Mar 22 02:37:06 2018 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 22 Mar 2018 01:37:06 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: <5AB34A4F.4010506@canterbury.ac.nz> References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: [Tim] >> from trig functions doing argument reduction as if pi were represented >> with infinite precision, [Greg Ewing ] > That sounds like an interesting trick! Can you provide > pointers to any literature describing how it's done? > > Not doubting it's possible, just curious. As I recall, when it was first done a "lazy" routine produced as many bits of pi as a given argument required, doing gonzo arbitrary precision arithmetic. Later, computer-aided analysis based on continued fraction expansions identified the worst possible case across all IEEE doubles (& singles). For example, it's possible in reasonable time to find the IEEE double that comes closest to being an exact integer multiple of pi/4 (or whatever other range you want to reduce to). Then it's only necessary to precompute pi to as many bits as needed to handle the worst case. In practice, falling back to that is necessary only for "large" arguments, and the usual double-precision numeric tricks suffice for smaller arguments. Search the web for "trig argument reduction" for whatever the state of the art may be today ;-) For actual code, FDLIBM does "as if infinite precision" trig argument reduction, using a precomputed number of pi bits sufficient to handle the worst possible IEEE double case, and is available for free from NETLIB: http://www.netlib.org/fdlibm/ The code is likely to be baffling, though, as there's scant explanation. Reading a paper or two first would be a huge help. From rohadhik at gmail.com Thu Mar 22 02:27:31 2018 From: rohadhik at gmail.com (Rohit Adhikari) Date: Thu, 22 Mar 2018 11:57:31 +0530 Subject: [Python-Dev] Do we have vlookup function Message-ID: Do we have vlookup function which can be used in dataframe same as used in excel. Can you please provide the pointer for the same? Thanks!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephane at wirtel.be Thu Mar 22 03:00:22 2018 From: stephane at wirtel.be (Stephane Wirtel) Date: Thu, 22 Mar 2018 08:00:22 +0100 Subject: [Python-Dev] Do we have vlookup function In-Reply-To: References: Message-ID: This mailing list is for the development of CPython, not for the end user, please could you move your question on python-list at python.org ? Thank you, Le 22/03/18 ? 07:27, Rohit Adhikari a ?crit?: > Do we have vlookup function which can be used in dataframe same as used > in excel. > Can you please provide the pointer for the same? > > Thanks!!! > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/stephane%40wirtel.be > From kirillbalunov at gmail.com Thu Mar 22 05:51:14 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Thu, 22 Mar 2018 12:51:14 +0300 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: I apologize that I get into the discussion. Obviously in some situations it will be useful to check that a floating-point number is integral, but from the examples given it is clear that they are very rare. Why the variant with the inclusion of this functionality into the math module was not considered at all. If the answer is - consistency upon the numeric tower - will it go for complex type and what will it mean (there can be two point of views)? Is this functionality so often used and practical to be a method of float, int, ..., and not just to be an auxiliary function? p.s.: The same thoughts about `as_integer_ratio` discussion. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob at sixty-north.com Thu Mar 22 12:42:17 2018 From: rob at sixty-north.com (Robert Smallshire) Date: Thu, 22 Mar 2018 17:42:17 +0100 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: In the PR which implements is_integer() for int, the numeric tower, and Decimal I elected not to implement it for Complex or complex. This was principally because complex instances, even if they have an integral real value, are not convertible to int and it seems reasonable to me that any number for which is_integer() returns True should be convertible to int successfully, and without loss of information. >>> int(complex(2, 0)) Traceback (most recent call last): File "", line 1, in TypeError: can't convert complex to int There could be an argument that a putative complex.is_integral() should therefore return False, but I expect that would get even less support than the other suggestions in these threads. *Robert Smallshire | *Managing Director *Sixty North* | Applications | Consulting | Training rob at sixty-north.com | T +47 63 01 04 44 | M +47 924 30 350 http://sixty-north.com On 22 March 2018 at 10:51, Kirill Balunov wrote: > I apologize that I get into the discussion. Obviously in some situations > it will be useful to check that a floating-point number is integral, but > from the examples given it is clear that they are very rare. Why the > variant with the inclusion of this functionality into the math module was > not considered at all. If the answer is - consistency upon the numeric > tower - will it go for complex type and what will it mean (there can be two > point of views)? > > Is this functionality so often used and practical to be a method of float, > int, ..., and not just to be an auxiliary function? > > p.s.: The same thoughts about `as_integer_ratio` discussion. > > With kind regards, > -gdg > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > rob%40sixty-north.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Mar 22 12:47:11 2018 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 22 Mar 2018 11:47:11 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: [Kirill Balunov ] > I apologize that I get into the discussion. Obviously in some situations it > will be useful to check that a floating-point number is integral, but from > the examples given it is clear that they are very rare. Why the variant with > the inclusion of this functionality into the math module was not considered > at all. Nobody here really discussed the history, and I don't know. The questions here have been about what to do given that `is_integer` and `as_integer_ratio` are _already_ advertised (public) methods on some numeric types. > If the answer is - consistency upon the numeric tower - will it go > for complex type and what will it mean (there can be two point of views)? I haven't seen anyone suggest either method be added to Complex. There are lots of methods that don't show up in the tower before hitting Real. For example, given that Complex doesn't support __float__, it would be bizarre if it _did_ support as_integer_ratio. > Is this functionality so often used and practical to be a method of float, > int, ..., and not just to be an auxiliary function? > > p.s.: The same thoughts about `as_integer_ratio` discussion. I would have added them as functions in the `math` module instead. perhaps supported by dunder methods (__as_integer_ratio__, __is_integer__). But that's not what happened, and whether or not they have double underscores on each end doesn't really make all that much difference except to dedicated pedants ;-) From gregory.szorc at gmail.com Thu Mar 22 12:58:07 2018 From: gregory.szorc at gmail.com (Gregory Szorc) Date: Thu, 22 Mar 2018 09:58:07 -0700 Subject: [Python-Dev] Better support for consuming vendored packages Message-ID: I'd like to start a discussion around practices for vendoring package dependencies. I'm not sure python-dev is the appropriate venue for this discussion. If not, please point me to one and I'll gladly take it there. I'll start with a problem statement. Not all consumers of Python packages wish to consume Python packages in the common `pip install ` + `import ` manner. Some Python applications may wish to vendor Python package dependencies such that known compatible versions are always available. For example, a Python application targeting a general audience may not wish to expose the existence of Python nor want its users to be concerned about Python packaging. This is good for the application because it reduces complexity and the surface area of things that can go wrong. But at the same time, Python applications need to be aware that the Python environment may contain more than just the Python standard library and whatever Python packages are provided by that application. If using the system Python executable, other system packages may have installed Python packages in the system site-packages and those packages would be visible to your application. A user could `pip install` a package and that would be in the Python environment used by your application. In short, unless your application distributes its own copy of Python, all bets are off with regards to what packages are installed. (And even then advanced users could muck with the bundled Python, but let's ignore that edge case.) In short, `import X` is often the wild west. For applications that want to "just work" without requiring end users to manage Python packages, `import X` is dangerous because `X` could come from anywhere and be anything - possibly even a separate code base providing the same package name! Since Python applications may not want to burden users with Python packaging, they may vendor Python package dependencies such that a known compatible version is always available. In most cases, a Python application can insert itself into `sys.path` to ensure its copies of packages are picked up first. This works a lot of the time. But the strategy can fall apart. Some Python applications support loading plugins or extensions. When user-provided code can be executed, that code could have dependencies on additional Python packages. Or that custom code could perform `sys.path` modifications to provide its own package dependencies. What this means is that `import X` from the perspective of the main application becomes dangerous again. You want to pick up the packages that you provided. But you just aren't sure that those packages will actually be picked up. And to complicate matters even more, an extension may wish to use a *different* version of a package from what you distribute. e.g. they may want to adopt the latest version that you haven't ported to you or they may want to use an old versions because they haven't ported yet. So now you have the requirements that multiple versions of packages be available. In Python's shared module namespace, that means having separate package names. A partial solution to this quagmire is using relative - not absolute - imports. e.g. say you have a package named "knights." It has a dependency on a 3rd party package named "shrubbery." Let's assume you distribute your application with a copy of "shrubbery" which is installed at some packages root, alongside "knights:" / /knights/__init__.py /knights/ni.py /shrubbery/__init__.py If from `knights.ni` you `import shrubbery`, you /could/ get the copy of "shrubbery" distributed by your application. Or you could pick up some other random copy that is also installed somewhere in `sys.path`. Whereas if you vendor "shrubbery" into your package. e.g. / /knights/__init__.py /knights/ni.py /knights/vendored/__init__.py /knights/vendored/shrubbery/__init__.py Then from `knights.ni` you `from .vendored import shrubbery`, you are *guaranteed* to get your local copy of the "shrubbery" package. This reliable behavior is highly desired by Python applications. But there are problems. What we've done is effectively rename the "shrubbery" package to "knights.vendored.shrubbery." If a module inside that package attempts an `import shrubbery.x`, this could fail because "shrubbery" is no longer the package name. Or worse, it could pick up a separate copy of "shrubbery" somewhere else in `sys.path` and you could have a Frankenstein package pulling its code from multiple installs. So for this to work, all package-local imports must be using relative imports. e.g. `from . import x`. The takeaway is that packages using relative imports for their own modules are much more flexible and therefore friendly to downstream consumers that may wish to vendor them under different names. Packages using relative imports can be dropped in and used, often without source modifications. This is a big deal, as downstream consumers don't want to be modifying/forking packages they don't maintain. Because of the advantages of relative imports, *I've individually reached the conclusion that relative imports within packages should be considered a best practice.* I would encourage the Python community to discuss adopting that practice more formally (perhaps as a PEP or something). But package-local relative imports aren't a cure-all. There is a major problem with nested dependencies. e.g. if "shrubbery" depends on the "herring" package. There's no reasonable way of telling "shrubbery" that "herring" is actually provided by "knights.vendored." You might be tempted to convert non package-local imports to relative. e.g. `from .. import herring`. But the importer doesn't allow relative imports outside the current top-level package and this would break classic installs where "shrubbery" and "herring" are proper top-level packages and not sub-packages in e.g. a "vendored" sub-package. For cases where this occurs, the easiest recourse today is to rewrite imported source code to use relative imports. That's annoying, but it works. In summary, some Python applications may want to vendor and distribute Python package dependencies. Reliance on absolute imports is dangerous because the global Python environment is effectively undefined from the perspective of the application. The safest thing to do is use relative imports from within the application. But because many packages don't use relative imports themselves, vendoring a package can require rewriting source code so imports are relative. And even if relative imports are used within that package, relative imports can't be used for other top-level packages. So source code rewriting is required to handle these. If you vendor your Python package dependencies, your world often consists of a lot of pain. It's better to absorb that pain than inflict it on the end-users of your application (who shouldn't need to care about Python packaging). But this is a pain that Python application developers must deal with. And I feel that pain undermines the health of the Python ecosystem because it makes Python a less attractive platform for standalone applications. I would very much welcome a discussion and any ideas on improving the Python package dependency problem for standalone Python applications. I think encouraging the use of relative imports within packages is a solid first step. But it obviously isn't a complete solution. Gregory -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Mar 22 13:48:21 2018 From: phd at phdru.name (Oleg Broytman) Date: Thu, 22 Mar 2018 18:48:21 +0100 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: References: Message-ID: <20180322174821.d4bfabpud3zuymeg@phdru.name> Hi! On Thu, Mar 22, 2018 at 09:58:07AM -0700, Gregory Szorc wrote: > Not all consumers of Python packages wish to consume Python packages in the > common `pip install ` IMO `pip` is for developers. To package and distribute end-user applications there are rpm, dpkg/deb, PyInstaller, cx_Freeze, py2exe (+ installer like NSIS or InnoSetup), py2app, etc... Most of them pack a copy of Python interpreter and necessary parts of stdlib, so there is no problem with `sys.path` and wrong imports. > Gregory Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From kirillbalunov at gmail.com Thu Mar 22 14:59:27 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Thu, 22 Mar 2018 21:59:27 +0300 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: 2018-03-22 19:47 GMT+03:00 Tim Peters : > > > Is this functionality so often used and practical to be a method of > float, > > int, ..., and not just to be an auxiliary function? > > > > p.s.: The same thoughts about `as_integer_ratio` discussion. > > I would have added them as functions in the `math` module instead. > perhaps supported by dunder methods (__as_integer_ratio__, > __is_integer__). But that's not what happened, and whether or not > they have double underscores on each end doesn't really make all that > much difference except to dedicated pedants ;-) > Yes, this was my point. In spite of the fact that the pronouncement has already been made, there may still be an opportunity to influence this decision. I do not think that this is only a matter of choice, how this functionality will be accessed through a method or function, in fact these highly specialized methods heavily pollute the API and open the door for persistent questions. Given the frequency and activity of using this `.is_integer` method the deprecation of this method is unlikely to greatly affect someone. (for `as_integer_ratio` I think the bar is higher). Summarizing this thread it seems to me that with deprecation of `is_integer` method and with addition of `is_integer` function in math module will make everyone happy: PROS: 1. Those who do not like this method, and do not want to see it as a redundant part of `int`, ... will be happy 2. Those who need it will have this functionality through math module 3. Compatible packages do not have to quack louder 4. Cleaner API (no need to add this through numeric tower) 5. Make everyone happy and stop this thread :) CONS: 1. Backward incompatible change I do not want to restart this topic, but I think that there is an opportunity for improvement that can be missed. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregory.szorc at gmail.com Thu Mar 22 15:00:59 2018 From: gregory.szorc at gmail.com (Gregory Szorc) Date: Thu, 22 Mar 2018 12:00:59 -0700 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: <20180322174821.d4bfabpud3zuymeg@phdru.name> References: <20180322174821.d4bfabpud3zuymeg@phdru.name> Message-ID: On 3/22/2018 10:48 AM, Oleg Broytman wrote: > Hi! > > On Thu, Mar 22, 2018 at 09:58:07AM -0700, Gregory Szorc wrote: >> Not all consumers of Python packages wish to consume Python packages in the >> common `pip install ` > > IMO `pip` is for developers. To package and distribute end-user > applications there are rpm, dpkg/deb, PyInstaller, cx_Freeze, py2exe > (+ installer like NSIS or InnoSetup), py2app, etc... > > Most of them pack a copy of Python interpreter and necessary parts > of stdlib, so there is no problem with `sys.path` and wrong imports. Yes, there are tools to create standalone packages. Some even bundle a Python install so the execution environment is more deterministic. These are great ways to distribute Python applications! However, if you are a Python application that is maintained by your distribution's package manager, you pretty much must use a Python installed by the system package manager. And that leaves us with the original problem of an undefined execution environment. So packaging tools for standalone Python applications only work if you control the package distribution channel. If you are a successful Python application that is packaged by your distro, you lose the ability to control your own destiny and must confront these problems for users who installed your application through their distro's package manager. i.e. the cost of success for your Python application is a lot of pain inflicted by policies of downstream packagers. Also, not vendoring dependencies puts the onus on downstream packagers to deal with those dependencies. There can be package version conflicts between various packaged Python applications ("dependency hell"). Vendoring dependencies under application-local package names removes the potential for version conflicts. It's worth noting that some downstream packagers do insist on unbundling dependencies. So they may get stuck with work regardless. But if you vendor dependencies, a downstream packager is at least capable of packaging a Python application without having to deal with "dependency hell." Maybe what I'm asking for here is import machinery where an application can forcefully limit or influence import mechanisms for modules in a certain package. But this seems difficult to achieve given the constraint of a single, global modules namespace (`sys.modules`) per interpreter. From barry at python.org Thu Mar 22 15:30:02 2018 From: barry at python.org (Barry Warsaw) Date: Thu, 22 Mar 2018 12:30:02 -0700 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: References: Message-ID: <671CCBB6-A712-4B6C-93A5-CD9EBE571C11@python.org> On Mar 22, 2018, at 09:58, Gregory Szorc wrote: > > Not all consumers of Python packages wish to consume Python packages in the common `pip install ` + `import ` manner. Some Python applications may wish to vendor Python package dependencies such that known compatible versions are always available. It?s important to understand how painful vendoring is to some downstream consumers. Debian is a good example. There we often have to go through a lot of hoops to unvendor packages, both for policy and for good distribution practices. The classic example is a security vulnerability in a library. It?s the distro?s responsibility to fix that, but in the face of vendored dependencies, you can?t just patch the system package. Now you also have to hunt down all the vendored versions and figure out if *they?re* vulnerable, etc. It certainly doesn?t help that you can easily have vendored libraries vendoring their own dependencies. I think I found one application in Debian once that had like 4 or 5 versions of urllib3 inside it! You mention dependency hell for downstream consumers like a Linux distro, but this type of integration work is exactly the job of a distro. They have to weigh the health and security of all the applications and libraries they support, so it doesn?t bother me that it?s sometimes challenging to work out the right versions of library dependencies. It bothers me a lot that I have to (sometimes heavily) modify packages to devendorize dependencies, especially because it?s not always clearly evident that that has happened. That said, I completely understand the desire for application and library authors to pin their dependency versions. We?ve had some discussions in the past (not really leading anywhere) about how to satisfy both communities. I definitely don?t go so far as to discourage global imports, and I definitely don?t like vendoring all your dependencies. For applications distributed outside of a distro, there are lots of options, from zip apps (e.g. pex) to frozen binaries, etc. Developers are mostly going to use pip, and maybe a requirements.txt, so I think that use case is well covered. Downstream consumers need to be able to easily devendorize, but I think ultimately, the need to vendorize just points to something more fundamental missing from Python?s distribution and import system. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From phd at phdru.name Thu Mar 22 15:33:44 2018 From: phd at phdru.name (Oleg Broytman) Date: Thu, 22 Mar 2018 20:33:44 +0100 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: <671CCBB6-A712-4B6C-93A5-CD9EBE571C11@python.org> References: <671CCBB6-A712-4B6C-93A5-CD9EBE571C11@python.org> Message-ID: <20180322193344.bwk6hhx3z4yjih4l@phdru.name> On Thu, Mar 22, 2018 at 12:30:02PM -0700, Barry Warsaw wrote: > Developers are mostly going to use pip, and maybe a requirements.txt, +virtual envs to avoid problems with global site-packages. IMO virtualenv for development and frozen app for distribution solve the problem much better than vendoring. > Cheers, > -Barry Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From barry at python.org Thu Mar 22 15:51:14 2018 From: barry at python.org (Barry Warsaw) Date: Thu, 22 Mar 2018 12:51:14 -0700 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: <20180322193344.bwk6hhx3z4yjih4l@phdru.name> References: <671CCBB6-A712-4B6C-93A5-CD9EBE571C11@python.org> <20180322193344.bwk6hhx3z4yjih4l@phdru.name> Message-ID: <65C8BEC0-F39A-465B-9408-E9FF082ADBB1@python.org> On Mar 22, 2018, at 12:33, Oleg Broytman wrote: > > On Thu, Mar 22, 2018 at 12:30:02PM -0700, Barry Warsaw wrote: >> Developers are mostly going to use pip, and maybe a requirements.txt, > > +virtual envs to avoid problems with global site-packages. Yep, that was implied but of course it?s better to be explicit. :) > IMO virtualenv for development and frozen app for distribution solve > the problem much better than vendoring. +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From tim.peters at gmail.com Fri Mar 23 01:21:03 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 23 Mar 2018 00:21:03 -0500 Subject: [Python-Dev] Deprecating float.is_integer() In-Reply-To: References: <5AB34A4F.4010506@canterbury.ac.nz> Message-ID: [Kirill Balunov ] > ... >.... In spite of the fact that the pronouncement has > already been made, there may still be an opportunity to influence this > decision. That's not really how this works. Guido has been doing this for decades, and when he Pronounces he's done with it :-) > I do not think that this is only a matter of choice, how this > functionality will be accessed through a method or function, in fact these > highly specialized methods heavily pollute the API "Heavily"? Seems oversold. > and open the door for persistent questions. That's a door that can never be closed, no matter what. > Given the frequency and activity of using this > `.is_integer` method the deprecation of this method is unlikely to greatly > affect someone. (for `as_integer_ratio` I think the bar is higher). > Summarizing this thread it seems to me that with deprecation of `is_integer` > method and with addition of `is_integer` function in math module will make > everyone happy: Not at all, but that's already been explained. Deprecation is _serous_ business: it's not only the presumably relative handful of direct users who are directly annoyed, but any number of worldwide web pages, blogs, books, papers, slides, handouts, message boards ... that so much as mentioned the now-deprecated feature. The language implementation is the tiniest part of what's affected, yet is the _only_ part we (Python developers) can repair. Deprecation really requires that something is a security hole that can't be repaired, impossible to make work as intended, approximately senseless, or is superseded by a new way to accomplish a thing that's near-universally agreed to be vastly superior. Maybe others? Regardless, they're all "really big deals". The "harm" done by keeping these methods seems approximately insignificant. Serhiy certainly found examples where uses made no good sense, but that's _common_ among floating-point features. For example, here's a near-useless implementation of Newton's method for computing square roots: def mysqrt(x): guess = x / 2.0 while guess ** 2 != x: guess = (guess + x / guess) / 2.0 return guess And here I'll use it: >>> mysqrt(25.0) 5.0 >>> mysqrt(25.2) 5.019960159204453 Works great! Ship it :-) >>> mysqrt(25.1) Oops. It just sits there, consuming cycles. That's because there is no IEEE double x such that x*x == 25.1. While that's not at all obvious, it's true. Some people really have argued to deprecate (in)equality testing of floats because of "things like that", but that's fundamentally nuts. We may as well remove floats entirely then. In short, that an fp feature can be misused, and _is_ misused, is no argument for deprecating it. If it can _only_ be misused, that's different, but that doesn't apply to is_integer. That someone - or even almost everyone - is merely annoyed by seeing an API they have no personal use for doesn't get close to "really big deal". The time to stop it was before it was added. > PROS: > ... > 5. Make everyone happy and stop this thread :) This thread ended before you replied to it - I'm just a ghost haunting its graveyard to keep you from feeling ignored -) From steve.dower at python.org Fri Mar 23 09:28:31 2018 From: steve.dower at python.org (Steve Dower) Date: Fri, 23 Mar 2018 06:28:31 -0700 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: <65C8BEC0-F39A-465B-9408-E9FF082ADBB1@python.org> References: <671CCBB6-A712-4B6C-93A5-CD9EBE571C11@python.org> <20180322193344.bwk6hhx3z4yjih4l@phdru.name> <65C8BEC0-F39A-465B-9408-E9FF082ADBB1@python.org> Message-ID: FWIW, this is a topic I was planning to bring up at the language summit this year, so for those who are going to be there and want to toss around ideas (mine is nearly developed enough to present, but not quite yet), bring them. That said, I don?t think relying on relative imports within a package should be at all controversial, but perhaps it needs an official endorsement somehow? PEP 8 is what people read to find these, but I don?t know if it makes sense for the stdlib (maybe it could deal with some of the shadowing issues people run into? If they manage to import the top level module before their own appears ahead of it on sys.path... thinking out loud here). Cheers, Steve Top-posted from my Windows phone From: Barry Warsaw Sent: Thursday, March 22, 2018 12:56 To: Python-Dev Subject: Re: [Python-Dev] Better support for consuming vendored packages On Mar 22, 2018, at 12:33, Oleg Broytman wrote: > > On Thu, Mar 22, 2018 at 12:30:02PM -0700, Barry Warsaw wrote: >> Developers are mostly going to use pip, and maybe a requirements.txt, > > +virtual envs to avoid problems with global site-packages. Yep, that was implied but of course it?s better to be explicit. :) > IMO virtualenv for development and frozen app for distribution solve > the problem much better than vendoring. +1 -Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmangoba at python.org Fri Mar 23 11:04:15 2018 From: mmangoba at python.org (Mark Mangoba) Date: Fri, 23 Mar 2018 08:04:15 -0700 Subject: [Python-Dev] PEP 541 - Accepted Message-ID: Hi All, As the BDFL-Delegate, I?m happy to announce PEP 541 has been accepted. PEP 541 has been voted by the packaging-wg (https://wiki.python.org/psf/ PackagingWG/Charter): - Donald Stufft - Dustin Ingram - Ernest W. Durbin III - Ewa Jodlowska - Kenneth Reitz - Mark Mangoba - Nathaniel J. Smith - Nick Coghlan - Nicole Harris - Sumana Harihareswara Thank you to the packaging-wg and to everyone that has contributed to PEP 541. Best regards, Mark -- Mark Mangoba | PSF IT Manager | Python Software Foundation | mmangoba at python.org | python.org | Infrastructure Staff: infrastructure-staff at python.org | GPG: 2DE4 D92B 739C 649B EBB8 CCF6 DC05 E024 5F4C A0D1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 23 11:51:35 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 23 Mar 2018 08:51:35 -0700 Subject: [Python-Dev] PEP 541 - Accepted In-Reply-To: References: Message-ID: Thank you all! PS for those who don't recall what this is: https://www.python.org/dev/peps/pep-0541/ "Package Index Name Retention" was how the mypy project got its project name back on PyPI (there was an ancient inactive project by that name whose owner did not respond to any email). On Fri, Mar 23, 2018 at 8:04 AM, Mark Mangoba wrote: > Hi All, > > As the BDFL-Delegate, I?m happy to announce PEP 541 has been accepted. > > PEP 541 has been voted by the packaging-wg (https://wiki.python.org/psf/P > ackagingWG/Charter): > > - Donald Stufft > - Dustin Ingram > - Ernest W. Durbin III > - Ewa Jodlowska > - Kenneth Reitz > - Mark Mangoba > - Nathaniel J. Smith > - Nick Coghlan > - Nicole Harris > - Sumana Harihareswara > > Thank you to the packaging-wg and to everyone that has contributed to PEP > 541. > > Best regards, > Mark > > -- > Mark Mangoba | PSF IT Manager | Python Software Foundation | > mmangoba at python.org | python.org | Infrastructure Staff: > infrastructure-staff at python.org | GPG: 2DE4 D92B 739C 649B EBB8 CCF6 DC05 > E024 5F4C A0D1 > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Mar 23 13:10:00 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 23 Mar 2018 18:10:00 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180323171000.97EF611BF7E@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-03-16 - 2018-03-23) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6528 ( +3) closed 38349 (+37) total 44877 (+40) Open issues with patches: 2542 Issues opened (30) ================== #33087: No reliable clean shutdown method https://bugs.python.org/issue33087 opened by Void2258 #33088: Cannot pass a SyncManager proxy to a multiprocessing subproces https://bugs.python.org/issue33088 opened by jjdmon #33089: Add multi-dimensional Euclidean distance function to the math https://bugs.python.org/issue33089 opened by rhettinger #33090: race condition between send and recv in _ssl with non-zero tim https://bugs.python.org/issue33090 opened by nneonneo #33091: ssl.SSLError: Invalid error code (_ssl.c:2217) https://bugs.python.org/issue33091 opened by devkid #33092: The bytecode for f-string formatting is inefficient. https://bugs.python.org/issue33092 opened by Mark.Shannon #33093: Fatal error on SSL transport https://bugs.python.org/issue33093 opened by Eric Toombs #33095: Cross-reference isolated mode from relevant locations https://bugs.python.org/issue33095 opened by ncoghlan #33096: ttk.Treeview.insert() does not allow to insert item with "Fals https://bugs.python.org/issue33096 opened by igor.yakovchenko #33097: concurrent futures Executors accept tasks after interpreter sh https://bugs.python.org/issue33097 opened by mrknmc #33099: test_poplib hangs with the changes done in PR https://bugs.python.org/issue33099 opened by jayyin11043 #33102: get the nth folder of a given path https://bugs.python.org/issue33102 opened by amjad ben hedhili #33105: os.path.isfile returns false on Windows when file path is long https://bugs.python.org/issue33105 opened by ldconejo #33106: Deleting a key in a read-only gdbm results in KeyError, not gd https://bugs.python.org/issue33106 opened by sam-s #33109: argparse: make new 'required' argument to add_subparsers defau https://bugs.python.org/issue33109 opened by wolma #33110: Adding a done callback to a concurrent.futures Future once it https://bugs.python.org/issue33110 opened by samm #33111: Merely importing tkinter breaks parallel code (multiprocessing https://bugs.python.org/issue33111 opened by ezwelty #33113: Query performance is very low and can even lead to denial of s https://bugs.python.org/issue33113 opened by ghi5107 #33114: random.sample() behavior is unexpected/unclear from docs https://bugs.python.org/issue33114 opened by Scott Eilerman #33115: Asyncio loop blocks with a lot of parallel tasks https://bugs.python.org/issue33115 opened by decaz #33117: asyncio example uses non-existing/documented method https://bugs.python.org/issue33117 opened by hfingler #33118: No clean way to get notified when a Transport's write buffer e https://bugs.python.org/issue33118 opened by vitaly.krug #33119: python sys.argv argument parsing not clear https://bugs.python.org/issue33119 opened by Jonathan Huot #33120: infinite loop in inspect.unwrap(unittest.mock.call) https://bugs.python.org/issue33120 opened by peterdemin #33121: recv returning 0 on closed connection not documented https://bugs.python.org/issue33121 opened by joders #33122: ftplib: FTP_TLS seems to have problems with sites that close t https://bugs.python.org/issue33122 opened by jottbe #33123: Path.unlink should have a missing_ok parameter https://bugs.python.org/issue33123 opened by rbu #33124: Lazy execution of module bytecode https://bugs.python.org/issue33124 opened by nascheme #33125: Windows 10 ARM64 platform support https://bugs.python.org/issue33125 opened by Steven Noonan #33126: Some C buffer protocol APIs not documented https://bugs.python.org/issue33126 opened by pitrou Most recent 15 issues with no replies (15) ========================================== #33126: Some C buffer protocol APIs not documented https://bugs.python.org/issue33126 #33124: Lazy execution of module bytecode https://bugs.python.org/issue33124 #33123: Path.unlink should have a missing_ok parameter https://bugs.python.org/issue33123 #33122: ftplib: FTP_TLS seems to have problems with sites that close t https://bugs.python.org/issue33122 #33121: recv returning 0 on closed connection not documented https://bugs.python.org/issue33121 #33119: python sys.argv argument parsing not clear https://bugs.python.org/issue33119 #33117: asyncio example uses non-existing/documented method https://bugs.python.org/issue33117 #33113: Query performance is very low and can even lead to denial of s https://bugs.python.org/issue33113 #33110: Adding a done callback to a concurrent.futures Future once it https://bugs.python.org/issue33110 #33099: test_poplib hangs with the changes done in PR https://bugs.python.org/issue33099 #33096: ttk.Treeview.insert() does not allow to insert item with "Fals https://bugs.python.org/issue33096 #33095: Cross-reference isolated mode from relevant locations https://bugs.python.org/issue33095 #33090: race condition between send and recv in _ssl with non-zero tim https://bugs.python.org/issue33090 #33088: Cannot pass a SyncManager proxy to a multiprocessing subproces https://bugs.python.org/issue33088 #33085: *** Error in `python': double free or corruption (out): 0x0000 https://bugs.python.org/issue33085 Most recent 15 issues waiting for review (15) ============================================= #33124: Lazy execution of module bytecode https://bugs.python.org/issue33124 #33123: Path.unlink should have a missing_ok parameter https://bugs.python.org/issue33123 #33097: concurrent futures Executors accept tasks after interpreter sh https://bugs.python.org/issue33097 #33092: The bytecode for f-string formatting is inefficient. https://bugs.python.org/issue33092 #33083: math.factorial accepts non-integral Decimal instances https://bugs.python.org/issue33083 #33082: multiprocessing docs bury very important 'callback=' args https://bugs.python.org/issue33082 #33070: Add platform triplet for RISC-V https://bugs.python.org/issue33070 #33063: failed to build _ctypes: undefined reference to `ffi_closure_F https://bugs.python.org/issue33063 #33061: NoReturn missing from __all__ in typing.py https://bugs.python.org/issue33061 #33058: Enhance Python's Memory Instrumentation with COUNT_ALLOCS https://bugs.python.org/issue33058 #33057: logging.Manager.logRecordFactory is never used https://bugs.python.org/issue33057 #33051: IDLE: Create new tab for editor options in configdialog https://bugs.python.org/issue33051 #33048: macOS job broken on Travis CI https://bugs.python.org/issue33048 #33042: New 3.7 startup sequence crashes PyInstaller https://bugs.python.org/issue33042 #33038: GzipFile doesn't always ignore None as filename https://bugs.python.org/issue33038 Top 10 most discussed issues (10) ================================= #33053: Avoid adding an empty directory to sys.path when running a mod https://bugs.python.org/issue33053 18 msgs #33089: Add multi-dimensional Euclidean distance function to the math https://bugs.python.org/issue33089 12 msgs #33118: No clean way to get notified when a Transport's write buffer e https://bugs.python.org/issue33118 9 msgs #6083: Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupl https://bugs.python.org/issue6083 7 msgs #33115: Asyncio loop blocks with a lot of parallel tasks https://bugs.python.org/issue33115 7 msgs #25345: Unable to install Python 3.5 on Windows 10 https://bugs.python.org/issue25345 5 msgs #32949: Simplify "with"-related opcodes https://bugs.python.org/issue32949 5 msgs #33109: argparse: make new 'required' argument to add_subparsers defau https://bugs.python.org/issue33109 5 msgs #33111: Merely importing tkinter breaks parallel code (multiprocessing https://bugs.python.org/issue33111 5 msgs #31550: Inconsistent error message for TypeError with subscripting https://bugs.python.org/issue31550 4 msgs Issues closed (36) ================== #18802: ipaddress documentation errors https://bugs.python.org/issue18802 closed by xiang.zhang #23203: Aliasing import of sub-{module,package} from the package raise https://bugs.python.org/issue23203 closed by ncoghlan #25836: Documentation of MAKE_FUNCTION/MAKE_CLOSURE_EXTENDED_ARG is mi https://bugs.python.org/issue25836 closed by serhiy.storchaka #27683: ipaddress subnet slicing iterator malfunction https://bugs.python.org/issue27683 closed by xiang.zhang #28247: Add an option to zipapp to produce a Windows executable https://bugs.python.org/issue28247 closed by csabella #31639: http.server and SimpleHTTPServer hang after a few requests https://bugs.python.org/issue31639 closed by mdk #32056: Improve exceptions in aifc, sunau and wave https://bugs.python.org/issue32056 closed by serhiy.storchaka #32421: Keeping an exception in cache can segfault the interpreter https://bugs.python.org/issue32421 closed by zunger #32489: Allow 'continue' in 'finally' clause https://bugs.python.org/issue32489 closed by serhiy.storchaka #32505: dataclasses: make field() with no annotation an error https://bugs.python.org/issue32505 closed by eric.smith #32506: dataclasses: no need for OrderedDict now that dict guarantees https://bugs.python.org/issue32506 closed by eric.smith #32829: Lib/ be more pythonic https://bugs.python.org/issue32829 closed by r.david.murray #32838: Fix Python versions in the table of magic numbers https://bugs.python.org/issue32838 closed by serhiy.storchaka #32896: Error when subclassing a dataclass with a field that uses a de https://bugs.python.org/issue32896 closed by eric.smith #32953: Dataclasses: frozen should not be inherited for non-dataclass https://bugs.python.org/issue32953 closed by eric.smith #32960: dataclasses: disallow inheritance between frozen and non-froze https://bugs.python.org/issue32960 closed by eric.smith #33018: Improve issubclass() error checking and message https://bugs.python.org/issue33018 closed by levkivskyi #33027: handling filename encoding in Content-Disposition by cgi.Field https://bugs.python.org/issue33027 closed by pawciobiel #33034: urllib.parse.urlparse and urlsplit not raising ValueError for https://bugs.python.org/issue33034 closed by berker.peksag #33041: Issues with "async for" https://bugs.python.org/issue33041 closed by serhiy.storchaka #33044: pdb from base class, get inside a method of derived class https://bugs.python.org/issue33044 closed by terry.reedy #33049: itertools.count() confusingly mentions zip() and sequence numb https://bugs.python.org/issue33049 closed by rhettinger #33069: Maintainer information discarded when writing PKG-INFO https://bugs.python.org/issue33069 closed by ncoghlan #33078: Queue with maxsize can lead to deadlocks https://bugs.python.org/issue33078 closed by pitrou #33081: multiprocessing Queue leaks a file descriptor associated with https://bugs.python.org/issue33081 closed by pitrou #33086: pip: IndexError https://bugs.python.org/issue33086 closed by eric.smith #33094: dataclasses: ClassVar attributes are not working properly https://bugs.python.org/issue33094 closed by eric.smith #33098: add implicit conversion for random.choice() on a dict https://bugs.python.org/issue33098 closed by tim.peters #33100: dataclasses and __slots__ - non-default argument (member_descr https://bugs.python.org/issue33100 closed by eric.smith #33101: Possible name inversion in heapq implementation https://bugs.python.org/issue33101 closed by rhettinger #33103: Syntax to get multiple arbitrary items from an sequence https://bugs.python.org/issue33103 closed by serhiy.storchaka #33104: Documentation for EXTENDED_ARG in dis module is incorrect for https://bugs.python.org/issue33104 closed by Eric Appelt #33107: Feature request: more typing.SupportsXXX https://bugs.python.org/issue33107 closed by gvanrossum #33108: Unicode char 304 in lowercase has len = 2 https://bugs.python.org/issue33108 closed by serhiy.storchaka #33112: SequenceMatcher bug https://bugs.python.org/issue33112 closed by tim.peters #33116: Field is not exposed in dataclasses.__all__ https://bugs.python.org/issue33116 closed by eric.smith From nad at python.org Fri Mar 23 14:39:20 2018 From: nad at python.org (Ned Deily) Date: Fri, 23 Mar 2018 14:39:20 -0400 Subject: [Python-Dev] IMPORTANT - 3.7.0b3 cutoff / 3.7.0 ABI freeze Message-ID: Just a reminder that 3.7.0b3 is almost upon us. Please get your feature fixes, bug fixes, and documentation updates in before 2018-03-26 ~23:59 Anywhere on Earth (UTC-12:00). That's a little over 3.5 days from now. IMPORTANT: We are now entering the final phases of 3.7.0. After the tagging for 3.7.0b3, the intention is that the ABI for 3.7.0 is frozen. After next week's 3.7.0b3, there will only be two more opportunities planned for changes prior to 3.7.0 final: - 2018-04-30 3.7.0 beta 4 - 2018-05-31 3.7.0 release candidate As I've noted in previous communications, we need to start locking down 3.7.0 so that our downstream users, that is, third-party package developers, Python distributors, and end users, can test their code with confidence that the actual release of 3.7.0 will hold no unpleasant surprises. So after 3.7.0b3, you should treat the 3.7 branch as if it is already released and in maintenance mode. That means you should only push the kinds of changes that are appropriate for a maintenance release: non-ABI-changing bug and feature fixes and documentation updates. If you find a problem that requires an ABI-altering or other significant user-facing change (for example, something likely to introduce an incompatibility with existing users' code or require rebuilding of user extension modules), please make sure to set the b.p.o issue to "release blocker" priority and describe there why you feel the change is necessary. If you are reviewing PRs for 3.7 (and please do!), be on the lookout for and flag potential incompatibilities (we've all made them). Thanks again for all of your hard work towards making 3.7.0 yet another great release! --Ned -- Ned Deily nad at python.org -- [] From truestarecat at gmail.com Sat Mar 24 02:06:09 2018 From: truestarecat at gmail.com (Igor Yakovchenko) Date: Sat, 24 Mar 2018 10:06:09 +0400 Subject: [Python-Dev] ttk.Treeview.insert() does not allow to insert item with iid=0 Message-ID: I had opened a tracker issue . -- Igor Yakovchenko ??? ???????. www.avast.ru <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Mar 24 04:50:43 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 24 Mar 2018 10:50:43 +0200 Subject: [Python-Dev] Move ensurepip blobs to external place Message-ID: Currently the repository contains bundled pip and setuptools (2 MB total) which are updated with every release of pip and setuptools. This increases the size of the repository by around 2 MB several times per year. There were total 37 updates of Lib/ensurepip/_bundled, therefore the repository contains up to 70 MB of unused blobs. The size of the repository is 350 MB. Currently blobs takes up to 20% of the size of the repository, but this percent will likely grow in future, because they where added only 4 years ago. Wouldn't be better to put them into a separate repository like Tcl/Tk and other external binaries for Windows, and download only the recent version? From ncoghlan at gmail.com Sat Mar 24 05:29:24 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Mar 2018 19:29:24 +1000 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: References: Message-ID: On 23 March 2018 at 02:58, Gregory Szorc wrote: > I'd like to start a discussion around practices for vendoring package > dependencies. I'm not sure python-dev is the appropriate venue for this > discussion. If not, please point me to one and I'll gladly take it there. > > Since you mainly seem interested in the import side of things (rather than the initial vendoring process), python-ideas is probably the most suitable location (we're not at the stage of a concrete design proposal that would be appropriate for python-dev, and this doesn't get far enough into import system arcana to really need to be an import-sig discussion rather than a python-ideas one). > What we've done is effectively rename the "shrubbery" package to > "knights.vendored.shrubbery." If a module inside that package attempts an > `import shrubbery.x`, this could fail because "shrubbery" is no longer the > package name. Or worse, it could pick up a separate copy of "shrubbery" > somewhere else in `sys.path` and you could have a Frankenstein package > pulling its code from multiple installs. So for this to work, all > package-local imports must be using relative imports. e.g. `from . import > x`. > If it's the main application doing the vendoring, then the following kind of snippet can be helpful: from knights.vendored import shrubbery import sys sys.path["shrubbery"] = shrubbery So doing that kind of aliasing on a process-wide basis is already possible, as long as you have a point where you can inject the alias (and by combining it with a lazy importer, you can defer the actual import until someone actually uses the module). Limiting aliasing to a particular set of modules *doing* imports would be much harder though, since we don't pass that information along (although context variables would potentially give us a way to make it available without having to redefine all the protocol APIs) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 24 06:15:31 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Mar 2018 20:15:31 +1000 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: References: Message-ID: On 24 March 2018 at 19:29, Nick Coghlan wrote: > On 23 March 2018 at 02:58, Gregory Szorc wrote: > >> I'd like to start a discussion around practices for vendoring package >> dependencies. I'm not sure python-dev is the appropriate venue for this >> discussion. If not, please point me to one and I'll gladly take it there. >> >> > Since you mainly seem interested in the import side of things (rather than > the initial vendoring process), python-ideas is probably the most suitable > location (we're not at the stage of a concrete design proposal that would > be appropriate for python-dev, and this doesn't get far enough into import > system arcana to really need to be an import-sig discussion rather than a > python-ideas one). > > >> What we've done is effectively rename the "shrubbery" package to >> "knights.vendored.shrubbery." If a module inside that package attempts an >> `import shrubbery.x`, this could fail because "shrubbery" is no longer the >> package name. Or worse, it could pick up a separate copy of "shrubbery" >> somewhere else in `sys.path` and you could have a Frankenstein package >> pulling its code from multiple installs. So for this to work, all >> package-local imports must be using relative imports. e.g. `from . import >> x`. >> > > If it's the main application doing the vendoring, then the following kind > of snippet can be helpful: > > from knights.vendored import shrubbery > import sys > sys.path["shrubbery"] = shrubbery > Oops, s/path/modules/ :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at holdenweb.com Sat Mar 24 06:30:35 2018 From: steve at holdenweb.com (Steve Holden) Date: Sat, 24 Mar 2018 10:30:35 +0000 Subject: [Python-Dev] Better support for consuming vendored packages In-Reply-To: References: Message-ID: On Sat, Mar 24, 2018 at 9:29 AM, Nick Coghlan wrote: > On 23 March 2018 at 02:58, Gregory Szorc wrote: > >> I'd like to start a discussion around practices for vendoring package >> dependencies. I'm not sure python-dev is the appropriate venue for this >> discussion. If not, please point me to one and I'll gladly take it there. >> >> > ?[...]? > > If it's the main application doing the vendoring, then the following kind > of snippet can be helpful: > > from knights.vendored import shrubbery > import sys > sys.path["shrubbery"] = shrubbery > > ?I suspect you meant > ? sys.modules["shrubbery"]? = shrubbery -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 24 06:50:12 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Mar 2018 20:50:12 +1000 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: References: Message-ID: On 24 March 2018 at 18:50, Serhiy Storchaka wrote: > Currently the repository contains bundled pip and setuptools (2 MB total) > which are updated with every release of pip and setuptools. This increases > the size of the repository by around 2 MB several times per year. There > were total 37 updates of Lib/ensurepip/_bundled, therefore the repository > contains up to 70 MB of unused blobs. The size of the repository is 350 MB. > Currently blobs takes up to 20% of the size of the repository, but this > percent will likely grow in future, because they where added only 4 years > ago. > > Wouldn't be better to put them into a separate repository like Tcl/Tk and > other external binaries for Windows, and download only the recent version? > Specifically, I believe that would entail adding them to https://github.com/python/cpython-bin-deps, and then updating the make file to do a shallow clone of the relevant branch and copy the binaries to a point where ensurepip expects to find them? I'm fine with the general idea of moving these out to the bin-deps repo, as long as cloning the main CPython repo and running "./configure && make && ./python -m test test_ensurepip" still works. We'd also want to add docs to the developer guide on how to update them (those docs are missing at the moment, since the update process is just dropping the new wheel files directly into the right place) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sat Mar 24 07:15:05 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 24 Mar 2018 11:15:05 +0000 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: References: Message-ID: On 24 March 2018 at 10:50, Nick Coghlan wrote: > On 24 March 2018 at 18:50, Serhiy Storchaka wrote: >> >> Currently the repository contains bundled pip and setuptools (2 MB total) >> which are updated with every release of pip and setuptools. This increases >> the size of the repository by around 2 MB several times per year. There were >> total 37 updates of Lib/ensurepip/_bundled, therefore the repository >> contains up to 70 MB of unused blobs. The size of the repository is 350 MB. >> Currently blobs takes up to 20% of the size of the repository, but this >> percent will likely grow in future, because they where added only 4 years >> ago. >> >> Wouldn't be better to put them into a separate repository like Tcl/Tk and >> other external binaries for Windows, and download only the recent version? > > > Specifically, I believe that would entail adding them to > https://github.com/python/cpython-bin-deps, and then updating the make file > to do a shallow clone of the relevant branch and copy the binaries to a > point where ensurepip expects to find them? > > I'm fine with the general idea of moving these out to the bin-deps repo, as > long as cloning the main CPython repo and running "./configure && make && > ./python -m test test_ensurepip" still works. We'd also want to add docs to > the developer guide on how to update them (those docs are missing at the > moment, since the update process is just dropping the new wheel files > directly into the right place) I don't have a problem with moving the pip/setuptools wheels - as long as (as a pip dev doing a release) I know where to put the files, it makes little difference to me. But as Nick says, if the files aren't in the main CPython repository, the build process (both the Unix and the Windows processes) will need updating to ensure that the files are taken from where they do reside and put in the right places. I'd assume that a change like that is big enough that it would be targeted at 3.8, BTW (and so won't affect what I need to do for 3.7). Paul From tinchester at gmail.com Sat Mar 24 10:18:14 2018 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Sat, 24 Mar 2018 14:18:14 +0000 Subject: [Python-Dev] Replacing self.__dict__ in __init__ Message-ID: Hi Python-dev, I'm one of the core attrs contributors, and I'm contemplating applying an optimization to our generated __init__s. Before someone warns me python-dev is for the development of the language itself, there are two reasons I'm posting this here: 1) it's a very low level question that I'd really like the input of the core devs on, and 2) maybe this will find its way into dataclasses if it works out. I've found that, if a class has more than one attribute, instead of creating an init like this: self.a = a self.b = b self.c = c it's faster to do this: self.__dict__ = {'a': a, 'b': b, 'c': c} i.e. to replace the instance dictionary altogether. On PyPy, their core devs inform me this is a bad idea because the instance dictionary is special there, so we won't be doing this on PyPy. But is it safe to do on CPython? To make the question simpler, disregard the possibility of custom setters on the attributes. Thanks in advance! -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirillbalunov at gmail.com Sat Mar 24 11:15:09 2018 From: kirillbalunov at gmail.com (Kirill Balunov) Date: Sat, 24 Mar 2018 18:15:09 +0300 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: 2018-03-24 17:18 GMT+03:00 Tin Tvrtkovi? : > > I've found that, if a class has more than one attribute, instead of > creating an init like this: > > self.a = a > self.b = b > self.c = c > > it's faster to do this: > > self.__dict__ = {'a': a, 'b': b, 'c': c} > > i.e. to replace the instance dictionary altogether. On PyPy, their core > devs inform me this is a bad idea because the instance dictionary is > special there, so we won't be doing this on PyPy. > But why you need to replace it? When you can just update it: class C: def __init__(self, a, b, c): self.__dict__.update({'a': a, 'b': b, 'c': c}) I'm certainly not a developer. Just out of curiosity. With kind regards, -gdg -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Mar 24 12:09:59 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 25 Mar 2018 03:09:59 +1100 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: <20180324160958.GY16661@ando.pearwood.info> On Sat, Mar 24, 2018 at 02:18:14PM +0000, Tin Tvrtkovi? wrote: > self.__dict__ = {'a': a, 'b': b, 'c': c} > > i.e. to replace the instance dictionary altogether. On PyPy, their core > devs inform me this is a bad idea because the instance dictionary is > special there, so we won't be doing this on PyPy. > > But is it safe to do on CPython? I don't know if it's safe, but replacing __init__ is certainly an old and famous idiom: https://code.activestate.com/recipes/66531-singleton-we-dont-need-no-stinkin-singleton-the-bo/ -- Steve From raymond.hettinger at gmail.com Sat Mar 24 12:20:22 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 24 Mar 2018 09:20:22 -0700 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: > On Mar 24, 2018, at 7:18 AM, Tin Tvrtkovi? wrote: > > it's faster to do this: > > self.__dict__ = {'a': a, 'b': b, 'c': c} > > i.e. to replace the instance dictionary altogether. On PyPy, their core devs inform me this is a bad idea because the instance dictionary is special there, so we won't be doing this on PyPy. > > But is it safe to do on CPython? This should work. I've seen it done in other production tools without any ill effect. The dict can be replaced during __init__() and still get benefits of key-sharing. That benefit is lost only when the instance dict keys are modified downstream from __init__(). So, from a dict size point of view, your optimization is fine. Still, you should look at whether this would affect static type checkers, lint tools, and other tooling. Raymond From steve.dower at python.org Sat Mar 24 16:13:33 2018 From: steve.dower at python.org (Steve Dower) Date: Sat, 24 Mar 2018 13:13:33 -0700 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: References: Message-ID: Or we could just pull the right version directly from PyPI? (Note that updating the version should be an explicit step, as it is today, but the file should be identical to what?s on PyPI, right? And a urlretrieve is easier than pulling from a git repo.) Top-posted from my Windows phone From: Paul Moore Sent: Saturday, March 24, 2018 4:17 To: Nick Coghlan Cc: Serhiy Storchaka; python-dev Subject: Re: [Python-Dev] Move ensurepip blobs to external place On 24 March 2018 at 10:50, Nick Coghlan wrote: > On 24 March 2018 at 18:50, Serhiy Storchaka wrote: >> >> Currently the repository contains bundled pip and setuptools (2 MB total) >> which are updated with every release of pip and setuptools. This increases >> the size of the repository by around 2 MB several times per year. There were >> total 37 updates of Lib/ensurepip/_bundled, therefore the repository >> contains up to 70 MB of unused blobs. The size of the repository is 350 MB. >> Currently blobs takes up to 20% of the size of the repository, but this >> percent will likely grow in future, because they where added only 4 years >> ago. >> >> Wouldn't be better to put them into a separate repository like Tcl/Tk and >> other external binaries for Windows, and download only the recent version? > > > Specifically, I believe that would entail adding them to > https://github.com/python/cpython-bin-deps, and then updating the make file > to do a shallow clone of the relevant branch and copy the binaries to a > point where ensurepip expects to find them? > > I'm fine with the general idea of moving these out to the bin-deps repo, as > long as cloning the main CPython repo and running "./configure && make && > ./python -m test test_ensurepip" still works. We'd also want to add docs to > the developer guide on how to update them (those docs are missing at the > moment, since the update process is just dropping the new wheel files > directly into the right place) I don't have a problem with moving the pip/setuptools wheels - as long as (as a pip dev doing a release) I know where to put the files, it makes little difference to me. But as Nick says, if the files aren't in the main CPython repository, the build process (both the Unix and the Windows processes) will need updating to ensure that the files are taken from where they do reside and put in the right places. I'd assume that a change like that is big enough that it would be targeted at 3.8, BTW (and so won't affect what I need to do for 3.7). Paul _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Sat Mar 24 16:52:01 2018 From: nad at python.org (Ned Deily) Date: Sat, 24 Mar 2018 16:52:01 -0400 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: <407sB262wwzFrBs@mail.python.org> References: <407sB262wwzFrBs@mail.python.org> Message-ID: <8218AEF9-86A5-4A8E-83B4-4A1B7CF362BA@python.org> On Mar 24, 2018, at 16:13, Steve Dower wrote: > Or we could just pull the right version directly from PyPI? (Note that updating the version should be an explicit step, as it is today, but the file should be identical to what?s on PyPI, right? And a urlretrieve is easier than pulling from a git repo.) I think the primary original rationale for having the pip wheel and its dependencies checked into the cpython repo was so that users would be able to install pip even if they did not have an Internet connection. But perhaps that requirement can be relaxed a bit if we say that the necessary wheels are vendored into all of our downloadable release items, that is, included in the packaging of source release files (the various tarballs) and the Windows and macOS binary installers. The main change would likely be making ensurepip a bit smarter to download if the bundled wheels are not present in the source directory. Assuming that people building from a cpython repo need to have a network connection if they want to run ensurepip, at least for the first time, is probably not an onerous requirement. -- Ned Deily nad at python.org -- [] From ncoghlan at gmail.com Sat Mar 24 23:23:55 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Mar 2018 13:23:55 +1000 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: On 25 March 2018 at 00:18, Tin Tvrtkovi? wrote: > But is it safe to do on CPython? > That depends on what you mean by "safe" :) It won't crash, but it will lose any existing entries that a metaclass, subclass, or __new__ method implementation might have added to the instance dictionary before calling the __init__ method. That can be OK in a tightly controlled application specific class hierarchy, but it would be questionable in a general purpose utility library that may be combined with arbitrary other types. As Kirill suggests, `self.__dict__.update(new_attrs)` is likely to be faster than repeated assignment statements, without the potentially odd interactions with other instance initialisation code. It should also be explicitly safe to do in the case of "type(self) is __class__ and not self.__dict__", which would let you speed up the common case of direct instantiation, while falling back to the update based approach when combined with other classes at runtime. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Mar 24 23:27:18 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Mar 2018 13:27:18 +1000 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: <8218AEF9-86A5-4A8E-83B4-4A1B7CF362BA@python.org> References: <407sB262wwzFrBs@mail.python.org> <8218AEF9-86A5-4A8E-83B4-4A1B7CF362BA@python.org> Message-ID: On 25 March 2018 at 06:52, Ned Deily wrote: > On Mar 24, 2018, at 16:13, Steve Dower wrote: > > Or we could just pull the right version directly from PyPI? (Note that > updating the version should be an explicit step, as it is today, but the > file should be identical to what?s on PyPI, right? And a urlretrieve is > easier than pulling from a git repo.) > > I think the primary original rationale for having the pip wheel and its > dependencies checked into the cpython repo was so that users would be able > to install pip even if they did not have an Internet connection. But > perhaps that requirement can be relaxed a bit if we say that the necessary > wheels are vendored into all of our downloadable release items, that is, > included in the packaging of source release files (the various tarballs) > and the Windows and macOS binary installers. The main change would likely > be making ensurepip a bit smarter to download if the bundled wheels are not > present in the source directory. Assuming that people building from a > cpython repo need to have a network connection if they want to run > ensurepip, at least for the first time, is probably not an onerous > requirement. > Right, having the wheels in the release artifacts is a requirement, as is having them available for use when running the test suite, but having them in the git repo isn't. Adding them directly to the repo was just the simplest approach to getting ensurepip working, since it didn't require any changes to the build process. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at vazor.com Sun Mar 25 01:44:15 2018 From: matt at vazor.com (Matt Billenstein) Date: Sun, 25 Mar 2018 05:44:15 +0000 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: References: <407sB262wwzFrBs@mail.python.org> <8218AEF9-86A5-4A8E-83B4-4A1B7CF362BA@python.org> Message-ID: <010101625baf92f3-0fe6f030-9f72-42e6-915c-02d44aaae493-000000@us-west-2.amazonses.com> As i recall git LFS makes storing large binary objects in some external object storage fairly seamless - might be a good fit for keeping the same workflow and not bloating the repo. M -- Matt Billenstein matt at vazor.com Sent from my iPhone 6 (this put here so you know I have one) > On Mar 24, 2018, at 8:27 PM, Nick Coghlan wrote: > >> On 25 March 2018 at 06:52, Ned Deily wrote: >> On Mar 24, 2018, at 16:13, Steve Dower wrote: >> > Or we could just pull the right version directly from PyPI? (Note that updating the version should be an explicit step, as it is today, but the file should be identical to what?s on PyPI, right? And a urlretrieve is easier than pulling from a git repo.) >> >> I think the primary original rationale for having the pip wheel and its dependencies checked into the cpython repo was so that users would be able to install pip even if they did not have an Internet connection. But perhaps that requirement can be relaxed a bit if we say that the necessary wheels are vendored into all of our downloadable release items, that is, included in the packaging of source release files (the various tarballs) and the Windows and macOS binary installers. The main change would likely be making ensurepip a bit smarter to download if the bundled wheels are not present in the source directory. Assuming that people building from a cpython repo need to have a network connection if they want to run ensurepip, at least for the first time, is probably not an onerous requirement. > > Right, having the wheels in the release artifacts is a requirement, as is having them available for use when running the test suite, but having them in the git repo isn't. > > Adding them directly to the repo was just the simplest approach to getting ensurepip working, since it didn't require any changes to the build process. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/matt%40vazor.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tinchester at gmail.com Sun Mar 25 11:08:49 2018 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Sun, 25 Mar 2018 15:08:49 +0000 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: That's reassuring, thanks. On Sat, Mar 24, 2018 at 5:20 PM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > This should work. I've seen it done in other production tools without any > ill effect. > > The dict can be replaced during __init__() and still get benefits of > key-sharing. That benefit is lost only when the instance dict keys are > modified downstream from __init__(). So, from a dict size point of view, > your optimization is fine. > > Still, you should look at whether this would affect static type checkers, > lint tools, and other tooling. > > > Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidhalter88 at gmail.com Sun Mar 25 08:36:17 2018 From: davidhalter88 at gmail.com (Dave Halter) Date: Sun, 25 Mar 2018 14:36:17 +0200 Subject: [Python-Dev] Use more Argument Clinic Annotations? Message-ID: Hi Python Devs I recently started testing Jedi with Python 3.7. Some tests broke. I realized that one of the things that changed in 3.7 was the use of argument clinic in methods like str.replace. The issue is that the text signature doesn't contain a return annotation. >>> str.replace.__text_signature__ '($self, old, new, count=-1, /) In Python < 3.7 there was a `S.replace(old, new[, count]) -> str` at the top of the __doc__. T If the __text_signature__ was `'($self, old, new, count=-1, /) -> str` a lot of tools would be able to have the information again. Is this intentional or was this just forgotten? I'd like to note that this information is insanely helpful (at least for Jedi) to pick up type information. I really hope this information can make it back into 3.7, since it was there in earlier versions. If you lack don't have time I might have some. Just give me some instructions. ~ Dave PS: Don't get me wrong, I love argument clinic/inspect.signature and am generally in favor of using it everywhere. It helps tools like jedi, pycharm and others get accurate information about a builtin function. From tinchester at gmail.com Sun Mar 25 11:38:58 2018 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Sun, 25 Mar 2018 15:38:58 +0000 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: On Sun, Mar 25, 2018 at 5:23 AM Nick Coghlan wrote: > That depends on what you mean by "safe" :) > > It won't crash, but it will lose any existing entries that a metaclass, > subclass, or __new__ method implementation might have added to the instance > dictionary before calling the __init__ method. That can be OK in a tightly > controlled application specific class hierarchy, but it would be > questionable in a general purpose utility library that may be combined with > arbitrary other types. > > As Kirill suggests, `self.__dict__.update(new_attrs)` is likely to be > faster than repeated assignment statements, without the potentially odd > interactions with other instance initialisation code. > > It should also be explicitly safe to do in the case of "type(self) is > __class__ and not self.__dict__", which would let you speed up the common > case of direct instantiation, while falling back to the update based > approach when combined with other classes at runtime. > > Hm, food for thought, thank you. The entire point of the exercise is to shave nanoseconds off of __init__. Using Victor Stinner's excellent pyperf tool and CPython 3.6.3 on Linux, I see the dict replacement approach always beating the series of assignments approach, and the update approach always losing to the series of assignments. For example, for a simple class with 9 attributes: Series of assignments: Mean +- std dev: 1.31 us +- 0.06 us Dict replacement: Mean +- std dev: 1.04 us +- 0.04 us Dict update: Mean +- std dev: 1.67 us +- 0.06 us Nick's guard: 1.34 us +- 0.03 us Based on these numbers, I don't think the update approach and the guard approach are worth doing. The dict replacement approach is 30% faster though, so it's hard to ignore. The attrs generated __init__ was always a little special, for example it never calls super().__init__. Now we just need to figure out how common are the special cases you called out, and whether to make this vroom-vroom init opt-in or opt-out. Kind regards, Tin -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Mar 25 12:38:14 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 25 Mar 2018 19:38:14 +0300 Subject: [Python-Dev] Use more Argument Clinic Annotations? In-Reply-To: References: Message-ID: 25.03.18 15:36, Dave Halter ????: > I recently started testing Jedi with Python 3.7. Some tests broke. I > realized that one of the things that changed in 3.7 was the use of > argument clinic in methods like str.replace. > > The issue is that the text signature doesn't contain a return annotation. > >>>> str.replace.__text_signature__ > '($self, old, new, count=-1, /) > > > In Python < 3.7 there was a `S.replace(old, new[, count]) -> str` at > the top of the __doc__. T > > If the __text_signature__ was `'($self, old, new, count=-1, /) -> str` > a lot of tools would be able to have the information again. > > Is this intentional or was this just forgotten? I'd like to note that > this information is insanely helpful (at least for Jedi) to pick up > type information. I really hope this information can make it back into > 3.7, since it was there in earlier versions. Argument Clinic convertors don't have any relations with annotations. Annotations are not supported by Argument Clinic. From storchaka at gmail.com Sun Mar 25 12:51:42 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 25 Mar 2018 19:51:42 +0300 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: 25.03.18 18:38, Tin Tvrtkovi? ????: > For example, for a simple class with 9 attributes: What are results for classes with 2 or 100 attributes? What are results in Python 3.5? I think you are playing on thin ice. Your results depend on implementation details of the bytecode (in particularly on adding BUILD_CONST_KEY_MAP in 3.6). Other implementations use different bytecode or don't use bytecode at all. CPython can introduce new opcodes in future versions which will change your result drastically. And the straightforward way could become the most fast. I suggest you to not worry and just wait for more general optimizations in CPython interpreter. From davidhalter88 at gmail.com Sun Mar 25 12:47:38 2018 From: davidhalter88 at gmail.com (Dave Halter) Date: Sun, 25 Mar 2018 18:47:38 +0200 Subject: [Python-Dev] Use more Argument Clinic Annotations? In-Reply-To: References: Message-ID: 2018-03-25 18:38 GMT+02:00 Serhiy Storchaka : > 25.03.18 15:36, Dave Halter ????: >> >> I recently started testing Jedi with Python 3.7. Some tests broke. I >> realized that one of the things that changed in 3.7 was the use of >> argument clinic in methods like str.replace. >> >> The issue is that the text signature doesn't contain a return annotation. >> >>>>> str.replace.__text_signature__ >> >> '($self, old, new, count=-1, /) >> >> >> In Python < 3.7 there was a `S.replace(old, new[, count]) -> str` at >> the top of the __doc__. T >> >> If the __text_signature__ was `'($self, old, new, count=-1, /) -> str` >> a lot of tools would be able to have the information again. >> >> Is this intentional or was this just forgotten? I'd like to note that >> this information is insanely helpful (at least for Jedi) to pick up >> type information. I really hope this information can make it back into >> 3.7, since it was there in earlier versions. > > > Argument Clinic convertors don't have any relations with annotations. > Annotations are not supported by Argument Clinic. Is there a way though in which the __text_signature__ could contain the information `-> str` or do we need to engineer that first? IMO it's just a small thing in which Python 3.7 got worse than 3.6 and I hope we can still fix that. Everything else looks great. From storchaka at gmail.com Sun Mar 25 12:58:15 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 25 Mar 2018 19:58:15 +0300 Subject: [Python-Dev] Use more Argument Clinic Annotations? In-Reply-To: References: Message-ID: 25.03.18 19:47, Dave Halter ????: > Is there a way though in which the __text_signature__ could contain > the information `-> str` or do we need to engineer that first? There is no such way currently. From drsalists at gmail.com Sun Mar 25 14:38:58 2018 From: drsalists at gmail.com (Dan Stromberg) Date: Sun, 25 Mar 2018 11:38:58 -0700 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: On Sun, Mar 25, 2018 at 9:51 AM, Serhiy Storchaka wrote: > 25.03.18 18:38, Tin Tvrtkovi? ????: >> >> For example, for a simple class with 9 attributes: > What are results for classes with 2 or 100 attributes? What are results in > Python 3.5? > > I think you are playing on thin ice. Your results depend on implementation > details of the bytecode > > I suggest you to not worry and just wait for more general optimizations in > CPython interpreter. Indeed, sometimes strange code that was once faster, is made slower than the favored way by a future version of your favorite python interpreter. I hope there are more important things to worry about? From jelle.zijlstra at gmail.com Sun Mar 25 17:58:44 2018 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Sun, 25 Mar 2018 14:58:44 -0700 Subject: [Python-Dev] Use more Argument Clinic Annotations? In-Reply-To: References: Message-ID: 2018-03-25 5:36 GMT-07:00 Dave Halter : > Hi Python Devs > > I recently started testing Jedi with Python 3.7. Some tests broke. I > realized that one of the things that changed in 3.7 was the use of > argument clinic in methods like str.replace. > > The issue is that the text signature doesn't contain a return annotation. > > >>> str.replace.__text_signature__ > '($self, old, new, count=-1, /) > > > In Python < 3.7 there was a `S.replace(old, new[, count]) -> str` at > the top of the __doc__. T > > If the __text_signature__ was `'($self, old, new, count=-1, /) -> str` > a lot of tools would be able to have the information again. > > Is this intentional or was this just forgotten? I'd like to note that > this information is insanely helpful (at least for Jedi) to pick up > type information. I really hope this information can make it back into > 3.7, since it was there in earlier versions. > > If you lack don't have time I might have some. Just give me some > instructions. > > Perhaps you should use https://github.com/python/typeshed/ to get type information? > ~ Dave > > > PS: Don't get me wrong, I love argument clinic/inspect.signature and > am generally in favor of using it everywhere. It helps tools like > jedi, pycharm and others get accurate information about a builtin > function. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Mar 25 20:23:32 2018 From: larry at hastings.org (Larry Hastings) Date: Sun, 25 Mar 2018 17:23:32 -0700 Subject: [Python-Dev] Use more Argument Clinic Annotations? In-Reply-To: References: Message-ID: <92c96cbf-a07d-b209-33a1-9b0e2682a017@hastings.org> On 03/25/2018 09:58 AM, Serhiy Storchaka wrote: > 25.03.18 19:47, Dave Halter ????: >> Is there a way though in which the __text_signature__ could contain >> the information `-> str` or do we need to engineer that first? > > There is no such way currently. Are you sure?? I'm pretty sure Argument Clinic signatures support "return converters", which are emitted in the text signature as a return annotation.? See the section "Using a return converter" in Doc/howto/clinic.rst. What appears to be lacking is a "return converter" that handles strings.? I don't know how easy or hard it would be to write one--I haven't thought about Argument Clinic in years.? However, the DecodeFSDefault() returns a string type, with some extra implied constraints on the value I think.? So my guess is, it wouldn't be too hard to add a simple "str" return converter. Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sun Mar 25 21:43:45 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 26 Mar 2018 10:43:45 +0900 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: > > The dict can be replaced during __init__() and still get benefits of key-sharing. That benefit is lost only when the instance dict keys are modified downstream from __init__(). So, from a dict size point of view, your optimization is fine. > I think replacing __dict__ lose key-sharing. Python 3.6.4 (default, Mar 9 2018, 23:15:03) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class C: ... def __init__(self, a, b, c): ... self.a, self.b, self.c = a, b, c ... >>> class D: ... def __init__(self, a, b, c): ... self.__dict__ = {'a':a, 'b':b, 'c':c} ... >>> import sys >>> sys.getsizeof(C(1,2,3).__dict__) 112 >>> sys.getsizeof(D(1,2,3).__dict__) 240 -- INADA Naoki From raymond.hettinger at gmail.com Sun Mar 25 23:40:41 2018 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 25 Mar 2018 20:40:41 -0700 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: On Mar 25, 2018, at 8:08 AM, Tin Tvrtkovi? wrote: > > That's reassuring, thanks. I misspoke. The object size is the same but the underlying dictionary loses key-sharing and doubles in size. Raymond From tinchester at gmail.com Mon Mar 26 05:19:50 2018 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Mon, 26 Mar 2018 09:19:50 +0000 Subject: [Python-Dev] Replacing self.__dict__ in __init__ In-Reply-To: References: Message-ID: Thank you to everyone who participated (Kirill, Raymond, Nick, Naoki). I've decided there are too many caveats for this approach to be worthwhile and I'm giving up on it. Kind regards, Tin On Sat, Mar 24, 2018 at 3:18 PM Tin Tvrtkovi? wrote: > Hi Python-dev, > > I'm one of the core attrs contributors, and I'm contemplating applying an > optimization to our generated __init__s. Before someone warns me python-dev > is for the development of the language itself, there are two reasons I'm > posting this here: > > 1) it's a very low level question that I'd really like the input of the > core devs on, and > 2) maybe this will find its way into dataclasses if it works out. > > I've found that, if a class has more than one attribute, instead of > creating an init like this: > > self.a = a > self.b = b > self.c = c > > it's faster to do this: > > self.__dict__ = {'a': a, 'b': b, 'c': c} > > i.e. to replace the instance dictionary altogether. On PyPy, their core > devs inform me this is a bad idea because the instance dictionary is > special there, so we won't be doing this on PyPy. > > But is it safe to do on CPython? > > To make the question simpler, disregard the possibility of custom setters > on the attributes. > > Thanks in advance! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Mon Mar 26 10:40:10 2018 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 26 Mar 2018 10:40:10 -0400 Subject: [Python-Dev] descriptor __set_name__ and dataclasses Message-ID: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> https://bugs.python.org/issue33141 points out an interesting issue with dataclasses and descriptors. Given this code: from dataclasses import * class D: """A descriptor class that knows its name.""" def __set_name__(self, owner, name): self.name = name def __get__(self, instance, owner): if instance is not None: return 1 return self @dataclass class C: d: int = field(default=D(), init=False) C.d.name is not set, because d.__set_name__ is never called. However, in this case: class X: d: int = D() X.d.name is set to 'd' when d.__set_name__ is called during type.__new__. The problem of course, is that in the dataclass case, when class C is initialized, and before the decorator is called, C.d is set to a Field() object, not to D(). It's only when the dataclass decorator is run that I change C.d from a Field to the value of D(). That means that the call to d.__set_name__(C, 'd') is skipped. See https://www.python.org/dev/peps/pep-0487/#implementation-details for details on how type.__new__ works. The only workaround I can think of is to emulate the part of PEP 487 where __set_name__ is called. I can do this from within the @dataclass decorator when I'm initializing C.d. I'm not sure how great this solution is, since it's moving the call from class creation time to class decorator time. I think in 99+% of cases this would be fine, but you could likely write code that depends on side effects of being called during type.__new__. Unless anyone has strong objections, I'm going to make the call to __set_name__ in the @datacalss decorator. Since this is such a niche use case, I don't feel strongly that it needs to be in today's beta release, but if possible I'll get it in. I already have the patch written. And if it does get in but the consensus is that it's a bad idea, we can back it out. Eric From ncoghlan at gmail.com Mon Mar 26 11:08:57 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 27 Mar 2018 01:08:57 +1000 Subject: [Python-Dev] descriptor __set_name__ and dataclasses In-Reply-To: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> References: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> Message-ID: On 27 March 2018 at 00:40, Eric V. Smith wrote: > https://bugs.python.org/issue33141 points out an interesting issue with > dataclasses and descriptors. > > Given this code: > > from dataclasses import * > > class D: > """A descriptor class that knows its name.""" > def __set_name__(self, owner, name): > self.name = name > def __get__(self, instance, owner): > if instance is not None: > return 1 > return self > > > @dataclass > class C: > d: int = field(default=D(), init=False) > > C.d.name is not set, because d.__set_name__ is never called. However, in > this case: > > class X: > d: int = D() > > X.d.name is set to 'd' when d.__set_name__ is called during type.__new__. > > The problem of course, is that in the dataclass case, when class C is > initialized, and before the decorator is called, C.d is set to a Field() > object, not to D(). It's only when the dataclass decorator is run that I > change C.d from a Field to the value of D(). That means that the call to > d.__set_name__(C, 'd') is skipped. See > https://www.python.org/dev/peps/pep-0487/#implementation-details for details > on how type.__new__ works. > > The only workaround I can think of is to emulate the part of PEP 487 where > __set_name__ is called. I can do this from within the @dataclass decorator > when I'm initializing C.d. I'm not sure how great this solution is, since > it's moving the call from class creation time to class decorator time. I > think in 99+% of cases this would be fine, but you could likely write code > that depends on side effects of being called during type.__new__. > > Unless anyone has strong objections, I'm going to make the call to > __set_name__ in the @datacalss decorator. Since this is such a niche use > case, I don't feel strongly that it needs to be in today's beta release, but > if possible I'll get it in. I already have the patch written. And if it does > get in but the consensus is that it's a bad idea, we can back it out. Would it be feasible to define `Field.__set_name__`, and have that call `default.__set_name__` when the latter exists, and be a no-op otherwise? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Mon Mar 26 11:10:25 2018 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 26 Mar 2018 11:10:25 -0400 Subject: [Python-Dev] descriptor __set_name__ and dataclasses In-Reply-To: References: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> Message-ID: On 3/26/18 11:08 AM, Nick Coghlan wrote: > On 27 March 2018 at 00:40, Eric V. Smith wrote: >> https://bugs.python.org/issue33141 points out an interesting issue with >> dataclasses and descriptors. >> >> Given this code: >> >> from dataclasses import * >> >> class D: >> """A descriptor class that knows its name.""" >> def __set_name__(self, owner, name): >> self.name = name >> def __get__(self, instance, owner): >> if instance is not None: >> return 1 >> return self >> >> >> @dataclass >> class C: >> d: int = field(default=D(), init=False) >> >> C.d.name is not set, because d.__set_name__ is never called. However, in >> this case: >> >> class X: >> d: int = D() >> >> X.d.name is set to 'd' when d.__set_name__ is called during type.__new__. >> >> The problem of course, is that in the dataclass case, when class C is >> initialized, and before the decorator is called, C.d is set to a Field() >> object, not to D(). It's only when the dataclass decorator is run that I >> change C.d from a Field to the value of D(). That means that the call to >> d.__set_name__(C, 'd') is skipped. See >> https://www.python.org/dev/peps/pep-0487/#implementation-details for details >> on how type.__new__ works. >> >> The only workaround I can think of is to emulate the part of PEP 487 where >> __set_name__ is called. I can do this from within the @dataclass decorator >> when I'm initializing C.d. I'm not sure how great this solution is, since >> it's moving the call from class creation time to class decorator time. I >> think in 99+% of cases this would be fine, but you could likely write code >> that depends on side effects of being called during type.__new__. >> >> Unless anyone has strong objections, I'm going to make the call to >> __set_name__ in the @datacalss decorator. Since this is such a niche use >> case, I don't feel strongly that it needs to be in today's beta release, but >> if possible I'll get it in. I already have the patch written. And if it does >> get in but the consensus is that it's a bad idea, we can back it out. > > Would it be feasible to define `Field.__set_name__`, and have that > call `default.__set_name__` when the latter exists, and be a no-op > otherwise? A clever idea! I'll look in to it. Eric. From eric at trueblade.com Mon Mar 26 11:17:33 2018 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 26 Mar 2018 11:17:33 -0400 Subject: [Python-Dev] descriptor __set_name__ and dataclasses In-Reply-To: References: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> Message-ID: <1e31ee8a-fa14-a360-4747-1506f78ecba3@trueblade.com> On 3/26/18 11:10 AM, Eric V. Smith wrote: > On 3/26/18 11:08 AM, Nick Coghlan wrote: >> On 27 March 2018 at 00:40, Eric V. Smith wrote: >> Would it be feasible to define `Field.__set_name__`, and have that >> call `default.__set_name__` when the latter exists, and be a no-op >> otherwise? > > A clever idea! I'll look in to it. It looks like that does work. Thank, Nick! Eric From ncoghlan at gmail.com Tue Mar 27 10:20:39 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Mar 2018 00:20:39 +1000 Subject: [Python-Dev] descriptor __set_name__ and dataclasses In-Reply-To: <1e31ee8a-fa14-a360-4747-1506f78ecba3@trueblade.com> References: <481b6c5b-d5e0-f11a-814b-a3382cecb0c0@trueblade.com> <1e31ee8a-fa14-a360-4747-1506f78ecba3@trueblade.com> Message-ID: On 27 March 2018 at 01:17, Eric V. Smith wrote: > On 3/26/18 11:10 AM, Eric V. Smith wrote: >> >> On 3/26/18 11:08 AM, Nick Coghlan wrote: >>> >>> On 27 March 2018 at 00:40, Eric V. Smith wrote: > > >>> Would it be feasible to define `Field.__set_name__`, and have that >>> call `default.__set_name__` when the latter exists, and be a no-op >>> otherwise? >> >> >> A clever idea! I'll look in to it. > > It looks like that does work. Thank, Nick! Cool! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Tue Mar 27 15:26:59 2018 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 27 Mar 2018 12:26:59 -0700 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: I know this is all done, but for completeness? sake: I just noticed math.trunc() and __trunc__(). So wouldn?t the ?correct? way to check for an integral value be something like: obj.__trunc__() == obj I don?t think this has any bearing on adding is_integer() methods to numeric objects, but might if we wanted to add a generic is_integer() function somewhere. In any case, I don?t recall it being mentioned in the conversation, so thought I?d complete the record. -CHB On Wed, Mar 21, 2018 at 8:31 PM Guido van Rossum wrote: > On Wed, Mar 21, 2018 at 6:48 PM, Chris Barker > wrote: > >> On Wed, Mar 21, 2018 at 4:12 PM, Guido van Rossum >> wrote: >> >>> Thank you! As you may or may not have noticed in a different thread, >>> we're going through a small existential crisis regarding the usefulness of >>> is_integer() -- Serhiy believes it is not useful (and even an attractive >>> nuisance) and should be deprecated. OTOH the existence of >>> dec_mpd_isinteger() seems to validate to me that it actually exposes useful >>> functionality (and every Python feature can be abused, so that alone should >>> not >>> >> ) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob at sixty-north.com Tue Mar 27 16:16:25 2018 From: rob at sixty-north.com (Robert Smallshire) Date: Tue, 27 Mar 2018 20:16:25 +0000 Subject: [Python-Dev] Symmetry arguments for API expansion In-Reply-To: References: <46E16295-E5B0-4808-B3AC-34241DE50926@gmail.com> <5AA84F73.3030409@canterbury.ac.nz> <20180321043429.GP16661@ando.pearwood.info> Message-ID: In the PR I've submitted, that's essentially what I'm doing for the default Real.is_integer() implementation. The details differ slightly, in that I rely on the int() constructor to call __trunc__(), rather than introduce a new dependency on the math module. On Tue, 27 Mar 2018 at 21:29, Chris Barker wrote: > I know this is all done, but for completeness? sake: > > I just noticed math.trunc() and __trunc__(). > > So wouldn?t the ?correct? way to check for an integral value be something > like: > > obj.__trunc__() == obj > > I don?t think this has any bearing on adding is_integer() methods to > numeric objects, but might if we wanted to add a generic is_integer() > function somewhere. > > In any case, I don?t recall it being mentioned in the conversation, so > thought I?d complete the record. > > -CHB > > > > > > On Wed, Mar 21, 2018 at 8:31 PM Guido van Rossum wrote: > >> On Wed, Mar 21, 2018 at 6:48 PM, Chris Barker >> wrote: >> >>> On Wed, Mar 21, 2018 at 4:12 PM, Guido van Rossum >>> wrote: >>> >>>> Thank you! As you may or may not have noticed in a different thread, >>>> we're going through a small existential crisis regarding the usefulness of >>>> is_integer() -- Serhiy believes it is not useful (and even an attractive >>>> nuisance) and should be deprecated. OTOH the existence of >>>> dec_mpd_isinteger() seems to validate to me that it actually exposes useful >>>> functionality (and every Python feature can be abused, so that alone should >>>> not >>>> >>> ) >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rob%40sixty-north.com > -- *Robert Smallshire | *Managing Director *Sixty North* | Applications | Consulting | Training rob at sixty-north.com | T +47 63 01 04 44 | M +47 924 30 350 http://sixty-north.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Mar 28 11:27:19 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 18:27:19 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() Message-ID: There is a subtle semantic difference between str.format() and "equivalent" f-string. '{}{}'.format(a, b) f'{a}{b}' In the former case b is evaluated before formatting a. This is equivalent to t1 = a t2 = b t3 = format(t1) t4 = format(t2) r = t3 + t4 In the latter case a is formatted before evaluating b. This is equivalent to t1 = a t2 = format(t1) t3 = b t4 = format(t3) r = t2 + t4 In most cases this doesn't matter, but when implement the optimization that transforms the former expression to the the latter one ([1], [2]) we have to make a decision what to do with this difference. 1. Keep the exact semantic of str.format() when optimize it. This means that it should be transformed into AST node different from the AST node used for f-strings. Either introduce a new AST node type, or add a boolean flag to JoinedStr. 2. Change the semantic of f-strings. Make it closer to the semantic of str.format(): evaluate all subexpressions first than format them. This can be implemented in two ways: 2a) Add additional instructions for stack manipulations. This will slow down f-strings. 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and BUILD_STRING. This will speed up f-strings. 3. Transform str.format() into an f-string with changing semantic, and ignore this change. This is not new. The optimizer already changes semantic. Non-optimized "if a and True:" would call bool(a) twice, but optimized code calls it only once. [1] https://bugs.python.org/issue28307 [2] https://bugs.python.org/issue28308 From guido at python.org Wed Mar 28 12:20:22 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 28 Mar 2018 09:20:22 -0700 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: Hm, without thinking too much about it I'd say it's okay to change the evaluation order. Can these optimizations be disabled with something like -O0? On Wed, Mar 28, 2018 at 8:27 AM, Serhiy Storchaka wrote: > There is a subtle semantic difference between str.format() and > "equivalent" f-string. > > '{}{}'.format(a, b) > f'{a}{b}' > > In the former case b is evaluated before formatting a. This is equivalent > to > > t1 = a > t2 = b > t3 = format(t1) > t4 = format(t2) > r = t3 + t4 > > In the latter case a is formatted before evaluating b. This is equivalent > to > > t1 = a > t2 = format(t1) > t3 = b > t4 = format(t3) > r = t2 + t4 > > In most cases this doesn't matter, but when implement the optimization > that transforms the former expression to the the latter one ([1], [2]) we > have to make a decision what to do with this difference. > > 1. Keep the exact semantic of str.format() when optimize it. This means > that it should be transformed into AST node different from the AST node > used for f-strings. Either introduce a new AST node type, or add a boolean > flag to JoinedStr. > > 2. Change the semantic of f-strings. Make it closer to the semantic of > str.format(): evaluate all subexpressions first than format them. This can > be implemented in two ways: > > 2a) Add additional instructions for stack manipulations. This will slow > down f-strings. > > 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and > BUILD_STRING. This will speed up f-strings. > > 3. Transform str.format() into an f-string with changing semantic, and > ignore this change. This is not new. The optimizer already changes > semantic. Non-optimized "if a and True:" would call bool(a) twice, but > optimized code calls it only once. > > [1] https://bugs.python.org/issue28307 > [2] https://bugs.python.org/issue28308 > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Mar 28 14:26:01 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 28 Mar 2018 13:26:01 -0500 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: [Serhiy Storchaka ] > ... > This is not new. The optimizer already changes semantic. > Non-optimized "if a and True:" would call bool(a) twice, but optimized code > calls it only once. I have a hard time imaging how that could have come to be, but if it's true I'd say the unoptimized code was plain wrong. The dumbest possible way to implement `f() and g()` is also the correct ;-) way: result = f() if not bool(result): result = g() For the thing you really care about here, the language guarantees `a` will be evaluated before `b` in: '{}{}'.format(a, b) but I'm not sure it says anything about how the format operations are interleaved. So your proposed transformation is fine by me (your #3: still evaluate `a` before `b` but ignore that the format operations may occur in a different order with respect to those). From tim.peters at gmail.com Wed Mar 28 14:30:04 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 28 Mar 2018 13:30:04 -0500 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: [Tim] > I have a hard time imaging how that could have come to be, but if it's > true I'd say the unoptimized code was plain wrong. The dumbest > possible way to implement `f() and g()` is also the correct ;-) way: > > result = f() > if not bool(result): > result = g() Heh - that's entirely wrong, isn't it? That's how `or` is implemented ;-) Same top-level point, though: result = f() if bool(result): result = g() From solipsis at pitrou.net Wed Mar 28 14:39:51 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 28 Mar 2018 20:39:51 +0200 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data Message-ID: <20180328203951.3488ca41@fsol> Hi, I'd like to submit this PEP for discussion. It is quite specialized and the main target audience of the proposed changes is users and authors of applications/libraries transferring large amounts of data (read: the scientific computing & data science ecosystems). https://www.python.org/dev/peps/pep-0574/ The PEP text is also inlined below. Regards Antoine. PEP: 574 Title: Pickle protocol 5 with out-of-band data Version: $Revision$ Last-Modified: $Date$ Author: Antoine Pitrou Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 23-Mar-2018 Post-History: Resolution: Abstract ======== This PEP proposes to standardize a new pickle protocol version, and accompanying APIs to take full advantage of it: 1. A new pickle protocol version (5) to cover the extra metadata needed for out-of-band data buffers. 2. A new ``PickleBuffer`` type for ``__reduce_ex__`` implementations to return out-of-band data buffers. 3. A new ``buffer_callback`` parameter when pickling, to handle out-of-band data buffers. 4. A new ``buffers`` parameter when unpickling to provide out-of-band data buffers. The PEP guarantees unchanged behaviour for anyone not using the new APIs. Rationale ========= The pickle protocol was originally designed in 1995 for on-disk persistency of arbitrary Python objects. The performance of a 1995-era storage medium probably made it irrelevant to focus on performance metrics such as use of RAM bandwidth when copying temporary data before writing it to disk. Nowadays the pickle protocol sees a growing use in applications where most of the data isn't ever persisted to disk (or, when it is, it uses a portable format instead of Python-specific). Instead, pickle is being used to transmit data and commands from one process to another, either on the same machine or on multiple machines. Those applications will sometimes deal with very large data (such as Numpy arrays or Pandas dataframes) that need to be transferred around. For those applications, pickle is currently wasteful as it imposes spurious memory copies of the data being serialized. As a matter of fact, the standard ``multiprocessing`` module uses pickle for serialization, and therefore also suffers from this problem when sending large data to another process. Third-party Python libraries, such as Dask [#dask]_, PyArrow [#pyarrow]_ and IPyParallel [#ipyparallel]_, have started implementing alternative serialization schemes with the explicit goal of avoiding copies on large data. Implementing a new serialization scheme is difficult and often leads to reduced generality (since many Python objects support pickle but not the new serialization scheme). Falling back on pickle for unsupported types is an option, but then you get back the spurious memory copies you wanted to avoid in the first place. For example, ``dask`` is able to avoid memory copies for Numpy arrays and built-in containers thereof (such as lists or dicts containing Numpy arrays), but if a large Numpy array is an attribute of a user-defined object, ``dask`` will serialize the user-defined object as a pickle stream, leading to memory copies. The common theme of these third-party serialization efforts is to generate a stream of object metadata (which contains pickle-like information about the objects being serialized) and a separate stream of zero-copy buffer objects for the payloads of large objects. Note that, in this scheme, small objects such as ints, etc. can be dumped together with the metadata stream. Refinements can include opportunistic compression of large data depending on its type and layout, like ``dask`` does. This PEP aims to make ``pickle`` usable in a way where large data is handled as a separate stream of zero-copy buffers, letting the application handle those buffers optimally. Example ======= To keep the example simple and avoid requiring knowledge of third-party libraries, we will focus here on a bytearray object (but the issue is conceptually the same with more sophisticated objects such as Numpy arrays). Like most objects, the bytearray object isn't immediately understood by the pickle module and must therefore specify its decomposition scheme. Here is how a bytearray object currently decomposes for pickling:: >>> b.__reduce_ex__(4) (, (b'abc',), None) This is because the ``bytearray.__reduce_ex__`` implementation reads morally as follows:: class bytearray: def __reduce_ex__(self, protocol): if protocol == 4: return type(self), bytes(self), None # Legacy code for earlier protocols omitted In turn it produces the following pickle code:: >>> pickletools.dis(pickletools.optimize(pickle.dumps(b, protocol=4))) 0: \x80 PROTO 4 2: \x95 FRAME 30 11: \x8c SHORT_BINUNICODE 'builtins' 21: \x8c SHORT_BINUNICODE 'bytearray' 32: \x93 STACK_GLOBAL 33: C SHORT_BINBYTES b'abc' 38: \x85 TUPLE1 39: R REDUCE 40: . STOP (the call to ``pickletools.optimize`` above is only meant to make the pickle stream more readable by removing the MEMOIZE opcodes) We can notice several things about the bytearray's payload (the sequence of bytes ``b'abc'``): * ``bytearray.__reduce_ex__`` produces a first copy by instantiating a new bytes object from the bytearray's data. * ``pickle.dumps`` produces a second copy when inserting the contents of that bytes object into the pickle stream, after the SHORT_BINBYTES opcode. * Furthermore, when deserializing the pickle stream, a temporary bytes object is created when the SHORT_BINBYTES opcode is encountered (inducing a data copy). What we really want is something like the following: * ``bytearray.__reduce_ex__`` produces a *view* of the bytearray's data. * ``pickle.dumps`` doesn't try to copy that data into the pickle stream but instead passes the buffer view to its caller (which can decide on the most efficient handling of that buffer). * When deserializing, ``pickle.loads`` takes the pickle stream and the buffer view separately, and passes the buffer view directly to the bytearray constructor. We see that several conditions are required for the above to work: * ``__reduce__`` or ``__reduce_ex__`` must be able to return *something* that indicates a serializable no-copy buffer view. * The pickle protocol must be able to represent references to such buffer views, instructing the unpickler that it may have to get the actual buffer out of band. * The ``pickle.Pickler`` API must provide its caller with a way to receive such buffer views while serializing. * The ``pickle.Unpickler`` API must similarly allow its caller to provide the buffer views required for deserialization. * For compatibility, the pickle protocol must also be able to contain direct serializations of such buffer views, such that current uses of the ``pickle`` API don't have to be modified if they are not concerned with memory copies. Producer API ============ We are introducing a new type ``pickle.PickleBuffer`` which can be instantiated from any buffer-supporting object, and is specifically meant to be returned from ``__reduce__`` implementations:: class bytearray: def __reduce_ex__(self, protocol): if protocol == 5: return type(self), PickleBuffer(self), None # Legacy code for earlier protocols omitted ``PickleBuffer`` is a simple wrapper that doesn't have all the memoryview semantics and functionality, but is specifically recognized by the ``pickle`` module if protocol 5 or higher is enabled. It is an error to try to serialize a ``PickleBuffer`` with pickle protocol version 4 or earlier. Only the raw *data* of the ``PickleBuffer`` will be considered by the ``pickle`` module. Any type-specific *metadata* (such as shapes or datatype) must be returned separately by the type's ``__reduce__`` implementation, as is already the case. PickleBuffer objects -------------------- The ``PickleBuffer`` class supports a very simple Python API. Its constructor takes a single PEP 3118-compatible object [#pep-3118]_. ``PickleBuffer`` objects themselves support the buffer protocol, so consumers can call ``memoryview(...)`` on them to get additional information about the underlying buffer (such as the original type, shape, etc.). On the C side, a simple API will be provided to create and inspect PickleBuffer objects: ``PyObject *PyPickleBuffer_FromObject(PyObject *obj)`` Create a ``PickleBuffer`` object holding a view over the PEP 3118-compatible *obj*. ``PyPickleBuffer_Check(PyObject *obj)`` Return whether *obj* is a ``PickleBuffer`` instance. ``const Py_buffer *PyPickleBuffer_GetBuffer(PyObject *picklebuf)`` Return a pointer to the internal ``Py_buffer`` owned by the ``PickleBuffer`` instance. ``PickleBuffer`` can wrap any kind of buffer, including non-contiguous buffers. It's up to consumers to decide how best to handle different kinds of buffers (for example, some consumers may find it acceptable to make a contiguous copy of non-contiguous buffers). Consumer API ============ ``pickle.Pickler.__init__`` and ``pickle.dumps`` are augmented with an additional ``buffer_callback`` parameter:: class Pickler: def __init__(self, file, protocol=None, ..., buffer_callback=None): """ If *buffer_callback* is not None, then it is called with a list of out-of-band buffer views when deemed necessary (this could be once every buffer, or only after a certain size is reached, or once at the end, depending on implementation details). The callback should arrange to store or transmit those buffers without changing their order. If *buffer_callback* is None (the default), buffer views are serialized into *file* as part of the pickle stream. It is an error if *buffer_callback* is not None and *protocol* is None or smaller than 5. """ def pickle.dumps(obj, protocol=None, *, ..., buffer_callback=None): """ See above for *buffer_callback*. """ ``pickle.Unpickler.__init__`` and ``pickle.loads`` are augmented with an additional ``buffers`` parameter:: class Unpickler: def __init__(file, *, ..., buffers=None): """ If *buffers* is not None, it should be an iterable of buffer-enabled objects that is consumed each time the pickle stream references an out-of-band buffer view. Such buffers have been given in order to the *buffer_callback* of a Pickler object. If *buffers* is None (the default), then the buffers are taken from the pickle stream, assuming they are serialized there. It is an error for *buffers* to be None if the pickle stream was produced with a non-None *buffer_callback*. """ def pickle.loads(data, *, ..., buffers=None): """ See above for *buffers*. """ Protocol changes ================ Three new opcodes are introduced: * ``BYTEARRAY`` creates a bytearray from the data following it in the pickle stream and pushes it on the stack (just like ``BINBYTES8`` does for bytes objects); * ``NEXT_BUFFER`` fetches a buffer from the ``buffers`` iterable and pushes it on the stack. * ``READONLY_BUFFER`` makes a readonly view of the top of the stack. When pickling encounters a ``PickleBuffer``, there can be four cases: * If a ``buffer_callback`` is given and the ``PickleBuffer`` is writable, the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode is appended to the pickle stream. * If a ``buffer_callback`` is given and the ``PickleBuffer`` is readonly, the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode is appended to the pickle stream, followed by a ``READONLY_BUFFER`` opcode. * If no ``buffer_callback`` is given and the ``PickleBuffer`` is writable, it is serialized into the pickle stream as if it were a ``bytearray`` object. * If no ``buffer_callback`` is given and the ``PickleBuffer`` is readonly, it is serialized into the pickle stream as if it were a ``bytes`` object. The distinction between readonly and writable buffers is explained below (see "Mutability"). Caveats ======= Mutability ---------- PEP 3118 buffers [#pep-3118]_ can be readonly or writable. Some objects, such as Numpy arrays, need to be backed by a mutable buffer for full operation. Pickle consumers that use the ``buffer_callback`` and ``buffers`` arguments will have to be careful to recreate mutable buffers. When doing I/O, this implies using buffer-passing API variants such as ``readinto`` (which are also often preferrable for performance). Data sharing ------------ If you pickle and then unpickle an object in the same process, passing out-of-band buffer views, then the unpickled object may be backed by the same buffer as the original pickled object. For example, it might be reasonable to implement reduction of a Numpy array as follows (crucial metadata such as shapes is omitted for simplicity):: class ndarray: def __reduce_ex__(self, protocol): if protocol == 5: return numpy.frombuffer, (PickleBuffer(self), self.dtype) # Legacy code for earlier protocols omitted Then simply passing the PickleBuffer around from ``dumps`` to ``loads`` will produce a new Numpy array sharing the same underlying memory as the original Numpy object (and, incidentally, keeping it alive):: >>> import numpy as np >>> a = np.zeros(10) >>> a[0] 0.0 >>> buffers = [] >>> data = pickle.dumps(a, protocol=5, buffer_callback=buffers.extend) >>> b = pickle.loads(data, buffers=buffers) >>> b[0] = 42 >>> a[0] 42.0 This won't happen with the traditional ``pickle`` API (i.e. without passing ``buffers`` and ``buffer_callback`` parameters), because then the buffer view is serialized inside the pickle stream with a copy. Alternatives ============ The ``pickle`` persistence interface is a way of storing references to designated objects in the pickle stream while handling their actual serialization out of band. For example, one might consider the following for zero-copy serialization of bytearrays:: class MyPickle(pickle.Pickler): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.buffers = [] def persistent_id(self, obj): if type(obj) is not bytearray: return None else: index = len(self.buffers) self.buffers.append(obj) return ('bytearray', index) class MyUnpickle(pickle.Unpickler): def __init__(self, *args, buffers, **kwargs): super().__init__(*args, **kwargs) self.buffers = buffers def persistent_load(self, pid): type_tag, index = pid if type_tag == 'bytearray': return self.buffers[index] else: assert 0 # unexpected type This mechanism has two drawbacks: * Each ``pickle`` consumer must reimplement ``Pickler`` and ``Unpickler`` subclasses, with custom code for each type of interest. Essentially, N pickle consumers end up each implementing custom code for M producers. This is difficult (especially for sophisticated types such as Numpy arrays) and poorly scalable. * Each object encountered by the pickle module (even simple built-in objects such as ints and strings) triggers a call to the user's ``persistent_id()`` method, leading to a possible performance drop compared to nominal. Open questions ============== Should ``buffer_callback`` take a single buffers or a sequence of buffers? * Taking a single buffer would allow returning a boolean indicating whether the given buffer is serialized in-band or out-of-band. * Taking a sequence of buffers is potentially more efficient by reducing function call overhead. Related work ============ Dask.distributed implements a custom zero-copy serialization with fallback to pickle [#dask-serialization]_. PyArrow implements zero-copy component-based serialization for a few selected types [#pyarrow-serialization]_. PEP 554 proposes hosting multiple interpreters in a single process, with provisions for transferring buffers between interpreters as a communication scheme [#pep-554]_. Acknowledgements ================ Thanks to the following people for early feedback: Nick Coghlan, Olivier Grisel, Stefan Krah, MinRK, Matt Rocklin, Eric Snow. References ========== .. [#dask] Dask.distributed -- A lightweight library for distributed computing in Python https://distributed.readthedocs.io/ .. [#dask-serialization] Dask.distributed custom serialization https://distributed.readthedocs.io/en/latest/serialization.html .. [#ipyparallel] IPyParallel -- Using IPython for parallel computing https://ipyparallel.readthedocs.io/ .. [#pyarrow] PyArrow -- A cross-language development platform for in-memory data https://arrow.apache.org/docs/python/ .. [#pyarrow-serialization] PyArrow IPC and component-based serialization https://arrow.apache.org/docs/python/ipc.html#component-based-serialization .. [#pep-3118] PEP 3118 -- Revising the buffer protocol https://www.python.org/dev/peps/pep-3118/ .. [#pep-554] PEP 554 -- Multiple Interpreters in the Stdlib https://www.python.org/dev/peps/pep-0554/ Copyright ========= This document has been placed into the public domain. From storchaka at gmail.com Wed Mar 28 14:44:18 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 21:44:18 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: 28.03.18 19:20, Guido van Rossum ????: > Hm, without thinking too much about it I'd say it's okay to change the > evaluation order. Do you mean the option 3, right? This is the simplest option. I have already wrote a PR for optimizing old-style formating [1], but have not merged it yet due to this change of semantic. > Can these optimizations be disabled with something like -O0? Currently there is no way to disable optimizations. There is an open issue with a request for this. [2] [1] https://github.com/python/cpython/pull/5012 [2] https://bugs.python.org/issue2506 From tim.peters at gmail.com Wed Mar 28 14:54:20 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 28 Mar 2018 13:54:20 -0500 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: [Tim] > Same top-level point, though: [for evaluating `f() and g()`]: > > result = f() > if bool(result): > result = g() Ah, I think I see your point now. In the _context_ of `if f() and g()`, the dumbest possible code generation would do the above, and then go on to do if bool(result): .... If in fact `f()` returned a false-like value, an optimizer could note that `bool(result)` had already been evaluated and skip the redundant evaluation. I think that's fine either way: what the language guarantees is that `f()` will be evaluated exactly once, and `g()` no more than once, and that's all so regardless. From storchaka at gmail.com Wed Mar 28 14:54:51 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 21:54:51 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <7facd1f2-250e-a7dc-2417-457b055423cb@gmail.com> 28.03.18 21:30, Tim Peters ????: > [Tim] >> I have a hard time imaging how that could have come to be, but if it's >> true I'd say the unoptimized code was plain wrong. The dumbest >> possible way to implement `f() and g()` is also the correct ;-) way: >> >> result = f() >> if not bool(result): >> result = g() > Heh - that's entirely wrong, isn't it? That's how `or` is implemented ;-) > > Same top-level point, though: > > result = f() > if bool(result): > result = g() Optimized ??? if f() and g(): ??????? spam() is equivalent to ??? result = f() ??? if bool(result): ??????? result = g() ? ?? ?? if bool(result): ??? ? ?? ?? spam() Without optimization it would be equivalent to ??? result = f() ??? if bool(result): ??????? result = g() ??? if bool(result): ??????? spam() It calls bool() for the result of f() twice if it is false. Thus there is a small difference between ??? if f() and g(): ??????? spam() and ??? tmp = f() and g() ??? if tmp: ??????? spam() From guido at python.org Wed Mar 28 15:04:31 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 28 Mar 2018 19:04:31 +0000 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: Yes, #3, and what Tim says. On Wed, Mar 28, 2018, 11:44 Serhiy Storchaka wrote: > 28.03.18 19:20, Guido van Rossum ????: > > > Hm, without thinking too much about it I'd say it's okay to change the > > evaluation order. > > Do you mean the option 3, right? This is the simplest option. I have > already wrote a PR for optimizing old-style formating [1], but have not > merged it yet due to this change of semantic. > > > Can these optimizations be disabled with something like -O0? > > Currently there is no way to disable optimizations. There is an open > issue with a request for this. [2] > > [1] https://github.com/python/cpython/pull/5012 > [2] https://bugs.python.org/issue2506 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Mar 28 15:05:33 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Mar 2018 20:05:33 +0100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On 28 March 2018 at 19:44, Serhiy Storchaka wrote: > 28.03.18 19:20, Guido van Rossum ????: > >> Hm, without thinking too much about it I'd say it's okay to change the >> evaluation order. > > Do you mean the option 3, right? This is the simplest option. I have already > wrote a PR for optimizing old-style formating [1], but have not merged it > yet due to this change of semantic. I can't imagine (non-contrived) code where the fact that a is formatted before b is evaluated would matter, so I'm fine with option 3. Paul From storchaka at gmail.com Wed Mar 28 15:12:06 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 22:12:06 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <5fab009b-939c-44dc-87b0-28c8dde37ad9@gmail.com> 28.03.18 22:05, Paul Moore ???? > I can't imagine (non-contrived) code where the fact that a is > formatted before b is evaluated would matter, so I'm fine with option > 3. If formatting a and evaluating b both raise exceptions, the resulting exception depends on the order. $ ./python -bb >>> a = b'bytes' >>> '{}{}'.format(a, b) Traceback (most recent call last): ? File "", line 1, in NameError: name 'b' is not defined >>> f'{a}{b}' Traceback (most recent call last): ? File "", line 1, in BytesWarning: str() on a bytes instance -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Mar 28 15:13:29 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 22:13:29 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: 28.03.18 22:04, Guido van Rossum ????: > Yes, #3, and what Tim says. Thank you. This helps a much. From stefan_ml at behnel.de Wed Mar 28 15:19:35 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 28 Mar 2018 21:19:35 +0200 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: Serhiy Storchaka schrieb am 28.03.2018 um 17:27: > There is a subtle semantic difference between str.format() and "equivalent" > f-string. > > ??? '{}{}'.format(a, b) > ??? f'{a}{b}' > > In the former case b is evaluated before formatting a. This is equivalent to > > ??? t1 = a > ??? t2 = b > ??? t3 = format(t1) > ??? t4 = format(t2) > ??? r = t3 + t4 > > In the latter case a is formatted before evaluating b. This is equivalent to > > ??? t1 = a > ??? t2 = format(t1) > ??? t3 = b > ??? t4 = format(t3) > ??? r = t2 + t4 > > In most cases this doesn't matter, but when implement the optimization that > transforms the former expression to the the latter one ([1], [2]) we have > to make a decision what to do with this difference. I agree that it's not normally a problem, but if the formatting of 'a' fails and raises an exception, then 'b' will not get evaluated at all in the second case. Whether this difference is subtle or not is seems to depend largely on the code at hand. Stefan From p.f.moore at gmail.com Wed Mar 28 15:21:15 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 28 Mar 2018 20:21:15 +0100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <5fab009b-939c-44dc-87b0-28c8dde37ad9@gmail.com> References: <5fab009b-939c-44dc-87b0-28c8dde37ad9@gmail.com> Message-ID: On 28 March 2018 at 20:12, Serhiy Storchaka wrote: > 28.03.18 22:05, Paul Moore ???? > > I can't imagine (non-contrived) code where the fact that a is > formatted before b is evaluated would matter, so I'm fine with option > 3. > > > If formatting a and evaluating b both raise exceptions, the resulting > exception depends on the order. > > $ ./python -bb >>>> a = b'bytes' >>>> '{}{}'.format(a, b) > Traceback (most recent call last): > File "", line 1, in > NameError: name 'b' is not defined >>>> f'{a}{b}' > Traceback (most recent call last): > File "", line 1, in > BytesWarning: str() on a bytes instance Thanks, I hadn't thought of that. But I still say that code that depends on which exception was raised is "contrived". Anyway, Guido said #3, so no reason to debate it any further :-) Paul From storchaka at gmail.com Wed Mar 28 16:03:08 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 28 Mar 2018 23:03:08 +0300 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: <20180328203951.3488ca41@fsol> References: <20180328203951.3488ca41@fsol> Message-ID: <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> 28.03.18 21:39, Antoine Pitrou ????: > I'd like to submit this PEP for discussion. It is quite specialized > and the main target audience of the proposed changes is > users and authors of applications/libraries transferring large amounts > of data (read: the scientific computing & data science ecosystems). Currently I'm working on porting some features from cloudpickle to the stdlib. For these of them which can't or shouldn't be implemented in the general purpose library (like serializing local functions by serializing their code objects, because it is not portable) I want to add hooks that would allow to implement them in cloudpickle using official API. This would allow cloudpickle to utilize C implementation of the pickler and unpickler. There is a private module _compat_pickle for supporting compatibility of moved stdlib classes with Python 2. I'm going to provide public API that would allow third-party libraries to support compatibility for moved classes and functions. This could also help to support classes and function moved in the stdlib after 3.0. It is well known that pickle is unsafe. Unpickling untrusted data can cause executing arbitrary code. It is less known that unpickling can be made safe by controlling resolution of global names in custom Unpickler.find_class(). I want to provide helpers which would help implementing safe unpickling by specifying just white lists of globals and attributes. This work still is not finished, but I think it is worth to include it in protocol 5 if some features will need bumping protocol version. From solipsis at pitrou.net Wed Mar 28 16:19:39 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 28 Mar 2018 22:19:39 +0200 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: <20180328221939.10d3cd01@fsol> On Wed, 28 Mar 2018 23:03:08 +0300 Serhiy Storchaka wrote: > 28.03.18 21:39, Antoine Pitrou ????: > > I'd like to submit this PEP for discussion. It is quite specialized > > and the main target audience of the proposed changes is > > users and authors of applications/libraries transferring large amounts > > of data (read: the scientific computing & data science ecosystems). > > Currently I'm working on porting some features from cloudpickle to the > stdlib. For these of them which can't or shouldn't be implemented in the > general purpose library (like serializing local functions by serializing > their code objects, because it is not portable) I want to add hooks that > would allow to implement them in cloudpickle using official API. This > would allow cloudpickle to utilize C implementation of the pickler and > unpickler. Yes, that's something that would benefit a lot of people. For the record, here are my notes on the topic: https://github.com/cloudpipe/cloudpickle/issues/58#issuecomment-339751408 > It is well known that pickle is unsafe. Unpickling untrusted data can > cause executing arbitrary code. It is less known that unpickling can be > made safe by controlling resolution of global names in custom > Unpickler.find_class(). I want to provide helpers which would help > implementing safe unpickling by specifying just white lists of globals > and attributes. I'm not sure how safe that would be, because 1) there may be other attack vectors, and 2) it's difficult to predict which functions are entirely safe for calling. I think the best way to make pickles safe is to cryptographically sign them so that they cannot be forged by an attacker. > This work still is not finished, but I think it is worth to include it > in protocol 5 if some features will need bumping protocol version. Agreed. Do you know by which timeframe you'll know which opcodes you want to add? Regards Antoine. From eric at trueblade.com Wed Mar 28 16:39:04 2018 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 28 Mar 2018 16:39:04 -0400 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> I?d vote #3 as well. -- Eric > On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka wrote: > > There is a subtle semantic difference between str.format() and "equivalent" f-string. > > '{}{}'.format(a, b) > f'{a}{b}' > > In the former case b is evaluated before formatting a. This is equivalent to > > t1 = a > t2 = b > t3 = format(t1) > t4 = format(t2) > r = t3 + t4 > > In the latter case a is formatted before evaluating b. This is equivalent to > > t1 = a > t2 = format(t1) > t3 = b > t4 = format(t3) > r = t2 + t4 > > In most cases this doesn't matter, but when implement the optimization that transforms the former expression to the the latter one ([1], [2]) we have to make a decision what to do with this difference. > > 1. Keep the exact semantic of str.format() when optimize it. This means that it should be transformed into AST node different from the AST node used for f-strings. Either introduce a new AST node type, or add a boolean flag to JoinedStr. > > 2. Change the semantic of f-strings. Make it closer to the semantic of str.format(): evaluate all subexpressions first than format them. This can be implemented in two ways: > > 2a) Add additional instructions for stack manipulations. This will slow down f-strings. > > 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and BUILD_STRING. This will speed up f-strings. > > 3. Transform str.format() into an f-string with changing semantic, and ignore this change. This is not new. The optimizer already changes semantic. Non-optimized "if a and True:" would call bool(a) twice, but optimized code calls it only once. > > [1] https://bugs.python.org/issue28307 > [2] https://bugs.python.org/issue28308 > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com From timothy.c.delaney at gmail.com Wed Mar 28 17:09:32 2018 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 29 Mar 2018 08:09:32 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: On 29 March 2018 at 07:39, Eric V. Smith wrote: > I?d vote #3 as well. > > > On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka > wrote: > > > > There is a subtle semantic difference between str.format() and > "equivalent" f-string. > > > > '{}{}'.format(a, b) > > f'{a}{b}' > > > > In most cases this doesn't matter, but when implement the optimization > that transforms the former expression to the the latter one ([1], [2]) we > have to make a decision what to do with this difference. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Wed Mar 28 17:10:25 2018 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 29 Mar 2018 08:10:25 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: On 29 March 2018 at 08:09, Tim Delaney wrote: > On 29 March 2018 at 07:39, Eric V. Smith wrote: > >> I?d vote #3 as well. >> >> > On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka >> wrote: >> > >> > There is a subtle semantic difference between str.format() and >> "equivalent" f-string. >> > >> > '{}{}'.format(a, b) >> > f'{a}{b}' >> > >> > In most cases this doesn't matter, but when implement the optimization >> that transforms the former expression to the the latter one ([1], [2]) we >> have to make a decision what to do with this difference. >> > Sorry about that - finger slipped and I sent an incomplete email ... If I'm not mistaken, #3 would result in the optimiser changing str.format() into an f-string in-place. Is this correct? We're not talking here about people manually changing the code from str.format() to f-strings, right? I would argue that any optimisation needs to have the same semantics as the original code - in this case, that all arguments are evaluated before the string is formatted. I also assumed (not having actually used an f-string) that all its formatting arguments were evaluated before formatting. So my preference would be (if my understanding in the first line is correct): 1: +0 2a: +0.5 2b: +1 3: -1 Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Wed Mar 28 17:59:19 2018 From: nad at python.org (Ned Deily) Date: Wed, 28 Mar 2018 17:59:19 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.5 is now available Message-ID: Python 3.6.5 is now available. 3.6.5 is the fifth maintenance release of Python 3.6, which was initially released in 2016-12 to great interest. Detailed information about the changes made in 3.6.5 can be found in its change log. You can find Python 3.6.5 and more information here: https://www.python.org/downloads/release/python-365/ See the "What?s New In Python 3.6" document for more information about features included in the 3.6 series. Detailed information about the changes made in 3.6.5 can be found in the change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-5-final Attention macOS users: as of 3.6.5, there is a new additional installer variant for macOS 10.9+ that includes a built-in version of Tcl/Tk 8.6. This variant is expected to become the default variant in future releases. Check it out! The next maintenance release is expected to follow in about 3 months, around the end of 2018-06. Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation: https://www.python.org/psf/ -- Ned Deily nad at python.org -- [] From tim.peters at gmail.com Wed Mar 28 19:48:40 2018 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 28 Mar 2018 18:48:40 -0500 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: [Tim Delaney ] > ... > If I'm not mistaken, #3 would result in the optimiser changing str.format() > into an f-string in-place. Is this correct? We're not talking here about > people manually changing the code from str.format() to f-strings, right? All correct. It's a magical transformation from one spelling to another. > I would argue that any optimisation needs to have the same semantics as the > original code - in this case, that all arguments are evaluated before the > string is formatted. That's why Serhiy is asking about it - there _are_ potentially visible changes in behavior under all but one of his suggestions. > I also assumed (not having actually used an f-string) that all its > formatting arguments were evaluated before formatting. It's a string - it doesn't have "arguments" as such. For example: def f(a, b, n): return f"{a+b:0{n}b}" # the leading "f" makes it an f-string Then >>> f(2, 3, 12) '000000000101' The generated code currently interleaves evaluating expressions with formatting the results in a more-than-less obvious way, waiting until the end to paste all the formatted fragments together. As shown in the example, this can be more than one level deep (the example needs to paste together "0", str(n), and "b" to _build_ the format code for `a+b`). > So my preference would be (if my understanding in the first line is > correct): > > 1: +0 That's the only suggestion with no potentially visible changes. I'll add another: leave `.format()` alone entirely - there's no _need_ to "optimize" it, it's just a maybe-nice-to-have. > 2a: +0.5 > 2b: +1 Those two don't change the behaviors of `.format()`, but _do_ change some end-case behaviors of f-strings. If you're overly ;-) concerned about the former, it would be consistent to be overly concerned about the latter too. > 3: -1 And that's the one that proposes to let .format() also interleave expression evaluation (but still strictly "left to right") with formatting. If it were a general code transformation, I'm sure everyone would be -1. As is, it's hard to care. String formatting is a tiny area, and format methods are generally purely functional (no side effects). If anyone has a non-contrived example where the change would make a lick of real difference, they shouldn't be shy about posting it :-) I looked, and can't find any in my code. From njs at pobox.com Wed Mar 28 21:15:01 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 28 Mar 2018 18:15:01 -0700 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Wed, Mar 28, 2018 at 1:03 PM, Serhiy Storchaka wrote: > 28.03.18 21:39, Antoine Pitrou ????: >> I'd like to submit this PEP for discussion. It is quite specialized >> and the main target audience of the proposed changes is >> users and authors of applications/libraries transferring large amounts >> of data (read: the scientific computing & data science ecosystems). > > Currently I'm working on porting some features from cloudpickle to the > stdlib. For these of them which can't or shouldn't be implemented in the > general purpose library (like serializing local functions by serializing > their code objects, because it is not portable) I want to add hooks that > would allow to implement them in cloudpickle using official API. This would > allow cloudpickle to utilize C implementation of the pickler and unpickler. There's obviously some tension here between pickle's use as a persistent storage format, and its use as a transient wire format. For the former, you definitely can't store code objects because there's no forwards- or backwards-compatibility guarantee for bytecode. But for the latter, transmitting bytecode is totally fine, because all you care about is whether it can be decoded once, right now, by some peer process whose python version you can control -- that's why cloudpickle exists. Would it make sense to have a special pickle version that the transient wire format users could opt into, that only promises compatibility within a given 3.X release cycle? Like version=-2 or version=pickle.NONPORTABLE or something? (This is orthogonal to Antoine's PEP.) -n -- Nathaniel J. Smith -- https://vorpus.org From robertc at robertcollins.net Wed Mar 28 21:40:17 2018 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 29 Mar 2018 01:40:17 +0000 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: <20180328203951.3488ca41@fsol> References: <20180328203951.3488ca41@fsol> Message-ID: One question.. On Thu., 29 Mar. 2018, 07:42 Antoine Pitrou, wrote: > ... > ======= > > Mutability > ---------- > > PEP 3118 buffers [#pep-3118]_ can be readonly or writable. Some objects, > such as Numpy arrays, need to be backed by a mutable buffer for full > operation. Pickle consumers that use the ``buffer_callback`` and > ``buffers`` > arguments will have to be careful to recreate mutable buffers. When doing > I/O, this implies using buffer-passing API variants such as ``readinto`` > (which are also often preferrable for performance). > > Data sharing > ------------ > > If you pickle and then unpickle an object in the same process, passing > out-of-band buffer views, then the unpickled object may be backed by the > same buffer as the original pickled object. > > For example, it might be reasonable to implement reduction of a Numpy array > as follows (crucial metadata such as shapes is omitted for simplicity):: > > class ndarray: > > def __reduce_ex__(self, protocol): > if protocol == 5: > return numpy.frombuffer, (PickleBuffer(self), self.dtype) > # Legacy code for earlier protocols omitted > > Then simply passing the PickleBuffer around from ``dumps`` to ``loads`` > will produce a new Numpy array sharing the same underlying memory as the > original Numpy object (and, incidentally, keeping it alive):: This seems incompatible with v4 semantics. There, a loads plus dumps combination is approximately a deep copy. This isn't. Sometimes. Sometimes it is. Other than that way, I like it. Rob > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Mar 28 22:10:56 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 28 Mar 2018 22:10:56 -0400 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On 3/28/2018 9:15 PM, Nathaniel Smith wrote: > There's obviously some tension here between pickle's use as a > persistent storage format, and its use as a transient wire format. For > the former, you definitely can't store code objects because there's no > forwards- or backwards-compatibility guarantee for bytecode. But for > the latter, transmitting bytecode is totally fine, because all you > care about is whether it can be decoded once, right now, by some peer > process whose python version you can control -- that's why cloudpickle > exists. An interesting observation. IDLE compiles user code in the user process to check for syntax errors. idlelib.rpc subclasses Pickler to pickle the resulting code objects via marshal.dumps so it can send them to the user code execution subprocess. > Would it make sense to have a special pickle version that the > transient wire format users could opt into, that only promises > compatibility within a given 3.X release cycle? Like version=-2 or > version=pickle.NONPORTABLE or something? > > (This is orthogonal to Antoine's PEP.) -- Terry Jan Reedy From julia.hiyeon.kim at gmail.com Thu Mar 29 00:14:53 2018 From: julia.hiyeon.kim at gmail.com (Julia Kim) Date: Wed, 28 Mar 2018 21:14:53 -0700 Subject: [Python-Dev] Sets, Dictionaries Message-ID: Hi, My name is Julia Hiyeon Kim. My suggestion is to change the syntax for creating an empty set and an empty dictionary as following. an_empty_set = {} an_empty_dictionary = {:} It would seem to make more sense. Warm regards, Julia Kim From hasan.diwan at gmail.com Thu Mar 29 00:27:59 2018 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Wed, 28 Mar 2018 21:27:59 -0700 Subject: [Python-Dev] Sets, Dictionaries In-Reply-To: References: Message-ID: Hi, Julia, On 28 March 2018 at 21:14, Julia Kim wrote: > > My suggestion is to change the syntax for creating an empty set and an > empty dictionary as following. > You should craft your suggestion as a PEP and send it to the python-ideas mailing list. Good luck! -- H -- OpenPGP: https://sks-keyservers.net/pks/lookup?op=get&search=0xFEBAD7FFD041BBA1 If you wish to request my time, please do so using http://bit.ly/hd1ScheduleRequest. Si vous voudrais faire connnaisance, allez a http://bit.ly/hd1ScheduleRequest. Sent from my mobile device Envoye de mon portable -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Mar 29 01:57:56 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 29 Mar 2018 16:57:56 +1100 Subject: [Python-Dev] Sets, Dictionaries In-Reply-To: References: Message-ID: <20180329055755.GM16661@ando.pearwood.info> Hi Julia, and welcome! On Wed, Mar 28, 2018 at 09:14:53PM -0700, Julia Kim wrote: > My suggestion is to change the syntax for creating an empty set and an > empty dictionary as following. > > an_empty_set = {} > an_empty_dictionary = {:} > > It would seem to make more sense. Indeed it would, and if sets had existed in Python since the beginning, that's probably exactly what we would have done. But unfortunately they didn't, and {} has meant an empty dict forever. The requirement to keep backwards-compatibility is a very, very hard barrier to cross. I think we all acknowledge that it is sad and a little bit confusing that {} means a dict not a set, but it isn't sad or confusing enough to justify breaking millions of existing scripts and applications. Not to mention the confusing transition period when the community would be using *both* standards at the same time, which could easily last ten years. Given that, I think we just have to accept that having to use set() for the empty set instead of {} is a minor wart on the language that we're stuck with. If you disagree, and think that you have a concrete plan that can make this transition work, we'll be happy to hear it, but you'll almost certainly need to write a PEP before it could be accepted. https://www.python.org/dev/peps/ Thanks, -- Steve From chris.jerdonek at gmail.com Thu Mar 29 03:56:59 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 29 Mar 2018 00:56:59 -0700 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Wed, Mar 28, 2018 at 6:15 PM, Nathaniel Smith wrote: > On Wed, Mar 28, 2018 at 1:03 PM, Serhiy Storchaka wrote: >> 28.03.18 21:39, Antoine Pitrou ????: >>> I'd like to submit this PEP for discussion. It is quite specialized >>> and the main target audience of the proposed changes is >>> users and authors of applications/libraries transferring large amounts >>> of data (read: the scientific computing & data science ecosystems). >> >> Currently I'm working on porting some features from cloudpickle to the >> stdlib. For these of them which can't or shouldn't be implemented in the >> general purpose library (like serializing local functions by serializing >> their code objects, because it is not portable) I want to add hooks that >> would allow to implement them in cloudpickle using official API. This would >> allow cloudpickle to utilize C implementation of the pickler and unpickler. > > There's obviously some tension here between pickle's use as a > persistent storage format, and its use as a transient wire format. For > the former, you definitely can't store code objects because there's no > forwards- or backwards-compatibility guarantee for bytecode. But for > the latter, transmitting bytecode is totally fine, because all you > care about is whether it can be decoded once, right now, by some peer > process whose python version you can control -- that's why cloudpickle > exists. Is it really true you'll always be able to control the Python version on the other side? Even if they're internal services, it seems like there could be times / reasons preventing you from upgrading the environment of all of your services at the same rate. Or did you mean to say "often" all you care about ...? --Chris > > Would it make sense to have a special pickle version that the > transient wire format users could opt into, that only promises > compatibility within a given 3.X release cycle? Like version=-2 or > version=pickle.NONPORTABLE or something? > > (This is orthogonal to Antoine's PEP.) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com From njs at pobox.com Thu Mar 29 04:18:00 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 29 Mar 2018 01:18:00 -0700 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Thu, Mar 29, 2018 at 12:56 AM, Chris Jerdonek wrote: > On Wed, Mar 28, 2018 at 6:15 PM, Nathaniel Smith wrote: >> On Wed, Mar 28, 2018 at 1:03 PM, Serhiy Storchaka wrote: >>> 28.03.18 21:39, Antoine Pitrou ????: >>>> I'd like to submit this PEP for discussion. It is quite specialized >>>> and the main target audience of the proposed changes is >>>> users and authors of applications/libraries transferring large amounts >>>> of data (read: the scientific computing & data science ecosystems). >>> >>> Currently I'm working on porting some features from cloudpickle to the >>> stdlib. For these of them which can't or shouldn't be implemented in the >>> general purpose library (like serializing local functions by serializing >>> their code objects, because it is not portable) I want to add hooks that >>> would allow to implement them in cloudpickle using official API. This would >>> allow cloudpickle to utilize C implementation of the pickler and unpickler. >> >> There's obviously some tension here between pickle's use as a >> persistent storage format, and its use as a transient wire format. For >> the former, you definitely can't store code objects because there's no >> forwards- or backwards-compatibility guarantee for bytecode. But for >> the latter, transmitting bytecode is totally fine, because all you >> care about is whether it can be decoded once, right now, by some peer >> process whose python version you can control -- that's why cloudpickle >> exists. > > Is it really true you'll always be able to control the Python version > on the other side? Even if they're internal services, it seems like > there could be times / reasons preventing you from upgrading the > environment of all of your services at the same rate. Or did you mean > to say "often" all you care about ...? Yeah, maybe I spoke a little sloppily -- I'm sure there are cases where you're using pickle as a wire format between heterogenous interpreters, in which case you wouldn't use version=NONPORTABLE. But projects like dask, and everyone else who uses cloudpickle/dill, are already assuming homogenous interpreters. A typical way of using these kinds of systems is: you start your script, it spins up some cloud VMs or local cluster nodes (maybe sending them all a conda environment you made), they all chat for a while doing your computation, and then they spin down again and your script reports the results. So versioning and coordinated upgrades really aren't a thing you need to worry about :-). Another example is the multiprocessing module: it's very safe to assume that the parent and the child are using the same interpreter :-). There's no fundamental reason you shouldn't be able to send bytecode between them. Pickle's not really the ideal wire format for persistent services anyway, given the arbitrary code execution and tricky versioning -- even if you aren't playing games with bytecode, pickle still assumes that if two classes in two different interpreters have the same name, then their internal implementation details are all the same. You can make it work, but usually there are better options. It's perfect though for multi-core and multi-machine parallelism. -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Thu Mar 29 04:08:02 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Mar 2018 10:08:02 +0200 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> Message-ID: <20180329100802.1c4bbce5@fsol> On Thu, 29 Mar 2018 01:40:17 +0000 Robert Collins wrote: > > > > Data sharing > > ------------ > > > > If you pickle and then unpickle an object in the same process, passing > > out-of-band buffer views, then the unpickled object may be backed by the > > same buffer as the original pickled object. > > > > For example, it might be reasonable to implement reduction of a Numpy array > > as follows (crucial metadata such as shapes is omitted for simplicity):: > > > > class ndarray: > > > > def __reduce_ex__(self, protocol): > > if protocol == 5: > > return numpy.frombuffer, (PickleBuffer(self), self.dtype) > > # Legacy code for earlier protocols omitted > > > > Then simply passing the PickleBuffer around from ``dumps`` to ``loads`` > > will produce a new Numpy array sharing the same underlying memory as the > > original Numpy object (and, incidentally, keeping it alive):: > > This seems incompatible with v4 semantics. There, a loads plus dumps > combination is approximately a deep copy. This isn't. Sometimes. Sometimes > it is. True. But it's only incompatible if you pass the new ``buffer_callback`` and ``buffers`` arguments. If you don't, then you always get a copy. This is something that consumers should keep in mind. Note there's a movement towards immutable data. For example, Dask arrays and Arrow arrays are designed as immutable. Regards Antoine. From rosuav at gmail.com Thu Mar 29 04:49:26 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 29 Mar 2018 19:49:26 +1100 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Thu, Mar 29, 2018 at 7:18 PM, Nathaniel Smith wrote: > Another example is the multiprocessing module: it's very safe to > assume that the parent and the child are using the same interpreter > :-). There's no fundamental reason you shouldn't be able to send > bytecode between them. You put a smiley on it, but is this actually guaranteed on all platforms? On Unix-like systems, presumably it's using fork() and thus will actually use the exact same binary, but what about on Windows, where a new process has to be spawned? Can you say "spawn me another of this exact binary blob", or do you have to identify it by a file name? It wouldn't be a problem for the nonportable mode to toss out an exception in weird cases like this, but it _would_ be a problem if that causes a segfault or something. ChrisA From p.f.moore at gmail.com Thu Mar 29 04:56:50 2018 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 29 Mar 2018 09:56:50 +0100 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On 29 March 2018 at 09:49, Chris Angelico wrote: > On Thu, Mar 29, 2018 at 7:18 PM, Nathaniel Smith wrote: >> Another example is the multiprocessing module: it's very safe to >> assume that the parent and the child are using the same interpreter >> :-). There's no fundamental reason you shouldn't be able to send >> bytecode between them. > > You put a smiley on it, but is this actually guaranteed on all > platforms? On Unix-like systems, presumably it's using fork() and thus > will actually use the exact same binary, but what about on Windows, > where a new process has to be spawned? Can you say "spawn me another > of this exact binary blob", or do you have to identify it by a file > name? > > It wouldn't be a problem for the nonportable mode to toss out an > exception in weird cases like this, but it _would_ be a problem if > that causes a segfault or something. If you're embedding, you need multiprocessing.set_executable() (https://docs.python.org/3.6/library/multiprocessing.html#multiprocessing.set_executable), so in that case you definitely *won't* have the same binary... Paul From rosuav at gmail.com Thu Mar 29 05:01:06 2018 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 29 Mar 2018 20:01:06 +1100 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Thu, Mar 29, 2018 at 7:56 PM, Paul Moore wrote: > On 29 March 2018 at 09:49, Chris Angelico wrote: >> On Thu, Mar 29, 2018 at 7:18 PM, Nathaniel Smith wrote: >>> Another example is the multiprocessing module: it's very safe to >>> assume that the parent and the child are using the same interpreter >>> :-). There's no fundamental reason you shouldn't be able to send >>> bytecode between them. >> >> You put a smiley on it, but is this actually guaranteed on all >> platforms? On Unix-like systems, presumably it's using fork() and thus >> will actually use the exact same binary, but what about on Windows, >> where a new process has to be spawned? Can you say "spawn me another >> of this exact binary blob", or do you have to identify it by a file >> name? >> >> It wouldn't be a problem for the nonportable mode to toss out an >> exception in weird cases like this, but it _would_ be a problem if >> that causes a segfault or something. > > If you're embedding, you need multiprocessing.set_executable() > (https://docs.python.org/3.6/library/multiprocessing.html#multiprocessing.set_executable), > so in that case you definitely *won't* have the same binary... Ah, and that also showed me that forking isn't mandatory on Unix either. So yeah, there's no assuming that they use the same binary. I doubt it'll be a problem to pickle though as it'll use some form of versioning even in NONPORTABLE mode right? ChrisA From ja.py at farowl.co.uk Thu Mar 29 06:17:10 2018 From: ja.py at farowl.co.uk (Jeff Allen) Date: Thu, 29 Mar 2018 11:17:10 +0100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: My credentials for this are that I re-worked str.format in Jython quite extensively, and I followed the design of f-strings a bit when they were introduced, but I haven't used them to write anything. On 29/03/2018 00:48, Tim Peters wrote: > [Tim Delaney ] >> ... >> I also assumed (not having actually used an f-string) that all its >> formatting arguments were evaluated before formatting. > It's a string - it doesn't have "arguments" as such. For example: > def f(a, b, n): > return f"{a+b:0{n}b}" # the leading "f" makes it an f-string > Agreed "argument" is the wrong word, but so is "string". It's an expression returning a string, in which a, b and n are free variables. I think we can understand it best as a string-display (https://docs.python.org/3/reference/expressions.html#list-displays), or a sort of eval() call. The difference Serhiy identifies emerges (I think) because in the conventional interpretation of a format call, the arguments of format are evaluated left-to right (all of them) and then formatted in the order references are encountered to these values in a tuple or dictionary. In an f-string expressions are evaluated as they are encountered. A more testing example is therefore perhaps: ??? '{1} {0}'.format(a(), b()) # E1 ??? f'{b()}{a()}' # E2 I think I would be very surprised to find b called before a in E1 because of the general contract on the meaning of method calls. I'm assuming that's what an AST-based optimisation would do? There's no reason in E2 to call them in any other order than b then a and the documentation tells me they are. But do I expect a() to be called before the results of b() are formatted? In E1 I definitely expect that. In E2 I don't think I'd be surprised either way. Forced to guess, I would guess that b() would be formatted and in the output buffer before a() was called, since it gives the implementation fewer things to remember. Then I hope I would not depend on this guesswork. Strictly-speaking the documentation doesn't say when the result is formatted in relation to the evaluation of other expressions, so there is permission for Serhiy's idea #2. I think the (internal) AST change implied in Serhiy's idea #1 is the price one has to pay *if* one insists on optimising str.format(). str.format just a method like any other. The reasons would have to be very strong to give it special-case semantics. I agree that the cases are rare in which one would notice a difference. (Mostly I think it would be a surprise during debugging.) But I think users should be able to rely on the semantics of call. Easier optimisation doesn't seem to me a strong enough argument. This leaves me at: 1: +1 2a, 2b: +0 3: -1 Jeff Allen -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Mar 29 07:25:13 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 29 Mar 2018 11:25:13 +0000 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: On Thu, Mar 29, 2018, 02:02 Chris Angelico wrote: > On Thu, Mar 29, 2018 at 7:56 PM, Paul Moore wrote: > > On 29 March 2018 at 09:49, Chris Angelico wrote: > >> On Thu, Mar 29, 2018 at 7:18 PM, Nathaniel Smith wrote: > >>> Another example is the multiprocessing module: it's very safe to > >>> assume that the parent and the child are using the same interpreter > >>> :-). There's no fundamental reason you shouldn't be able to send > >>> bytecode between them. > >> > >> You put a smiley on it, but is this actually guaranteed on all > >> platforms? On Unix-like systems, presumably it's using fork() and thus > >> will actually use the exact same binary, but what about on Windows, > >> where a new process has to be spawned? Can you say "spawn me another > >> of this exact binary blob", or do you have to identify it by a file > >> name? > >> > >> It wouldn't be a problem for the nonportable mode to toss out an > >> exception in weird cases like this, but it _would_ be a problem if > >> that causes a segfault or something. > > > > If you're embedding, you need multiprocessing.set_executable() > > ( > https://docs.python.org/3.6/library/multiprocessing.html#multiprocessing.set_executable > ), > > so in that case you definitely *won't* have the same binary... > > Ah, and that also showed me that forking isn't mandatory on Unix > either. So yeah, there's no assuming that they use the same binary. > Normally it spawns children using `sys.executable`, which I think on Windows in particular is guaranteed to be the same binary that started the main process, because the OS locks the file while it's executing. But yeah, I didn't think about the embedding case, and apparently there's also a little-known set of features for using multiprocessing between arbitrary python processes: https://docs.python.org/3/library/multiprocessing.html#multiprocessing-listeners-clients > I doubt it'll be a problem to pickle though as it'll use some form of > versioning even in NONPORTABLE mode right? > I guess the (merged, but undocumented?) changes in https://bugs.python.org/issue28053 should make it possible to set the pickle version, and yeah, if we did add a NONPORTABLE mode then presumably it would have some kind of header saying which version of python it was created with, so version mismatches could give a sensible error message. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Mar 29 07:50:50 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 29 Mar 2018 07:50:50 -0400 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: <64a73b60-0668-561e-eab6-d17c2f4b995e@trueblade.com> On 3/29/2018 6:17 AM, Jeff Allen wrote: > My credentials for this are that I re-worked str.format in Jython quite > extensively, and I followed the design of f-strings a bit when they were > introduced, but I haven't used them to write anything. Thanks for your work on Jython. And hop on the f-string bandwagon! > The difference Serhiy identifies emerges (I think) because in the > conventional interpretation of a format call, the arguments of format > are evaluated left-to right (all of them) and then formatted in the > order references are encountered to these values in a tuple or > dictionary. In an f-string expressions are evaluated as they are > encountered. A more testing example is therefore perhaps: > > ??? '{1} {0}'.format(a(), b()) # E1 > > ??? f'{b()}{a()}' # E2 > > > I think I would be very surprised to find b called before a in E1 > because of the general contract on the meaning of method calls. I'm > assuming that's what an AST-based optimisation would do? There's no > reason in E2 to call them in any other order than b then a and the > documentation tells me they are. > > But do I expect a() to be called before the results of b() are > formatted? In E1 I definitely expect that. In E2 I don't think I'd be > surprised either way. Forced to guess, I would guess that b() would be > formatted and in the output buffer before a() was called, since it gives > the implementation fewer things to remember. Then I hope I would not > depend on this guesswork. Strictly-speaking the documentation doesn't > say when the result is formatted in relation to the evaluation of other > expressions, so there is permission for Serhiy's idea #2. I don't think we should restrict f-strings to having to evaluate all of the expressions before formatting. But, if we do restrict it, we should document whatever the order is in 3.6 and add tests to ensure the behavior doesn't change. > I think the (internal) AST change implied in Serhiy's idea #1 is the > price one has to pay *if* one insists on optimising str.format(). > > str.format just a method like any other. The reasons would have to be > very strong to give it special-case semantics. I agree that the cases > are rare in which one would notice a difference. (Mostly I think it > would be a surprise during debugging.) But I think users should be able > to rely on the semantics of call. Easier optimisation doesn't seem to me > a strong enough argument. > > This leaves me at: > 1: +1 > 2a, 2b: +0 > 3: -1 #1 seems so complex as to not be worth it, given the likely small overall impact of the optimization to a large program. If the speedup really is sufficiently important for a particular piece of code, I'd suggest just rewriting the code to use f-strings, and the author could then determine if the transformation breaks anything. Maybe write a 2to3 like tool that would identify places where str.format or %-formatting could be replaced by f-strings? I know I'd run it on my code, if it existed. Because the optimization can only work code with literals, I think manually modifying the source code is an acceptable solution if the possible change in semantics implied by #3 are unacceptable. Eric. From steve at pearwood.info Thu Mar 29 08:28:26 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 29 Mar 2018 23:28:26 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <20180329122826.GN16661@ando.pearwood.info> On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote: > The optimizer already changes > semantic. Non-optimized "if a and True:" would call bool(a) twice, but > optimized code calls it only once. I don't understand this. Why would bool(a) be called twice, and when did this change? Surely calling it twice would be a bug. I just tried the oldest Python 3 I have on this computer, 3.2, and bool is only called once. -- Steve From rosuav at gmail.com Thu Mar 29 10:08:42 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 30 Mar 2018 01:08:42 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <20180329122826.GN16661@ando.pearwood.info> References: <20180329122826.GN16661@ando.pearwood.info> Message-ID: On Thu, Mar 29, 2018 at 11:28 PM, Steven D'Aprano wrote: > On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote: > >> The optimizer already changes >> semantic. Non-optimized "if a and True:" would call bool(a) twice, but >> optimized code calls it only once. > > I don't understand this. Why would bool(a) be called twice, and when did > this change? Surely calling it twice would be a bug. > > I just tried the oldest Python 3 I have on this computer, 3.2, and bool > is only called once. Technically not bool() itself, but the equivalent. Here's some similar code: From rosuav at gmail.com Thu Mar 29 10:18:15 2018 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 30 Mar 2018 01:18:15 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <20180329122826.GN16661@ando.pearwood.info> Message-ID: On Fri, Mar 30, 2018 at 1:08 AM, Chris Angelico wrote: > On Thu, Mar 29, 2018 at 11:28 PM, Steven D'Aprano wrote: >> On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote: >> >>> The optimizer already changes >>> semantic. Non-optimized "if a and True:" would call bool(a) twice, but >>> optimized code calls it only once. >> >> I don't understand this. Why would bool(a) be called twice, and when did >> this change? Surely calling it twice would be a bug. >> >> I just tried the oldest Python 3 I have on this computer, 3.2, and bool >> is only called once. > > Technically not bool() itself, but the equivalent. Here's some similar code: Wow, I'm good. Premature send much? Nice going, Chris. Let's try that again. Here's some similar code: >>> def f(a): ... if a and x: ... print("Yep") ... >>> class Bool: ... def __bool__(self): ... print("True?") ... return True ... >>> x = 1 >>> f(Bool()) True? Yep This is, however, boolifying a, then boolifying x separately. To bool a twice, you'd need to write this instead: >>> def f(a): ... if a or False: ... print("Yep") ... In its optimized form, this still only boolifies a once. But we can defeat the optimization: >>> def f(a): ... cond = a or False ... if cond: ... print("Yep") ... >>> f(Bool()) True? True? Yep The "or False" part implies a booleanness check on its left operand, and the 'if' statement performs a boolean truthiness check on its result. That means two calls to __bool__ in the unoptimized form. But it gets optimized, on the assumption that __bool__ is a pure function. The version assigning to a temporary variable does one check before assigning, and then another check in the 'if'; the same thing without the temporary skips the second check, and just goes ahead and enters the body of the 'if'. Technically that's a semantic change. But I doubt it'll hurt anyone. ChrisA From solipsis at pitrou.net Thu Mar 29 10:18:35 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Mar 2018 16:18:35 +0200 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> Message-ID: <20180329161835.5b552f67@fsol> On Thu, 29 Mar 2018 11:25:13 +0000 Nathaniel Smith wrote: > > > I doubt it'll be a problem to pickle though as it'll use some form of > > versioning even in NONPORTABLE mode right? > > > > I guess the (merged, but undocumented?) changes in > https://bugs.python.org/issue28053 should make it possible to set the > pickle version [...] Not only undocumented, but untested and they are actually look plain wrong when looking at that diff. Notice how "reduction" is imported using `from .context import reduction` and then changed inside the "context" module using `globals()['reduction'] = reduction`. That seems unlikely to produce any effect. (not to mention the abstract base class that doesn't seem to define any abstract methods or properties) To be frank such an unfinished patch should never have been committed. I may consider undoing it if I find some spare cycles. Regards Antoine. From tjreedy at udel.edu Thu Mar 29 11:06:47 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 29 Mar 2018 11:06:47 -0400 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: > The optimizer already changes > semantic. Non-optimized "if a and True:" would call bool(a) twice, but > optimized code calls it only once. Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or True.", should say something like "Should not have side-effects, as redundant bool calls may be optimized away (bool(bool(ob)) should have the same result as bool(ob))." -- Terry Jan Reedy From me at ixokai.io Thu Mar 29 02:45:11 2018 From: me at ixokai.io (Stephen Hansen) Date: Wed, 28 Mar 2018 23:45:11 -0700 Subject: [Python-Dev] Sets, Dictionaries In-Reply-To: References: Message-ID: <1522305911.3536456.1319921336.388C01A3@webmail.messagingengine.com> On Wed, Mar 28, 2018, at 9:14 PM, Julia Kim wrote: > My suggestion is to change the syntax for creating an empty set and an > empty dictionary as following. > > an_empty_set = {} > an_empty_dictionary = {:} > > It would seem to make more sense. The amount of code this would break is astronomical. -- Stephen Hansen m e @ i x o k a i . i o From ncoghlan at gmail.com Thu Mar 29 11:54:50 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Mar 2018 01:54:50 +1000 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: <20180328203951.3488ca41@fsol> References: <20180328203951.3488ca41@fsol> Message-ID: On 29 March 2018 at 04:39, Antoine Pitrou wrote: > > Hi, > > I'd like to submit this PEP for discussion. It is quite specialized > and the main target audience of the proposed changes is > users and authors of applications/libraries transferring large amounts > of data (read: the scientific computing & data science ecosystems). > > https://www.python.org/dev/peps/pep-0574/ > > The PEP text is also inlined below. +1 from me, which you already knew :) For folks that haven't read Eric Snow's PEP 554 about exposing multiple interpreter support as a Python level API, Antoine's proposed zero-copy-data-management enhancements for pickle complement that nicely, since they allow the three initial communication primitives in PEP 554 (passing None, bytes, memory views) to be more efficiently expanded to handling arbitrary objects by sending first the pickle data, then the out-of-band memory views, and finally None as an end-of-message marker. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Mar 29 12:13:15 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Mar 2018 02:13:15 +1000 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <64a73b60-0668-561e-eab6-d17c2f4b995e@trueblade.com> References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> <64a73b60-0668-561e-eab6-d17c2f4b995e@trueblade.com> Message-ID: On 29 March 2018 at 21:50, Eric V. Smith wrote: > #1 seems so complex as to not be worth it, given the likely small overall > impact of the optimization to a large program. If the speedup really is > sufficiently important for a particular piece of code, I'd suggest just > rewriting the code to use f-strings, and the author could then determine if > the transformation breaks anything. Maybe write a 2to3 like tool that would > identify places where str.format or %-formatting could be replaced by > f-strings? I know I'd run it on my code, if it existed. Because the > optimization can only work code with literals, I think manually modifying > the source code is an acceptable solution if the possible change in > semantics implied by #3 are unacceptable. While more projects are starting to actively drop Python 2.x support, there are also quite a few still straddling the two different versions. The "rewrite to f-strings" approach requires explicitly dropping support for everything below 3.6, whereas implicit optimization of literal based formatting will work even for folks preserving backwards compatibility with older versions. As far as the semantics go, perhaps it would be possible to explicitly create a tuple as part of the implementation to ensure that the arguments are still evaluated in order, and everything gets calculated exactly once? This would have the benefit that even format strings that used numbered references could be optimised in a fairly straightforward way. '{}{}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[0]}{_hidden_ref[1]}' while: '{1}{0}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[1]}{_hidden_ref[0]}' This would probably need to be implemented as Serhiy's option 1 (generating a distinct AST node), which in turn leads to 2a: adding extra stack manipulation opcodes in order to more closely replicate str.format semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Thu Mar 29 13:33:16 2018 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 29 Mar 2018 13:33:16 -0400 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> <64a73b60-0668-561e-eab6-d17c2f4b995e@trueblade.com> Message-ID: On 3/29/2018 12:13 PM, Nick Coghlan wrote: > On 29 March 2018 at 21:50, Eric V. Smith wrote: >> #1 seems so complex as to not be worth it, given the likely small overall >> impact of the optimization to a large program. If the speedup really is >> sufficiently important for a particular piece of code, I'd suggest just >> rewriting the code to use f-strings, and the author could then determine if >> the transformation breaks anything. Maybe write a 2to3 like tool that would >> identify places where str.format or %-formatting could be replaced by >> f-strings? I know I'd run it on my code, if it existed. Because the >> optimization can only work code with literals, I think manually modifying >> the source code is an acceptable solution if the possible change in >> semantics implied by #3 are unacceptable. > > While more projects are starting to actively drop Python 2.x support, > there are also quite a few still straddling the two different > versions. The "rewrite to f-strings" approach requires explicitly > dropping support for everything below 3.6, whereas implicit > optimization of literal based formatting will work even for folks > preserving backwards compatibility with older versions. Sure. But 3.6 will be 3 years old before this optimization is released. I've been seeing 3.4 support dropping off, and expect to see 3.5 follow suit by the time 3.8 is released. Although maybe the thought is to do this in a bug-fix release? If we're changing semantics at all, that seems like a non-starter. > As far as the semantics go, perhaps it would be possible to explicitly > create a tuple as part of the implementation to ensure that the > arguments are still evaluated in order, and everything gets calculated > exactly once? This would have the benefit that even format strings > that used numbered references could be optimised in a fairly > straightforward way. > > '{}{}'.format(a, b) > > would become: > > _hidden_ref = (a, b) > f'{_hidden_ref[0]}{_hidden_ref[1]}' > > while: > > '{1}{0}'.format(a, b) > > would become: > > _hidden_ref = (a, b) > f'{_hidden_ref[1]}{_hidden_ref[0]}' > > This would probably need to be implemented as Serhiy's option 1 > (generating a distinct AST node), which in turn leads to 2a: adding > extra stack manipulation opcodes in order to more closely replicate > str.format semantics. I still think the complexity isn't worth it, but maybe I'm a lone voice on this. Eric. From mertz at gnosis.cx Thu Mar 29 13:11:47 2018 From: mertz at gnosis.cx (David Mertz) Date: Thu, 29 Mar 2018 17:11:47 +0000 Subject: [Python-Dev] Sets, Dictionaries In-Reply-To: <20180329055755.GM16661@ando.pearwood.info> References: <20180329055755.GM16661@ando.pearwood.info> Message-ID: I agree with everything Steven says. But it's true that even as a 20-year Python user, this is an error I make moderately often when I want an empty set... Notwithstanding that I typed it thousands of times before sets even existed (and still type it when I want an empty dictionary). That said, I've sort of got in the habit of using the type initializers: x = set() y = dict() z = list() I feel like those jump out a little better visually. But I'm inconsistent in my code. On Thu, Mar 29, 2018, 2:03 AM Steven D'Aprano wrote: > Hi Julia, and welcome! > > On Wed, Mar 28, 2018 at 09:14:53PM -0700, Julia Kim wrote: > > > My suggestion is to change the syntax for creating an empty set and an > > empty dictionary as following. > > > > an_empty_set = {} > > an_empty_dictionary = {:} > > > > It would seem to make more sense. > > Indeed it would, and if sets had existed in Python since the beginning, > that's probably exactly what we would have done. But unfortunately they > didn't, and {} has meant an empty dict forever. > > The requirement to keep backwards-compatibility is a very, very hard > barrier to cross. I think we all acknowledge that it is sad and a little > bit confusing that {} means a dict not a set, but it isn't sad or > confusing enough to justify breaking millions of existing scripts and > applications. > > Not to mention the confusing transition period when the community would > be using *both* standards at the same time, which could easily last ten > years. > > Given that, I think we just have to accept that having to use set() for > the empty set instead of {} is a minor wart on the language that we're > stuck with. > > If you disagree, and think that you have a concrete plan that can make > this transition work, we'll be happy to hear it, but you'll almost > certainly need to write a PEP before it could be accepted. > > https://www.python.org/dev/peps/ > > > Thanks, > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Mar 29 14:13:10 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 29 Mar 2018 21:13:10 +0300 Subject: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data In-Reply-To: <20180328221939.10d3cd01@fsol> References: <20180328203951.3488ca41@fsol> <34eaf991-fb6c-c540-ea6d-e7b6267df34f@gmail.com> <20180328221939.10d3cd01@fsol> Message-ID: 28.03.18 23:19, Antoine Pitrou ????: > Agreed. Do you know by which timeframe you'll know which opcodes you > want to add? I'm currently in the middle of the first part, trying to implement pickling local classes with static and class methods without creating loops. Other parts exist just like general ideas, I didn't rite code for them still. I try to do this with existing protocols, but maybe some new opcodes will be needed for efficiency. We are now at the early stage of 3.8 developing, and I think we have a lot of time. It wouldn't deserve bumping pickle version, but if we do this already, it would be worth to add shorter versions for FRAME. Currently it uses 64-bit size, and 9 bytes is a large overhead for short pickles. 8-bit size would reduce overhead for short pickles, and 32-bit size would be enough for any practical use (larger data is not wrapped in a frame). From ericfahlgren at gmail.com Thu Mar 29 14:15:01 2018 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Thu, 29 Mar 2018 11:15:01 -0700 Subject: [Python-Dev] Sets, Dictionaries In-Reply-To: References: <20180329055755.GM16661@ando.pearwood.info> Message-ID: On Thu, Mar 29, 2018 at 10:11 AM, David Mertz wrote: > I agree with everything Steven says. But it's true that even as a 20-year > Python user, this is an error I make moderately often when I want an empty > set... Notwithstanding that I typed it thousands of times before sets even > existed (and still type it when I want an empty dictionary). > > That said, I've sort of got in the habit of using the type initializers: > > x = set() > y = dict() > z = list() > > I feel like those jump out a little better visually. But I'm inconsistent > in my code. > ? Yeah, we've been doing that for several years, too. A hair slower in some cases, but much more greppable... ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Mar 29 14:30:42 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 29 Mar 2018 21:30:42 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> Message-ID: 29.03.18 13:17, Jeff Allen ????: > ??? '{1} {0}'.format(a(), b()) # E1 > > ??? f'{b()}{a()}' # E2 > > > I think I would be very surprised to find b called before a in E1 > because of the general contract on the meaning of method calls. I'm > assuming that's what an AST-based optimisation would do? There's no > reason in E2 to call them in any other order than b then a and the > documentation tells me they are. I was going to optimize only formatting with implicit references. '{} {}' but not '{1} {0}' and either not '{0} {1}'. This guaranties in-order computation and referencing every subexpression only once. I don't have a goal of converting every string formatting, but only the most common and the most simple ones. If go further, we will need to add several new AST nodes (like for comprehensions). From donald at stufft.io Thu Mar 29 17:28:30 2018 From: donald at stufft.io (Donald Stufft) Date: Thu, 29 Mar 2018 17:28:30 -0400 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: References: Message-ID: <992C0319-6AF7-4E59-90CD-0AAA100A988F@stufft.io> From my POV I don?t care where they live, just document how to update them going forward. Sent from my iPhone > On Mar 24, 2018, at 4:50 AM, Serhiy Storchaka wrote: > > Wouldn't be better to put them into a separate repository like Tcl/Tk and other external binaries for Windows, and download only the recent version? From wes.turner at gmail.com Thu Mar 29 18:31:54 2018 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 29 Mar 2018 18:31:54 -0400 Subject: [Python-Dev] Move ensurepip blobs to external place In-Reply-To: <992C0319-6AF7-4E59-90CD-0AAA100A988F@stufft.io> References: <992C0319-6AF7-4E59-90CD-0AAA100A988F@stufft.io> Message-ID: AFAIU, the objectives (with no particular ranking) are: - minimize git clone time and bandwidth - include latest pip with every python install - run the full test suite with CI for every PR (buildbot) - the full test suite requires pip - run the test suite locally when developing a PR - minimize PyPI bandwidth What are the proposed solutions? ... https://help.github.com/articles/about-storage-and-bandwidth-usage/ > All personal and organization accounts using Git LFS receive 1 GB of free storage and 1 GB a month of free bandwidth. If the bandwidth and storage quotas are not enough, you can choose to purchase an additional quota for Git LFS. > > Git LFS is available for every repository on GitHub, whether or not your account or organization has a paid plan. On Thursday, March 29, 2018, Donald Stufft wrote: > From my POV I don?t care where they live, just document how to update them > going forward. > > Sent from my iPhone > > > On Mar 24, 2018, at 4:50 AM, Serhiy Storchaka > wrote: > > > > Wouldn't be better to put them into a separate repository like Tcl/Tk > and other external binaries for Windows, and download only the recent > version? > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Mar 29 19:16:47 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Mar 2018 10:16:47 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <20180329231647.GO16661@ando.pearwood.info> On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote: > 2. Change the semantic of f-strings. Make it closer to the semantic of > str.format(): evaluate all subexpressions first than format them. This > can be implemented in two ways: > > 2a) Add additional instructions for stack manipulations. This will slow > down f-strings. > > 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and > BUILD_STRING. This will speed up f-strings. If the aim here is to be an optimization, then I vote strongly for 2b. That gives you *faster f-strings* that have the same order-of-evaluation of normal method calls, so that when you optimize str.format into an f-string, not only is the behaviour identical, but they will be even faster than with option 3. Python's execution model implies that obj.method(expression_a, expression_b) should fully evaluate both expressions before they are passed to the method. Making str.format a magical special case that violates that rule should be a last resort. In this case, we can have our cake and eat it too: both the str.format to f-string optimization and keeping the normal evaluation rules. And as a bonus, we make f-strings even faster. I say "we", but of course it is Serhiy doing the work, thank you. Is there a down-side to 2b? It sounds like something you might end up doing at a later date regardless of what you do now. -- Steve From nad at python.org Thu Mar 29 21:44:28 2018 From: nad at python.org (Ned Deily) Date: Thu, 29 Mar 2018 21:44:28 -0400 Subject: [Python-Dev] [RELEASE] Python 3.7.0b3 is now available for testing Message-ID: On behalf of the Python development community and the Python 3.7 release team, I'm happy to announce the availability of Python 3.7.0b3. b3 is the third of four planned beta releases of Python 3.7, the next major release of Python, and marks the end of the feature development phase for 3.7. You can find Python 3.7.0b3 here: https://www.python.org/downloads/release/python-370b3/ Among the new major new features in Python 3.7 are: * PEP 538, Coercing the legacy C locale to a UTF-8 based locale * PEP 539, A New C-API for Thread-Local Storage in CPython * PEP 540, UTF-8 mode * PEP 552, Deterministic pyc * PEP 553, Built-in breakpoint() * PEP 557, Data Classes * PEP 560, Core support for typing module and generic types * PEP 562, Module __getattr__ and __dir__ * PEP 563, Postponed Evaluation of Annotations * PEP 564, Time functions with nanosecond resolution * PEP 565, Show DeprecationWarning in __main__ * PEP 567, Context Variables Please see "What?s New In Python 3.7" for more information. Additional documentation for these features and for other changes will be provided during the beta phase. https://docs.python.org/3.7/whatsnew/3.7.html Beta releases are intended to give you the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release. We strongly encourage you to test your projects with 3.7 during the beta phase and report issues found to https://bugs.python.org as soon as possible. While the release is feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (2018-05-21). Our goal is have no ABI changes after beta 3 and no code changes after rc1. To achieve that, it will be extremely important to get as much exposure for 3.7 as possible during the beta phase. Attention macOS users: there is a new installer variant for macOS 10.9+ that includes a built-in version of Tcl/Tk 8.6. This variant is expected to become the default version when 3.7.0 releases. Check it out! We welcome your feedback. As of 3.7.0b3, the legacy 10.6+ installer also includes a built-in Tcl/Tk 8.6. Please keep in mind that this is a preview release and its use is not recommended for production environments. The next planned release of Python 3.7 will be 3.7.0b4, currently scheduled for 2018-04-30. More information about the release schedule can be found here: https://www.python.org/dev/peps/pep-0537/ -- Ned Deily nad at python.org -- [] From ncoghlan at gmail.com Fri Mar 30 01:41:22 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Mar 2018 15:41:22 +1000 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <4C81641B-64D6-45FD-9C0D-15AAB7BA1D01@trueblade.com> <64a73b60-0668-561e-eab6-d17c2f4b995e@trueblade.com> Message-ID: On 30 March 2018 at 03:33, Eric V. Smith wrote: > On 3/29/2018 12:13 PM, Nick Coghlan wrote: >> While more projects are starting to actively drop Python 2.x support, >> there are also quite a few still straddling the two different >> versions. The "rewrite to f-strings" approach requires explicitly >> dropping support for everything below 3.6, whereas implicit >> optimization of literal based formatting will work even for folks >> preserving backwards compatibility with older versions. > > > Sure. But 3.6 will be 3 years old before this optimization is released. I've > been seeing 3.4 support dropping off, and expect to see 3.5 follow suit by > the time 3.8 is released. Although maybe the thought is to do this in a > bug-fix release? If we're changing semantics at all, that seems like a > non-starter. Definitely 3.8+ only. The nice thing about doing this implicitly at the compiler level is that it potentially provides an automatic performance improvement for existing libraries and applications. The justification for the extra complexity would then come from whether or not it actually measurably improve things, either for the benchmark suite, or for folks' real-world applications. Steven D'Aprano also raises a good point on that front: a FORMAT_STRING super-opcode that sped up f-strings *and* allowed semantics preserving constant-folding of str.format calls on string literals could make more sense than a change that focused solely on the implicit optimisation case. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From songofacandy at gmail.com Fri Mar 30 02:28:47 2018 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 30 Mar 2018 15:28:47 +0900 Subject: [Python-Dev] How can we use 48bit pointer safely? Message-ID: Hi, As far as I know, most amd64 and arm64 systems use only 48bit address spaces. (except [1]) [1] https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf It means there are some chance to compact some data structures. I point two examples below. My question is; can we use 48bit pointer safely? It depends on CPU architecture & OS memory map. Maybe, configure option which is available on only (amd64, amd64) * (Linux, Windows, macOS)? # Possible optimizations by 48bit pointer ## PyASCIIObject [snip] unsigned int ready:1; /* Padding to ensure that PyUnicode_DATA() is always aligned to 4 bytes (see issue #19537 on m68k). */ unsigned int :24; } state; wchar_t *wstr; /* wchar_t representation (null-terminated) */ } PyASCIIObject; Currently, state is 8bit + 24bit padding. I think we can pack state and wstr in 64bit. ## PyDictKeyEntry typedef struct { /* Cached hash code of me_key. */ Py_hash_t me_hash; PyObject *me_key; PyObject *me_value; /* This field is only meaningful for combined tables */ } PyDictKeyEntry; There are chance to compact it: Use only 32bit for hash and 48bit*2 for key and value. CompactEntry may be 16byte instead of 24byte. Regards, -- INADA Naoki From storchaka at gmail.com Fri Mar 30 06:05:59 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 30 Mar 2018 13:05:59 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <20180329231647.GO16661@ando.pearwood.info> References: <20180329231647.GO16661@ando.pearwood.info> Message-ID: 30.03.18 02:16, Steven D'Aprano ????: > Is there a down-side to 2b? It sounds like something you might end up > doing at a later date regardless of what you do now. This complicate the compiler and the eval loop, especially in the case of nested substitutions in formats, like f'{value:+{width:d}.{prec:d}f}' The speed gain can be too small. The complex implementation of the opcode should be tightly integrated with the ceval loop, otherwise we can get a slow down, as in one of my experimental implementation of BUILD_STRING (https://bugs.python.org/issue27078#msg270505). The exact benefit is unknown until this feature be implemented. From storchaka at gmail.com Fri Mar 30 06:23:22 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 30 Mar 2018 13:23:22 +0300 Subject: [Python-Dev] How can we use 48bit pointer safely? In-Reply-To: References: Message-ID: 30.03.18 09:28, INADA Naoki ????: > As far as I know, most amd64 and arm64 systems use only 48bit address spaces. > (except [1]) > > [1] https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf > > It means there are some chance to compact some data structures. > I point two examples below. > > My question is; can we use 48bit pointer safely? > It depends on CPU architecture & OS memory map. > Maybe, configure option which is available on only (amd64, amd64) * > (Linux, Windows, macOS)? If the size be the main problem, we could use these 8 bit for encoding the type and the size of the value for some types, and even encode the value itself for some types in other 48 bits. For example 48 bit integers, 0-, 1- and 2-character Unicode strings, ASCII strings up to 6 characters (and even longer if use base 64 encodings for ASCII identifiers), singletons like None, True, False, Ellipsis, NotImplementes could be encoded in 64 bit word without using additional memory. But this would significantly complicate and slow down the code. From ncoghlan at gmail.com Fri Mar 30 06:25:01 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Mar 2018 20:25:01 +1000 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: <20180329231647.GO16661@ando.pearwood.info> Message-ID: On 30 March 2018 at 20:05, Serhiy Storchaka wrote: > 30.03.18 02:16, Steven D'Aprano ????: >> >> Is there a down-side to 2b? It sounds like something you might end up >> doing at a later date regardless of what you do now. > > > This complicate the compiler and the eval loop, especially in the case of > nested substitutions in formats, like > > f'{value:+{width:d}.{prec:d}f}' This point reminded me that there's still https://www.python.org/dev/peps/pep-0536/ to consider as well (that's the PEP about migrating f-strings to a fully nested expression grammar rather than hijacking the existing string tokenisation code). I *think* that's an orthogonal concern (since it relates to the initial parsing and AST compilation phase, rather then the code generation and execution phase), but it's worth keeping in mind. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Fri Mar 30 06:29:53 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 30 Mar 2018 13:29:53 +0300 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: 29.03.18 18:06, Terry Reedy ????: > On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: >> The optimizer already changes semantic. Non-optimized "if a and True:" >> would call bool(a) twice, but optimized code calls it only once. > > Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or > True.", should say something like "Should not have side-effects, as > redundant bool calls may be optimized away (bool(bool(ob)) should have > the same result as bool(ob))." Do you meant that it should be idempotent operation? Because bool(bool(ob)) always have the same result as bool(ob)) if bool(ob) returns True or False. From steve at pearwood.info Fri Mar 30 06:50:27 2018 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Mar 2018 21:50:27 +1100 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: <20180330105026.GQ16661@ando.pearwood.info> On Fri, Mar 30, 2018 at 01:29:53PM +0300, Serhiy Storchaka wrote: > 29.03.18 18:06, Terry Reedy ????: > >On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: > >>The optimizer already changes semantic. Non-optimized "if a and True:" > >>would call bool(a) twice, but optimized code calls it only once. > > > >Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or > >True.", should say something like "Should not have side-effects, as > >redundant bool calls may be optimized away (bool(bool(ob)) should have > >the same result as bool(ob))." > > Do you meant that it should be idempotent operation? Because > bool(bool(ob)) always have the same result as bool(ob)) if bool(ob) > returns True or False. Assuming that bool is the built-in, and hasn't been shadowed or monkey-patched. -- Steve From njs at pobox.com Fri Mar 30 07:16:47 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 30 Mar 2018 04:16:47 -0700 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On Fri, Mar 30, 2018 at 3:29 AM, Serhiy Storchaka wrote: > 29.03.18 18:06, Terry Reedy ????: >> >> On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: >>> >>> The optimizer already changes semantic. Non-optimized "if a and True:" >>> would call bool(a) twice, but optimized code calls it only once. >> >> >> Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or >> True.", should say something like "Should not have side-effects, as >> redundant bool calls may be optimized away (bool(bool(ob)) should have the >> same result as bool(ob))." > > > Do you meant that it should be idempotent operation? Because bool(bool(ob)) > always have the same result as bool(ob)) if bool(ob) returns True or False. And bool(obj) does always return True or False; if you define a __bool__ method that returns something else then bool rejects it and raises TypeError. So bool(bool(obj)) is already indistinguishable from bool(obj). However, the naive implementation of 'if a and True:' doesn't call bool(bool(a)), it calls bool(a) twice, and this *is* distinguishable by user code, at least in principle. If we want to change the language spec, I guess it would be with text like: "if bool(obj) would be called twice in immediate succession, with no other code in between, then the interpreter may assume that both calls would return the same value and elide one of them". -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Fri Mar 30 07:33:25 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Mar 2018 13:33:25 +0200 Subject: [Python-Dev] How can we use 48bit pointer safely? References: Message-ID: <20180330133325.199ed546@fsol> On Fri, 30 Mar 2018 15:28:47 +0900 INADA Naoki wrote: > Hi, > > As far as I know, most amd64 and arm64 systems use only 48bit address spaces. > (except [1]) > > [1] https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf > > It means there are some chance to compact some data structures. As that paper shows, effective virtual address width tends to increase over time to accomodate growing needs. Bigger systems like IBM POWER sytems may already have larger virtual address spaces. So we can't safely assume that bits 48-63 are available for us. Another issue is the cost of the associated bit-twiddling. It will all depend how often it needs to be done. Note that pointers can be "negative", i.e. some of them will have all 1s in their upper bits, and you need to reproduce that when reconstituting the original pointer. A safer alternative is to use the *lower* bits of pointers. The bottom 3 bits are always available for storing ancillary information, since typically all heap-allocated data will be at least 8-bytes aligned (probably 16-bytes aligned on 64-bit processes). However, you also get less bits :-) Regards Antoine. From ncoghlan at gmail.com Fri Mar 30 07:41:46 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Mar 2018 21:41:46 +1000 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On 30 March 2018 at 21:16, Nathaniel Smith wrote: > On Fri, Mar 30, 2018 at 3:29 AM, Serhiy Storchaka wrote: >> 29.03.18 18:06, Terry Reedy ????: >>> >>> On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: >>>> >>>> The optimizer already changes semantic. Non-optimized "if a and True:" >>>> would call bool(a) twice, but optimized code calls it only once. >>> >>> >>> Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or >>> True.", should say something like "Should not have side-effects, as >>> redundant bool calls may be optimized away (bool(bool(ob)) should have the >>> same result as bool(ob))." >> >> >> Do you meant that it should be idempotent operation? Because bool(bool(ob)) >> always have the same result as bool(ob)) if bool(ob) returns True or False. > > And bool(obj) does always return True or False; if you define a > __bool__ method that returns something else then bool rejects it and > raises TypeError. So bool(bool(obj)) is already indistinguishable from > bool(obj). > > However, the naive implementation of 'if a and True:' doesn't call > bool(bool(a)), it calls bool(a) twice, and this *is* distinguishable > by user code, at least in principle. For example: >>> class FlipFlop: ... _state = False ... def __bool__(self): ... result = self._state ... self._state = not result ... return result ... >>> toggle = FlipFlop() >>> bool(toggle) False >>> bool(toggle) True >>> bool(toggle) False >>> bool(toggle) and bool(toggle) False >>> toggle and toggle <__main__.FlipFlop object at 0x7f35293604e0> >>> bool(toggle and toggle) True So the general principle is that __bool__ implementations shouldn't do anything that will change the result of the next call to __bool__, or else weirdness is going to result. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ronaldoussoren at mac.com Fri Mar 30 07:55:44 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 30 Mar 2018 11:55:44 +0000 (GMT) Subject: [Python-Dev] How can we use 48bit pointer safely? Message-ID: <03f9be17-4e4c-422c-9623-e05f62e468a2@me.com> On Mar 30, 2018, at 08:31 AM, INADA Naoki wrote: Hi, As far as I know, most amd64 and arm64 systems use only 48bit address spaces. (except [1]) [1] https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf It means there are some chance to compact some data structures. I point two examples below. My question is; can we use 48bit pointer safely? Not really, at least some CPUs can also address more memory than that. See ?which talks about Linux support for 57-bit virtual addresses and 52-bit physical addresses.? Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Fri Mar 30 09:08:50 2018 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 30 Mar 2018 10:08:50 -0300 Subject: [Python-Dev] How can we use 48bit pointer safely? In-Reply-To: <03f9be17-4e4c-422c-9623-e05f62e468a2@me.com> References: <03f9be17-4e4c-422c-9623-e05f62e468a2@me.com> Message-ID: Not only that, but afaik Linux could simply raise that 57bit virtual to 64bit virtual without previous warning on any version change. On 30 March 2018 at 08:55, Ronald Oussoren wrote: > > > On Mar 30, 2018, at 08:31 AM, INADA Naoki wrote: > > Hi, > > As far as I know, most amd64 and arm64 systems use only 48bit address > spaces. > (except [1]) > > [1] > https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf > > It means there are some chance to compact some data structures. > I point two examples below. > > My question is; can we use 48bit pointer safely? > > > Not really, at least some CPUs can also address more memory than that. See > which talks about Linux support for > 57-bit virtual addresses and 52-bit physical addresses. > > Ronald > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br > From ronaldoussoren at mac.com Fri Mar 30 09:33:49 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 30 Mar 2018 13:33:49 +0000 (GMT) Subject: [Python-Dev] How can we use 48bit pointer safely? Message-ID: <04c3dd38-4972-4a30-9e00-511aa5e02522@me.com> On Mar 30, 2018, at 03:11 PM, "Joao S. O. Bueno" wrote: Not only that, but afaik Linux could simply raise that 57bit virtual to 64bit virtual without previous warning on any version change. The change from 48-bit to 57-bit virtual?addresses was not done without any warning because that would have broken too much code (IIRC due to at least some JS environments assuming 48bit pointers). Ronald? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Mar 30 09:38:18 2018 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 30 Mar 2018 13:38:18 +0000 (GMT) Subject: [Python-Dev] How can we use 48bit pointer safely? Message-ID: <9efcdd2f-9c52-4577-8ee9-d16a3fd9b4e5@me.com> On Mar 30, 2018, at 01:40 PM, Antoine Pitrou wrote: A safer alternative is to use the *lower* bits of pointers. The bottom 3 bits are always available for storing ancillary information, since typically all heap-allocated data will be at least 8-bytes aligned (probably 16-bytes aligned on 64-bit processes). However, you also get less bits :-) The lower bits are more interesting to use. I'm still hoping to find some time to experiment with tagged pointers some day, that?could be interesting w.r.t. performance and memory use (at the cost of being ABI incompatible).? Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Mar 30 09:54:42 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Mar 2018 15:54:42 +0200 Subject: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?] References: Message-ID: <20180330155442.6bf2cd9f@fsol> On Fri, 30 Mar 2018 15:28:47 +0900 INADA Naoki wrote: > > # Possible optimizations by 48bit pointer > > ## PyASCIIObject > > [snip] > unsigned int ready:1; > /* Padding to ensure that PyUnicode_DATA() is always aligned to > 4 bytes (see issue #19537 on m68k). */ > unsigned int :24; > } state; > wchar_t *wstr; /* wchar_t representation (null-terminated) */ > } PyASCIIObject; > > Currently, state is 8bit + 24bit padding. I think we can pack state and wstr > in 64bit. We could also simply nuke wstr. I frankly don't think it's very important. It's only used when calling system functions taking a wchar_t argument, as an ? optimization ?. I'd be willing to guess that modern workloads aren't bottlenecked by the cost overhead of those system functions... Of course, the question is whether all this matters. Is it important to save 8 bytes on each unicode object? Only testing would tell. Regards Antoine. From tjreedy at udel.edu Fri Mar 30 10:33:22 2018 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 30 Mar 2018 10:33:22 -0400 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On 3/30/2018 6:29 AM, Serhiy Storchaka wrote: > 29.03.18 18:06, Terry Reedy ????: >> On 3/28/2018 11:27 AM, Serhiy Storchaka wrote: >>> The optimizer already changes semantic. Non-optimized "if a and >>> True:" would call bool(a) twice, but optimized code calls it only once. >> >> Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False >> or True.", should say something like "Should not have side-effects, as >> redundant bool calls may be optimized away (bool(bool(ob)) should have >> the same result as bool(ob))." > > Do you meant that it should be idempotent operation? Because > bool(bool(ob)) always have the same result as bool(ob)) if bool(ob) > returns True or False. That is what the parenthetical comment says, but it is not right in the context and should be deleted. For the "if a and True:" example, 'redundant bool calls may be optimized away.' might be better written as 'duplicate implied __bool__ calls may be avoided.' What I am trying to say is that *we* define the intended behavior of special methods, and we should define what an implementation may actually expect. The current optimizer expects __bool__ to have no side effects, at least none that it need respect. Having said what __bool__ should do, we can also say what it should not do to avoid possible surprises -- at least in production code, as opposed to 'testing' code like the examples in this thread. -- Terry Jan Reedy From status at bugs.python.org Fri Mar 30 12:09:58 2018 From: status at bugs.python.org (Python tracker) Date: Fri, 30 Mar 2018 18:09:58 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20180330160958.87F8411A86A@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2018-03-23 - 2018-03-30) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6543 (+14) closed 38393 (+46) total 44936 (+60) Open issues with patches: 2548 Issues opened (42) ================== #31793: Allow to specialize smart quotes in documentation translations https://bugs.python.org/issue31793 reopened by mdk #33128: PathFinder is twice on sys.meta_path https://bugs.python.org/issue33128 opened by htgoebel #33129: Add kwarg-only option to dataclass https://bugs.python.org/issue33129 opened by alan_du #33130: functools.reduce signature/docstring discordance https://bugs.python.org/issue33130 opened by vreuter #33131: Upgrade to pip 10 for Python 3.7 https://bugs.python.org/issue33131 opened by ncoghlan #33132: Possible refcount issues in the compiler https://bugs.python.org/issue33132 opened by serhiy.storchaka #33133: Don't return implicit optional types by get_type_hints https://bugs.python.org/issue33133 opened by levkivskyi #33135: Define field prefixes for the various config structs https://bugs.python.org/issue33135 opened by ncoghlan #33136: Harden ssl module against CVE-2018-8970 https://bugs.python.org/issue33136 opened by christian.heimes #33137: line traces may be missed on backward jumps when instrumented https://bugs.python.org/issue33137 opened by xdegaye #33138: Improve standard error for uncopyable types https://bugs.python.org/issue33138 opened by serhiy.storchaka #33139: Bdb doesn't find instruction in linecache after pdb.set_trace( https://bugs.python.org/issue33139 opened by prounce #33140: shutil.chown on Windows https://bugs.python.org/issue33140 opened by eryksun #33144: random._randbelow optimization https://bugs.python.org/issue33144 opened by wolma #33146: contextlib.suppress should capture exception for inspection an https://bugs.python.org/issue33146 opened by jason.coombs #33147: Update references for RFC 3548 to RFC 4648 https://bugs.python.org/issue33147 opened by paulehoffman #33148: RuntimeError('Event loop is closed') after cancelling getaddri https://bugs.python.org/issue33148 opened by vitaly.krug #33150: Signature error for methods of class configparser.Interpolatio https://bugs.python.org/issue33150 opened by acue #33152: Use list comprehension in timeit module instead of loop with a https://bugs.python.org/issue33152 opened by Windson Yang #33153: interpreter crash when multiplying large tuples https://bugs.python.org/issue33153 opened by imz #33154: subprocess.Popen ResourceWarning should have activation-deacti https://bugs.python.org/issue33154 opened by acue #33155: Use super().method instead in Logging https://bugs.python.org/issue33155 opened by madsjensen #33158: Add fileobj property to csv reader and writer objects https://bugs.python.org/issue33158 opened by samwyse #33159: Implement PEP 473 https://bugs.python.org/issue33159 opened by skreft #33161: Refactor of pathlib's _WindowsBehavior.gethomedir https://bugs.python.org/issue33161 opened by onlined #33162: TimedRotatingFileHandler in logging module https://bugs.python.org/issue33162 opened by Nikunj jain #33164: Blake 2 module update https://bugs.python.org/issue33164 opened by David Carlier #33165: Add stacklevel parameter to logging APIs https://bugs.python.org/issue33165 opened by ncoghlan #33166: os.cpu_count() returns wrong number of processors on specific https://bugs.python.org/issue33166 opened by yanirh #33167: RFC Documentation Updates to urllib.parse.rst https://bugs.python.org/issue33167 opened by agnosticdev #33168: distutils build/build_ext and --debug https://bugs.python.org/issue33168 opened by lazka #33169: importlib.invalidate_caches() doesn't clear all caches https://bugs.python.org/issue33169 opened by gvanrossum #33171: multiprocessing won't utilize all of platform resources https://bugs.python.org/issue33171 opened by yanirh #33173: GzipFile's .seekable() returns True even if underlying buffer https://bugs.python.org/issue33173 opened by Walt Askew #33174: error building the _sha3 module with Intel 2018 compilers https://bugs.python.org/issue33174 opened by wscullin #33176: Allow memoryview.cast(readonly=...) https://bugs.python.org/issue33176 opened by pitrou #33178: Add support for BigEndianUnion and LittleEndianUnion in ctypes https://bugs.python.org/issue33178 opened by emezh #33179: Investigate using a context variable for zero-arg super initia https://bugs.python.org/issue33179 opened by ncoghlan #33180: Flag for unusable sys.executable https://bugs.python.org/issue33180 opened by steve.dower #33181: SimpleHTTPRequestHandler shouldn't redirect to directories wit https://bugs.python.org/issue33181 opened by oulenz #33184: Update OpenSSL to 1.1.0h / 1.0.2o https://bugs.python.org/issue33184 opened by ned.deily #33185: Python 3.7.0b3 fails in pydoc where b2 did not. https://bugs.python.org/issue33185 opened by nedbat Most recent 15 issues with no replies (15) ========================================== #33185: Python 3.7.0b3 fails in pydoc where b2 did not. https://bugs.python.org/issue33185 #33184: Update OpenSSL to 1.1.0h / 1.0.2o https://bugs.python.org/issue33184 #33176: Allow memoryview.cast(readonly=...) https://bugs.python.org/issue33176 #33174: error building the _sha3 module with Intel 2018 compilers https://bugs.python.org/issue33174 #33173: GzipFile's .seekable() returns True even if underlying buffer https://bugs.python.org/issue33173 #33171: multiprocessing won't utilize all of platform resources https://bugs.python.org/issue33171 #33168: distutils build/build_ext and --debug https://bugs.python.org/issue33168 #33167: RFC Documentation Updates to urllib.parse.rst https://bugs.python.org/issue33167 #33165: Add stacklevel parameter to logging APIs https://bugs.python.org/issue33165 #33164: Blake 2 module update https://bugs.python.org/issue33164 #33162: TimedRotatingFileHandler in logging module https://bugs.python.org/issue33162 #33161: Refactor of pathlib's _WindowsBehavior.gethomedir https://bugs.python.org/issue33161 #33159: Implement PEP 473 https://bugs.python.org/issue33159 #33155: Use super().method instead in Logging https://bugs.python.org/issue33155 #33150: Signature error for methods of class configparser.Interpolatio https://bugs.python.org/issue33150 Most recent 15 issues waiting for review (15) ============================================= #33173: GzipFile's .seekable() returns True even if underlying buffer https://bugs.python.org/issue33173 #33167: RFC Documentation Updates to urllib.parse.rst https://bugs.python.org/issue33167 #33161: Refactor of pathlib's _WindowsBehavior.gethomedir https://bugs.python.org/issue33161 #33159: Implement PEP 473 https://bugs.python.org/issue33159 #33155: Use super().method instead in Logging https://bugs.python.org/issue33155 #33152: Use list comprehension in timeit module instead of loop with a https://bugs.python.org/issue33152 #33144: random._randbelow optimization https://bugs.python.org/issue33144 #33138: Improve standard error for uncopyable types https://bugs.python.org/issue33138 #33136: Harden ssl module against CVE-2018-8970 https://bugs.python.org/issue33136 #33132: Possible refcount issues in the compiler https://bugs.python.org/issue33132 #33129: Add kwarg-only option to dataclass https://bugs.python.org/issue33129 #33128: PathFinder is twice on sys.meta_path https://bugs.python.org/issue33128 #33124: Lazy execution of module bytecode https://bugs.python.org/issue33124 #33123: Path.unlink should have a missing_ok parameter https://bugs.python.org/issue33123 #33106: Deleting a key in a read-only gdbm results in KeyError, not gd https://bugs.python.org/issue33106 Top 10 most discussed issues (10) ================================= #33166: os.cpu_count() returns wrong number of processors on specific https://bugs.python.org/issue33166 16 msgs #33144: random._randbelow optimization https://bugs.python.org/issue33144 10 msgs #32850: Run gc_collect() before complaining about dangling threads https://bugs.python.org/issue32850 7 msgs #33023: Unable to copy ssl.SSLContext https://bugs.python.org/issue33023 7 msgs #33096: ttk.Treeview.insert() does not allow to insert item with "Fals https://bugs.python.org/issue33096 7 msgs #33131: Upgrade to pip 10 for Python 3.7 https://bugs.python.org/issue33131 6 msgs #32726: macOS installer and framework enhancements and changes for 3.7 https://bugs.python.org/issue32726 5 msgs #33153: interpreter crash when multiplying large tuples https://bugs.python.org/issue33153 5 msgs #33111: Merely importing tkinter breaks parallel code (multiprocessing https://bugs.python.org/issue33111 4 msgs #33128: PathFinder is twice on sys.meta_path https://bugs.python.org/issue33128 4 msgs Issues closed (44) ================== #17994: Change necessary in platform.py to support IronPython https://bugs.python.org/issue17994 closed by csabella #23388: datetime.strftime('%s') does not take timezone into account https://bugs.python.org/issue23388 closed by csabella #25782: CPython hangs on error __context__ set to the error itself https://bugs.python.org/issue25782 closed by gregory.p.smith #27428: Document WindowsRegistryFinder inherits from MetaPathFinder https://bugs.python.org/issue27428 closed by brett.cannon #31455: ElementTree.XMLParser() mishandles exceptions https://bugs.python.org/issue31455 closed by serhiy.storchaka #31550: Inconsistent error message for TypeError with subscripting https://bugs.python.org/issue31550 closed by rhettinger #31639: http.server and SimpleHTTPServer hang after a few requests https://bugs.python.org/issue31639 closed by mdk #32358: json.dump: fp must be a text file object https://bugs.python.org/issue32358 closed by berker.peksag #32517: test_read_pty_output() of test_asyncio hangs on macOS 10.13.2 https://bugs.python.org/issue32517 closed by ned.deily #32563: -Werror=declaration-after-statement expat build failure on Pyt https://bugs.python.org/issue32563 closed by ncoghlan #32844: subprocess may incorrectly redirect a low fd to stderr if anot https://bugs.python.org/issue32844 closed by izbyshev #32873: Pickling of typing types https://bugs.python.org/issue32873 closed by levkivskyi #32932: better error message when __all__ contains non-str objects https://bugs.python.org/issue32932 closed by xiang.zhang #32943: confusing error message for rot13 codec https://bugs.python.org/issue32943 closed by xiang.zhang #33042: New 3.7 startup sequence crashes PyInstaller https://bugs.python.org/issue33042 closed by ncoghlan #33053: Avoid adding an empty directory to sys.path when running a mod https://bugs.python.org/issue33053 closed by ncoghlan #33055: bytes does not implement __bytes__() https://bugs.python.org/issue33055 closed by serhiy.storchaka #33061: NoReturn missing from __all__ in typing.py https://bugs.python.org/issue33061 closed by levkivskyi #33079: subprocess: document the interaction between subprocess.Popen https://bugs.python.org/issue33079 closed by gregory.p.smith #33081: multiprocessing Queue leaks a file descriptor associated with https://bugs.python.org/issue33081 closed by pitrou #33093: Fatal error on SSL transport https://bugs.python.org/issue33093 closed by ned.deily #33114: random.sample() behavior is unexpected/unclear from docs https://bugs.python.org/issue33114 closed by rhettinger #33115: Asyncio loop blocks with a lot of parallel tasks https://bugs.python.org/issue33115 closed by terry.reedy #33119: python sys.argv argument parsing not clear https://bugs.python.org/issue33119 closed by ncoghlan #33120: infinite loop in inspect.unwrap(unittest.mock.call) https://bugs.python.org/issue33120 closed by terry.reedy #33126: Some C buffer protocol APIs not documented https://bugs.python.org/issue33126 closed by pitrou #33127: Python 2.7.14 won't build ssl module with Libressl 2.7.0 https://bugs.python.org/issue33127 closed by christian.heimes #33134: dataclasses: use function dispatch instead of multiple tests f https://bugs.python.org/issue33134 closed by eric.smith #33141: descriptor __set_name__ feature broken for dataclass descripto https://bugs.python.org/issue33141 closed by eric.smith #33142: Fatal Python error: Py_Initialize: Unable to get the locale en https://bugs.python.org/issue33142 closed by ned.deily #33143: encode UTF-16 generates unexpected results https://bugs.python.org/issue33143 closed by serhiy.storchaka #33145: unaligned accesses in siphash24() lead to crashes on sparc https://bugs.python.org/issue33145 closed by serhiy.storchaka #33149: Parser stack overflows https://bugs.python.org/issue33149 closed by ned.deily #33151: importlib.resources breaks on subdirectories https://bugs.python.org/issue33151 closed by barry #33156: Use super().method instead in email classes. https://bugs.python.org/issue33156 closed by r.david.murray #33157: Strings beginning with underscore not removed from lists - fea https://bugs.python.org/issue33157 closed by xiang.zhang #33160: Negative values in positional access inside formatting https://bugs.python.org/issue33160 closed by serhiy.storchaka #33163: Upgrade pip to 9.0.3 and setuptools to v39.0.1 https://bugs.python.org/issue33163 closed by ned.deily #33170: New type based on int() created with typing.NewType is not con https://bugs.python.org/issue33170 closed by avanov #33172: Update built-in version of SQLite3 https://bugs.python.org/issue33172 closed by ned.deily #33175: dataclasses should look up __set_name__ on class, not instance https://bugs.python.org/issue33175 closed by eric.smith #33177: make install hangs on macOS when there is an existing Python a https://bugs.python.org/issue33177 closed by ned.deily #33182: Python 3.7.0b3 fails to build with clang 6.0 https://bugs.python.org/issue33182 closed by ncoghlan #33183: Refactoring: replacing some assertTrue by assertIn https://bugs.python.org/issue33183 closed by serhiy.storchaka From chris.jerdonek at gmail.com Fri Mar 30 13:33:07 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Fri, 30 Mar 2018 10:33:07 -0700 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: On Fri, Mar 30, 2018 at 4:41 AM, Nick Coghlan wrote: > On 30 March 2018 at 21:16, Nathaniel Smith wrote: >> And bool(obj) does always return True or False; if you define a >> __bool__ method that returns something else then bool rejects it and >> raises TypeError. So bool(bool(obj)) is already indistinguishable from >> bool(obj). >> >> However, the naive implementation of 'if a and True:' doesn't call >> bool(bool(a)), it calls bool(a) twice, and this *is* distinguishable >> by user code, at least in principle. > > For example: > > >>> class FlipFlop: > ... _state = False > ... def __bool__(self): > ... result = self._state > ... self._state = not result > ... return result > ... > >>> toggle = FlipFlop() > >>> bool(toggle) > False > >>> bool(toggle) > True > >>> bool(toggle) > False > >>> bool(toggle) and bool(toggle) > False > >>> toggle and toggle > <__main__.FlipFlop object at 0x7f35293604e0> > >>> bool(toggle and toggle) > True > > So the general principle is that __bool__ implementations shouldn't do > anything that will change the result of the next call to __bool__, or > else weirdness is going to result. I don't think this way of stating it is general enough. For example, you could have a nondeterministic implementation of __bool__ that doesn't itself carry any state (e.g. flipping the result with some probability), but the next call could nevertheless still return a different result. So I think Nathaniel's way of stating it is probably better: > If we want to change the language spec, I guess it would be with text > like: "if bool(obj) would be called twice in immediate succession, > with no other code in between, then the interpreter may assume that > both calls would return the same value and elide one of them". --Chris From storchaka at gmail.com Fri Mar 30 14:40:21 2018 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 30 Mar 2018 21:40:21 +0300 Subject: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?] In-Reply-To: <20180330155442.6bf2cd9f@fsol> References: <20180330155442.6bf2cd9f@fsol> Message-ID: 30.03.18 16:54, Antoine Pitrou ????: > We could also simply nuke wstr. I frankly don't think it's very > important. It's only used when calling system functions taking a > wchar_t argument, as an ? optimization ?. I'd be willing to > guess that modern workloads aren't bottlenecked by the cost overhead of > those system functions... This is possible only after removing all Py_UNICODE related C API. It is deprecated since 3.3, but only in the documentation, and should stay to the EOL of 2.7. Only in 3.7 most of these functions started emitting deprecation warnings at compile time (GCC-only). [1] It would be good to make them emitted in other compilers too. In future versions we could make them emitting user-visible runtime deprecation warnings, and finally make them always failing after 2020. [1] https://bugs.python.org/issue19569 From tim.peters at gmail.com Fri Mar 30 20:11:10 2018 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 30 Mar 2018 19:11:10 -0500 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: <20180329231647.GO16661@ando.pearwood.info> References: <20180329231647.GO16661@ando.pearwood.info> Message-ID: [Steven D'Aprano ] > ... > Is there a down-side to 2b? It sounds like something you might end up > doing at a later date regardless of what you do now. There are always downsides ;-) As Serhiy noted later, the idea that "it's faster" is an educated guess - you can't know before it's implemented. Changes to the very complicated eval loop often have not only surprising speed consequences on one platform, but even consequences in opposite directions across platforms. Not necessarily in the part you directly changed, either. Optimizing C compilers just can't reliably guess what's most important in such a massive pile of test-and-branch laden code. Indeed, which paths through the eval loop _are_ most important depend on the Python program you're running at the time (which is, e.g., why "profile-guided optimization" was invented). So there's an ocean of potential complications there, and wading through those has opportunity cost too: Serhiy is a very productive contributor, but time he spends on this is time he won't be spending on other things of potentially greater value. That's all up to him, though. I'm not keen on changing the behavior of f-strings regardless (2a or 2b). While their implementation details aren't documented, they were intentional, and follow the pattern increasingly large parts of the language and std library adopted after the iterator protocol was introduced: compute intermediate results as they're needed, not all in advance. That's proved to have many advantages. It's certainly possible to write custom purely functional (no side effects) __format__ methods such that memory use in an f-string remains bounded under the current implementation, but can grow without bound if all __format__ arguments need to be evaluated before any formatting begins. It's akin to the difference between iterating over range() and xrange() in Python 2. I don't know that there's any real f-string code out there _relying_ on that - but don't know that there isn't either. It's more plausible to me than that there are non-functional real __format__ methods. I'd be happiest if no behaviors changed in anything. Then the only downsides to optimizing are code bloat, code churn, new bugs, subtler corner cases, less predictable behavior for end users, and increased implementation complexity forever after ;-) From stefan_ml at behnel.de Sat Mar 31 04:17:37 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 Mar 2018 10:17:37 +0200 Subject: [Python-Dev] Subtle difference between f-strings and str.format() In-Reply-To: References: Message-ID: Serhiy Storchaka schrieb am 28.03.2018 um 17:27: > There is a subtle semantic difference between str.format() and "equivalent" > f-string. > > ??? '{}{}'.format(a, b) > ??? f'{a}{b}' > > In the former case b is evaluated before formatting a. This is equivalent to > > ??? t1 = a > ??? t2 = b > ??? t3 = format(t1) > ??? t4 = format(t2) > ??? r = t3 + t4 > > In the latter case a is formatted before evaluating b. This is equivalent to > > ??? t1 = a > ??? t2 = format(t1) > ??? t3 = b > ??? t4 = format(t3) > ??? r = t2 + t4 > > In most cases this doesn't matter, but when implement the optimization that > transforms the former expression to the the latter one ([1], [2]) we have > to make a decision what to do with this difference. > > [1] https://bugs.python.org/issue28307 > [2] https://bugs.python.org/issue28308 Just for the record, I implemented the translation-time transformation described in [1] (i.e. '%' string formatting with a tuple) in Cython 0.28 (released mid of March, see [3]), and it has the same problem of changing the behaviour. I was aware of this at the time I implemented it, but decided to postpone the fix of evaluating the arguments before formatting them, as I considered it low priority compared to the faster execution. I expect failures during formatting to be rare in real-world code, where string building tends to be at the end of a processing step that should catch many value problems already. Rare, but not impossible. I do consider this change in behaviour a bug that should be fixed, and I would also consider it a bug in CPython if it was added there. Stefan [3] https://github.com/cython/cython/blob/de618c0141ae818e7a4c35d46256d98e6b6dba53/Cython/Compiler/Optimize.py#L4261 From solipsis at pitrou.net Sat Mar 31 05:35:10 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Mar 2018 11:35:10 +0200 Subject: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?] References: <20180330155442.6bf2cd9f@fsol> Message-ID: <20180331113510.48624002@fsol> On Fri, 30 Mar 2018 21:40:21 +0300 Serhiy Storchaka wrote: > 30.03.18 16:54, Antoine Pitrou ????: > > We could also simply nuke wstr. I frankly don't think it's very > > important. It's only used when calling system functions taking a > > wchar_t argument, as an ? optimization ?. I'd be willing to > > guess that modern workloads aren't bottlenecked by the cost overhead of > > those system functions... > > This is possible only after removing all Py_UNICODE related C API. It is > deprecated since 3.3, but only in the documentation, and should stay to > the EOL of 2.7. Only in 3.7 most of these functions started emitting > deprecation warnings at compile time (GCC-only). [1] It would be good to > make them emitted in other compilers too. It should be possible with MSVC: https://stackoverflow.com/a/295229/10194 and clang as well: http://releases.llvm.org/3.9.1/tools/clang/docs/AttributeReference.html#deprecated-gnu-deprecated Regards Antoine.